Forest Fire Risk Prediction Based on Stacking Ensemble Learning for Yunnan Province of China

: Forest fire risk prediction is essential for building a forest fire defense system. Ensemble learning methods can avoid the problem of difficult model selection for disaster susceptibility prediction and can significantly improve modeling accuracy. This study introduces a stacking ensemble learning model for predicting forest fire risks in Yunnan Province by integrating various data types, such as meteorological, topographic, vegetation, and human activity factors. A total of 70,274 fire points and an equal number of randomly selected nonfire points were used to develop the model, with 70% of the data allocated for training and the remaining 30% for testing. The stacking model combined four diverse machine learning methods: random forest (RF), extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and multilayer perceptron (MLP). We evaluated the model’s predictive performance using metrics like accuracy, area under the characteristic curve (AUC), and fire density (FD). The results demonstrated that the stacking fusion model exhibited remarkable accuracy with an AUC of 0.970 on the test set, significantly surpassing the performance of individual machine learning models, which had AUC values ranging from 0.935 to 0.953. Furthermore, the stacking fusion model effectively captured the maximum fire density in extremely high susceptibility areas, demonstrating enhanced generalization capabilities.


Introduction
As the main body of terrestrial ecosystems, forests have a variety of functions, such as regulating climate and maintaining soil and water.Among the many hazards facing forests, fires are increasingly harmful to forest resources [1].Since the 1980s, with continuous global warming and increase in extreme weather events, forest fires have shown a trend of longer fire duration, broader impact, and increased disaster losses [2].Forest fire risk prediction is essential in building a forest fire defense system [3].Analyzing the influencing factors of forest fires and developing a highly accurate and interpretable prediction model is essential for preventing forest fires and formulating relevant strategies.
The occurrence of forest fires is the result of a combination of various factors, such as meteorology, topography, vegetation, and human activities, and demonstrates a complex nonlinear behavior [4,5].In order to achieve more accurate forest fire risk prediction, multiple influencing factors must be comprehensively considered in the modeling process.However, the large number of influencing factors, complex correlations, and unknown internal mechanisms pose significant difficulties for research.
There are shortcomings and drawbacks in using traditional methods for forest fire risk prediction, which can be summarized in the following categories: (1) The assessment of forest fire risk associated with the weather involves intricate formulas and rules to categorize meteorological fire risk levels.The forest fire weather index (FWI) system, a widely employed system, relies solely on meteorological factors [6].However, the existing simplistic weather index falls short of meeting practical requirements.
(2) Multicriteria decision-making approaches [7], which have the capability to take into account various factors and objectives during the planning process, are particularly valuable when dealing with complex wildfire risk scenarios.This method is recognized as a crucial component in the wildfire risk assessment framework established by the U.S. Forest Service [8].However, it relies on subjective expert ratings, lacking objectivity.(3) Statistical and bivariate models, such as frequency ratios [9], weight of evidence [10], and fuzzy logic [11].Constrained by linear assumptions and low-order properties, these approaches may exhibit greater accuracy when handling high-dimensional data.(4) The relationship between different combustibles and meteorological factors is established by combustion experiments to predict forest fire risk [12].This method requires a large number of field experiments, and the physical parameters are very labor intensive to prepare and only applicable to a small area.
The swift progress in artificial intelligence has led to widespread applications of machine learning methods, effectively addressing complex nonlinear challenges in engineering.Forest fire risk prediction, seen as a binary classification problem in machine learning [13], utilizes influencing factors as features and forest fire occurrence/nonoccurrence as labels, as evidenced by a substantial body of literature.Logistic regression (LR), renowned for its robust fitting of linear relationships and high interpretability, has been widely utilized in forest fire risk prediction studies.In Shanxi Province, Pan et al. [14] applied LR to predict forest fire risk, revealing significant influences of surface temperature and NDVI on fire occurrence.Rodrigues et al. [15] observed regional variations in the impact of different fire risk drivers in a study on forest fires in Spain.However, the LR employing a uniform set of model parameters for the entire study area faced challenges in adapting to spatial heterogeneity.Random forest (RF), an algorithm in the bagging ensemble learning framework, is extensively used in forest fire risk prediction [16].RF employs decision trees as the base classifier, introducing random feature selection during decision making and enabling automatic variable importance assessment and selection.Milanović et al. [17] found that the RF model incorporating terrain, vegetation, human factors, and climatic features outperformed LR in predicting forest fire occurrences in eastern Serbia.Emerging in the boosting ensemble learning framework, XGBoost and LightGBM have gained popularity, demonstrating outstanding performance in diverse prediction studies [18].Sun et al. [19], aiming to enhance forest fire risk prediction accuracy, used LightGBM to generate forest fire susceptibility maps, showing superior performance compared to LR and RF in accuracy and AUC values.Neural networks, which are widely used in disaster prediction assessments due to their function approximation ability [20], were employed by Naderpour et al. [21] to develop a fire risk assessment model.Selecting 36 indicators from diverse backgrounds, they achieved an accuracy exceeding 90% in assessing forest fire susceptibility in Sydney's northern beaches area.These machine learning methods address the high workload, subjectivity, and low predictive accuracy of traditional methods.
However, the choice of an appropriate method remains uncertain within the plethora of available options.Single machine learning methods are susceptible to data flaws, such as outliers and multicollinearity among feature variables.Ensemble learning methods offer a solution by combining multiple independent methods, mitigating the challenge of selecting a specific machine learning method.Moreover, ensemble learning methods excel in handling complex and high-dimensional data, yielding more accurate results than their single counterparts.While ensemble learning has proven successful in disaster prediction for landslides [22], floods [23], earthquakes [24], and other domains, its application in forest fire risk prediction is still limited and warrants further exploration and testing [25].Concurrently, current research in forest fire risk prediction predominantly focuses on achieving high model accuracy; yet, the interpretability of the prediction results remains inadequate, restricting the models' practical utility in disaster prediction.
Based on the above considerations, we developed a fusion model using the stacking ensemble learning technique for forest fire risk prediction in Yunnan Province.To the best of our knowledge, there is currently no exploration of forest fire risk prediction using this model.In order to be able to objectively validate the effectiveness of the stacking fusion model, a comparative analysis was performed with the four base models.Hereafter, the body of this paper is organized into four major parts.Section 2 provides descriptions of the study area, data preparation, and the models used.The implementation of the models and the presentation of results are detailed in Section 3. Section 4 discusses the achievements.Finally, Section 5 concludes the paper.

Study Area
Yunnan Province is located on the southwestern border of China (Figure 1), which is between 21°09′-29° 15′ N and 97° 31′-106°11′ E, with a total area of 394,100 square kilometers.The topography of this region is high in the northwest and low in the southeast and descends step by step from north to south.Mountainous areas account for 88.64% of the total area of this province.Yunnan generally has a subtropical monsoon climate with large daily temperature differences and small annual temperature differences.The temperature varies significantly vertically with the height of the terrain.Precipitation is very unevenly distributed over the seasons [26].The wet season runs from May to October, accounting for 85% of the total annual rainfall.The dry season runs from November to April.The topography and meteorology of this region are suitable for the growth of forests.By 2020, the forest area of Yunnan Province had reached 24.936 million hectares with 65.04% forest coverage, making it one of the provinces with abundant and diverse forest resources in China [27].The dominant tree species in Yunnan are Pinus yunnanensis, Pinus armandii, and Keteleeria evelyniana, all of which are flammable tree species [28].According to the statistics of the Yunnan Forest Fire Brigade, there were 5036 forest fires in Yunnan from 2004 to 2018.It is a region in China with severe fire hazards.Meanwhile, in densely populated forest areas, there has always been the practice of burning wasteland.If the fire is used carelessly in the dry season, it can easily cause large-scale forest fires.As a result of the above factors, the potential risk of forest fires in Yunnan Province is high, and its forest fire prevention work faces serious challenges [29].

Fire Point Inventory
Fire point data were collected from the NASA Fire Information Resource Management System (FIRMS) Visible Infrared Imaging Radiometer Suite (VIIRS) 375 m anomalous hotspot remote sensing data [30].The VIIRS sensor provides a spatial resolution of two types of anomalous hotspot data, 750 and 375 m, with global coverage every 12 h.It is widely used in fire point monitoring [31], burned area mapping [32], and fire emission assessments [33].Compared to a moderate resolution imaging spectroradiometer (MODIS), VIIRS products have higher precision and more significant application potential.
The VIIRS 375 m dataset encompasses critical fire characteristics, including fire point coordinates, ignition dates, and confidence levels.In extracting fire points, VIIRS highconfidence thermal anomaly data (Type H) from Yunnan Province spanning the years 2004 to 2018 were selected to represent forest fire points.To mitigate the influence of stationary heat sources, such as power plants, thermal anomalies not classified as forest land were excluded from the analysis.The resulting dataset still contained a significant number of fire points, attributed to VIIRS's capability of capturing changes in the perimeters of fire scenes during the spread of fires.To refine the dataset further, we adopted an adaptive kmeans clustering algorithm, as referenced in [34], which clusters the thermal anomalies based on longitude, latitude, and occurrence time.The method was employed to identify points belonging to the same fire incident and to construct fire polygons encompassing all points involved in individual fire events.Ultimately, 20,676 fire incidents were identified, as illustrated in Figure 2.Among these, the earliest occurring points were retained as forest fire points for experimental purposes, resulting in a dataset of 70,274 fire points.While these data may still include points related to the same fire incident, variations were present within them.For the sake of expanding the training dataset, these points were not excluded.In addition to the fire points, random points were also necessary and selected to serve as nonfire points.For the dependent variable labels, ignition points were labeled as "1" and nonignition points as "0".To ensure the accuracy of the nonfire point selection, a 500 m buffer zone was established around the fire polygons obtained from clustering, prohibiting the allocation of random points within this buffer on the same day.Furthermore, to maintain data balance and minimize bias, we randomly selected an equal number of nonfire points based on previous research [14,[16][17][18]20].The selection of random points as nonfire points ensured dual randomness in both temporal and spatial dimensions.The dataset, comprising both fire and nonfire points, was divided based on the occurrence dates, with 70% of the data allocated for training and the remaining 30% for testing.To address the issue of data leakage, we ensured that the data for each distinct fire event were exclusively contained within either the training or test set and not both.This approach prevented the overlap of fire events across sets, thereby eliminating the risk of the model being inadvertently exposed to test conditions during training.

Forest Fires Risk Influencing Factors
The occurrence of forest fires is the result of a combination of factors.We classified these factors into four major categories based on previous studies: meteorological factors, topographic factors, vegetation factors, and human activity factors.
The influencing factors related to topography include elevation, slope, and aspect.Usually, the higher the elevation and the higher the moisture content of the vegetation, the lower the risk of forest fires [35].Slope can affect the rate of fire spread.For uphill fires, combustible materials in the upper part are preheated by the fire below, causing a significant loss of moisture and accelerating fire spread.Also, the greater the slope, the easier it is to lose moisture from the soil, the drier the vegetation tends to be, and the higher the risk of forest fires [36].Slope aspects can be divided into sunny slopes and shady slopes.Vegetation on sunny slopes is more susceptible to forest fires due to prolonged exposure to sunlight, increased surrounding temperatures, and reduced air humidity [37].
Meteorological factors include temperature, humidity, precipitation, wind speed, and hours of sunshine.The temperature is a crucial factor in forest fire occurrence and spread.The higher the temperature, the stronger the transpiration of vegetation, leading to a decrease in the water content of combustible materials and an increase in the risk of forest fires [38].The amount of precipitation and the air's relative humidity affect the vegetation's water content [39].In addition, the wind can accelerate airflow, reduce air humidity, and increase the likelihood of forest fires.
The occurrence of forest fires is not only influenced by same-day meteorological factors but also by the cumulative effects of previous climate.One of the most commonly used methods to visualize the impact of this cumulative weather effect's impact on forest occurrence is the forest fire weather index (FWI).The FWI is based on the time-lag equilibrium water content theory.The individual indices of the FWI are calculated from the temperature, relative humidity, wind speed, and precipitation data observed at solar noon each day [40].The FWI index includes three combustible moisture codes, which are fine fuel moisture content (FFMC), duff moisture content (DMC), and drought code (DC).FFMC represents the fuel moisture of forest litter fuels under the shade of a forest canopy.It is intended to represent moisture conditions for shaded litter fuels, the equivalent of 16 h time lag, and ranges from 0 to 101.The FFMC value varies with the moisture content of combustible material affected by precipitation, temperature, relative humidity, and wind speed.DMC represents fuel moisture of decomposed organic material underneath the litter and represents moisture conditions for the equivalent of 15-day (or 360 h) time-lag fuels.DC represents drying deep into the soil and approximates moisture conditions for the equivalent of 53-day (1272 h) time-lag fuels.It is unitless and has a maximum value of 1000.Extreme drought conditions have produced DC values near 800 [41].The FFMC, DMC, and DC indicators are a continuous calculation process, and the results of the next day depend on the values of the previous day, so the initial values of the three indicators need to be set.The initial value of FFMC is 85, DMC is 6, and DC is 15 [42].
Vegetation factors include the vegetation normalized index (NDVI) and vegetation type (VT).NDVI is remote sensing data reflecting vegetation's spatial distribution and density, and it is often used to estimate the vegetation's growth status and density [43].The occurrence of forest fires is intricately linked to the type of vegetation as it directly influences the characteristics and quantity of combustible materials.
According to relevant statistics, the vast majority of forest fires that have occurred in recent years have been caused by human activities [44,45].Ritual fires, spring burning, and wild smoking are the leading causes of human-induced forest fires [46].We chose population density distribution and distance to roads to characterize the impact of human activities on forest fires.For ease of description, we categorize the distance to rivers as part of human activities.Areas with rivers tend to have moist soil and higher plant moisture content, making them less prone to wildfires.Additionally, rivers can effectively act as barriers to prevent the spread of wildfires [47].
Through the investigation of forest fire risk influencing factors, we finally selected 17 factors as features, and the detailed data description is shown in Table 1.The distribution of influencing factors is shown in Figure 3    The distribution of meteorological stations in Yunnan Province is characterized by discrete limitations and uneven coverage, necessitating the use of spatial interpolation techniques to derive accurate meteorological data.Spatial interpolation involves mathematically deriving data for unknown regions based on known discrete data and is commonly accomplished through methods such as inverse distance weighting (IDW), kriging, and spline [48].
Studies indicate that, particularly in Yunnan, thin-plate spline interpolation using the Anusplin software yields higher overall accuracy compared to IDW and kriging [49].
Meteorological data were sourced from the "Daily meteorological dataset of basic meteorological elements of China National Surface Weather Station (V3.0)" provided by the China Meteorological Data Network, covering 15 years from 2004 to 2018.To ensure precise interpolation, we selected the three-variable thin-plate spline function with elevation as a covariate, specifically the TVPTPS3 model, based on prior research findings [50].Utilizing Anusplin software with this model configuration, we generated a spatial interpolation raster dataset for meteorological variables at a resolution of 500 m × 500 m.Precipitation interpolation employed square root transformation due to data range and high uncertainty.
To assess the reliability and applicability of Anusplin interpolation for meteorological feature data, we conducted a comparative analysis with commonly employed kriging [17] and IDW [51] interpolation methods frequently utilized in forest fire prediction or risk assessment.Monthly average meteorological elements from the 15th day of each month (temperature, humidity, precipitation, hours of sunshine, and wind speed) were selected for interpolation processing.Of the 86 meteorological stations, data from 80% (69 stations) were used for interpolation computation, while data from the remaining 20% (17 stations) were used for verification.Mean absolute error (MAE) and root mean square error (RMSE) comparisons between actual and predicted values at verification stations were performed to assess interpolation accuracy [48].The MAE indicates the level of overall error, while the RMSE reflects the sensitivity and extreme conditions of the sample data estimation.The smaller the MAE and RMSE values, the higher the precision and robustness of the spatial interpolation.
Comparative analysis of IDW, kriging, and Anusplin for average maximum air temperature distribution on 15 May from 2004 to 2018 (Figure 4) revealed distinctive features.The IDW method resulted in locally low-or high-temperature phenomena because it only considered the spatial distance between meteorological stations and points to be interpolated.The kriging method used semivariogram to describe spatial correlation between sites, capturing trends in geographical space using covariance functions.Although it yielded the smoothest interpolation distribution among the three methods, it did not precisely reflect the influence of geographic elements on temperature.The Anusplin method using a regression model based on altitude distinctively delineated mountainous features of the Yunnan region.Overall, the Anusplin interpolation method achieved the smallest RMSE and MAE values among all the five meteorological elements.As a result, it stood out as the optimal spatial interpolation method for daily basic elements in the Yunnan area, proving that the interpolated data are highly reliable [49].

Data Preprocessing
To reduce implicit transformations in the model and eliminate outliers in raw data to improve accuracy, data preprocessing is the first crucial step in machine learning.We extracted forest fire influencing factors as features from raster data using the ArcGIS software and converted them to tabular data.Among the features, aspect, VT, road, and river were categorical variables, while the rest were continuous variables.We chose the z-score normalization method to transform continuous variables into a form with a mean of 0 and a standard deviation of 1 [52].The formula is as follows: where x t * is the normalized data, x t is the original data, and x avg and x sd are the mean and standard of the full original data, respectively.

Methodology
The methodology included five steps: (1) selecting the forest fire influencing factors, (2) modeling the forest fire risk using the stacking ensemble technique, (3) ranking feature importance based on the SHAP interpretation framework, (4) evaluating the forest fire risk prediction models, and (5) producing and validating the forest fire susceptibility maps.Figure 6 shows the flowchart.

Selecting the Forest Fire Influencing Factors
Feature selection is usually required before modeling, which can prevent inaccurate model prediction due to multicollinearity among features.We used the Pearson correlation coefficient [53], tolerance level (TOL), and variance inflation factor (VIF) to identify feature variables with high covariance.Pearson correlation coefficient >0.8, tolerance (TOL) <0.1, and variance inflation factor (VIF) >10 are usually considered signs of multicollinearity [54].The features that did not satisfy the above conditions were not used to construct the model.

Modeling the Forest Fire Risk Using the Stacking Ensemble Technique
Single machine learning models usually have various drawbacks.Ensemble learning involves combining individual models together to form a model with stronger generalization performance, effectively reducing or eliminating the limitations of individual machine learning models.

5:
The   The structure of the stacking fusion model is based on the prediction ability and variance of the base model, so the model with strong prediction ability and high diversity should be selected as the base model in the first layer of the stacking model.For the second layer, the metamodel should have a strong generalization ability to find and correct the bias of multiple base models for the training set.Hence, we used four heterogeneous machine learning algorithms (random forest, XGBoost, LightGBM, and MLP) as the base model and LightGBM as the metamodel.These four base models are briefly described below.
Random forest (RF) is an ensemble learning model design based on the bagging framework [57].Random forest integrates decision trees as the base estimators and introduces sample random sampling and feature random sampling in the training process.Random forest significantly improves the accuracy and stability of the model, reduces sensitivity to noise and outliers, and effectively avoids overfitting.
eXtreme gradient boosting (XGBoost) is a novel gradient boosting decision tree (GBDT) model [58].Unlike RF, GBDT is an ensemble learning model based on the boosting framework.Each estimator of GBDT relies on the residuals formed by the previous round of estimator training for the current round.The final result is obtained by accumulating all the results, which is serially generated.In contrast, RF's base estimators are generated in parallel, and the final results are obtained by majority voting.The XGBoost uses Taylor's second-order expansion to optimize the loss function, which reduces the training time compared to GBDT and thus improves the efficiency of the solution.It also incorporates a regular term to reduce model complexity and prevent overfitting.The core of XGBoost is to build new trees by continuously splitting features to fit the residuals between the last prediction and the actual value and to sum up the results of all trees as the final prediction result.
The light gradient boosting machine (LightGBM) introduces a leaf-wise leaf growth strategy with a depth limit, a histogram algorithm, a gradient-based one-sided sampling (GOSS) algorithm, and an exclusive feature bundling (EFB) algorithm based on the traditional gradient boosting tree [59].All four algorithms ensure that the model accuracy is not affected and improve the model training speed under the growth of sample data volume and feature volume.
Multilayer perceptron (MLP), also called artificial neural network (ANN), mainly consists of three parts: input layer, hidden layer, and output layer [60].The layers of MLP transform the output through the activation function, which can handle complex multiinput and multioutput nonlinear problems and is very good for classification problems.
The base model of stacking determines the optimal hyperparameters by Bayesian optimization [61].We used accuracy (ACC) as the evaluation indicator for each prediction effectiveness and combined it with a five-fold cross-validation approach by comparing the prediction effectiveness in the test set.
The experiments were conducted on a computer with an Intel i7-12700H processor and 16 GB memory, and the operating environment was a 64-bit Windows 11 system.The Scikit-learn tool library in Python 3.8 was used for model construction, and the bayes_opt tool library was used for Bayesian optimization.The optimal set of hyperparameters for each model is shown in Table 2. Feature importance is a method for scoring the input features of a prediction model, which can reveal the relative importance of each feature.The Scikit-learn tool library provides the internal method for tree models such as RF, XGBoot, and LightGBM to return feature importance.However, this method is usually based on the training set, and the relative importance of feature variables may not be accurate when the model is overfitted.As for machine learning algorithms, such as neural networks, which are not based on tree models, feature importance is not available.
The stacking model proposed in this study has a complex internal structure.It is difficult to determine the correlation between the input and output, so the feature importance cannot be obtained using the internal mechanism of the model.We used the SHapley Additive exPlanation (SHAP) interpretation framework to obtain the Shapley value of each feature to indicate the feature's importance.The SHAP is an additive feature attribution model based on the cooperative game theory [62].The SHAP value can be used to represent each feature's marginal contribution value and measure the feature's importance [63].The SHAP value of feature j is defined as follows: where N represents the set of all features, and S represents all feature subsets obtained from F after removing the feature j. f(S) represents the output of the machine learning model of the feature subset S. f(S ∪ {j}) − f(S) represents the cumulative contribution value of feature j.

Evaluating the Forest Fire Risk Prediction Models
As explained in the previous section, forest fire risk prediction is a binary classification problem, so the threshold value needs to be set when predicting whether a sample belongs to a fire point.In the case of a balanced number of positive and negative samples, the threshold value can be set to 0.5.When the calculated probability is greater than 0.5, the sample is judged to be a fire point and vice versa for a nonfire point.
We used the confusion matrix and its associated parameters to verify the predictive performance of different models.The confusion matrix contains four possible outcomes: true positive (TP), true negative (TN), false positive (FP), and false negative (FN).TP and TN denote the number of correctly classified fire point and nonfire point samples, respectively, while FN and FP denote the number of misclassified fire point and nonfire point samples, respectively.
The receiver operating characteristic curve (ROC) is a valuable tool for evaluating the performance of forest fire risk prediction models [64].The area under the curve (AUC) can quantitatively reflect the model performance measured on the basis of the ROC with values between 0.5 and 1.The closer the AUC to 1, the more accurate the model.
In addition to the above classification performance indicators, accuracy, precision, recall, and F1 score were used to evaluate the model effect [65].The higher accuracy, precision, recall, and F1 score represent the better predictive ability of the model.In general, the model evaluation indicators, as described above, should be sufficient to validate the predictive capability of the models.However, susceptibility mapping for forest fire risk prediction is critical because it provides preliminary information on which areas are at risk [13].Therefore, each model must be output as a susceptibility map to validate and evaluate performance.
The forest fire susceptibility map is a probability map of fire occurrence, indicating the probability of forest fire risk.The trained model was applied to the entire study area, and the probability of forest fire occurrence was calculated for each cell.The equal intervals method was used to reclassify susceptibility into five classes, including extremely low (0~0.2),low (0.2~0.4), medium (0.4~0.6), high (0.6~0.8), and extremely high (0.8~1).
A good forest fire susceptibility map should contain more fire points with fewer high susceptibility areas.For this purpose, we introduced fire density (FD) as a new evaluation indicator.FD is defined as the ratio between the proportion of fire points falling into each class and the proportion of the number of pixels in that class.The higher the FD for a given class, the more fire points per unit area for that class.The formula for FD is as follows: where N i represents the number of fire points falling in class i, N represents the total number of fire points in the study area, S i represents the number of cells in class i, and S represents the total number of cells in the study area.

Forest Fire Influencing Factor Selection
Figure 8 illustrates the Pearson correlation coefficients among the 17 features.The correlation analysis indicated that, with the exception of the strong correlation between average daily air temperature and maximum daily air temperature (0.92), most factors exhibited weak correlations with each other.In light of practical considerations and the understanding that elevated temperatures are a crucial component of conditions conducive to forest fire risk, average daily air temperature was subsequently excluded from the features.The multicollinearity analysis showed the tolerances (TOL) and variance inflation factors (VIF) of the remaining 16 features, as presented in Table 3.The TOL of the features ranged from 0.20 to 0.92, and the VIF ranged from 1.09 to 5.0, which satisfied the threshold requirements of TOL >0.1 and VIF <10.This indicated that there was no multicollinearity among the selected features, and it could be used for model training and validation.

Model Comparison and Validation
The stacking fusion model was compared with its four base models.We used the six evaluation indicators in Section 2.3.4 to validate the models' prediction effectiveness and generalization ability.
Figure 9 shows the ROC curve of the five models in the testing dataset.The results showed that the stacking fusion model had the highest AUC value (0.9701), followed by XGBoost (0.953), LightGBM (0.946), RF (0.942), and MLP (0.934).As demonstrated in Table 4, the stacking fusion model was better than the other four models in terms of accuracy, precision, recall, and F1 score.The proposed stacking fusion model, compared to the bestperforming model among the single models, XGBoost, improved by 2.24% in accuracy, 2.22% in precision, 2.24% in recall, and 2.12% in F1 score.The predictive power of all five models in this study was within the acceptable range, but the stacking model had better generalization ability and better prediction accuracy.

Forest Fire Susceptibility Map
To comprehensively compare the performance of the five machine learning models and present their prediction outcomes more clearly, we conducted an analysis and comparison of the forest fire susceptibility maps generated by these models.The months of February and March are recognized as peak periods for forest fires in Yunnan Province.For the validation of the predictive capabilities of the five models, we selected the complete datasets for February and March in the year 2018.Notably, the fire point data for these months were deliberately excluded from the training dataset.We randomly selected 6 March 2018 from the two peak months and utilized five models to generate forest fire susceptibility maps for that specific day.The resolution of these maps was set at 500 m, as depicted in Figure 10.Overall, the five maps had similarities in spatial distribution.For example, the extremely high susceptibility area was mostly distributed in the eastern region, and most fire points also fell in the extremely high class.In addition, there were some differences among the five maps.The MLP had the largest number of extremely high and high susceptibility areas.
For quantitative susceptibility map analysis, we employed fire density (FD) to assess the reliability and accuracy of the results.A higher FD value in a specific area indicates a greater number of fire points per unit area in that region.Table 5 illustrates the FD values corresponding to the susceptibility maps for the five models across different susceptibility levels, ranging from extremely low to extremely high.The distribution aligns well with the actual scenario, wherein fire points are more likely to occur in extremely high susceptibility areas while the probability is considerably lower in extremely low susceptibility areas.At the same time, the stacking fusion model exhibited a significantly higher FD value in the extremely high susceptibility area (2.68) compared to other models.This suggests that the stacking fusion model excels in accurately identifying fire points with only a minimal number of misclassifications in low susceptibility areas.

Feature Importance
The predictions of the stacking fusion model were analyzed using SHAP to generate a comprehensive feature summary plot illustrating feature importance and effects.In  As shown in Figure 8, features such as sunshine duration (Suh), altitude, fine fuel moisture code (FFMC), normalized difference vegetation index (NDVI), duff moisture code (DMC), population density (Pop), average relative humidity (Arh), maximum daily air temperature (Mte), maximum daily wind speed (Mws), 24 h cumulative precipitation (Pre), and vegetation type (VT) significantly influenced the model's predictive output.Specifically, an increase in the values of Suh, FFMC, NDVI, DMC, Pop, Mte, and Mws resulted in higher SHAP values, indicating a positive impact on the occurrence of forest fires.Conversely, altitude, Arh, and Pre exhibited a negative influence on the predicted output as their values increased.For categorical features, such as aspect, slope, and VT, where the feature values represent categories, it is challenging to assess their impact on the model output based solely on value changes.However, their clustered distribution suggests that specific categories had a significant impact on the model.

Predictive Model Comparison
Forest fire risk prediction is affected by different meteorological, topographical, and human factors in the study area, which is complex and varied.Thus, it is difficult to have a general model to solve the prediction problem in all areas [66].
We opted for four models, namely, random forest (RF), XGBoost, LightGBM, and multilayer perceptron (MLP), each exhibiting robust generalization capabilities and significant theoretical differences, as our base models.Leveraging the stacking ensemble learning technique, we constructed a multimodel fusion model by combining them.When the stacking model was compared with these four base models, experimental results showed that the stacking fusion model improved the prediction performance (highest AUC values and accuracy).This improved effect resulted from reducing the bias and variance and avoiding overfitting problems [67].Among the four base models, tree-based models, namely, RF, XGBoost, and LightGBM, demonstrated comparable performance in accuracy, precision, recall, F1 score, and AUC evaluation metrics.They outperformed MLP, confirming the notion that neural networks may not perform as well as tree models, such as random forest and XGBoost, on tabular data [68].
When modeling the forest fire susceptibility map, MLP, in order to fit the sample data, predicted the forest fire risk for the entire Yunnan Province as a higher level, failing to reflect the polar distribution of extremely high and extremely low forest fire risks.However, some argue that an extreme polar distribution of very high and very low risks may not be the optimal choice for susceptibility maps as moderate risk levels also provide crucial information [51].The stacking model proposed in this study leverages the advantages of integrating multiple models, allowing it to encompass the maximum number of extremely high susceptibility areas with minimal coverage, achieving the highest fire density without losing the intermediate risk levels.We believe it strikes a good balance between these two perspectives.
There is no free lunch, and no one model is always good [69].The stacking fusion model achieves the highest accuracy, but it also has the most complex structure, requiring a considerable amount of time for training and incurring high costs.Given these considerations, many commercial companies may opt for a single model, such as XGBoost or LightGBM.In the realm of model selection, balancing accuracy and complexity has always been a focal point in machine learning research.Additionally, compared to other treebased models, stacking cannot identify the relative weights of features, leading to lower interpretability.To address this, incorporating Shapley values on the foundation of the SHAP interpretability framework enhances the interpretability of the stacking model.

Impact Factor Analysis
Meteorology was most closely related to forest fire risk.As shown in Figure 11, the first two features were both meteorological factors.Forest fires in Yunnan Province mainly occur in the dry season (January-May).According to Figure 11, moderate correlations were observed among specific factors, such as sunshine duration and precipitation as well as sunshine duration and relative humidity.This suggests that many variables exhibit interdependence.The dry season is characterized by abundant sunlight, low precipitation, gradual temperature increase, low air humidity, and the impact of subtropical monsoons.These conditions contribute to a continuous decrease in the water content of combustible materials, making the area more susceptible to forest fires.
Regarding vegetation factors, some studies suggest that higher NDVI values indicate a higher frequency of forest fires.As shown in Figure 11, the feature values of NDVI were positively correlated with SHAP values, confirming this observation.Additionally, we observed a clustered distribution of vegetation types, as shown in Figure 11.Upon examining the vegetation types corresponding to these high SHAP values, we found that they were predominantly coniferous forests, such as pine trees.This indicates that different vegetation types can also influence the occurrence of forest fires [70].
As for topographic factors, altitude significantly impacted the model output, probably because of the significant variations in altitude in Yunnan.Areas with high elevation have low temperatures, and together with the lack of human presence, the probability of forest fire risk is low.Aspect had less influence on the model output, probably because Yunnan is located at low latitudes and the difference in heat between shaded and sunny slopes is insignificant.
On the human activity factors, the SHAP framework had a poor interpretation of Pop, and many points with high feature values were mixed with those with low feature values.It may be that, within a certain threshold, an increase in population density would indicate high human activity in the forest, increasing forest fire frequency [71].However, forest fires are less likely to occur in towns with high population density, where forest cover is low.In general, the distance from rivers and roads is considered a crucial feature for assessing forest fire risk.However, in this study, these two features were found to have a weaker impact, possibly related to the categorization of distance.In studies involving large-scale regions like Yunnan Province, the data used lacked information on roads below the county level, making it challenging to categorize road distances in units as small as meters, as is carried out in smaller-scale studies [72].This categorization might not have fully explored the correlation between fire occurrences and road distance, leading to a relatively weak influence on the model output.
Based on the analysis above, in Yunnan Province, with its varied topography and rich biodiversity, managing forest fires is challenging.Advanced monitoring through satellites and drones is crucial for early fire detection, especially in remote areas.This should be coupled with a rapid response system combining aerial and ground teams for effective fire suppression.The province's rugged terrain requires special firefighting methods and equipment, particularly for mountainous regions.Helicopters are vital for surveillance and fighting fires in hard-to-reach areas [20].Additionally, managing vegetation and firebreaks is essential to reduce fuel and prevent wildfire spread.The distinct dry and wet seasons in Yunnan also demand an adaptive management strategy to effectively prevent and control fires and to mitigate their impact.

Limitations and Prospects
The quality of the input data is critical.Both the type and resolution of data can have an impact on model construction.Different versions of elevation, vegetation, and meteorological data may lead to different forest fire risk prediction results.Although some free elevation data are currently available [73], unfortunately, there are still difficulties in obtaining high-resolution meteorological data and vegetation data.As more and more high-resolution data become available, the reliability of predictive assessment modeling developed based on machine learning will continue to increase in the future.
In constructing our dataset, we aimed for a balanced representation and minimized bias by setting a fire to nonfire point ratio at 1:1, aligning with prior research insights.
However, due to the infrequency of forest fires, this sampling approach may potentially overemphasize the proportion of fire incidents, leading to inadequate model training and reduced accuracy.Recent work in landslide susceptibility assessment [74] suggests the use of Bayesian optimization algorithms to optimize sample ratios, resulting in improved AUC values.Given the rarity of forest fires, similar to landslides, our future research will explore the applicability of Bayesian optimization algorithms in determining the optimal ratio of fire and nonfire points.
The proposed stacking fusion model in this study effectively enhances prediction accuracy and addresses the challenge of selecting the appropriate machine learning algorithm.Nonetheless, a notable drawback is the model's increased complexity and reduced interpretability compared to other models.Investigating the integration of interpretable machine learning models in forest fire risk prediction, elucidating corresponding trends in the feature space, and explaining the decision process for specific cases pose challenges for future research.
Although our current model primarily leverages static spatiotemporal influencing factors, we acknowledge the growing body of research utilizing neural network models based on time-series data to enhance predictive capabilities [75].Recognizing the potential advancements offered by incorporating temporal dynamics, our future research will explore the integration of time-series data into the modeling framework.This evolution aims to capture the dynamic nature of forest fire risk, providing a more nuanced and accurate predictive assessment.As temporal data availability and computational capabilities advance, this avenue holds promise for refining and advancing our predictive modeling strategies.
Considering the ultimate goal of applying the model to the entire Yunnan Province, future efforts will focus on validating and adjusting the model to ensure its applicability and reliability across the entire domain.This step is critical as the reported metrics, while encouraging, may not fully reflect the model's performance when applied to the broader and more diverse dataset of the entire province.

Conclusions
Forest fire management is a protracted undertaking necessitating the development of reliable fire risk prediction models tailored to local conditions.The key contributions of this study are as follows:

•
We devised a stacking fusion model by amalgamating four disparate machine learning methods (RF, XGBoost, LightGBM, and MLP) for forest fire risk prediction.Although currently limited by the specific sampling dataset used in this study, the model showed promising results in generating daily forest fire sensitivity maps.

•
Through multiple covariance tests and Pearson coefficient analyses involving meteorological, topographic, vegetation, and human activity data, we identified 16 significant factors influencing forest fire risk.

•
Model performance was meticulously evaluated using various metrics, including accuracy, AUC, and fire density.The results demonstrated that the stacking fusion model exhibited remarkable accuracy with an AUC of 0.970 on the test set, significantly surpassing the performance of individual machine learning models, which had AUC values ranging from 0.935 to 0.953.Furthermore, the stacking fusion model effectively captured the maximum fire density in extremely high susceptibility areas, demonstrating enhanced generalization capabilities.This research expands the application of stacking ensemble learning for predicting forest fire risk.

•
To address the interpretability challenges arising from the intricate internal structure of stacking fusion models, we employed the SHAP framework for an interpretable analysis of the model's prediction results, yielding a feature importance ranking.
In summary, this study provides valuable insights for policymakers, enabling a comprehensive understanding of the influencing factors and susceptibility distribution of forest fires in Yunnan Province.It facilitates the formulation of more precise fire prevention strategies to mitigate potential fire incidents.

Figure 1 .
Figure 1.Study area (map) and locations of forest fire points from NASA's Fire Information for Resource Management System.The histogram shows the proportion of forest fire points in the study area from 2004 to 2018.

Figure 2 .
Figure 2. The portion displaying fire incidents identified through adaptive k-means clustering.

Figure 4 .
Figure 4. Distribution of average maximum temperature on 15 May from 2004 to 2018 by IDW, kriging, and Anusplin.Numerically, a comparison of the verification of the three interpolation methods for average maximum temperature (a), relative humidity (b), hours of sunshine (c), precipitation (d), and wind speed (e) are illustrated in Figure 5, depicting the average MAE and RMSE across each month from 2004 to 2018.Overall, the Anusplin interpolation method achieved the smallest RMSE and MAE values among all the five meteorological elements.As a result, it stood out as the optimal

Figure 5 .
Figure 5.The average MEA and RMSE for the average maximum temperature (a), relative humidity (b), hours of sunshine (c), precipitation (d), and wind speed (e) across each month from 2004 to 2018 employing three interpolation methods.

Figure 7
Figure7shows the structure of the proposed stacking fusion model, which contains two layers.The first layer contains several base models, while the second has only one metamodel.In order to avoid the problem of poor generalization ability due to the small scale of data division, base models are usually trained by cross-validation.The results of numerous training of base models in the first layer were used as the new training and testing sets, and the metamodel in the second layer was trained and predicted using the new training and testing sets.The following pseudo-codes describe the procession of the model method in detail.

Figure 7 .
Figure 7.The structure of the stacking fusion model.

5 .
Producing and Validating the Forest Fire Susceptibility Maps

Figure 10 .
Figure 10.Forest fire susceptibility maps and observed fire points distribution on 6 March 2018.
Figure 11, each row corresponds to a feature, with features arranged in descending order of importance (from top to bottom).Each point represents the SHAP value of an instance, and the color gradient from blue to red indicates the feature's low to high value.The xaxis represents the SHAP values, where a positive value signifies a positive effect on the model output and a negative value indicates a negative effect on the model output.

Figure 11 .
Figure 11.SHAP summary plot for the stacking fusion model.

Table 1 .
Data description of forest fire risk influencing factors.

Table 2 .
The optimal set of hyperparameters for each model.

Table 3 .
Multicollinearity diagnosis results of features.

Table 4 .
Performance of different prediction models.

Table 5 .
The evaluation of forest fire susceptibility classes.