1. Introduction
Wildfire frequency and intensity have risen globally in recent decades, primarily due to climate change—a trend that poses mounting threats to ecosystems and human settlements [
1]. Wildfire activity is a rising concern worldwide amid warming climates and expanding human development [
2,
3]. In South Korea, which has reforested significantly in the late 20th century, forests now cover about 62–63% of the land area [
4]. This remarkable recovery makes the country one of the most forested in OECD (Organisation for Economic Co-operation and Development), an international organization of 38 member countries promoting economic and social policy coordination, ranking fourth behind Finland, Sweden, and Japan [
4]. This forest recovery supports biodiversity and carbon storage but also increases wildfire risk, especially in wildland–urban interface (WUI) zones that now occupy large swaths of forest edge. Between 2016 and 2022, WUI areas accounted for approximately 29% of the nation’s total forest fire incidents [
5,
6]. Globally, extreme wildfire events—from Australia’s 2019–2020 bushfires to the ongoing expansions of burned tree cover around the world [
7,
8]—demonstrate that wildfire risk has become an urgent challenge. Understanding wildfire trends in South Korea’s monsoonal climate context is therefore critical for adaptation [
5,
9].
Despite South Korea’s remarkable reforestation and robust suppression efforts, wildfire trends highlight persistent and in some cases escalating threats. The country recorded an average of 562 wildfire events per year from 2016 to 2022, burning 1863 hectares annually [
5]. While the annual burned area has not demonstrated a consistent long-term increase [
9], multiple large-scale wildfire events have occurred in recent decades. For instance, the catastrophic 2000 East Coast fires scorched 23,794 hectares, then a record, only to be surpassed in March 2022 when fires burned approximately 24,000 hectares [
10].
Climate change exacerbates wildfire hazards by raising temperatures and lowering humidity levels, both of which contribute to more frequent and intense fires [
11]. Additionally, South Korea’s conifer-heavy forests—dominated by highly flammable pine species—further amplify potential fire damage [
9,
12]. Meanwhile, accelerating WUI expansion has introduced more ignition sources near forests, making human-triggered fires more common [
5]. Recent analyses suggest that fire frequency has risen but without strong statistical significance [
9]. These overlapping factors underscore that climate, ecological, and anthropogenic drivers collectively contribute to escalating wildfire hazards.
Amid rising global wildfire activity, South Korea’s extensive forest cover and expanding WUI exacerbate vulnerabilities [
13,
14]. The spring fire season, when low humidity and strong winds prevail, poses a particularly high risk [
11]. With climate change projected to bring more frequent heatwaves and erratic rainfall, conditions for severe wildfires could intensify [
1]. In addition, the country’s demographic shifts—urban sprawl, rural depopulation, and an aging population in forested areas—complicate evacuation and suppression strategies [
5]. Thus, large sections of the population and vital infrastructure face heightened fire exposure. Major global wildfire events, such as Australia’s 2019–2020 bushfires, illustrate the scale of risk that comparable forested nations may face under changing climatic conditions [
7,
8]. Recent studies highlight South Korea’s increasing wildfire risks driven by climatic extremes, expanding WUI, and socioeconomic pressures [
15,
16]. These combined factors demand adaptive forest management and technology-driven strategies for wildfire mitigation [
16,
17].
Prolonged droughts reduce fuel moisture, making coniferous stands exceptionally flammable. Moreover, fires burning near the WUI often threaten human safety and property, placing tremendous pressure on local governments [
5]. These conditions highlight the need for comprehensive datasets on large fires (≥5 ha) and underscore concerns that continuing climate trends, coupled with growing human encroachment, may further intensify wildfire severity and frequency.
Addressing these risks requires continuous monitoring of climate variables and integrated analysis; fuel management (e.g., thinning or controlled burns) [
12], WUI-focused planning, and dynamic policy adaptation to demographic shifts are crucial [
5]. By leveraging statistical methods and advanced machine-learning models (Random Forest, XGBoost) with interpretability tools such as SHAP, researchers can identify how different drivers interact and establish threshold conditions that may dramatically increase fire spread potential [
9,
18]. Existing wildfire research often isolates factors such as climate, fuels, and human activity rather than analyzing their interplay. Our goal is to examine how low fuel moisture, high temperatures, and wind extremes interact to intensify fire risk, emphasizing non-linear, compound effects [
18]. While previous studies have emphasised broad human expansion metrics [
19]. we also investigate how WUI presence may amplify ignition risk and spread, particularly in conifer-rich areas under drought [
20]. By quantifying these relationships, we offer a more nuanced framework linking urban expansion, climate change, and ecological conditions.
Overall, wildfire risk emerges from the complex interplay of climatic, ecological, and human factors, implying the need for integrated models and proactive management.
Study Hypotheses. H1 (Climate): Peak wind speed is positively associated with burned area, whereas relative humidity is negatively associated (temperature positive, conditional on wind/RH). H2 (Forest): Conifer and mature–stand ratios are positively, and broadleaf ratio is negatively associated with burned area. H3 (WUI): Greater WUI exposure (e.g., population density or proximity to built-up areas) is positively associated with burned area. H4 (Interactions): Climate–fuel interactions intensify losses (e.g., Wind × low RH; Conifer/Mature × low RH). H5 (Thresholds): Under very dry, windy conditions, upper-tail losses are more likely in landscapes with high conifer or mature–stand proportions.
2. Literature Review
Global wildfire activity has intensified, with evidence linking this trend to anthropogenic climate change. In 2023, Canada experienced its most destructive fire season on record, burning approximately 15 million hectares. This event was driven by early snowmelt, prolonged drought, and a +2.2 °C temperature anomaly [
21]. Similarly, in the Western United States, a small number of extreme single-day fires account for the majority of burned area, with frequency expected to more than double under +2 °C warming [
22]. The 2019/20 Australian Black Summer fires exemplify how overlapping drought, long-term warming, and climate variability now surpass historical fire norms [
23].
Comparisons to other regions with similar climates underscore the urgency of this work: in Mediterranean Europe, wildfires are increasingly driven by climatic and fuel conditions, and projections indicate that climate change could increase wildfire danger and burned areas by 2–4% up to 50% per decade [
24,
25]. Likewise, in the Western United States and Pacific Northwest, warmer and drier conditions have lengthened fire seasons, increased fire severity and led to exponential growth in fire frequency and size since the 1950s, resulting in more frequent large and mega-fires [
26,
27].
Fire behavior is highly sensitive to atmospheric and biophysical conditions; declining fuel moisture driven by warming and drought has increased global fire risk, even in typically moist ecosystems [
18,
20]. Lightning-caused wildfires are expected to rise by 39–65% per +1 °C increase, particularly in dense evergreen forests [
28]. In southern Europe, fire danger and burned area are projected to increase by up to 50% per decade under high-emission scenarios [
25]. However, in mainland China, strict fire suppression policies have mitigated climate impacts, especially in southern regions [
29]. Conversely, in Canadian forests, fire spread occurs under milder conditions in non-mountainous, conifer-dense areas [
30].
In the United States, wildfires have grown significantly larger and more frequent since the 2000s, reflecting intensified climate-change effects and increased concurrent extreme fires [
31]. Globally, fire activity is escalating particularly in Mediterranean and high-latitude forests due to intensifying fire weather, although regional outcomes vary due to bioclimatic and human factors [
32]. Effective wildfire mitigation in the U.S. increasingly requires understanding the combined social, biological, and physical triggers of wildfire ignitions and impacts [
33]. Warming-driven increases in fuel aridity have doubled the burned area in the Western United States relative to a no-climate-change scenario [
34].
Ecological consequences of heightened fire activity are severe. Canada’s 2023 wildfires emitted over 1.3 petagrams of CO
2, surpassing its decade-long carbon mitigation targets [
33]. Arctic fires are also alarming due to their potential carbon feedback loops from peat soil ignition [
33]. Globally, fire-prone zones may expand by 29%, particularly in boreal (+111%) and temperate (+25%) biomes [
35]. Human-induced climate change roughly doubled the probability of extreme wildfire-conducive conditions during Southwest France’s 2022 season [
36]. Rapid Arctic warming has driven record-breaking wildfire seasons in Siberia [
37]. Wildfire frequency is increasing across South Asia; for example, India’s annual forest fire incidents rose from ~8430 in 2005 to over 104,500 by 2021 [
38].
Expansion of WUI significantly escalates wildfire risks. Globally, between 1985 and 2020, WUI areas globally expanded by over 12%, with 10 million people residing in high-risk zones [
39]. Since 2000, urbanization-driven WUI expansion increased by 35.6%, amplifying human exposure to wildfires [
40]. As a result, WUI fires disproportionately affect air quality and health due to dense populations [
41]. Recent estimates indicate that WUI regions comprise 4.7% of Earth’s surface yet house 3.5 billion people [
6]. Case studies reinforce this concern. In the U.S., the number of homes within wildfire perimeters has doubled since the 1990s [
18]. In California, the combination of climate change and WUI growth has raised extreme wildfire risk more than fourfold, primarily due to human-caused ignitions near populated areas [
19].
These global and regional patterns resonate with South Korea’s experience. In South Korea, recurrent severe wildfires along the east coast require enhanced risk assessments. For example, fire risk was assessed using grid-based models that incorporated topography, vegetation, and human infrastructure proximity [
42]. Regional variability has been observed, with the Yeongdong area showing heightened susceptibility to surface and crown fires due to fuel characteristics [
43]. In addition, building materials and urban layout significantly influence fire spread in WUI contexts [
44]. More recently, the need to protect LPG stations in WUI–Petroleum zones has been underscored, given the associated explosion risks [
13]. As fire-prone environments continue to shift with climate and land-use changes, comprehensive, interdisciplinary approaches are essential to manage and mitigate future wildfire threats.
3. Materials and Methods
3.1. Data Source
We obtained wildfire records from the Korea Forest Service (KFS) and Wildfire Statistics and meteorological observations from the Korea Meteorological Administration (KMA) Automated Synoptic Observing System (ASOS) and Automatic Weather Station (AWS) databases, which, respectively, compile standardized fire records through field inspections and local fire department reports and quality-controlled weather station data, focusing on wildfires (burned area ≥ 5 ha) nationwide from 1980 through 2024. We chose 1980 as the start year because national wildfire statistics and variable definitions are consistently available from this period onward, while earlier records are sparse and not directly comparable; a 45-year window also captures multi-decadal climate variability, forest ageing/reforestation, major policy phases, and landmark events (e.g., the 2000 East Coast fires; the 2022 Uljin–Samcheok fires). We end in 2024 to include the most recent complete fire seasons available at the time of analysis.
Spatial resolution and temporal intervals. KMA ASOS/AWS data are point-based station observations aggregated to daily metrics (event-day averages and maxima) and matched to each fire’s administrative unit (Si/Gun/Gu) using the nearest station within the same province. Forest/land-use attributes are compiled from the KFS Forest Statistical Yearbook and related Forest Spatial Information products at the Si/Gun/Gu level, with annual reporting where available; variables released on inventory cycles (e.g., age-class distributions from the National Forest Inventory) are aligned to event years via nearest-year assignment.
The dataset includes over 905 fire events and contains details on: burned area (hectares) as the dependent variable; forest factors (timber stock volume total and per hectare, species composition—counts and ratios of coniferous, broadleaf, mixed forests, forest age classes I–VI counts and ratios, where V–VI indicates mature forest, forest area, and forest coverage rate of the region); climate factors (daily average wind speed, maximum wind speed on the day of fire, daily average temperature, and average relative humidity). To enhance representativeness across decades and reduce sensitivity to evolving detection/reporting practices, we restricted analysis to large events (≥5 ha) and used burned area (ha) as the primary response—measures that are less affected than raw event counts by surveillance intensity. We also summarised patterns by multi-year intervals (see
Figure 1 and
Figure 2) to contextualise decadal shifts and check that inference is not dominated by any single epoch. Because an annually consistent, nation-wide WUI time series was unavailable for 1980–2024, WUI effects are discussed qualitatively and flagged as a limitation for future geospatial integration.
3.2. Data Preprocessing
Data were rigorously preprocessed to ensure quality and reliability. Duplicate or erroneous entries were removed based on verification through event dates and locations. Missing values were handled by imputing regional averages for climate-related data, thereby minimizing potential bias [
45]; all imputations were flagged for transparency and later sensitivity checks. Forest variables with gaps, such as unknown age-class distributions, were completed using data from earlier or subsequent years within the same region, using the nearest available year in the same region and assuming gradual temporal changes in forest composition [
46].
Coordinate and schema harmonisation. Event locations were standardised to administrative units (Si/Gun/Gu) and, where coordinates were available, projected to WGS84 (EPSG:4326). Variable names/units were normalised across vintages.
All variables were standardized for consistency. Wind speed measurements were converted to meters per second (m/s), and temperature measurements to degrees Celsius (°C). Additionally, we derived ratios (percentages) for forest composition types (coniferous, broadleaf, and mixed forests) and mature forest classes (age class V–VI) to facilitate comparable analysis across varying regional sizes. Where appropriate, continuous predictors were centred and scaled to aid comparability and numerical stability. To address potential secular changes in observing systems over 1980–2024, we (i) adopted the ≥ 5 ha threshold, (ii) modeled burned area rather than only counts, and (iii) examined time-slice summaries to verify that results are consistent across periods.
Climatic data QC and matching. KMA station records were screened for physically implausible values (e.g., RH < 2% or >100%), unit-harmonised, and aggregated to daily event-day metrics (Avg Wind, Max Wind, Avg Temperature, Avg RH). If the nearest station had missing values on the event day, the next-nearest station within the same province was used; short gaps were imputed with provincial daily means and flagged.
Multicollinearity among independent variables was assessed using Pearson correlation and Variance Inflation Factor (VIF) analyses [
47]. Variables exhibiting a VIF greater than 5 were identified, and during subsequent regression modeling, we employed a stepwise selection approach to exclude highly collinear variables, giving priority to more interpretable ratio-based metrics [
48]. We note that VIF thresholds of 5–10 are commonly reported in the literature; here we chose the more conservative cutoff of 5 to minimize redundancy and improve interpretability of coefficients.
Average temperature was included in an initial model but became insignificant when max wind and RH were present (it dropped out at p ≈ 0.15)—its effect seems largely captured by these correlated weather variables. The p-value cutoff of 0.15 for variable exclusion follows recommendations in applied ecological and climatic modeling studies, which suggest that a slightly relaxed threshold prevents premature removal of variables that may contribute under interaction or nonlinear modeling frameworks. To ensure robustness, we further cross-checked variable stability by examining confidence intervals and sensitivity analyses.
3.3. Correlation Analysis (Data, Metrics, and Computation)
Correlation analysis was initially conducted using Pearson’s correlation coefficient, computed in Python (version 3.8.12) with the SciPy library (version 1.7.1); heatmaps were generated using the Seaborn library (version 0.11.2). These visualizations and statistics helped detect preliminary relationships among variables. Significant correlations guided further in-depth analyses, particularly indicating strong negative associations between humidity and burned area, and positive associations between wind speed and burned area [
49].
3.4. Multiple Linear Regression (MLR)
An MLR model was constructed as follows:
The model fitting was performed using Python (version 3.8.12) and the Statsmodels library (version 0.14.0). Variance inflation factors (VIFs) were calculated using the variance (inflation_factor function) in Statsmodels to assess multicollinearity. Although generalized linear models (GLMs) and logistic regression are popular for categorical outcomes, they were not appropriate here because the response variable (burned area) is continuous; therefore, a multiple linear regression with a Gaussian family and identity link was used.
Here, Xj included selected forest and climate variables based on previous correlation analyses and VIF results. The model initially comprised approximately 10 predictors. We progressively refined the model by removing variables with non-significant coefficients (p > 0.1) to avoid overfitting, particularly given the sample size of approximately 900 events. In this equation, k represents the total number of predictor variables used in the final model, βj denotes the coefficient for each predictor Xj, and ε is the error term.
Assumptions including normality of residuals, homoscedasticity, and independence were assessed. Residual plots indicated mild heteroskedasticity; thus, robust standard errors (HC3) were applied for inference [
50].
3.5. Random Forest Modeling
To capture potential nonlinear relationships and interactions among variables, a Random Forest (RF) model was employed [
51]. This model was implemented using the (Random Forest Regressor) class from the Scikit-Learn library (version 1.1.3) in Python (version 3.8.12). The model was configured with 1000 trees, utilizing bootstrap sampling and evaluating a random subset of predictors at each split.
Mathematically, the RF prediction is the average of outputs from all individual decision trees: ŷ = (1/T) × ∑ f
t(x), where T is the total number of trees and f
t(x) is the prediction from the t-th tree. Data was randomly partitioned into training (70%) and testing (30%) subsets, ensuring balanced temporal and regional representation. Model performance was internally validated using out-of-bag (OOB) error estimation, and variable importance was assessed using permutation-based importance metrics [
52]. While our models assume independent observations, wildfire occurrences can exhibit spatial clustering. We therefore assessed Moran’s I on the regression residuals and found no significant spatial autocorrelation (
p > 0.1), suggesting the residuals are spatially independent. Nevertheless, we recognize that incorporating spatial effects (e.g., spatial lag terms or region-specific random effects) could further improve the model and should be explored in future work.
3.6. XGBoost Modeling and SHAP Interpretation
Recent studies have demonstrated the growing role of AI and satellite systems in wildfire forecasting and detection. XGBoost-SHAP modeling [
53], VIIRS/Himawari-8 satellite integration [
54], and deep learning–based smoke detection [
55] exemplify operational advances since 2020. In this study, Extreme Gradient Boosting (XGBoost) models were implemented using the XGBoost library (version 1.6.0) in Python (version 3.8.12), and hyperparameter tuning was conducted via GridSearchCV from Scikit-Learn (version 1.1.3) [
56]. Mathematically, XGBoost minimizes a regularized objective function at each boosting iteration: L(t) = ∑ l(y
i, ŷ
i(t−1) + f
t(x
i)) + Ω(f
t), where l is a loss function (e.g., squared error), ŷ
i(t−1) is the previous prediction, f
t(x
i) is the new tree, and Ω(f
t) penalizes model complexity.
We performed a grid-search 5-fold cross-validation for XGBoost hyperparameter tuning. Tested parameters included max_depth (3–9), learning_rate (0.01–0.3), n_estimators (50–500), subsample (0.5–1.0), and colsample_bytree (0.5–1.0). The optimal combination found was max_depth = 5, learning_rate = 0.10, n_estimators = 200, subsample = 0.8, colsample_bytree = 0.8. These final hyperparameters were selected based on the lowest cross-validation RMSE, and are summarized in
Table A1.
To interpret complex model interactions and feature importance clearly, SHapley Additive exPlanations (SHAP) values were calculated using the SHAP Python library (version 0.41.0) [
57]. SHAP summary plots illustrated feature impacts across the dataset, while individual force plots explained model behavior for extreme fire events qualitatively, enhancing interpretability.
3.7. Wildland–Urban Interface (WUI) Assessment
The influence of the Wildland–Urban Interface (WUI) on wildfire outcomes was qualitatively assessed using documented domestic case studies. Specific events where fires impacted infrastructure and residential areas within WUI zones were examined to contextualize and discuss the role of proximity between settlements and forested regions in exacerbating wildfire damage [
58]. This provided valuable insight into the socio-economic dimension of wildfire danger in South Korea, which informed subsequent recommendations and policy considerations. Although WUI influence was examined qualitatively (case studies), we did not include a specific WUI predictor in our models due to data limitations. We acknowledge this as a limitation and have added that future analyses will integrate geospatial WUI indicators or proxies (e.g., housing or road density) to better capture human presence in fire-prone areas.
AI-based language tools (ChatGPT, OpenAI, San Francisco, CA, USA) were used to assist with language editing and improving readability. All content was subsequently reviewed and validated by the authors.
4. Results
4.1. Correlation Analysis: Forest, Climate vs. Burned Area
Heatmap of Correlation Coefficients (
Figure 1) shows that climate factors are most strongly correlated with burned area. Maximum wind speed had the highest Pearson correlation with burned area (r = 0.75,
p < 0.001), followed by average wind speed (r = 0.68,
p < 0.001) and relative humidity (r = −0.55,
p < 0.001). This indicates that wind, particularly at higher speeds, plays a dominant role in driving fire behavior. Average temperature also correlated positively (r = 0.47,
p < 0.01). These results suggest that large wildfires are strongly associated with windy, dry, and warm conditions. Forest variables showed weaker, yet statistically significant, correlations. Forest coverage rate (r = 0.50) and forest area (r = 0.44) were moderately correlated with burned area, suggesting that larger or denser forests may offer more continuous fuel, potentially facilitating more extensive fires. The coniferous tree ratio was positively correlated (r = 0.20,
p < 0.05), while the broadleaf tree ratio showed a negative correlation (r = −0.22,
p < 0.05). This supports the hypothesis that conifer-dominated forests, due to their resinous nature, are more prone to larger fires compared to broadleaf forests. The mixed forest ratio was essentially uncorrelated (r ≈ 0), indicating no linear preference in fire size. The proportion of mature forests (Age V–VI) showed a positive correlation (r = 0.33,
p < 0.01), suggesting that older stands may contribute to larger fires due to the accumulation of dry fuel material over time.
These correlation results illustrate the interplay between climate and fuel: extreme weather conditions (such as windy, low-humidity days) are conducive to large fires, but in their absence, even fuel-rich environments may not experience significant fire spread [
59]. Conversely, under adverse weather, even areas with limited fuel can suffer large fires. This stresses the interplay between weather and fuel: weather sets the stage for potential fire behavior, and fuel conditions modulate the extent of fire spread. This dynamic is analogous to the distinction between fire intensity and burn severity [
59].
4.2. Multiple Linear Regression (Independent Effects)
Our MLR model (
Table 1) confirmed many of the above correlations while controlling for others. The model’s R
2 of 0.62 indicates a substantial portion of variability is explained by the included factors. In addition to reporting coefficients, we calculated 95% confidence intervals (CIs) for each parameter estimate to reflect uncertainty in effect sizes (
Table 1). For example, the effect of maximum wind speed (β = +8.47) had a 95% CI of approximately +6.4 to +10.5 ha, confirming the robustness of this predictor.
Climate factors: Max wind speed emerged as the strongest predictor (β = +8.47 ± 1.05, p < 0.001), meaning each additional m/s in max wind is associated with ~8.5 ha more burned, holding other factors constant. Relative humidity had a significant negative coefficient (β = −3.15 ± 0.88, p = 0.002); higher humidity significantly lowers burned area. Average temperature was included in an initial model but became insignificant when max wind and RH were present (it dropped out at p ~ 0.15)—its effect seems largely captured by these correlated weather variables. These findings quantitatively support that wind and dryness are primary drivers.
Forest factors: Forest coverage rate had a positive independent effect (β = +0.92 ± 0.43, p = 0.03). Denser forest cover contributes to larger fires, likely by enabling fire spread through contiguous canopy and ground fuels. Broadleaf ratio showed a significant negative effect (β = −0.51 ± 0.24, p = 0.04), implying broadleaf-dominated forests experience smaller fires even after accounting for weather—possibly due to higher moisture content and lower flammability of broadleaf litter. Coniferous ratio had a positive coefficient (β = +0.45 ± 0.25) with marginal significance (p = 0.07). While not definitive, it suggests a trend that more conifers lead to larger fires. Mature forest ratio also showed a marginally significant positive effect (β = +0.37, p = 0.08), echoing that older stands (with heavy fuel loads) can exacerbate fires. Timber stock volume per hectare had a very small, non-significant coefficient (β ≈ +0.002, p = 0.20), indicating that once we account for forest cover and composition, total biomass density does not add much predictive power (likely because volume and composition are interrelated, and composition is more directly tied to flammability).
Model implications: In essence, the regression suggests that if we had two identical forested regions, but one day is 5 m/s windier and 20% drier in RH than another, the model would predict roughly ~42 ha more burned for the windier day and ~63 ha less for the more humid day—demonstrating how powerful those factors are. Meanwhile, if one region had a 20% higher broadleaf ratio, it would see ~10 ha less burned, indicating vegetation management (promoting broadleaf species) can mitigate fire spread to an extent. The modest coefficient on forest cover (≈0.92 per percentage point) accumulates over large differences: for example, a region with 50% forest cover vs. 30% could expect ~18 ha more burned, all else equal. The regression’s residuals did not show strong bias against any particular period or region, suggesting it captured general patterns reasonably well. However, some extreme fires (e.g., the 2000 and 2019 events) were under-predicted, indicating that such events involve compounding factors beyond additive effects—a hint that interactive modeling (like Random Forest,
Section 4.3) is warranted.
4.3. Random Forest Analysis (Feature Importance)
The Random Forest achieved an OOB R
2 of ~0.65 on training data and ~0.64 on test data, slightly higher than the linear model, indicating it captured non-linear effects effectively. To assess robustness, we repeated the RF training across multiple bootstrap resamples and cross-validation folds. Variable importance rankings remained stable, with maximum wind speed and relative humidity consistently emerging as the top two predictors in more than 90% of runs, underscoring the generalizability of the findings. The permutation importance results (
Figure 2) identified max wind speed and relative humidity as the top two features by a large margin, aligning with their known critical role. Average temperature was the third most important, more so than in linear analysis—RF likely leveraged temperature during, say, moderately windy conditions to decide if a fire would still grow. Forest coverage was the fourth, showing that in the ensemble model, areas of contiguous forest consistently contributed to fire spread outcomes. Notably, coniferous ratio outranked broadleaf ratio in importance (as expected since they are inversely related, RF might split on one or the other; it chose coniferous likely due to more consistent relationship with larger fires). The importance of age class V–VI ratio may indicate that the model detects threshold effects, where an increased proportion of mature forest influences fire behaviour; beyond a certain level of maturity combined with particular weather conditions, fire risk could increase.
An interesting observation is that timber stock volume, though not significant in linear analyses, was moderately important in the RF model. This suggests that stands with exceptionally high biomass loads—such as densely stocked pine plantations—may contribute to intense fires in ways that are not captured by percentage-based variables. The RF model appears capable of detecting threshold effects, whereby fire intensities rise sharply when timber volume exceeds certain levels during dry conditions. We also examined Random Forest’s partial dependence plots to understand predictor relationships clearly. Predicted burned area consistently increased with rising maximum wind speed, confirming wind’s known influence on fire spread. Conversely, higher relative humidity corresponded to reduced predicted burned area, as expected due to moisture’s mitigating effect on ignition and spread.
Temperature exhibited a nonlinear relationship; predictions rose steadily with temperatures up to approximately 15–20 °C, then plateaued or slightly decreased. This pattern might reflect fewer fires during mid-summer due to increased rainfall, suggesting an interaction between temperature and seasonal rainfall patterns.
Forest cover showed another nonlinear effect: as cover increased up to about 50%, predicted fire size grew significantly. However, further increases beyond 50% had diminishing returns, possibly due to wetter forest conditions typically associated with very high cover levels or correlations with other non-flammable factors.
Conifer ratio displayed a clear threshold behavior. Forests with low conifer ratios (below 20%) significantly lowered predicted wildfire risk. Above approximately 50% conifer coverage, wildfire risk predictions plateaued at a high level, indicating additional conifers beyond this point have minimal incremental impact on risk.
Age ratio revealed an important threshold at around 30% mature forest composition, beyond which predicted fire sizes substantially increased. This indicates a potential tipping point in fuel connectivity, where sufficient mature vegetation substantially elevates wildfire risk.
Overall, Random Forest findings align closely with regression results, highlighting key variables but uniquely emphasizing temperature-related thresholds and nonlinear relationships critical for nuanced wildfire management strategies.
4.4. XGBoost with SHAP (Interaction Interpretability)
XGBoost performed similarly to RF (test R
2~0.66) but is easier to interpret with SHAP. We further examined robustness by performing 5-fold cross-validation with repeated random splits. Across folds, SHAP value patterns remained consistent, particularly for maximum wind speed, relative humidity, and conifer ratio, which showed stable contributions to predicted fire size. This stability across data partitions indicates that the model’s interpretability results are not artifacts of a single training set. The SHAP summary plot (
Figure 3) visualized the effect of each feature on the prediction across all fires. For example, almost all instances with high max wind had positive SHAP values (indicating larger predicted fire size), confirming that high wind consistently pushes fire size upward. Low humidity also showed mostly positive SHAP contributions to fire size. Temperature had a more mixed influence (some high-temperature instances contributed strongly, but others not as much, likely due to interactions with humidity).
Forest cover and composition variables had more moderate SHAP effects; notably, a high broadleaf ratio usually contributed negative SHAP values (reducing predicted fire size), whereas high coniferous ratio contributed positive SHAP in many cases. These patterns reinforce our findings that weather factors are paramount, with forest factors modulating outcomes. SHAP analysis also highlighted a few specific cases: for instance, one large fire in 1996 burned through predominantly oak forest under severe drought conditions—normally, broadleaf cover helps, but the extreme drought negated that advantage, and SHAP correctly captured that anomaly.
4.5. Comparative Analysis of WUI Dynamics and Wildfire Behavior (Empirical Results)
WUI is increasingly recognised as a significant concern alongside climate change [
14]. The present study focused primarily on burned area, but WUI dynamics involve the propagation of fire into areas with human structures. In recent Korean cases, when fires encroach on WUI zones—where the decline of traditional farmland has removed natural fuel breaks—suppression becomes more complex: firefighters must prioritise the protection of lives and property, which can allow fires to expand. Various studies have identified risk factors associated with the WUI. Park et al. (2024) emphasised the importance of understanding interactions between petroleum facilities, such as LPG filling stations, and wildfire risks in WUI–Petroleum areas [
13]. Garner and Kovacik (2022) reported that extreme combinations of strong winds and low relative humidity occur near densely populated areas in southern California, and that fuel treatments are less effective under such extreme weather [
60]. Wasserman and Mueller (2023) showed that increased temperature and precipitation variability intensify droughts and wildfire severity in the Western United States, implying similar risks may exist in Northeast Asia [
61]. This research extends these findings by integrating climatic variables with forest composition and age structure. Unlike the KO-G Dynamic forest growth model used by Hong et al. (2022), which focused on forest carbon and growth dynamics [
62], the approach directly models wildfire occurrence and size using climate extremes and WUI factors. By jointly considering human and environmental drivers, the analysis indicates that as Korean forests transition from young to over-mature stands, growth slows and vulnerability to disturbances—including wildfires—increases. This underscores the necessity for proactive forest management strategies to mitigate the fire hazards associated with aging stands.
4.6. Wildfire Damage Area in South Korea (1980–2024, 5-Year Interval)
An analysis of wildfire damage area from 1980 to 2024 in five-year intervals reveals a gradual increase in overall fire damage.
Figure 4 illustrates the trend of large-scale wildland fire damage area and the number of large-scale wildland fires for each period, while
Figure 5 maps the spatial distribution of damage across the country. In addition,
Figure 6 presents the monthly distribution of large wildfires (≥5 ha), showing a pronounced spring peak in March–April, and
Figure 7 summarizes provincial totals, highlighting the concentration of burned area in North Gyeongsang and Gangwon with comparatively low totals in metropolitan areas. The period from 2020 to 2024 recorded an unprecedented level of damage, which can be attributed to a combination of rising temperatures, changes in forest structure, and limitations in wildfire suppression methods. The pronounced increase in fires can be attributed to extreme droughts and heatwaves under climate change, which have led to record-breaking wildfire seasons across multiple regions. These conditions culminated in large wildfires—for example, the 2022 Uljin–Samcheok wildfire burned more than 17,000 ha, making it one of the largest modern wildfires in South Korea [
63].
Figure 4 shows that both the area burned and the number of large-scale wildland fires were relatively low in this period, with only about 85 large fires recorded. However, historical underreporting or incomplete data cannot be ruled out. Most fires were small-scale and locally contained; agricultural fires (e.g., burning of field residues) occasionally spread into nearby forests.
Figure 5 indicates that damage was concentrated in Kangwon and Gyeongsang provinces, but overall intensity remained modest. This period overlaps with the 2nd Basic Forest Plan (1979–1987), which emphasized afforestation and erosion control (e.g., reforestation and slope stabilization). Regulatory frameworks under the Forestry Act were being updated, yet there was no specialized wildfire management system. Basic preventive measures (patrols, limited public awareness campaigns) were in place, but resources for aerial firefighting or specialized fire brigades were minimal.
According to
Figure 4, wildfire damage doubled, and the number of large-scale fires rose to 115, reflecting rapid industrialization and urbanization that led to rural depopulation and reduced maintenance of forest–agricultural interfaces.
Figure 5 highlights expanded areas of fire damage, especially along the eastern coast and in Kangwon province. Accumulating forest fuel loads (leaf litter, deadwood) and drier conditions contributed to more frequent and severe ignitions. Still under the 2nd Basic Forest Plan, the Korea Forest Service (KFS) attempted to expand fire prevention (local monitoring, patrols), but large-scale fire response capacity remained limited. Mountainous regions saw incremental improvements in watchtower construction and volunteer firefighter training, yet overall capacity to respond to wildfires was insufficient.
Figure 4 shows a decline in both burned area and large-fire counts (about 142), possibly due to localized increases in precipitation, more effective early detection and incremental improvements in suppression efforts.
Figure 5 maps a reduction in high-intensity hotspots, though damage remained focused in eastern provinces. Nevertheless, forest stands continued to age, and fuel loads gradually accumulated, highlighting an ongoing latent risk of larger fires. The early years of the 3rd Basic Forest Plan (1988–1997) saw new regulations under the Forestry Act, including expanded zoning for forest protection and initial development of wildfire detection systems. Funding increased for regional firefighting personnel, though national-level aerial resources (helicopters, specialized aircraft) were still in nascent stages.
Wildfire damage surged, marking the first occurrence of modern large-scale wildfires in national records.
Figure 4 reflects a sharp rise in the number of large fires (peaking at 215) and burned area.
Figure 5 shows that heavy damage occurred along coastal and mountainous corridors, particularly in Gangwon Province. Strong spring winds, low humidity, and abundant forest fuels contributed to rapid fire spread. Late in the 3rd Basic Forest Plan and early in the 4th Basic Forest Plan (1998–2007), authorities began introducing expanded aerial firefighting initiatives and reorganizing local fire management units. Following the 1997 financial crisis, public works projects employed more local fire lookouts and patrols. However, policy efforts to curtail large-scale fires were still at an early stage; resources were not always consistently mobilized.
Slightly higher wildfire damage (4651 ha) was observed.
Figure 4 shows that large-scale fire counts declined to 146, but burned area remained elevated, partly due to notable events such as the 2000 East Coast fires.
Figure 5 continues to highlight hotspots in eastern provinces. Residential and recreational development near forested areas led to faster fire spread and heightened response complexity. Under the 4th Basic Forest Plan, aerial firefighting units were more formally institutionalized (e.g., the expansion of the Korea Forest Aviation Headquarters). Fire suppression practices improved through better coordination among central and local agencies, though large-scale fires continued to exceed existing suppression capacity when extreme weather aligned with high fuel loads.
Figure 4 records a substantial decline in both burned area and large-fire counts (about 46), reflecting enhanced early warning systems, increased firefighting aircraft availability and stronger ground coordination.
Figure 5 shows fewer regions with intense damage, indicating more effective containment. The latter phase of the 4th Basic Forest Plan and early stage of the 5th Basic Forest Plan (2008–2017) saw focused investments in wildfire prevention and suppression: heightened deployment of fire suppression crews and expansion of monitoring networks (e.g., CCTV, lookout towers). Legislative amendments (the Wildfire Prevention Act, the Forestry Act) introduced stricter penalties for negligence and illegal burning.
Figure 4 displays the lowest recorded damage and very few large-scale fires (about 36), with relatively favourable climatic conditions (increased humidity, fewer days of strong winds) coinciding with significantly improved response infrastructure.
Figure 5 shows only isolated pockets of minor damage. This period falls in the mid-course of the 5th Basic Forest Plan. Key initiatives included adoption of electronic forest mapping for real-time fire location tracking; expansion of professional firefighting teams (the “wildfire prevention and suppression squads”); and strengthening of interagency coordination, with local governments and the KFS collaborating to address WUI fires more systematically.
Wildfire damage rebounded sharply, increasing by 367% compared to 2010–2014.
Figure 4 shows a moderate burned area but a slight increase in large-scale fires (38). This rebound coincided with prolonged droughts, higher spring temperatures and stronger winds—conditions consistent with broader climate-related trends.
Figure 5 illustrates resurgent hotspots in Kangwon and Gyeongsang provinces. Consistent with
Figure 6, most events clustered in the spring months, reinforcing the seasonality of recent outbreaks;
Figure 7 also shows that provinces in the east and southeast (Gangwon and North Gyeongsang) account for the largest cumulative burned areas. Aging forests (with increased fuel loads) and expanding WUI areas made fire containment more difficult. As the 5th Basic Forest Plan concluded (2008–2017), new frameworks were established. Government measures such as advanced aerial resources (drones, additional helicopters) and specialized large-fire task forces were introduced, particularly in response to events like the 2019 Goseong–Sokcho fire. Nonetheless, large conflagrations highlighted the challenges posed by extreme weather and accumulated forest fuels.
This period saw an unprecedented surge in wildfire damage, exceeding all previous intervals combined (29,905 ha).
Figure 4 depicts an explosive increase in burned area alongside a rise in large-scale fires (91).
Figure 5 identifies extensive damage concentrated in the southeastern provinces, particularly Gyeongsang. The monthly peak documented in
Figure 6 (March–April) persists, and
Figure 7 confirms that North Gyeongsang and Gangwon dominate provincial totals during this interval. Notable incidents include the 2022 Uljin–Samcheok wildfire, which burned over 17,000 ha under strong winds and exceptionally dry conditions. The convergence of extreme meteorological factors (heat waves, low humidity, high wind speeds) and expanding WUI zones markedly escalated damage. In response, national laws (e.g., the Wildfire Prevention Act) were revised to impose harsher penalties for negligence and to expand the deployment of wildfire-specialized crews. Advanced technologies—AI-driven risk modeling, real-time satellite monitoring, and drone-based reconnaissance—were accelerated. Yet, given ongoing climate change and extensive fuel accumulations, controlling emergent mega-fires remains a formidable challenge.
6. Conclusions
6.1. Practical Implication
National and regional agencies play a pivotal role in wildfire prevention, preparedness, and response, especially given the escalating risks identified in this study. Integrated strategies are needed: land-use planning to limit WUI expansion; targeted fuel management in conifer-heavy and mature forests; early warning systems that monitor compound weather extremes; and coordinated action across fire agencies, forestry services, and local governments. Our analysis confirms that drivers are interconnected, WUI expansion intensifies risk, machine-learning tools reveal compound dynamics, conifer forests are particularly vulnerable, and effective mitigation requires a collaborative, multi-agency response. Public awareness campaigns and building codes tailored to fire-prone areas can further reduce ignition sources and enhance resilience at the community level.
Specifically, high-risk eastern provinces should prioritize mechanical thinning or prescribed burning in conifer-dominated stands exceeding 50% cover or 30% mature age-class ratio. WUI buffer zones of 50–100 m with ignition-resistant vegetation should be mandated in new residential developments. Nationwide fire safety weeks, coupled with school-based training and community evacuation drills, can rapidly scale preparedness. In addition, integrating machine learning-based fire weather thresholds into the Korea Forest Service’s national early warning system would provide real-time, scalable alerts. Pilot initiatives in Gangwon and North Gyeongsang could serve as models for nationwide implementation.
6.2. Limitations and Future Research Directions
This study’s focus on fires ≥ 5 ha, potential inconsistencies in data collection, and limited microclimate information constrain the ability to fully capture small fire dynamics and localised risk factors. In addition, while certain climatic, vegetative, and human variables were identified as strong predictors, they should not be interpreted as absolute determinants of wildfire behaviour. The models capture correlations and predictive power rather than direct causality, and unmeasured influences such as suppression effectiveness, ignition source distribution, or socio-economic changes may also shape observed patterns.
Future research should incorporate finer-scale fire data, microclimatic variables, socio-economic indicators, and WUI proxies to better quantify human influences and examine post-fire vegetation shifts to refine fuel management strategies. Such enhancements will support more adaptive and holistic wildfire policies as climate and land-use patterns continue to evolve.