Human Activities and Wildfires: The Impact of Forest Roads, Trails, and Forest Management on Wildfire Occurrence

Yeo-Chang, Youn; Lee, Se-Eum; Lee, Soo-Jin; Kim, Hyo-Rin

doi:10.3390/fire9060246

Open AccessArticle

Human Activities and Wildfires: The Impact of Forest Roads, Trails, and Forest Management on Wildfire Occurrence

¹

College of Agriculture and Life Sciences, Seoul National University, Seoul 08826, Republic of Korea

²

aSSIST University, Seoul 03767, Republic of Korea

³

Institute of Sustainable Social-Ecological Systems (ISSES), Seoul 04779, Republic of Korea

⁴

Indidlab, Seoul 07325, Republic of Korea

^*

Authors to whom correspondence should be addressed.

Fire 2026, 9(6), 246; https://doi.org/10.3390/fire9060246

Submission received: 14 April 2026 / Revised: 24 May 2026 / Accepted: 27 May 2026 / Published: 9 June 2026

(This article belongs to the Special Issue Fire Patterns, Driving Factors, and Multidimensional Impacts Under Climate Change and Human Activities)

Download

Browse Figures

Versions Notes

Abstract

The risk of wildfires is increasing due to high temperatures and dry weather conditions caused by climate change. Outbreaks and spread of wildfires are usually conditioned by weather, topography, and fuel characteristics. In the Republic of Korea (hereafter, the ROK), most wildfires are caused by anthropogenic factors rather than natural ones. However, the current forest fire forecasting system being operated in the ROK does not account for anthropogenic factors. To analyze the impact of human and physical factors on wildfire occurrence, a binary logistic regression model was constructed using data from the Gangwon and Gyeongbuk provinces from January 2022 to August 2025. The dependent variable was defined as the occurrence of a wildfire, while the independent variables comprised meteorological, seasonal, stand, and anthropogenic factors. To address multicollinearity, variables with high correlation coefficients were excluded from the independent variables, which were selected by three estimating approaches, including logistic regression and two machine learning techniques (namely, Random Forest and XGBoost). With machine learning, the variables with high feature importance were identified. The explanatory power of the logistic regression analysis with independent variables selected by the machine learning models was about 1.3 times higher than that of the model using variables adjusted solely for multicollinearity. The results of logistic regression analysis revealed that weather and coniferous forests are the most important factors fostering wildfires, while the mean stand age was the most significant factor in hindering wildfires. Among the anthropogenic factors, forest road density acted as a suppressor of wildfire spread rather than a promoter of occurrence. Conversely, trail density tends to increase the risk of wildfire occurrence. Among forest management activities, plantation forests may increase the risk of forest fires, although this remains uncertain. These findings suggest that preventing wildfires requires a paradigm shift in forest resource management policies, including extending forest rotation ages and converting coniferous forests to broadleaf forests. Meanwhile, it also indicates the need to restrict the expansion of hiking trails and improve regulations regarding hiker access and behavior to prevent wildfires.

Keywords:

forest fire; fire probability; forest age; road density; trail density; forest management; logistic regression; machine learning

1. Introduction

Global climate change causes extreme weather events, such as droughts, to occur frequently. These climatic conditions are considered as primary causes for the increasing frequency and scale of wildfire damage [1,2,3]. Globally, the frequency of mega-fires is rising, and wildfire seasons are becoming prolonged, leading to severe ecological and economic losses. In the era of the climate crisis, it is particularly concerning that wildfires release massive amounts of carbon dioxide into the atmosphere in a short period of time. This emitted carbon dioxide exacerbates climate change [4]. Boreal forest fires typically account for about 10% of global wildfire carbon dioxide emissions; however, in 2021, this figure surged to 23%, recording the highest proportion since 2000. In 2021, driven by global warming, abnormal moisture deficits led to extreme wildfires across North America and Eurasia [5]. Wildfires do not merely cause short-term damage; they trigger a severe positive feedback loop that accelerates global warming by releasing large quantities of carbon previously stored in forests [6].

In the ROK, the largest mega-fire in the nation’s history occurred in March 2025, primarily affecting Gyeongbuk, Gyeongnam, and Ulsan, with the damaged area reaching approximately 104,000 hectares, equivalent to 1.64% of total forest area [7]. Furthermore, this wildfire recorded the highest number of human casualties in history, in addition to extensive forest damage [8]. Korea is constantly exposed to the risk of wildfires, which have become an everyday threat. Given that about 63% of the land is covered by forests, combined with dry spring weather, strong localized winds, and rugged topography, there is a high risk of fires spreading rapidly.

The current fire-prone landscape in Korea is in part a product of post-war reforestation policy. Following the devastation during the Korean War (1950–1953), which left forests severely degraded, the government launched large-scale afforestation campaigns from the 1960s through the 1980s. These programs relied heavily on fast-growing coniferous species—most notably Pinus densiflora (Japanese red pine) and Pinus koraiensis (Korean pine)—which, although native, were planted in dense monoculture stands far exceeding their natural distribution [9]. As a result, coniferous plantations now dominate large portions of the landscape in Gangwon and Gyeongbuk provinces, creating structurally homogeneous forests with high fuel continuity that are particularly susceptible to rapid fire spread.

An examination of wildfire trends in Korea reveals that the interaction of climatic, structural, and social factors is making wildfires increasingly larger and more routine. According to statistics, an average of 450 to 500 wildfires occurs annually, destroying approximately 3700 to 4000 hectares of forests each year [10,11]. Notably, the success of past reforestation projects and continuous forest protection efforts have led to an increase in forest growing stock since the 1960s. This has resulted in fuel accumulation, acting as a structural factor that exacerbates wildfire scale [10,11]. Forest growing stock in Korea has increased 18.4-fold compared to 1946, reaching 1,040,447,000

m^{3}

(165.2

m^{3}

/ha) as of 2020 [12].

Meteorological patterns, characterized by decreasing precipitation days and a sharp increase in dry weather warnings due to climate change, further aggravate the risk of mega-fires [13]. Changes in the timing and patterns of occurrence are also distinct. Wildfires, once concentrated in the spring, are expanding into May and the winter season, while the increase in recreational hiking has led to more fires on weekends [10,11]. The decline and aging of rural populations make initial firefighting efforts increasingly difficult [10,11]. This implies that the ignition and spread of wildfires are closely linked to human activities and social structural changes, beyond simple climatic or topographical conditions.

Unlike natural causes such as lightning strikes, wildfires in Korea are predominantly caused by anthropogenic factors, such as accidental fires by hikers or the burning of agricultural waste [14]. The current National Forest Fire Danger Rating System, operated by the National Institute of Forest Science, issues fire risk ratings based on statistical analyses of weather, stand, and topographical factors in relation to wildfire occurrences (2000–2010) [14], but it does not incorporate anthropogenic factors. Therefore, it is necessary to account for the impact of human-induced factors on wildfire occurrence for improved forest fire forecasting systems and effective forest fire prevention plans.

Won et al. [15] developed a national integrated Daily Weather Index (DWI) model for calculating forest fire danger ratings in spring and autumn, identifying temperature, relative humidity, effective humidity, and wind speed as the key meteorological predictors. Ryu et al. [16] analyzed data of forest fire breakouts over the last 30 years and found that wildfire risk periods have been extended due to climate change in the ROK. Kwak et al. [17] showed that slope, elevation, aspect, distance to roads, and population density are significant explanatory factors for wildfire occurrence. Kim et al. [18] analyzed forest fire probability using multi-temporal socio-economic and environmental variables, demonstrating that fire risk is associated with both biophysical conditions and human land-use patterns. Kim et al. [19] reported that over a 30-year period (1991–2020), annual wildfire incidents have been increasing, with worsening spatial unevenness, as mega-fires are concentrated in the northeastern regions of Gangwon and Gyeongbuk.

However, there is a lack of empirical research that quantitatively and spatially analyzes the impact of anthropogenic factors on wildfires. Some previous studies (Hong et al. [20]) proposed a hypothesis that forest roads can exacerbate wildfire breakouts. Conversely, others suggest that forest roads hinder the spread of wildfires (Lee et al. [21]). A question of whether there is a relationship between forest roads and wildfires or not needs to be clarified. Some argue that the ROK Government’s policy of subsidizing “forest improvement tending” promotes forest fires (Park et al. [22]).

To better understand the factors associated with wildfire occurrence in the ROK, this study aims to test the following three hypotheses:

Forest stand age and its species composition influence wildfire occurrence.
Plantation forest tending activities can promote wildfire occurrence.
Expansion of forest road networks and trail infrastructure (density and accessibility) increases wildfire risk.

2. Materials and Methods

2.1. Study Area and Research Flow

The spatial scope of this study was set to the Gangwon and Gyeongbuk provinces in the ROK (Figure 1). Both regions feature rugged terrain and have recorded frequent wildfires. The temporal scope was defined from January 2022 to August 2025. The research followed the procedure illustrated in Figure 2 below.

Variables were selected and collected from relevant organizations, followed by a data refinement process to determine their impact on wildfire occurrence. In the data preprocessing stage, the study areas of Gangwon and Gyeongbuk were divided into 1 km × 1 km square grids to construct grid data. All spatial data preprocessing and variable mapping were performed using the open-source software QGIS 3.40.15.

The dependent variable (Target) was set as a binary classification, defining grids with a wildfire occurrence during the period as 1, and those without as 0. For grids where wildfires occurred (Target = 1), addresses provided by the Korea Forest Service’s wildfire statistics were converted into latitude and longitude coordinates using Geocoder (https://geocoder.gimi9.com/ (accessed date 26 May 2026)), a web-based geocoding site. The coordinates represent the centroid of the address parcel area. The geocoded location data were mapped to grids as Point data to identify Target grids. Independent variables

X_{1}

to

X_{5}

were also mapped to these grids along with the dependent variable Y. Data collection and processing methods for these variables are detailed in Section 2.2. The constructed spatial information was then transformed into a dataset structured suitably for machine learning model training and binary logistic regression analysis.

First, data on wildfire occurrence locations were obtained through an information disclosure request to the Korea Forest Service. A total of 492 wildfire cases were recorded in Gangwon and Gyeongbuk provinces from January 2022 to August 2025. However, after excluding cases where the exact address was difficult to identify and duplicate cases occurring within the same 1 km × 1 km grid, the final number of wildfire occurrence grids (Target = 1) was established as 471.

Second, non-occurrence grids (Target = 0) were extracted for machine learning and binary logistic regression analysis based on the refined data. Simple random sampling of non-occurrence grids could lead to spatial bias regarding regional meteorological and topographical characteristics. Therefore, to resolve the class imbalance caused by the difference in quantities between occurrence and non-occurrence grids, Region-based Stratified Random Sampling was employed, extracting 1000 non-occurrence grids. The specific sampling process is as follows.

The study area was divided into three zones based on topographical and meteorological similarities: Yeongdong (Gangwon East Coast), Yeongseo (Gangwon Inland), and Gyeongbuk (Figure 3). The wildfire occurrences in each zone were examined. Out of 471 wildfires between January 2022 and August 2025, Gyeongbuk accounted for 266 (56.48%), Yeongseo for 139 (29.51%), and Yeongdong for 66 (14.01%). To ensure the model is balanced by learning each region’s unique environmental characteristics based on occurrence weights, the actual occurrence ratio was used as the extraction weight. The target of 1000 non-occurrence grids was allocated accordingly: 566 from Gyeongbuk, 296 from Yeongseo, and 138 from Yeongdong were randomly selected. Through this process, a final analysis dataset of 1471 grids was constructed, comprising 471 occurrence grids and 1000 stratified non-occurrence grids (Table 1). Notably, for the Yeongdong region—characterized by unique weather conditions like Yangganjipung (local strong winds)—138 non-occurrence grids (about double the 66 occurrence grids) were allocated to effectively train the model on the complex ignition risk factors of the area.

Modeling, feature selection, logistic regression analysis, and SHAP visualization were implemented using the pandas, scikit-learn, xgboost, statsmodels, and shap libraries in Python 3.12.13 (GCC 11.4.0, Linux).

2.2. Data Collection and Processing Methods

To analyze the multidimensional factors affecting wildfire occurrence, independent variables were categorized into five groups based on raw data:

X_{1}

(Weather),

X_{2}

(Forest Characteristics),

X_{3}

(Infrastructure),

X_{4}

(Forest Management), and

X_{5}

(Temporal Factors) (Table 2). The spatial distribution of forest attributes, infrastructure, and management areas across the study region, together with wildfire occurrence points, is illustrated in Figure 4. Each independent variable was spatially joined to the grids. Missing rates and treatment strategies varied by variable type. Forest management variables (plantation forest tending, natural forest tending, other management) and infrastructure variables (road density, trail density, distance to road, distance to trail) exhibited a missing rate of 0%, as grids with no management history or infrastructure were assigned a structural zero value of 0 (meaning non-occurrence) at the raw data construction stage, rather than being treated as missing. For meteorological variables, minor missing rates were observed: effective humidity (1.16%), maximum wind speed (0.14%), and daily precipitation (0.07%). To verify that these negligible rates did not affect the results, all three model specifications (VIF-based, Random Forest-based, and XGBoost-based logistic regression) were re-estimated after applying listwise deletion to meteorological missing values (

n = 1453

). Across all three models, the selected variable sets were identical to those obtained with the original dataset, and model performance either improved marginally or remained equivalent (VIF: AUC 0.803 → 0.809; RF: AUC 0.795 → 0.801; XGBoost: AUC 0.791 → 0.795), confirming that the missing data had little substantial effect on the analysis. Detailed collection and processing methods for each variable are described below.

2.2.1. Weather Factors ( $X_{1}$ )

Daily weather data provided by the Korea Meteorological Administration (KMA) were utilized. For each Grid point, the shortest distance to weather stations was calculated to extract data from the nearest station. The extracted data included effective humidity (eff_hum; %), daily precipitation (daily_precip; mm), and maximum wind speed (max_wind; m/s). For occurrence grids, weather data from the day before the fire (D-1) were used. For non-occurrence grids, weather data from a randomly assigned day between January 2022 and August 2025 were extracted.

Particularly, effective humidity (eff_hum) was employed as a crucial indicator of the moisture content in forest fuels, which reflects the cumulative dryness of the environment. Unlike simple daily average humidity (avg_hum), effective humidity is calculated as a weighted moving average of the daily relative humidity over a specific preceding period. A decay coefficient (

r = 0.7

) was applied to assign higher weights to more recent days according to the following formula:

H_{e} = \frac{H_{0} + r (H_{1}) + r^{2} (H_{2}) + r^{3} (H_{3}) + r^{4} (H_{4})}{1 + r + r^{2} + r^{3} + r^{4}}

(1)

where

H_{e}

is the effective humidity,

H_{0}

is the relative humidity on the analysis date (D-1 for occurrence grids), and

H_{n}

denotes the relative humidity n days prior to the analysis date, with the decay coefficient r weighting more recent observations more heavily [23]. This approach effectively captures the persistent drying conditions that critically influence wildfire ignition probabilities.

Daily precipitation (daily_precip) was initially included as a continuous variable (mm). However, because 81.1% of the 1471 observations recorded 0 mm—a highly right-skewed distribution in which the median is zero—we additionally tested a binary specification (precip_binary: 1 = precipitation recorded, 0 = no precipitation). Rows with missing meteorological values were excluded for the robustness check (retained

n = 1453

), reducing the missing rate from 1.16% to 0% for effective humidity. The results of this sensitivity analysis are reported in Section 3.2.

2.2.2. Forest Characteristics ( $X_{2}$ )

To reflect the ecological structure and physical state of the forest, the 2024 large-scale forest type map (1:5000) produced by the Korea Forest Service was used. Based on the spatial data, five variables were calculated per grid: mean stand age (stand_age_mean; years) indicating maturity, coniferous tree ratio (conifer_ratio; ratio) representing species composition, mean diameter class (dmcls; cm), stand density (dnst; %), and mean tree height (height; m).

2.2.3. Infrastructure Factors ( $X_{3}$ )

Data for national forest roads and trails were sourced from the Korea Forest Service’s Forest Spatial Information Service. Data on public and private forest roads were obtained through information disclosure requests submitted to officials in Gangwon and Gyeongbuk provinces. Infrastructure factors act as indicators of human accessibility and firefighting resources. The forest road density (road_density;

{km/km}^{2}

), trail density (trail_density;

{km/km}^{2}

), and the distance to the nearest road and trail (dist_road, dist_trail; km) per grid were calculated through GIS spatial analysis (Infrastructure density was calculated as Total Length per grid area using the QGIS ‘Sum line lengths’ function, while infrastructure distance was calculated as the Euclidean distance (km) from the grid Centroid to the infrastructure object using the ‘Distance to nearest hub’ function.). Province-level summary statistics of forest area, forest road length and density, and trail length and density across the two study provinces are presented in Table 3.

2.2.4. Forest Management Factors ( $X_{4}$ )

Spatial data on tending and afforestation projects in public and private forests from 2015 to 2017, provided by the Korea Forest Service, were utilized. The implementation records for detailed forest management activities in the study areas are shown in Table 4. The 12 detailed activities from the collected data were calculated as Area Ratios, dividing the total activity area performed within each grid (1 km × 1 km) by the grid’s total area. These variables were classified into three groups based on their purpose:

Plantation Forest Tending: Activities aimed at improving growth and timber quality in planted stands, comprising the following operations [24]:
-
Pruning: Removal of dead or live branches to produce knot-free, straight timber and improve stand structure.
-
Tending of young trees: Cutting of diseased, suppressed, or competing vegetation around planted seedlings, typically conducted 5–10 years after planting.
-
Thinning: Selective removal of poor-quality trees to promote the growth of superior stems; generally initiated 15 years after planting and repeated every 5–10 years.
-
Stand cleaning: Removal of slash, logging residues, and ground-level combustible material to reduce fuel loads.
-
Weeding: Elimination of competing grasses and shrubs around planted seedlings during the establishment phase.
-
Planting: Afforestation or reforestation of open or degraded lands.
Natural Forest Tending: Activities to improve the health and ecological value of naturally regenerated stands, including tending for public benefits, natural forest improvement, and natural forest conservation [24].
Other Forest Management: Miscellaneous activities including vine removal, byproduct collection, and other operations.

A preliminary validation was conducted using the AUC (Area Under the ROC Curve) to evaluate which configuration—treating activities individually or grouping them—best predicts wildfire risk. The validation showed that the model’s predictive performance was superior when utilizing the three grouped categories (Plantation Tending, Natural Tending, and Other Management; AUC: 0.7590) compared to inputting 12 individual variables or integrating them into a single category (AUC: 0.7491). Thus, the three-group classification system was adopted.

2.2.5. Temporal Factors ( $X_{5}$ )

To account for temporal variations in wildfire occurrence, a ‘Season’ variable was derived from the mapped date of each event. The months were grouped into four categories: Summer (June–August), Spring (March–May), Fall (September–November), and Winter (December–February). For statistical analysis, these were transformed into dummy variables, with Summer serving as the reference category to prevent multicollinearity.

2.3. Feature Selection Based on Machine Learning

To identify the core variables substantially affecting wildfire occurrence among the independent variables, statistical multicollinearity diagnostics and machine learning algorithms were used concurrently. First, to prevent variables that behave similarly from degrading the model’s statistical accuracy, Variance Inflation Factor (VIF) values were calculated to diagnose multicollinearity [25]. Stable variables with a VIF of less than 10 were initially selected (Table 5).

Second, ensemble machine learning models (Random Forest, XGBoost) were introduced to capture complex interactions among variables. Wildfires are nonlinear phenomena in which weather, topography, and infrastructure intertwine. To overcome the limitations of traditional statistical methods that only consider linear relationships, tree-based models, which excel at finding hidden patterns, were utilized. The entire dataset was split into training (80%) and validation (20%) sets using stratified sampling.

Feature importance was extracted from each trained model to identify the top 10 variables (Table 6). The results demonstrated a high degree of consensus between the two algorithms, with 9 out of the top 10 variables overlapping. Notably, while the VIF-based model retained the ’Season’ variables as significant linear predictors, both machine learning models excluded all seasonal variables from the top 10. Instead, effective humidity and stand age mean—which were excluded in the VIF diagnostics due to collinearity issues—emerged as the absolute top-tier variables in both models. This indicates that the machine learning algorithms accurately identified the fundamental physical trigger of wildfires (i.e., extreme dryness represented by low effective humidity) rather than relying on the superficial temporal proxy of ’Season’.

To construct the final probability function, the variable set from the Random Forest (RF) model was adopted, as it exhibited a slightly higher predictive performance (AUC: 0.8001, Pseudo

R^{2}

: 0.1950) compared to XGBoost (AUC: 0.7960, Pseudo

R^{2}

: 0.1818).

2.4. Estimation of Forest Fire Probability Function and Hypothesis Verification

The core objective of this study lies in causal inference rather than predictive accuracy per se. Specifically, this research aims to identify the direction and statistical significance of the impacts of anthropogenic factors—such as forest road density, trail density, and forest management activities—on wildfire occurrence. To achieve this inferential goal, a model that generates interpretable coefficients is essential. While ‘black-box’ models, such as Random Forest or deep neural networks, offer high raw predictive power, they lack the coefficient-level transparency required to refine the algorithms of the Korea Forest Service’s National Forest Fire Danger Rating System.

Therefore, in this study, a Logistic Regression Model [26] was constructed to determine the directionality and statistical significance of the top variables selected via machine learning on actual wildfire occurrence probability. Logistic regression is suitable for binary dependent variables, with wildfire occurrence set as 1 and non-occurrence as 0. The regression equation comprising the selected 10 explanatory variables is expressed as Equation (2):

ln (\frac{P}{1 - P}) = β_{0} + β_{1} X_{1} + β_{2} X_{2} + \dots + β_{10} X_{10} + ϵ

(2)

Here, P is the probability of wildfire occurrence,

β_{0}

is the constant term,

β_{1}

to

β_{10}

are the regression coefficients representing the influence of each independent variable, and

ϵ

represents unobserved factors and the random error term not explained by the model’s independent variables.

McFadden’s Pseudo

R^{2}

was used to evaluate the goodness-of-fit. The final machine learning-based model, which exhibited the highest explanatory power, was adopted. The regression coefficients (Coefficient), significance probability (p-value), and Odds Ratios derived from this model were calculated to verify the practical impact of each factor.

2.5. Contribution Analysis to Wildfire Risk Using SHAP

Despite high predictive performance, machine learning models possess a ’Black-box’ characteristic, making internal decision-making processes hard to grasp. To overcome this, the SHAP (Shapley Additive exPlanations) technique, based on Game Theory, was introduced [27]. SHAP quantitatively decomposes and explains each variable’s contribution at the individual prediction level using Shapley Values [27]. SHAP allows for the observation of how wildfire risk changes as variable values shift at the individual grid level. In the SHAP summary plot (beeswarm plot), each point represents one grid observation. The y-axis lists variables ranked by mean absolute SHAP value (i.e., overall importance), and the x-axis shows the SHAP value, where positive values indicate increased fire probability and negative values indicate suppressed probability. Point color reflects the original feature value: red indicates a high feature value, and blue indicates a low feature value. For example, a cluster of red points on the positive x-axis for a given variable means that high values of that variable are associated with higher wildfire risk.

3. Results

3.1. Seasonal Distribution of Wildfire Occurrences

A total of 471 wildfire events were recorded across the study area from January 2022 to August 2025. Analysis of the monthly distribution revealed a pronounced seasonal concentration during the spring period (Figure 5). April recorded the highest share at 20.4% (96 events), followed by March (18.5%, 87 events), February (17.8%, 84 events), and January (12.1%, 57 events). Collectively, the January–April peak season accounted for 68.8% of all recorded wildfires, consistent with Korea’s climatological pattern of low humidity and strong winds in spring.

In contrast, the summer and early autumn months (June–October) showed markedly suppressed activity, with monthly shares ranging from 1.1% (July, 5 events) to 3.6% (June, 17 events). A secondary, modest increase was observed in November (4.7%, 22 events) and December (6.2%, 29 events), reflecting the dry conditions of the winter season.

Regional disaggregation revealed that Gyeongbuk consistently dominated fire counts throughout the year, peaking at 51 events in February. Yeongseo exhibited its highest activity in April (39 events), while Yeongdong maintained relatively low but persistent counts across all months, with a maximum of 13 events in February. All three regions followed the same unimodal seasonal pattern, confirming that the spring peak is a province-wide phenomenon driven by shared meteorological conditions rather than region-specific factors.

3.2. Machine Learning-Based Forest Fire Probability Function Analysis Results

To reflect the non-linear and complex mechanisms of wildfire occurrence, the top 10 core variables derived from machine learning algorithms were incorporated into the final logistic regression model. The model’s Pseudo

R^{2}

was 0.1950, demonstrating valid analytical reliability with an explanatory power about 1.3 times higher than the logistic regression model, excluding variables with high VIF values (0.1505). The final model’s AUC score reached 0.8001, indicating excellent predictive performance. The results of the logistic regression and the Odds Ratios of each variable are shown in Table 7.

As a robustness check, we re-estimated the model replacing the continuous daily precipitation variable with a binary indicator (precip_binary: 1 = precipitation recorded, 0 = no precipitation). This transformation is statistically motivated by the highly right-skewed distribution of daily_precip, in which 81.1% of observations recorded 0 mm. Rows with missing meteorological values were excluded for this analysis (

n = 1453

). The binary specification yielded virtually identical model performance (Pseudo

R^{2}

= 0.212; AUC = 0.806). Critically, the direction and significance of all primary predictors—effective humidity, conifer ratio, mean stand age, road density, and trail density—remained unchanged across both specifications. When precip_binary was included alongside the RF-selected variables, it attained a significant negative coefficient (

β = - 0.823

, OR = 0.439,

p < 0.01

), confirming that precipitation occurrence reduces wildfire probability. These results demonstrate that the primary findings of this study are robust to the operationalization of the precipitation variable.

To precisely interpret non-linear patterns between variables and individual data contributions that logistic regression struggles to capture, a SHAP Summary Plot was analyzed. SHAP summary plots showed that effective humidity (eff_hum), mean stand age (stand_age_mean), and coniferous tree ratio (conifer_ratio) were identified as the top contributing variables to fire prediction. Variables shifting towards the positive (+) direction when the point color is red (high variable value), such as coniferous ratio and trail density, aggravate wildfire risk. Conversely, variables moving towards the negative (−) direction, such as stand age and road density, suppress the risk (Figure 6).

3.3. Hypothesis 1 Testing: Impact of Forest Age and Species Composition

Based on the regression results in Table 7, Hypothesis 1 was statistically supported. The regression coefficient for mean stand age was

- 2.8588

(

p < 0.001

), identifying it as the strongest suppressor of wildfire occurrence among the model’s variables. The coniferous tree ratio had a coefficient of 1.4446 (

p < 0.001

), acting as a positive (+) factor that significantly increases fire risk. Odds ratio analysis indicated that areas with a high proportion of conifers are roughly 4.24 times more likely to experience wildfires compared to other areas.

In the SHAP plot regarding age and species, the mean stand age showed a trend of SHAP values falling below 0 during the maturation stage (Figure 7). This indicates high vulnerability in young forests but a decreasing risk as age increases. In contrast, as the coniferous ratio increased, SHAP values rose linearly, showing a clear pattern of heightening fire risk.

3.4. Hypothesis 2 Testing: Impact of Plantation Forest Tending Activities

Hypothesis 2, which posited an association between plantation forest tending activities and wildfire occurrence, was rejected in this analysis. The regression coefficient for plantation forest tending was 0.0501 (

p = 0.954

), indicating no significant relationship. This does not support the argument that anthropogenic forest management activities like afforestation or thinning directly cause wildfires.

On the SHAP plot for plantation forest tending, the majority of data points were densely clustered around a SHAP value of 0. This suggests that the marginal contribution of fluctuations in forest management ratios to the prediction of wildfire occurrence in individual grids is negligible.

3.5. Hypothesis 3 Testing: Impact of Forest Road and Trail Infrastructure

Hypothesis 3, which suggested that infrastructure factors increase fire risk, was partially supported as conflicting results emerged depending on the infrastructure type. Trail density recorded a coefficient of 1.4625 (

p = 0.094

), showing a tendency to increase fire probability within a 10% significance level. The odds ratio was high at 4.317, confirming that areas with frequent hiker access face aggravated fire risks due to accidental ignitions. In stark contrast, forest road density (road_density) significantly decreased the occurrence probability with a coefficient of

- 2.7202

(

p = 0.006

). This implies that forest roads facilitate the rapid deployment of firefighting equipment and personnel, thereby preventing the spread of flames and reducing damage probabilities.

The SHAP graphs for infrastructure variables clearly illustrate these opposing roles (Figure 8). As trail density increases, SHAP values rise in the positive (+) direction, indicating heightened risk; whereas for road density, higher densities push SHAP values in the negative (−) direction, exhibiting a suppressive trend on occurrences.

3.6. Impact of Other Environmental Variables: Effective Humidity and Precipitation

Meteorological factors acted as critical control variables determining wildfire occurrence in both machine learning importance evaluations and logistic regression. Effective humidity (eff_hum) had a coefficient of

- 0.0713

(

p < 0.001

), and daily precipitation (daily_precip) was

- 0.1881

(

p = 0.005

), confirming that drier atmospheres drastically and significantly increase the probability of fires. These results align with previous studies [23]. On the effective humidity SHAP plot, dropping below a specific dryness threshold caused SHAP values to spike, aggravating the risk (Figure 9). In zones with sufficient humidity, risk was consistently suppressed. This suggests that even under identical topographical, structural, and infrastructure conditions, reaching meteorological tipping points has a profound impact on triggering wildfires.

4. Discussion

4.1. Wildfire Suppression Effect of Mature Forests and Ecological Mechanisms

The analysis revealed that among ecological factors, the mean stand age lowered fire probability second only to meteorological variables. This supports our hypothesis that mature forests ecologically suppress wildfires, aligning with Zald and Dunn’s [28] findings that young forests heavily impact fire severity. Immature forests or homogeneous plantation stands often have canopies close to the ground, serving as ladder fuels that carry flames upward, and they tend to have abundant dry fine debris, making them highly vulnerable to ignition.

The vertical and horizontal heterogeneity and dense canopies of mature forests provide an insulating effect by shading the forest floor, lowering temperatures, and retaining moisture. As forests age, bark thickness increases and canopy fuels become more elevated, both of which substantially enhance resistance to surface fires beyond the microclimate effects alone.

As noted in the Introduction, Korea’s post-war coniferous plantations have matured into dense stands aged 31 to 50 years, with forest growing stock increasing 18.4-fold since 1946 [12]. Recently, there has been an active debate between forest policies focused on short-rotation clearcutting for economic timber and carbon absorption of young forests versus ecological preservation. At this juncture, forest policies must be established through scientific evaluations of stand age, climate change mitigation, and ecosystem services. Our findings present evidence that extending rotation periods—allowing stands to reach older age classes—may serve as a forest management strategy for suppressing mega-fires in the climate crisis era.

However, because topographical characteristics were not jointly considered, this study could not clarify whether the suppressive effect of longer-rotation management stems solely from ecological factors such as fuel elevation and bark thickness, or from a combination of topographical isolation that restricts human activity. Further research is needed.

Currently, research verifying whether aging forests suppress wildfires in Korea is scarce. Therefore, in-depth follow-up studies utilizing remote sensing technologies to explore the relationship between forest structure, stand age, and wildfire resistance are required.

4.2. The Paradox of Human Activity Infrastructure: Conflicting Roles of Forest Roads and Trails

Anthropogenic infrastructures demonstrated opposing impacts depending on their characteristics. An increase in hiking trail density was identified as an ignition factor raising fire risk. In Korea, dry spring and autumn seasons coincide with peak hiking periods. Fine fuels like dry fallen leaves are easily exposed around trails. Thus, accidental fires caused by human negligence, such as discarded cigarette butts or illegal cooking, readily escalate into actual wildfires.

Conversely, increased forest road density served as a suppressor. While forest roads could act as potential ignition sources by increasing human access, they simultaneously perform a vital fire prevention role regarding emergency response. During a fire, roads enable rapid access for fire trucks and personnel, facilitating mopping-up operations and restricting spread. They are functionally indispensable, especially at night or when helicopters cannot be deployed.

Nevertheless, arguments exist advocating for minimizing road construction due to increased landslide risks and ecosystem fragmentation. Studies investigating the link between roads and fires also present varied outcomes. Hong et al. [20] suggested roads could be primary ignition sources by increasing accessibility. Yet, others (Lee et al. [21]) argue they inhibit the spread. Thus, sophisticated empirical research proving the exact effects of forest roads is required, alongside the development of construction methods that minimize ecological damage.

4.3. Comparison with Recent Wildfire Prediction Models

The predictive performance of the present model compares favorably with recent studies employing similar methodologies in analogous settings. Lee et al. [29], applying ensemble machine learning models (Extra Trees, Random Forest, XGBoost, and LightGBM) with SHAP analysis to daily wildfire prediction in Gangwon Province, achieved a maximum AUC of 0.839 using meteorological, forest-related, and socioeconomic variables. Lee et al. [30], applying a Random Forest model integrating human proximity variables, topographic, and meteorological factors along Korea’s eastern coast (2015–2024), reported an overall accuracy of 0.733 and F1-score of 0.515. At the national scale, Choi et al. [31], comparing Random Forest, XGBoost, and ANN using satellite-based environmental variables across the entire Republic of Korea, reported AUC values of 0.74–0.76. The logistic regression model developed in this study, augmented by machine learning-based variable selection, achieved an AUC of 0.8001 and a McFadden Pseudo

R^{2}

of 0.1950—competitive with these benchmarks while additionally providing coefficient-level interpretability that black-box models cannot offer. This balance between predictive accuracy and statistical transparency is particularly relevant for policy applications such as refining the Korea Forest Service’s National Forest Fire Danger Rating System.

5. Conclusions

To combat the increasing scale and routine nature of wildfire disasters driven by climate change, this study investigated the complex impacts of meteorological, ecological, and anthropogenic factors on wildfire occurrences in Gangwon and Gyeongbuk provinces of KOR via logistic regression analysis. The results confirm that, excluding weather, the most critical fire-suppressing factor is forest stand age. We found that a high proportion of coniferous forests and increased trail density could serve as primary ignition and spread factors of forest fires. Furthermore, increased forest road density significantly reduces occurrence probability, identifying it as a core firefighting asset. The hypothesis that plantation forest tending increases fuel loads and fire risk requires further follow-up research.

These findings suggest the necessity of re-evaluating Korea’s current forest policies centered on economic timber and short-rotation logging. To effectively mitigate fire damage and enhance climate resilience, forests should be managed until older age classes by extending rotation periods. Additionally, ecological forest tending that transitions highly flammable, uniform coniferous plantation forests into fire-resistant broadleaf ecosystems is necessary.

Since this study conducted macroscopic spatial analyses at a 1 km grid resolution, the effects of microscopic changes in understory microclimates conditioned by topographical factors like slope and aspect were not addressed. Furthermore, the use of infrastructure density as a proxy for human activity is a simplification; future studies incorporating direct measures such as visitor counts, ignition source records, and agricultural burning data would better characterize anthropogenic fire risk. The forest management data used in this study were limited to the period 2015–2017, the most recent publicly available records, while fire occurrence data span 2022–2025; this temporal gap is a limitation, though silvicultural treatments such as thinning and stand cleaning may influence forest structure after implementation [32,33]. Finally, the model was developed for Gangwon and Gyeongbuk provinces, and applying these findings to other regions or countries would require similarly detailed infrastructure and management data, which may not be readily available elsewhere.

Author Contributions

Conceptualization, Y.Y.-C. and S.-E.L.; methodology, S.-J.L.; software, H.-R.K.; validation, Y.Y.-C. and S.-J.L.; formal analysis, S.-E.L.; investigation, S.-E.L.; resources, Y.Y.-C.; data curation, S.-E.L. and H.-R.K.; writing—original draft preparation, S.-E.L.; writing—review and editing, Y.Y.-C.; visualization, S.-E.L.; supervision, Y.Y.-C.; project administration, Y.Y.-C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available from the corresponding author upon reasonable request.

Acknowledgments

The authors gratefully acknowledge the Korea Forest Service and the Korea Meteorological Administration for providing the official national wildfire statistics and climatological data used in this study. The authors also thank the anonymous reviewers for their constructive comments, but the authors are solely responsible for the contents of this report.

Conflicts of Interest

Author Hyo-Rin Kim was employed by the company Indidlab. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Abatzoglou, J.T.; Williams, A.P. Impact of anthropogenic climate change on wildfire across western US forests. Proc. Natl. Acad. Sci. USA 2016, 113, 11770–11775. [Google Scholar] [CrossRef]
Bowman, D.M.J.S.; Kolden, C.A.; Abatzoglou, J.T.; Johnston, F.H.; van der Werf, G.R.; Flannigan, M. Vegetation fires in the Anthropocene. Nat. Rev. Earth Environ. 2020, 1, 500–515. [Google Scholar] [CrossRef]
IPCC. Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change; Cambridge University Press: Cambridge, UK; New York, NY, USA, 2021. [Google Scholar]
Lee, C.B.; Kang, W.S.; Kwon, C.G.; Kim, S.Y.; Kim, E.S.; No, N.J.; Ryu, J.Y.; Park, B.B.; Park, J.W.; Seo, K.W.; et al. Scientific Understanding of Forest Fire Management; Jieul: Seoul, Republic of Korea, 2023. (In Korean) [Google Scholar]
Zheng, B.; Ciais, P.; Chevallier, F.; Yang, H.; Canadell, J.G.; Chen, Y.; Zhang, Q. Record-high CO₂ emissions from boreal fires in 2021. Science 2023, 379, 912–917. [Google Scholar] [CrossRef] [PubMed]
IPCC. Climate Change and Land: An IPCC Special Report on Climate Change, Desertification, Land Degradation, Sustainable Land Management, Food Security, and Greenhouse Gas Fluxes in Terrestrial Ecosystems; Intergovernmental Panel on Climate Change: Geneva, Switzerland, 2019. [Google Scholar]
Provisional Scale of Forest Fire Damage in Gyeongbuk, Gyeongnam, and Ulsan Is 104 Thousand ha, Korea Forest Service Is Doing Its Best for Restoration. Available online: https://www.asiae.co.kr/en/article/2025041811085324794 (accessed on 21 March 2026).
National Assembly Research Service. National Response Tasks for Large-Scale Wildfires: In the Wake of the 2025 Yeongnam Region Large Wildfire (Special Report of the Forest Fire Response Research TF); National Assembly Research Service: Seoul, Republic of Korea, 2025. (In Korean)
Choi, Y.; Lim, C.H.; Chung, H.I.; Kim, Y.; Cho, H.J.; Hwang, J.; Kraxner, F.; Biging, G.S.; Lee, W.K.; Chon, J.; et al. Forest management can mitigate negative impacts of climate and land-use change on plant biodiversity: Insights from the Republic of Korea. J. Environ. Manag. 2021, 288, 112400. [Google Scholar] [CrossRef]
Yang, C. A Study on the Improvement of Forest Fire Response System in Korea. Master’s Thesis, University of Seoul, Seoul, Republic of Korea, 2017. (In Korean) [Google Scholar]
Lee, M.-W.; Lee, S.-Y.; Lee, J.H. Study of the Characteristics of Forest Fire Based on Statistics of Forest Fire in Korea. J. Korean Soc. Hazard Mitig. 2012, 12, 185–192. (In Korean) [Google Scholar] [CrossRef][Green Version]
Korea Forest Service. Our Well-Managed Forests Are Greener and More Lush!—Announcement of the 2020 Forest Basic Statistics: Increase in Growing Stock; Korea Forest Service: Daejeon, Republic of Korea, 2021. Available online: https://www.forest.go.kr/kfsweb/cop/bbs/selectBoardArticle.do?nttId=3163039&bbsId=BBSMSTR_1036&mn=NKFS_04_02_01 (accessed on 17 May 2026). (In Korean)
Chang, D.Y.; Jeong, S.; Park, C.-E.; Park, H.; Shin, J.; Bae, Y.; Park, H.; Park, C.R. Unprecedented wildfires in Korea: Historical evidence of increasing wildfire activity due to climate change. Agric. For. Meteorol. 2024, 348, 109920. [Google Scholar] [CrossRef]
Korea Forest Service. Explanation of the Risk Index Calculation Algorithm for the National Forest Fire Danger Rating System; Forest Fire Research Division: Daejeon, Republic of Korea, 2024. (In Korean)
Won, M.; Jang, K.; Yoon, S. Development of the National Integrated Daily Weather Index (DWI) Model to Calculate Forest Fire Danger Rating in the Spring and Fall. Korean J. Agric. For. Meteorol. 2018, 20, 348–356. (In Korean) [Google Scholar] [CrossRef]
Ryu, J.; Kim, S.Y.; Lim, C.; Kwon, C.G. Readjustment of Forest Fire Danger Season According to Climate Change. Crisisonomy 2024, 20, 83–91. (In Korean) [Google Scholar] [CrossRef]
Kwak, H.; Lee, W.K.; Saborowski, J.; Lee, S.Y.; Won, M.S.; Koo, K.S.; Lee, M.B.; Kim, S.N. Estimating the spatial pattern of human-caused forest fires using a generalized linear mixed model with spatial autocorrelation in South Korea. Int. J. Geogr. Inf. Sci. 2012, 26, 1589–1602. [Google Scholar] [CrossRef]
Kim, S.J.; Lim, C.-H.; Kim, G.S.; Lee, J.; Geiger, T.; Rahmati, O.; Son, Y.; Lee, W.-K. Multi-temporal analysis of forest fire probability using socio-economic and environmental variables. Remote Sens. 2019, 11, 86. [Google Scholar] [CrossRef]
Kim, J.; Kim, T.; Lee, Y.E.; Im, S. Spatial and temporal variability of forest fires in the Republic of Korea over 1991–2020. Nat. Hazards 2025, 121, 9801–9821. [Google Scholar] [CrossRef]
Hong, S.; Ahn, M.; Hwang, J. The Effect of Road Density and Vegetation Type on Large Forest Fire Damage—Centered on the 2023 Hongseong Forest Fire. Korean J. Environ. Ecol. 2024, 38, 634–645. (In Korean) [Google Scholar] [CrossRef]
Lee, H.-E.; Kwon, S.; Lim, C.-H. Investigating the Environmental Influencing Factors on Large Wildfire Spread Rate Considering the Spatial Configuration of Forest Roads. J. Korean Soc. For. Sci. 2025, 114, 558–569. (In Korean) [Google Scholar]
Park, J.; Kwon, S.-A.; Cho, S.; Lee, G.; Ryu, S. Derivation of Improvement Direction through Case Analysis of Wildfire Response in Korea. J. Korean Assoc. Crisis Emerg. Manag. 2025, 15, 45–58. Available online: https://www.earticle.net/Article/A475782 (accessed on 26 May 2026). (In Korean)
Kang, S.-C.; Won, M.; Yoon, S. Large Fire Forecasting Depending on the Changing Wind Speed and Effective Humidity in Korean Red Pine Forests Through a Case Study. J. Korean Assoc. Geogr. Inf. Stud. 2016, 19, 146–156. (In Korean) [Google Scholar] [CrossRef]
Korea Forest Service. Forest Tending Projects. Korea Forest Service: Daejeon, Republic of Korea. Available online: https://www.forest.go.kr/kfsweb/kfi/kfs/cms/cmsView.do?cmsId=FC_000900&mn=AR01_03_01 (accessed on 17 May 2026). (In Korean)
O’Brien, R.M. A caution regarding rules of thumb for variance inflation factors. Qual. Quant. 2007, 41, 673–690. [Google Scholar] [CrossRef]
Hosmer, D.W.; Lemeshow, S.; Sturdivant, R.X. Applied Logistic Regression, 3rd ed.; Wiley: Hoboken, NJ, USA, 2013. [Google Scholar] [CrossRef]
Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS 2017), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Zald, H.S.J.; Dunn, C.J. Severe fire weather and intensive forest management increase fire severity in a multi-ownership landscape. Ecol. Appl. 2018, 28, 1068–1080. [Google Scholar] [CrossRef]
Lee, C.; Choi, E.H.; Han, Y.; Lee, Y. Year-round daily wildfire prediction and key factor analysis using machine learning: A case study of Gangwon State, South Korea. Sci. Rep. 2025, 15, 29910. [Google Scholar] [CrossRef]
Lee, J.; Ahn, S.; Im, S. Spatial prediction of forest fire occurrence integrating human proximity: A machine learning approach for Korea’s eastern coast. Forests 2026, 17, 281. [Google Scholar] [CrossRef]
Choi, J.; Yun, Y.; Chae, H. Forest fire risk prediction in South Korea using Google Earth Engine: Comparison of machine learning models. Land 2025, 14, 1155. [Google Scholar] [CrossRef]
Lee, S.J.; Kwon, C.G.; Seo, K.W.; Lee, Y.J.; Kim, S.Y. Thinning Effect on Fuel Load and Crown Fire Hazard - A Case Study of Pinus Densiflora in Goseong, Gangwon Province. Crisisonomy 2023, 19, 27–37. (In Korean) [Google Scholar] [CrossRef]
Lee, Y.E.; Lee, S.J.; Kwon, C.G.; Seo, K.W.; Bang, C.A.; Kim, S.Y. The Effects of Thinning Slash on Wildfire Fuel Type. Crisisonomy 2020, 16, 61–69. (In Korean) [Google Scholar] [CrossRef]

Figure 1. Study area.

Figure 2. Research framework for analyzing the impact of physical and human-driven factors on forest fire occurrence.

Figure 3. Regional division map of Gangwon-Yeongseo, Gangwon-Yeongdong, and Gyeongbuk.

Figure 4. Maps showing the distribution of forest attributes, infrastructure, and management areas with forest fire breakout points: (a) normalized stand age mean, (b) normalized conifer ratio, (c) normalized diameter class, (d) normalized stand density, (e) normalized forest height, (f) municipal boundaries and forest road, (g) municipal boundaries and hiking trails, and (h) municipal boundaries with forest management areas (planting and forest tending zones).

Figure 5. Seasonal wildfire pattern in Korean provinces (2022–2025). (Top) Monthly distribution of wildfire occurrences as a percentage of the annual total (n = 471). Peak season (January–April) is highlighted in red. (Bottom) Monthly fire event counts disaggregated by province (Gyeongbuk, Gangwon-Yeongdong, Gangwon-Yeongseo).

Figure 6. SHAP summary plot (beeswarm plot).

Figure 7. SHAP dependence plots for stand age mean and conifer ratio (x-axis: normalized stand age, ranging from 0 to 1, where higher values represent older stands).

Figure 8. SHAP dependence plots for trail density and road density.

Figure 9. SHAP dependence plots for effective humidity and daily precipitation.

Table 1. Composition of the final analysis dataset by region (Target = 1 vs. Target = 0).

Region	Target = 1 (Occurrence)	Target = 0 (Non-Occurrence)	Total Grids
Gyeongbuk	266 (56.48%)	566	832
Yeongseo	139 (29.51%)	296	435
Yeongdong	66 (14.01%)	138	204
Total	471 (100%)	1000	1471

Table 2. Definitions of dependent and independent variables for logistic regression analysis.

Types	Variables	Specifications
Weather Factors ( $X_{1}$ )	Effective Humidity	Daily effective humidity (%)
	Maximum Wind Speed	Daily maximum wind speed (m/s)
	Precipitation	Daily total precipitation (mm)
Forest Characteristics ( $X_{2}$ )	Stand Age Mean	Mean age of forest stands (years)
	Conifer Ratio	Proportion of coniferous forest area (%)
	Diameter Class	Mean diameter class of trees (cm/dmcls)
	Stand Density	Degree of forest stocking/density (%)
	Average Height	Mean height of forest stands (m)
Infrastructure Factors ( $X_{3}$ )	Road Density	Total length of forest roads per grid ( ${km/km}^{2}$ )
	Trail Density	Total length of hiking trails per grid ${km/km}^{2}$ )
	Distance to Road	Euclidean distance to the nearest road (km)
	Distance to Trail	Euclidean distance to the nearest trail (km)
Forest Management Factors ( $X_{4}$ )	Plantation Forest Tending	Area ratio of plantation forest management activities (%)
	Natural Forest Tending	Area ratio of natural forest management activities (%)
	Other Management	Area ratio of other management activities (%)
Temporal Factors ( $X_{5}$ )	Season	Categorized as Spring, Summer, Fall, and Winter based on occurrence date (Reference: Summer)
Target (Y)	Fire Occurrence	Daily wildfire occurrence (Binary: 0 or 1)

Table 3. Characteristics of forest roads and forest area in the study area.

Region (Si-Do)	Forest Area (ha)	Forest Road Length (km)	Forest Road Density (m/ha)	Trail Length (km)	Trail Density (m/ha)
Gangwon-do	1,365,746	5496.47	4.02	5197.51	3.81
Gyeongsangbuk-do	1,286,222	4464.91	3.47	5600.11	4.35
Total/Average	2,651,968	9961.38	3.76	10,797.62	4.07

Note: Forest Area (ha): Korea Forest Service (2024) Statistical Yearbook of Forestry. Forest Road Length (km): Spatial data on forest roads were obtained from the Forest Geospatial Information Service (FGIS) of the Korea Forest Service; data on public and private forest roads were additionally acquired through information disclosure requests submitted to the relevant authorities of Gangwon-do and Gyeongsangbuk-do; forest road lengths were calculated using the Field Calculator function in QGIS. Trail Length (km): Spatial data on mountain trails were obtained from the FGIS of the Korea Forest Service; in accordance with the dataset’s table definition, trail lengths were derived by summing the values of the PMNTN_LT attribute (planimetric length of mountain trails, in km).

Table 4. Summary of forest management activities by region (2015–2017).

Management Groups	Specific Activities	Gangwon (ha)	Gyeongsangbuk (ha)
	Pruning	13.7	1.5
	Thinning	3852.6	2258.4
	Tending of young trees	2344.5	1737.5
Plantation Tending	Stand cleaning	681.6	11.2
	Weeding	4993.6	5347.6
	Planting	2017.7	1311.8
	Public forest tending	6189.5	2910.3
Natural Tending	Natural forest improvement	693.5	10,437.3
	Natural forest tending	2456.0	5005.2
	Others	0.0	279.9
Other Management	Vine removal	867.8	931.8
	Logging residue collection	55.7	840.1
Total Area		24,166.2	30,132.6

Table 5. VIF-based Primary Selection.

Category	Selected Independent Variables	VIF
Infrastructure	Road density, Trail density, Distance to road, Distance to trail	<10
Management treatment	Other Management, Plantation forest tending, Natural forest tending	<10
Environmental factors	Conifer ratio, Max wind speed, Daily precipitation	<10
Temporal factors	Season (Spring, Fall, Winter)	<10
Model Fit	Pseudo $R^{2}$ = 0.1505, AUC = 0.7494

Note: Variables with a Variance Inflation Factor (VIF)

< 10

were retained as candidates for logistic regression. VIF measures the degree of multicollinearity among predictors; a threshold of 10 is commonly applied to exclude highly collinear variables [25].

Table 6. Final selection based on ML feature importance.

Variables Finally Selected	RF Rank	XGB Rank	Remarks
Effective humidity	1	1	Excluded based on VIF criteria
Stand age mean	2	3	Excluded based on VIF criteria
Conifer ratio	3	2
Distance to road	4	4
Distance to trail	5	5
Max wind speed	6	8
Road density	7	7
Daily precipitation	8	-	Selected only by RF
Plantation forest tending	9	6
Trail density	10	9
Natural forest tending	-	10	Selected only by XGB
Model Fit (using Top 10)	AUC = 0.8001 Pseudo $R^{2} = 0.1950$	AUC = 0.7960 Pseudo $R^{2} = 0.1818$	The RF variable set was selected for the final logistic regression.

Table 7. Logistic regression analysis results for forest fire occurrence.

Variables	Coefficient	Std. Error	z-Value	$p > \| z \|$	Odds Ratio
Intercept	3.7124	0.549	6.764	<0.001 ***	-
Effective humidity	−0.0713	0.006	−12.012	<0.001 ***	0.931
Conifer ratio	1.4446	0.293	4.926	<0.001 ***	4.240
Stand age mean	−2.8588	0.737	−3.881	<0.001 ***	0.057
Distance to road	1.1206	2.680	0.418	0.676	3.067
Distance to trail	−2.6321	1.632	−1.613	0.107	0.072
Road density	−2.7202	0.988	−2.752	0.006 **	0.066
Max wind speed	0.0117	0.035	0.334	0.739	1.012
Plantation tending	0.0501	0.862	0.058	0.954	1.051
Trail density	1.4625	0.873	1.676	0.094 *	4.317
Daily precipitation	−0.1881	0.067	−2.826	0.005 **	0.829

Note: Pseudo

R^{2}

= 0.1950. Significance levels: *

p < 0.1

, **

p < 0.05

, ***

p < 0.01

. Odds Ratio is calculated as exp(Coefficient).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yeo-Chang, Y.; Lee, S.-E.; Lee, S.-J.; Kim, H.-R. Human Activities and Wildfires: The Impact of Forest Roads, Trails, and Forest Management on Wildfire Occurrence. Fire 2026, 9, 246. https://doi.org/10.3390/fire9060246

AMA Style

Yeo-Chang Y, Lee S-E, Lee S-J, Kim H-R. Human Activities and Wildfires: The Impact of Forest Roads, Trails, and Forest Management on Wildfire Occurrence. Fire. 2026; 9(6):246. https://doi.org/10.3390/fire9060246

Chicago/Turabian Style

Yeo-Chang, Youn, Se-Eum Lee, Soo-Jin Lee, and Hyo-Rin Kim. 2026. "Human Activities and Wildfires: The Impact of Forest Roads, Trails, and Forest Management on Wildfire Occurrence" Fire 9, no. 6: 246. https://doi.org/10.3390/fire9060246

APA Style

Yeo-Chang, Y., Lee, S.-E., Lee, S.-J., & Kim, H.-R. (2026). Human Activities and Wildfires: The Impact of Forest Roads, Trails, and Forest Management on Wildfire Occurrence. Fire, 9(6), 246. https://doi.org/10.3390/fire9060246

Article Menu

Human Activities and Wildfires: The Impact of Forest Roads, Trails, and Forest Management on Wildfire Occurrence

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area and Research Flow

2.2. Data Collection and Processing Methods

2.2.1. Weather Factors ( X 1 )

2.2.2. Forest Characteristics ( X 2 )

2.2.3. Infrastructure Factors ( X 3 )

2.2.4. Forest Management Factors ( X 4 )

2.2.5. Temporal Factors ( X 5 )

2.3. Feature Selection Based on Machine Learning

2.4. Estimation of Forest Fire Probability Function and Hypothesis Verification

2.5. Contribution Analysis to Wildfire Risk Using SHAP

3. Results

3.1. Seasonal Distribution of Wildfire Occurrences

3.2. Machine Learning-Based Forest Fire Probability Function Analysis Results

3.3. Hypothesis 1 Testing: Impact of Forest Age and Species Composition

3.4. Hypothesis 2 Testing: Impact of Plantation Forest Tending Activities

3.5. Hypothesis 3 Testing: Impact of Forest Road and Trail Infrastructure

3.6. Impact of Other Environmental Variables: Effective Humidity and Precipitation

4. Discussion

4.1. Wildfire Suppression Effect of Mature Forests and Ecological Mechanisms

4.2. The Paradox of Human Activity Infrastructure: Conflicting Roles of Forest Roads and Trails

4.3. Comparison with Recent Wildfire Prediction Models

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

2.2.1. Weather Factors ( $X_{1}$ )

2.2.2. Forest Characteristics ( $X_{2}$ )

2.2.3. Infrastructure Factors ( $X_{3}$ )

2.2.4. Forest Management Factors ( $X_{4}$ )

2.2.5. Temporal Factors ( $X_{5}$ )