Next Article in Journal
Quantifying Percent Traffic Congestion (pTC) and Mobility Bottleneck Dynamics at Atlanta’s Spaghetti Junction
Previous Article in Journal
Impact of Synthetic Data on Deep Learning Models for Earth Observation: Photovoltaic Panel Detection Case Study
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Predicting Persistent Forest Fire Refugia Using Machine Learning Models with Topographic, Microclimate, and Surface Wind Variables

1
Department of Geography and Environmental Studies, Stellenbosch University, P. Bag X1, Stellenbosch 7602, South Africa
2
Natural Resource Science and Management, Science Faculty, Nelson Mandela University, George Campus, George 6530, South Africa
3
Forest Science Program, Department of Plant and Soil Sciences, University of Pretoria, c/o Forestwood cc, 35 Grace Avenue, Murrayfield, Pretoria 0184, South Africa
4
Centre for Geospatial and Computing Technologies, Faculty of Environment, Society and Design|Te Wāhaka ki te Taiao, te Hāpori Whānui me kā Mahi Hoahoa, Lincoln University, Forbes Building, Ellesmere Junction Road, Lincoln 7608, New Zealand
*
Author to whom correspondence should be addressed.
ISPRS Int. J. Geo-Inf. 2025, 14(12), 480; https://doi.org/10.3390/ijgi14120480
Submission received: 7 September 2025 / Revised: 24 November 2025 / Accepted: 29 November 2025 / Published: 5 December 2025

Abstract

Persistent forest fire refugia are areas within fire-prone landscapes that remain fire-free over long periods of time and are crucial for ecosystem resilience. Modelling to develop maps of these refugia is key to informing fire and land use management. We predict persistent forest fire refugia using variables linked to the fire triangle (aspect, slope, elevation, topographic wetness, convergence and roughness, solar irradiation, temperature, surface wind direction, and speed) in machine learning algorithms (Random Forest, XGBoost; two ensemble models) and K-Nearest Neighbour. All models were run with and without ADASYN over-sampling and grid search hyperparameterisation. Six iterations were run per algorithm to assess the impact of omitting variables. Aspect is twice as influential as any other variable across all models. Solar radiation and surface wind direction are also highlighted, although the order of importance differs between algorithms. The predominant importance of aspect relates to solar radiation received by sun-facing slopes and resultant heat and moisture balances and, in this study area, the predominant fire wind direction. Ensemble models consistently produced the most accurate results. The findings highlight the importance of topographic and microclimatic variables in persistent forest fire refugia prediction, with ensemble machine learning providing reliable forecasting frameworks.

1. Introduction

Forest fire refugia are fire-free areas within fire-prone landscapes (persistent fire refugia, [1,2]). Sometimes fire refugia are defined as areas that experience fire less frequently or with lower severity than their surroundings (ephemeral fire refugia, [1,3,4]). Forest fire refugia are often surrounded by forest, but sometimes they are imbedded in other vegetation types, such as Mediterranean shrublands [5,6] or Spanish open woodlands [2]. Fire refugia are ecologically important, providing habitat for fire-sensitive forest flora and fauna in fire-prone matrices, shelter during the fire, and food after the fire, as well as maintaining mature vegetation and seed sources, which are essential for post-fire forest ecosystem recovery. Consequently, forest fire refugia enhance forest resilience [7].
Fire behaviour and spread is determined by factors illustrated by the fire behaviour triangle, namely, fuels, weather, and topography [1,8]. For a fire to start there must be fuel to burn, oxygen, and an ignition, often termed ‘heat’. Once started, fire behaviour will be determined by types and condition of the fuel, weather, and topographical features. Dry fuel promotes fire spread, whereas wet, discontinuous, or sparse fuel slows or stops fire. Hot dry windy weather will promote fire spread, whereas cold, damp weather can help slow or contain a fire. Topographical features such as slope or channels can promote fire spread. Cooler slopes and drainage lines can slow a fire. Scree slopes and cliffs can be a barrier to fires [9]. Topographic features, such as hills, mountains, and deep valleys, can shield refugia from fire driven by prevailing winds, particularly hot and dry foehn and katabatic winds [2]. All of these factors have been shown to be important in the formation of fire refugia in North American forests [3,10,11] and Mediterranean-type ecosystems [2]. Understanding the interplay between topography, surface wind movement, local climatic conditions, and fire behaviour is essential for identifying and conserving persistent forest fire refugia. No management technique can completely ensure that refugia will not burn under extreme fire conditions. However, knowledge of forest fire refugia locations and drivers can inform forest management practices aimed at promoting ecosystem resilience in fire-prone regions (sensu [9]), such as ensuring that these areas are not harvested.
Studies aiming to identify predictors of forest fire refugia have examined slope, aspect, the topographic wetness index, the topographic convergence index, topographic roughness or ruggedness [3,9,12], and elevation [13]. The topographic wetness index is a measure of water accumulation and soil saturation and based on landscape position [14]. The topographic convergence index is a metric of cold air pooling [3]. Topographic roughness is a measure of topographic complexity, and solar irradiation relates to heat received by a site.
Globally, forest occurrence is determined by climate and soil characteristics and fire exposure, with feedback loops of shading and moisture availability due to forest presence further promoting the location of forest patches [5,15,16,17,18,19]. Afromontane forests in the Southern Cape and Tsitsikamma region of South Africa occur in a matrix of fire-prone fynbos shrublands in locations protected from fire [5,20,21] and with higher soil moisture [16,17,22]. Typically, fire-protected positions in this landscape occur on the lee side of topographic features in relation to the directions of desiccating katabatic (‘berg’) winds [5,19,23,24]. In the Southern Cape region, forest refugia seldom burn, with patches recorded as surviving fire in excess of 100–230 years [25,26], while some patches never burn [27]. The fire regime of surrounding fynbos sees a mean fire return interval of 10–13 years [28,29].
This study will include surface wind speed and direction derived from computational fluid dynamics models, along with various topographic and microclimatic variables, in machine learning models to predict persistent forest fire refugia locations in the Southern Cape of South Africa to inform management action planning.

2. Materials and Methods

2.1. Study Area

The Southern Cape and Tsitsikamma region of South Africa (running from to Stormsriver in the east to George in the west, Figure 1) contains the largest area of Afrotemperate forest in southern Africa [19,30]. The forest occurs embedded in fire-prone, fire-adapted fynbos, which is one of the five Mediterranean-type ecosystems of the world and part of the Cape Floristic Region [31]. Boundaries between forest patches and surrounding fynbos vegetation are likely determined by fire occurrence, climate, and soil characteristics [5,15,16,17,18,19]. On a local scale, the potential of a landscape to accommodate forest is shaped by site characteristics, including climatic and substrate variables. In the relatively uniform Tsitsikamma region, evergreen forest is supported across a rainfall gradient of over 1200 mm/year to about 500 mm/year, a pattern consistent across South Africa and southeastern Democratic Republic of Congo [32]. Detailed field work by one of the authors (250 plots within evergreen forests along six transects from mountain to coast in the Southern Cape) recorded species composition and substrate variables [33]. In all areas, fire-adapted fynbos shrublands occurred adjacent to evergreen forest along typical fire pathways, with slope as the key differentiating factor, supporting the hypothesis that fire dynamics convert landscape potential into actual vegetation patterns, with forests persisting in fire-shadow zones and fire-adapted communities dominating fire pathways.
The narrow (10–40 km wide) coastal plain of the Southern Cape–Tsitsikamma region has a predominantly southerly aspect and rises from the Indian Ocean in the south to the foothills and rugged crest of the Outeniqua and Tsitsikamma Mountains in the north, peaking at 1675 m [34]. Due to the maritime influence, the climate is temperate. Although rain falls throughout the year, it peaks in spring and autumn. Summer months are the driest due to high evaporative demand, particularly associated with prevailing southeasterly winds. In winter, hot and dry northerly bergwinds sporadically create high-fire-danger conditions conducive to fires [34,35]. Consequently, fires may occur at any time of the year [29], and there is no distinct fire season in the study area. The western Sedgefield–Knysna portion of the Southern Cape is the focus of this study. It experiences northwesterly foehn winds (white box, Figure 1).

2.2. Data for Persistent Forest Fire Refugia

The fire regime of the forest is characterised by very long fire return intervals and patches that never burn [27], with an analysed tree ring showing that at least one forest patch has remained unburned for over 230 years [25]. Persistent forest fire refugia were thus identified as areas not burned in the last 100 years [34] and which have forest stands in contrast to the surrounding fynbos matrix, which is a shrubland of smaller stature and which burns at fire return intervals of 10–13 years [27,36,37,38]. This definition is similar to the analysis in [26]. The 100-year threshold used to define persistent refugia is ecologically grounded but remains somewhat arbitrary and could be refined through sensitivity testing. In addition, the models are based on current climatic and fire regime conditions. Persistent forest fire refugia were initially identified using the MODIS burned area product (500 m) between 2000 to 2022 [36]. These were subsequently validated through manual interpretation of aerial photographic imagery with a pixel size of 10–20 m and spanning approximately 100 years (1930–2022). This provided finer spatial detail and extended temporal coverage, helping to mitigate the under-detection of small or low-intensity fires that may be missed by coarse satellite products of only two decades. This combined approach improves confidence in refugia mapping, though it does not fully eliminate uncertainty associated with temporal and spatial data gaps. Presence of persistent forest stands is visible on all imagery and was verified by local experts with thorough in-field knowledge of the forest stands. This set of persistent forest fire refugia was iteratively subjected to a 75/25 split for training and validation.

2.3. Data for Predictor Variables

Human-impacted areas such as urban developments and intensively farmed areas were excluded using the 2020 South African National Landcover Classification dataset (https://za.africageoportal.com/maps/cff4e27fa6fb46f6bde59edb889398ef/explore; accessed 15 June 2023). We investigated the following topographic, microclimatic, and surface wind direction and speed variables that can be easily measured over large areas in a consistent manner as possible correlates of the location of persistent forest fire refugia (Table 1). The Shuttle Radar Topographic Mission’s (SRTM; https://www.usgs.gov/centers/eros/science/usgs-eros-archive-digital-elevation-shuttle-radar-topography-mission-srtm-1, USGS, 2004; accessed 4 February 2023) Digital Elevation Model (DEM) [37,38] is supplied at a spatial resolution of 30 m. The SRTM specifications indicate an absolute vertical accuracy of ±5.94 to ±16 m at the 95% confidence level [39]. The SRTM DEM has been thoroughly tested [40,41] and was used to create variables for elevation [13], slope, aspect, the topographic wetness index, the topographic convergence index [3,42], and the topographic roughness index or ruggedness [5,12]. Processing was conducted in QGIS (3.18.2).
Long-term averages of local-scale annual temperature and annual solar irradiation (kWh/m2) were sourced from the World Bank’s Global Solar Atlas at a 250 m spatial resolution [43] (https://solargis.com). These data represent values at 2 m above ground level [43]. These were used in preference to the calculation of solar irradiation from a DEM, which would still need to be validated and is, at best, an estimation of the potential solar irradiation [44]. Global radiation, sometimes called global horizontal irradiation, is the total amount of solar radiation reaching a horizontal surface. It consists of direct radiation (or direct normal irradiation) that reaches the surface and diffuse radiation (diffuse horizontal irradiation) that is scattered by the atmosphere. We include all three in the models.
Surface wind direction and speed were calculated using Computational Fluid Dynamics (CFD) in OpenFOAM [45] for 30 m pixels. Wind speed and direction were calculated at a height of 2 m above the surface—an elevation critical for fire modelling and behaviour [46]. Internally the simulation achieved stable convergence, with the k-equation residual reducing from 5.54 × 10−4 to 3.41 × 10−5, and global and cumulative continuity residuals were −8.53 × 10−10 and −1.01 × 10−4, respectively, confirming good mass conservation. As no meteorological data points were available within the study area, a pattern-based validation approach was used to evaluate CFD wind direction and speed outputs against know wind movements sensu [47] (Appendix S1) and previously recorded fire paths (Appendix S2). The latter showed that the fire spread model that included the CFD-derived wind data layers results most accurately matched the recorded [48] 45 min from the fire ignition point to the arrival of the fire at the Kooigoed weather station (Appendix S2). This type of validation, whereby patterns produced by model output results are compared with patterns observed in the real world, is often used for evaluating the validity of models in agent-based modelling in ecology, another type of modelling in which it is frequently difficult to have exact validation data [49]. Although this provides confidence in directional accuracy and fire spread impact, future work would benefit from additional field-based wind measurements to refine model calibration. Wind speed and direction were converted into raster datasets, georeferenced, and interpolated using neighbourhood averaging to fill gaps in QGIS.
All datasets were resampled using bilinear interpolation to a common, co-registered, aligned 30 m grid based on the SRTM DEM in Python 3.10.9. Bilinear interpolation has been shown to be an effective method for continuous datasets and DEM data [50]. While the original spatial resolution of the Solargis datasets should be considered in interpretations, upsampling allowed for retention of crucial topographic details that would otherwise be lost [51].
Table 1. Variables used to model spatial locations of persistent forest fire refugia locations.
Table 1. Variables used to model spatial locations of persistent forest fire refugia locations.
VariableRelevanceReference
ElevationHigher elevations may be cooler:
>moisture
<flammable
Some sites also see the following:
<fuel continuity at highest elevations
[3,4,13,14]
SlopeSteeper slopes:
>fuel preheating
>updrafts
>fire spread
[3,9]
AspectSun-facing:
>solar radiation
<moisture
Opposite aspect to prevailing fire season wind direction:
fire shield
[3,9]
Topographic wetness indexWater accumulation:
>moisture
<flammability
[3]
Topographic convergence indexCold air pooling:
>moisture
<flammability
[3]
Topographic roughness indexRugged terrain:
>volatile surface wind > complex fire behaviour
>fire spread
[5]
TemperatureHot areas:
<moisture
>flammability
Solar irradiation (global, direct, and diffuse)>heat
<moisture
>flammability
[4]
Wind directionPrevailing fire wind direction:
>heat
<moisture
>fire spread
Lee of prevailing wind direction may experience fire skipping
[46]
Wind speed>wind speed
<moisture
>flammability
and
>pre-heating
>rate of fire spread
[46]

2.4. Machine Learning Models

Several machine learning techniques, chosen for their differing assumptions and suitability for spatial datasets, were employed in this study. To enhance model performance and reduce overfitting, the dataset underwent Principal Component Analysis (PCA), which improved separability between forest fire refugia and non-refugia areas [52]. PCA1 and PCA2 explained 68.03% and 20.38% of variance, respectively, and were used to increase separability of refugia and non-refugia (Appendix S3). This refinement resulted in 9959 forest refugia pixels and 21,769 non-refugia pixels for training. As there are more than double the number of non-refugia pixels compared to forest refugia pixels in the training dataset, the training dataset can be considered imbalanced. Imbalanced datasets often cause inaccurate results in classifications [53], and consequently, adaptive synthetic (ADASYN) oversampling was tested as it has been found to be useful in improving classification with imbalanced training data [54]. ADASYN aims to keep classes truer to the original data while addressing class imbalance issues. To further reduce overfitting, a 70/30 spatial block holdout split was used for training and validation. This was performed by splitting the input raster into 64x64 pixel blocks and assigning each an ID-only block that contains refugia or non-refugia that are considered for the training or validation. All machine learning algorithms were executed 10 times with different random seeds for the block selection, and the reported results represent the mean, standard deviation, and 95% confidence intervals across these spatial differences. These splits are run iteratively as specified after the discussion on GridSearch.
Three machine learning algorithms—Random Forest (RF), Extreme Gradient Boosting (XGBoost), and K-Nearest Neighbours (KNN)—were applied to assess the relationship between variables described earlier and forest fire refugia and to predict unknown forest fire refugia. RF is an ensemble learning method that constructs multiple decision trees and aggregates their outputs [55]. This approach reduces overfitting and enhances generalisability [56]. RF also provides feature importance rankings, aiding the understanding of key drivers of forest fire refugia [55]. The classifier, implemented in Python using Scikit-learn (Appendix S4), is effective at handling correlated datasets [57]. XGBoost is a decision-tree-based algorithm that iteratively improves predictions through gradient boosting [58]. It is well-regarded for its speed, scalability, and effectiveness in handling imbalanced datasets [59]. XGBoost was implemented using the XGBoost package in Python (Appendix S5), with precautions taken to mitigate overfitting [60]. K-Nearest Neighbours (KNN) is a non-parametric algorithm that classifies data points based on proximity. It does not require large training datasets and can perform well with both balanced and imbalanced datasets [61]. However, it is sensitive to the choice of K and may experience performance degradation with correlated features [62]. The KNN implementation was performed in Python (Appendix S6). To enhance classification performance, ensemble learning was applied using both hard voting (majority vote, Appendix S7) and soft voting (weighted vote, Appendix S8) approaches. Ensemble methods leverage the strengths of multiple classifiers to improve predictive accuracy. Similar sets of algorithms have been used in other recent studies looking at factors affecting various properties of fire [63,64].
Using default parameters for machine learning often gives suboptimal results [65]. Fine-tuning of parameters, such as the minimum number of samples per split, the minimum samples per leaf, number of estimators, maximum number of features, and maximum depth, can improve model results by up to 40% [65,66]. Grid search, a widely used hyperparameter optimisation technique for machine learning algorithms, searches through a predefined set of hyperparameter values and evaluates the metrics such as r2 and accuracy normally using k-fold. A variety of settings were tested for each technique using grid search (Table 2). The selected setting identified by the grid search as optimal for each experiment can be seen in Appendix S9. Grid search was run for the first experiment for each of RF, XGBoost, and KNN. The remaining nine experiments applied these settings using a stable random seed for the classifier and a varying random seed for the training/validation dataset. The ensemble methods used the best parameters from the single-method experiments with the same features. The number of iterations for cross-validation (random splitting of the training and verification datasets) run for each of the 12 experiments were calculated as the product of the number of possible combinations of the GridSearchCV parameters and the GridSearch folds (CV) which was set to 3 and totalled 1296 for RF, 2160 for XGBoost, and 72 for KNN. GridSearchCV in Scikit-learn performs k-fold cross-validation.
A total of 30 machine learning experiments were conducted, systematically varying dataset inputs and employing ADASYN oversampling in subsequent runs. Within each of the five machine learning experiments, variables were omitted to determine the impact of their exclusion, starting with exclusion of elevations (experiments 3, 9, 15, 21, and 27, Appendix S10), temperature (experiments 4, 10, 16, 22, and 28), diffuse irradiation, direct irradiation, and global irradiation (experiments 5, 11, 17, 23, and 29), and wind speed and direction, until only DEM-derived variables (aspect, slope, topographical convergence, topographical roughness, and topographical wetness index) remain (experiments 6, 12, 18, 24, and 30).

2.5. Model Performance and Evaluation

Standard accuracy metrics were used, including accuracy, precision, and recall. These measures of fit are frequently used in the evaluation of machine learning models and provide a multi-faceted approach to evaluating the model fit. Feature importance was used for the RF models, and gain, which can be interpreted as a form of feature importance, was used in XGBoost. Standard deviation and 95% confidence intervals were calculated for feature importance and gain over the multiple runs.

3. Results

3.1. What Determines Persistent Forest Fire Refugia

Aspect is consistently the most important variable identified for modelling persistent forest fire refugia locations based on both feature importance and gain (Figure 2, Table 3). It is identified as twice as important as any other variable. Solar irradiation, particularly global horizontal irradiation, is also identified as important across all models (Figure 2, Table 3). Surface wind direction and speed have the next largest impact on feature importance and gain (Table 3), although the exact order of importance differs between algorithms. Temperature is only identified as important by Random Forest models. Temperature showed some importance in XGBoost (experiment 7, Table 3), but this was reduced once ADASYN oversampling was implemented (experiment 8, Table 3). For the other classifiers, removing temperature had a small impact (experiments 4, 10, 16, 22, and 28), with a few more forest fire refugia being predicted, some of which were shown to house forests when inspected on high-resolution imagery. Overall, RF models spread feature importance more broadly, with XGBoost concentrating mostly around aspect. While error bars show more variability over the experiment runs around aspect than the other features, aspect still remains the primary feature of importance for both RF and XGBoost.

3.2. Model Performance

Accuracy metrics suggest that the results of all experiments (all algorithms/models across all combinations of variables) can be treated with confidence with an average overall accuracy of 0.82 (Table 4). The ensemble methods and XGBoost performed the best, with RF showing slightly lower accuracy. KNN had the lowest performance, especially in recall. The unbalanced datasets (with no ADASYN oversampling) generally performed well across all classifiers but struggled with recall. The best-performing experiments overall, especially in recall, were those where elevation and temperature were dropped with all other variables kept. Comparing the mapped outputs of experiments RF3 and 4 with KNN15 and 16 (Figure 3), it is clear that KNN predicts more lowland forests, some of which may be refugia, but others are overestimates when compared to aerial imagery. KNN, however, dramatically overestimates forest fire refugia once elevation is removed (experiments 3, 9, and 15, Figure 3), indicating that KNN outputs are sensitive to which variables are included in the model.

4. Discussion

4.1. What Determines Forest Fire Refugia

Aspect has been highlighted by other studies as an important variable in predicting persistent forest fire refugia. Previous studies [3,13], found strong links between aspect and fire exclusion, which is what we term persistent forest fire refugia. Aspect of a slope can relate to the solar radiation received and the energy or heat balance and the resultant moisture balance. This is also demonstrated by the decrease in model accuracy when solar irradiation is removed. Sun-facing slopes have higher solar radiation [13], less moisture, and drier, more flammable vegetation or fuels [9] (Figure 4). In the southern hemisphere, these are the north and northwestern aspects, whereas in the northern hemisphere they are the southern slopes. Aspect can also affect the type of vegetation or fuel and how continuous it is, with warmer slopes sometimes having sparser vegetation with less fuel continuity and also different vegetation types that are adapted to hotter, drier conditions, which often result in vegetation structures that are more flammable [67]. In this study area, hot northwestern slopes are also those facing the predominant fire wind direction, and cool southwestern slopes act as fire shields that are very important for forest fire refugia persistence [5]. Persistent long-term forest fire refugia are seen in deep valleys on the south-to-southeast aspect in Figure 5. Similar spatial patterns of evergreen forest persistence within fire-shadow zones have been observed throughout southern and tropical Africa, including the Congo Basin, Mount Cameroon, Ethiopia, Malawi, Mozambique, Angola, and across South Africa’s varied landscapes [32].
Solar irradiation (particularly global irradiation) is also identified as important across all models and agrees with the findings of [4]. Elevation has previously been identified as important in mapping forest refugia [13], with fire survival increasing at higher elevations. Elevation did not appear to improve model accuracy in this study area. This might be due to the smaller altitudinal range in our study area (0–1675 m) compared to that in the Canadian Rockies (0–3000+ m, [13]) or the southwestern United States (3000–5000 m, [7]) or the Pacific Cordillera of Canada [3]. In addition, the effect of elevation might be emphasised in those studies due to the higher latitudes of the North American forests, which will further magnify the cooling effect of altitude. Differences in factors identified as important for forest fire refugia will also depend on whether studies focused on persistent forest fire refugia alone, as this study did, or also included ephemeral forest fire refugia (as in the case of [3,7]).
During the last 50 years, climate change seems to have impacted some facets of fire occurrence in the study area. Fire danger weather has become more severe, with higher temperatures and wind speeds and lower relative humidity. This is manifesting in shorter fire return intervals and increasing occurrence of large and high-severity fires [28,29,35,67]. High-severity fires are likely to cause retraction of forest patches, as it is has been shown that high-severity fires burn deeper into forest margins [22,24]. However, in the Tsitsikamma area, longer-term rainfall fluctuations over the past ~250 years [27] using stem sections of Outeniqua Yellowwood trees collected from the Witelsbos Forest [33] correspond with observations of forest retraction and expansion over longer periods. Forests can expand into fire pathways during wetter, fire-free intervals and then retract to long-term boundaries following severe fires. This pattern is evident across various moist, dry, and scrub forest types in Africa [33]. Potential changes in wind directions due to climate change have not been assessed but are unlikely to affect the fire-flow patterns around and across natural topographic features of the landscape.

4.2. Methods for Mapping Persistent Forest Fire Refugia

Ensemble learning methods and XGBoost consistently achieve the highest accuracy of fit. The stated aims of ensembles are reducing overfitting and enhancing generalisability. KNN did not benefit from ADASYN oversampling, which may highlight its ability to handle unbalanced training data well, as indicated by [61]. However, KNN’s performance metrics are the lowest of all models. Across all experiments, choice of variables included seemed to have more impact on model accuracy than the choice of algorithm. We found that it is useful to compare a range of suitably designed models (methods and data), because after aspect, the order of importance of variables was sensitive to model methods (algorithm, use of ADASYN oversampling) and data.

4.3. Limitations

While this study demonstrates the growing capacity of machine learning models to map persistent forest fire refugia, several methodological and data-related constraints should be acknowledged to contextualise the findings and guide future exploration.
As is often the case with machine learning models, data availability and preprocessing represented some of the most significant constraints in this study. Mixed spatial resolution was a particular challenge. To ensure model compatibility, all raster datasets were standardised to the same resolution and extent. Lower-resolution datasets, such as solar irradiation, were upsampled to match the higher-resolution DEM-derived datasets to avoid losing detail from more finely resolved datasets. This approach is commonly applied in geospatial modelling to ensure spatial consistency and preserve key terrain features [47,50]. However, it can introduce artificial precision and minor spatial inconsistencies, particularly in heterogeneous landscapes. The main implication is a potential smoothing of local variability, which may affect the delineation of smaller refugia.
Like many environmental and medical datasets, class imbalance posed a challenge, with substantially more non-refugia than refugia pixels. Such imbalance can bias classification performance toward the majority class. To reduce this ADASYN oversampling, a widely used technique for addressing class imbalance was applied [54]. While ADASYN improved model training by increasing representation of the minority class, more balanced or stratified samples would further strengthen the robustness of future analyses. Additional techniques, such as ensemble balancing or cost-sensitive learning, may also help reduce residual imbalance effects.
Some model accuracy values may reflect, at least in part, the influence of spatial autocorrelation between training and validation samples. Although PCA preprocessing helped reduce dimensionality and may have mitigated some spatial dependence. We used block holdout to compensate for spatial dependency. Future work could look at minimising spatial clustering in the training data.

5. Conclusions

We found that persistent forest fire refugia can be accurately predicted over large areas and in topographically complex settings, such as in the Sedgefield–Knysna region of southern Africa. Machine learning algorithms, especially ensemble models, using topographic and microclimatic variables emphasise the importance of aspect and its relations to site-level heat and water budgets, in agreement with the findings of [3,13]. Aspect is also related to the occurrence of fire shields from the prevailing fire weather wind direction, as was highlighted, in agreement with [9] in Australia. The importance of surface wind direction has been emphasised by other studies, such as the ‘topographic concavities’ of [7]. While aspect is likely to be important in many areas, the specific aspect will depend on an individual region’s predominant fire wind direction as well as solar exposure, the latter being determined by the hemisphere that a study area is in.
These findings support previous studies emphasising the influences of topography, microclimate, and surface wind on forest fire refugia persistence, particularly in the Sedgefield–Knysna region [5,19]. The machine learning approach used here to predict fire and wind refugia may be applied to other regions where wind and topography are determinants of fire occurrence and vegetation persistence.
In the Sedgefield–Knysna region, these results can be used to highlight the need to protect indigenous forest patches occurring in the lee of topographic features from the prevailing wind direction. These locations are likely to be able to sustain forest in the long term. While climate change may see an increase in fire intensity and frequency, these are unlikely to affect the fire-flow patterns around and across natural topographic features of the landscape. Management of these forest areas may be prioritised, such as to protect these from clear-felling, conversion to agriculture or life-style blocks, or replanting with commercial plantations. These areas can furthermore be incorporated as natural fire breaks or low-risk areas in fire management plans. On the other hand, the identification of areas in the landscape that are not fire refugia may be equally useful to guide fire and vegetation management practices in the landscape. In such fire-exposed areas, the conservation of fire-prone vegetation should be prioritised. These areas should thus be included in prescribed burning schedules, while invasion of forest species [32] and forest establishment efforts should be discouraged here. Use of fire-exposed areas for plantations, as is currently often carried out, is likely to lead to high economic loss.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ijgi14120480/s1, Appendix S1. Patterns based validation, Appendix S2. Fire model pattern based validation, Appendix S3. PCA, Appendix S4. Random forest script with grid search and ADASYN sampling, Appendix S5. XGBoost implementation in Python, Appendix S6. The KNN implementation was performed in Python, Appendix S7. Ensemble script soft, Appendix S8. Ensemble hard script, Appendix S9. Experimental design showing experiment number, dataset and oversampling, X indicate dataset was used in experiments, Appendix S10. Best parameters per experiment.

Author Contributions

Conceptualisation, Sven Christ, Helen M. de Klerk and Coert J. Geldenhuys; methodology, Sven Christ and Helen M. de Klerk; formal analysis, Sven Christ; writing—original draft preparation, Sven Christ and Helen M. de Klerk; writing—review and editing, Helen M. de Klerk, Coert J. Geldenhuys and Tineke Kraaij. All authors have read and agreed to the published version of the manuscript.

Funding

Sven Christ would like to acknowledge the National Institute for the Humanities and Social Sciences scholarship (NIHSS SDS17/1130, http://nihss.ac.za/) for funding. The content of this article is solely the responsibility of the authors and does not necessarily represent the views of the funder.

Data Availability Statement

Codes used are supplied in the Appendices in Supplementary Materials. Open-source data are used, and links to public archives thereof are supplied.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Meddens, A.J.H.; Kolden, C.A.; Lutz, J.A.; Smith, A.M.S.; Cansler, C.A.; Abatzoglou, J.T.; Meigs, G.W.; Downing, W.M.; Krawchuk, M.A. Fire Refugia: What Are They, and Why Do They Matter for Global Change? BioScience 2018, 68, 944–954. [Google Scholar] [CrossRef]
  2. Román-Cuesta, R.M.; Gracia, M.; Retana, J. Factors influencing the formation of unburned forest islands within the perimeter of a large forest fire. For. Ecol. Manag. 2009, 258, 71–80. [Google Scholar] [CrossRef]
  3. Krawchuk, M.A.; Haire, S.L.; Coop, J.; Parisien, M.A.; Whitman, E.; Chong, G.; Miller, C. Topographic and fire weather controls of fire refugia in forested ecosystems of northwestern North America. Ecosphere 2016, 7, e01632. [Google Scholar] [CrossRef]
  4. Rogeau, M.-P.; Barber, Q.E.; Parisien, M.-A. Effect of Topography on Persistent Fire Refugia of the Canadian Rocky Mountains. Forests 2018, 9, 285. [Google Scholar] [CrossRef]
  5. Geldenhuys, C.J. Bergwind Fires and the Location Pattern of Forest Patches in the Southern Cape Landscape, South-Africa. J. Biogeogr. 1994, 21, 49–62. [Google Scholar] [CrossRef]
  6. Giddey, B.L.; Baard, J.A.; Kraaij, T. Verification of the differenced Normalised Burn Ratio (dNBR) as an index of fire severity in Afrotemperate Forest. S. Afr. J. Bot. 2022, 146, 348–353. [Google Scholar] [CrossRef]
  7. Rodman, K.C.; Davis, K.T.; Parks, S.A.; Chapman, T.B.; Coop, J.D.; Iniguez, J.M.; Roccaforte, J.P.; Meador, A.J.S.; Springer, J.D.; Stevens-Rumann, C.S.; et al. Refuge-yeah or refuge-nah? Predicting locations of forest resistance and recruitment in a fiery world. Glob. Change Biol. 2023, 29, 7029–7050. [Google Scholar] [CrossRef]
  8. Rothermel, R.C. A Mathematical Model for Predicting Fire Spread in Wildland Fuels; USDA Forest Services Research Paper; Research Paper INT-115; USDA: Washington, DC, USA, 1972; pp. 1–41. [Google Scholar]
  9. Penman, T.D.; Smith, A.; Burton, J.; Heap, A.; McColl-Gausden, S.C.; Najera-Umana, J.; Gordon, F.; Holyland, B.; Marshall, E. What does it take to survive? An expert elicitation approach to understanding the drivers of fire Refugia occurrence and persistence. Biol. Conserv. 2025, 308, 111257. [Google Scholar] [CrossRef]
  10. Camp, A.; Oliver, C.; Hessburg, P.; Everett, R. Predicting late-successional fire refugia pre-dating European settlement in the Wenatchee Mountains. For. Ecol. Manag. 1997, 95, 63–77. [Google Scholar] [CrossRef]
  11. Chafer, C.J.; Noonan, M.; Macnaught, E. The post-fire measurement of fire severity and intensity in the Christmas 2001 Sydney wildfires. Int. J. Wildland Fire 2004, 13, 227–240. [Google Scholar] [CrossRef]
  12. Riley, S.J.; DeGloria, S.D.; Elliot, R. A Terrain Ruggedness Index that Quantifies Topographic Heterogeneity. Intermt. J. Sci. 1999, 5, 23–27. [Google Scholar]
  13. Rogeau, M.P.; Armstrong, G.W. Quantifying the effect of elevation and aspect on fire return intervals in the Canadian Rocky Mountains. For. Ecol. Manag. 2017, 384, 248–261. [Google Scholar] [CrossRef]
  14. Azedou, A.; Lahssini, S.; Khattabi, A.; Meliho, M.; Rifai, N. A Methodological Comparison of Three Models for Gully Erosion Susceptibility Mapping in the Rural Municipality of El Faid (Morocco). Sustainability 2021, 13, 682. [Google Scholar] [CrossRef]
  15. Manders, P.T. Fire and Other Variables as Determinants of Forest Fynbos Boundaries in the Cape-Province. J. Veg. Sci. 1990, 1, 483–490. [Google Scholar] [CrossRef]
  16. Coetsee, C.; Bond, W.J.; Wigley, B.J. Forest and fynbos are alternative states on the same nutrient poor geological substrate. S. Afr. J. Bot. 2015, 101, 57–65. [Google Scholar] [CrossRef]
  17. Cramer, M.D.; Power, S.C.; Belev, A.; Gillson, L.; Bond, W.J.; Hoffman, M.T.; Hedin, L.O. Are forest-shrubland mosaics of the Cape Floristic Region an example of alternate stable states? Ecography 2019, 42, 717–729. [Google Scholar] [CrossRef]
  18. Lu, M.Z.; Bond, W.J.; Sheffer, E.; Cramer, M.D.; West, A.G.; Allsopp, N.; February, E.C.; Chimphango, S.; Ma, Z.Q.; Slingsby, J.A.; et al. Biome boundary maintained by intense belowground resource competition in world’s thinnest-rooted plant community. Proc. Natl. Acad. Sci. USA 2022, 119, e2117514119. [Google Scholar] [CrossRef]
  19. Mucina, L.; Geldenhuys, C. Afrotemperate, subtropical and azonal forests. In The Vegetation of South Africa, Lesotho and Swaziland; Rutherford, L.M.M., Ed.; South African National Biodiversity Institute: Pretoria, South Africa, 2006; pp. 586–615. [Google Scholar]
  20. Pausas, J.G. Alternative fire-driven vegetation states. J. Veg. Sci. 2015, 26, 4–6. [Google Scholar] [CrossRef]
  21. Pausas, J.G.; Bond, W.J. Humboldt and the reinvention of nature. J. Ecol. 2019, 107, 1031–1037. [Google Scholar] [CrossRef]
  22. Giddey, B.; Baard, J.A.; Vhengani, L.; Kraaij, T. The effect of adjacent vegetation on fire severity in Afrotemperate forest along the southern Cape coast of South Africa. South For. 2021, 83, 225–230. [Google Scholar] [CrossRef]
  23. Bond, W.J.; Midgley, G.F.; Woodward, F.I. What controls South African vegetation—Climate or fire? S. Afr. J. Bot. 2003, 69, 79–91. [Google Scholar] [CrossRef]
  24. Van Wilgen, B.W.; Higgins, K.B.; Bellstedt, D.U. The role of vegetation structure and fuel chemistry in excluding fire from forest patches in the fire-prone fynbos shrublands of South Africa. J. Ecol. 1990, 78, 210–222. [Google Scholar] [CrossRef]
  25. Watson LH, C.M. The influence of fire on a southern cape mountain forest. S. Afr. For. J. 2001, 191, 39–42. [Google Scholar] [CrossRef]
  26. Strydom, T.; Kraaij, T.; Grobler, B.A.; Cowling, R.M. Canopy plant composition and structure of Cape subtropical dune thicket are predicted by the levels of fire exposure. PeerJ 2022, 10, e14310. [Google Scholar] [CrossRef] [PubMed]
  27. McNaughton, J.; Tyson, P.O. A preliminary assessment of Podocarpus falcatus in dendrochronological and dendroclimatological studies in the Witelsbos Forest Reserve. S. Afr. For. J. 1979, 111, 29–33. [Google Scholar] [CrossRef]
  28. Kraaij, T.; Baard, J.; Schutte-Vlok, A.L. Plant response to the fire regime (1970–2023) in a fynbos World Heritage Site: Ecological indicators for fire management. Ecol. Indic. 2025, 170, 113001. [Google Scholar] [CrossRef]
  29. Kraaij, T.; Baard, J.A.; Cowling, R.M.; van Wilgen, B.W.; Das, S. Historical fire regimes in a poorly understood, fire-prone ecosystem: Eastern coastal fynbos. Int. J. Wildland Fire 2013, 22, 277–287. [Google Scholar] [CrossRef]
  30. Quick, L.J.; Chase, B.M.; Wündsch, M.; Kirsten, K.L.; Chevalier, M.; Mäusbacher, R.; Meadows, M.E.; Haberzettl, T. A high-resolution record of Holocene climate and vegetation dynamics from the southern Cape coast of South Africa: Pollen and microcharcoal evidence from Eilandvlei. J. Quat. Sci. 2018, 33, 487–500. [Google Scholar] [CrossRef]
  31. Goldblatt, P.; Manning, J. Cape Plants: A Conspectus of the Cape Flora of South Africa; National Botanical Institute: Pretoria, South Africa; Missouri Botanical Garden: St Louis, MO, USA, 2000; Volume 9. [Google Scholar]
  32. Geldenhuys, C.J.; Moll, E.J.; Swart, R.C. Evergreen forests in South Africa—Their composition, biogeography, ecological dynamics and sustainable use management. S. Afr. J. Bot. 2026, 188, 63–98. [Google Scholar] [CrossRef]
  33. Geldenhuys, C.J. Composition and Dynamics of Plant Communities in the Southern Cape Forests; CSIR, Division of Forest Science and Technology: Pretoria, South Africa, 1993; p. 56. [Google Scholar]
  34. Geldenhuys, C.J. The Management of the Southern Cape Forests. S. Afr. For. J. 1982, 121, 4–10. [Google Scholar] [CrossRef]
  35. Kraaij, T.; Cowling, R.M.; van Wilgen, B.W. Lightning and fire weather in eastern coastal fynbos shrublands: Seasonality and long-term trends. Int. J. Wildland Fire 2013, 22, 288–295. [Google Scholar] [CrossRef]
  36. Giglio, L.; Boschetti, L.; Roy, D.P.; Humber, M.L.; Justice, C.O. The Collection 6 MODIS burned area mapping algorithm and product. Remote Sens. Environ. 2018, 217, 72–85. [Google Scholar] [CrossRef]
  37. Farr, T.G.; Rosen, P.A.; Caro, E.; Crippen, R.; Duren, R.; Hensley, S.; Kobrick, M.; Paller, M.; Rodriguez, E.; Roth, L.; et al. The shuttle radar topography mission. Rev. Geophys. 2007, 45, RG2004. [Google Scholar] [CrossRef]
  38. NASA JPL. NASA Shuttle Radar Topography Mission Global 1 arc Second [Data Set]. 2013. Available online: https://www.earthdata.nasa.gov/data/catalog/lpcloud-srtmgl1-003 (accessed on 4 February 2023). [CrossRef]
  39. Elkhrachy, I. Vertical accuracy assessment for SRTM and ASTER Digital Elevation Models: A case study of Najran city, Saudi Arabia. Ain Shams Eng. J. 2018, 9, 1807–1817. [Google Scholar] [CrossRef]
  40. Mashimbye, Z.E.; de Clercq, W.P.; Van Niekerk, A. An evaluation of digital elevation models (DEMs) for delineating land components. Geoderma 2014, 213, 312–319. [Google Scholar] [CrossRef]
  41. Huggel, C.; Schneider, D.; Miranda, P.J.; Delgado Granados, H.; Kääb, A. Evaluation of ASTER and SRTM DEM data for lahar modeling: A case study on lahars from Popocatépetl Volcano, Mexico. J. Volcanol. Geotherm. Res. 2008, 170, 99–110. [Google Scholar] [CrossRef]
  42. Grass Development Team. Geographic Resources Analysis Support System (GRASS) Software, Version 8.4; Open Source Geospatial Foundation: Arlington, VA, USA, 2024.
  43. Solargis. Solar Resource Map; Solargis: Singapore, 2021; Available online: https://solargis.com/ (accessed on 28 November 2025).
  44. Biljecki, F.; Heuvelink, G.B.M.; Ledoux, H.; Stoter, J. Propagation of positional error in 3D GIS: Estimation of the solar irradiation of building roofs. Int. J. Geogr. Inf. Sci. 2015, 29, 2269–2294. [Google Scholar] [CrossRef]
  45. Christ, S. The Influence of Topographical Variability on Wildfire Occurrence and Propagation; Stellenbosch University: Stellenbosch, South Africa, 2024. [Google Scholar]
  46. Morvan, D. Physical Phenomena and Length Scales Governing the Behaviour of Wildfires: A Case for Physical Modelling. Fire Technol. 2011, 47, 437–460. [Google Scholar] [CrossRef]
  47. Sharples, J.J.; McRae, R.H.; Wilkes, S.R. Wind–terrain effects on the propagation of wildfires in rugged terrain: Fire channelling. Int. J. Wildland Fire 2012, 21, 282–296. [Google Scholar] [CrossRef]
  48. Frost, P.; Kleyn, L.; Van Den Dool, R.; Burgess, M.; Vhengani, L.; Steenkamp, K.; Wessels, K. The Elandskraal Fire, Knysna: A Data Driven Analysis; CSIR Meraka Institute: Pretoria, South Africa, 2018. [Google Scholar]
  49. Gallagher, C.A.; Chudzinska, M.; Larsen-Gray, A.; Pollock, C.J.; Sells, S.N.; White, P.J.C.; Berger, U. From theory to practice in pattern-oriented modelling: Identifying and using empirical patterns in predictive models. Biol. Rev. 2021, 96, 1868–1888. [Google Scholar] [CrossRef] [PubMed]
  50. Pithani, M.B.; Sanyal, S.; Shukla, A.K. Bilinear and Bicubic Interpolations for Image Presentation of Mechanical Stress and Temperature Distribution. Power Eng. Eng. Thermophys. 2022, 1, 8–18. [Google Scholar] [CrossRef]
  51. Gao, Z.T.; Wang, L.M.; Wu, G.S. LIP: Local Importance-based Pooling. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV 2019), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 3354–3363. [Google Scholar]
  52. Chandra, R.; Bansal, C.; Kang, M.Y.; Blau, T.; Agarwal, V.; Singh, P.; Wilson, L.O.W.; Vasan, S. Unsupervised machine learning framework for discriminating major variants of concern during COVID-19. PLoS ONE 2023, 18, e0285719. [Google Scholar] [CrossRef]
  53. He, H.; Bai, Y.; Garcia, E.A.; Li, S. ADASYN: Adaptive synthetic sampling approach for imbalanced learning. IEEE Int. Jt. Conf. Neural Netw. 2008, 2008, 1322–1328. [Google Scholar] [CrossRef]
  54. Ahmed, G.; Er, M.J.; Fareed, M.M.S.; Zikria, S.; Mahmood, S.; He, J.; Asad, M.; Jilani, S.F.; Aslam, M. DAD-Net: Classification of Alzheimer’s Disease Using ADASYN Oversampling Technique and Optimized Neural Network. Molecules 2022, 27, 7085. [Google Scholar] [CrossRef]
  55. Cutler, D.R.; Edwards, T.; Beard, K.H.; Cutler, A.; Hess, K.T. Random forests for classification in ecology. Ecology 2007, 88, 2783–2792. [Google Scholar] [CrossRef]
  56. Ramos, D.; Carneiro, D.; Novais, P. evoRF: An Evolutionary Approach to Random Forests. Stud. Comput. Intell. 2020, 868, 102–107. [Google Scholar] [CrossRef]
  57. Chen, X.; Ishwaran, H. Random forests for genomic data analysis. Genomics 2012, 99, 323–329. [Google Scholar] [CrossRef] [PubMed]
  58. Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In KDD 16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; Krishnapuram, B., Shah, M., Eds.; Association for Computing Machinery: New York, NY, USA; San Francisco, CA, USA, 2016; pp. 785–794. [Google Scholar]
  59. Georganos, S.; Grippa, T.; Vanhuysse, S.; Lennert, M.; Shimoni, M.; Wolff, E. Very High Resolution Object-Based Land Use-Land Cover Urban Classification Using Extreme Gradient Boosting. IEEE Geosci. Remote Sens. Lett. 2018, 15, 607–611. [Google Scholar] [CrossRef]
  60. Ma, C.L.; Peng, Y.X.; Wu, L.T.; Guo, X.Y.; Wang, X.B.; Kong, X.Q. Application of Machine Learning Techniques to Predict the Occurrence of Distraction-affected Crashes with Phone-Use Data. Transp. Res. Rec. 2022, 2676, 692–705. [Google Scholar] [CrossRef]
  61. Thanh Noi, P.; Kappas, M. Comparison of Random Forest, k-Nearest Neighbor, and Support Vector Machine Classifiers for Land Cover Classification Using Sentinel-2 Imagery. Sensors 2017, 18, 18. [Google Scholar] [CrossRef]
  62. Prakash, K.B.; Kanagachidambaresan, G.R. Pattern Recognition and Machine Learning. In Programming with TensorFlow; EAI/Springer Innovations in Communication and Computing; Prakash, K.B., Kanagachidambaresan, G.R., Eds.; Springer: Cham, Switzerland, 2021; pp. 105–144. [Google Scholar]
  63. Barzani, A.R.; Pahlavani, P.; Ghorbanzadeh, O.; Gholamnia, K.; Ghamisi, P. Evaluating the Impact of Recursive Feature Elimination on Machine Learning Models for Predicting Forest Fire-Prone Zones. Fire Ecol. 2024, 7, 440. [Google Scholar] [CrossRef]
  64. Zhang, L.; Shi, C.; Zhang, F. Predicting Forest Fire Area Growth Rate Using an Ensemble Algorithm. Forests 2024, 15, 1493. [Google Scholar] [CrossRef]
  65. Tantithamthavorn, C.; McIntosh, S.; Hassan, A.E.; Matsumoto, K. The Impact of Automated Parameter Optimization on Defect Prediction Models. IEEE Trans. Softw. Eng. 2019, 45, 683–711. [Google Scholar] [CrossRef]
  66. Goldstein, B.A.; Polley, E.C.; Briggs, F.B.S. Random Forests for Genetic Association Studies. Stat. Appl. Genet. Mol. 2011, 10, 32. [Google Scholar] [CrossRef] [PubMed]
  67. Kraaij, T.; Baard, J.A.; Arndt, J.; Vhengani, L.; van Wilgen, B.W. An assessment of climate, weather, and fuel factors influencing a large, destructive wildfire in the Knysna region, South Africa. Fire Ecol. 2018, 14, 4. [Google Scholar] [CrossRef]
Figure 1. The Sedgefield–Knysna area (white box) within the Southern Cape–Tsitsikamma region of South Africa experiences northwesterly foehn winds (indicated by the white arrow) as the predominant fire weather wind. Afrotemperate forest of the region is shown in dark green.
Figure 1. The Sedgefield–Knysna area (white box) within the Southern Cape–Tsitsikamma region of South Africa experiences northwesterly foehn winds (indicated by the white arrow) as the predominant fire weather wind. Afrotemperate forest of the region is shown in dark green.
Ijgi 14 00480 g001
Figure 2. Feature importance (a) from Random Forest experiments (values range from 0 to 1), and feature gain (b) from XGBoost (XGBoost, values are unbounded) showing the relative contribution of each variable to model performance.
Figure 2. Feature importance (a) from Random Forest experiments (values range from 0 to 1), and feature gain (b) from XGBoost (XGBoost, values are unbounded) showing the relative contribution of each variable to model performance.
Ijgi 14 00480 g002
Figure 3. Spatial arrangement of predicted persistent forest fire refugia (green) and non-refugia (red) for classifications using RF, XGBoost, KNN, ensemble with soft voting, and ensemble with hard voting across various experiments.
Figure 3. Spatial arrangement of predicted persistent forest fire refugia (green) and non-refugia (red) for classifications using RF, XGBoost, KNN, ensemble with soft voting, and ensemble with hard voting across various experiments.
Ijgi 14 00480 g003
Figure 4. Relationship between aspect, solar radiation (heat budget), moisture balance, vegetation flammability, and predominant fire wind direction, which facilitates wind- and fire-shadows in the lee of the slope. Features that increase fire risk and spread are in red text and those that reduce fire risk and retard fire spread are in blue text. Sun-facing aspect is the north/northwestern slopes in the southern hemisphere.
Figure 4. Relationship between aspect, solar radiation (heat budget), moisture balance, vegetation flammability, and predominant fire wind direction, which facilitates wind- and fire-shadows in the lee of the slope. Features that increase fire risk and spread are in red text and those that reduce fire risk and retard fire spread are in blue text. Sun-facing aspect is the north/northwestern slopes in the southern hemisphere.
Ijgi 14 00480 g004
Figure 5. Small, persistent forest fire refugia (indicated by white arrows) in deep valley bottoms on the south-to-southeast slopes. These slopes are in the lee of the predominant fire wind direction from the northeast.
Figure 5. Small, persistent forest fire refugia (indicated by white arrows) in deep valley bottoms on the south-to-southeast slopes. These slopes are in the lee of the predominant fire wind direction from the northeast.
Ijgi 14 00480 g005
Table 2. Gridsearch hyperparameterisation settings tested and optimisation settings selected (bold).
Table 2. Gridsearch hyperparameterisation settings tested and optimisation settings selected (bold).
ParameterSearch Values
Random Forest
Out-of-bag scoreTRUE
N estimators50100200500
Max featuressqrtlog2
Max depthnone102030
Min samples split2510
Min samples leaf124
XGBoost
Parameter search values
Eta (Learning rate)0.010.10.5
N estimators501002001500
Max depth357
Reg_alpha (Lasso regression)00.0010.0050.010.05
Reg_lambda (Ridge regression)0.511.52
KNN
Parameter search values
N neighbours3579
Weightsuniformdistance
MetricEuclideanManhattanMinkowski
Table 3. Model feature importance (RF, values range from 0 to 1) and gain (XGBoost, values are unbounded) results. Strongest values are in the darkest shades of green.
Table 3. Model feature importance (RF, values range from 0 to 1) and gain (XGBoost, values are unbounded) results. Strongest values are in the darkest shades of green.
MethodExperiment NumberElevationTemperatureDiffuse horizontal IrradiationDirect Normal IrradiationGlobal Horizontal IrradiationWind SpeedWind DirectionAspectSlopeTopographical ConvergenceTopographical RoughnessTopographical Wetness Index
RF10.070.090.080.070.170.060.140.270.010.010.020.03
20.080.140.080.070.140.090.100.190.040.020.030.04
3 0.150.080.070.160.090.100.240.030.010.030.04
4 0.090.080.170.120.120.260.050.020.040.05
5 0.120.180.420.070.060.070.09
6 0.510.120.110.120.14
XGBoost756.2748.5663.4952.5299.8744.88103.19452.2913.2314.8818.8345.64
819.6613.6713.0710.0230.4910.3319.5755.023.995.3212.6910.44
9 9.619.949.0221.657.0713.2852.392.291.5211.277.92
10 14.9613.0827.289.5418.8462.253.213.1919.1812.26
11 0.831.724.620.300.581.060.92
12 2.310.500.560.720.67
Table 4. Model performance for various experiments to evaluate the effect on accuracy of omitting certain variables. For each accuracy metric, the mean, standard deviation (Std), and upper and lower 95% confidence intervals (CI_low and CI_high) are presented. Strongest values are in the darkest shades of green. Top scoring model per measure (as measured by the mean) is highlighted in bold.
Table 4. Model performance for various experiments to evaluate the effect on accuracy of omitting certain variables. For each accuracy metric, the mean, standard deviation (Std), and upper and lower 95% confidence intervals (CI_low and CI_high) are presented. Strongest values are in the darkest shades of green. Top scoring model per measure (as measured by the mean) is highlighted in bold.
MethodExperiment_NoAccuracy_MeanAccuracy_StdAccuracy_CI_LowAccuracy_CI_HighPrecision_MeanPrecision_StdPrecision_CI_LowPrecision_CI_HighRecall_MeanRecall_StdRecall_CI_LowRecall_CI_HighF1_MeanF1_StdF1_CI_LowF1_CI_HighOOB_MeanOOB_StdOOB_CI_LowOOB_CI_HighAUC_ROC_MeanAUC_ROC_StdAUC_ROC_CI_LowAUC_ROC_CI_High
RF10.8100.0970.7500.8700.7960.1460.7060.8860.6240.2570.4640.7830.6540.1850.5400.7690.9970.0010.9960.9970.9020.0680.8610.944
RF20.8130.0880.7580.8670.7940.1420.7060.8820.6230.2280.4810.7640.6600.1660.5570.7630.9980.0000.9980.9990.8950.0670.8530.936
RF30.8220.0820.7710.8730.7880.1410.7000.8750.6640.2290.5220.8060.6850.1570.5870.7820.9980.0000.9980.9990.9080.0580.8720.944
RF40.8340.0450.8060.8620.7620.1630.6610.8630.7070.1360.6230.7910.7160.1100.6480.7840.9930.0020.9920.9950.9140.0280.8960.932
RF50.8310.0300.8120.8490.7160.1520.6220.8100.7730.1150.7020.8450.7290.0960.6700.7890.9650.0040.9630.9680.9140.0270.8980.931
RF60.8060.0430.7790.8330.6620.1570.5640.7590.8360.0670.7940.8780.7230.0940.6650.7820.8910.0140.8820.9000.8940.0310.8750.913
XGboost70.8210.0890.7660.8760.7660.1290.6860.8460.6850.2380.5370.8330.6920.1560.5950.789 0.9050.0580.8690.940
XGboost80.8200.0840.7680.8710.7610.1310.6800.8420.6820.2200.5460.8190.6910.1450.6010.782 0.8970.0610.8590.935
XGboost90.8250.0820.7740.8760.7550.1280.6760.8340.7070.2230.5690.8450.7030.1450.6140.793 0.8980.0680.8560.940
XGboost100.8430.0390.8190.8680.7580.1520.6640.8520.7430.1190.6700.8170.7390.1010.6760.801 0.9200.0310.9020.939
XGboost110.8190.0230.8050.8330.7090.1530.6140.8040.7340.1070.6680.8010.7070.0920.6500.764 0.8950.0270.8790.912
XGboost120.7960.0360.7740.8190.6620.1560.5650.7580.7750.0620.7370.8140.6990.0870.6450.753 0.8670.0300.8480.885
KNN130.8120.0430.7850.8390.7090.1630.6080.8100.7410.1330.6590.8230.7030.0920.6460.761 0.8420.0450.8140.869
KNN140.8040.0390.7800.8270.6850.1710.5790.7910.7580.1230.6810.8340.6980.1010.6350.761 0.8220.0420.7960.848
KNN150.8230.0260.8070.8390.6930.1580.5950.7910.7920.1000.7300.8540.7260.1060.6600.791 0.8440.0320.8240.863
KNN160.8210.0270.8040.8370.6900.1580.5920.7880.7920.1000.7300.8540.7240.1050.6590.789 0.8420.0310.8230.861
KNN170.7990.0270.7820.8150.6560.1510.5620.7500.7860.0870.7320.8390.7010.0910.6440.757 0.8340.0230.8200.849
KNN180.7780.0360.7560.8010.6300.1530.5350.7250.7820.0580.7460.8180.6820.0910.6260.739 0.8280.0240.8130.843
EnsembleS 190.8180.0870.7630.8720.7710.1370.6870.8560.6710.2340.5260.8160.6830.1540.5880.779 0.9090.0560.8740.944
EnsembleS200.8230.0780.7740.8710.7620.1450.6720.8520.6950.2120.5640.8270.6980.1400.6110.785 0.9060.0560.8710.941
EnsembleS210.8350.0700.7920.8780.7640.1320.6820.8450.7260.1970.6030.8480.7210.1270.6430.800 0.9190.0400.8940.944
EnsembleS220.8430.0360.8210.8650.7450.1630.6440.8460.7650.1170.6920.8370.7410.1110.6730.810 0.9190.0260.9030.935
EnsembleS230.8250.0270.8090.8420.7020.1540.6060.7980.7800.1080.7140.8470.7250.0960.6660.785 0.9080.0260.8920.924
EnsembleS240.8030.0430.7760.8300.6580.1580.5600.7560.8330.0650.7930.8730.7200.0970.6600.780 0.8900.0300.8720.908
EnsembleH250.8140.0920.7570.8710.7770.1400.6900.8640.6530.2400.5040.8020.6730.1630.5720.774 0.9090.0560.8740.944
EnsembleH260.8200.0820.7690.8710.7710.1410.6830.8590.6720.2190.5360.8070.6870.1470.5960.778 0.9060.0560.8710.941
EnsembleH270.8310.0750.7850.8770.7710.1340.6880.8540.7080.2120.5770.8390.7110.1370.6260.796 0.9190.0400.8940.944
EnsembleH280.8410.0380.8180.8650.7500.1650.6480.8530.7500.1230.6730.8260.7360.1130.6650.806 0.9190.0260.9030.935
EnsembleH290.8280.0280.8100.8450.7080.1550.6120.8040.7750.1110.7060.8440.7260.0990.6650.787 0.9080.0260.8920.924
EnsembleH300.8050.0450.7770.8330.6590.1590.5600.7580.8400.0630.8000.8790.7230.0990.6620.785 0.8900.0300.8720.908
Mean 0.819 0.728 0.736 0.706 0.974 0.891
Std dev 0.014 0.048 0.059 0.022 0.039 0.030
Min 0.778 0.630 0.623 0.654 0.891 0.822
Max 0.843 0.796 0.840 0.741 0.998 0.920
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Christ, S.; Kraaij, T.; Geldenhuys, C.J.; de Klerk, H.M. Predicting Persistent Forest Fire Refugia Using Machine Learning Models with Topographic, Microclimate, and Surface Wind Variables. ISPRS Int. J. Geo-Inf. 2025, 14, 480. https://doi.org/10.3390/ijgi14120480

AMA Style

Christ S, Kraaij T, Geldenhuys CJ, de Klerk HM. Predicting Persistent Forest Fire Refugia Using Machine Learning Models with Topographic, Microclimate, and Surface Wind Variables. ISPRS International Journal of Geo-Information. 2025; 14(12):480. https://doi.org/10.3390/ijgi14120480

Chicago/Turabian Style

Christ, Sven, Tineke Kraaij, Coert J. Geldenhuys, and Helen M. de Klerk. 2025. "Predicting Persistent Forest Fire Refugia Using Machine Learning Models with Topographic, Microclimate, and Surface Wind Variables" ISPRS International Journal of Geo-Information 14, no. 12: 480. https://doi.org/10.3390/ijgi14120480

APA Style

Christ, S., Kraaij, T., Geldenhuys, C. J., & de Klerk, H. M. (2025). Predicting Persistent Forest Fire Refugia Using Machine Learning Models with Topographic, Microclimate, and Surface Wind Variables. ISPRS International Journal of Geo-Information, 14(12), 480. https://doi.org/10.3390/ijgi14120480

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop