Next Article in Journal
The Impact of Fire Emission Inputs on Smoke Plume Dispersion Modeling Results
Previous Article in Journal
Research on Fire Evacuation in University Libraries Based on the Fuzzy Ant Colony Optimization Algorithm
Previous Article in Special Issue
Impacts of COVID-19-Induced Human Mobility Changes on Global Wildfire Activity
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Regional Prediction of Fire Characteristics Using Machine Learning in Australia

1
School of Computer Science, University of Wollongong in Dubai, Dubai P.O. Box 20183, United Arab Emirates
2
School of Engineering, University of Wollongong in Dubai, Dubai P.O. Box 20183, United Arab Emirates
*
Authors to whom correspondence should be addressed.
Fire 2025, 8(8), 330; https://doi.org/10.3390/fire8080330
Submission received: 1 July 2025 / Revised: 9 August 2025 / Accepted: 10 August 2025 / Published: 16 August 2025
(This article belongs to the Special Issue Intelligent Forest Fire Prediction and Detection)

Abstract

Wildfires are increasing in frequency and severity, with Australia’s 2019–2020 Black Summer burning over 18 million hectares. Accurate prediction of wildfire behavior is essential for effective risk assessment and emergency response. This study presents a machine learning framework for predicting wildfire dynamics across Australia’s seven regions using the IBM wildfire dataset. Various Machine Learning (ML) models were evaluated to forecast three key indicators: Fire Area (km2), Fire Brightness Temperature (K), and Fire Radiative Power (MW). Lasso Regression consistently outperformed the other models, achieving an average RMSE of 0.04201 and R2 of 0.29355. Performance varied across regions, with stronger results in areas like New South Wales and Queensland, likely influenced by differences in topography, microclimate, and vegetation. However, limitations include the exclusion of ignition sources such as lightning and human activity, which are critical for capturing the environment accurately and improving predictive accuracy. Future work will integrate these factors alongside more detailed weather and vegetation data. Practical implementation may face challenges related to real-time data availability, system integration, and response coordination, but this approach offers promising potential for operational wildfire decision support.

1. Introduction

Wildfires are a critical and escalating global issue, leading to widespread environmental degradation, economic disruption, and severe health consequences. In 2020 alone, approximately 9.5 million acres were consumed by wildfires in the western United States, an alarming escalation from historical norms [1]. Although the number of wildfire incidents has declined since the 1990s, the total area burned annually has more than doubled [2], indicating a shift toward more intense and prolonged fire seasons. These trends are largely driven by climate change, land use patterns, and anthropogenic factors. The impacts of wildfires are widespread and multifaceted, affecting human health, ecosystems, and economies.
Smoke inhalation has been conclusively linked to respiratory and cardiovascular diseases [3]. The 2019–2020 Australian bushfires led to a substantial rise in emergency department visits, particularly in lower socioeconomic areas [4], underscoring how extreme events disproportionately affect vulnerable populations. Further, fire-related ambient nitrate pollution has been linked to significant global health risks, with increased mortality in densely populated regions such as China and India [5]. In the Brazilian Amazon, data-driven studies using machine learning and time series models found that fire exposure was correlated with respiratory hospitalizations and COVID-19 mortality [6]. Socioeconomic disadvantage has consistently emerged as a key factor in wildfire vulnerability [7]. These disparities were evident during Australia’s Black Summer, where unequal protection capacities and resources increased risk exposure. Fire activity also interacts with the global carbon cycle. Ref. [8] showed that declining fire activity linked to demographic shifts can enhance land carbon uptake, potentially offsetting 5–10 years of CO2 emissions by 2100 and contributing to climate change mitigation.
Beyond vegetation and atmospheric effects, fire also impacts faunal communities. Predators, which play a crucial role in shaping ecosystems through top-down regulation, show varied and context-specific responses to wildfire. A global systematic review and meta-analysis found inconsistent responses among terrestrial predators, largely dependent on species traits and geographic context, highlighting critical knowledge gaps in predator–fire interactions [9].
Understanding how fire regimes evolve under climate and anthropogenic pressures remains challenging. Expert assessments project a significant increase in fire frequency, severity, and size, especially under high-emission scenarios, and emphasize the urgency for adaptive management strategies [10]. Historical and political dynamics further shape wildfire regimes. In Algeria, colonial fire suppression policies were not only ecological strategies but also instruments of control rooted in European forestry ideologies and punitive measures [11].
The underlying drivers of wildfires include biophysical, climatic, and anthropogenic factors. Climate plays a pivotal role in shaping the spatial and temporal patterns of ignition and spread. In Portugal, different pyroregions align with regional climate zones, with intra-annual fire peaks reflecting spring and summer weather conditions [12]. Wildfire smoke emissions also feed back into the climate system, contributing to radiative forcing and altering local weather patterns [13]. Recent advances in global fire emissions modeling at finer spatial resolution (500 m) have improved estimates of carbon emissions and highlighted the importance of capturing landscape heterogeneity for accurate fire impact assessments [14]. Mediterranean countries such as Portugal, Spain, and France have experienced increasingly severe fire seasons, highlighted in 2022 by extreme fuel dryness and unusually early large fires [15]. Oceanic climate signals like anomalous sea surface temperatures have also been shown to influence fire risk in the Amazon by reducing regional precipitation and humidity [16].
Topography and fuel composition further modulate fire behavior. Laboratory tests have demonstrated that spot fires on sloped terrain can accelerate spread, suggesting that models may underestimate fire risk in hilly regions [17]. Human activity is another dominant ignition source, particularly in the United States, where the majority of fires are human-caused, resulting in longer fire seasons and more frequent events [18]. Accurate prediction of wildfire risk and behavior remains a complex challenge due to the non-linear interactions between variables such as weather, vegetation, topography, fuel moisture, and human activity [19]. Traditional fire danger indices, based on static and short-term weather data, often fail to deliver the precision and responsiveness required for real-time decision-making. Traditionally, fire danger has been assessed using daytime climatic conditions, but emerging studies suggest night-time fire behavior is becoming increasingly relevant.
Ref. [20] showed that globally, flammable night-time hours have increased due to rising vapor pressure deficits, reducing the natural overnight suppression window and intensifying wildfires. Empirical models, particularly those used for public communication and resource management, show widely differing sensitivities to temperature and humidity, highlighting the variability and uncertainty in projections of future fire behavior [21]. These variations emphasize the importance of carefully selecting fire models when developing climate-informed fire risk assessments.
In response, researchers have increasingly turned to time series modeling. By incorporating longitudinal data on meteorological, environmental, and fire-related variables, such approaches have enhanced predictive capacity. Ref. [22] found that lagged temperature and moisture indices significantly improved wildfire prediction in Southwest China.
Machine learning (ML) and deep learning (DL) have further advanced wildfire modeling. Techniques such as Random Forest (RF) [23], Extreme Gradient Boosting (XGBoost) [24], Support Vector Machines (SVM) [25], and Long Short-Term Memory (LSTM) networks [26] have outperformed traditional statistical models in high-dimensional, non-linear contexts. These methods enable real-time inference and dynamic updating, making them especially useful for operational risk management.
In fire-burned area prediction, hybrid spatiotemporal models and surrogate deep neural networks have improved both computational efficiency and accuracy [27,28]. In California, attention-based models and ConvLSTM architectures have shown promising results for short-term forecasting [29,30], though regional generalizability remains a challenge. Time series architectures such as U-LSTM [31], Transformer-based attention models [32], and dynamic indices such as the Hourly Fire Risk Index [33] have demonstrated robust performance across regions. In particular, ref. [24] reported an AUC of 0.998, and ref. [26] achieved an RMSE of 0.3253 in predicting the Canadian Fire Weather Index. However, limitations persist, including data imbalance [34], exclusion of ignition-specific variables [35], and limited multiregional validation [36].
To operationalize fire risk management, various fire danger rating systems have been developed globally. These include the Canadian Forest Fire Danger Rating System (CFFDRS), the U.S. National Fire Danger Rating System (NFDRS), the European Forest Fire Information System (EFFIS), and Russia’s ISDM-Rosleskhoz. These systems integrate weather, fuel, and terrain data but often require regional calibration.
Recent studies have improved the Canadian Fire Weather Index system by incorporating overwinter drought effects, improving early season fire risk assessments with high-resolution meteorological data [37]. In Crete, the Canadian FWI had to be reconfigured to local climatic thresholds for effective predictions [38]. In Southern China, CFFDRS showed good alignment with daily fire activity [39]. NFDRS has exhibited strong predictive performance in the U.S. West but underperformed in the South and East [40], indicating a need for localized refinement. Russia’s ISDM-Rosleskhoz is evolving to address climate-driven increases in wildfire activity, with updates focusing on zoning, suppression strategies, and post-fire assessments [41]. Advanced hybrid models further improve fire predictions by integrating ensemble Kalman filters with ML techniques. These have successfully forecasted fire perimeters and optimized fuel parameters in real-world cases like the 2018 Camp Fire [42]. Ensemble Transform Kalman Filters (ETKF) have also shown superior accuracy over rugged terrain, especially when integrated with tools like FARSITE [43]. Long-duration fire simulations have demonstrated the potential to combine real-time meteorological forecasts with post-event evaluations [44].
Remote sensing technologies, including Sentinel-2 and Landsat 8, have become indispensable for fire detection and post-burn analysis. In North Africa, these sensors effectively mapped burn scars and offered improvements over EFFIS in capturing local fire activity [45]. Long-term fused datasets of monthly burned area combining MODIS, Landsat, and Sentinel-2 enable better tracking of global fire trends over multiple decades [46]. Australia faces a high risk of wildfires across diverse climates. However, the modeling literature remains fragmented. Most studies focus on single regions such as New South Wales [47,48], Victoria [49,50], or Western Australia [25], using methods such as gene-expression programming [51] or hybrid XGBoost [49]. Only one study has analyzed multiple regions (NSW, VI, QL, WA, NT, ACT) continent-wide [52], and none have implemented time series modeling across all seven regions.
In response to these challenges, this study presents a unified machine learning-based time series framework for predicting key wildfire metrics across Australia as a whole and all seven Australian regions: Fire Area (FA), Fire Brightness Temperature (FBT), and Fire Radiative Power (FRP). To our knowledge, no previous study has simultaneously modeled wildfire dynamics across all seven regions of Australia. Using the IBM wildfire dataset, which integrates diverse inputs such as climate variables and historical fire records, this research advances the field not only by targeting multiple fire metrics, but also by generating region-specific predictions within a single cohesive model. The proposed approach aims for high accuracy and regional adaptability, and creates a scalable system capable of supporting national fire policy and resource planning.
The remainder of this paper is structured as follows. The Materials and Methods section begins with the study area, followed by the proposed framework for wildfire prediction. The research dataset is then described, including climate and wildfire data. This is followed by data preparation and preprocessing procedures, an overview of the machine learning models used to estimate wildfire characteristics, and a description of the model evaluation and selection process. The Results section presents the overall best performance of wildfire prediction across Australia, followed by an analysis of the best-performing machine learning models by region and target variable. This includes detailed results for FA, FBT, FRP across Australian regions.
The Discussion introduces a severity classification system that maps predicted fire characteristics to operational thresholds. This system supports actionable response planning by categorizing fire events into severity levels and linking them to appropriate resource recommendations. The Conclusion and Future Work section summarizes key contributions and outlines directions for continued research. Additional sections cover data availability, conflicts of interest, funding, acknowledgements, and references.

2. Materials and Methods

This study investigates the prediction of wildfire characteristics based on meteorological and environmental factors. The goal is to predict three key wildfire attributes: Estimated FA, Mean Estimated FBT, and Mean Estimated FRP. To achieve this, a variety of machine learning models were tested across different regions. The methodology used in feature preparation, model training, evaluation, and feature importance is described below.

2.1. Study Area

The study area consists of the seven regions of Australia, shown in Figure 1: Western Australia (WA), Northern Territory (NT), Queensland (QL), New South Wales (NSW) and ACT (NSW), Victoria (VI), Tasmania (TA), and South Australia (SA). Australia has experienced wildfires historically in all seven regions, and this study captures all wildfires from 2005 onwards, including significant events such as the Black Saturday bushfires in 2009, one of Australia’s worst disasters, with 173 fatalities, and the Kilmore East fire accounting for 70% of those deaths, consuming 100,000 ha in under 12 h [53].

2.2. Framework for Wildfire Prediction

This methodology outlines a structured approach to wildfire prediction using machine learning, progressing through data integration, feature engineering, and model development, as seen in Figure 2.
  • Step 1: Data Integration
    In the first stage, two primary datasets are combined. Fire predictors include Estimated FA, Mean Estimated FBT, and Mean Estimated FRP. These are crucial indicators of wildfire intensity and spread. Alongside this, weather data is integrated, covering precipitation, relative humidity, temperature, wind speed, soil moisture content, and solar radiation. These environmental factors significantly influence wildfire behavior and are essential for building a reliable predictive model.
  • Step 2: Feature Engineering
    After integrating the datasets, the next step involves aggregating, cleaning, and preprocessing the data. Feature extraction is then performed to identify and generate the most relevant attributes from the combined datasets. To ensure that the models are validated rigorously and to reduce the risk of overfitting, a K-fold cross-validation strategy with five splits is used. This helps in assessing model performance across different subsets of the data.
  • Step 3: Build Wildfire Prediction Models
    The processed dataset is divided into training (80%) and validation (20%) sets. Several machine learning algorithms are used to model the data, detailed in Section 2.5 Prediction Models for Wildfire Characteristics. These models are then evaluated using two performance metrics: Root Mean Squared Error (RMSE), which measures the average prediction error, and the coefficient of determination (R2), which indicates the proportion of variance explained by the model.

2.3. Research Dataset Description

Effective wildfire prediction and management based on the integration of different datasets that capture various environmental and land characteristics. This section introduces the key datasets utilized in the analysis, focusing on climate, and wildfire data. The following subsections detail the characteristics, significance, and sources of these datasets.

2.3.1. Climate Data

This study draws on meteorological and climate data, shown in Table 1, primarily the ERA5 reanalysis dataset by ECMWF. ERA5 merges satellite, station, and other observations to produce consistent hourly estimates of variables like precipitation, temperature, wind, humidity, soil moisture, and solar radiation [54]. For this work, daily data spanning from 2005 to 2021 were used, with hourly resolution available for some variables. With a spatial resolution of around 31 km, ERA5 is well-suited for analyzing both long-term climate patterns and short-term weather events relevant to wildfire activity. The dataset includes several key variables: precipitation, temperature, wind speed, relative humidity, soil water content, and solar radiation. These are widely recognized as important factors in understanding and predicting fire behavior. Lower precipitation increases dryness risk, while higher temperature and wind escalate fire growth [55]. Relative humidity influences fuel moisture content [56]. These variables, widely used in predictive modelling, offer concise, informative summaries of environmental trends over time [57]. The availability of high-frequency data makes it possible to monitor subtle changes in weather conditions, which is especially valuable for short-term fire prediction and emergency response.

2.3.2. Wildfire Data

A summary of the wildfire data used is in Table 1, also showing the available statistical measures of each of the variables. The FA helps quantify the scale of a wildfire, directly correlating with potential damage and the level of resources required for firefighting efforts [58]. FBT and FRP serve as proxies for fire intensity, where higher values typically indicate larger, hotter fires that may spread more rapidly [59]. Additional metrics such as the mean, variance, and standard deviation of fire-related values can offer deeper insights into fire behavior patterns. The pixel count is also critical for tracking the spatial extent of fire over time [60]. The dataset used spans from January 2005 to January 2021 and records fire events on a daily basis, provided a fire occurs. As illustrated in Figure 3, the distribution of fire records reveals substantial regional variation, with certain areas demonstrating significantly higher fire incidence. The fire data originates from the MODIS MCD14DL product, a widely utilized source for global fire monitoring due to its high spatial (250 m) and temporal (daily) resolution [58].

2.4. Data Preparation and Preprocessing

In predictive wildfire modeling, dealing with missing fire records is a crucial and delicate step. For this study, any days that had weather data but no corresponding wildfire information were removed to reduce noise and avoid introducing bias into the model. To start, the climate and wildfire datasets were combined using the date as the key, so that each weather record could be matched with a wildfire event if one occurred on that day. Any records without matching fire data were excluded, following best practices for handling incomplete datasets. Ref. [61] warns against simply assuming that missing values mean “no fire” happened without solid evidence, as this can lead to inaccurate conclusions.
In the case of wildfires, missing records often do not necessarily mean that there was no fire. Instead, they can reflect inconsistencies in reporting or issues during data collection. After removing these unmatched records, we performed exploratory data analysis (EDA), a crucial step where we closely examined the cleaned dataset to understand its characteristics. EDA gives us the ability to know our data: it involves checking the distribution of data points over time, looking at weather conditions, and ensuring the geographic coverage still reflects reality. This helps confirm that no unintended biases were introduced when removing data, and that the dataset still fairly represents the original situation.
By doing this, we can be confident that the data feeding into the predictive model truly captures the real relationships between weather and wildfire events. After the dataset was refined through EDA, it was preprocessed with wildfire occurrences set as the target variable and the weather characteristics as the predictors. This ensures the model learns meaningful patterns, which is vital for accuracy and reliability. Overall, this careful data cleaning, exploration, and preparation process reduces outliers and keeps the input data robust, making the wildfire prediction model more dependable.

2.5. Prediction Models for Wildfire Characteristics

Machine learning models are widely used for predictive analytics due to their ability to capture complex, often non-linear relationships between features and targets. To effectively bridge the stages of the development and analysis of wildfire prediction models, this methodology focuses on evaluating a wide range of machine learning algorithms, including Random Forest (RF), Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), Support Vector Machine, Linear Regression (LR), Lasso Regression, Multilayer Perceptron (MLP), Long Short-Term Memory (LSTM), and K-Nearest Neighbor (KNN). Below, various models are grouped based on their significance and application in recent wildfire and related predictive modeling research. By systematically testing these models on the dataset, the approach aims to assess their performance on Australia overall. The five most promising models are then selected for further evaluation, ensuring that only the most suitable algorithms are prioritized for the subsequent regional analysis based on RMSE and R2.

2.5.1. Tree-Based Models

RF is an ensemble method composed of multiple decision trees trained on bootstrapped subsets of the data. At each split, a random subset of features is selected to decorrelate trees and reduce overfitting. The final prediction is an average of the individual tree predictions:
y ^ = 1 T t = 1 T h t ( x )
where y ^ is the final predicted value, T is the total number of trees, h t ( x ) is the prediction from the t th tree, and x is the input feature vector. RF is widely used for wildfire prediction due to its high accuracy and robustness. It has outperformed other models in diverse settings like Bolivia, China, and Iran [62,63]. RF handles complex variables well, including vegetation indices and human activity [64]. Studies recommend combining RF with resampling and explainable AI to enhance performance in climate-affected regions [65,66].
XGBoost constructs an additive model through gradient-based optimization of a regularized objective function:
L = i = 1 n l ( y i , y ^ i ) + k = 1 K Ω ( f k )
where n is the number of training examples, y i is the true label, y ^ i is the predicted output, l ( · ) is a differentiable loss function, K is the number of trees, and Ω ( f k ) is a regularization term that penalizes model complexity for each tree f k [67].
XGBoost performs well with high-dimensional data, offering over 81% accuracy in prediction of the fire period [68]. It integrates well with real-time meteorological inputs and supports explainable AI [48,69]. Although adaptable, its performance depends on the selection of features adapted to local conditions [70].
LightGBM is a gradient boosting framework optimized for speed and scalability. It uses histogram-based feature binning and a leaf-wise growth strategy, making it efficient for large datasets.
LightGBM is fast, accurate, and effective with large datasets. It has been hybridized with metaheuristics to predict wildfire risk with AUC > 0.85 [71] and outperformed LSTM in time series predictions [72]. The SHAP integration helps to explain the importance of the features. LightGBM is suitable for both static mapping and real-time wildfire prediction systems.

2.5.2. SVM

SVM aim to find a decision boundary (hyperplane) that maximizes the margin between classes in the feature space. The soft-margin optimization problem is defined as follows:
min w , b 1 2 w 2 + C i = 1 n ξ i
subject to
y i ( w · ϕ ( x i ) + b ) 1 ξ i , ξ i 0
where w is the weight vector, b is the bias term, ϕ ( · ) is a kernel mapping function, ξ i are slack variables allowing margin violations, C is the penalty parameter, and y i { 1 , 1 } are the class labels.
SVM have shown strong performance in wildfire prediction. Ref. [36] reported an AUC of 0.908 in Indian forests using SVM with MODIS and climate data. Ref. [73] found that a Boosted Trees–Random Forest model outperformed SVM, but noted that SVM still performed reliably. Other studies, including [51,74,75], also tested SVM models, though they were not the top performers in those cases. These findings underscore SVM’s robustness across different contexts, even when outperformed by ensemble or deep learning models.

2.5.3. Regression Models

LR models a linear relationship between the dependent variable y and predictors x:
y = β 0 + j = 1 p β j x j + ϵ
where β 0 is the intercept, β j are regression coefficients, x j are input features, p is the number of features, and ϵ is the random error term [66].
LR remains a simple, interpretable tool for wildfire prediction, though it often underperforms compared to advanced ML models. Ref. [76] used MLR for fire severity prediction and emphasized its accessibility for users without technical backgrounds, despite ANN showing higher accuracy. Ref. [70] found LR’s MAE (7.48 km2) notably worse than RF and XGBoost (3.13 km2), showing LR’s poor generalization across regions. Similarly, ref. [77] found that LSTM outperformed MLR in capturing non-linear climatic patterns in Thailand. Ref. [78] showed that decision trees outperform both linear and ridge regression due to their ability to capture complex variable relationships. While LR’s simplicity is useful in low-resource or interpretability-focused settings, most studies recommend non-linear ML models for greater accuracy and adaptability.
Lasso Regression introduces L 1 regularization to encourage sparsity in coefficients, optimizing the following:
min β i = 1 n ( y i x i β ) 2 + λ j = 1 p | β j |
where λ is the regularization parameter, x i is the feature vector for sample i, and β j are model coefficients.
Ref. [79] applied Lasso regression in Jilin Province, achieving 85% accuracy and highlighting weather and topography as dominant fire risk drivers. Lasso’s strength lies in feature selection and model simplicity, though it trails ML models like RF and SVM in performance. While Lasso identified key risk zones, it lacks the flexibility to model non-linear wildfire dynamics. Integrating Lasso with other models or visualization tools may improve real-time wildfire prevention strategies.

2.5.4. Neural Networks

MLP consist of stacked layers of neurons applying non-linear transformations:
a ( l ) = σ W ( l ) a ( l 1 ) + b ( l )
where a ( l ) is the activation vector in layer l, W ( l ) and b ( l ) are the weight matrix and bias vector of layer l, and σ ( · ) is an activation function such as ReLU or sigmoid.
MLP models demonstrate strong wildfire prediction accuracy across diverse settings. Ref. [80] achieved 82.97% accuracy in California, while global studies showed MLPs consistently outperforming LR [70]. Vegetation was a key predictor across models, with preprocessing, dynamic factor extraction by [81], enhancing performance. Visualization tools, such as 3D maps by [80], increased usability for decision-makers. MLPs show promise for operational wildfire early warning systems, especially when supported by feature engineering and visualization.
LSTM networks are a type of recurrent neural network (RNN) designed to capture long-term dependencies using memory cells and gated updates:
f t = σ ( W f · [ h t 1 , x t ] + b f )
i t = σ ( W i · [ h t 1 , x t ] + b i )
C ˜ t = tanh ( W C · [ h t 1 , x t ] + b C )
C t = f t × C t 1 + i t × C ˜ t
o t = σ ( W o · [ h t 1 , x t ] + b o )
h t = o t × tanh ( C t )
where x t is the input vector at time t, h t is the hidden state, C t is the cell state, f t , i t , and o t are the forget, input, and output gates, and W * and b * are learnable weights and biases.
LSTM excels in time series wildfire prediction by incorporating dynamic climatic data. Refs. [26,77] found that LSTM outperformed static ML models in predicting fire risk in China and Thailand. Advanced LSTM architectures (ULSTM, SLA-ConvLSTM) by [31] and ref. [30] integrated spatio-temporal and teleconnection data for >97% accuracy in California and globally. LSTM-based surrogate models, as shown by [28], drastically reduced computation time while maintaining high accuracy. However, LSTM struggles with rare events and lacks accuracy in regions with limited historical data [72,82].

2.5.5. Instance-Based Models

KNN is a non-parametric method that predicts a target value for a given input x by averaging the outputs of the k closest training samples:
y ^ = 1 k i N k ( x ) y i
where N k ( x ) is the set of indices of the k nearest neighbors to x, and y i are their associated outputs.
KNN’s simplicity helps in mapping wildfire patterns, but it struggles with non-linear or imbalanced data. Refs. [83,84] noted poor performance in high-dimensional datasets. Though KNN achieved 97% accuracy at a county level [85], it overfit on imbalanced data. While useful in localized fire monitoring, its predictive accuracy is limited unless paired with preprocessing.

2.6. Model Performance Evaluation and Selection

The evaluation process involved optimizing model parameters and validating performance using k-fold cross-validation. In this approach, the dataset is divided into five sections, each acting as a test set once, 20%, while the remaining sections form the training set, 80%. This cycle is repeated five times, and the results are averaged to provide a final evaluation metric, reducing overfitting and ensuring robust model performance while preserving the time series order.
For each region, the selection of the best-performing model was guided by evaluation metrics, including Root Mean Squared Error (RMSE) and R2. RMSE, a widely used metric, measures the model prediction accuracy by calculating the square root of the average squared differences between the predicted and observed values. These steps ensured a thorough and systematic approach to identifying the most effective models for regional wildfire prediction.
The RMSE measures the average magnitude of the prediction errors:
RMSE = 1 n i = 1 n ( y i y ^ i ) 2
where
  • n is the total number of observations;
  • y i is the actual value at time i;
  • y ^ i is the predicted value at time i.
The coefficient of determination R 2 assesses the proportion of variance in the target variable explained by the model, with higher values indicating better model fit [86]:
R 2 = 1 i = 1 n ( y i y ^ i ) 2 i = 1 n ( y i y ¯ ) 2
where
  • y ¯ is the mean of the actual values;
  • n is the total number of observations.

3. Results

The tested models were assessed using RMSE and R2 score across the different targets. MLP achieved negative R2 scores and thus, was excluded. SVM and LSTM performed worse than the six selected models in RMSE and R2 scores and are not included. The chosen models achieved positive R2 scores for at least one target and are analyzed further for regional performance.
Overall, Lasso Regression stands out for its accurate predictions across all targets, particularly for FA and FRP. Random Forest and LightGBM emerged as reliable models for FBT, balancing RMSE and R2 score. While XGBoost and KNN perform reasonably well, they slightly lag behind the top-performing models. Lasso Regression, Random Forest, and LightGBM exhibit the strongest potential for wildfire-related predictions, with Lasso doing the best overall. The results for the best performing models overall, then per region and target, are discussed in Section 3.1 and Section 3.2.

3.1. Australia Overall Best Performance of Wildfire Prediction

Based on the analysis of all machine learning models evaluated, Lasso Regression emerges as the overall best-performing method for predicting the three wildfire-related targets. While RF achieved slightly better results for mean estimated FBT, Lasso Regression consistently provides reliable predictions across all targets with the added advantages of simplicity and interpretability.
Lasso Regression demonstrated consistent performance across all three prediction targets associated with wildfire characterization, with the details shown in Table 2. In modeling the estimated FA, the approach yielded a RMSE of 0.02105 and a coefficient of determination (R2) of 0.0845, indicating a low level of explanatory power and high accuracy. When applied to the prediction of mean estimated FBT, Lasso achieved an RMSE of 0.06415 and an R2 of 0.28447, suggesting a comparatively stronger fit. For the prediction of mean estimated FRP, the method produced an RMSE of 0.01769 and an R2 value of 0.02165, reflecting limited but measurable predictive alignment. These results emphasize Lasso Regression’s ability to predict fire targets, with variable effectiveness depending on the response variable.

3.2. Best ML Model per Region and Target of Wildfire Prediction

The performance of the models across the regions and targets are visualized and discussed in the following subsections.

3.2.1. Estimated FA Predictions Results Across Australian Regions

The best performing models per region are shown in Figure 4 and Table 3. Lasso Regression emerged as the top-performing model in NSW and WA. In NSW, it recorded an RMSE of 0.00394 and an R2 score of 0.05249. Its strength lies in reducing overfitting through regularization, which is particularly effective in regions characterized by heterogeneous topography and diverse vegetation structures. This includes variations in slope, elevation, and aspect, factors known to influence burn severity and fuel accumulation. In WA, Lasso achieved an RMSE of 0.03549 and an R2 score of 0.15189, effectively managing the sparse vegetation and open shrubland typical of this region, where fire spread is influenced by wind exposure and lower canopy cover [25].
In SA, TA, and VI, KNN showed superior performance. For SA, KNN achieved an RMSE of 0.00773 and R2 = 0.12744; in TA, RMSE = 0.00476 and R2 = 0.14280; and in VI, RMSE = 0.00596 and R2 = 0.20591. These regions are characterized by highly variable microclimates, where local topography, canopy density, and surface moisture drive fine-scale variation in fire behavior. KNN’s spatially sensitive modeling approach effectively captures these local dependencies [49,87]. In TA and VI, complex interactions between solar radiation, wind speed, and relative humidity further influence fire ignition and spread, making localized models particularly suitable.
In QL, RF achieved the highest R2 score across all regions at 0.51979 with an RMSE of 0.01859. RF’s strength lies in its ability to model non-linear interactions between environmental features such as fuel moisture, canopy density, and terrain. This capability is especially useful in QL’s diverse ecosystems, ranging from coastal rainforests to savanna woodlands, where vegetation density and seasonal rainfall strongly impact fire dynamics [88].
In the NT, LightGBM outperformed other models, achieving an RMSE of 0.02454 and an R2 score of 0.20750. NT’s fire regimes are shaped by strong seasonality, frequent lightning strikes, and rapid vegetation transitions. LightGBM’s gradient boosting approach allows it to effectively learn from high-variance data and prioritize informative features across seasons, making it well-suited for this environment [88].
These results suggest that the best model for each region aligns closely with its ecological and climatic characteristics. Lasso Regression handles heterogeneity and sparsity through feature selection; KNN excels in regions where local microclimatic effects dominate; RF captures non-linear patterns in diverse ecological contexts; and LightGBM performs well under high variability.

3.2.2. Mean FBT Predictions Results Across Australian Regions

Lasso Regression emerged as the most effective model for predicting mean estimated FBT across all the Australian regions except NT, as seen in Figure 5. Table 4 shows the models’ performance in all the regions. In NSW, it achieved an RMSE of 0.07470 and an R2 of 0.41712, successfully modeling brightness intensity despite limited feature inputs. This performance may be attributed to Lasso’s capacity for sparse feature selection, which is advantageous in regions with variable fuel loads and fragmented ecosystems.
In WA, the RMSE was 0.06172 with an R2 of 0.32574. The region’s large-scale wind systems and high solar exposure result in sharp contrasts in thermal readings, which Lasso handles by weighting key predictors such as wind gusts and radiative fluxes [49]. Similarly, in QL (RMSE = 0.05747, R2 = 0.29785), Lasso effectively captured FBT patterns in areas influenced by frequent storms and dense tropical canopies, where reflectivity and moisture retention impact satellite-based brightness temperature readings [88].
In SA (RMSE = 0.09551, R2 = 0.34424), FBT is strongly governed by dry spells and diurnal temperature variation, factors that Lasso’s regularized approach efficiently incorporates. For VI and TA, the RMSE values were 0.08127 and 0.10238, with R2 scores of 0.24826 and 0.14451, respectively. These lower scores likely reflect the challenges posed by dense cloud cover and complex terrain affecting satellite detection of FBT [87].
In the NT, Gradient Boosting outperformed all other models (RMSE = 0.06007, R2 = 0.24618). This result underscores Gradient Boosting’s ability to model non-linear interactions among weather variables, vegetation type, and satellite reflectance. NT’s landscape is influenced by highly seasonal monsoonal transitions and frequent lightning ignitions, making it well-suited for ensemble learners like Gradient Boosting that prioritize informative features across decision paths [88]. These results reinforce that model effectiveness is tightly coupled with regional fire regimes, vegetation structures, and satellite signal characteristics. Lasso Regression is favored where linear feature impacts dominate and data sparsity must be addressed, while Gradient Boosting thrives in ecologically diverse, noise-prone regions.

3.2.3. Mean FRP Predictions Results Across Australian Regions

Lasso Regression consistently demonstrated the lowest RMSE values for predicting mean FRP across all Australian regions, affirming its strength in minimizing absolute prediction error, as seen in Table 5. In QL, the model achieved an RMSE of 0.01347, its best performance among all regions, indicating precise alignment between modeled and observed radiative energy emitted during fire events. This suggests Lasso’s efficiency in selecting dominant features and eliminating noise in tropical and sub-tropical fire environments, where FRP is influenced by convective heat flux and vegetation type [89].
WA and NSW followed with RMSE values of 0.02295 and 0.02325, respectively. These outcomes highlight Lasso’s capability in reducing error even in ecosystems affected by sporadic lightning-induced fires and coastal wind corridors, which typically complicate FRP estimates [49]. In SA and TA, regions with heterogeneous landscapes and cooler climates, the model maintained competitiveness, achieving RMSEs of 0.03525 and 0.04813. These findings illustrate that Lasso can generalize across climatic zones, even when satellite-based thermal detection faces obstructions due to moisture or terrain [87].
However, the R2 scores reveal a shortcoming: the model’s ability to explain variance in FRP remains limited. The highest R2 value was observed in QL (0.13965), indicating a weak-to-moderate correlation between predicted and actual FRP. In SA and TA, the R2 scores dropped sharply to 0.00668 and 0.01558, respectively. These values suggest that while Lasso is capable of reducing the magnitude of prediction errors, it falls short in modeling the complexity and variability inherent in FRP dynamics, which are driven by non-linear interactions between biomass combustion intensity, wind patterns, humidity, and fuel moisture [88].
This pattern suggests that Lasso may act more as a conservative estimator in FRP modeling, reducing extreme error without fully capturing the chaotic behavior of high-energy fire events. The limited explanatory power of R2 points to the need for complementary models or hybrid approaches in regions with complex fire regimes.

4. Discussion

The results demonstrate that Lasso Regression consistently performs well in predicting wildfire-related parameters across diverse Australian regions. Its predictive accuracy for FA, FBT, and FRP aligns with earlier applications of Lasso in environmental modeling tasks such as rainfall prediction and flood risk [90,91]. Lasso’s inherent strength in variable selection and regularization makes it an attractive choice for operational early warning systems, balancing accuracy with computational efficiency.
Notably, regions such as NSW, WA, QLD, and the NT exhibited strong predictive performance, highlighting Lasso’s utility for real-time alerts, mitigation planning, and strategic resource deployment. Its relatively low computational burden compared to ensemble models makes it especially viable for national-scale implementation.
While Lasso performed well overall, limitations remain, particularly in regions like VI and SA, where lower R2 scores indicate underfitting. The model’s linear structure may struggle to capture complex non-linear interactions inherent in wildfire behavior, which is influenced by a combination of climatic conditions (e.g., humidity, temperature extremes), vegetation dynamics (e.g., fuel type and load), and topography (e.g., slope, elevation). These factors are not fully represented in the current dataset, highlighting the need for more sophisticated or hybrid models that balance interpretability with the ability to model complex fire dynamics. Furthermore, the exclusion of anthropogenic drivers, such as ignition sources, land-use patterns, and proximity to infrastructure, limits the model’s ability to forecast human-caused fire events or assess urban–wildland interface risk. Addressing these gaps requires the integration of high-resolution spatiotemporal datasets, including land-use maps, human population density, and fire history.
Future research should explore hybrid models that combine Lasso with non-linear learners such as Gradient Boosting and Random Forests and explore models such as CNN and Elastic Net to improve predictive accuracy. Integrating real-time multi-modal data, such as sensor readings, satellite observations, and machine learning forecasts, can better capture complex fire–environment relationships while maintaining operational feasibility. Further enhancements include incorporating high-resolution spatiotemporal climate data, topographic features, and human activity indicators (e.g., ignition sources, land-use patterns) to improve ecological and operational relevance. Climate change projections, feature importance analysis, and region-specific customization will also strengthen model robustness. Additionally, ongoing development of the alert system aims to dynamically adapt severity thresholds based on land type classifications (e.g., forest, grassland, urban–wildland interface zones) and monitor rapid changes in fire metrics to trigger timely and targeted wildfire management interventions.
To support wildfire response and management, the machine learning model developed in this study predicts three key fire parameters: estimated FA (km2), FBT (K), and FRP (MW). These outputs are automatically mapped to a five-level severity scale: No Danger, Low, Moderate, High, and Extreme. This is based on empirically derived thresholds informed by satellite fire detection research and operational fire behavior frameworks. Table 6 summarizes the threshold criteria used for severity classification.
The FBT thresholds reflect the logic of the MODIS fire detection algorithm [58], which uses contextual criteria based on thermal contrast and absolute temperature. MODIS typically identifies fire pixels above 330 K, with high-intensity events exceeding 365 K. These categories align with brightness patterns commonly observed in moderate to extreme wildfires and allow for operationally meaningful classification of fire severity.
FRP thresholds are grounded in studies of MODIS and VIIRS data. Active fires are generally detectable at 30 MW, with moderate-intensity events between 100 and 300 MW and intense fires frequently exceeding 1500 MW [59]. The upper category, “Extreme”, reflects large, fast-spreading fires that demand substantial suppression resources.
FA thresholds are informed by both the NWCG Fire Size Class system and EFFIS reporting standards. NWCG defines fires over 10 km2 as large-scale events, often requiring interagency response coordination. Similarly, EFFIS considers events above 0.3 km2 as significant. This study expands the scale further to reflect operational needs, enabling finer-grained assessments of fire size.
By integrating this severity framework, the prediction system enhances situational awareness for decision-makers, enabling real-time assessments of fire intensity and spatial extent. This facilitates targeted interventions, more efficient resource allocation, and earlier warnings for high-risk areas.

5. Conclusions

Key contributions of this study include:
1.
Development and evaluation of Lasso Regression models tailored to seven diverse Australian fire regions, demonstrating regional adaptability and reliability.
2.
Comparative analysis of multiple machine learning approaches for predicting FA, FBT, and FRP, establishing Lasso’s competitive edge.
3.
Integration of environmental and climatic variables in predictive modeling, highlighting their differential impacts across the seven Australian regions.
4.
Provision of a scalable framework for operational wildfire monitoring with reduced computational complexity, suited for real-time applications.
5.
Recommendations for improving wildfire early warning systems through dynamic alert thresholds and temporal monitoring of fire characteristic changes.
This study confirms the growing utility of Lasso Regression in modeling region-specific wildfire behavior across Australia. Despite promising predictive accuracy, limitations such as the exclusion of ignition sources and potential real-time constraints indicate room for future enhancement. The proposed framework lays a foundation for scalable, adaptive wildfire forecasting and supports the development of robust decision-making tools for emergency response agencies.

Author Contributions

Conceptualization, Z.A.; methodology, Z.A. and A.E.; software, Z.A.; validation, Z.A.; formal analysis, Z.A.; investigation, Z.A.; resources, A.E. and M.E.B.; data curation, Z.A. and M.E.B.; writing—original draft, Z.A.; writing—review and editing, Z.A., A.E. and O.A.-K.; visualization, Z.A. and A.E.; supervision, A.E.; project administration, A.E. and M.E.B.; funding acquisition, A.E., O.A.-K. and M.E.B. All authors have read and agreed to the published version of the manuscript.

Funding

Funding was provided by the University of Wollongong in Dubai. Grant ID: URC25016.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data that support this study were obtained from IBM by permission/license. Data will be shared upon reasonable request to the corresponding author with permission from IBM.

Acknowledgments

We gratefully acknowledge International Business Machines Corporation (IBM) for acquiring, processing, and providing the data that enabled this research. The data sources, detailed in Table 1, are referenced in the text. IBM’s contribution was vital to our analysis and to advancing wildfire prediction using machine learning.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
MLMachine Learning
DLDeep Learning
RFRandom Forest
XGBoostExtreme Gradient Boosting
SVMSupport Vector Machine
LSTMLong Short-Term Memory
LRLinear Regression
MLPMultilayer Perceptron
KNNK-Nearest Neighbor
LightGBMLight Gradient Boosting Machine
CNNConvolutional Neural Network
RMSERoot Mean Squared Error
R2Coefficient of Determination
FAFire Area
FBTFire Brightness Temperature
FRPFire Radiative Power
km2Square Kilometers
MWMegawatts
KKelvin (temperature unit)
EDAExploratory Data Analysis
ERA5ECMWF Reanalysis v5
ECMWFEuropean Centre for Medium-Range Weather Forecasts
MODISModerate Resolution Imaging Spectroradiometer
VIIRSVisible Infrared Imaging Radiometer Suite
NWCGNational Wildfire Coordinating Group
EFFISEuropean Forest Fire Information System
WAWestern Australia
NTNorthern Territory
QLQueensland
NSWNew South Wales and Australian Capital Territory
VIVictoria
TATasmania
SASouth Australia

References

  1. Koh, J. Gradient boosting with extreme-value theory for wildfire prediction. Extremes 2023, 26, 273–299. [Google Scholar] [CrossRef]
  2. U.S. Department of Agriculture. Wildfire Risk and Resilience: A Comprehensive Strategy for the 2022–2030 Period; U.S. Forest Service: Washington, DC, USA, 2021. Available online: https://www.fs.usda.gov (accessed on 4 March 2025).
  3. Fann, N.; Alman, B.; Broome, R.A.; Morgan, G.G.; Johnston, F.H.; Pouliot, G.; Rappold, A.G. The health impacts and economic value of wildland fire episodes in the U.S.: 2008–2012. Sci. Total Environ. 2022, 807, 150741. [Google Scholar] [CrossRef]
  4. Wen, B.; Wu, Y.; Xu, R.; Guo, Y.; Li, S. Excess emergency department visits for cardiovascular and respiratory diseases during the 2019–20 bushfire period in Australia: A two-stage interrupted time-series analysis. Sci. Total Environ. 2022, 809, 152226. [Google Scholar] [CrossRef] [PubMed]
  5. Sun, W.; Tang, D.; Li, R. Global impact of fire emission on ambient nitrate (NO3) and health effects during 2005–2019. J. Hazard. Mater. 2025, 494, 138509. [Google Scholar] [CrossRef] [PubMed]
  6. Schroeder, L.; de Souza, E.M.; Rosset, C.; Junior, A.M.; Boquett, J.A.; Rofatto, V.F.; Brum, D.; Gonzaga, L., Jr.; de Oliveira, M.Z.; Veronez, M.R. Fire association with respiratory disease and COVID-19 complications in the state of Pará, Brazil. Lancet Reg. Health Am. 2022, 6, 100102. [Google Scholar] [CrossRef]
  7. Akter, S.; Grafton, R.Q. Do fires discriminate? Socio-economic disadvantage, wildfire hazard exposure and the Australian 2019–20 “Black Summer” fires. Clim. Change 2021, 165, 30. [Google Scholar] [CrossRef]
  8. Wu, C.; Jiang, C.; Wang, X.; Thompson, J.R.; Peñuelas, J.; Yue, C.; Peng, S.; Huang, Y. Reduced global fire activity due to human demography slows global warming by enhanced land carbon uptake. Proc. Natl. Acad. Sci. USA 2022, 119, e2101186119. [Google Scholar] [CrossRef]
  9. Geary, W.L.; Doherty, T.S.; Nimmo, D.G.; Tulloch, A.I.; Ritchie, E.G. Predator responses to fire: A global systematic review and meta-analysis. J. Anim. Ecol. 2020, 89, 955–971. [Google Scholar] [CrossRef]
  10. Sayedi, S.S.; van Marle, M.J.E.; Hantson, S.; Forkel, M.; van der Werf, G.R. Assessing changes in global fire regimes. Fire Ecol. 2024, 20, 18. [Google Scholar] [CrossRef]
  11. Plarier, A. Agricultural fire or arson? Hist. Reflect. 2020, 46, 9–24. [Google Scholar] [CrossRef]
  12. Pereira, M.; Gonçalves, N.; Amraoui, M. The Influence of Wildfire Climate on Wildfire Incidence: The Case of Portugal. Fire 2024, 7, 234. [Google Scholar] [CrossRef]
  13. Liu, Y.; Goodrick, S.; Heilman, W. Wildland fire emissions, Carbon, and climate: Wildfire–climate interactions. For. Ecol. Manag. 2014, 317, 80–96. [Google Scholar] [CrossRef]
  14. van Wees, D.; van der Werf, G.R.; Randerson, J.T.; Giglio, L.; van Leeuwen, T.T. Global biomass burning fuel consumption and emissions at 500m spatial resolution based on the global fire emissions database (GFED). Geosci. Model Dev. 2022, 15, 8411–8437. [Google Scholar] [CrossRef]
  15. Rodrigues, M.; Camprubí, À.C.; Balaguer-Romano, R.; Megía, C.J.C.; Castañares, F.; Ruffault, J.; Fernandes, P.M.; de Dios, V.R. Drivers and implications of the extreme 2022 wildfire season in Southwest Europe. Sci. Total Environ. 2023, 859, 160320. [Google Scholar] [CrossRef] [PubMed]
  16. Dong, X.; Li, F.; Lin, Z.; Harrison, S.P.; Chen, Y.; Kug, J.S. Climate influence on the 2019 fires in Amazonia. Sci. Total Environ. 2021, 794, 148718. [Google Scholar] [CrossRef]
  17. Storey, M.A.; Price, O.F.; Almeida, M.; Ribeiro, C.; Bradstock, R.A.; Sharples, J.J. Experiments on the influence of spot fire and topography interaction on fire rate of spread. PLoS ONE 2021, 16, e0245132. [Google Scholar] [CrossRef]
  18. Cattau, M.E.; Wessman, C.; Mahood, A.; Balch, J.K. Anthropogenic and lightning-started fires are becoming larger and more frequent over a longer season length in the U.S.A. Glob. Ecol. Biogeogr. 2020, 29, 668–681. [Google Scholar] [CrossRef]
  19. Sayad, Y.O.; Mousannif, H.; Al Moatassime, H. Predictive modeling of wildfires: A new dataset and Machine Learning Approach. Fire Saf. J. 2019, 104, 130–146. [Google Scholar] [CrossRef]
  20. Balch, J.K.; St. Denis, L.A.; Mahood, A.L.; Magi, B.I.; Shuman, J.K.; Brando, P.M.; Forkel, M. Warming weakens the night-time barrier to global fire. Nature 2022, 602, 442–448. [Google Scholar] [CrossRef]
  21. Tory, K.J.; Cruz, M.G.; Matthews, S.; Kilinc, M.; McCaw, W.L. On the sensitivity of fire-weather climate projections to empirical fire models. Agric. For. Meteorol. 2024, 348, 109928. [Google Scholar] [CrossRef]
  22. Quan, X.; Wang, W.; Xie, Q.; He, B.; de Dios, V.R.; Yebra, M.; Jiao, M.; Chen, R. Improving wildfire occurrence modelling by integrating time-series features of weather and fuel moisture content. Environ. Model. Softw. 2023, 170, 105840. [Google Scholar] [CrossRef]
  23. Makowski, D. Simple random forest classification algorithms for predicting occurrences and sizes of wildfires. Extremes 2022, 26, 331–338. [Google Scholar] [CrossRef]
  24. Wang, Z.; Liu, Y.; Zhang, C.; Zhang, Y.; Zhang, L. Improving wildfire danger assessment using time series features of weather and fuel in the Great Xing’an Mountain Region, China. Forests 2023, 14, 986. [Google Scholar] [CrossRef]
  25. Ismail, F.N.; Noor, A.M.; Tan, W.L.; Hassan, R.; Ahmad, S.A. An assessment of existing wildfire danger indices in comparison to one-class machine learning models. Nat. Hazards 2024. Preprint. [Google Scholar] [CrossRef]
  26. Chen, L.; Zhao, X.; Li, Y. Time Series Prediction of Wildfires Using LSTM Networks in China. Atmos. Environ. 2023, 278, 119466. [Google Scholar]
  27. Zhu, Q.; Riley, W.J.; Lin, Y.; Chen, Y.; Jones, A.D.; Neale, R.B. Building a machine learning surrogate model for wildfire activities within a global earth system model. Geosci. Model Dev. 2022, 15, 1899–1911. [Google Scholar] [CrossRef]
  28. Cheng, S.; Wu, Y.; Karniadakis, G.E. Data-Driven Surrogate Model with Latent Data Assimilation: Application to Wildfire Forecasting. J. Comput. Phys. 2022, 464, 111302. [Google Scholar] [CrossRef]
  29. Li, F.; van Wees, D.; Luo, Y.; Yue, C.; Ward, D.S.; Randerson, J.T. Attentionfire_v1.0: Interpretable machine learning fire model for burned-area predictions over Tropics. Geosci. Model Dev. 2023, 16, 869–884. [Google Scholar] [CrossRef]
  30. Ji, Y.; Zhang, L.; Fang, X.; Zhou, X.; Li, Y.; Xu, Y. Global wildfire danger predictions based on deep learning taking into account static and dynamic variables. Forests 2024, 15, 216. [Google Scholar] [CrossRef]
  31. Bhowmik, R.T.; Jung, Y.S.; Aguilera, J.A.; Prunicki, M.; Nadeau, K. A multi-modal wildfire prediction and early-warning system based on A Novel Machine Learning Framework. J. Environ. Manag. 2023, 341, 117908. [Google Scholar] [CrossRef]
  32. Miao, X.; Li, J.; Mu, Y.; He, C.; Ma, Y.; Chen, J.; Wei, W.; Gao, D. Time series forest fire prediction based on improved transformer. Forests 2023, 14, 1596. [Google Scholar] [CrossRef]
  33. Kang, Y.; Jang, E.; Im, J.; Kwon, C.; Kim, S. Developing a new hourly forest fire risk index based on Catboost in South Korea. Appl. Sci. 2020, 10, 8213. [Google Scholar] [CrossRef]
  34. Phelps, N.; Woolford, D.G. Guidelines for effective evaluation and comparison of wildland fire occurrence prediction models. Int. J. Wildland Fire 2021, 30, 225. [Google Scholar] [CrossRef]
  35. Wilson, R.; Wickramasuriya, R.; Marchiori, D. An empirical modelling and simulation framework for fire events initiated by vegetation and electricity network interactions. Fire 2023, 6, 61. [Google Scholar] [CrossRef]
  36. Sharma, S.K.; Aryal, J.; Rajabifard, A. Remote Sensing and meteorological data fusion in predicting bushfire severity: A case study from Victoria, Australia. Remote Sens. 2022, 14, 1645. [Google Scholar] [CrossRef]
  37. McElhinny, M.; Jain, P.; Wang, X.; Flannigan, M.D. A high-resolution reanalysis of global fire weather from 1979 to 2018—overwintering the drought code. Earth Syst. Sci. Data 2020, 12, 1823–1833. [Google Scholar] [CrossRef]
  38. Dimitrakopoulos, A.P.; Bemmerzouk, A.M.; Mitsopoulos, I.D. Evaluation of the Canadian Fire Weather Index system in an Eastern Mediterranean environment. Meteorol. Appl. 2011, 18, 83–93. [Google Scholar] [CrossRef]
  39. Ullah, M.R.; Liu, X.D.; Al-Amin, M. Spatial-temporal distribution of forest fires and Fire Weather Index calculation from 2000 to 2009 in China. J. For. Sci. 2013, 59, 279–287. [Google Scholar] [CrossRef]
  40. Walding, N.G.; Williams, H.T.P.; McGarvie, S.; Belcher, C.M. A comparison of the US National Fire Danger Rating System (NFDRS) with recorded fire occurrence and final fire size. Int. J. Wildland Fire 2018, 27, 99–113. [Google Scholar] [CrossRef]
  41. Balashov, I.V.; Loupian, E.A.; Bartalev, S.A.; Burtsev, M.A.; Mazurov, A.A. ISDM-ROSLESKHOZ operation and evolution experience. IOP Conf. Ser. Earth Environ. Sci. 2021, 806, 012007. [Google Scholar] [CrossRef]
  42. Zhou, T.; Ding, L.; Ji, J.; Yu, L.; Wang, Z. Combined estimation of fire perimeters and fuel adjustment factors in FARSITE for forecasting wildland fire propagation. Fire Saf. J. 2020, 116, 103167. [Google Scholar] [CrossRef]
  43. Zhou, T.; Ding, L.; Ji, J.; Li, L.; Huang, W. Ensemble Transform Kalman Filter (ETKF) for large-scale wildland fire spread simulation using FARSITE tool and state estimation method. Fire Saf. J. 2019, 105, 95–106. [Google Scholar] [CrossRef]
  44. Martins, L.; Almeida, R.V.; Maia, A.; Vieira, P. Analysing Fire Propagation Models: A Case Study on FARSITE for Prolonged Wildfires. Fire 2025, 8, 166. [Google Scholar] [CrossRef]
  45. Achour, H.; Toujani, A.; Trabelsi, H.; Jaouadi, W. Evaluation and comparison of Sentinel-2 MSI, Landsat 8 OLI, and EFFIS data for forest fires mapping: Illustrations from the summer 2017 fires in Tunisia. Geocarto Int. 2021, 37, 7021–7040. [Google Scholar] [CrossRef]
  46. Chen, Y.; van der Werf, G.R.; Giglio, L.; Randerson, J.T. Multi-decadal trends and variability in burned area from the fifth version of the Global Fire Emissions Database (GFED5). Earth Syst. Sci. Data 2023, 15, 5227–5259. [Google Scholar] [CrossRef]
  47. Nur, A.; Smith, J.; Lee, H.; Kumar, P. Spatial prediction of wildfire susceptibility using hybrid machine learning models based on support vector regression in Sydney, Australia. Remote Sens. 2023, 15, 760. [Google Scholar] [CrossRef]
  48. McNorton, J.R.; Johnston, J.; Cruz, M.G.; Brown, T.J.; Jiménez, E.; Tansey, K.; Wooster, M.J. A global probability-of-fire (POF) forecast. Geophys. Res. Lett. 2024, 51, e2023GL107929. [Google Scholar] [CrossRef]
  49. Ma, J.; Wang, X.; Liu, Y.; Chen, Z. Real-time detection of wildfire risk caused by powerline vegetation faults using advanced machine learning techniques. Adv. Eng. Inform. 2020, 44, 101070. [Google Scholar] [CrossRef]
  50. Abdollahi, A.; Pradhan, B. Explainable artificial intelligence (XAI) for interpreting the contributing factors feed into the wildfire susceptibility prediction model. Sci. Total Environ. 2023, 879, 163004. [Google Scholar] [CrossRef] [PubMed]
  51. Hosseini, M.; Lim, S. Gene expression programming and machine learning methods for bushfire susceptibility mapping in New South Wales, Australia. Nat. Hazards 2021. preprint. [Google Scholar] [CrossRef]
  52. Sankaran, K.S.; Lim, S.-J.; Bhaskar, S.C. An automated prediction of remote sensing data of Queensland-Australia for flood and wildfire susceptibility using Bissoa-DBMLA scheme. Acta Geophys. 2022, 70, 3005–3021. [Google Scholar] [CrossRef]
  53. Cruz, M.G.; Sullivan, A.L.; Gould, J.S.; Sims, N.C.; Bannister, A.J.; Hollis, J.J.; Hurley, R.J. Anatomy of a Catastrophic Wildfire: The Black Saturday Kilmore East Fire in Victoria, Australia. For. Ecol. Manag. 2012, 284, 269–285. [Google Scholar] [CrossRef]
  54. Hersbach, H.; Bell, B.; Berrisford, P.; Hirahara, S.; Horányi, A.; Muñoz-Sabater, J.; Nicolas, J.; Peubey, C.; Radu, R.; Schepers, D.; et al. The ERA5 global reanalysis. Q. J. R. Meteorol. Soc. 2020, 146, 1999–2049. [Google Scholar] [CrossRef]
  55. Bradstock, R.A. A biogeographic model of fire regimes in Australia: Current and future implications. Glob. Ecol. Biogeogr. 2010, 19, 145–158. [Google Scholar] [CrossRef]
  56. Atkins, J.W.; Fahey, R.T.; Hardiman, B.S.; Gough, C.M. Forest canopy structural complexity and light absorption relationships at the subcontinental scale. J. Geophys. Res. Biogeosci. 2018, 123, 1387–1405. [Google Scholar] [CrossRef]
  57. Friedman, J.H.; Hastie, T.; Tibshirani, R. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer: New York, NY, USA, 2001. [Google Scholar] [CrossRef]
  58. Giglio, L.; Schroeder, W.; Justice, C.O. The Collection 6 MODIS active fire detection algorithm and fire products. Remote Sens. Environ. 2016, 178, 31–41. [Google Scholar] [CrossRef] [PubMed]
  59. Schroeder, W.; Oliva, P.; Giglio, L.; Csiszar, I.A. The New VIIRS 375 m active fire detection data product: Algorithm description and initial assessment. Remote Sens. Environ. 2014, 143, 85–96. [Google Scholar] [CrossRef]
  60. Andela, N.; Morton, D.C.; Giglio, L.; Chen, Y.; van der Werf, G.R.; Kasibhatla, P.S.; Randerson, J.T. The Global Fire Atlas of individual fire size, duration, speed and direction. Earth Syst. Sci. Data 2019, 11, 529–552. [Google Scholar] [CrossRef]
  61. Schafer, J.L.; Graham, J.W. Missing data: Our view of the state of the art. Psychol. Methods 2002, 7, 147–177. [Google Scholar] [CrossRef]
  62. Bustillo Sánchez, M.; Ibañez, C.; Cisneros, A.; Bazán, A.; Delgado, S. Spatial assessment of wildfires susceptibility in Santa Cruz (Bolivia) using random forest. Geosciences 2021, 11, 224. [Google Scholar] [CrossRef]
  63. Wu, X.; Li, J.; Zhang, Y.; Chen, M.; Wang, H.; Liu, Y. Machine learning for predicting forest fire occurrence in Changsha: An innovative investigation into the introduction of a forest fuel factor. Remote Sens. 2023, 15, 4208. [Google Scholar] [CrossRef]
  64. Dong, H.; Silva, C.A.; Morais, M.; Camacho, P.; Fraga, H.; Oliveira, J.T. Wildfire prediction model based on spatial and temporal characteristics: A case study of a wildfire in Portugal’s Montesinho Natural Park. Sustainability 2022, 14, 10107. [Google Scholar] [CrossRef]
  65. Li, Y.; Zhang, J.; Li, X.; Wang, T.; Chen, Y.; Gao, J.; Wang, Y. Risk factors and prediction of the probability of wildfire occurrence in the China–Mongolia–Russia cross-border area. Remote Sens. 2023, 15, 42. [Google Scholar] [CrossRef]
  66. Rubí, J.N.; Gondim, P.R. A performance comparison of machine learning models for wildfire occurrence risk prediction in the Brazilian federal district region. Environ. Syst. Decis. 2024, 44, 351–368. [Google Scholar] [CrossRef]
  67. Haydar, M.; Islam, M.N.; Rahman, M.T.; Hasan, M.K.; Chowdhury, M.A.; Hossain, M.S. Data driven forest fire susceptibility mapping in Bangladesh. Ecol. Indic. 2024, 166, 112264. [Google Scholar] [CrossRef]
  68. Agrawal, N.; Nelson, P.V.; Low, R.D. A Novel Approach for Predicting Large Wildfires Using Machine Learning towards Environmental Justice via Environmental Remote Sensing and Atmospheric Reanalysis Data across the United States. Remote Sens. 2023, 15, 5501. [Google Scholar] [CrossRef]
  69. Tonbul, H.; Yilmaz, E.O.; Kavzoglu, T. Comparative analysis of Deep Learning and machine learning models for burned area estimation using Sentinel-2 image: A case study in Muğla-Bodrum, Turkey. In Proceedings of the 2023 10th International Conference on Recent Advances in Air and Space Technologies (RAST), Istanbul, Turkey, 7–9 June 2023; pp. 1–5. [Google Scholar] [CrossRef]
  70. Shmuel, A.; Heifetz, E. Global wildfire susceptibility mapping based on machine learning models. Forests 2022, 13, 1050. [Google Scholar] [CrossRef]
  71. Janizadeh, S.; Tran, T.T.K.; Bateni, S.M.; Jun, C.; Kim, D.; Trauernicht, C.; Heggy, E. Advancing the LIGHTGBM approach with three novel nature-inspired optimizers for predicting wildfire susceptibility in Kaua’i and Moloka’i Islands, Hawaii. Expert Syst. Appl. 2024, 258, 124963. [Google Scholar] [CrossRef]
  72. Sri Sowmya, V.; Sasikala, D.; Theetchenya, S. A comparative exploration of time series models for Wild fire prediction. In Proceedings of the 2024 Fourth International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT), Bhilai, India, 11–12 January 2024; Volume 34, pp. 1–5. [Google Scholar] [CrossRef]
  73. Aldiansyah, S.; Madani, I. Spatial Model of Wildfire Susceptibility Using Machine Learning Approaches on Rawa Aopa Watumohai National Park, Indonesia. GeoScape 2024, 18, 1–20. [Google Scholar] [CrossRef]
  74. Trucchia, A.; Mari, G.; Carrion, D.; Batlles, F.J. Machine-Learning Applications in Geosciences: Comparison of Different Algorithms and Vegetation Classes’ Importance Ranking in Wildfire Susceptibility. Geosciences 2022, 12, 424. [Google Scholar] [CrossRef]
  75. Shahzad, F.; Abbas, Q.; Iqbal, M.F.; Shahzad, M.I. Comparing Machine Learning Algorithms to Predict Vegetation Fire Detections in Pakistan. Fire Ecol. 2024, 20, 1. [Google Scholar] [CrossRef]
  76. Ali, S.D.; Ridwan, I.; Septiana, M.; Fithria, A.; Rezekiah, A.A.; Rahmadi, A.; Asyari, M.; Rahman, H.; Syafarina, G.A. GeoAI for Disaster Mitigation: Fire Severity Prediction Models Using Sentinel-2 and ANN Regression. In Proceedings of the 2022 IEEE International Conference on Aerospace Electronics and Remote Sensing Technology (ICARES), Yogyakarta, Indonesia, 24–25 November 2022; Volume 43, pp. 1–7. [Google Scholar] [CrossRef]
  77. Sawetsuthipan, T.; Wongchaisuwat, P. Forecasting burned areas of wildfires: A case study of Mae Hong Son Province in Thailand. In Proceedings of the 2023 4th International Conference on Computers and Artificial Intelligence Technology (CAIT), Macau, Macao, 13–15 December 2023; Volume 9, pp. 52–56. [Google Scholar] [CrossRef]
  78. Qin, L.; Shao, W.; Du, G.; Mou, J.; Bi, R. Predictive modeling of wildfires in the United States. In Proceedings of the 2021 2nd International Conference on Computing and Data Science (CDS), Stanford, CA, USA, 28–29 January 2021; pp. 562–567. [Google Scholar] [CrossRef]
  79. Gao, B.; Shan, Y.; Liu, X.; Yin, S.; Yu, B.; Cui, C.; Cao, L. Prediction and driving factors of forest fire occurrence in Jilin province, China. J. For. Res. 2023, 35, 1. [Google Scholar] [CrossRef]
  80. Castrejon, D.J.; Wang, C.; Osmak, D.; Kukadiya, B.; Liu, L.; Giraldo, M.; Jiang, X. Machine learning-based California Wildfire Risk Prediction and Visualization. In Proceedings of the 2023 International Conference on Machine Learning and Applications (ICMLA), Jacksonville, FL, USA, 15–17 December 2023; pp. 1212–1217. [Google Scholar] [CrossRef]
  81. Zhao, L.; Ge, Y.; Guo, S.; Li, H.; Li, X.; Sun, L.; Chen, J. Forest fire susceptibility mapping based on precipitation-constrained cumulative dryness status information in southeast China: A novel machine learning modeling approach. For. Ecol. Manag. 2024, 558, 121771. [Google Scholar] [CrossRef]
  82. Gupta, Y.; Goyal, N.; Varghese, V.J.; Goyal, P. Utilizing MODIS fire mask for predicting forest fires using Landsat-9/8 and meteorological data. In Proceedings of the 2023 IEEE 10th International Conference on Data Science and Advanced Analytics (DSAA), Thessaloniki, Greece, 9–13 October 2023; Volume 27, pp. 1–10. [Google Scholar] [CrossRef]
  83. Haji Zahari, M.I.H.B.P.; Karri, R.R.; Isa, M.H.; Zahran, E.S.M.M.; Nagendra, S.S. Soft computing techniques for prediction of forest fire occurrence in Brunei Darussalam. AIP Conf. Proc. 2023, 2643, 030023. [Google Scholar] [CrossRef]
  84. Laube, R.; Hamilton, H.J. Wildfire occurrence prediction using time series classification: A comparative study. In Proceedings of the 2021 IEEE International Conference on Big Data (Big Data), Orlando, FL, USA, 15–18 December 2021; pp. 4178–4182. [Google Scholar] [CrossRef]
  85. Pham, K.; Nguyen, T.; Tran, Q.; Le, H.; Vu, D. California wildfire prediction using Machine Learning. In Proceedings of the 2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA), Nassau, Bahamas, 12–14 December 2022; Volume 16, pp. 525–530. [Google Scholar] [CrossRef]
  86. Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning, 2nd ed.; Springer: New York, NY, USA, 2009. [Google Scholar]
  87. Davis, K.T.; Dobrowski, S.Z.; Higuera, P.E.; Holden, Z.A.; Veblen, T.T.; Rother, M.T.; Parks, S.A.; Sala, A.; Maneta, M.P. Wildfires and climate change push low-elevation forests across a critical climate threshold for tree regeneration. Proc. Natl. Acad. Sci. USA 2019, 116, 6193–6198. [Google Scholar] [CrossRef]
  88. Partheepan, S.; Sanati, F.; Hassan, J. Bushfire severity modelling and predicting future trends in Australia using remote sensing and machine learning. Environ. Model. Softw. 2025, 188, 106377. [Google Scholar] [CrossRef]
  89. Yazici, K.; Taskin, A. A Comparative Bayesian Optimization-Based Machine Learning and Artificial Neural Networks Approach for Burned Area Prediction in Forest Fires: An Application in Turkey. Nat. Hazards 2023, 119, 1883–1912. [Google Scholar] [CrossRef]
  90. Kumar, V.; Sharma, P.; Yadav, R.; Rani, M. A Comparison of Machine Learning Models for Predicting Rainfall in Urban Metropolitan Cities. Sustainability 2023, 15, 13724. [Google Scholar] [CrossRef]
  91. Islam, M.R.; Najafi, M.R. Machine Learning-Based Regional Flood Frequency Framework for Climate Resilient Infrastructure. J. Hydrol. 2025, 661, 133703. [Google Scholar] [CrossRef]
Figure 1. The seven Australian regions.
Figure 1. The seven Australian regions.
Fire 08 00330 g001
Figure 2. Framework for wildfire prediction: data, models, and analysis.
Figure 2. Framework for wildfire prediction: data, models, and analysis.
Fire 08 00330 g002
Figure 3. Number of wildfire occurences per region.
Figure 3. Number of wildfire occurences per region.
Fire 08 00330 g003
Figure 4. Best model for each Australian region for estimated FA.
Figure 4. Best model for each Australian region for estimated FA.
Fire 08 00330 g004
Figure 5. Best model for each Australian region for mean estimated FBT.
Figure 5. Best model for each Australian region for mean estimated FBT.
Fire 08 00330 g005
Table 1. Summary of dataset features used in wildfire prediction.
Table 1. Summary of dataset features used in wildfire prediction.
TypeIndicatorStatistical MeasuresTime PeriodFrequencyData Source
Climate
Data
Precipitation [mm/day]
Relative Humidity [%]
Soil Water Content [m3]
Solar Radiation [MJ/day]
Temperature [°C]
Wind Speed [m/s]
Minimum
Maximum
Mean
Variance
1 January 2005
to
23 January 2021
DailyERA5 Hourly Reanalysis
Wildfire
Data
Estimated FASum4 January 2005
to
28 January 2021
DailyMODIS MCD14DL
Mean Estimated FBTMean
Mean Estimated FRPMean
ConfidenceStd. Deviation, Variance
Pixel CountCount
Table 2. Best model performance for Australia overall.
Table 2. Best model performance for Australia overall.
Best ModelWildfire TargetRMSER2 Score
Lasso RegressionEstimated FA (km2)0.0210530.084497
Mean Estimated FBT (K)0.0641490.284465
Mean Estimated FRP (MW)0.0176940.021653
Table 3. Estimated FA prediction performance and environmental interpretation by region.
Table 3. Estimated FA prediction performance and environmental interpretation by region.
RegionRMSER2Environmental Interpretation
NSW0.003940.05249Captures sparse fuel and topographic variability via regularization
WA0.035490.15189Handles sparse vegetation and gentle terrain transitions
SA0.007730.12744Local microclimate and canopy-driven fire behavior captured by nearest neighbors
TA0.004760.14280Effective for fine-scale fire behavior in forested and mountainous areas
VI0.005960.20591Adapts well to complex terrain and urban–wildland interface dynamics
QL0.018590.51979Captures rich non-linear relations among climate, topography, and vegetation
NT0.024540.20750Handles high-variance seasonal regimes and heterogeneous vegetation transitions
Table 4. Mean estimated FBT prediction performance and environmental interpretation by region.
Table 4. Mean estimated FBT prediction performance and environmental interpretation by region.
RegionRMSER2Environmental Interpretation
NSW0.07470.41712Fuel heterogeneity, fragmented landscapes
WA0.061720.32574High solar radiation, wind variability
SA0.095510.34424Diurnal cycles, temperature spikes
TA0.102380.14451Moist forest cover, thermal signal disruption
VI0.081270.24826Complex topography, patchy cloud cover
QL0.057470.29785Tropical canopy, humidity, storms
NT0.060070.24618Monsoon seasonality, ignition from lightning, high variance
Table 5. Mean estimated FRP prediction performance and environmental interpretation by region.
Table 5. Mean estimated FRP prediction performance and environmental interpretation by region.
RegionRMSER2Environmental Interpretation
NSW0.023250.102909Coastal winds and terrain variations introduce noise, but model accuracy remains strong. High seasonal activity aids feature learning.
WA0.022950.122765Effective despite sparse vegetation and arid conditions. Performance aided by fewer clouds and clearer satellite readings.
SA0.035250.00668Complex terrain and patchy vegetation reduce variance explanation. Fires are wind-driven and sporadic, challenging for linear models.
TA0.048130.01558Cool, moist climate and fragmented vegetation complicate FRP prediction. Low solar intensity may reduce satellite fire detection fidelity.
VI0.0517570.024745Complex terrain and volatile weather (e.g., wind shifts, dry lightning) introduce non-linear fire dynamics. Lasso captures baseline trends but misses finer interactions.
QL0.013470.13965High accuracy due to relatively stable vegetation types and strong thermal signatures from high biomass combustion. Warm, humid climate supports more predictable FRP patterns.
NT0.0254490.065457Monsoonal wet–dry seasonality creates sharp contrasts in fire potential. Sparse features and vast open rangeland reduce model effectiveness for FRP variation.
Table 6. Thresholds for classifying wildfire severity levels based on predicted FA, FBT, and FRP, adapted from operational and satellite detection standards.
Table 6. Thresholds for classifying wildfire severity levels based on predicted FA, FBT, and FRP, adapted from operational and satellite detection standards.
ParameterNo DangerLowModerateHighExtreme
FA (km2)<0.10.1–0.50.5–22–10>10
FBT (K)<310310–330330–345345–365>365
FRP (MW)<3030–100100–300300–1500>1500
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Abohaia, Z.; Elkhouly, A.; Barachi, M.E.; Al-Khatib, O. Regional Prediction of Fire Characteristics Using Machine Learning in Australia. Fire 2025, 8, 330. https://doi.org/10.3390/fire8080330

AMA Style

Abohaia Z, Elkhouly A, Barachi ME, Al-Khatib O. Regional Prediction of Fire Characteristics Using Machine Learning in Australia. Fire. 2025; 8(8):330. https://doi.org/10.3390/fire8080330

Chicago/Turabian Style

Abohaia, Zina, Abeer Elkhouly, May El Barachi, and Obada Al-Khatib. 2025. "Regional Prediction of Fire Characteristics Using Machine Learning in Australia" Fire 8, no. 8: 330. https://doi.org/10.3390/fire8080330

APA Style

Abohaia, Z., Elkhouly, A., Barachi, M. E., & Al-Khatib, O. (2025). Regional Prediction of Fire Characteristics Using Machine Learning in Australia. Fire, 8(8), 330. https://doi.org/10.3390/fire8080330

Article Metrics

Back to TopTop