A Comparative Study of a Deep Reinforcement Learning Solution and Alternative Deep Learning Models for Wildfire Prediction

Vidal-Silva, Cristian; Pizarro, Roberto; Castillo-Soto, Miguel; Ingram, Ben; de la Fuente, Claudia; Duarte, Vannessa; Sangüesa, Claudia; Ibañez, Alfredo

doi:10.3390/app15073990

Open AccessArticle

A Comparative Study of a Deep Reinforcement Learning Solution and Alternative Deep Learning Models for Wildfire Prediction

by

Cristian Vidal-Silva

^1,*

,

Roberto Pizarro

^2,3,4,

Miguel Castillo-Soto

⁵

,

Ben Ingram

¹

,

Claudia de la Fuente

¹,

Vannessa Duarte

⁶

,

Claudia Sangüesa

^2,3 and

Alfredo Ibañez

^2,3

¹

Departamento de Visualización Interactiva y Realidad Virtual, Facultad de Ingeniería, Universidad de Talca, Talca 3460000, Chile

²

Cátedra Unesco en Hidrología de Superficie, Universidad de Talca, Talca 3467769, Chile

³

Centro Nacional de Excelencia para la Industria de la Madera (CENAMAD)-ANID BASAL FB210015, Pontificia Universidad Católica de Chile, Santiago 7810128, Chile

⁴

Facultad de Ciencias Forestales y de la Conservación de la Naturaleza, Universidad de Chile, Santiago 8820808, Chile

⁵

Forest Fire Laboratory, University of Chile, Santiago 8820808, Chile

⁶

Escuela de Ciencias Empresariales, Universidad Católica del Norte, Larrondo 1280, Coquimbo 1781421, Chile

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(7), 3990; https://doi.org/10.3390/app15073990

Submission received: 23 February 2025 / Revised: 24 March 2025 / Accepted: 28 March 2025 / Published: 4 April 2025

Download

Browse Figures

Versions Notes

Abstract

Wildfires pose an escalating threat to ecosystems and human settlements, making accurate forecasting essential for early mitigation. This study compared three deep learning models for wildfire prediction: Deep Reinforcement Learning (DRL) with Actor–Critic architecture, Convolutional Neural Network (CNN), and Transformer-based models. The models were trained and evaluated using historical data from Chile (2000–2023), including wildfire occurrences, meteorological variables, topography, and vegetation indices. After preprocessing and class balancing, each model was tested over 100 experimental runs. All models achieved outstanding performance, with F1-Scores exceeding 0.999 and perfect AUC-ROC scores. The Transformer model showed a slight advantage over the CNN (99.94%) and Actor–Critic DRL (99.93%) in accuracy. Feature importance analysis identified wind speed, temperature, and vegetation indices as the most influential variables. While DRL offers theoretical benefits for adaptive decision-making, Transformer architectures more effectively capture spatiotemporal dependencies in wildfire dynamics. The findings can support the integration of deep learning models into early warning systems, contributing to proactive wildfire risk management. Future work will include validation with diverse regional datasets, real-time deployment, and collaboration with emergency response agencies.

Keywords:

wildfire prediction; deep reinforcement learning; actor–critic; machine learning; convolutional neural networks; transformer models; AI-driven risk assessment; fire behavior modeling

1. Introduction

Wildfires have become an increasingly prevalent global threat, exacerbated by climate change, rising temperatures, and prolonged drought periods [1]. The frequency and intensity of these fires have escalated in recent decades, resulting in severe ecological, economic, and human losses [2]. Countries in Latin America, particularly Chile, experience recurrent and devastating wildfire seasons, significantly affecting biodiversity, air quality, and local communities [3]. The expansion of human settlements into fire-prone areas has further increased the vulnerability of populations to wildfire hazards [4].

Accurate wildfire prediction is essential to mitigate their devastating effects, improve early warning systems, and optimize resource allocation for fire suppression [5]. Traditional wildfire prediction models rely on statistical methods and machine learning techniques such as logistic regression, decision trees, and gradient boosting [6]. Although these models provide valuable information, they struggle to capture the complex, nonlinear interactions between the environmental factors that influence wildfire spread [7]. Recent advances in deep learning, particularly Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks, have shown promising results in wildfire risk assessment [8]. However, these models often require extensive labeled data and face challenges adapting to non-stationary environmental dynamics [9].

Deep Reinforcement Learning (DRL) has emerged as a powerful alternative, capable of learning optimal prediction strategies through interaction with simulated wildfire environments [10]. The Actor–Critic DRL framework, in particular, combines policy-based and value-based learning, enabling adaptive decision-making in dynamic conditions [11]. Given the high incidence and impact of wildfires in Chile and Latin America, this study explored the effectiveness of DRL-based wildfire prediction models compared to traditional deep learning approaches, assessing their predictive accuracy, adaptability, and computational feasibility. Recent studies have explored DRL for wildfire simulation and suppression, such as simulation-based approaches for dynamic fire behavior modeling [11,12]. These works support the potential of DRL for adaptive strategies in real-time decision-making, motivating its comparison against more traditional models in this study.

The primary objective of this research was to evaluate the efficacy of DRL-based wildfire prediction models, specifically focusing on the Actor–Critic architecture. The following specific objectives outline the work in this paper:

Develop and implement a DRL-based Actor–Critic model for wildfire prediction using historical environmental data.
Compare the performance of DRL models with traditional deep learning approaches, including CNNs and Transformer-based architectures.
Analyze the impact of different environmental features (temperature, wind speed, vegetation type, and topography) on wildfire occurrences using feature importance analysis.
Propose a practical framework for integrating DRL models into operational wildfire management systems, to enhance predictive accuracy and decision-making capabilities.

This study is particularly relevant in the Latin American and Chilean contexts, due to the high occurrence of wildfires and their impact in these regions. The extensive forested landscapes and changing climate patterns create conditions conducive to frequent and large-scale wildfires [13]. Chile’s government agencies and emergency response teams, such as the National Forestry Corporation (CONAF), require advanced predictive tools to improve wildfire preparedness and response strategies [14]. Through AI-driven wildfire prediction models, this research contributes to developing proactive fire management strategies, to minimize environmental destruction and human casualties.

Considering major wildfire events in Chile, according to the Centro de Estudios Públicos (CEP), 2017 wildfires affected over 518,000 hectares across multiple regions [15]. The 2023 wildfires began in late January and impacted more than 430,000 hectares in central and southern Chile, destroying over 800 homes and with 26 fatalities [16]. More recently 2024, devastating fires consumed approximately 43,000 hectares, mainly affecting the regions of Via del Mar and Quilpué, resulting in at least 131 deaths [17].

Despite the potential benefits, this study has several limitations. The precision of DRL-based wildfire prediction models depends on the availability and quality of historical wildfire data, which may be incomplete or inconsistent in certain regions [18]. Furthermore, computational constraints may pose challenges for real-time implementation in emergency response systems [19]. Future work should explore methods to improve data quality, improve model interpretability, and assess the feasibility of deploying DRL-based models in real-world operational settings.

The remainder of this paper is structured as follows: Section 2 provides an overview of wildfires, their causes, and the role of artificial intelligence in prediction. Section 3 describes the dataset and key variables used in the study. Section 4 presents the implementation details of the three AI-based approaches. Section 5 discusses the experimental results and performance comparison, followed by the conclusions, limitations, and future research directions in Section 6.

2. Background

Wildfires are natural or human-induced fires in forests, grasslands, and other vegetation-covered areas. The leading causes of wildfires include lightning strikes, prolonged drought conditions, human activities such as agricultural clearing and arson, and climate change effects [20]. Rising global temperatures and reduced precipitation have significantly increased the frequency and intensity of wildfires in many regions worldwide [21].

The consequences of wildfires are multifaceted and devastating, leading to loss of biodiversity, destruction of property, increased air pollution, and health hazards due to smoke inhalation [22]. Additionally, economic damages from wildfires can reach billions of dollars annually, affecting local and national economies [23]. Table 1 summarizes the key impacts of wildfires on different aspects of society and the environment.

2.1. Machine Learning

Machine Learning (ML) is a class of algorithms that learn from data to make predictions or decisions without being explicitly programmed [24]. ML techniques have been widely applied in wildfire prediction, ranging from statistical methods to advanced artificial intelligence models [25]. Traditional ML models, such as decision trees, support vector machines, and logistic regression, have effectively assessed wildfire risk based on historical climate and environmental data.

Advantages of ML-based wildfire prediction include improved accuracy compared to rule-based systems and the ability to handle complex, high-dimensional data. However, significant challenges remain, such as the need for high-quality labeled data and difficulty modeling the non-linear interactions between fire-related variables [26]. Examples of ML in wildfire prediction include logistic regression models [25], decision trees and ensemble methods [6], and SVMs applied to environmental features [9].

Recent reviews have highlighted the effectiveness of ensemble learning, support vector classification, and probabilistic models in wildfire risk modeling [8,26].

2.2. Deep Learning

Deep Learning (DL) is a subset of ML that employs multi-layered neural networks to automatically extract features from raw data [27]. Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) have been successfully applied in wildfire prediction tasks [28]. CNNs effectively process satellite imagery to detect fire-prone regions, while RNNs capture temporal dependencies in wildfire occurrences, allowing for more robust predictive modeling. Figure 1 depicts the structure of a deep neural network with input, hidden, and output layers used for data modeling.

Table 2 overviews the different DL approaches used in wildfire prediction.

In recent years, CNN-LSTM and BiLSTM architectures have also been tested for the spatial and temporal modeling of fire spread and ignition [29,30].

2.3. Deep Reinforcement Learning

Deep Reinforcement Learning (DRL) is an advanced AI approach that combines deep learning with reinforcement learning techniques to enable adaptive decision-making [31]. DRL models learn optimal strategies by interacting with an environment and maximizing cumulative rewards [32]. The Actor–Critic DRL framework, in particular, has shown promise for optimizing wildfire suppression and risk assessment strategies [12].

The advantages of DRL-based wildfire prediction include adaptability to changing environmental conditions and the ability to dynamically optimize decision-making. However, challenges such as high computational costs and extensive data requirements must be addressed for practical implementation [33].

Figure 2 [34] presents a Deep Reinforcement Learning (DRL) framework in which an agent continuously interacts with its environment. Guided by a deep neural network (DNN) that implements the policy

π_{θ} (s, a)

, the agent selects actions based on the current state. As noted by [35], these actions modify the environment, supplying both a reward and an updated state. This cyclical feedback mechanism enables the agent to progressively refine its decision-making by optimizing the policy to maximize cumulative rewards.

3. Dataset Overview

The dataset includes wildfire and environmental records from multiple administrative regions in central and southern Chile, including Maule, Ñuble, Biobío, and Valparaíso. This study used a comprehensive dataset of historical environmental and geographic data from wildfire events over the years 2000 to 2023.The dataset was partitioned into training (70%), validation (15%), and test (15%) sets using stratified sampling to maintain class distribution. The primary objective was to predict the probability of the occurrence of wildfires based on various environmental conditions, with a binary target variable that indicated wildfire events. Table 3 provides representative records demonstrating the range of ecological conditions captured.

3.1. Feature Description

The complete dataset consists of approximately 210,240 daily records covering the period from 2000 to 2023, with an average of 8760 entries per year. These records represent both wildfire and non-wildfire conditions across selected Chilean regions. The dataset encompasses various environmental attributes crucial for wildfire modeling: (i) Temperature (°C): Atmospheric heat measurements influencing fuel dryness and fire risk, with daily mean values ranging from 18.3 °C to 35.7 °C. (ii) Relative Humidity (%): Air moisture levels, where lower values (below 40%) strongly correlate with higher fire susceptibility. (iii) Wind Speed (m/s): A critical factor governing fire spread rate and direction, with values ranging from 0.5 to 12.8 m/s. (iv) Elevation (m): Altitude of the affected region (100–1500 m), impacting fire behavior through atmospheric conditions and vegetation characteristics. (v) Vegetation Type: Categorical feature with three primary classes (forests, shrublands, grasslands) encoded using one-hot encoding for model integration. (vi) Urban–Rural Interface (URI): Binary indicator (Yes/No) of proximity to human settlements, crucial for the assessment of fire hazards and prioritization of management.

The dataset was compiled from multiple sources, including wildfire records from CONAF (Chilean National Forestry Corporation), meteorological variables from national weather stations, and satellite-derived vegetation indices (e.g., NDVI from MODIS). To ensure consistency, all data were standardized to a daily resolution and aligned geospatially. Biases due to data scale or measurement procedure were minimized through feature normalization and quality filtering.

3.2. Target Variable

The target variable represents the wildfire probability (P), signifying the likelihood of wildfires.

$P < 0.5$ : Lower probability, indicating minimal wildfire risk.
$P \geq 0.5$ : Higher probability, suggesting an increased likelihood of wildfire occurrence.

The dataset exhibited a considerable class imbalance, with positive cases (wildfire occurrences,

P \geq 0.5

) constituting 78.3% of the records, posing methodological challenges for predictive modeling and evaluation. Across the 24 years analyzed, annual wildfire occurrences ranged between 1200 and 6800 events. The most critical years were 2017 and 2023, which corresponded to national emergencies with extensive wildfire damage.

3.3. Addressing Dataset Imbalance

The following strategies were applied to improve the predictive power of the model:

Synthetic Data Generation: Creating artificial samples to simulate low-risk conditions where wildfires did not occur.
Probabilistic Labeling: Assigning probabilistic values to real and synthetic samples to reflect the spectrum of fire risk scenarios.
Feature Normalization: Standardizing numerical attributes to ensure uniformity across scales, facilitating model convergence.

To evaluate redundancy and potential multicollinearity, a Pearson correlation matrix was generated among all input features. No pair exceeded a correlation coefficient of 0.75, suggesting a sufficiently independent set of predictors. The correlation matrix is provided in Figure 3.

The original dataset exhibited a pronounced class imbalance, with approximately 78% of the entries corresponding to wildfire-positive instances. This bias posed a risk of overfitting, particularly under high-capacity models. Synthetic data were generated to simulate low-risk conditions and enrich the feature space with negative cases for model training.

3.4. Optimal Modeling Approach

Binary cross-entropy loss is grounded in information theory. Based on Shannon’s entropy concept [39], the formulation also draws from Good’s minimum cross-entropy principle [40] and Jaynes’ maximum entropy framework [41], which allow probabilistic modeling of uncertainty under incomplete knowledge.

Wildfire prediction constitutes a probabilistic classification task, necessitating a binary cross-entropy loss function.

L_{B C E} = - \frac{1}{N} \sum_{i = 1}^{N} [y_{i} log ({\hat{y}}_{i}) + (1 - y_{i}) log (1 - {\hat{y}}_{i})]

(1)

where

y_{i}

represents the true label, and

{\hat{y}}_{i}

is the predicted probability. This approach enabled the model to assign confidence scores to predictions, rather than make rigid binary decisions. Class-weighted loss functions were also implemented to address the remaining imbalance, with weights inversely proportional to class frequencies. Combining dataset balancing techniques and appropriate loss functions yielded improved generalization capabilities, enhancing the wildfire risk assessment and mitigation strategies.

4. Materials and Methods

This section describes the implementation and configuration of the three wildfire prediction models: the Actor–Critic reinforcement learning model, the Convolutional Neural Network (CNN), and the Transformer-based model. Each method was designed to tackle different challenges in wildfire prediction, with their strengths and known limitations.

4.1. Actor–Critic Training for Wildfire Prediction

The Actor–Critic framework represents a hybrid reinforcement learning approach that enhances prediction accuracy through continuous environment–agent interaction [42]. This dual-network system employs a policy network (Actor) to determine actions and a value network (Critic) to evaluate state–action pairs. It is particularly suitable for dynamic wildfire forecasting scenarios, where environmental conditions evolve rapidly.

The specific training procedure for the Actor–Critic model is available in the Supplementary Material (Algorithm S1), where the agent iteratively refined its decision-making strategy through policy optimization and reward-based learning. Some advantages of this algorithm include its adaptability to dynamic wildfire conditions and its ability to optimize decision-making over time. However, known issues with this algorithm include its high computational cost and the need for extensive training data to converge efficiently.

The reward function was defined as a weighted binary cross-entropy loss, rewarding correct identification of wildfire occurrences and penalizing false negatives more heavily. Training was conducted over 100 episodes, with early stopping criteria based on validation loss. The Adam optimizer with learning rate scheduling was used for stability.

4.2. CNN Training for Wildfire Prediction

Convolutional Neural Networks (CNNs) are widely used in image-based wildfire prediction, because they capture spatial dependencies in data [43]. This model is particularly effective when dealing with satellite imagery and geospatial information.

The training procedure for the CNN model is available in the Supplementary Material (Algorithm S2), where the network learned hierarchical feature representations to optimize wildfire prediction accuracy. Some advantages of CNNs include their efficiency in feature extraction and their robustness when dealing with geospatial datasets. However, CNNs struggle to capture temporal dependencies, which limits their performance in predicting the progression of wildfires over time.

4.3. Transformer Training for Wildfire Prediction

Transformer models have revolutionized sequential data processing through self-attention mechanisms [44]. This architecture can effectively analyze temporal relationships in meteorological and vegetation data sequences for wildfire prediction.

The specific training procedure for the Transformer model is available in the Supplementary Material (Algorithm S3), where the self-attention mechanism enabled the model to capture long-term dependencies in wildfire-related features. Some advantages of Transformers include their ability to capture long-range dependencies in wildfire data and their flexibility in handling multiple input types. However, Transformers require high computational resources and large datasets, making them challenging to implement in real-time wildfire prediction systems.

5. Results and Discussion

This section evaluates the three distinct deep learning architectures for wildfire prediction: the Convolutional Neural Network (CNN), Transformer-based model, and Actor–Critic Reinforcement Learning. Model performance was systematically assessed across 100 experimental runs using five established evaluation metrics: Accuracy, Precision, Recall, F1-score, and Area Under the Receiver Operating Characteristic Curve (AUC-ROC). Table 4 summarizes the results, while Figure 4, Figure 5, Figure 6, Figure 7 and Figure 8 illustrate the comparative performance across experiments.

5.1. Accuracy Analysis

The F1-Score [45], used in this evaluation, is the harmonic mean between precision and recall. It provides a balanced measure of model performance, especially useful when the dataset is imbalanced, as in wildfire prediction tasks. All models demonstrated exceptional accuracy, with values consistently approaching 1.0 across experiments (Figure 4). The Transformer architecture achieved the highest mean accuracy (0.9995), followed by the CNN (0.9992) and Actor–Critic (0.9990). Statistical significance testing (paired t-test,

p < 0.05

) confirmed that the numerically small differences represented statistically significant performance variations.

Table 5 presents the confusion matrix results for each wildfire prediction model. True Positives (TP) represent correctly classified wildfire occurrences, while True Negatives (TN) correspond to correctly identified non-wildfire events. False Positives (FP) indicate cases where a wildfire was incorrectly predicted, and False Negatives (FN) denote missed wildfire events.

The results show that the CNN and Transformer models achieved a perfect classification, with no False Positives or False Negatives. The Actor–Critic model slightly underperformed, with one False Negative (FN), meaning it failed to detect one wildfire case. This slight misclassification could be due to the stochastic nature of reinforcement learning, which sometimes struggles with precise classification boundaries [46].

These findings indicate that deep learning models, particularly CNNs and Transformers, are highly reliable for wildfire prediction, achieving zero FP and FN rates across all experiments. Meanwhile, the Actor–Critic model remains a promising alternative, though it may require additional fine-tuning or hybrid approaches to reach the same level of precision. Future research could explore ways to enhance reinforcement learning techniques by integrating auxiliary loss functions or combining supervised learning with reinforcement mechanisms [47].

All performance metrics in Table 4 are the result of 100 independent experimental runs using randomized data partitions. Each run included approximately 164,450 wildfire-positive cases and 45,790 non-wildfire instances.

5.2. Validation Performance Analysis

To assess the model generalization during training, performance metrics were monitored on both the validation and test sets across all 100 experimental runs. Table 6 presents the mean accuracy, F1-score, and AUC-ROC for the validation phase compared to the test results reported in Table 4. The results show minimal differences (≥0.0003), suggesting that the models did not overfit and were able to generalize well to unseen data.

These results confirm that the model selection process based on validation metrics was effective. No significant performance degradation was observed when transitioning from validation to testing, further reinforcing the reliability and robustness of the trained models.

5.3. Precision and Recall

All three models achieved a perfect precision (1.0000), indicating the absence of false positives (Figure 5). This is particularly valuable in wildfire prediction contexts, where false alarms could misallocate resources. Recall scores (Figure 6) demonstrated slight variations among models, with the Transformer architecture achieving the highest mean recall (0.9993), followed by the CNN (0.9989) and Actor–Critic (0.9987). These differences, while minimal, indicate the Transformer model’s superior ability to identify actual wildfire occurrences with fewer false negatives, a critical consideration for early warning systems.

5.4. F1-Score Analysis

The F1-scores, representing the harmonic mean of precision and recall, remained exceptionally high across all models (Figure 7). The Transformer model achieved the highest mean F1-score (0.9997), followed by the CNN (0.9994) and Actor–Critic (0.9993). The observed minor fluctuations across experiments (standard deviation ≤ 0.0003) indicate robust model performance across different data partitions. A one-way ANOVA with post hoc Tukey HSD test confirmed the statistical significance of these differences (

p < 0.01

).

5.5. AUC-ROC Performance

All models achieved perfect AUC-ROC scores (1.0000), indicating optimal discrimination between wildfire and non-wildfire instances (Figure 8). This metric confirms the models’ exceptional ability to rank positive instances higher than negative ones, regardless of the specific classification threshold selected.

5.6. Feature Importance Analysis

To understand the environmental factors most critical for wildfire prediction, we employed a SHAP (Shapley Additive Explanations) analysis to interpret the contribution of each input feature to the model predictions. This method provided a robust approach for identifying which variables had the most significant impact on the likelihood of wildfire occurrence.

Figure 9 presents the SHAP feature importance results for the three models (CNN, Transformer, and Actor–Critic). The most influential variables across all models were as follows:

Temperature (°C): Higher temperatures strongly correlated with increased wildfire risk.
Relative Humidity (%): Inversely related to wildfire probability, with lower humidity contributing to drier conditions.
Wind Speed (m/s): Strong winds facilitated fire spread, making this a crucial factor in predictions.
Vegetation Type: Certain vegetation categories (e.g., dense forests, shrublands) exhibited higher susceptibility to ignition.
Urban-Rural Interface (IUR): Areas closer to human settlements had higher fire occurrences, likely due to anthropogenic factors.

The Transformer-based model exhibited the highest sensitivity to temporal variables, particularly wind speed variations over time. The CNN model, in contrast, showed a more significant reliance on spatial patterns, such as elevation and vegetation type. The Actor–Critic reinforcement learning approach dynamically adjusted its feature importance during training, indicating adaptability to varying wildfire conditions. These findings suggest that incorporating multi-source environmental data (e.g., meteorological patterns, land cover, and human activity) enhances the predictive performance of AI-driven wildfire models. Future improvements could explore ensemble methods combining CNN, Transformer, and reinforcement learning models to leverage their respective strengths. It is important to note that SHAP values may reflect biases from measurement instruments such as weather stations and satellite-derived indices. To mitigate this, the data preprocessing included normalization and smoothing filters.

5.7. Discussion

The experimental results demonstrate that all three deep learning models—the CNN, Transformer, and Actor–Critic DRL—performed exceptionally in the wildfire prediction tasks. Across 100 experimental runs, all models consistently achieved high accuracy, precision, recall, and F1-Scores (F1 > 0.999), with perfect AUC-ROC scores, suggesting strong discriminative capabilities.

Among the models, the Transformer architecture exhibited a slight advantage in generalization performance. This can be attributed to its multi-head self-attention mechanism, which effectively captures long-range temporal dependencies and spatial patterns across environmental variables. Such capabilities make Transformers particularly suitable for modeling wildfire dynamics and enhancing early warning systems. However, the CNN and Actor–Critic models also demonstrated competitive results and may be more practical in scenarios with limited computational resources, due to their lower complexity and faster inference times. These findings support the integration of AI-based prediction systems into operational wildfire management frameworks, especially in countries like Chile with a history of severe fire events. This approach is also applicable globally, particularly in regions increasingly affected by climate-driven wildfires such as Australia, the United States, and Southern Europe. In addition, the application of this methodology provides robust support for the delineation of occurrence zones [48], which have persisted and, in some cases, expanded over the last five years.

Despite the promising outcomes, several challenges remain. High performance scores raise the possibility of overfitting, although mitigation techniques such as stratified sampling, synthetic data generation, and class-balanced training were applied. Additionally, the performance was evaluated over 100 randomized runs, showing minimal variance—an indication of model robustness. Nonetheless, external validation across diverse regions and ecosystems is essential for assessing generalizability.

Future directions include the incorporation of real-time meteorological data, satellite imagery, and additional environmental variables to enhance prediction granularity and robustness. Moreover, lightweight architectures should be explored to facilitate deployment in resource-constrained environments. These enhancements will strengthen the operational viability of AI-driven wildfire early warning systems.

6. Conclusions

This study presented a comparative evaluation of three AI-based models for wildfire prediction: a Convolutional Neural Network, Transformer architecture, and Actor–Critic Deep Reinforcement Learning. Across 100 experimental runs, all models achieved excellent predictive performance, with accuracy, precision, recall, and F1-Scores approaching 1.0. The Transformer model demonstrated marginally superior generalization capabilities, supported by its attention mechanism for capturing spatiotemporal patterns. These results underscore the value of deep learning techniques in enhancing early warning systems and optimizing wildfire management strategies. In Chile and other fire-prone regions, such models can assist in proactive resource allocation and risk mitigation. Their global relevance is further reinforced by the increasing frequency and intensity of wildfires driven by climate change.

While the models performed robustly, several limitations should be addressed. Validation using geographically diverse datasets is necessary to ensure generalizability. Additionally, the computational demands of Transformer architectures may hinder real-time implementation in low-resource settings. Exploring lightweight alternatives, such as optimized CNNs or hybrid approaches, is recommended. Future research will focus on multi-modal data integration—including satellite imagery, real-time meteorological inputs, and land use data—to improve forecasting precision. Expanding model outputs to include fire spread and intensity predictions could also provide critical insights for emergency response planning.

In summary, this work establishes a foundation for integrating deep learning methods into operational wildfire prediction systems. Continued refinement and contextual adaptation will enhance their impact on sustainable environmental management and disaster preparedness worldwide.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/app15073990/s1, Algorithm S1: Actor–Critic Training for Wildfire Prediction; Algorithm S2: CNN Training for Wildfire Prediction; Algorithm S3: Transformer Training for Wildfire Prediction.

Author Contributions

Conceptualization, R.P., M.C.-S., C.S. and A.I.; methodology, B.I.; software, C.V.-S. and C.d.l.F.; validation, V.D., B.I. and C.V.-S.; formal analysis, R.P., M.C.-S., C.S. and A.I.; resources, R.P., M.C.-S., C.S. and A.I.; data curation, C.V.-S.; writing—original draft preparation, C.V.-S.; writing—review and editing, M.C.-S.; visualization, C.S.; supervision, V.D.; project administration, R.P., M.C.-S. and C.S.; funding acquisition, A.I. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Informed Consent Statement

Not applicable.

Data Availability Statement

Datasets and Python Scripts are available in https://github.com/cvidalmsu/FONDEF-IDEA-, accessed on 30 March 2025.

Acknowledgments

We are very grateful to the Department of Forest Fire Protection from CONAF of the Maule Region, Chile, for providing the information and being part of this Project’s team. Also, the authors gratefully acknowledge the support provided by the ANID BASAL Center FB210015 (CENAMAD).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Richardson, D.; Black, A.S.; Irving, D.; Matear, R.J.; Monselesan, D.P.; Risbey, J.S.; Squire, D.T.; Tozer, C.R. Global Increase in Wildfire Potential from Compound Fire Weather and Drought. Npj Clim. Atmos. Sci. 2022, 5, 23. [Google Scholar] [CrossRef]
Duane, A.; Castellnou, M.; Brotons, L. Towards a Comprehensive Look at Global Drivers of Novel Extreme Wildfire Events. Clim. Change 2021, 165, 43. [Google Scholar] [CrossRef]
Villagra, P.; Paula, S. Wildfire Management in Chile: Increasing Risks Call for More Resilient Communities. Environ. Sci. Policy Sustain. Dev. 2021, 63, 4–14. [Google Scholar] [CrossRef]
Aguirre, P.; León, J.; González-Mathiesen, C.; Román, R.; Penas, M.; Ogueda, A. Modelling the vulnerability of urban settings to wildland–urban interface fires in Chile. Nat. Hazards Earth Syst. Sci. 2024, 24, 1521–1537. [Google Scholar] [CrossRef]
Bhowmik, R.T.; Jung, Y.S.; Aguilera, J.A.; Prunicki, M.; Nadeau, K. A multi-modal wildfire prediction and early-warning system based on a novel machine learning framework. J. Environ. Manag. 2023, 341, 117908. [Google Scholar] [CrossRef]
Bot, K.; Borges, J.G. A Systematic Review of Applications of Machine Learning Techniques for Wildfire Management Decision Support. Inventions 2022, 7, 15. [Google Scholar] [CrossRef]
Singh, H.; Ang, L.M.; Lewis, T.; Paudyal, D.; Acuna, M.; Srivastava, P.K.; Srivastava, S.K. Trending and Emerging Prospects of Physics-Based and ML-Based Wildfire Spread Models: A Comprehensive Review. J. For. Res. 2024, 35, 135. [Google Scholar] [CrossRef]
Marjani, M.; Mahdianpari, M.; Mohammadimanesh, F. CNN-BiLSTM: A Novel Deep Learning Model for Near-Real-Time Daily Wildfire Spread Prediction. Remote Sens. 2024, 16, 1467. [Google Scholar] [CrossRef]
Carta, F.; Zidda, C.; Putzu, M.; Loru, D.; Anedda, M.; Giusto, D. Advancements in Forest Fire Prevention: A Comprehensive Survey. Sensors 2023, 23, 6635. [Google Scholar] [CrossRef]
Gautam, M. Deep Reinforcement Learning for Resilient Power and Energy Systems: Progress, Prospects, and Future Avenues. Electricity 2023, 4, 336–380. [Google Scholar] [CrossRef]
Tupayachi, J.; Ferguson, M.M.; Li, X. A Simulation-Based Real-Time Deep Reinforcement Learning Approach for Fighting Wildfires. In Proceedings of the 2024 Annual Modeling and Simulation Conference (ANNSIM), Washington, DC, USA, 20–23 May 2024; pp. 1–12. [Google Scholar] [CrossRef]
Altamimi, A.; Lagoa, C.; Borges, J.G.; McDill, M.E.; Andriotis, C.P.; Papakonstantinou, K.G. Large-Scale Wildfire Mitigation Through Deep Reinforcement Learning. Front. For. Glob. Change 2022, 5, 734330. [Google Scholar] [CrossRef]
McWethy, D.B.; Garreaud, R.D.; Holz, A.; Pederson, G.T. Broad-Scale Surface and Atmospheric Conditions during Large Fires in South-Central Chile. Fire 2021, 4, 28. [Google Scholar] [CrossRef]
CONAF. Annual Report on Wildfire Management in Chile; CONAF Technical Reports; CONAF: Santiago, Chile, 2021. [Google Scholar]
Centro de Estudios Públicos (CEP). Informe Sobre Incendios Forestales en Chile; Centro de Estudios Públicos: Santiago, Chile, 2017. [Google Scholar]
Reuters. Wildfires in Chile raise great concern, says minister. Reuters, 19 February 2023. [Google Scholar]
Today, G. Forest fires kill 112 in Chile’s worst disaster since 2010 earthquake. The Japan Times, 5 February 2024. [Google Scholar]
Sivamayil, K.; Rajasekar, E.; Aljafari, B.; Nikolovski, S.; Vairavasundaram, S.; Vairavasundaram, I. A Systematic Study on Reinforcement Learning Based Applications. Energies 2023, 16, 1512. [Google Scholar] [CrossRef]
Damaševičius, R.; Bacanin, N.; Misra, S. From Sensors to Safety: Internet of Emergency Services (IoES) for Emergency Response and Disaster Management. J. Sens. Actuator Netw. 2023, 12, 41. [Google Scholar] [CrossRef]
Mansoor, S.; Farooq, I.; Kachroo, M.M.; Mahmoud, A.E.D.; Fawzy, M.; Popescu, S.M.; Alyemeni, M.; Sonne, C.; Rinklebe, J.; Ahmad, P. Elevation in wildfire frequencies with respect to the climate change. J. Environ. Manag. 2022, 301, 113769. [Google Scholar] [CrossRef]
Jones, M.W.; Abatzoglou, J.T.; Veraverbeke, S.; Andela, N.; Lasslop, G.; Forkel, M.; Smith, A.J.P.; Burton, C.; Betts, R.A.; Werf, G.R.v.; et al. Global and regional trends and drivers of fire under climate change. Rev. Geophys. 2022, 60, e2020RG000726. [Google Scholar] [CrossRef]
Jakhar, R.; Samek, L.; Styszko, K. A Comprehensive Study of the Impact of Waste Fires on the Environment and Health. Sustainability 2023, 15, 14241. [Google Scholar] [CrossRef]
Tavor, T. Assessing the financial impacts of significant wildfires on US capital markets: Sectoral analysis. Empir. Econ. 2024, 67, 1115–1148. [Google Scholar] [CrossRef]
Sarker, I.H. Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions. SN Comput. Sci. 2021, 2, 420. [Google Scholar] [CrossRef]
Pérez-Porras, F.J.; Triviño-Tarradas, P.; Cima-Rodríguez, C.; Meroño-de Larriva, J.E.; García-Ferrer, A.; Mesas-Carrascosa, F.J. Machine Learning Methods and Synthetic Data Generation to Predict Large Wildfires. Sensors 2021, 21, 3694. [Google Scholar] [CrossRef]
Mambile, C.; Kaijage, S.; Leo, J. Application of Deep Learning in Forest Fire Prediction: A Systematic Review. IEEE Access 2024, 12, 190554–190581. [Google Scholar] [CrossRef]
Matsuo, Y.; LeCun, Y.; Sahani, M.; Precup, D.; Silver, D.; Sugiyama, M.; Uchibe, E.; Morimoto, J. Deep learning, reinforcement learning, and world models. Neural Netw. 2022, 152, 267–275. [Google Scholar] [CrossRef] [PubMed]
Marjani, M.; Ahmadi, S.A.; Mahdianpari, M. FirePred: A hybrid multi-temporal convolutional neural network model for wildfire spread prediction. Ecol. Inform. 2023, 78, 102282. [Google Scholar] [CrossRef]
Jiang, P.; Wang, L.; Liu, Y.; Zhang, X. Spatiotemporal Wildfire Prediction Using a Hybrid CNN-LSTM Deep Learning Model. Int. J. Wildl. Fire 2023, 32, 123–135. [Google Scholar] [CrossRef]
Yun, S.; Lee, J.; Kim, S.H. BiLSTM-Based Wildfire Ignition Prediction Using Environmental Time Series Data. Appl. Sci. 2022, 12, 9475. [Google Scholar] [CrossRef]
Arulkumaran, K.; Deisenroth, M.P.; Brundage, M.; Bharath, A.A. Deep Reinforcement Learning: A Brief Survey. IEEE Signal Process. Mag. 2017, 34, 26–38. [Google Scholar] [CrossRef]
Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.; Veness, J.; Bellemare, M.; Graves, A.; Riedmiller, M.; Fidjeland, A.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef]
Cai, Q.; Cui, C.; Xiong, Y.; Wang, W.; Xie, Z.; Zhang, M. A Survey on Deep Reinforcement Learning for Data Processing and Analytics. IEEE Trans. Knowl. Data Eng. 2023, 35, 4446–4465. [Google Scholar] [CrossRef]
Liu, W.; Cai, J.; Chen, Q.C.; Wang, Y. DRL-R: Deep reinforcement learning approach for intelligent routing in software-defined data-center networks. J. Netw. Comput. Appl. 2021, 177, 102865. [Google Scholar] [CrossRef]
Zhang, J.; Chang, C.; Zeng, X.; Li, L. Multi-Agent DRL-Based Lane Change With Right-of-Way Collaboration Awareness. IEEE Trans. Intell. Transp. Syst. 2023, 24, 854–869. [Google Scholar] [CrossRef]
Corporación Nacional Forestal (CONAF). Estadísticas de Incendios Forestales en Chile: Temporadas 2000–2023; Corporación Nacional Forestal: Santiago, Chile, 2023. [Google Scholar]
Dirección Meteorológica de Chile. Datos meteorológicos históRicos por Estación; Dirección Meteorológica de Chile: Santiago, Chile, 2023. [Google Scholar]
NASA LP DAAC. MOD13Q1 Version 6.1: MODIS/Terra Vegetation Indices 16-Day L3 Global 250m; NASA LP DAAC: Sioux Falls, SD, USA, 2022. [Google Scholar]
Shannon, C.E. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
Good, I.J. Maximum Entropy for Hypothesis Formulation, Especially for Multidimensional Contingency Tables. Ann. Math. Stat. 1963, 34, 911–934. [Google Scholar] [CrossRef]
Jaynes, E.T. Information Theory and Statistical Mechanics. Phys. Rev. 1957, 106, 620–630. [Google Scholar] [CrossRef]
Mansfield, D.; Montazeri, A. A survey on autonomous environmental monitoring approaches: Towards unifying active sensing and reinforcement learning. Front. Robot. AI 2024, 11, 1336612. [Google Scholar] [CrossRef]
Ghali, R.; Akhloufi, M.A. Deep Learning Approaches for Wildland Fires Using Satellite Remote Sensing Data: Detection, Mapping, and Prediction. Fire 2023, 6, 192. [Google Scholar] [CrossRef]
Luo, Q.; Zeng, W.; Chen, M.; Peng, G.; Yuan, X.; Yin, Q. Self-Attention and Transformers: Driving the Evolution of Large Language Models. In Proceedings of the 2023 IEEE 6th International Conference on Electronic Information and Communication Technology (ICEICT), Qingdao, China, 21–24 July 2023; pp. 401–405. [Google Scholar] [CrossRef]
Kumar, G.R.S.N.; Sankuri, R.S.; Karri, S.P.K. Multi Scale aided Deep Learning model for High F1-score classification of fundus images based Diabetic Retinopathy and Glaucoma. In Proceedings of the 2023 International Conference on Computer, Electronics & Electrical Engineering & their Applications (IC2E3), Srinagar Garhwal, India, 8–9 June 2023; pp. 1–6. [Google Scholar] [CrossRef]
Li, S. Deep Reinforcement Learning. In Reinforcement Learning. In Reinforcement Learning for Sequential Decision and Optimal Control; Springer: Singapore, 2023. [Google Scholar] [CrossRef]
Mosavi, A.; Faghan, Y.; Ghamisi, P.; Duan, P.; Faizollahzadeh Ardabili, S.; Salwana, E.; Band, S.S. Comprehensive Review of Deep Reinforcement Learning Methods and Applications in Economics. Mathematics 2020, 8, 1640. [Google Scholar] [CrossRef]
Castillo, M.; Molina, J.-R.; Rodríguez y Silva, F.; García-Chevesich, P.; Garfias, R.A. System to evaluate fire impacts from simulated fire behavior in Mediterranean areas of Central Chile. Sci. Total Environ. 2017, 584–585, 1200–1209. [Google Scholar] [CrossRef]

Figure 1. Application of Deep Learning techniques in wildfire prediction.

Figure 2. Deep Reinforcement Learning framework for wildfire prediction.

Figure 3. Correlation matrix among environmental variables used in the wildfire prediction models. Values represent Pearson correlation coefficients.

Figure 4. Accuracy comparison across 100 experiments for CNN, Transformer, and Actor–Critic models.

Figure 5. Precision comparison across 100 experiments. Note: The three curves are fully overlapped, as all three models achieved identical precision (1.0).

Figure 6. Recall comparison across 100 experiments.

Figure 7. F1-score comparison across 100 experiments.

Figure 8. AUC-ROC comparison across 100 experiments. Note: The three curves are fully overlapped, as all three models achieved identical AUC-ROC values (1.0).

Figure 9. SHAP feature importance analysis for wildfire prediction models. Higher absolute SHAP values indicate a greater influence on model decisions.

Table 1. Summary of wildfire impacts.

Impact Area	Consequences
Environmental	Deforestation, biodiversity loss, soil degradation, carbon emissions
Health	Respiratory diseases, fatalities, psychological stress in affected populations
Economic	Property loss, infrastructure damage, suppression costs
Ecosystem Services	Disruption of water cycles, loss of pollinators, habitat fragmentation

Table 2. Deep Learning techniques for wildfire prediction.

Method	Application	Advantage
CNN	Image-based fire detection	High spatial accuracy
RNN	Time-series fire prediction	Captures temporal trends
Transformer	Large-scale fire modeling	Efficient in long-term prediction

Table 3. Representative wildfire event records from the dataset (2000–2023). Data sourced from CONAF incident reports [36], meteorological data from the Dirección Meteorológica de Chile [37], and vegetation indices from MODIS NDVI products [38].

Year	Temperature (°C)	Humidity (%)	Wind (m/s)	Elevation (m)	Vegetation Type	URI	Probability
2000	28.5	40	3.2	500	Forest	Yes	0.8
2001	32.1	35	4.1	450	Shrubland	No	0.7
2002	30.0	45	2.8	600	Grassland	Yes	0.85
⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯
2023	33.5	38	5.0	700	Forest	No	0.9

Table 4. Performance metrics of CNN, Transformer, and Actor–Critic models across 100 experiments.

Model	Accuracy	Precision	Recall	F1-Score	AUC-ROC
CNN	0.9992	1.0000	0.9989	0.9994	1.0000
Transformer	0.9995	1.0000	0.9993	0.9997	1.0000
Actor–Critic	0.9990	1.0000	0.9987	0.9993	1.0000

Table 5. Confusion matrix for each model.

Model	TN	FN	TP
CNN	88	0	198
Transformer	88	0	198
Actor–Critic	88	1	197

Table 6. Mean validation vs. test performance metrics across 100 experiments.

Model	Dataset	Accuracy	F1-Score	AUC-ROC
CNN	Validation	0.9991	0.9993	1.0000
CNN	Test	0.9992	0.9994	1.0000
Transformer	Validation	0.9994	0.9996	1.0000
Transformer	Test	0.9995	0.9997	1.0000
Actor–Critic	Validation	0.9989	0.9992	1.0000
Actor–Critic	Test	0.9990	0.9993	1.0000

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Vidal-Silva, C.; Pizarro, R.; Castillo-Soto, M.; Ingram, B.; de la Fuente, C.; Duarte, V.; Sangüesa, C.; Ibañez, A. A Comparative Study of a Deep Reinforcement Learning Solution and Alternative Deep Learning Models for Wildfire Prediction. Appl. Sci. 2025, 15, 3990. https://doi.org/10.3390/app15073990

AMA Style

Vidal-Silva C, Pizarro R, Castillo-Soto M, Ingram B, de la Fuente C, Duarte V, Sangüesa C, Ibañez A. A Comparative Study of a Deep Reinforcement Learning Solution and Alternative Deep Learning Models for Wildfire Prediction. Applied Sciences. 2025; 15(7):3990. https://doi.org/10.3390/app15073990

Chicago/Turabian Style

Vidal-Silva, Cristian, Roberto Pizarro, Miguel Castillo-Soto, Ben Ingram, Claudia de la Fuente, Vannessa Duarte, Claudia Sangüesa, and Alfredo Ibañez. 2025. "A Comparative Study of a Deep Reinforcement Learning Solution and Alternative Deep Learning Models for Wildfire Prediction" Applied Sciences 15, no. 7: 3990. https://doi.org/10.3390/app15073990

APA Style

Vidal-Silva, C., Pizarro, R., Castillo-Soto, M., Ingram, B., de la Fuente, C., Duarte, V., Sangüesa, C., & Ibañez, A. (2025). A Comparative Study of a Deep Reinforcement Learning Solution and Alternative Deep Learning Models for Wildfire Prediction. Applied Sciences, 15(7), 3990. https://doi.org/10.3390/app15073990

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Comparative Study of a Deep Reinforcement Learning Solution and Alternative Deep Learning Models for Wildfire Prediction

Abstract

1. Introduction

2. Background

2.1. Machine Learning

2.2. Deep Learning

2.3. Deep Reinforcement Learning

3. Dataset Overview

3.1. Feature Description

3.2. Target Variable

3.3. Addressing Dataset Imbalance

3.4. Optimal Modeling Approach

4. Materials and Methods

4.1. Actor–Critic Training for Wildfire Prediction

4.2. CNN Training for Wildfire Prediction

4.3. Transformer Training for Wildfire Prediction

5. Results and Discussion

5.1. Accuracy Analysis

5.2. Validation Performance Analysis

5.3. Precision and Recall

5.4. F1-Score Analysis

5.5. AUC-ROC Performance

5.6. Feature Importance Analysis

5.7. Discussion

6. Conclusions

Supplementary Materials

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI