An Integrated MADQN–Heuristic Framework for Swarm Robotic Fire Detection and Extinguishing
Abstract
1. Introduction
1.1. Wildfire Challenges and the Role of Swarm Robotics
1.2. Algorithmic Paradigms in Swarm Robotics
1.2.1. Heuristic and Bio-Inspired Coordination Strategies
1.2.2. Reinforcement Learning and Multi-Agent Coordination
1.2.3. Hybrid and Integrated Control Frameworks
- Dual-pheromone heuristic design: A novel dual-heuristic mechanism was introduced, combining a fire-attraction pheromone for suppression guidance and an exploration pheromone for area coverage, enabling adaptive balance between exploration and exploitation.
- Reward function formulation for MADQN: A task-specific reward function was designed to integrate fire intensity, suppression efficiency, and resource optimization, effectively shaping agent behavior toward cooperative and sustainable firefighting actions.
- Algorithmic fusion of heuristic control and MADQN: The proposed HG-MADQN algorithm fuses heuristic swarm coordination with Multi-Agent Deep Q-Network learning, creating a hybrid decision-making framework capable of decentralized and adaptive fire suppression.
- Comparative performance analysis: A comprehensive experimental evaluation was conducted across four algorithms: Heuristic, Lévy, Reinforcement Learning (MADQN), and HG-MADQN, demonstrating the superior containment speed, spatial efficiency, and robustness of the proposed approach.
2. Proposed Methodology
2.1. Overall Architecture and Environment
2.2. Heuristic Function for Fire-Guided Navigation
2.3. Multi-Agent Deep Q-Network (MADQN) Model
2.4. Fusion Between Heuristic and MADQN Policies
| Algorithm 1: Fire detection and extinguishing framework |
|
3. Experimental Setup and Results
3.1. Setup
3.2. Fire Propagation and Burned Area Analysis
3.3. Payload Consumption Analysis
3.4. Success Rate
3.5. Simulation Analysis of HG-MADQN Fire-Suppression Behavior
3.6. Impact of Swarm Size on Fire Suppression Performance
3.7. Pheromone Field Evolution and Fire Suppression Dynamics
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Liu, Y.; Stanturf, J.A.; Goodrick, S.L. Trends in global wildfire potential in a changing climate. For. Ecol. Manag. 2010, 259, 685–697. [Google Scholar] [CrossRef]
- Nguyen, M.T.; Lee, S.W. Advancing Early Wildfire Detection: Integration of Vision–Language Models and Drone Platforms. Drones 2025, 9, 347. [Google Scholar] [CrossRef]
- Dutta, A.; Paul, A.; Chowdhury, A.; Saha, S.; Saha, S.; Kar, A. Drone Swarms in Fire Suppression Activities: A Conceptual Framework. Drones 2021, 5, 17. [Google Scholar] [CrossRef]
- Yu, B.; Yu, S.; Zhao, Y.; Wang, J.; Lai, R.; Lv, J.; Zhou, B. Intelligent Firefighting Technology for Drone Swarms with Multi-Sensor Integrated Path Planning: YOLOv8 Algorithm-Driven Fire Source Identification and Precision Deployment Strategy. Drones 2025, 9, 348. [Google Scholar] [CrossRef]
- Bushnaq, O.M.; Chaaban, A.; Al-Naffouri, T.Y. The Role of UAV-IoT Networks in Future Wildfire Detection. IEEE Internet Things J. 2021, 8, 16984–16999. [Google Scholar] [CrossRef]
- Altamimi, A.; Lagoa, C.; Borges, J.G.; McDill, M.E.; Andriotis, C.P.; Papakonstantinou, K.G. Large-Scale Wildfire Mitigation Through Deep Reinforcement Learning. Front. For. Glob. Change 2022, 5, 734330. [Google Scholar] [CrossRef]
- Teow, B.H.A.; Yakimenko, O. Contemplating Urban Operations Involving a UGV Swarm. In Proceedings of the 2018 International Conference on Control and Robots (ICCR), Hong Kong, China, 15–17 September 2018; IEEE: New York, NY, USA, 2018; pp. 35–45. [Google Scholar] [CrossRef]
- Wong, C.; Yang, E.; Yan, X.-T.; Gu, D. An overview of robotics and autonomous systems for harsh environments. In Proceedings of the 2017 23rd International Conference on Automation and Computing (ICAC), Huddersfield, UK, 7–8 September 2017; IEEE: New York, NY, USA, 2017; pp. 1–6. [Google Scholar] [CrossRef]
- Nguyen, L.V. Swarm Intelligence-Based Multi-Robotics: A Comprehensive Review. Appl. Math 2024, 4, 1192–1210. [Google Scholar] [CrossRef]
- Alsammak, A.I.L.H.; Mahmoud, M.A.; Gunasekaran, S.S.; Ahmed, A.N.; AlKilabi, M. Nature-Inspired Drone Swarming for Wildfires Suppression Considering Distributed Fire Spots and Energy Consumption. IEEE Access 2023, 11, 50962–50983. [Google Scholar] [CrossRef]
- Eberhart, R.; Kennedy, J. A new optimizer using particle swarm theory. In Proceedings of the MHS’95. Proceedings of the 6th International Symposium on Micro Machine and Human Science, Nagoya, Japan, 4–6 October 1995; IEEE: New York, NY, USA, 1995; pp. 39–43. [Google Scholar]
- Zaburdaev, V.; Denisov, S.; Klafter, J. Lévy walks. Rev. Modern Phys. 2015, 87, 483. [Google Scholar] [CrossRef]
- JLee, H.; Ahn, C.W. Improving Energy Efficiency in Cooperative Foraging Swarm Robots Using Behavioral Model. In Proceedings of the Sixth International Conference on Bio-Inspired Computing: Theories and Applications, Penang, Malaysia, 27–29 September 2011; IEEE: New York, NY, USA, 2011; pp. 39–44. [Google Scholar] [CrossRef]
- Lee, J.-H.; Ahn, C.W.; An, J. A honey bee swarm-inspired cooperation algorithm for foraging swarm robots: An empirical analysis. In Proceedings of the 2013 IEEE/ASME International Conference on Advanced Intelligent Mechatronics, Wollongong, Australia, 9–12 July 2013; IEEE: New York, NY, USA, 2013; pp. 489–493. [Google Scholar] [CrossRef]
- John, J.; Sundaram, S. Genetic Algorithm-Based Routing and Scheduling for Wildfire Suppression Using a Team of UAVs. arXiv 2024, arXiv:2407.19162. [Google Scholar] [CrossRef]
- Zhou, Y.; Zhou, S.; Wang, M.; Chen, A. Multitarget Search Algorithm Using Swarm Robots in an Unknown 3D Mountain Environment. Appl. Sci. 2023, 13, 1969. [Google Scholar] [CrossRef]
- Ekechi, C.C.; Elfouly, T.; Alouani, A.; Khattab, T. A Survey on UAV Control with Multi-Agent Reinforcement Learning. Drones 2025, 9, 484. [Google Scholar] [CrossRef]
- Haksar, R.N.; Schwager, M. Distributed Deep Reinforcement Learning for Fighting Forest Fires with a Network of Aerial Robots. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; IEEE: New York, NY, USA, 2018; pp. 1067–1074. [Google Scholar] [CrossRef]
- Collignon, M.; Perrusquía, A.; Tsourdos, A.; Guo, W. Search and Rescue Operations in Wildfires Using Unmanned Aerial Vehicles: A Multi-Agent Deep Reinforcement Learning Approach. Neurocomputing 2025, 653, 131211. [Google Scholar] [CrossRef]
- Kouzeghar, M.; Song, Y. Multi-Target Pursuit by a Decentralized Heterogeneous UAV Swarm Using Deep Multi-Agent Reinforcement Learning. Drones 2023, 7, 179. [Google Scholar] [CrossRef]
- Liu, Y.; Li, X.; Wang, J.; Wei, F.; Yang, J. Reinforcement-Learning-Based Multi-UAV Cooperative Search for Moving Targets in 3D Scenarios. Drones 2024, 8, 378. [Google Scholar] [CrossRef]
- Huang, S.; Sun, C.; Gong, J.; Pompili, D. Reinforcement learning–based task allocation and path-finding in multi-robot systems under environment uncertainty. Comput. Aided Civ. Infrastruct. Eng. 2025, 40, 3408–3429. [Google Scholar] [CrossRef]
- Li, S.; Li, L.; Lee, G.; Zhang, H. A Hybrid Search Algorithm for Swarm Robots Searching in an Unknown Environment. PLoS ONE 2014, 9, e111970. [Google Scholar] [CrossRef] [PubMed]
- Jin, Y.; Zhang, Y.; Yuan, J.; Zhang, X. Efficient Multi-agent Cooperative Navigation in Unknown Environments with Interlaced Deep Reinforcement Learning. In Proceedings of the ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 17 April 2019; IEEE: New York, NY, USA, 2019; pp. 2897–2901. [Google Scholar] [CrossRef]
- Sun, H.; Jiang, H.; Zhang, L.; Wu, C.; Qian, S. Multi-robot hierarchical safe reinforcement learning autonomous decision-making strategy based on uniformly ultimate boundedness constraints. Sci. Rep. 2025, 15, 5990. [Google Scholar] [CrossRef]
- Renzaglia, A.; Dibangoye, J.; Le Doze, V.; Simonin, O. A Common Optimization Framework for Multi-Robot Exploration and Coverage in 3D Environments. J. Intell. Robot Syst. 2020, 100, 1453–1468. [Google Scholar] [CrossRef]
- Fang, Z.; Ma, T.; Huang, J.; Niu, Z.; Yang, F. Efficient Task Allocation in Multi-Agent Systems Using Reinforcement Learning and Genetic Algorithm. Appl. Sci. 2025, 15, 1905. [Google Scholar] [CrossRef]
- Aydin, B.; Selvi, E.; Tao, J.; Starek, M.J. Use of Fire-Extinguishing Balls for a Conceptual System of Drone-Assisted Wildfire Fighting. Drones 2019, 3, 17. [Google Scholar] [CrossRef]
- Allison, R.S.; Johnston, J.M.; Craig, G.; Jennings, S. Airborne Optical and Thermal Remote Sensing for Wildfire Detection and Monitoring. Sensors 2016, 16, 1310. [Google Scholar] [CrossRef] [PubMed]
- Wang, X.; Liu, C.; Liu, J.; Qin, X.; Wang, N.; Zhou, W. A cellular automata model for forest fire spreading simulation. In Proceedings of the 2016 IEEE Symposium Series on Computational Intelligence (SSCI), Athens, Greece, 6–9 December 2016; IEEE: New York, NY, USA, 2016; pp. 1–6. [Google Scholar] [CrossRef]
- Karafyllidis, A.I.; Thanailakis, A. A model for predicting forest fire spreading using cellular automata. Ecol. Modell. 1997, 99, 87–97. [Google Scholar] [CrossRef]









| Symbol/Variable | Description | Value |
|---|---|---|
| N × N | Grid size (environment dimensions) | 300 × 300 |
| Number of drones (agents) | 50 | |
| Batch size | Mini-batch size for training | 512 |
| Learning rate | Optimizer learning rate | 0.001 |
| Coverage-repulsion weight | 1.0 | |
| Inter-agent separation weight | 10 | |
| Fire attraction weight | 2.0 | |
| Exploration bonus (new cells) | 0.05 | |
| Penalty for revisiting explored areas | −0.05 | |
| γ | Discount factor (MADQN) | 0.98 |
| Fire detection radius | 3 cells | |
| Fire suppression radius | 2 cells | |
| Reward for discovering new cells in the local 5 × 5 observation window | 0.1 per new cell | |
| Reward for detecting fire | 0.1 | |
| Penalty for proximity-based collisions | 0.2 | |
| Penalty for visited places | 0.001 per cell | |
| Penalty for proximity to map boundaries | 1 |
| Interval | Heuristic | HG-MADQN | Levy | RL |
|---|---|---|---|---|
| 0–100 | 24.9 | 22.7 | 23.9 | 13.8 |
| 100–200 | 625.1 | 202.8 | 343.3 | 261.9 |
| 200–300 | 1014.3 | 360.6 | 642.8 | 665.3 |
| 300–400 | 1116.4 | 468.3 | 675.2 | 996.1 |
| 400–500 | 754.1 | 466.8 | 626.5 | 1103.3 |
| 500–600 | 621.8 | 457.8 | 511.5 | 629.5 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Dutceac, A.; Vizitiu, C.I. An Integrated MADQN–Heuristic Framework for Swarm Robotic Fire Detection and Extinguishing. Robotics 2026, 15, 5. https://doi.org/10.3390/robotics15010005
Dutceac A, Vizitiu CI. An Integrated MADQN–Heuristic Framework for Swarm Robotic Fire Detection and Extinguishing. Robotics. 2026; 15(1):5. https://doi.org/10.3390/robotics15010005
Chicago/Turabian StyleDutceac, Andrei, and Constantin I. Vizitiu. 2026. "An Integrated MADQN–Heuristic Framework for Swarm Robotic Fire Detection and Extinguishing" Robotics 15, no. 1: 5. https://doi.org/10.3390/robotics15010005
APA StyleDutceac, A., & Vizitiu, C. I. (2026). An Integrated MADQN–Heuristic Framework for Swarm Robotic Fire Detection and Extinguishing. Robotics, 15(1), 5. https://doi.org/10.3390/robotics15010005
