# Multi-Target Optimization Strategy for Unmanned Aerial Vehicle Formation in Forest Fire Monitoring Based on Deep Q-Network Algorithm

^{1}

^{2}

^{3}

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

## 2. System Model

#### 2.1. Pure Azimuth Passive Localization

_{1}, α

_{2}and α

_{3}, respectively. Pure azimuth passive localization utilizes this angle information to determine the position of the receiving UAV.

#### 2.2. Passive Localization Model of UAV Based on Relationships among Triangle Sides and Angles

- (a)
- The passive receiving UAV knows which signal comes from which transmitting UAV.
- (b)
- The relative positions of the UAVs remain unchanged during the operation.
- (c)
- The positions of the transmitting UAVs have no bias.
- (d)
- The transmitting UAVs cannot receive signals simultaneously.
- (e)
- The transmitted signals are accurate and unaffected by external factors.
- (f)
- The UAVs are not affected by external interference during their flight.

_{1}, the angle between UAV00 and UAV-2 is α

_{2}, and the angle between UAV-1 and UAV-2 is α

_{3}. The geometric relationship between the passive receiving UAV and the three transmitting UAVs can be divided into the following two cases. (In practice, the positions of the four UAVs are unknown, and this analysis is conducted to derive a general localization model).

#### 2.3. Simulation Verification of Circular Formation

## 3. Problem Formulation

## 4. Algorithm

#### 4.1. The Mechanism of the DQN Algorithm

#### 4.2. Algorithm Framework

Algorithm 1: Deep Q-learning (DQN) |

Initialize replay memory D to capacity N Initialize action-value function Q with random weights θ Initialize target action-value function Q̂ with weights θ^- = θ for episode = 1, M do Initialize sequence s_1 = {x_1} and preprocessed sequence φ_1 = φ(s_1) for t = 1, u do With probability ε select a random action a_t (The action refers to selecting which UAV to be the signal-emitting UAV.) otherwise select a_t = argmax_a Q(φ(s_t), a; θ) Execute action a_t in emulator and observe reward r_t (The reward, denoted as r_t, represents the discrepancy between the adjusted position and the ideal position. Since the goal of DRL is to maximize the reward, the objective of this task is to minimize the discrepancy. Therefore, the reward is defined as the negative value of the discrepancy: reward = -discrepancy and image x_(t + 1) Set s_(t + 1) = s_t, a_t, x_(t + 1) and preprocess φ_(t+1) = φ(s_(t + 1)) Store transition (φ_t, a_t, r_t, φ_(t + 1)) in D Sample random minibatch of transitions (φ_j, a_j, r_j, φ_(j+1)) from D Set y_j = r_j for terminal φ_(j+1) r_j + γ * max_a′ Q̂(φ_(j + 1), a′; θ^-) for non-terminal φ_(j + 1) Perform a gradient descent step on (y_j - Q(φ_j, a_j; θ))^2 according to equation with respect to the network parameters θ Every C steps reset Q̂ = Q end for end for |

## 5. Experiment

#### 5.1. Experimental Background and Environment

#### 5.2. Setting of Environmental Parameters

#### 5.3. Experimental Result Analysis

## 6. Conclusions

- (a)
- Based on the equivalence relationship between the three sides and angles, the distances between the three emitting UAVs and the receiving UAVs can be solved. Then, using each of the three emitting UAVs as the center, circles are drawn with the distance to the receiving UAV as the radius. The intersection points of the three circles are the positions of the receiving UAVs to be located. Since the system of three quadratic equations has multiple suitable real number solutions, the point closest to the ideal position is selected as the position of the receiving UAV to be located.
- (b)
- If it is required to evenly arrange UAVs 1–9 on the circumference with a radius of 100 and centered at UAV 0, due to only UAV 1 having an unbiased position, it is not possible to position all UAVs in an exact unbiased position within a limited number of adjustments. Therefore, the optimization goal is to minimize the deviation from the ideal position and minimize the number of adjustments. Thus, a mathematical programming model is established with the decision variables being the emitting UAVs selected for each adjustment, aiming to minimize the sum of squared errors between the final adjusted UAV coordinates and the ideal position coordinates, as well as the number of adjustments.
- (c)
- Due to environmental constraints and other factors, all experiments conducted in this study are simulated experiments. Therefore, the influence of external factors on drones is not considered.
- (d)
- The process of the DQN algorithm first involves generating many instances to train the DQN algorithm. After obtaining a well-trained DQN model, this model is then used to test given examples. This demonstrates the effectiveness of the algorithm and the model. For the same example, the results of the DQN algorithm are consistent.
- (e)
- In the 5th adjustment round, the DQN algorithm yielded a sum of squared errors between the actual and ideal positions of 9.985 × 10
^{−8}, indicating no deviation between the actual and ideal positions at this stage. - (f)
- The testing time of the DQN algorithm is 2.7 s, while that of the genetic algorithm is 127.9 s. The DQN algorithm has a significantly lower testing time than the genetic algorithm, making it more responsive to the rapid nature of drone operations in monitoring and extinguishing forest residual fires.
- (g)
- In the 5th adjustment round, the DQN algorithm’s model tended to stabilize after the 20th iteration, indicating convergence of results.

## Author Contributions

## Funding

## Data Availability Statement

## Conflicts of Interest

## References

- Turco, M.; Bedia, J.; Di Liberto, F.; Fiorucci, P.; von Hardenberg, J.; Koutsias, N.; Llasat, M.; Xystrakis, F.; Provenzale, A. Decreasing Fires in Mediterranean Europe. PLoS ONE
**2016**, 11, e150663. [Google Scholar] [CrossRef] [PubMed] - San-Miguel-Ayanz, J.; Moreno, J.M.; Camia, A. Analysis of large fires in European Mediterranean landscapes: Lessons learned and perspectives. For. Ecol. Manag.
**2013**, 294, 11–22. [Google Scholar] [CrossRef] - Radford, I.J.; Gibson, L.A.; Corey, B.; Carnes, K.; Fairman, R. Influence of Fire Mosaics, Habitat Characteristics and Cattle Disturbance on Mammals in Fire-Prone Savanna Landscapes of the Northern Kimberley. PLoS ONE
**2015**, 10, e130721. [Google Scholar] [CrossRef] [PubMed] - Agbeshie, A.A.; Abugre, S.; Atta-Darkwa, T.; Awuah, R. A review of the effects of forest fire on soil properties. J. For. Res.
**2022**, 33, 1419–1441. [Google Scholar] [CrossRef] - Ubeda, X.; Sarricolea, P. Wildfires in Chile: A review. Glob. Planet. Chang.
**2016**, 146, 152–161. [Google Scholar] [CrossRef] - Thom, D.; Seidl, R. Natural disturbance impacts on ecosystem services and biodiversity in temperate and boreal forests. Biol. Rev.
**2016**, 91, 760–781. [Google Scholar] [CrossRef] - Sharples, J.J.; Cary, G.J.; Fox-Hughes, P.; Mooney, S.; Evans, J.P.; Fletcher, M.S.; Fromm, M.; Grierson, P.F.; Mcrae, R.; Baker, P. Natural hazards in Australia: Extreme bushfire. Clim. Chang.
**2016**, 139, 85–99. [Google Scholar] [CrossRef] - Madoui, A.; Gauthier, S.; Leduc, A.; Bergeron, Y.; Valeria, O. Monitoring Forest Recovery Following Wildfire and Harvest in Boreal Forests Using Satellite Imagery. Forests
**2015**, 6, 4105–4134. [Google Scholar] [CrossRef] - Deligiannakis, G.; Pallikarakis, A.; Papanikolaou, I.; Alexiou, S.; Reicherter, K. Detecting and Monitoring Early Post-Fire Sliding Phenomena Using UAV–SfM Photogrammetry and t-LiDAR-Derived Point Clouds. Fire
**2021**, 4, 87. [Google Scholar] [CrossRef] - Bao, S.; Xiao, N.; Lai, Z.; Zhang, H.; Kim, C. Optimizing watchtower locations for forest fire monitoring using location models. Fire Saf. J.
**2015**, 71, 100–109. [Google Scholar] [CrossRef] - Kim, S.; Lee, W.; Park, Y.S.; Lee, H.W.; Lee, Y.T. Forest fire monitoring system based on aerial image. In Proceedings of the 2016 3rd International Conference on Information and Communication Technologies for Disaster Management (ICT-DM), Vienna, Austria, 13–15 December 2016; pp. 1–6. [Google Scholar]
- Torres, P.; Rodes-Blanco, M.; Viana-Soto, A.; Nieto, H.; García, M. The Role of Remote Sensing for the Assessment and Monitoring of Forest Health: A Systematic Evidence Synthesis. Forests
**2021**, 12, 1134. [Google Scholar] [CrossRef] - Xu, R.; Lin, H.; Lu, K.; Cao, L.; Liu, Y. A Forest Fire Detection System Based on Ensemble Learning. Forests
**2021**, 12, 217. [Google Scholar] [CrossRef] - Torresan, C.; Berton, A.; Carotenuto, F.; Di Gennaro, S.F.; Gioli, B.; Matese, A.; Miglietta, F.; Vagnoli, C.; Zaldei, A.; Wallace, L. Forestry applications of UAVs in Europe: A review. Int. J. Remote Sens.
**2017**, 38, 2427–2447. [Google Scholar] [CrossRef] - Yuan, C.; Zhang, Y.; Liu, Z. A survey on technologies for automatic forest fire monitoring, detection, and fighting using unmanned aerial vehicles and remote sensing techniques. Can. J. For. Res.
**2015**, 45, 783–792. [Google Scholar] [CrossRef] - Ivanova, S.; Prosekov, A.; Kaledin, A. A Survey on Monitoring of Wild Animals during Fires Using Drones. Fire
**2022**, 5, 60. [Google Scholar] [CrossRef] - Penglase, K.; Lewis, T.; Srivastava, S.K. A New Approach to Estimate Fuel Budget and Wildfire Hazard Assessment in Commercial Plantations Using Drone-Based Photogrammetry and Image Analysis. Remote Sens.
**2023**, 15, 2621. [Google Scholar] [CrossRef] - Harikumar, K.; Senthilnath, J.; Sundaram, S. Multi-UAV Oxyrrhis Marina-Inspired Search and Dynamic Formation Control for Forest Firefighting. IEEE Trans. Autom. Sci. Eng.
**2019**, 16, 863–873. [Google Scholar] [CrossRef] - Sudhakar, S.; Vijayakumar, V.; Kumar, C.S.; Priya, V.; Ravi, L.; Subramaniyaswamy, V. Unmanned Aerial Vehicle (UAV) based Forest Fire Detection and monitoring for reducing false alarms in forest-fires. Comput. Commun.
**2020**, 149, 1–16. [Google Scholar] [CrossRef] - Zhan, J.; Hu, Y.; Cai, W.; Zhou, G.; Li, L. A Small Target Detection Approach for Wildland Fire Smoke through Remote Sensing Images. Symmetry
**2021**, 13, 2260. [Google Scholar] [CrossRef] - Georgiades, G.; Papageorgiou, X.S.; Loizou, S.G. Integrated Forest Monitoring System for Early Fire Detection and Assessment. In Proceedings of the 2019 6th International Conference on Control, Decision and Information Technologies (CODIT), Paris, France, 23–26 April 2019; pp. 1817–1822. [Google Scholar]
- Zhen, T.; Zhang, Y.M.; Xin, J.; Mu, L.X.; Yi, Y.M.; Liu, H.; Liu, D. A Deep Learning Based Forest Fire Detection Approach Using UAV and YOLOv3. In Proceedings of the 2019 1st International Conference on Industrial Artificial Intelligence (IAI), Shenyang, China, 23–27 July 2019; pp. 1–5. [Google Scholar]
- Jiang, H.; Liang, Y.Q. Online Path Pl, onomous UAVs for Bearing-Only Stand off Multi-Target Following in Threat Environment. IEEE Access
**2018**, 6, 22531–22544. [Google Scholar] [CrossRef] - Yu, Z.Q.; Zhang, Y.M.; Jiang, B.; Fu, J.; Jin, Y. A review on fault-tolerant cooperative control of multiple unmanned aerial vehicles. Chin. J. Aeronaut.
**2022**, 35, 1–18. [Google Scholar] [CrossRef] - Zhou, Y.K.; Gao, W.; Rao, B.; Ding, B.W.; Wang, W. Neighborhood Selection Synchronization Mechanism-Based Moving Source Localization Using UAV Swarm. Remote Sens.
**2023**, 15, 2313. [Google Scholar] [CrossRef] - Zhou, Y.K.; Song, D.; Ding, B.W.; Rao, B.; Su, M.; Wang, W. Ant Colony Pheromone Mechanism-Based Passive Localization Using UAV Swarm. Remote Sens.
**2022**, 14, 2944. [Google Scholar] [CrossRef] - Li, H.; Fan, X.Y.; Shi, M.H. Research on the Cooperative Passive Location of Moving Targets Based on Improved Particle Swarm Optimization. Drones
**2023**, 7, 264. [Google Scholar] [CrossRef] - Guo, P.J.; Zhang, R.; Gao, G.G.; Xu, B. Cooperative Navigation of UAV Formation Based on Relative Velocity and Position Assistance. J. Shanghai Jiaotong Univ.
**2022**, 56, 1438–1446. [Google Scholar] - Wang, R.; Du, J.N.; Xiong, Z.; Chen, X.; Liu, J.Y. Hierarchical Collaborative Navigation Method for UAV Swarm. J. Aerosp. Eng.
**2021**, 34, 4020097. [Google Scholar] [CrossRef] - Li, C.Y.; Wang, J.A.; Liu, J.H.; Shan, J.Y. Cooperative Visual-Range-Inertial Navigation for Multiple Unmanned Aerial Vehicles. IEEE Trans. Aerosp. Electron. Syst.
**2023**, 59, 7851–7865. [Google Scholar] [CrossRef] - Coelho, B.N.; Coelho, V.N.; Coelho, I.M.; Ochi, L.S.; Haghnazar, K.R.; Zuidema, D.; Lima, M.; Da Costa, A.R. A multi-objective green UAV routing problem. Comput. Oper. Res.
**2017**, 88, 306–315. [Google Scholar] [CrossRef] - Alotaibi, K.A.; Rosenberger, J.M.; Mattingly, S.P.; Punugu, R.K.; Visoldilokpun, S. Unmanned aerial vehicle routing in the presence of threats. Comput. Ind. Eng.
**2018**, 115, 190–205. [Google Scholar] [CrossRef] - Goodbody, T.; Coops, N.C.; Marshall, P.L.; Tompalski, P.; Crawford, P. Unmanned aerial systems for precision forest inventory purposes: A review and case study. For. Chron.
**2017**, 93, 71–81. [Google Scholar] [CrossRef] - Masroor, R.; Naeem, M.; Ejaz, W. Efficient deployment of UAVs for disaster management: A multi-criterion optimization approach. Comput. Commun.
**2021**, 177, 185–194. [Google Scholar] [CrossRef] - Park, Y.; Nielsen, P.; Moon, I. Unmanned aerial vehicle set covering problem considering fixed-radius coverage constraint. Comput. Oper. Res.
**2020**, 119, 104936. [Google Scholar] [CrossRef] - Alonzo, M.; Andersen, H.E.; Morton, D.C.; Cook, B.D. Quantifying Boreal Forest Structure and Composition Using UAV Structure from Motion. Forests
**2018**, 9, 119. [Google Scholar] [CrossRef] - Martin, J.G.; Frejo, J.; García, R.A.; Camacho, E.F. Multi-robot task allocation problem with multiple nonlinear criteria using branch and bound and genetic algorithms. Intell. Serv. Robot.
**2021**, 14, 707–727. [Google Scholar] [CrossRef] - Phung, M.D.; Ha, Q.P. Safety-enhanced UAV path planning with spherical vector-based particle swarm optimization. Appl. Soft Comput.
**2021**, 107, 107376. [Google Scholar] [CrossRef] - Chen, J.C.; Ling, F.Y.; Zhang, Y.; You, T.; Liu, Y.F.; Du, X.Y. Coverage path planning of heterogeneous unmanned aerial vehicles based on ant colony system. Swarm Evol. Comput.
**2022**, 69, 101005. [Google Scholar] [CrossRef] - Liu, X.; Liu, Y.W.; Chen, Y. Reinforcement Learning in Multiple-UAV Networks: Deployment and Movement Design. IEEE Trans. Veh. Technol.
**2019**, 68, 8036–8049. [Google Scholar] [CrossRef] - Zhang, J.; Cui, Y.N.; Ren, J. Dynamic Mission Planning Algorithm for UAV Formation in Battlefield Environment. IEEE Trans. Aerosp. Electron. Syst.
**2023**, 59, 3750–3765. [Google Scholar] [CrossRef] - Tang, J.; Chen, X.; Zhu, X.M.; Zhu, F. Dynamic Reallocation Model of Multiple Unmanned Aerial Vehicle Tasks in Emergent Adjustment Scenarios. IEEE Trans. Aerosp. Electron. Syst.
**2023**, 59, 1139–1155. [Google Scholar] [CrossRef] - Hu, J.W.; Wang, L.H.; Hu, T.M.; Guo, C.B.; Wang, Y.X. Autonomous Maneuver Decision Making of Dual-UAV Cooperative Air Combat Based on Deep Reinforcement Learning. Electronics
**2022**, 11, 467. [Google Scholar] [CrossRef] - Luong, N.C.; Hoang, D.T.; Gong, S.M.; Niyato, D.; Wang, P.; Liang, Y.C.; Kim, D.I. Applications of Deep Reinforcement Learning in Communications and Networking: A Survey. IEEE Commun. Surv. Tutor.
**2019**, 21, 3133–3174. [Google Scholar] [CrossRef] - Khalili, A.; Monfared, E.M.; Zargari, S.; Javan, M.R.; Yamchi, N.M.; Jorswieck, E.A. Resource Management for Transmit Power Minimization in UAV-Assisted RIS HetNets Supported by Dual Connectivity. IEEE Trans. Wirel. Commun.
**2022**, 21, 1806–1822. [Google Scholar] [CrossRef] - Liu, X.; Liu, Y.W.; Chen, Y. Machine Learning Empowered Trajectory and Passive Beamforming Design in UAV-RIS Wireless Networks. IEEE J. Sel. Areas Commun.
**2021**, 39, 2042–2055. [Google Scholar] [CrossRef] - Zhou, W.Q.; Fan, L.S.; Zhou, F.S.; Li, F.; Lei, X.F.; Xu, W.; Nallanathan, A. Priority-Aware Resource Scheduling for UAV-Mounted Mobile Edge Computing Networks. IEEE Trans. Veh. Technol.
**2023**, 72, 9682–9687. [Google Scholar] [CrossRef]

**Figure 2.**Two scenarios of case 1: (

**a**) UAV with two non-central emitting UAVs on the same side; (

**b**) UAV with two non-central emitting UAVs not on the same side.

**Figure 3.**Two scenarios of case 2: (

**a**) Scenario of ${\alpha}_{1}-{\alpha}_{2}={\alpha}_{3}$; (

**b**) Scenario of ${\alpha}_{2}-{\alpha}_{1}={\alpha}_{3}$.

**Figure 5.**Representation of real coordinates and estimated coordinates in the simulation. * represents the position coordinates of the signal-transmitting UAV. * represents the position coordinates of the signal-receiving UAV.

Number | Actual Coordinates | Estimated Coordinates |
---|---|---|

UAV 03 | (8.637, 96.572) | (8.637, 96.572) |

UAV 04 | (−51.033, 83.919) | (−51.033, 83.919) |

UAV 05 | (−88.699, 36.76) | (−88.699, 36.76) |

UAV 06 | (−88.53, −25.545) | (−88.53, −25.545) |

UAV 07 | (−40.545, −92.762) | (−40.545, −92.762) |

UAV 08 | (10.142, −94.555) | (10.142, −94.555) |

UAV 09 | (68.481, −63.771) | (68.481, −63.771) |

Parameter | Parameter Configuration |
---|---|

Learning rate | 0.001 |

Batch size | 512 |

Discount factor | 0.95 |

Maximum capacity of the experience replay | 10,000 |

Exploration factor | Exploration factor |

Optimizer | Adam |

The Number of UAVs | The Initial Position is in Polar Coordinates (m,°) |
---|---|

0 | (0, 0) |

1 | (100, 0) |

2 | (98, 40.10) |

3 | (112, 80.21) |

4 | (105, 119.75) |

5 | (98, 159.86) |

6 | (112, 119.96) |

7 | (105, 240.07) |

8 | (98, 280.17) |

9 | (112, 320.28) |

The Number of UAVs | The Ideal Position is in Polar Coordinates (m,°) |
---|---|

0 | (0, 0) |

1 | (100, 0) |

2 | (76.6, 64.2) |

3 | (17.3, 98.4) |

4 | (−50, 86.6) |

5 | (−93.9, 34.2) |

6 | (−93.9, −34.2) |

7 | (−50, −86.6) |

8 | (17.3, −98.4) |

9 | (76.6, −64.2) |

Experimental Environment | Parameter Configuration |
---|---|

Cpu | Intel core i7 12700k |

Gpu | Nvidia GeForce RTX3060 |

RAM | 32 G |

Hard disk | 1 TB |

Programming environment | Python 3.9 |

Version of Torch | 1.11.0 |

Types of Algorithms | CPU Time |
---|---|

DQN | 2.7 s |

GA | 127.9 s |

Adjustment Round | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|

Transmitting drones | [0, 1, 2] | [0, 1, 3] | [0, 1, 5] | [0, 1, 8] | [0, 1, 6] |

Adjusted position of item 1. | (100, 0) | (100, 0) | (100, 0) | (100, 0) | (100, 0) |

Adjusted position of item 2. | (74.9, 63.1) | (74.9, 63.1) | (76.4, 64.3) | (76.4, 64.3) | (76.6, 64.2) |

Adjusted position of item 3. | (12.9, 95.6) | (12.9, 95.6) | (17.1, 98.3) | (17.1, 98.3) | (17.3, 98.4) |

Adjusted position of item 4. | (−49.9, 82.5) | (−49.9, 83.3) | (−49.9, 86.2) | (−49.9, 86.7) | (−50, 86.6) |

Adjusted position of item 5. | (−92.0, 33.7) | (−93.6, 34.0) | (−93.6, 34.0) | (−94.0, 34.2) | (−93.9, 34.2) |

Adjusted position of item 6. | (−93.5, −33.9) | (−93.5, −33.9) | (−93.7, −34.0) | (−93.9, −34.2) | (−93.9, −34.2) |

Adjusted position of item 7. | (−52.3, −90.9) | (−49.8, −91.7) | (−49.9, −86.3) | (−50, −86.5) | (−50, −86.6) |

Adjusted position of item 8. | (17.3, −96.4) | (17.3, −96.4) | (17.2, −98.4) | (17.2, −98.4) | (17.3, −98.4) |

Adjusted position of item 9. | (86.1, −71.5) | (86.1, −71.5) | (76.6, −64.2) | (76.6, −64.2) | (76.6, −64.2) |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Liu, W.; Lyu, S.-K.; Liu, T.; Wu, Y.-T.; Qin, Z.
Multi-Target Optimization Strategy for Unmanned Aerial Vehicle Formation in Forest Fire Monitoring Based on Deep Q-Network Algorithm. *Drones* **2024**, *8*, 201.
https://doi.org/10.3390/drones8050201

**AMA Style**

Liu W, Lyu S-K, Liu T, Wu Y-T, Qin Z.
Multi-Target Optimization Strategy for Unmanned Aerial Vehicle Formation in Forest Fire Monitoring Based on Deep Q-Network Algorithm. *Drones*. 2024; 8(5):201.
https://doi.org/10.3390/drones8050201

**Chicago/Turabian Style**

Liu, Wenjia, Sung-Ki Lyu, Tao Liu, Yu-Ting Wu, and Zhen Qin.
2024. "Multi-Target Optimization Strategy for Unmanned Aerial Vehicle Formation in Forest Fire Monitoring Based on Deep Q-Network Algorithm" *Drones* 8, no. 5: 201.
https://doi.org/10.3390/drones8050201