Decision-Making Strategies for Close-Range Air Combat Based on Reinforcement Learning with Variable-Scale Actions
Abstract
:1. Introduction
2. Close-Range Air Combat Model
2.1. Aircraft Model
2.2. Situation Assessment
2.3. Decision-Making Strategy
3. Virtual Lag Points Based on Trajectory Prediction
3.1. Trajectory Prediction Method
3.1.1. Method and Structure
3.1.2. Conventional Neural Network Prediction
3.1.3. Error Prediction
3.1.4. Correction and Judgment
3.2. Prediction Results
4. Strategy Training
4.1. Markov Decision Model
4.1.1. States
4.1.2. Actions and State Transition Functions
4.1.3. Reward
4.2. Training Algorithm
Algorithm 1: PPO. |
for iteration = 1.2… do |
for actor = 1, 2, …, N do |
Run policy πθold in the environment for T timesteps |
Compute advantage estimate . ⋯; |
end for |
Optimize surrogate L wrt θ, with K epochs and minibatch size M < NT |
θold ← θ |
end for |
5. Simulation and Analysis
5.1. Typical Scenes
5.1.1. Blue Aircraft in Advantage
5.1.2. Blue Aircraft in Disadvantage
5.1.3. Two Aircrafts in Parallel Flight
5.1.4. Two Aircrafts in Opposite Flight
5.2. Monte Carlo Mathematical Simulation
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Pan, Q.; Zhou, D.Y.; Huang, J.C.; Lv, X.F.; Yang, Z.; Zhang, K.; Li, X.Y. Maneuver Decision for Cooperative Close-Range Air Combat Based on State Predicted Influence Diagram. In Proceedings of the IEEE International Conference on Information and Automation (ICIA), Macau, China, 18–20 July 2017; pp. 726–731. [Google Scholar]
- Jiandong, Z.; Qiming, Y.; Guoqing, S.; Yi, L.; Yong, W. UAV cooperative air combat maneuver decision based on multi-agent reinforcement learning. J. Syst. Eng. Electron. 2021, 32, 1421–1438. [Google Scholar] [CrossRef]
- Park, H.; Lee, B.Y.; Tahk, M.J.; Yoo, D.W. Differential Game Based Air Combat Maneuver Generation Using Scoring Function Matrix. Int. J. Aeronaut. Space Sci. 2016, 17, 204–213. [Google Scholar] [CrossRef]
- Sun, Y.-Q.; Zhou, X.-C.; Meng, S.; Fan, H.-D. Research on Maneuvering Decision for Multi-fighter Cooperative Air Combat. In Proceedings of the 2009 International Conference on Intelligent Human-Machine Systems and Cybernetics, Hangzhou, China, 26–27 August 2009; pp. 197–200. [Google Scholar] [CrossRef]
- McGrew, J.S.; How, J.P.; Williams, B.; Roy, N. Air-Combat Strategy Using Approximate Dynamic Programming. J. Guid. Control Dyn. 2010, 33, 1641–1654. [Google Scholar] [CrossRef]
- Li, N.; Yi, W.Q.; Gong, G.H. Multi-aircraft Cooperative Target Allocation in BVR Air Combat Using Cultural-Genetic Algorithm. In Proceedings of the Asia Simulation Conference/International Conference on System Simulation and Scientific Computing (AsiaSim and ICSC 2012), Springer-Verlag Berlin, Shanghai, China, 27–30 October 2012; pp. 414–422. [Google Scholar]
- Duan, H.; Pei, L.; Yu, Y. A Predator-prey Particle Swarm Optimization Approach to Multiple UCAV Air Combat Modeled by Dynamic Game Theory. IEEE/CAA J. Autom. Sin. 2015, 2, 11–18. [Google Scholar] [CrossRef]
- Huang, C.; Dong, K.; Huang, H.; Tang, S. Autonomous air combat maneuver decision using Bayesian inference and moving horizon optimization. J. Syst. Eng. Electron. 2018, 29, 86–97. [Google Scholar] [CrossRef]
- Burgin, G.H.; Fogel, L.J. Air-to-Air Combat Tactics Synthesis and Analysis Program Based on An Adaptive Maneuvering Logic, NASA. J. Cybern. 1972, 2, 60–68. [Google Scholar] [CrossRef]
- He, X.; Zu, W.; Chang, H.; Zhang, J.; Gao, Y. Autonomous Maneuvering Decision Research of UAV Based on Experience Knowledge Representation. In Proceedings of the 28th Chinese Control and Decision Conference, Yinchuan, China, 28–30 May 2016; pp. 161–166. [Google Scholar]
- Hu, D.; Yang, R.; Zuo, J.; Zhang, Z.; Wu, J.; Wang, Y. Application of Deep Reinforcement Learning in Maneuver Planning of Beyond-Visual-Range Air Combat. IEEE Access 2021, 9, 32282–32297. [Google Scholar] [CrossRef]
- You, S.X.; Diao, M.; Gao, L.P.; Zhang, F.L.; Wang, H. Target tracking strategy using deep deterministic policy gradient. Appl. Soft Comput. 2020, 95, 13. [Google Scholar] [CrossRef]
- Qiu, X.; Yao, Z.; Tan, F.; Zhu, Z.; Lu, J.-G. One-to-one Air-combat Maneuver Strategy Based on Improved TD3 Algorithm. In Proceedings of the 2020 Chinese Automation Congress (CAC), Shanghai, China, 6–8 November 2020; pp. 5719–5725. [Google Scholar] [CrossRef]
- Kong, W.R.; Zhou, D.Y.; Zhang, K.; Yang, Z. Air combat autonomous maneuver decision for one-on-one within visual range engagement base on robust multi-agent reinforcement learning. In Proceedings of the 16th IEEE International Conference on Control and Automation (ICCA)Electr Network, Singapore, 9–11 October 2020; pp. 506–512. [Google Scholar]
- Sun, Z.X.; Piao, H.Y.; Yang, Z.; Zhao, Y.Y.; Zhan, G.; Zhou, D.Y.; Meng, G.L.; Chen, H.C.; Chen, X.; Qu, B.H.; et al. Multi-agent hierarchical policy gradient for Air Combat Tactics emergence via self-play. Eng. Appl. Artif. Intell. 2021, 98, 14. [Google Scholar] [CrossRef]
- Austin, F.; Carbone, G.; Falco, M.; Hinz, H.; Lewis, M. Automated maneuvering decisions for air-to-air combat. In Proceedings of the Guidance, Navigation and Control Conference, Monterey, CA, USA, 17–19 August 1987. [Google Scholar] [CrossRef]
- Wang, M.; Wang, L.; Yue, T.; Liu, H. Influence of unmanned combat aerial vehicle agility on short-range aerial combat effectiveness. Aerosp. Sci. Technol. 2020, 96, 105534. [Google Scholar] [CrossRef]
- Sonneveldt, L. Nonlinear F-16 Model Description; Delft University of Technology: Delft, The Netherlands, 2006. [Google Scholar]
- You, D.-I.; Shim, D.H. Design of an aerial combat guidance law using virtual pursuit point concept. Proc. Inst. Mech. Eng. Part G J. Aerosp. Eng. 2014, 229, 792–813. [Google Scholar] [CrossRef]
- Shin, H.; Lee, J.; Kim, H.; Shim, D.H. An autonomous aerial combat framework for two-on-two engagements based on basic fighter maneuvers. Aerosp. Sci. Technol. 2018, 72, 305–315. [Google Scholar] [CrossRef]
- Yu, Y.; Si, X.S.; Hu, C.H.; Zhang, J.X. A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures. Neural Comput. 2019, 31, 1235–1270. [Google Scholar] [CrossRef] [PubMed]
- Wang, Z.; Li, H.; Wu, Z.; Wu, H. A pretrained proximal policy optimization algorithm with reward shaping for aircraft guidance to a moving destination in three-dimensional continuous space. Int. J. Adv. Robot. Syst. 2021, 18, 1–13. [Google Scholar] [CrossRef]
Scenario | xb (km) | yb (km) | zb (km) | Vb (m/s) | γb (°) | χb (°) | xr (km) | yr (km) | zr (km) | Vr (m/s) | γr (°) | χr (°) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | −2 | 0 | −6 | 200 | 0 | 0 | 2 | 0 | −6 | 200 | 0 | 0 |
2 | 2 | 0 | −6 | 200 | 0 | 0 | −2 | 0 | −6 | 200 | 0 | 0 |
3 | 0 | 2 | −6 | 200 | 0 | 0 | 0 | −2 | −6 | 200 | 0 | 0 |
4 | 0 | 2 | −6 | 200 | 0 | −90 | 0 | −2 | −6 | 200 | 0 | 90 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, L.; Wang, J.; Liu, H.; Yue, T. Decision-Making Strategies for Close-Range Air Combat Based on Reinforcement Learning with Variable-Scale Actions. Aerospace 2023, 10, 401. https://doi.org/10.3390/aerospace10050401
Wang L, Wang J, Liu H, Yue T. Decision-Making Strategies for Close-Range Air Combat Based on Reinforcement Learning with Variable-Scale Actions. Aerospace. 2023; 10(5):401. https://doi.org/10.3390/aerospace10050401
Chicago/Turabian StyleWang, Lixin, Jin Wang, Hailiang Liu, and Ting Yue. 2023. "Decision-Making Strategies for Close-Range Air Combat Based on Reinforcement Learning with Variable-Scale Actions" Aerospace 10, no. 5: 401. https://doi.org/10.3390/aerospace10050401
APA StyleWang, L., Wang, J., Liu, H., & Yue, T. (2023). Decision-Making Strategies for Close-Range Air Combat Based on Reinforcement Learning with Variable-Scale Actions. Aerospace, 10(5), 401. https://doi.org/10.3390/aerospace10050401