A Multi-Population Mean-Field Game Approach for Large-Scale Agents Cooperative Attack-Defense Evolution in High-Dimensional Environments †
Abstract
:1. Introduction
- We formulated the cooperative attack–defense evolution of large-scale agents in high-dimensional environments as a multi-population high-dimensional stochastic mean-field game (MPHD-MFG), which significantly reduced the communication frequency and computational complexity.
- We propose ECA-Net, an extended nonlinear coupled alternating neural network composed of multiple generators and multiple discriminators. We tractably solved the MPHD-MFG with the ECA-Net algorithm using MFGs’ underlying variational primal–dual structure.
- We carried out an integrative experiment in which we analytically showed the fast convergence of our cooperative attack–defense evolution algorithm by the convergence of the Hamilton–Jacobi–Bellman equation’s residual errors. The experiment also showed that a large number of drones can avoid obstacles and smoothly evolve their attack and defense behaviors while minimizing their energy consumption. The comparison with the baseline methods showed that our approach is advanced.
2. Modeling and Formulating
2.1. Kinematics Equation
2.2. Objective Function
2.2.1. Blue-Side Control Problem
2.2.2. Red Side Control Problem
2.3. Multi-Population High-Dimensional Mean-Field Game
3. GAN-Based Approach for MPHD-MFG
3.1. Variational Primal–Dual Structure of MPHD-MFG
3.2. ECA-Net for Cooperative Attack–Defense Evolution
Algorithm 1 ECA-Net for cooperative attack–defense evolution. |
Require: diffusion parameter, terminal cost, Hamiltonian, interaction term, . Require: Initialize neural networks and , batch size B. Require: Set and as in (21). While not converged, do train : Sample batch , , where , and . , for . Backpropagate total loss to weights. train Sample batch , , where , and . Backpropagate total loss to weights. end while |
4. Simulation Results
4.1. Experimental Setup
4.2. Convergence Analysis
4.3. Performance Analysis
4.4. Comparison with Baselines
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
MFG | Mean-field game |
3D | Three-dimensional |
UAV | Unmanned aerial vehicle |
GANs | Generative adversarial neural networks |
ECA-Net | Extended coupled alternating neural network |
HJB | Hamilton–Jacobi–Bellman (partial differential equation) |
FPK | Fokker–Planck (equation) |
Appendix A. The 3D Renderings of Numerical Results and More Experiments
Appendix A.1. The 3D Run Diagram Figure A1 about Figure 5
Appendix A.2. Asymmetric Case Study
References
- Yu, C.; Zhang, M.; Ren, F.; Tan, G. Multiagent Learning of Coordination in Loosely Coupled Multiagent Systems. IEEE Trans. Cybern. 2015, 45, 2853–2867. [Google Scholar] [CrossRef]
- Yang, Y.; Luo, R.; Li, M.; Zhou, M.; Zhang, W.; Wang, J. Mean Field Multi-Agent Reinforcement Learning. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; Volume 80, pp. 5571–5580. [Google Scholar]
- Chen, C.; Mo, L.; Zheng, D. Cooperative attack–defense game of multiple UAVs with asymmetric maneuverability. Acta Aeronaut. Astronaut. Sin. 2020, 41, 324152. [Google Scholar] [CrossRef]
- Huang, L.; Fu, M.; Qu, H.; Wang, S.; Hu, S. A deep reinforcement learning-based method applied for solving multi-agent defense and attack problems. Expert Syst. Appl. 2021, 176, 114896. [Google Scholar] [CrossRef]
- Huang, H.; Zhang, W.; Ding, J.; Stipanovic, D.M.; Tomlin, C.J. Guaranteed decentralized pursuit-evasion in the plane with multiple pursuers. In Proceedings of the IEEE Conference on Decision and Control and European Control Conference, Orlando, FL, USA, 12–15 December 2011. [Google Scholar] [CrossRef]
- Zha, W.; Chen, J.; Peng, Z.; Gu, D. Construction of Barrier in a Fishing Game With Point Capture. IEEE Trans. Cybern. 2017, 47, 1409–1422. [Google Scholar] [CrossRef] [PubMed]
- Liu, Y.; Qi, N.; Tang, Z. Linear Quadratic Differential Game Strategies with Two-pursuit Versus Single-evader. Chin. J. Aeronaut. 2012, 25, 896–905. [Google Scholar] [CrossRef] [Green Version]
- Wan, K.; Wu, D.; Zhai, Y.; Li, B.; Gao, X.; Hu, Z. An Improved Approach towards Multi-Agent Pursuit–Evasion Game Decision-Making Using Deep Reinforcement Learning. Entropy 2021, 23, 1433. [Google Scholar] [CrossRef]
- Garcia, E.; Casbeer, D.W.; Moll, A.V.; Pachter, M. Multiple Pursuer Multiple Evader Differential Games. IEEE Trans. Autom. Control 2021, 66, 2345–2350. [Google Scholar] [CrossRef]
- Earl, M.; D’Andrea, R. Modeling and control of a multi-agent system using mixed integer linear programming. In Proceedings of the 41st IEEE Conference on Decision and Control, Las Vegas, NV, USA, 10–13 December 2002. [Google Scholar] [CrossRef]
- Earl, M.; D’Andrea, R. A study in cooperative control: The RoboFlag drill. In Proceedings of the Proceedings of the 2002 American Control Conference (IEEE Cat. No.CH37301), Anchorage, AK, USA, 8–10 May 2002. [Google Scholar] [CrossRef]
- Bakolas, E.; Tsiotras, P. Optimal pursuit of moving targets using dynamic Voronoi diagrams. In Proceedings of the 49th IEEE Conference on Decision and Control (CDC), Atlanta, GA, USA, 15–17 December 2010. [Google Scholar] [CrossRef]
- Isaacs, R. Differential Games; Wiley: Hoboken, NJ, USA, 1967. [Google Scholar]
- Lasry, J.M.; Lions, P.L. Jeux à champ moyen. I–Le cas stationnaire. Comptes Rendus Math. 2006, 343, 619–625. [Google Scholar] [CrossRef]
- Lasry, J.M.; Lions, P.L. Jeux à champ moyen. II–Horizon fini et contrôle optimal. Comptes Rendus Math. 2006, 343, 679–684. [Google Scholar] [CrossRef]
- Lasry, J.M.; Lions, P.L. Mean field games. Jpn. J. Math. 2007, 2, 229–260. [Google Scholar] [CrossRef] [Green Version]
- Huang, M.; Caines, P.; Malhame, R. Individual and mass behaviour in large population stochastic wireless power control problems: Centralized and nash equilibrium solutions. In Proceedings of the 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475), Maui, HI, USA, 9–12 December 2003. [Google Scholar] [CrossRef] [Green Version]
- Caines, P.E.; Huang, M.; Malhamé, R.P. Large population stochastic dynamic games: Closed-loop McKean-Vlasov systems and the Nash certainty equivalence principle. Commun. Inf. Syst. 2006, 6, 221–252. [Google Scholar] [CrossRef]
- Huang, M.; Caines, P.E.; Malhame, R.P. Large-Population Cost-Coupled LQG Problems With Nonuniform Agents: Individual-Mass Behavior and Decentralized $\varepsilon$-Nash Equilibria. IEEE Trans. Autom. Control 2007, 52, 1560–1571. [Google Scholar] [CrossRef]
- Gomes, D.; Saúde, J. A mean-field game approach to price formation in electricity markets. arXiv 2018, arXiv:1807.07088. [Google Scholar]
- Kizilkale, A.C.; Salhab, R.; Malhamé, R.P. An integral control formulation of mean field game based large scale coordination of loads in smart grids. Automatica 2019, 100, 312–322. [Google Scholar] [CrossRef] [Green Version]
- Paola, A.D.; Trovato, V.; Angeli, D.; Strbac, G. A Mean Field Game Approach for Distributed Control of Thermostatic Loads Acting in Simultaneous Energy-Frequency Response Markets. IEEE Trans. Smart Grid 2019, 10, 5987–5999. [Google Scholar] [CrossRef]
- Lachapelle, A.; Wolfram, M.T. On a mean field game approach modeling congestion and aversion in pedestrian crowds. Transp. Res. Part B Methodol. 2011, 45, 1572–1589. [Google Scholar] [CrossRef]
- Burger, M.; Francesco, M.D.; Markowich, P.A.; Wolfram, M.T. Mean field games with nonlinear mobilities in pedestrian dynamics. Discret. Contin. Dyn. Syst.-B 2014, 19, 1311–1333. [Google Scholar] [CrossRef]
- Aurell, A.; Djehiche, B. Mean-Field Type Modeling of Nonlocal Crowd Aversion in Pedestrian Crowd Dynamics. SIAM J. Control Optim. 2018, 56, 434–455. [Google Scholar] [CrossRef] [Green Version]
- Achdou, Y.; Lasry, J.M. Mean Field Games for Modeling Crowd Motion. In Computational Methods in Applied Sciences; Springer International Publishing: Cham, Switzerland, 2018; pp. 17–42. [Google Scholar] [CrossRef]
- Liu, Z.; Wu, B.; Lin, H. A Mean Field Game Approach to Swarming Robots Control. In Proceedings of the IEEE 2018 Annual American Control Conference (ACC), Milwaukee, WI, USA, 27–29 June 2018. [Google Scholar] [CrossRef]
- Elamvazhuthi, K.; Berman, S. Mean-field models in swarm robotics: A survey. Bioinspir. Biomimet. 2019, 15, 015001. [Google Scholar] [CrossRef]
- Lee, W.; Liu, S.; Tembine, H.; Li, W.; Osher, S. Controlling Propagation of Epidemics via Mean-Field Control. SIAM J. Appl. Math. 2021, 81, 190–207. [Google Scholar] [CrossRef]
- Chang, S.L.; Piraveenan, M.; Pattison, P.; Prokopenko, M. Game theoretic modelling of infectious disease dynamics and intervention methods: A review. J. Biol. Dyn. 2020, 14, 57–89. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- E, W.; Han, J.; Li, Q. A mean-field optimal control formulation of deep learning. Res. Math. Sci. 2018, 6, 10. [Google Scholar] [CrossRef] [Green Version]
- Guo, X.; Hu, A.; Xu, R.; Zhang, J. Learning Mean-Field Games. In Advances in Neural Information Processing Systems; Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2019; Volume 32. [Google Scholar]
- Carmona, R.; Laurière, M.; Tan, Z. Linear-Quadratic Mean-Field Reinforcement Learning: Convergence of Policy Gradient Methods. arXiv 2019, arXiv:1910.04295. [Google Scholar]
- Guéant, O.; Lasry, J.M.; Lions, P.L. Mean Field Games and Applications. In Paris-Princeton Lectures on Mathematical Finance 2010; Springer: Berlin/Heidelberg, Germany, 2011; pp. 205–266. [Google Scholar] [CrossRef] [Green Version]
- Lin, A.T.; Fung, S.W.; Li, W.; Nurbekyan, L.; Osher, S.J. Alternating the population and control neural networks to solve high-dimensional stochastic mean-field games. Proc. Natl. Acad. Sci. USA 2021, 118, e2024713118. [Google Scholar] [CrossRef] [PubMed]
- Wang, G.; Yao, W.; Zhang, X.; Niu, Z. Coupled Alternating Neural Networks for Solving Multi-Population High-Dimensional Mean-Field Games with Stochasticity. TechRxiv Preprint 2022. [Google Scholar] [CrossRef]
- Wang, G.; Zhang, X.; Yao, W.; Ren, L. Cooperative attack–defense evolution of large-scale agents. In Proceedings of the ACM Genetic and Evolutionary Computation Conference Companion, Boston, MA, USA, 9–13 July 2022. [Google Scholar] [CrossRef]
- Chang, K.; Xia, Y.; Huang, K. UAV formation control design with obstacle avoidance in dynamic three-dimensional environment. SpringerPlus 2016, 5, 1124. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Onken, D.; Nurbekyan, L.; Li, X.; Fung, S.W.; Osher, S.; Ruthotto, L. A Neural Network Approach for High-Dimensional Optimal Control Applied to Multiagent Path Finding. IEEE Trans. Control. Syst. Technol. 2022, 1–17. [Google Scholar] [CrossRef]
- Schulte, J.M. Adjoint Methods for Hamilton–Jacobi–Bellman Equations. Ph.D. Thesis, University of Munster, Münster, Germany, 2010. [Google Scholar]
- Fujii, M. Probabilistic Approach to Mean Field Games and Mean Field Type Control Problems with Multiple Populations. SSRN Electron. J. 2019. [CrossRef] [Green Version]
- Bensoussan, A.; Huang, T.; Laurière, M. Mean Field Control and Mean Field Game Models with Several Populations. arXiv 2018, arXiv:1810.00783. [Google Scholar]
- Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein Generative Adversarial Networks. In Proceedings of the 34th International Conference on Machine Learning—Volume 70 (ICML’17), Sydney, Australia, 6–11 August 2017; pp. 214–223. [Google Scholar]
- Benamou, J.D.; Carlier, G.; Santambrogio, F. Variational Mean Field Games. In Active Particles; Modeling and Simulation in Science, Engineering & Technology; Springer International Publishing: Cham, Switzerland, 2017; Volume 1, pp. 141–171. [Google Scholar]
- Cardaliaguet, P.; Graber, P.J. Mean field games systems of first order. ESAIM Control. Optim. Calc. Var. 2015, 21, 690–722. [Google Scholar] [CrossRef] [Green Version]
- Cardaliaguet, P.; Graber, P.J.; Porretta, A.; Tonon, D. Second order mean field games with degenerate diffusion and local coupling. Nonlinear Differ. Equ. Appl. NoDEA 2015, 22, 1287–1317. [Google Scholar] [CrossRef]
Method | Scene | Scale of UAVs | Scene Complexity | Communication |
---|---|---|---|---|
[7] | 3D blank scene | Small | 0.67 1 | 2 |
[8] | 2D obstacle scene | Small | 0.67 | |
[9] | 2D blank scene | Large | 0.67 | |
[37] | 3D blank scene | Large | 0.83 | |
Ours | 3D obstacle scene | Large | 1 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, G.; Li, Z.; Yao, W.; Xia, S. A Multi-Population Mean-Field Game Approach for Large-Scale Agents Cooperative Attack-Defense Evolution in High-Dimensional Environments. Mathematics 2022, 10, 4075. https://doi.org/10.3390/math10214075
Wang G, Li Z, Yao W, Xia S. A Multi-Population Mean-Field Game Approach for Large-Scale Agents Cooperative Attack-Defense Evolution in High-Dimensional Environments. Mathematics. 2022; 10(21):4075. https://doi.org/10.3390/math10214075
Chicago/Turabian StyleWang, Guofang, Ziming Li, Wang Yao, and Sikai Xia. 2022. "A Multi-Population Mean-Field Game Approach for Large-Scale Agents Cooperative Attack-Defense Evolution in High-Dimensional Environments" Mathematics 10, no. 21: 4075. https://doi.org/10.3390/math10214075
APA StyleWang, G., Li, Z., Yao, W., & Xia, S. (2022). A Multi-Population Mean-Field Game Approach for Large-Scale Agents Cooperative Attack-Defense Evolution in High-Dimensional Environments. Mathematics, 10(21), 4075. https://doi.org/10.3390/math10214075