Enhancing Autonomous Underwater Vehicle Decision Making through Intelligent Task Planning and Behavior Tree Optimization
Abstract
:1. Introduction
- Time-Intensive Manual Design: Manual design of decision control systems proves inadequate in handling the timeliness demands posed by intricate tasks or an increased number of AUVs.
- Balancing Safety and Efficiency: Ensuring both the safety of AUV operations and the efficiency of task completion within complex scenarios remains a critical challenge in decision control for AUVs.
- Enhancing Decision Structures: The utilization of reactive decision control structures is pivotal for addressing the transformation from goal-driven behaviors to achieve more effective decision-making processes in AUV operations.
- Establishment of Behavior Tree Search Model: A behavior tree search model based on directed acyclic graphs is developed, serving as a foundational framework for the design of AUV control systems.
- Introduction of Optimization Algorithm: An optimization algorithm, leveraging MCTS-QPSO, is presented for the automatic optimization of the optimal behavior tree structure, enhancing the efficiency and effectiveness of decision-making processes.
- Enhanced Optimization Efficiency: The optimization algorithm’s efficiency is further bolstered through the pre-grouping of actions and states, reducing unnecessary search costs and streamlining the overall optimization process.
2. Literature Review
3. Problem Formulation
4. Behavior Tree Learning Algorithm
4.1. Traditional Behavior Tree and AUV Control
Algorithm 1 Sequence node of BTs |
1. for child ∈ children do 2. status = tick(child); 3. if status = Running OR status = Failure then 4. return status; 5. end 6. end 7. return Success |
Algorithm 2 Fallback node of BTs |
1. for child ∈ children do 2. status = tick(child); 3. if status = Running OR status = Success then 4. return status; 5. end 6. end 7. return Failure |
Algorithm 3 Parallel node of BTs |
1. for child ∈ children do 2. status = tick(child); 3. end 4. if textbfthen 5. return success; 6. end 7. if textbfthen 8. end 9. return failure |
4.2. Behavior Tree Formal Grammar
4.3. MCTS for Subtree Learning
Algorithm 4 MCTS for subtree search frame |
Input: AUVs number , subtree number , iterations IterNum, initialized three types of subtrees: TSs, TEs, MEs, PTs reward value calculation equation 1: for i ← 1 to do 2: Create a new tree with root node and initialize root: 3: root. N←0, root. Q←0 4: for j ← 1 to IterNum do 5: Evaluate the draw profit value; 6: while(True) 7: if is leaf 8: break; 9: end if 10: find the best subtree of p and its index ind; 11: p←best subtree of p; 12: end while 13: if p is not leaf node 14: Expand the node p; 15: end if 16: Simulation: Simulate returns according to the reward 17: Back up 18: end for 19: end for Return optimized subtree end |
4.4. Optimization Algorithm for Subtree Fusion
- (1)
- Initialize the position information of particles and determine the particle population size and particle dimension .
- (2)
- Calculate the value of the middle position of the particle swarm.
- (3)
- Calculate the fitness of each particle, and select the particle with the optimal fitness value as the optimal particle .
- (4)
- The fitness values of all are compared, and the particle with the best fitness value is selected as the global optimal particle .
- (5)
- For each dimension with particles, a random point is obtained between and .
- (6)
- Obtain a new position of particle.
- (7)
- Check whether the particle meets the limit condition , otherwise, solve the control step and make it so that
- (8)
- Repeat steps 2–7 until the algorithm reaches the accuracy standard or the maximum number of iterations, and output the global optimal particle position and its fitness.
5. Simulation Experiment
5.1. Verification of Multi-AUV Cooperative Task Effect
5.2. Algorithm Effectiveness Analysis
5.3. Algorithm Superiority Analysis
6. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Wei, W.; Wang, J.; Fang, Z.; Chen, J.; Ren, Y.; Dong, Y. 3U: Joint design of uav-usv-uuv networks for cooperative target hunting. IEEE Trans. Veh. Technol. 2023, 72, 4085–4090. [Google Scholar] [CrossRef]
- Lin, C.; Cheng, Y.; Wang, X.; Yuan, J.; Wang, G. Transformer-based dual-channel self-attention for uuv autonomous collision avoidance. IEEE Trans. Intell. Veh. 2023, 8, 2319–2331. [Google Scholar] [CrossRef]
- Scheide, E.; Best, G.; Hollinger, G.A. Behavior tree learning for robotic task planning through monte carlo dag search over a formal grammar. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 4837–4843. [Google Scholar] [CrossRef]
- Yu, D.; Wang, H.; Li, B.; Wang, Z.; Ren, J.; Wang, X. Prometheebased multi-auv threat assessment method using combinational weights. J. Mar. Sci. Eng. 2023, 11, 1422. [Google Scholar] [CrossRef]
- Ligot, A.; Kuckling, J.; Bozhinoski, D.; Birattari, M. Automatic modular design of robot swarms using behavior trees as a control architecture. PeerJ Comput. Sci. 2020, 6, e314. [Google Scholar] [CrossRef] [PubMed]
- Birattari, M.; Ligot, A.; Bozhinoski, D.; Brambilla, M.; Francesca, G.; Garattoni, L.; Ramos, D.G.; Hasselmann, K.; Kegeleirs, M.; Kuckling, J.; et al. Automatic off-line design of robot swarms: A anifesto. Front. Robot. AI 2019, 6, 59. [Google Scholar] [CrossRef] [PubMed]
- Francesca, G.; Birattari, M. Automatic design of robot swarms: Achievements and challenges. Front. Robot. AI 2016, 3, 29. [Google Scholar] [CrossRef]
- Masek, M.; Lam, C.P.; Kelly, L.; Wong, M. Discovering optimal strategy in tactical combat scenarios through the evolution of behaviour trees. Ann. Oper. Res. 2023, 320, 901–936. [Google Scholar] [CrossRef]
- Sprague, C.I.; Özkahraman, Ö.; Munafo, A.; Marlow, R.; Phillips, A.; Ögren, P. Improving the modularity of auv control systems using behaviour trees. In Proceedings of the 2018 IEEE/OES Autonomous Underwater Vehicle Workshop (AUV), Porto, Portugal, 6–9 November 2018; pp. 1–6. [Google Scholar] [CrossRef]
- Colledanchise, M.; Gren, P. How behavior trees generalize the teleoreactive paradigm and and-or-trees. In Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Republic of Korea, 9–14 October 2016; pp. 424–429. [Google Scholar] [CrossRef]
- Malviya, V.; Reddy, A.K.; Kala, R. Autonomous social robot navigation using a behavioral finite state social machine. Robotica 2020, 38, 2266–2289. [Google Scholar] [CrossRef]
- Yan, Y.; Ma, W.; Li, Y.; Wong, S.; He, P.; Zhu, S.; Yin, X. The navigation of mobile robot in the indoor dynamic unknown environment based on de- cision tree algorithm. Comput. Intell. Neurosci. 2022, 2022, 3492175. [Google Scholar] [CrossRef]
- Browne, C.B.; Powley, E.; Whitehouse, D.; Lucas, S.M.; Cowling, P.I.; Rohlfshagen, P.; Tavener, S.; Perez, D.; Samothrakis, S.; Colton, S. A survey of monte carlo tree search methods. IEEE Trans. Comput. Intell. Games 2012, 4, 1–43. [Google Scholar] [CrossRef]
- Shen, G.; Lei, L.; Zhang, X.; Li, Z.; Cai, S.; Zhang, L. Multi-uav cooperative search based on reinforcement learning with a digital twin driven training framework. IEEE Trans. Veh. Technol. 2023, 72, 8354–8368. [Google Scholar] [CrossRef]
- Pandey, P.; Pompili, D.; Yi, J. Dynamic collaboration between networked robots and clouds in resource-constrained environments. IEEE Trans. Autom. Sci. Eng. 2015, 12, 471–480. [Google Scholar] [CrossRef]
- Perera, L.P.; Carvalho, J.P.; Guedes Soares, C. Intelligent ocean navigation and fuzzy-bayesian decision/action formulation. IEEE J. Ocean. Eng. 2012, 37, 204–219. [Google Scholar] [CrossRef]
- Brito, M.P.; Griffiths, G. A markov chain state transition approach to establishing critical phases for auv reliability. IEEE J. Ocean. Eng. 2011, 36, 139–149. [Google Scholar] [CrossRef]
- Abbasi, A.; MahmoudZadeh, S.; Yazdani, A. A cooperative dynamic task assignment framework for cotsbot auvs. IEEE Trans. Autom. Sci. Eng. 2022, 19, 1163–1179. [Google Scholar] [CrossRef]
- Bhatt, E.C.; Howard, B.; Schmidt, H. An embedded tactical decision aid framework for environmentally adaptive autonomous underwater vehi- cle communication and navigation. IEEE J. Ocean. Eng. 2022, 47, 848–863. [Google Scholar] [CrossRef]
- Pan, Y.; Ma, B.; Tang, J.; Zeng, Y. Behavioral model summarisation for other agents under uncertainty. Inf. Sci. 2022, 582, 495–508. [Google Scholar] [CrossRef]
- Chang, Y.; Garcia, A.; Wang, Z.; Sun, L. Structural estimation of partially observable markov decision processes. IEEE Trans. Autom. Control 2023, 68, 5135–5141. [Google Scholar] [CrossRef]
- Doshi, P.; Zeng, Y.; Chen, Q. Graphical models for interactive pomdps: Representations and solutions. Auton. Agents Multi-Agent Syst. 2009, 18, 376–416. [Google Scholar] [CrossRef]
- Pan, Y.; Ma, B.; Zeng, Y.; Tang, J.; Zeng, B.; Ming, Z. An evolutionary framework for modelling unknown behaviours of other agents. IEEE Trans. Emerg. Top. Comput. Intell. 2023, 7, 1276–1289. [Google Scholar] [CrossRef]
- Ostonov, A.; Moshkov, M. On complexity of deterministic and nondeterministic decision trees for conventional decision tables from closed classes. Entropy 2023, 25, 1411. [Google Scholar] [CrossRef] [PubMed]
- Yan, Y.; Deng, H.; Yue, J.; Chen, Z. Model-erence adaptive control of finite state machines with respect to states: A matrix-based approach. IEEE Trans. Circuits Syst. II Express Briefs 2023, 70, 2171–2175. [Google Scholar] [CrossRef]
- Gugliermo, S.; Schaffernicht, E.; Koniaris, C.; Pecora, F. Learning behavior trees from planning experts using decision tree and logic factorization. IEEE Robot. Autom. Lett. 2023, 8, 3534–3541. [Google Scholar] [CrossRef]
- Nicolau, M.; Perez-Liebana, D.; Neill, M.O.; Brabazon, A. Evolutionary behavior tree approaches for navigating platform games. IEEE Trans. Comput. Intell. AI Games 2017, 9, 227–238. [Google Scholar] [CrossRef]
- Dortmans, E.; Punter, T. Behavior trees for smart robots practical guidelines for robot software development. J. Robot. 2022, 2022, 3314084. [Google Scholar] [CrossRef]
- Abiyev, R.H.; Akkaya, N.; Aytac, E.; Ibrahim, D. Behaviour tree based control for efficient navigation of holonomic robots. Int. J. Robot. Autom. 2014, 29, 44–57. [Google Scholar] [CrossRef]
- Bhat, S.; Stenius, I. Controlling an underactuated auv as an inverted pendulum using nonlinear model predictive control and behavior trees. In Proceedings of the 2023 IEEE International Conference on Robotics and Automation(ICRA), London, UK, 29 May–2 June 2023; pp. 12261–12267. [Google Scholar] [CrossRef]
- Iovino, M.; Scukins, E.; Styrud, J.; Ögren, P.; Smith, C. A survey of behavior trees in robotics and ai. Robot. Auton. Syst. 2022, 154, 104096. [Google Scholar] [CrossRef]
- Scheper, K.Y.W.; Tijmons, S.; de Visser, C.C.; de Croon, G.C.H.E. Behavior Trees for Evolutionary Robotics. Artif. Life 2016, 22, 23–48. [Google Scholar] [CrossRef] [PubMed]
- Kuckling, J.; Ligot, A.; Bozhinoski, D.; Birattari, M. Behavior trees as a control architecture in the automatic modular design of robot swarms. In Swarm Intelligence; Springer International Publishing: Cham, Swizerland, 2018; pp. 30–43. [Google Scholar]
- Yao, J.; Wang, W.; Li, Z.; Lei, Y.; Li, Q. Tactics exploration framework based on genetic programming. Int. J. Comput. Intell. Syst. 2017, 10, 804–814. [Google Scholar] [CrossRef]
- Colledanchise, M.; Parasuraman, R.; Ögren, P. Learning of behavior trees for autonomous agents. IEEE Trans. Games 2019, 11, 183–189. [Google Scholar] [CrossRef]
- Venkata, S.S.O.; Parasuraman, R.; Pidaparti, R. Kt-bt: A framework for knowledge transfer through behavior trees in multirobot systems. IEEE Trans. Robot. 2023, 39, 4114–4130. [Google Scholar] [CrossRef]
- French, K.; Wu, S.; Pan, T.; Zhou, Z.; Jenkins, O.C. Learning behavior trees from demonstration. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 7791–7797. [Google Scholar] [CrossRef]
- Sprague, C.I.; Ögren, P. Adding Neural Network Controllers to Behavior Trees without Destroying Performance Guarantees. In Proceedings of the 2022 IEEE 61st Conference on Decision and Control (CDC), Cancun, Mexico, 6–9 December 2022; pp. 3989–3996. [Google Scholar] [CrossRef]
- Hólzl, M.; Gabor, T. Reasoning and Learning for Awareness and Adaptation. In Software Engineering for Collective Autonomic Systems; Springer International Publishing: Cham, Swizerlalnd, 2015; pp. 249–290. [Google Scholar] [CrossRef]
- Dey, R.; Child, C. Ql-bt: Enhancing behaviour tree design and implementation with q-learning. In Proceedings of the 2013 IEEE Conference on Computational Inteligence in Games (CIG), Niagara Falls, ON, Canada, 11–13 August 2013; pp. 1–8. [Google Scholar] [CrossRef]
- Hoffman, M.; Song, E.; Brundage, M.; Kumara, S. Online Maintenance Prioritization Via Monte Carlo Tree Search and Case Based Reasoning. J. Comput. Inf. Sci. Eng. 2022, 22, 041005. [Google Scholar] [CrossRef]
- Chiu, T.-Y.; Ny, J.L.; David, J.-P. Temporal logic explanations for dynamic decision systems using anchors and monte carlo tree search. Artif. Intell. 2023, 318, 103897. [Google Scholar] [CrossRef]
- Seiler, K.M.; Palmer, A.W.; Hill, A.J. Flow-achieving online planning and dispatching for continuous transportation with autonomous vehicles. IEEE Trans. Autom. Sci. Eng. 2022, 19, 457–472. [Google Scholar] [CrossRef]
- Swiechowski, M.; Godlewski, K.; Sawicki, B.; Mandziuk, J. Monte carlo tree search: A review of recent modifications and applications. Arti Ficial Intell. Rev. 2023, 56, 2497–2562. [Google Scholar] [CrossRef]
- Yu, D.; Wang, H.; Huang, W.; Huang, S. Application of extended game in multi-uuv pursuit-escape task. In Proceedings of the Ocean, Offshore and Arctic Engineering, Melbourne, VIC, Australia, 11–16 June 2023; Volume 5. [Google Scholar] [CrossRef]
- Dorling, K.; Heinrichs, J.; Messier, G.G.; Magierowski, S. Vehicle routing problems for drone delivery. IEEE Trans. Syst. Man Cybern. Syst. 2017, 47, 70–85. [Google Scholar] [CrossRef]
- Sun, B.; Ma, H.; Zhu, D. A fusion designed improved elastic potential field method in auv underwater target interception. IEEE J. Ocean. Eng. 2023, 48, 640–648. [Google Scholar] [CrossRef]
- Ögren, P.; Sprague, C.I. Behavior trees in robot control systems. Annu. Rev. Control. Robot. Auton. Syst. 2022, 5, 81–107. [Google Scholar] [CrossRef]
- Özkahraman, O.; Ögren, P. Combining control barrier functions and behavior trees for multi-agent underwater coverage missions. In Proceedings of the 2020 59th IEEE Conference on Decision and Control, Jeju, Republic of Korea, 14–18 December 2020; pp. 5275–5282. [Google Scholar] [CrossRef]
- Fu, Y.; Ding, M.; Zhou, C. Phase angle-encoded and quantum-behaved particle swarm optimization applied to three-dimensional route planning for UAV. IEEE Trans. Syst. Man Cybern.-Part A Syst. Hum. 2011, 42, 511–526. [Google Scholar] [CrossRef]
Node Type | Success | Failure | Running |
---|---|---|---|
Fallback | One child succeeds | All children fail | One child running |
Sequence | All children succeed | One child fails | One child running |
Parallel | >M children succeed | N−M children fail | Else |
Action | Upon completion | Not complete | During completion |
Condition | True | False | Never |
Condition | Action |
---|---|
Target is obstacles | Mobile_evasion |
Non-maximum velocity | Accelerate |
Heading is not satisfied | Adjust heading |
In pursue range | Pursue |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yu, D.; Wang, H.; Cao, X.; Wang, Z.; Ren, J.; Zhang, K. Enhancing Autonomous Underwater Vehicle Decision Making through Intelligent Task Planning and Behavior Tree Optimization. J. Mar. Sci. Eng. 2024, 12, 791. https://doi.org/10.3390/jmse12050791
Yu D, Wang H, Cao X, Wang Z, Ren J, Zhang K. Enhancing Autonomous Underwater Vehicle Decision Making through Intelligent Task Planning and Behavior Tree Optimization. Journal of Marine Science and Engineering. 2024; 12(5):791. https://doi.org/10.3390/jmse12050791
Chicago/Turabian StyleYu, Dan, Hongjian Wang, Xu Cao, Zhao Wang, Jingfei Ren, and Kai Zhang. 2024. "Enhancing Autonomous Underwater Vehicle Decision Making through Intelligent Task Planning and Behavior Tree Optimization" Journal of Marine Science and Engineering 12, no. 5: 791. https://doi.org/10.3390/jmse12050791
APA StyleYu, D., Wang, H., Cao, X., Wang, Z., Ren, J., & Zhang, K. (2024). Enhancing Autonomous Underwater Vehicle Decision Making through Intelligent Task Planning and Behavior Tree Optimization. Journal of Marine Science and Engineering, 12(5), 791. https://doi.org/10.3390/jmse12050791