A Hybrid Path Planning Framework Integrating Deep Reinforcement Learning and Variable-Direction Potential Fields
Abstract
1. Introduction
- (1)
- An obstacle classification algorithm is designed to categorize obstacles into trap and non-trap types based on their distribution, enabling the robot to select appropriate steering strategies according to obstacle types.
- (2)
- A safe variable-direction potential field has been developed. Through the decoupling of attractive and repulsive potentials, coupled with the modification of repulsive force directionality, the robotic system is enabled to execute detour maneuvers, thereby achieving escape from local optima and obstacle-entrapped regions.
- (3)
- A hybrid framework integrating RL (Reinforcement Learning) and APF (Artificial Potential Field) is constructed. A weight factor is introduced to modulate the equilibrium between attractive and repulsive potentials. By appropriately compromising a certain degree of safety to expand the action space, the RL algorithm continuously adjusts the magnitudes of attractive and repulsive forces while updating the robot’s state variables and action policies throughout the training process, thereby facilitating the robot’s rapid convergence to the target positions.
2. Related Works
2.1. Brief Review of APF Method
2.2. Path Planning Problems in Reinforcement Learning
2.3. Brief Review of Twin Delayed Deep Deterministic Policy Gradient Algorithm
3. Variable-Direction Potential Fields
3.1. Obstacle Classification Algorithm
Algorithm 1 Obstacle classification algorithm. |
|
3.2. Processing of Resultant Force
3.3. Design of Variable-Direction Potential Field
4. VDPF-TD3 Algorithm
4.1. Safety of Variable-Direction Potential Field
4.2. VDPF-TD3 Algorithm
Algorithm 2 VDPF-TD3 Algorithm. |
|
5. Simulation
5.1. Simulation Experiments of VDPF Algorithm
5.2. Simulation Experiments of VDPF-TD3 Algorithm
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Abdalla, T.Y.; Abed, A.A.; Ahmed, A.A. Mobile robot navigation using pso optimized fuzzy artificial potential field with fuzzy control. J. Intell. Fuzzy Syst. 2017, 32, 3893–3908. [Google Scholar] [CrossRef]
- Xu, X.Q.; Wang, M.Y.; Mao, Y. Path planning of mobile robot based on improved artificial potential field method. J. Comput. Appl. 2020, 40, 3508–3512. [Google Scholar]
- Lu, Y.; Wu, A.P.; Chen, Q.Y. An improved UAV path planning method based on RRT - APF hybrid strategy. In Proceedings of the 2020 5th International Conference on Automation, Control and Robotics Engineering, Dalian, China, 19–20 September 2020; pp. 81–86. [Google Scholar]
- Huang, S.Y. Path Planning Based on Mixed Algorithm of RRT and Artificial Potential Field Method. In Proceedings of the 2021 4th International Conference on Intelligent Robotics and Control Engineering (IRCE), Lanzhou, China, 18–19 September 2021; pp. 149–155. [Google Scholar]
- Sheng, H.L.; Zhang, J.; Yan, Z.Y. New multi-UAV formation keeping method based on improved artificial potential field. Chin. J. Aeronaut. 2023, 36, 249–270. [Google Scholar] [CrossRef]
- Liu, M.J.; Zhang, H.X.; Yang, J. A path planning algorithm for three dimensional collision avoidance based on potential field and b - spline boundary curve. Aerosp. Sci. Technol. 2024, 144, 108763. [Google Scholar] [CrossRef]
- Zhang, W.; Wang, N.X.; Wu, W.H. A hybrid path planning algorithm considering AUV dynamic constraints based on improved A* algorithm and APF algorithm. Ocean. Eng. 2023, 285, 115333. [Google Scholar] [CrossRef]
- Chen, X.Q.; Liu, S.H.; Zhao, J.S. Autonomous port management based AGV path planning and optimization via an ensemble reinforcement learning framework. Ocean. Coast. Manag. 2024, 251, 107087. [Google Scholar] [CrossRef]
- Huang, Y.J.; Li, H.; Dai, Y. A 3D Path Planning Algorithm for UAVs Based on an Improved Artificial Potential Field and Bidirectional RRT*. Drones 2024, 8, 760. [Google Scholar] [CrossRef]
- Lin, J.Q.; Zhang, P.; Li, C.G. APF-DPPO: An Automatic Driving Policy Learning Method Based on the Artificial Potential Field Method to Optimize the Reward Function. Machines 2022, 10, 533. [Google Scholar] [CrossRef]
- Ma, B.; Ji, Y.; Li, C.G. A Multi-UAV Formation Obstacle Avoidance Method Combined with Improved Simulated Annealing and an Adaptive Artificial Potential Field. Drones 2025, 9, 390. [Google Scholar] [CrossRef]
- Khatib, O. Real-time obstacle avoidance for manipulators and mobile robots. In Proceedings of the International Conference on Robotics and Automation, St. Louis, MO, USA, 25–28 March 1985; pp. 500–505. [Google Scholar]
- Huang, B.; Xie, J.C.; Yan, J.W. Inspection Robot Navigation Based on Improved TD3 Algorithm. Sensors 2024, 24, 2525. [Google Scholar] [CrossRef] [PubMed]
- Tao, X.Y.; Lang, N.; Li, H.P. Path Planning in Uncertain Environment With Moving Obstacles Using Warm Start Cross Entropy. IEEE/ASME Trans. Mechatron. 2022, 27, 800–810. [Google Scholar] [CrossRef]
- Zhang, F.J.; Li, J.; Li, Z. A TD3-based multi-agent deep reinforcement learning method in mixed cooperation-competition environment. Neurocomputing 2020, 411, 206–215. [Google Scholar] [CrossRef]
- Jia, J.Y.; Xing, X.W.; Chang, D.E. GRU-Attention based TD3 Network for Mobile Robot Navigation. In Proceedings of the 22nd International Conference on Control, Automation and Systems (ICCAS), Busan, Republic of Korea, 27 November–1 December 2022; pp. 1642–1647. [Google Scholar]
- Li, P.; Wang, Y.C.; Gao, Z.Y. Path Planning of Mobile Robot Based on Improved TD3 Algorithm. In Proceedings of the International Conference on Mechatronics and Automation (ICMA), Guilin, China, 7–10 August 2022; pp. 715–720. [Google Scholar]
- Shu, M.; Lu, S.; Gong, X.Y. Episodic Memory-Double Actor-Critic Twin Delayed Deep Deterministic Policy Gradient. Neural Netw. 2025, 187, 107286. [Google Scholar] [CrossRef] [PubMed]
- Cui, Z.W.; Guan, W.; Zhang, X.K. Autonomous collision avoidance decision-making method for USV based on ATL-TD3 algorithm. Ocean. Eng. 2024, 312, 119297. [Google Scholar] [CrossRef]
- Li, T.Y.; Ruan, J.G.; Zhang, K.X. The investigation of reinforcement learning-based End-to-End decision-making algorithms for autonomous driving on the road with consecutive sharp turns. Green Energy Intell. Transp. 2025, 4, 100288. [Google Scholar] [CrossRef]
- Jeng, S.; Chiang, C. End-to-End Autonomous Navigation Based on Deep Reinforcement Learning with a Survival Penalty Function. Sensors 2023, 23, 8651. [Google Scholar] [CrossRef] [PubMed]
- Jiang, H.G.; Abolfazli, E.; Wu, K.Y. iTD3-CLN: Learn to navigate in dynamic scene through Deep Reinforcement Learning. Neurocomputing 2022, 503, 118–128. [Google Scholar] [CrossRef]
- Chakravarthy, A.; Ghose, D. Obstacle avoidance in a dynamic environment: A collision cone approac. IEEE Trans. Syst. Man Cybern. -Syst. 1998, 28, 562–574. [Google Scholar] [CrossRef]
- Lu, L.H.; Zhang, S.J.; Ding, D.R. Path Planning via an Improved DQN-Based Learning Policy. IEEE Access 2019, 7, 67319–67330. [Google Scholar] [CrossRef]
Parameter | Value |
---|---|
Timesteps | 1,000,000 |
Maximum Steps per Episode | 500 |
Soft Update Coefficient | 0.005 |
Discount Factor | 0.99 |
Learning Rate | 0.0003 |
Tau | 0.005 |
Policy Noise | 0.2 |
Policy Delay | 2 |
Env | Algo | GS | LS | Time | Length |
---|---|---|---|---|---|
5 obstacles | APF-TD3 | ||||
VDPF-TD3 | |||||
APF-SAC | |||||
VDPF-SAC | |||||
APF-PPO | |||||
VDPF-PPO | |||||
10 obstacles | APF-TD3 | ||||
VDPF-TD3 | |||||
APF-SAC | |||||
VDPF-SAC | |||||
APF-PPO | |||||
VDPF-PPO | |||||
15 obstacles | APF-TD3 | — | — | — | — |
VDPF-TD3 | |||||
APF-SAC | — | — | — | — | |
VDPF-SAC | |||||
APF-PPO | — | — | — | — | |
VDPF-PPO |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Bi, Y.; Fang, X. A Hybrid Path Planning Framework Integrating Deep Reinforcement Learning and Variable-Direction Potential Fields. Mathematics 2025, 13, 2312. https://doi.org/10.3390/math13142312
Bi Y, Fang X. A Hybrid Path Planning Framework Integrating Deep Reinforcement Learning and Variable-Direction Potential Fields. Mathematics. 2025; 13(14):2312. https://doi.org/10.3390/math13142312
Chicago/Turabian StyleBi, Yunfei, and Xi Fang. 2025. "A Hybrid Path Planning Framework Integrating Deep Reinforcement Learning and Variable-Direction Potential Fields" Mathematics 13, no. 14: 2312. https://doi.org/10.3390/math13142312
APA StyleBi, Y., & Fang, X. (2025). A Hybrid Path Planning Framework Integrating Deep Reinforcement Learning and Variable-Direction Potential Fields. Mathematics, 13(14), 2312. https://doi.org/10.3390/math13142312