Adaptive A*–Q-Learning–DWA Fusion with Dynamic Heuristic Adjustment for Safe Path Planning in Spraying Robots
Abstract
1. Introduction
1.1. Background
1.2. Related Work
Author | Algorithm Type | Core Improvements | Existing Limitations |
---|---|---|---|
Zhang et al. [12] | Hybrid Algorithm (Improved A* + DWA) | 1. Improved A*: Optimizes path efficiency and smoothness through adjustment factors, 5-neighborhood search, and Floyd’s algorithm. 2. Improved DWA: Introduces a smoothing coefficient and a local target selection strategy. | 1. A* has fixed neighborhood sampling (only 5-neighborhood), which cannot adapt to changes in obstacle distribution. 2. No safety distance constraint is set, so the path is prone to being close to obstacles. |
Fu et al. [13] | Adaptive A* | 1. Optimizes traditional neighborhood to 16-neighborhood. 2. Removes redundant path points via node deletion, bidirectional search, and dynamic weights. | 1. 16-neighborhood expands search space exponentially, increasing computation time. 2. Unoptimized turning angle may violate robot kinematic constraints. |
Jin et al. [14] | Adaptive A* | Reduces node redundancy and improves search efficiency via 5-neighborhood filtering. | 1. Failure to address turning angle optimization; path may have sharp turns (>90°). 2. No dynamic obstacle avoidance capability. |
1.3. Motivation and Contributions
- Dynamic heuristic A*: An adaptive A* algorithm is proposed, incorporating dynamic heuristic weighting (Equation (3)) and adaptive connectivity selection. A safety distance constraint (Equation (7)) ensures that all path nodes maintain a distance greater than one grid unit from obstacles, thereby satisfying geometric safety requirements for mobile robots.
- Q-learning for safety re-planning: Unsafe path segments are modeled as Markov Decision Processes (S, A, P, R) and re-planned using Q-learning.
- DWA integration: The integration of the DWA enhances the algorithm’s capability to manage both moving and unknown static obstacles in real time, thereby improving adaptability in complex and dynamic environments.
2. Proposed Methods
2.1. Optimized A* Algorithm
2.1.1. Traditional A* Algorithm
2.1.2. Adaptive A* Algorithm
- (1)
- Optimization of the Evaluation Function
- When r ≈ R (early stage), (1 + r⁄R) ≈ 2, enhancing the heuristic dominance to accelerate global exploration;
- When r 0 (near goal), (1 + r/R) 1, reducing heuristic impact to prioritize local security.
- (2)
- Dynamic Connectivity Selection for Child Node Sampling.
- (3)
- Obstacle Avoidance Optimization of the A* Algorithm
2.2. Path Smoothing Based on B-Splines
2.3. Path Optimization Using the Q-Learning Algorithm
2.3.1. Introduction to the Q-Learning algorithm
- (1)
- State Space (S):
- (2)
- Action Space (A):
- (3)
- State Transition Dynamics (P):
- (4)
- Reward Function (R):
2.3.2. Q-Learning Algorithm for Path Optimization of the A* Algorithm
2.4. DWA Algorithm
3. Experiments
3.1. Experimental Environment
- Group 1 used four map sizes (30 × 30 to 200 × 200) with 10% obstacle density (fixed MATLAB random seeds rng (1)–rng (10) for consistent setups across 10 trials per map);
- Group 2 used a fixed 50 × 50 map with obstacle densities varying 0–20% (seed-controlled distributions).
3.2. Simulation Experiments of the Adaptive A* Algorithm
3.2.1. Simulation Experiments with Different Map Sizes
- 30 × 30 map: start point = (3, 3); goal point = (28, 28).
- 50 × 50 map: start point = (3, 3); goal point = (48, 45).
- 100 × 100 map: start point = (3, 3); goal point = (95, 95).
- 200 × 200 map: start point = (3, 3); goal point = (195, 195).
3.2.2. Simulation Experiments with Different Obstacle Rates
4. Results Analysis and Discussion
4.1. Simulation Data Analysis
4.1.1. Performance of the Adaptive A* Algorithm Under Varied Map Sizes
4.1.2. Performance of the Adaptive A* Algorithm Under Varied Obstacle Densities
4.1.3. Optimization Mechanism and Safety Improvement of the Adaptive A* Algorithm
4.2. Performance Comparison of the Adaptive A* Algorithm
4.3. Q-Learning Algorithm Path Optimization Simulation Experiment
4.4. Simulation Experiments with the Integration of the DWA Algorithm
4.5. Experimental Results Analysis
- (1)
- Evaluation Function Enhancement:
- (2)
- Optimized Child Node Selection:
- (3)
- Unsafe Path Re-planning:
- (4)
- Path Smoothing:
- (5)
- Handling of Unknown Obstacles:
4.6. Uncertainty Analysis and Coping Strategies
4.6.1. Internal Uncertainties
- (1)
- Sensor Noise:
- (2)
- Motion Model Errors:
- (3)
- Mitigation Strategy:
- The EKF (Extended Kalman Filter) fusion of LiDAR and monocular vision, which reduces the ranging error to ±0.8 cm and increases the safety threshold when the obstacle confidence is low;
- The correction of turning costs using the dynamic error coefficient β based on historical motion data (expressed as C’(n) = β × C(n));
- The reduction in the heuristic weight when the sensor confidence is low (decreasing 1 + r/R from 2 to 1.2).
4.6.2. External Uncertainties
- (1)
- Randomness of Dynamic Obstacles:
- (2)
- Inaccurate Environmental Modeling:
- (3)
- Mitigation Strategies:
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Li, C.; Zhang, X. Operation technology of unmanned mining robot for coal mine based on intelligent control technology. In Proceedings of the 2023 International Conference on Computer Simulation and Modeling, Information Security (CSMIS), Buenos Aires, Argentina, 15–17 November 2023; IEEE: Piscataway, NJ, USA, 2024; pp. 86–91. [Google Scholar]
- Liu, Z. Navigation system of coal mine rescue robot based on virtual reality technology. In Proceedings of the 2022 World Automation Congress (WAC), San Antonio, TX, USA, 11–15 October 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 379–383. [Google Scholar]
- Zhang, H.; Xie, X.; Wei, M.; Wang, X.; Song, D.; Luo, J. An improved goal-bias RRT algorithm for unmanned aerial vehicle path planning. In Proceedings of the 2024 IEEE International Conference on Mechatronics and Automation (ICMA), Tianjin, China, 4–7 August 2024; IEEE: Piscataway, NJ, USA, 19 August, 2024; pp. 1360–1365. [Google Scholar]
- Fu, S. Robot Path Planning Optimization Based on RRT and APF Fusion Algorithm. In Proceedings of the 2024 8th International Conference on Robotics and Automation Sciences (ICRAS), Tokyo, Japan, 21–23 June 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 32–36. [Google Scholar]
- Zhu, W.; Qiu, G. Path Planning of Intelligent Mobile Robots with an Improved RRT Algorithm. Appl. Sci. 2025, 15, 3370. [Google Scholar] [CrossRef]
- Wu, D.; Li, Y. Mobile Robot Path Planning Based on Improved Smooth A* Algorithm and Optimized Dynamic Window Approach. In Proceedings of the 2024 2nd International Conference on Signal Processing and Intelligent Computing (SPIC), Guangzhou, China, 20–22 September 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 345–348. [Google Scholar]
- Meng, F.; Sun, X.; Zhu, J.; Mei, B.; Zheng, P. Research on Ship Path Planning Based on Bidirectional A*-APF Algorithm. In Proceedings of the 2024 4th International Conference on Consumer Electronics and Computer Engineering (ICCECE), Guangzhou, China, 12–14 January 2024; 2024; pp. 460–466. [Google Scholar]
- Fu, B.; Chen, L.; Zhou, Y.; Zheng, D.; Wei, Z.; Dai, J.; Pan, H. An improved A* algorithm for the industrial robot path planning with high success rate and short length. Robot. Auton. Syst. 2018, 106, 26–37. [Google Scholar] [CrossRef]
- Lv, Y.; Wang, J.; Yang, H.; Zhang, X. Dynamic Guidance Point Obstacle Avoidance Planning Method and Optimization Based on DWA. In Proceedings of the 2024 International Conference on Cyber-Physical Social Intelligence (ICCSI), Doha, Qatar, 8–12 November 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–6. [Google Scholar]
- Nie, W.; Fu, H.; Ma, S.; Deng, Y.; Yang, J. An Improved FFPF-APF Path Planning. In Proceedings of the 4th 2024 International Conference on Autonomous Unmanned Systems (4th ICAUS 2024): Volume II, Shenyang, China, 19–21 September 2024; Springer Nature: Berlin/Heidelberg, Germany, 2025; Volume 1375, pp. 156–170. [Google Scholar]
- Zhang, H.M.; Li, M.L.; Yang, L. Safe path planning of mobile robot based on improved A* algorithm in complex terrains. Algorithms 2018, 11, 44. [Google Scholar] [CrossRef]
- Zhang, J.; Ling, H.; Tang, Z.; Song, W.; Lu, A. Path planning of USV in confined waters based on improved A∗ and DWA fusion algorithm. Ocean. Eng. 2025, 322, 120475. [Google Scholar] [CrossRef]
- Fu, X.; Huang, Z.; Zhang, G.; Wang, W.; Wang, J. Research on path planning of mobile robots based on improved A* algorithm. PeerJ Comput. Sci. 2025, 11, e2691. [Google Scholar] [CrossRef] [PubMed]
- Jing, M.; Wang, H. Robot path planning by integrating improved A* algorithm and DWA algorithm. J. Phys. Conf. Ser. 2023, 492, 012017. [Google Scholar]
- Huang, X.; Li, G. An improved Q-learning algorithm for path planning. In Proceedings of the 2023 IEEE International Conference on Sensors, Electronics and Computer Engineering (ICSECE), Jinzhou, China, 18–20 August 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 277–281. [Google Scholar]
- Gan, X.; Huo, Z.; Li, W. Dp-a*: For path planing of ugv and contactless delivery. IEEE Trans. Intell. Transp. Syst. 2023, 25, 907–919. [Google Scholar] [CrossRef]
- Liao, C.; Wang, S.; Wang, Z.; Zhai, Y. GAA-DFQ: A dual-layer learning model for robot path planning in dynamic environments integrating genetic algorithms, DWA, fuzzy control and Q-learning. In Proceedings of the 2025 8th International Conference on Advanced Algorithms and Control Engineering (ICAACE), Shanghai, China, 21–23 March 2025; IEEE: Piscataway, NJ, USA, 2025; pp. 339–343. [Google Scholar]
- Hwang, U.; Hong, S. On Practical Robust Reinforcement Learning: Adjacent Uncertainty Set and Double-Agent Algorithm. IEEE Trans. Neural Netw. Learn. Syst. 2024, 36, 7696–7710. [Google Scholar] [CrossRef] [PubMed]
- De Carvalho, K.B.; de OBBatista, H.; Fagundes-Junior, L.A.; de Oliveira, I.R.L.; Brandão, A.S. Q-learning global path planning for UAV navigation with pondered priorities. Intell. Syst. Appl. 2025, 25, 200485. [Google Scholar] [CrossRef]
- Lu, Y.; Da, C. Global and local path planning of robots combining ACO and dynamic window algorithm. Sci. Rep. 2025, 15, 9452. [Google Scholar] [CrossRef] [PubMed]
- Du, X.; Luo, P.; Lv, X. Path Planning Algorithm Based on Improved RRT and Artificial Potential Field. In Proceedings of the 2024 8th International Conference on Electrical, Mechanical and Computer Engineering (ICEMCE), Xi’an, China, 25–27 October 2024; pp. 1767–1774. [Google Scholar]
- Liu, Y.T.; Guo, S.J.; Tang, S.F.; Zhang, X.W.; Li, T.T. Path planning based on fusion of improved A* and ROA-DWA for robot. Zhejiang Daxue Xuebao (Gongxue Ban)/J. Zhejiang Univ. (Eng. Sci.) 2024, 58, 360–369. [Google Scholar]
- Wang, P.; Gupta, K. View planning via maximal c-space entropy reduction. In Algorithmic Foundations of Robotics V; Springer: Berlin/Heidelberg, Germany, 2004; pp. 149–165. [Google Scholar]
- Lanillos, P.; Besada-Portas, E.; Pajares, G.; Ruz, J.J. Minimum time search for lost targets using cross entropy optimization. In Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Vilamoura, Algarve, Portugal, 7–12 October 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 602–609. [Google Scholar]
- Gong, X.; Gao, Y.; Wang, F.; Zhu, D.; Zhao, W.; Wang, F.; Liu, Y. A local path planning algorithm for robots based on improved DWA. Electronics 2024, 13, 2965. [Google Scholar] [CrossRef]
- Guo, Z.; Chen, H.; Xu, F.; Hu, Y.; Lin, J.; Guo, L. Uncertainty-Aware Safe Trajectory Planner Based on Model Predictive Control for Autonomous Driving. IEEE Trans. Intell. Transp. Syst. 2025, 26, 12068–12079. [Google Scholar] [CrossRef]
Author | Algorithm Type | Core Improvements | Existing Limitations |
---|---|---|---|
Gan et al. [16] | Hybrid algorithm (DP-A* + Q-Learning) | Integrates Q-learning into A* heuristic function to optimize dynamic obstacle avoidance. | 1. Relies on predefined environment models, unable to handle unknown static obstacles. 2. Lacks re-planning for locally unsafe segments, posing collision risks. |
Liao et al. [17] | Dual-layer hybrid model (GAA-DFQ) | 1. Global layer: Improved multi-objective genetic algorithm (GAA) integrates A* to generate initial population, optimizing path length and safety margin. 2. Local layer: DFQ algorithm fuses DWA, fuzzy control, and Q-learning. | 1. High algorithm complexity and strict hardware requirements. 2. Unsuitable for single-target scenarios of coal mine shotcrete robots. |
Author | Algorithm Type | Core Improvements | Existing Limitations |
---|---|---|---|
Du et al. [21] | Improved RRT (Fused with APF) | 1. Target bias strategy. 2. Sampling restrictions in the target area. 3. Integration of Artificial Potential Field (APF). | 1. Relies on static APF, resulting in weak response to dynamic obstacles. 2. The random sampling characteristic of RRT remains unmodified, leading to poor global path optimality. |
Liu Yuting et al. [22] | Hybrid Planner (Improved A* + ROA-DWA) | 1. Improved A*: Heuristic weight tuning + redundant node pruning. 2. Integration of ROA-DWA (Region of Avoidance—Dynamic Window Approach). | 1. The node pruning strategy of the improved A* is fixed, leading to still-high redundancy in complex environments. 2. Lack of re-planning for unsafe path segments, making it difficult to ensure safe distance. |
Map Size (Grid) | Algorithm | Search Time (s) | Path Length (Grid Unit) | Number of Path Nodes (Node) | Unsafe Nodes |
---|---|---|---|---|---|
Traditional A* algorithm | 0.43 ± 0.05 | 36.284 ± 1.2 | 30 ± 2 | 15 ± 1 | |
Adaptive A* algorithm | 0.45 ± 0.05 | 42.614 ± 1.5 | 22 ± 1 | 4 ± 1 | |
Traditional A* algorithm | 1.36 ± 0.10 | 61.983 ± 1.8 | 47 ± 3 | 16 ± 2 | |
Adaptive A* algorithm | 1.29 ± 0.10 | 68.242 ± 2.0 | 33 ± 2 | 9 ± 2 | |
Traditional A* algorithm | 4.61 ± 0.52 | 131.450 ± 2.5 | 97 ± 4 | 31 ± 2 | |
Adaptive A* algorithm | 4.26 ± 0.50 | 145.249 ± 2.8 | 65 ± 3 | 16 ± 2 | |
Traditional A* algorithm | 15.29 ± 1.30 | 274.63 ± 4.5 | 200 ± 6 | 70 ± 2 | |
Adaptive A* algorithm | 12.51 ± 1.10 | 305.45 ± 5.0 | 134 ± 4 | 39 ± 2 |
Obstacle Rate | Algorithm Type | Path Length (Grid Unit) | Length Ratio | Number of Nodes (Node) | Node Ratio | Unsafe Nodes |
---|---|---|---|---|---|---|
Traditional A* algorithm | 61.55 ± 1.5 | 1.033 ± 0.05 | 45 ± 2 | 1.956 ± 0.1 | 0 | |
Adaptive A* algorithm | 59.56 ± 1.3 | 23 ± 1 | 0 | |||
Traditional A* algorithm | 61.98 ± 1.5 | 0.988 ± 0.05 | 45 ± 2 | 1.60 ± 0.1 | 10 ± 1 | |
Adaptive A* algorithm | 62.72 ± 1.4 | 28 ± 1 | 6 ± 1 | |||
Traditional A* algorithm | 61.98 ± 1.6 | 0.908 ± 0.05 | 47 ± 2 | 1.42 ± 0.1 | 16 ± 2 | |
Adaptive A* algorithm | 68.24 ± 1.7 | 33 ± 2 | 9 ± 2 | |||
Traditional A* algorithm | 62.36 ± 1.6 | 0.871 ± 0.05 | 48 ± 3 | 1.37 ± 0.1 | 20 ± 2 | |
Adaptive A* algorithm | 71.58 ± 1.7 | 35 ± 2 | 13 ± 2 | |||
Traditional A* algorithm | 63.54 ± 1.5 | 0.842 ± 0.05 | 48 ± 3 | 1.26 ± 0.1 | 26 ± 2 | |
Adaptive A* algorithm | 75.46 ± 1.8 | 38 ± 2 | 16 ± 2 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Su, C.; Zhao, L.; Xiang, D. Adaptive A*–Q-Learning–DWA Fusion with Dynamic Heuristic Adjustment for Safe Path Planning in Spraying Robots. Appl. Sci. 2025, 15, 9340. https://doi.org/10.3390/app15179340
Su C, Zhao L, Xiang D. Adaptive A*–Q-Learning–DWA Fusion with Dynamic Heuristic Adjustment for Safe Path Planning in Spraying Robots. Applied Sciences. 2025; 15(17):9340. https://doi.org/10.3390/app15179340
Chicago/Turabian StyleSu, Chang, Liangliang Zhao, and Dongbing Xiang. 2025. "Adaptive A*–Q-Learning–DWA Fusion with Dynamic Heuristic Adjustment for Safe Path Planning in Spraying Robots" Applied Sciences 15, no. 17: 9340. https://doi.org/10.3390/app15179340
APA StyleSu, C., Zhao, L., & Xiang, D. (2025). Adaptive A*–Q-Learning–DWA Fusion with Dynamic Heuristic Adjustment for Safe Path Planning in Spraying Robots. Applied Sciences, 15(17), 9340. https://doi.org/10.3390/app15179340