Applying Reinforcement Learning for AMR’s Docking and Obstacle Avoidance Behavior Control
Abstract
:1. Introduction
2. Deep Q-Network Methodology
2.1. Localization and Differential Wheel Control
2.2. Q-Learning
2.3. DQN (Deep Q-Network)
2.4. AprilTag
3. Autonomous Mobile Robot Design
3.1. Design Idea
3.2. Hardware Architecture
3.3. Software Architecture
3.4. The Training Scenarios
4. Experiments
4.1. Results of Training with 32 Neurons
4.2. Results of Training with 50 Neurons
4.3. Results of Training with 64 Neurons
4.4. Simulation Experiment Results
4.5. Real Robot Experiment Results
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Chen, C.S.; Lin, C.J.; Lai, C.C. Non-Contact Service Robot Development in Fast-Food Restaurants. IEEE Access 2022, 10, 31466–31479. [Google Scholar] [CrossRef]
- Yan, C.; Chen, G.; Li, Y.; Sun, F.; Wu, Y. Immune Deep Reinforcement Learning-Based Path Planning for Mobile Robot in Unknown Environment. Appl. Soft Comput. 2023, 145, 110601. [Google Scholar] [CrossRef]
- Bailey, T.; Nieto, J.; Guivant, J.; Stevens, M.; Nebot, E. Consistency of the EKF-SLAM Algorithm. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Beijing, China, 9–15 October 2006; pp. 3562–3568. [Google Scholar] [CrossRef]
- Grisetti, G.; Stachniss, C.; Burgard, W. Improved Techniques for Grid Mapping with Rao-Blackwellized Particle Filters. IEEE Trans. Robot. 2007, 23, 34–46. [Google Scholar] [CrossRef]
- Da Silva, B.M.F.; Xavier, R.S.; Do Nascimento, T.P.; Goncalves, L.M.G. Experimental Evaluation of ROS-Compatible SLAM Algorithms for RGB-D Sensors. In Proceedings of the Latin American Robotics Symposium (LARS), Curitiba, Brazil, 6–10 November 2017; pp. 1–6. [Google Scholar] [CrossRef]
- Hu, F.; Wu, G. Distributed Error Correction of EKF Algorithm in Multisensory Fusion Localization Model. IEEE Access 2020, 8, 93211–93218. [Google Scholar] [CrossRef]
- Sinisa, M. Evaluation of SLAM Methods and Adaptive Monte Carlo Localization. Doctoral Dissertation, Vienna University of Technology, Vienna, Austria, 2022. Available online: https://api.semanticscholar.org/CorpusID:251380609 (accessed on 20 January 2025).
- Lee, D.; Lee, S.-J.; Seo, Y.-J. Application of Recent Developments in Deep Learning to ANN-based Automatic Berthing Systems.International. J. Eng. Technol. Innov. 2020, 10, 75–90. [Google Scholar] [CrossRef]
- Yang, Y.; Li, J.; Peng, L. Multi-Robot Path Planning Based on a Deep Reinforcement Learning DQN Algorithm. CAAI Trans. Intell. Technol. 2020, 5, 177–183. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Badia, A.P.; Sprechmann, P.; Vitvitskyi, A.; Guo, D.; Piot, B.; Kapturowski, S.; Tieleman, O.; Arjovsky, M.; Pritzel, A.; Bolt, A.; et al. Never Give Up: Learning Directed Exploration Strategies. arXiv 2020, arXiv:2002.06038. [Google Scholar]
- Zong, L.; Yu, Y.; Wang, J.; Liu, P.; Feng, W.; Dai, X.; Chen, L.; Gunawan, C.; Yun, S.J.; Amal, R.; et al. Oxygen-Vacancy-Rich Molybdenum Carbide MXene Nanonetworks for Ultrasound-Triggered and Capturing-Enhanced Sonocatalytic Bacteria Eradication. Biomaterials 2023, 296, 122074. [Google Scholar] [CrossRef] [PubMed]
- Sivaranjani, A.; Vinod, B. Artificial Potential Field Incorporated Deep-Q-Network Algorithm for Mobile Robot Path Prediction. Intell. Autom. Soft Comput. 2023, 35, 1135–1150. [Google Scholar] [CrossRef]
- Zhang, J.; Xu, Z.; Wu, J.; Chen, Q.; Wang, F. Lightweight Intelligent Autonomous Unmanned Vehicle Based on Deep Neural Network in ROS System. In Proceedings of the 2022 IEEE 5th International Conference on Information Systems and Computer Aided Education (ICISCAE), Dalian, China, 23–25 September 2022; Volume 10, pp. 679–684. [Google Scholar] [CrossRef]
- Miyama, M. Robust Inference of Multi-Task Convolutional Neural Network for Advanced Driving Assistance by Embedding Coordinates. In Proceedings of the 8th World Congress on Electrical Engineering and Computer Systems and Science (EECSS), Prague, Czech Republic, 28–30 July 2022; pp. 105-1–105-9. Available online: https://api.semanticscholar.org/CorpusID:251856335 (accessed on 6 March 2025).
- Jebbar, M.; Maizate, A.; Abdelouahid, R.A. Moroccan’s Arabic Speech Training and Deploying Machine Learning Models with Teachable Machine. Procedia Comput. Sci. 2022, 203, 801–806. [Google Scholar] [CrossRef]
- Copot, C.; Shi, L.; Smet, E.; Ionescu, C.; Vanlanduit, S. Comparison of Deep Learning Models in Position-Based Visual Servoing. In Proceedings of the 2022 IEEE 27th International Conference on Emerging Technologies and Factory Automation (ETFA), Stuttgart, Germany, 6–9 September 2022; Volume 10, pp. 1–4. [Google Scholar] [CrossRef]
- Liu, J.; Rangwala, M.; Ahluwalia, K.; Ghajar, S.; Dhami, H.; Tokekar, P.; Tracy, B.; Williams, R. Intermittent Deployment for Large-Scale Multi-Robot Forage Perception: Data Synthesis, Prediction, and Planning. IEEE Trans. Autom. Sci. Eng. 2022, 21, 27–47. Available online: https://ieeexplore.ieee.org/document/9923747 (accessed on 6 March 2025). [CrossRef]
- Lai, J.; Ramli, H.; Ismail, L.; Hasan, W. Real-Time Detection of Ripe Oil Palm Fresh Fruit Bunch Based on YOLOv4. IEEE Access 2022, 10, 95763–95770. [Google Scholar] [CrossRef]
- Lin, H.Z.; Chen, H.H.; Choophutthakan, K.; Li, C.H. Autonomous Mobile Robot as a Cyber-Physical System Featuring Networked Deep Learning and Control. In Proceedings of the 2022 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM), Sapporo, Japan, 11–15 July 2022; Volume 10, pp. 268–274. [Google Scholar] [CrossRef]
- Mandel, N.; Sandino, J.; Galvez-Serna, J.; Vanegas, F.; Milford, M.; Gonzalez, F. Resolution-Adaptive Quadtrees for Semantic Segmentation Mapping in UAV Applications. In Proceedings of the 2022 IEEE Aerospace Conference (AERO), Big Sky, MT, USA, 5–12 March 2022; pp. 1–17. [Google Scholar] [CrossRef]
- Chen, Y.; Li, D.; Zhong, H.; Zhao, R. The Method for Automatic Adjustment of AGV’s PID Based on Deep Reinforcement Learning. J. Phys. Conf. Ser. 2022, 2320, 012008. [Google Scholar] [CrossRef]
- Chen, S.-L.; Huang, L.-W. Using Deep Learning Technology to Realize the Automatic Control Program of Robot Arm Based on Hand Gesture Recognition. Int. J. Eng. Technol. Innov. 2021, 11, 241–250. [Google Scholar] [CrossRef]
- Yin, Y.; Chen, Z.; Liu, G.; Guo, J. Mapless Local Path Planning Approach Using Deep Reinforcement Learning Framework. Sensors 2023, 23, 2036. [Google Scholar] [CrossRef] [PubMed]
- Escobar-Naranjo, J.; Caiza, G.; Ayala, P.; Jordan, E.; Garcia, C.A.; Garcia, M.V. Autonomous Navigation of Robots: Optimization with DQN. Appl. Sci. 2023, 13, 7202. [Google Scholar] [CrossRef]
- Li, J.; Chavez-Galaviz, J.; Azizzadenesheli, K.; Mahmoudian, N. Obstacle Avoidance for USVs Using Cross-Domain Deep Reinforcement Learning and Neural Network Model Predictive Controller. Sensors 2023, 23, 3572. [Google Scholar] [CrossRef] [PubMed]
The Q value of executing a in the current state s. | |
Is the learning rate, which can obtain new information and affect the existing Q value. | |
The immediate reward obtained by action a. | |
The key and importance of balancing current rewards with future rewards. | |
The maximum Q value of the new state. | |
The new state after executing action a. |
s | Number of states |
Number of actions | |
Q-Learning | Deep learning |
Q-Value | Deep learning results |
CNN | Training | Success | Fail | Time(min) | Percentage | Frequency |
---|---|---|---|---|---|---|
32 | 500 | 240 | 10 | 108 | 96.00% | 250 |
50 | 500 | 249 | 1 | 108 | 99.60% | 250 |
64 | 500 | 249 | 1 | 200 | 99.60% | 250 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lai, C.-C.; Yang, B.-J.; Lin, C.-J. Applying Reinforcement Learning for AMR’s Docking and Obstacle Avoidance Behavior Control. Appl. Sci. 2025, 15, 3773. https://doi.org/10.3390/app15073773
Lai C-C, Yang B-J, Lin C-J. Applying Reinforcement Learning for AMR’s Docking and Obstacle Avoidance Behavior Control. Applied Sciences. 2025; 15(7):3773. https://doi.org/10.3390/app15073773
Chicago/Turabian StyleLai, Chun-Chi, Bo-Jun Yang, and Chia-Jen Lin. 2025. "Applying Reinforcement Learning for AMR’s Docking and Obstacle Avoidance Behavior Control" Applied Sciences 15, no. 7: 3773. https://doi.org/10.3390/app15073773
APA StyleLai, C.-C., Yang, B.-J., & Lin, C.-J. (2025). Applying Reinforcement Learning for AMR’s Docking and Obstacle Avoidance Behavior Control. Applied Sciences, 15(7), 3773. https://doi.org/10.3390/app15073773