Research on Behavioral Decision at an Unsignalized Roundabout for Automatic Driving Based on Proximal Policy Optimization Algorithm
Abstract
:1. Introduction
2. PPO Algorithm Model
2.1. PPO Optimization Algorithm
2.2. State Model
2.3. The Reward and Punishment Function of an Unsignalized Roundabout
3. Results and Discussion
3.1. Experimental Environment
3.2. Simulation and Analysis
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Samizadeh, S.; Nikoofard, A.; Yektamoghadam, H. Decision Making for Autonomous Vehicles’ Strategy in Triple-Lane Roundabout Intersections. In Proceedings of the 2022 8th International Conference on Control, Instrumentation and Automation (ICCIA), Tehran, Iran, 2–3 March 2022; pp. 1–6. [Google Scholar]
- Mohebifard, R.; Hajbabaie, A. Effects of Automated Vehicles on Traffic Operations at Roundabouts. In Proceedings of the IEEE International Conference on Intelligent Transportation Systems, Rhodes, Greece, 20–23 September 2020. [Google Scholar] [CrossRef]
- Naderi, M.; Papageorgiou, M.; Karafyllis, I.; Papamichail, I. Automated vehicle driving on large lane-free roundabouts. In Proceedings of the 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC), Macau, China, 8–12 October 2022; pp. 1528–1535. [Google Scholar]
- Zhang, Y.; Zhang, J.; Dong, B. An optimal management scheme for connected vehicles merging at a roundabout. In Proceedings of the 2022 6th CAA International Conference on Vehicular Control and Intelligence (CVCI), Nanjing, China, 28–30 October 2022; pp. 1–6. [Google Scholar]
- Qian, D.; Qi, H.; Liu, Z.; Zhou, Z.; Yi, J. Research on Autonomous Decision-Making in Air-Combat Based on Improved Proximal Policy Optimization. J. Syst. Simul. 2023, 1–11. [Google Scholar] [CrossRef]
- Yu, Z.; Zhu, T.; Liu, W. Rapid Trajectory Programming for Hypersonic Umanned 6Areial Vehicle in Ascent Phase Based on Proximal Policy Optimization. J. Jilin Univ. (Eng. Technol. Ed.) 2023, 53, 863–870. [Google Scholar] [CrossRef]
- Chen, X.; Zhu, Y.; Lv, C. Signal Phase and Timing Optimization Method for Intersection Based on Hybrid Proximal Policy Optimization. J. Transp. Syst. Eng. Inf. Technol. 2023, 23, 106–113. [Google Scholar] [CrossRef]
- Zhao, J.; Hu, X.; Du, X. Spectrum Resource Allocation of Vehicle Edge Network Based on Proximal Policy Optimization Algorithm. Front. Data Comput. 2022, 4, 142–155. [Google Scholar]
- Jia, H.; Li, B. Calculation of Traffic Capacity at Signalized Roundabouts Based on Gap Acceptance Theory. J. Transp. Inf. Saf. 2018, 36, 64–71. [Google Scholar]
- Liu, C.; Liu, Y.; Luo, X. Trajectory Optimization of Connected Vehicles at Isolated Intersection in Mixed Traffic Environment. J. Transp. Syst. Eng. Inf. Technol. 2022, 22, 154–162. [Google Scholar] [CrossRef]
- Zhang, J.; Hu, S.; Jin, H. Modeling of Traffic Flow Velocity Control Strategy for Human-machine Mixed Driving at Signalized Intersections. J. Syst. Simul. 2022, 34, 1697–1709. [Google Scholar] [CrossRef]
- Liu, Q.; Pan, M. Research on Intersection Capacity Considering the Stability of Autonomous Vehicles. Highway 2021, 66, 240–247. [Google Scholar]
- Wang, S.; Wan, Q. Right-turn Driving Decisions of Autonomous Vehicles at Signal-free Intersections. Appl. Res. Comput. 2022, 1–6. [Google Scholar] [CrossRef]
- Chen, Z.; Luo, L. Speed Trajectory Optimization of Connected Autonomous Vehicles at Signalized Intersections. J. Transp. Inf. Saf. 2021, 39, 92–98+156. [Google Scholar]
- Wu, W.; Liu, Y.; Liu, W.; Wu, G.; Ma, W. A Novel Autonomous Vehicle Trajectory Planning and Control Model for Connected-and-Autonomous Intersections. Acta Autom. Sin. 2020, 46, 1971–1985. [Google Scholar] [CrossRef]
- Lu, Y.; Xu, X.; Ding, C.; Lu, G. Connected Autonomous Vehicle Speed Control at Successive Signalized Intersections. J. Beijing Univ. Aeronaut. Astronaut. 2018, 44, 2257–2266. [Google Scholar] [CrossRef]
- Zhang, Y.; Gao, B.; Guo, L.; Guo, H.; Chen, H. Adaptive decision-making for automated vehicles under roundabout scenarios using optimization embedded reinforcement learning. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 5526–5538. [Google Scholar] [CrossRef] [PubMed]
- Hang, P.; Huang, C.; Hu, Z.; Xing, Y.; Lv, C. Decision making of connected automated vehicles at an unsignalized roundabout considering personalized driving behaviours. IEEE Trans. Veh. Technol. 2021, 70, 4051–4064. [Google Scholar] [CrossRef]
- García Cuenca, L.; Puertas, E.; Fernandez Andrés, J.; Aliane, N. Autonomous driving in roundabout maneuvers using reinforcement learning with Q-learning. Electronics 2019, 8, 1536. [Google Scholar] [CrossRef]
- Zheng, R.; Liu, C.; Guo, Q. A decision–making method for autonomous vehicles based on simulation and reinforcement learning. In Proceedings of the 2013 International Conference on Machine Learning and Cybernetics, Tianjin, China, 14–17 July 2013. [Google Scholar]
- Gao, Z.; Sun, T.; Xiao, H. Decision–making method for vehicle longitudinal automatic driving based on reinforcement Q–learning. Int. J. Adv. Robot. Syst. 2019, 16, 141–172. [Google Scholar] [CrossRef]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human–level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef] [PubMed]
- Chae, H.; Kang, C.M.; Kim, B.; Kim, J.; Chung, C.C.; Choi, J.W. Autonomous braking system via deep reinforcement learning. In Proceedings of the 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), Yokohama, Japan, 16–19 October 2017. [Google Scholar]
- Lillicrap, T.P.; Hunt, J.J.; Pritzel, A.; Heess, N.; Erez, T.; Tassa, Y.; Silver, D.; Wierstra, D. Continuous control with deep reinforcement learning. arXiv 2015, arXiv:1509.02971. [Google Scholar]
- Sallab, A.E.; Abdou, M.; Perot, E.; Yogamani, S. Deep reinforcement learning framework for autonomous driving. arXiv 2017, arXiv:1704.02532. [Google Scholar] [CrossRef]
- Kachroo, P.; Li, Z. Vehicle merging control design for an automated highway system. In Proceedings of the IEEE Proceedings of Conference on Intelligent Transportation Systems, Boston, MA, USA, 12 November 1997; pp. 224–229.
- Awal, T.; Kulik, L.; Ramamohanrao, K. Optimal traffic merging strategy for communication-and sensor-enabled vehicle. In Proceedings of the 16th International IEEE Conference on Intelligent Transportation Systems, Hague, The Netherlands, 6–9 October 2013; pp. 1468–1474. [Google Scholar]
- Uno, A.; Sakaguchi, T.; Tsugawa, S. A merging control algorithm based on inter-vehicle communication. In Proceedings of the Proceedings 199 IEEE/IEEEJ/JSAI International Conference on Intelligent Transportation Systems, Tokyo, Japan, 5–8 October 1999; pp. 783–787. [Google Scholar]
- Waddell, E. Evolution of Roundabout Technology: A history Based Literature Review. In Proceedings of the Institute of Transportation Engineers 67th Annual Meeting Compendium of Technical Papers, Boston, MA, USA, 3–7 August 1997; pp. 89–97. [Google Scholar]
- Thai Van, M.J.; Balmefrezol, P. The Design of Roundabout in France: Historical context and State of the Art. Transp. Res. Rec. 2000, 1737, 92–97. [Google Scholar] [CrossRef]
Category | Variable | Characteristics |
---|---|---|
Security | Lateral offset | Left–right offset, Offset range |
Lane change time | Average time | |
Vehicle collision | Collision or not | |
Vehicle spacing | Minimum spacing | |
Comfort level | Lateral acceleration | Maximum value, average value |
Longitudinal acceleration | Maximum value, average value | |
Vehicle deviation angle | Offset angle and offset range | |
Driving efficiency | Speed | Maximum value, average value |
Vehicle acceleration | Maximum value, average value | |
Transit time | Average value |
Training Step Size | 300,000 | 600,000 | |||||
---|---|---|---|---|---|---|---|
Scenes | 20 | 100 | 200 | 20 | 100 | 200 | |
Average transit time (s) | PPO+CCMR algorithm | 34.42 | 34.86 | 41.81 | 31.61 | 31.97 | 39.61 |
Optimized PPO algorithm | 33.62 | 33.95 | 36.51 | 30.86 | 31.13 | 34. 08 | |
Reduce efficiency through time | 2.32% | 2.61% | 12.67% | 2.37% | 2.62% | 13.96% |
Training Step Size | 300,000 | 600,000 | |||||
---|---|---|---|---|---|---|---|
Scenes | 20 | 100 | 200 | 20 | 100 | 200 | |
Traffic ratio (Inner: outer) | 5:5 | 5:5 | 6:4 | 9:1 | 9:1 | 8:2 | |
Average time (s) | Inner | 31.30 | 31.55 | 35.15 | 30.48 | 30.74 | 33.52 |
outer | 35.94 | 36.35 | 38.55 | 34.26 | 34.62 | 36.31 |
Training the Total Step Size | 300,000 | 600,000 | |||||
---|---|---|---|---|---|---|---|
Scenes | 20 | 100 | 200 | 20 | 100 | 200 | |
PPO Algorithm | Successes | 2 | 2 | 1 | 4 | 4 | 3 |
Failures | 18 | 18 | 19 | 16 | 16 | 17 | |
Success rate | 10% | 10% | 5% | 20% | 20% | 15% | |
PPO+CCMR Algorithm | Successes | 9 | 9 | 7 | 16 | 15 | 13 |
Failures | 11 | 11 | 13 | 4 | 5 | 7 | |
Success rate | 45% | 45% | 35% | 80% | 75% | 65% | |
Optimized PPO Algorithm | Successes | 9 | 9 | 8 | 16 | 15 | 14 |
Failures | 11 | 11 | 12 | 4 | 5 | 6 | |
Success rate | 45% | 45% | 40% | 80% | 75% | 70% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Gan, J.; Zhang, J.; Liu, Y. Research on Behavioral Decision at an Unsignalized Roundabout for Automatic Driving Based on Proximal Policy Optimization Algorithm. Appl. Sci. 2024, 14, 2889. https://doi.org/10.3390/app14072889
Gan J, Zhang J, Liu Y. Research on Behavioral Decision at an Unsignalized Roundabout for Automatic Driving Based on Proximal Policy Optimization Algorithm. Applied Sciences. 2024; 14(7):2889. https://doi.org/10.3390/app14072889
Chicago/Turabian StyleGan, Jingpeng, Jiancheng Zhang, and Yuansheng Liu. 2024. "Research on Behavioral Decision at an Unsignalized Roundabout for Automatic Driving Based on Proximal Policy Optimization Algorithm" Applied Sciences 14, no. 7: 2889. https://doi.org/10.3390/app14072889
APA StyleGan, J., Zhang, J., & Liu, Y. (2024). Research on Behavioral Decision at an Unsignalized Roundabout for Automatic Driving Based on Proximal Policy Optimization Algorithm. Applied Sciences, 14(7), 2889. https://doi.org/10.3390/app14072889