Differential Evolution Deep Reinforcement Learning Algorithm for Dynamic Multiship Collision Avoidance with COLREGs Compliance
Abstract
:1. Introduction
- The DQN is utilized to search for paths in global path planning. Subsequently, the DE algorithm is introduced to optimize and smooth the detected path, ultimately achieving a shorter global path length through the integration of the DE and DQN algorithms.
- A CPRM is proposed for local path planning, considering the course angle and position. This model is integrated with DQN to restrict the agent, enabling it to execute fewer steps and efficiently reach its destination in dynamic multi-ship local path planning.
- DEDRL is evaluated against six other algorithms, including various reinforcement learning and intelligent algorithms, to assess its effectiveness, showcasing superior performance across all comparisons.
2. Materials and Methods
2.1. Global and Local Path Planning
2.2. Deep Q-Network
2.3. Differential Evolution
- (1)
- Initialization: The population is initialized at random within the search domain, adhering to the following formula:
- (2)
- Mutation: The DE algorithm involves the generation of a mutant vector for each individual in the current population. This process utilizes a prevalent mutation operators, which is as follows:“DE/rand/1”:
- (3)
- Crossover: The trial vector is produced from the mutated vector according to the following equation:
- (4)
- Selection: During the selection process, the target vector is potentially supplanted by the trial vector. This occurs when the trial vector fitness surpasses that of the target, adhering to a greedy approach for vector replacement.
2.4. COLREGs and Ship Maneuverability Restriction
2.4.1. COLREGs
- Overtaking: To avoid a potential collision, the Own Ship (OS) is required to modify its heading to starboard or port when it finds itself in an overtaking scenario. This occurs when the OS is following the Target Ship (TS) directly from behind, a situation that is deemed overtaking if the approach is from a direction more than 22.5 degrees abaft the TS’s beam.
- Crossing: In cases where two power-driven vessels are at risk of collision due to their crossing paths, the vessel with the other on her starboard side is responsible for yielding way. The OS must adjust its course towards starboard when it encounters another vessel on its starboard side during navigation. This maneuver is necessary in a scenario deemed a starboard-side crossing. Furthermore, this vessel should strive to avoid crossing ahead of the other vessel whenever the circumstances permit.
- Head-on: A head-on scenario is identified when the OS and the TS are on courses that are directly opposite or nearly so, resulting in one vessel being observed ahead or nearly ahead of the other. To avert a collision, it is necessary for both the OS and TS to alter their courses to starboard, ensuring that each vessel passes on the port side of the other.
2.4.2. Ship Maneuverability Restriction
2.5. Dynamical Collision Risk Detection
3. Proposed DEDRL Algorithm
3.1. Global Path Planning Utilizing DQN and DE
3.1.1. State Space
3.1.2. Action Space
3.1.3. Reward Function
3.2. Local Path Planning with CPRM
3.2.1. State Space
Algorithm 1: DEDRL (Global path planning) |
3.2.2. Action Space
3.2.3. Reward Function
Algorithm 2: DEDRL (Local path planning) |
4. Experimental Results
4.1. Environment Map Construction
4.2. Evaluation Metrics and Parameters Setting
4.3. Algorithms Comparison in Global Path Planning
4.4. Algorithms Comparison in Local Path Planning
5. Discussion
Ablation Analysis
6. Conclusions and Outlook
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Zhu, Q.; Xi, Y.; Weng, J.; Han, B.; Hu, S.; Ge, Y.E. Intelligent ship collision avoidance in maritime field: A bibliometric and systematic review. Expert Syst. Appl. 2024, 252, 124148. [Google Scholar] [CrossRef]
- He, Y.; Liu, X.; Zhang, K.; Mou, J.; Liang, Y.; Zhao, X.; Wang, B.; Huang, L. Dynamic adaptive intelligent navigation decision making method for multi-object situation in open water. Ocean Eng. 2022, 253, 111238. [Google Scholar] [CrossRef]
- Pietrzykowski, Z.; Wołejsza, P.; Nozdrzykowski, Ł.; Borkowski, P.; Banaś, P.; Magaj, J.; Chomski, J.; Maka, M.; Mielniczuk, S.; Pańka, A.; et al. The autonomous navigation system of a sea-going vessel. Ocean Eng. 2022, 261, 112104. [Google Scholar] [CrossRef]
- Zhou, C.; Gu, S.; Wen, Y.; Du, Z.; Xiao, C.; Huang, L.; Zhu, M. The review unmanned surface vehicle path planning: Based on multi-modality constraint. Ocean Eng. 2020, 200, 107043. [Google Scholar] [CrossRef]
- Jiang, L.; An, L.; Zhang, X.; Wang, C.; Wang, X. A human-like collision avoidance method for autonomous ship with attention-based deep reinforcement learning. Ocean Eng. 2022, 264, 112378. [Google Scholar] [CrossRef]
- Huang, Y.; Chen, L.; Chen, P.; Negenborn, R.R.; van Gelder, P. Ship collision avoidance methods: State-of-the-art. Saf. Sci. 2020, 121, 451–473. [Google Scholar] [CrossRef]
- Gu, Q.; Zhen, R.; Liu, J.; Li, C. An improved RRT algorithm based on prior AIS information and DP compression for ship path planning. Ocean Eng. 2023, 279, 114595. [Google Scholar] [CrossRef]
- He, Z.; Liu, C.; Chu, X.; Negenborn, R.R.; Wu, Q. Dynamic anti-collision A-star algorithm for multi-ship encounter situations. Appl. Ocean Res. 2022, 118, 102995. [Google Scholar] [CrossRef]
- Xue, H. A quasi-reflection based SC-PSO for ship path planning with grounding avoidance. Ocean Eng. 2022, 247, 110772. [Google Scholar] [CrossRef]
- Sui, F.; Tang, X.; Dong, Z.; Gan, X.; Luo, P.; Sun, J. ACO+PSO+A*: A bi-layer hybrid algorithm for multi-task path planning of an AUV. Comput. Ind. Eng. 2023, 175, 108905. [Google Scholar] [CrossRef]
- Gao, P.; Zhou, L.; Zhao, X.; Shao, B. Research on ship collision avoidance path planning based on modified potential field ant colony algorithm. Ocean Coast. Manag. 2023, 235, 106482. [Google Scholar] [CrossRef]
- Rawson, A.; Brito, M. A survey of the opportunities and challenges of supervised machine learning in maritime risk analysis. Transp. Rev. 2023, 43, 108–130. [Google Scholar] [CrossRef]
- Yang, X.; Han, Q. Improved reinforcement learning for collision-free local path planning of dynamic obstacle. Ocean Eng. 2023, 283, 115040. [Google Scholar] [CrossRef]
- Chen, Z.; Qin, B.; Sun, M.; Sun, Q. Q-Learning-based parameters adaptive algorithm for active disturbance rejection control and its application to ship course control. Neurocomputing 2020, 408, 51–63. [Google Scholar] [CrossRef]
- Guan, W.; Luo, W.; Cui, Z. Intelligent decision-making system for multiple marine autonomous surface ships based on deep reinforcement learning. Robot. Auton. Syst. 2024, 172, 104587. [Google Scholar] [CrossRef]
- Gao, M.; Shi, G.Y. Ship collision avoidance anthropomorphic decision-making for structured learning based on AIS with Seq-CGAN. Ocean Eng. 2020, 217, 107922. [Google Scholar] [CrossRef]
- Cheng, C.; Sha, Q.; He, B.; Li, G. Path planning and obstacle avoidance for AUV: A review. Ocean Eng. 2021, 235, 109355. [Google Scholar] [CrossRef]
- Wang, B.; Liu, Z.; Li, Q.; Prorok, A. Mobile Robot Path Planning in Dynamic Environments Through Globally Guided Reinforcement Learning. IEEE Robot. Autom. Lett. 2020, 5, 6932–6939. [Google Scholar] [CrossRef]
- Liu, L.; Wang, X.; Yang, X.; Liu, H.; Li, J.; Wang, P. Path planning techniques for mobile robots: Review and prospect. Expert Syst. Appl. 2023, 227, 120254. [Google Scholar] [CrossRef]
- Jang, D.u.; Kim, J.s. Development of Ship Route-Planning Algorithm Based on Rapidly-Exploring Random Tree (RRT*) Using Designated Space. J. Mar. Sci. Eng. 2022, 10, 1800. [Google Scholar] [CrossRef]
- Chen, P.; Huang, Y.; Papadimitriou, E.; Mou, J.; van Gelder, P. Global path planning for autonomous ship: A hybrid approach of Fast Marching Square and velocity obstacles methods. Ocean Eng. 2020, 214, 107793. [Google Scholar] [CrossRef]
- Hu, W.; Chen, S.; Liu, Z.; Luo, X.; Xu, J. HA-RRT: A heuristic and adaptive RRT algorithm for ship path planning. Ocean Eng. 2025, 316, 119906. [Google Scholar] [CrossRef]
- Huang, J.; Chen, C.; Shen, J.; Liu, G.; Xu, F. A self-adaptive neighborhood search A-star algorithm for mobile robots global path planning. Comput. Electr. Eng. 2025, 123, 110018. [Google Scholar] [CrossRef]
- Namgung, H.; Kim, J.S. Collision Risk Inference System for Maritime Autonomous Surface Ships Using COLREGs Rules Compliant Collision Avoidance. IEEE Access 2021, 9, 7823–7835. [Google Scholar] [CrossRef]
- Namgung, H. Local Route Planning for Collision Avoidance of Maritime Autonomous Surface Ships in Compliance with COLREGs Rules. Sustainability 2022, 14, 198. [Google Scholar] [CrossRef]
- Ohn, S.W.; Namgung, H. Requirements for Optimal Local Route Planning of Autonomous Ships. J. Mar. Sci. Eng. 2023, 11, 17. [Google Scholar] [CrossRef]
- Zhang, A.; Wang, W.; Bi, W.; Huang, Z. A path planning method based on deep reinforcement learning for AUV in complex marine environment. Ocean Eng. 2024, 313, 119354. [Google Scholar] [CrossRef]
- Xin, J.; Kim, J.; Li, Z.; Li, N. Train a real-world local path planner in one hour via partially decoupled reinforcement learning and vectorized diversity. Eng. Appl. Artif. Intell. 2025, 141, 109726. [Google Scholar] [CrossRef]
- Wang, H.; Lu, B.; Li, J.; Liu, T.; Xing, Y.; Lv, C.; Cao, D.; Li, J.; Zhang, J.; Hashemi, E. Risk Assessment and Mitigation in Local Path Planning for Autonomous Vehicles With LSTM Based Predictive Model. IEEE Trans. Autom. Sci. Eng. 2022, 19, 2738–2749. [Google Scholar] [CrossRef]
- Jang, D.u.; Kim, J.s. Map Space Modeling Method Reflecting Safety Margin in Coastal Water Based on Electronic Chart for Path Planning. Sensors 2023, 23, 1723. [Google Scholar] [CrossRef]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef]
- Zhao, Y.; Ma, S.; Mo, X.; Xu, X. Data-driven optimization for energy-constrained dietary supplement scheduling: A bounded cut MP-DQN approach. Comput. Ind. Eng. 2024, 188, 109894. [Google Scholar] [CrossRef]
- Li, C.; Zhang, Y.; Luo, Y. DQN-enabled content caching and quantum ant colony-based computation offloading in MEC. Appl. Soft Comput. 2023, 133, 109900. [Google Scholar] [CrossRef]
- Xu, W.; Li, Y.; Pei, B.; Yu, Z. Coordinated intelligent control of the flight control system and shape change of variable sweep morphing aircraft based on dueling-DQN. Aerosp. Sci. Technol. 2022, 130, 107898. [Google Scholar] [CrossRef]
- Carta, S.; Ferreira, A.; Podda, A.S.; Reforgiato Recupero, D.; Sanna, A. Multi-DQN: An ensemble of Deep Q-learning agents for stock market forecasting. Expert Syst. Appl. 2021, 164, 113820. [Google Scholar] [CrossRef]
- Lechiakh, M.; El-Moutaouakkil, Z.; Maurer, A. Towards long-term depolarized interactive recommendations. Inf. Process. Manag. 2024, 61, 103833. [Google Scholar] [CrossRef]
- Storn, R.; Price, K. Differential Evolution—A Simple and Efficient Heuristic for Global Optimization Over Continuous Spaces. J. Glob. Optim. 1997, 11, 341–359. [Google Scholar] [CrossRef]
- Yu, X.; Hu, Z.; Luo, W.; Xue, Y. Reinforcement learning-based multi-objective differential evolution algorithm for feature selection. Inf. Sci. 2024, 661, 120185. [Google Scholar] [CrossRef]
- Liao, Z.; Zhu, F.; Gong, W.; Li, S.; Mi, X. AGSDE: Archive guided speciation-based differential evolution for nonlinear equations. Appl. Soft Comput. 2022, 122, 108818. [Google Scholar] [CrossRef]
- Liao, Z.; Mi, X.; Pang, Q.; Sun, Y. History archive assisted niching differential evolution with variable neighborhood for multimodal optimization. Swarm Evol. Comput. 2023, 76, 101206. [Google Scholar] [CrossRef]
- Stepien, B. Towards a New Horizon: 1972 COLREG in the Era of Autonomous Ships. Ocean Dev. Int. Law 2024, 55, 170–184. [Google Scholar] [CrossRef]
- Zhang, J.; Zhang, H.; Liu, J.; Wu, D.; Soares, C.G. A Two-Stage Path Planning Algorithm Based on Rapid-Exploring Random Tree for Ships Navigating in Multi-Obstacle Water Areas Considering COLREGs. J. Mar. Sci. Eng. 2022, 10, 1441. [Google Scholar] [CrossRef]
- International Maritime Organization. Convention on the International Regulations for Preventing Collisions at Sea, 1972 (COLREGs). 1972. Available online: https://www.imo.org/en/About/Conventions/Pages/COLREG.aspx (accessed on 28 August 2024).
- Liu, K.; Yuan, Z.; Xin, X.; Zhang, J.; Wang, W. Conflict detection method based on dynamic ship domain model for visualization of collision risk Hot-Spots. Ocean Eng. 2021, 242, 110143. [Google Scholar] [CrossRef]
- Liu, J.; Zhang, J.; Yang, Z.; Wan, C.; Zhang, M. A novel data-driven method of ship collision risk evolution evaluation during real encounter situations. Reliab. Eng. Syst. Saf. 2024, 249, 110228. [Google Scholar] [CrossRef]
Metrics | Time/s | Mile/n | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
DEDRL | Q-Learning | Sarsa | RRT | RRTe | RRTes | FFC-A* | DEDRL | Q-Learning | Sarsa | RRT | RRTe | RRTes | FFC-A* | |
Average | 31.8557 | 17.9915 | 18.3151 | 70.6179 | 262.1404 | 488.1923 | 0.2034 | 28.6943 | 30.8057 | 30.5759 | 35.2297 | 28.8783 | 30.1380 | 29.2346 |
Maximum | 46.8148 | 20.9226 | 32.8669 | 176.6006 | 337.8111 | 611.7446 | 0.2370 | 28.9354 | 33.0558 | 32.8669 | 38.5772 | 29.1695 | 32.0473 | 29.2346 |
Minimum | 22.3778 | 15.8740 | 15.7925 | 41.0793 | 135.6768 | 360.8642 | 0.1890 | 28.4539 | 30.3213 | 30.3213 | 31.2561 | 28.6456 | 29.0754 | 29.2346 |
Situation | Own Ship (OS) | Target Ship (TSs) | Distance/ n Miles | ||||
---|---|---|---|---|---|---|---|
Position/
n Miles | Heading/ | Speed/kn |
Position/
n Miles | Heading/ | Speed/kn | ||
Overtaking | (0.0, 0.0) | 49.3 | 12.6 | (8.57, 10.0) | 49.3 | 4.2 | 11 |
Crossing | (34.0, 40.0) | 49.3 | (47.6, 48.2) | 139.3 | 13 | ||
Head-on | (72.0, 81.0) | 34.1 | (89.1, 92.6) | 214.1 | 17 |
Metrics | Time/s | Mile/n | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
DEDRL | Q-Learning | Sarsa | RRT | RRTe | RRTes | FFC-A* | DEDRL | Q-Learning | Sarsa | RRT | RRTe | RRTes | FFC-A* | |
Average | 24.1769 | 4.8217 | 4.7234 | 12.8933 | 70.0088 | 128.7108 | 0.2594 | 9.2443 | 12.6861 | 14.5729 | 13.3172 | 9.3392 | 9.6350 | 9.3373 |
Maximum | 34.7880 | 6.5335 | 5.1490 | 35.6695 | 110.0946 | 171.4856 | 0.2996 | 9.2643 | 16.2569 | 16.9765 | 25.7088 | 9.3665 | 10.3167 | 9.3373 |
Minimum | 12.4860 | 3.6540 | 4.0960 | 6.7950 | 54.7665 | 91.9505 | 0.2370 | 9.2335 | 10.4225 | 12.5314 | 10.4541 | 9.3180 | 9.3171 | 9.3373 |
Metrics | Time/s | Mile/n | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
DEDRL | Q-Learning | Sarsa | RRT | RRTe | RRTes | FFC-A* | DEDRL | Q-Learning | Sarsa | RRT | RRTe | RRTes | FFC-A* | |
Average | 17.0493 | 2.5412 | 2.5048 | 9.1258 | 162.4348 | 122.0941 | 0.6230 | 6.1476 | 7.8918 | 9.3052 | 8.4623 | 6.2396 | 6.2597 | 6.1645 |
Maximum | 21.5565 | 3.9015 | 3.7220 | 12.5030 | 269.7891 | 174.9756 | 0.6550 | 6.1693 | 10.5152 | 11.3638 | 10.5542 | 6.2649 | 6.3044 | 6.1645 |
Minimum | 10.3920 | 1.7430 | 2.1235 | 3.8760 | 84.6595 | 85.3990 | 0.5970 | 6.1371 | 6.8426 | 7.6912 | 7.2373 | 6.2237 | 6.2250 | 6.1645 |
Metrics | Time/s | Mile/n | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
DEDRL | Q-Learning | Sarsa | RRT | RRTe | RRTes | FFC-A* | DEDRL | Q-Learning | Sarsa | RRT | RRTe | RRTes | FFC-A* | |
Average | 7.4574 | 3.1644 | 2.5608 | 83.9183 | 298.8182 | 313.7402 | 0.2485 | 6.7938 | 10.5668 | 10.6684 | 7.1160 | 6.7941 | 11.7772 | 6.9877 |
Maximum | 9.3560 | 5.4490 | 3.1555 | 114.3611 | 623.3218 | 462.7472 | 0.2720 | 6.8015 | 13.6207 | 14.6148 | 7.6202 | 6.7972 | 13.2857 | 6.9877 |
Minimum | 6.5970 | 2.3745 | 2.1870 | 48.6670 | 116.4556 | 214.5826 | 0.2270 | 6.7869 | 7.8266 | 7.8266 | 6.8719 | 6.7905 | 10.7456 | 6.9877 |
Mile/n | Global Path Planning | Local Path Planning | ||||||
---|---|---|---|---|---|---|---|---|
Overtaking | Crossing | Head-On | ||||||
DEDRL | DQN | DEDRL | DQN | DEDRL | DQN | DEDRL | DQN | |
Average | 28.6943 | 32.0265 | 9.2443 | 9.3683 | 6.1476 | 6.2261 | 6.7938 | 6.9112 |
Maximum | 28.9354 | 32.3053 | 9.2643 | 9.4217 | 6.1693 | 6.2756 | 6.8015 | 6.9671 |
Minimum | 28.4539 | 31.3758 | 9.2335 | 9.3029 | 6.1371 | 6.1401 | 6.7869 | 6.8380 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Shen, Y.; Liao, Z.; Chen, D. Differential Evolution Deep Reinforcement Learning Algorithm for Dynamic Multiship Collision Avoidance with COLREGs Compliance. J. Mar. Sci. Eng. 2025, 13, 596. https://doi.org/10.3390/jmse13030596
Shen Y, Liao Z, Chen D. Differential Evolution Deep Reinforcement Learning Algorithm for Dynamic Multiship Collision Avoidance with COLREGs Compliance. Journal of Marine Science and Engineering. 2025; 13(3):596. https://doi.org/10.3390/jmse13030596
Chicago/Turabian StyleShen, Yangdi, Zuowen Liao, and Dan Chen. 2025. "Differential Evolution Deep Reinforcement Learning Algorithm for Dynamic Multiship Collision Avoidance with COLREGs Compliance" Journal of Marine Science and Engineering 13, no. 3: 596. https://doi.org/10.3390/jmse13030596
APA StyleShen, Y., Liao, Z., & Chen, D. (2025). Differential Evolution Deep Reinforcement Learning Algorithm for Dynamic Multiship Collision Avoidance with COLREGs Compliance. Journal of Marine Science and Engineering, 13(3), 596. https://doi.org/10.3390/jmse13030596