Path Planning for Automatic Berthing Using Ship-Maneuvering Simulation-Based Deep Reinforcement Learning
Abstract
:1. Introduction
2. Mathematical Model
2.1. Coordinated System
2.2. Mathematical Model of USV
2.3. Hydrodynamic and Interaction Coefficients
2.4. Maneuverability
3. Path-Planning Approach
- The utilization of twin critic networks, which work in pairs.
- Delayed updates of the actor.
- Action noise regularization.
3.1. Conception
Algorithm 1: Pseudocode TD3 Algorithm | |
1 | Initialize the critic network and actor-network with random parameter |
2 | Initialize the target parameter to the main parameter |
3 | For to do: |
4 | Observe the state of the environment and choose the action |
5 | Execute action in the TD3 environment to observe the new state , reward , and done signal that gives the signal to stop training for this step. |
6 | Store training set in replay buffer D |
7 | If taking to the goal point, reset the environment state |
8 | If it is time for an update: For in range (custom decided) do: |
9 | Randomly sample a batch of transitions |
10 | Compute Target actions: |
11 | Compute targets: |
12 | Update Q-functions using gradient descent: |
13 | If mod policy delay == 0 then: Update policy by the one-step deterministic policy gradient ascent using |
14 | Update target networks: |
15 | End if End for End if End until convergence |
3.2. Setting for Reinforcement Learning
- Observation space and state: The observation space and state were defined as the set of physical velocity, position, and orientation. The state vector includes the position , as the element. The orientation is the heading angle. The linear velocity is , and the angular velocity is ;
- Action: The control action includes the control input of the thruster (revolution of propeller) and steering (rudder angle) system. The action signal is continuous in the range [−1, 1], where [−1, 1] = [−300, 100] rpm represents the thrust system and [−1, 1] where [−1, 1] = [−35, 35] is the degrees for the steering system.
- Reward function: This plays a crucial role in the design of a reinforcement learning application. It serves as a guide for the network training process and helps optimize the model’s performance throughout each episode. If the reward function does not accurately capture the objectives of the target task, the model may struggle to achieve desirable performance.
- Environment: The environment receives the input as the control input and the state then returns the ship’s new state and the reward for this action. The environment function was built based on a maneuvering simulation that uses the MMG model as a mathematical simulation.
- Agent: The hyperparameters for the TD3 model were selected as follows: the number of hidden layers was set as two layers with 512 units for each. The learning rate for the actor and critic networks α and β was set to 0.0001. The discount factor was 0.99. The soft update coefficient was 0.005. The batch size was 128. This training process was set to 20,000/50,000 steps for the warmup of the model with the exploration noise set in Table 5.
3.3. Boundary Conditions
4. Simulation Results and Discussion
5. Conclusions and Remarks
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Nomenclature
Item | Unit | |
- | Effective inflow angle to the rudder | |
Drift angle of the ship | ||
Inflow angle to the propeller | ||
Inflow angle to the rudder | ||
- | Flow straightening coefficient of the rudder | |
Rudder angle | ||
- | Propeller diameter to the rudder span ratio | |
- | Rudder aspect ratio | |
- | The experimental coefficient for longitudinal inflow velocity to the rudder | |
Displacement of the ship | ||
Heading angle of the ship | ||
Water density | ||
- | Ratio of the wake fraction at the propeller to the rudder | |
Rudder profile area | ||
- | Increase factor of the rudder force | |
Breadth of ship | ||
Average rudder chord length | ||
- | Experimental constant due to wake characteristic in maneuvering | |
Diameter of propeller | ||
Draft of ship | ||
Normal force of the rudder | ||
Surge and sway force acting on the ship | ||
- | Lift gradient coefficients of the rudder | |
Span of rudder | ||
Moment of inertial of ship | ||
- | Advanced ratio of propeller | |
- | Propeller thrust open water characteristic | |
- | Coefficients relative to | |
Ship length between two perpendicular | ||
Effective longitudinal length of the rudder position | ||
Yaw moment acting on the ship | ||
Ship mass | ||
Propeller revolutions per minute (rpm) | ||
- | Earth-fixed coordinate system | |
- | Body-fixed coordinate system | |
- | Resistance of the ship in straight motion (-) | |
Yaw rate | ||
Reward | ||
Longitudinal propeller force | ||
Time | ||
- | Thrust deduction factor | |
- | Steering deduction factor | |
Resultant velocity | ||
Initial resultant velocity | ||
Resultant inflow velocity to the rudder | ||
Longitudinal and lateral velocity of the ship in the body-fixed coordinate system | ||
Longitudinal and lateral inflow velocity to rudder position | ||
- | Wake coefficient at the propeller in maneuvering motion | |
- | Wake coefficient at the propeller at straight motion | |
- | Wake coefficient at the rudder position | |
Surge, sway force, and yaw moment around the midship | ||
Surge, sway force, and yaw moment acting on ship’s Hull | ||
Surge, sway force, and yaw moment due to the propeller | ||
Surge, sway force, and yaw moment due to the rudder | ||
Longitudinal position of the center of gravity | ||
Longitudinal position of the acting point of addition lateral force | ||
Longitudinal position of the propeller | ||
Longitudinal position of the rudder |
Abbreviations
Item | |
AI | Artificial Intelligence |
CFD | Computational Fluid Dynamics |
DDPG | Deep Deterministic Policy Gradients |
DRL | Deep Reinforcement Learning |
MMG | Maneuvering Modeling Group |
RL | Reinforcement Learning |
TD3 | Twin Delayed DDPG (a variant of the DDPG algorithm) |
USV | Unmanned Surface Vehicle |
References
- Chaal, M.; Ren, X.; BahooToroody, A.; Basnet, S.; Bolbot, V.; Banda, O.A.V.; van Gelder, P. Research on risk, safety, and reliability of autonomous ships: A bibliometric review. In Safety Science (Vol. 167); Elsevier B.V.: Amsterdam, The Netherlands, 2023. [Google Scholar] [CrossRef]
- Oh, K.G.; Hasegawa, K. Low speed ship manoeuvrability: Mathematical model and its simulation. In Proceedings of the International Conference on Offshore Mechanics and Arctic Engineering—OMAE, Nantes, France, 9–14 June 2013; p. 9. [Google Scholar] [CrossRef]
- Shouji, K. An Automatic Berthing Study by Optimal Control Techniques. IFAC Proc. Vol. 1992, 25, 185–194. [Google Scholar] [CrossRef]
- Skjåstad, K.G.; Barisic, M. Automated Berthing (Parking) of Autonomous Ships. Ph.D. Thesis, NTNU, Trondheim, Norway, 2018. [Google Scholar]
- Mizuno, N.; Uchida, Y.; Okazaki, T. Quasi real-time optimal control scheme for automatic berthing. IFAC-Pap. 2015, 28, 305–312. [Google Scholar] [CrossRef]
- Nguyen, V.S.; Im, N.K. Automatic ship berthing based on fuzzy logic. Int. J. Fuzzy Log. Intell. Syst. 2019, 19, 163–171. [Google Scholar] [CrossRef]
- Zhang, Y.; Zhang, M.; Zhang, Q. Auto-berthing control of marine surface vehicle based on concise backstepping. IEEE Access 2020, 8, 197059–197067. [Google Scholar] [CrossRef]
- Sawada, R.; Hirata, K.; Kitagawa, Y.; Saito, E.; Ueno, M.; Tanizawa, K.; Fukuto, J. Path following algorithm application to automatic berthing control. J. Mar. Sci. Technol. 2021, 26, 541–554. [Google Scholar] [CrossRef]
- Wu, G.; Zhao, M.; Cong, Y.; Hu, Z.; Li, G. Algorithm of berthing and maneuvering for catamaran unmanned surface vehicle based on ship maneuverability. J. Mar. Sci. Eng. 2021, 9, 289. [Google Scholar] [CrossRef]
- Im, N.; Seong Keon, L.; Hyung Do, B. An Application of ANN to Automatic Ship Berthing Using Selective Controller. Int. J. Mar. Navig. Saf. Sea Transp. 2007, 1, 101–105. [Google Scholar]
- Ahmed, Y.A.; Hasegawa, K. Automatic ship berthing using artificial neural network trained by consistent teaching data using nonlinear programming method. Eng. Appl. Artif. Intell. 2013, 26, 2287–2304. [Google Scholar] [CrossRef]
- Im, N.; Hasegawa, K. Automatic ship berthing using parallel neural controller. IFAC Proc. Vol. 2001, 34, 51–57. [Google Scholar] [CrossRef]
- Im, N.K.; Nguyen, V.S. Artificial neural network controller for automatic ship berthing using head-up coordinate system. Int. J. Nav. Archit. Ocean. Eng. 2018, 10, 235–249. [Google Scholar] [CrossRef]
- Marcelo, J.; Figureueiredo, P.; Pereira, R.; Rejaili, A. Deep Reinforcement Learning Algorithms for Ship Navigation in Restricted Waters. Mecatrone 2018, 3, 151953. [Google Scholar] [CrossRef]
- Lee, D. Reinforcement Learning-Based Automatic Berthing System. arXiv 2021, arXiv:2112.01879. [Google Scholar]
- Fujimoto, S.; van Hoof, H.; Meger, D. Addressing Function Approximation Error in Actor-Critic Methods. Int. Conf. Mach. Learn. 2018, 80, 1587–1596. [Google Scholar]
- Yasukawa, H.; Yoshimura, Y. Introduction of MMG Standard Method for Ship Maneuvering Predictions. J. Mar. Sci. Technol. 2015, 20, 37–52. [Google Scholar] [CrossRef]
- Khanfir, S.; Hasegawa, K.; Nagarajan, V.; Shouji, K.; Lee, S.K. Manoeuvring characteristics of twin-rudder systems: Rudder-hull interaction effect on the manoeuvrability of twin-rudder ships. J. Mar. Sci. Technol. 2011, 16, 472–490. [Google Scholar] [CrossRef]
- Vo, A.K.; Mai, T.L.; Jeon, M.; Yoon, H.k. Experimental Investigation of the Hydrodynamic Characteristics of a Ship due to Bank Effect. Port. Res. 2022, 46, 294–301. [Google Scholar] [CrossRef]
- Kim, D.J.; Choi, H.; Kim, Y.G.; Yeo, D.J. Mathematical Model for Harbour Manoeuvres of Korea Autonomous Surface Ship (KASS) Based on Captive Model Tests. In Proceedings of the Conference of Korean Association of Ocean Science and Technology Societies, Incheon, Republic of Korea, 13–14 May 2021. [Google Scholar]
- Vo, A.K. Application of Deep Reinforcement Learning on Ship’s Autonomous Berthing Based on Maneuvering Simulation. Ph.D. Thesis, Changwon National University, Changwon, Republic of Korea, 2022. [Google Scholar]
Item (Unit) | Value |
---|---|
Length perpendicular, () | 22.000 |
Breadth, () | 6.000 |
Draft, () | 1.250 |
Displacement Volume, () | 86.681 |
Rudder area, () | 0.518 |
Rudder span, () | 0.900 |
Propeller diameter, () | 0.950 |
) | |||||
---|---|---|---|---|---|
−81 | −1034 | 64 | |||
−627 | −126 | −33 | |||
−407 | −2610 | −130 | |||
675 | −3530 | −513 | |||
226 | 3080 | −2 | |||
390 | −138 | ||||
−47 | −178 | ||||
−2170 | −253 | ||||
−3590 | −420 | ||||
−1830 |
Propeller and Rudder | |||||
---|---|---|---|---|---|
0.934 | 0.702 | 0.342 | |||
0.960 | −2.713 | 0.634 | |||
0.695 | 11.211 |
Starboard Turning | Port Turning | |
---|---|---|
Advance () | 2.681 | 2.680 |
Transfer () | 1.114 | 1.115 |
Turning Radius () | 2.625 | 0.623 |
Tactical diameter () | 2.935 | 2.932 |
Item | Value (Exploration Rate ϵ) |
---|---|
Step [0–5000] | 0.5 |
Step [5000–10,000]: | 0.4 |
Step [10,000–15,000] | 0.3 |
Step [15,000–20,000] | 0.2 |
Step [20,000–50,000] | 0.1 |
Item | Value |
---|---|
[−20, 20] | |
[−10, 10] | |
[−5, 5] | |
1 | |
0 | |
0 |
Item | Value |
---|---|
180 2 | |
20 0.5 | |
0 3 | |
0 0.1 | |
0 0.05 | |
0 1 |
Item | Value |
---|---|
[−20, 182] | |
[−10, 20.5] |
Item | Value |
---|---|
[−20, 20] | |
[−10, 10] | |
[−5, 5] | |
1 | |
0 | |
0 |
Item | Value |
---|---|
150 0.5 | |
30 2 | |
90 3 | |
0 0.1 | |
0 0.05 | |
0 1 |
Item | Value |
---|---|
[−20, 150.5] | |
[−10, 32] |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Vo, A.K.; Mai, T.L.; Yoon, H.K. Path Planning for Automatic Berthing Using Ship-Maneuvering Simulation-Based Deep Reinforcement Learning. Appl. Sci. 2023, 13, 12731. https://doi.org/10.3390/app132312731
Vo AK, Mai TL, Yoon HK. Path Planning for Automatic Berthing Using Ship-Maneuvering Simulation-Based Deep Reinforcement Learning. Applied Sciences. 2023; 13(23):12731. https://doi.org/10.3390/app132312731
Chicago/Turabian StyleVo, Anh Khoa, Thi Loan Mai, and Hyeon Kyu Yoon. 2023. "Path Planning for Automatic Berthing Using Ship-Maneuvering Simulation-Based Deep Reinforcement Learning" Applied Sciences 13, no. 23: 12731. https://doi.org/10.3390/app132312731
APA StyleVo, A. K., Mai, T. L., & Yoon, H. K. (2023). Path Planning for Automatic Berthing Using Ship-Maneuvering Simulation-Based Deep Reinforcement Learning. Applied Sciences, 13(23), 12731. https://doi.org/10.3390/app132312731