Feature-Based MPPI Control with Applications to Maritime Systems †
Abstract
:1. Introduction
2. Model Predictive Path Integral Control
Algorithm 1 Optimize Control Sequence (OCS) acc. to [4] |
Input:: Transition model; |
K: Number of samples; |
T: Number of timesteps; |
: Initial control sequence; |
: Recent state estimate; |
: Control hyper-parameters; |
Output:: Optimized control sequence |
: Average costs; |
1: for do |
2: ; |
3: Sample ; |
4: ; |
5: for do |
6: ; |
7: ; |
8: end for |
9: ; |
10: end for |
11: ; |
12: ; |
13: for do |
14: ; |
15: end for |
16: for do |
17: ; |
18: end for |
19: ; |
20: return and |
3. Extension to Feature-Based Proposal Density
3.1. Exploration Problem of Classical MPPI Control
3.2. Feature-Based Extension of the Search Space
3.3. Resulting Feature-Based MPPI Algorithm
Algorithm 2 Feature-Based MPPI Control |
Input:: Transition model; |
K: Number of samples; |
T: Number of timesteps; |
: Initial control sequence; |
: Recent state estimate; |
: Control hyper-parameters; |
I: Number of features with feature index ; |
: Initial control sequence of feature; |
: State dependent costs of feature; |
: Number of features samples; |
1: while Controller is active do |
2: |
3: for do |
4: |
5: |
6: end for |
7: |
8: |
9: SendToActuator; |
10: for do |
11: |
12: end for |
13: |
14: end while |
4. Maritime Application Scenario
4.1. Dynamics of the Vessel
4.2. Inequality Constraints
4.3. Equality Constraints
4.4. Cost Function
- A collision should be prevented.
- The actuators must not be overloaded.
- The position should follow a predefined the trajectory.
- The orientation should be in line with the trajectory.
- While the surge velocity component should match a reference, the absolute value of the sway component and yaw-rate component should be minimized.
4.4.1. Costs Dependent on the Position
4.4.2. Costs Dependent on the Velocity
4.5. Resulting Problem Formulation
4.6. Feature Definition
5. Controller Design
5.1. Architecture
5.2. Controller Parameters
5.2.1. Numerical Solution of the Initial Value Problem
5.2.2. Choice of Controller Step Size
5.2.3. Prediction Horizon
5.2.4. Covariance Matrix of Additive Noise
5.2.5. Temperature
5.2.6. Number of Predicted Trajectories
5.2.7. Cost Function Parameters
6. Simulation Results
7. Full-Scale Results
8. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
AWGN | Additive white Gaussian noise |
CG | Center of gravity |
HJB | Hamilton–Jacobi–Bellman |
MABX | MicroAutoBox |
MPPI | Model predictive path integral |
NMPC | Nonlinear model predictive control |
OCP | Optimal control problem |
RL | Reinforcement learning |
UKF | Unscented Kalman filter |
References
- Homburger, H.; Wirtensohn, S.; Reuter, J. Feature-Based Proposal Density Optimization for Nonlinear Model Predictive Path Integral Control. In Proceedings of the 6th IEEE Conference on Control Technology and Applications (CCTA), Trieste, Italy, 22–25 August 2022. [Google Scholar]
- Kleinert, H. Path Integrals in Quantum Mechanics, Statistics, Polymer Physics, and Financial Markets, 5th ed.; World Scientific Publishing Ltd.: Singapore, 2009. [Google Scholar]
- Kappen, H.J. Path Integrals and Symmetry Breaking for Optimal Control Theory. J. Stat. Mech. Theory Exp. 2005, 2005, P11011. [Google Scholar] [CrossRef] [Green Version]
- Williams, G.; Wagener, N.; Goldfain, B.; Drews, P.; Rehg, J.; Boots, B.; Theodorou, E. Information Theoretic MPC for Model-Based Reinforcement Learning. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Singapore, 29 May 2017–3 June 2017. [Google Scholar]
- Schaal, S.; Atkeson, C. Learning Control in Robotics. IEEE Robot. Autom. Mag. 2010, 10, 20–29. [Google Scholar] [CrossRef]
- Theodorou, E.A.; Buchli, J.; Schaal, S. A Generalized Path Integral Control Approach to Reinforcement Learning. J. Mach. Learn. Res. 2010, 11, 3137–3181. [Google Scholar]
- Kappen, H.J.; Ruiz, H. Adaptive Importance Sampling for Control and Inference. J. Stat. Phys. 2016, 162, 1244–1266. [Google Scholar] [CrossRef] [Green Version]
- Gómez, V.; Thijssen, S.; Symington, A.; Hailes, S.; Kappen, H.J. Real-Time Stochastic Optimal Control for Multi-agent Quadrotor Systems. In Proceedings of the 26th International Conference on Automated Planning and Scheduling (ICAPS 16), London, UK, 12–17 June 2016; 2015. [Google Scholar]
- Homburger, H.; Wirtensohn, S.; Reuter, J. Swinging Up and Stabilization Control of the Furuta Pendulum using Model Predictive Path Integral Control. In Proceedings of the 30th Mediterranean Conference on Control and Automation (MED), Athens, Greece, 28 June–1 July 2022. [Google Scholar]
- Theodorou, E.A.; Todorov, E. Relative entropy and free energy dualities: Connections to path integral and KL control. In Proceedings of the IEEE 51st Annual Conference on Decision and Control (CDC), Grand Wailea Maui, HI, USA, 10–13 December 2012; pp. 1466–1473. [Google Scholar]
- Theodorou, E.A. Nonlinear Stochastic Control and Information Theoretic Dualities: Connections, Interdependencies and Thermodynamic Interpretations. Entropy 2015, 17, 3352–3375. [Google Scholar] [CrossRef]
- Kappen, H.J. An Introduction to Stochastic Control Theory, Path Integrals and Reinforcement Learning. Coop. Behav. Neural Syst. Ninth Granada Lect. 2007, 887, 149–181. [Google Scholar]
- Thijssen, S.; Kappen, H.J. Path Integral Control and State Dependent Feedback. Phys. Rev. E 2015, 91, 032104. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Gandhi, M.S.; Vlahov, B.; Gibson, J.; Williams, G.; Theodorou, E. Robust Model Predictive Path Integral Control: Analysis and Performance Guarantees. IEEE Robot. Autom. Lett. 2021, 6, 1423–1430. [Google Scholar] [CrossRef]
- Yin, J.; Zhang, Z.; Theodorou, E.; Tsiotras, P. Improving Model Predictive Path Integral using Covariance Steering. arXiv 2021, arXiv:2109.12147. [Google Scholar]
- Kusumoto, R.; Palmieri, L.; Spies, M.; Csiszar, A.; Arras, K.O. Informed Information Theoretic Model Predictive Control. In Proceedings of the International Conference on Robotics and Automation (ICRA), Montreal, Canada, 20–24 May 2019; pp. 2047–2053. [Google Scholar]
- Williams, G.; Drews, P.; Goldfain, B.; Rehg, J.M.; Theodorou, E. Information-Theoretic Model Predictive Control: Theory and Applications to Autonomous Driving. IEEE Trans. Robot. 2018, 34, 1603–1622. [Google Scholar] [CrossRef] [Green Version]
- Abdelaal, M.; Hahn, A. NMPC-based trajectory tracking and collision avoidance of unmanned surface vessels with rule-based colregs confinement. In Proceedings of the IEEE Conference on Systems, Process and Control (ICSPC), Hammamet, Tunisia, 16–18 December 2016; pp. 23–28. [Google Scholar]
- Lutz, M.; Meurer, T. Optimal trajectory planning and model predictive control of underactuated marine surface vessels using a flatness-based approach. arXiv 2021, arXiv:2101.12730. [Google Scholar]
- Bärlund, A.; Linder, J.; Feyzmahdavian, H.R.; Lundh, M.; Tervo, K. Nonlinear MPC for combined motion control and thrust allocation of ships. In Proceedings of the 21st IFAC Wolrd Congress, Berlin, Germany, 11–17 July 2020. [Google Scholar]
- Zare, N.; Brandoli, B.; Sarvmaili, M.; Soares, A.; Matwin, S. Continuous Control with Deep Reinforcement Learning for Autonomous Vessels. arXiv 2021, arXiv:2106.14130. [Google Scholar]
- Martinsen, A.; Lekkas, A. Curved Path Following with Deep Reinforcement Learning: Results from Three Vessel Models. In Proceedings of the IEEE Oceans MTS, Charleston, SC, USA, 22–25 October 2018; pp. 1–8. [Google Scholar]
- Yin, Z.; He, W.; Yang, C.; Sun, C. Control Design of a Marine Vessel System Using Reinforcement Learning. Neurocomputing 2018, 311, 353–362. [Google Scholar] [CrossRef] [Green Version]
- Homburger, H.; Wirtensohn, S.; Reuter, J. Docking Control of a Fully-Actuated Autonomous Vessel using Model Predicitve Path Integral Control. In Proceedings of the 20th European Control Conference (ECC), London, UK, 12–15 July 2022. [Google Scholar]
- Lloyd. Articles of the Convention on the International Regulations for Preventing Collisions at Sea, 1972; Lloyd’s Register of International Maritime Organization: London, UK, 2005. [Google Scholar]
- Kinjo, L.M.; Wirtensohn, S.; Reuter, J.; Menard, T.; Gehan, O. Trajectory tracking of a fully-actuated surface vessel using nonlinear model predictive control. In Proceedings of the 13th IFAC Conference on Control Applications in Marine Systems, Robotics, and Vehicles (CAMS), Oldenburg, Germany, 22–24 September 2021; pp. 51–56. [Google Scholar]
- Fossen, T.I. Marine Control Systems: Guidance, Navigation and Control of Ships, Rigs and Underwater Vehicles, 1st ed.; Marine Cybernetics: Trondheim, Norway, 2002. [Google Scholar]
- Blanke, M. Ship Propulsion Losses Related to Automatic Steering and Prime Mover Control, 1st ed.; Technical University of Denmark: Lyngby, Denmark, 1981. [Google Scholar]
- Ramachandran, D.; Amir, E. Bayesian Inverse Reinforcement Learning. In Proceedings of the 20th International Joint Conference on Artificial Intelligence, Hyderabad, India, 6–12 January 2007. [Google Scholar]
- Ng, A.Y.; Russell, S. Algorithms for Inverse Reinforcement Learning. In Proceedings of the 17th International Conference on Machine Learning (ICML), San Francisco, CA, USA, 29 June–2 July 2000. [Google Scholar]
- Arora, S.; Doshi, P. A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress. Artif. Intell. 2021, 297, 103500. [Google Scholar] [CrossRef]
- Google Maps. Rhine River. Available online: https://www.google.com/maps/@47.6679863,9.1762762,16.96z (accessed on 5 May 2022).
- Wirtensohn, S.; Hamburger, O.; Homburger, H.; Kinjo, L.M.; Reuter, J. Comparison of Advanced Control Strategies for Automated Docking. In Proceedings of the 13th IFAC Conference on Control Applications in Marine Systems, Robotics, and Vehicles (CAMS), Oldenburg, Germany, 22–24 September 2021; pp. 295–300. [Google Scholar]
Vessel’s Dynamics Parameter | Value | Actuators’ Dynamics Parameter | Value |
---|---|---|---|
m | 3100 kg | 2300 rpm | |
0 m | 105 ms | ||
21,179 kgm | 240 ms | ||
kg | 71.66 1/s | ||
kg | - | ||
kgm | 95 ms | ||
kgm | 160 ms | ||
Ns/m | 1.496 rad/s | ||
Ns/m | 3800 rpm | ||
Ns/m | 270 ms | ||
Ns/m | 80 ms | ||
Ns/m | 10,000 1/s |
Parameter | Value | Parameter | Value |
---|---|---|---|
0.9047 | 0.21 | ||
0.6545 | 0.24 | ||
0.36 m | 1000 kg/m | ||
0.23 m | 2.9 m |
Point | Latitude | Longitude | Description |
---|---|---|---|
P1 | 47.668279218023123 | 9.174018424704204 | Start position |
P2 | 47.666282640256334 | 9.178596676673507 | End position |
Parameter | MPPI | Feature-Based MPPI |
---|---|---|
K | 9000 | 8500 |
T | 41 | 41 |
1 | 1 | |
Equation (33) | Equation (33) | |
0 | 0 | |
- | 500 | |
- | Equation (38) | |
- | Equation (38) |
Parameter | Value | Parameter | Value |
---|---|---|---|
500 1/m | 409.9 m | ||
200 1/m | 2 m/s | ||
2500 1/rad | 300 s/m | ||
5000 s/m | 30 s/m | ||
10 s/m | 10 s/rad | ||
7500 s/rad | 60 m | ||
80 m |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Homburger, H.; Wirtensohn, S.; Diehl, M.; Reuter, J. Feature-Based MPPI Control with Applications to Maritime Systems. Machines 2022, 10, 900. https://doi.org/10.3390/machines10100900
Homburger H, Wirtensohn S, Diehl M, Reuter J. Feature-Based MPPI Control with Applications to Maritime Systems. Machines. 2022; 10(10):900. https://doi.org/10.3390/machines10100900
Chicago/Turabian StyleHomburger, Hannes, Stefan Wirtensohn, Moritz Diehl, and Johannes Reuter. 2022. "Feature-Based MPPI Control with Applications to Maritime Systems" Machines 10, no. 10: 900. https://doi.org/10.3390/machines10100900
APA StyleHomburger, H., Wirtensohn, S., Diehl, M., & Reuter, J. (2022). Feature-Based MPPI Control with Applications to Maritime Systems. Machines, 10(10), 900. https://doi.org/10.3390/machines10100900