Deep Reinforcement Learning for Variable Tension Control of Unmanned Underwater Vehicle Arresting Gear Under Nonlinear Effects
Abstract
1. Introduction
- The dynamic models of the arresting cable system and the UUV were systematically established, while a nonlinear friction model derived from the Stribeck effect was constructed and subsequently embedded within the deep reinforcement learning environment to facilitate training and control strategy development.
- In this study, the Proximal Policy Optimization (PPO) algorithm, grounded in the Actor–Critic framework, is employed as the core learning paradigm. Within this framework, generalized advantage estimation is combined with a clipping mechanism to effectively regulate the magnitude of gradient updates, thereby ensuring training stability and improving convergence properties. In alignment with the operational objectives of UUV recovery, a carefully designed reward structure—integrating both dense rewards and sparse terminal rewards is proposed to provide nuanced guidance throughout the learning process. Moreover, an entropy regularization term is incorporated into the objective function to promote sufficient exploration of the policy space and to alleviate the tendency of the learning agent to converge prematurely to suboptimal local optima.
- To rigorously assess the effectiveness of the proposed WSR-E-PPO approach, a series of high-fidelity simulation studies were performed, in which its performance was systematically compared against five representative control strategies during the UUV retrieval phase—namely, the operational stage in which the vehicle is guided from its maximum arresting distance back to the designated docking interface. The comparative simulation results indicate that the WSR-E-PPO algorithm achieves superior performance, enabling the UUV to return to the docking point and come to a stable halt within a minimal time of 82 s. Moreover, the algorithm ensures a consistently smooth velocity trajectory and exhibits well-moderated tension variations in the arresting cable, thereby demonstrating both enhanced dynamic stability and improved recovery efficiency.
2. Related Work
2.1. Friction Compensation Methods
2.2. Deep Reinforcement Learning in Friction Compensation
3. Preliminaries
3.1. Principle and Modeling of Arresting Gear System
- In the present study, only the case where the UUV impacts the arresting cable along the X-direction is considered. Based on extensive preliminary land-based tests, the influence of the deflection angle on the final test results has been found to be negligible.
- The present analysis is confined to the translational motion and corresponding forces along the X-axis.
- Simulation results from prior studies indicate that, irrespective of the initial drift angle or lateral deviation during engagement with the arresting cable, the UUV ultimately slid to the central point of the cable upon deceleration.
3.2. Stribeck-Based Friction Modeling of the Arresting System
4. Method
4.1. Introduction of Deep Reinforcement Learning
4.2. Agent Design
- State
- Action
- Reward
4.3. Algorithm Model
| Algorithm 1 Training Process of the WSR-E-PPO Algorithm |
|
5. Results
5.1. Simulation Results
5.2. Implementation and Statistical Reproducibility
5.3. Robustness Evaluation Under Uncertainties
5.4. Ablation Study
5.4.1. Ablation of PBRS
5.4.2. Ablation of Entropy Regularization
5.4.3. Ablation of Both Mechanisms
5.5. Comparative Evaluation
- Constant Tension Control: This represents the fundamental, passive winch mechanism widely utilized as a fail-safe in real-world marine engineering. It serves as the absolute lower bound for tracking performance and energy efficiency in our comparative analysis.
- Kinetic Energy Decay Trajectory + PID [19]: This strategy represents the current mainstream model-based approach in the specific domain of UUV retrieval. It plans a velocity trajectory based on kinetic energy dissipation and tracks it using a traditional PID controller. Including this benchmark rigorously demonstrates the limitations of linear feedback controllers and static mathematical models when subjected to severe, unpredictable Stribeck friction dynamics.
- Soft Actor–Critic (SAC): SAC is selected as the state-of-the-art (SOTA) standard Deep Reinforcement Learning benchmark for continuous control tasks. Comparing the proposed method against standard SAC proves that our customized control architecture is highly competitive and specifically better suited for this heavily constrained, friction-dominated towing environment.
- Vanilla PPO (Ablation Benchmark): To scientifically validate the core contributions of this paper, a degraded version of our algorithm Vanilla PPO without the Entropy Regularization and Potential-Based Reward Shaping (PBRS) is included as an ablation benchmark. This specifically demonstrates that without our customized reward shaping and exploration mechanisms, standard DRL agents frequently suffer from exploration inefficiency and become trapped in friction-induced dead zones or suboptimal local minima.
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Repoulias, F.; Papadopoulos, E. Planar trajectory planning and tracking control design for underactuated AUVs. Ocean Eng. 2007, 34, 1650–1667. [Google Scholar] [CrossRef]
- Sariel, S.; Balch, T.; Erdogan, N. Naval mine countermeasure missions. IEEE Robot. Autom. Mag. 2008, 15, 45–52. [Google Scholar] [CrossRef]
- Yan, Z.; Yang, Z.; Pan, X.; Zhou, J.; Wu, D. Virtual leader based path tracking control for Multi-UUV considering sampled-data delays and packet losses. Ocean Eng. 2020, 216, 108065. [Google Scholar] [CrossRef]
- Yuh, J. Design and control of autonomous underwater robots: A survey. Auton. Robot. 2000, 8, 7–24. [Google Scholar] [CrossRef]
- Gong, P.; Yan, Z.; Zhang, W.; Tang, J. Lyapunov-based model predictive control trajectory tracking for an autonomous underwater vehicle with external disturbances. Ocean Eng. 2021, 232, 109010. [Google Scholar] [CrossRef]
- Palomeras, N.; Vallicrosa, G.; Mallios, A.; Bosch, J.; Vidal, E.; Hurtos, N.; Carreras, M.; Ridao, P. AUV homing and docking for remote operations. Ocean Eng. 2018, 154, 106–120. [Google Scholar] [CrossRef]
- Zhang, W.; Zeng, J.; Yan, Z.; Wei, S.; Tian, W. Leader-following consensus of discrete-time multi-AUV recovery system with time-varying delay. Ocean Eng. 2021, 219, 108258. [Google Scholar] [CrossRef]
- Jun, B.H.; Park, J.Y.; Lee, F.Y.; Lee, P.M.; Lee, C.M.; Kim, K.; Lim, Y.K.; Oh, J.H. Development of the AUV ‘ISiMI’and a free running test in an Ocean Engineering Basin. Ocean Eng. 2009, 36, 2–14. [Google Scholar] [CrossRef]
- Sato, Y.; Maki, T.; Masuda, K.; Matsuda, T.; Sakamaki, T. Autonomous docking of hovering type AUV to seafloor charging station based on acoustic and visual sensing. In Proceedings of the 2017 IEEE Underwater Technology (UT); IEEE: Piscataway, NJ, USA, 2017; pp. 1–6. [Google Scholar]
- Walton, J. AUV launch and recovery from US navy ships: Problems and solutions. In Proceedings of the PACON, San Francisco, CA, USA, 8–11 July 2001; PACON International: Honolulu, HI, USA, 2001; pp. 324–331. [Google Scholar]
- Fan, S.; Liu, C.; Li, B.; Xu, Y.; Xu, W. AUV docking based on USBL navigation and vision guidance. J. Mar. Sci. Technol. 2019, 24, 673–685. [Google Scholar] [CrossRef]
- Page, B.R.; Mahmoudian, N. Simulation-driven optimization of underwater docking station design. IEEE J. Ocean. Eng. 2019, 45, 404–413. [Google Scholar] [CrossRef]
- Zhang, W.; Jia, G.; Wu, P.; Yang, S.; Huang, B.; Wu, D. Study on hydrodynamic characteristics of AUV launch process from a launch tube. Ocean Eng. 2021, 232, 109171. [Google Scholar] [CrossRef]
- Zhang, W.; Teng, Y.; Chen, H.; Yu, C. On the robust model predictive control method of dynamic positioning to line for UUV recovery. In Proceedings of the OCEANS 2016 MTS/IEEE Monterey; IEEE: Piscataway, NJ, USA, 2016; pp. 1–6. [Google Scholar]
- Hardy, T.; Barlow, G. Unmanned Underwater Vehicle (UUV) deployment and retrieval considerations for submarines. In Proceedings of the International Naval Engineering Conference and Exhibition; OODA Technologies Inc.: Montreal, QC, Canada, 2008; Volume 2008. [Google Scholar]
- Li, Y.; Jiang, Y.; Cao, J.; Wang, B.; Li, Y. AUV docking experiments based on vision positioning using two cameras. Ocean Eng. 2015, 110, 163–173. [Google Scholar] [CrossRef]
- Kim, J.; Lee, G. A study on the UUV docking system by using torpedo tubes. In Proceedings of the 2011 8th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI); IEEE: Piscataway, NJ, USA, 2011; pp. 842–844. [Google Scholar]
- Bai, G.; Gu, H.; Zhang, H.; Meng, L.; Tang, D. V-shaped wing design and hydrodynamic analysis based on moving base for recovery AUV. In Proceedings of the 2018 WRC Symposium on Advanced Robotics and Automation (WRC SARA); IEEE: Piscataway, NJ, USA, 2018; pp. 320–325. [Google Scholar]
- Wang, X.; Liang, L.; Lei, M.; Huang, J. Research on Variable Tension Control Strategy for UUV Arresting Gear System. In Proceedings of the ISOPE International Ocean and Polar Engineering Conference; ISOPE: Mountain View, CA, USA, 2024; p. ISOPE-I-24-245. [Google Scholar]
- Shang, W.; Cong, S.; Zhang, Y. Nonlinear friction compensation of a 2-DOF planar parallel manipulator. Mechatronics 2008, 18, 340–346. [Google Scholar] [CrossRef]
- Li, B.; Xie, X.; Yu, B.; Liao, Y.; Fan, D. Data-driven friction modeling and compensation for rotary servo actuators. Front. Mech. Eng. 2024, 19, 41. [Google Scholar] [CrossRef]
- Kim, M.J.; Beck, F.; Ott, C.; Albu-Schäffer, A. Model-free friction observers for flexible joint robots with torque measurements. IEEE Trans. Robot. 2019, 35, 1508–1515. [Google Scholar] [CrossRef]
- Lischinsky, P.; Canudas-de Wit, C.; Morel, G. Friction compensation for an industrial hydraulic robot. IEEE Control Syst. Mag. 2002, 19, 25–32. [Google Scholar]
- Liu, Y.; Alambeigi, F. Impact of generic tendon routing on tension loss of tendon-driven continuum manipulators with planar deformation. IEEE Robot. Autom. Lett. 2022, 7, 3624–3631. [Google Scholar] [CrossRef] [PubMed]
- Kim, Y.H.; Lewis, F.L. Reinforcement adaptive learning neural-net-based friction compensation control for high speed and precision. IEEE Trans. Control Syst. Technol. 2002, 8, 118–126. [Google Scholar] [CrossRef]
- Hernández, R.; García-Hernández, R.; Jurado, F. Deep Reinforcement Learning for a Mechanical System under Friction Effect. In Proceedings of the 2022 International Conference on Mechatronics, Electronics and Automotive Engineering (ICMEAE); IEEE: Piscataway, NJ, USA, 2022; pp. 29–34. [Google Scholar]
- Al-Mahasneh, A.; Abu Mallouh, M.; Al-Khawaldeh, M.A.; Jouda, B.; Shehata, O.; Baniyounis, M. Online reinforcement learning control of robotic arm in presence of high variation in friction forces. Syst. Sci. Control Eng. 2023, 11, 2251521. [Google Scholar] [CrossRef]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; MIT Press: Cambridge, MA, USA, 1998; Volume 1. [Google Scholar]
- Nguyen, H.; La, H. Review of deep reinforcement learning for robot manipulation. In Proceedings of the 2019 Third IEEE International Conference on Robotic Computing (IRC); IEEE: Piscataway, NJ, USA, 2019; pp. 590–595. [Google Scholar]
- Koh, S.; Zhou, B.; Fang, H.; Yang, P.; Yang, Z.; Yang, Q.; Guan, L.; Ji, Z. Real-time deep reinforcement learning based vehicle navigation. Appl. Soft Comput. 2020, 96, 106694. [Google Scholar] [CrossRef]
- Pérez-Gil, Ó.; Barea, R.; López-Guillén, E.; Bergasa, L.M.; Gomez-Huelamo, C.; Gutiérrez, R.; Diaz-Diaz, A. Deep reinforcement learning based control for Autonomous Vehicles in CARLA. Multimed. Tools Appl. 2022, 81, 3553–3576. [Google Scholar] [CrossRef]
- Chen, P.; Pei, J.; Lu, W.; Li, M. A deep reinforcement learning based method for real-time path planning and dynamic obstacle avoidance. Neurocomputing 2022, 497, 64–75. [Google Scholar] [CrossRef]
- Wang, L.; Zhang, G.; Yang, Q.; Han, T. An adaptive traffic signal control scheme with Proximal Policy Optimization based on deep reinforcement learning for a single intersection. Eng. Appl. Artif. Intell. 2025, 149, 110440. [Google Scholar] [CrossRef]
- Lampaert, V.; Swevers, J.; Al-Bender, F. Comparison of model and non-model based friction compensation techniques in the neighbourhood of pre-sliding friction. In Proceedings of the 2004 American Control Conference; IEEE: Piscataway, NJ, USA, 2004; Volume 2, pp. 1121–1126. [Google Scholar]
- Ruderman, M.; Iwasaki, M. Observer of nonlinear friction dynamics for motion control. IEEE Trans. Ind. Electron. 2015, 62, 5941–5949. [Google Scholar] [CrossRef]
- Ruderman, M. Tracking control of motor drives using feedforward friction observer. IEEE Trans. Ind. Electron. 2013, 61, 3727–3735. [Google Scholar] [CrossRef]
- Huang, W.S.; Liu, C.W.; Hsu, P.L.; Yeh, S.S. Precision control and compensation of servomotors and machine tools via the disturbance observer. IEEE Trans. Ind. Electron. 2009, 57, 420–429. [Google Scholar] [CrossRef]
- Lin, F.J.; Shieh, H.J.; Huang, P.K.; Shieh, P.H. An adaptive recurrent radial basis function network tracking controller for a two-dimensional piezo-positioning stage. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 2008, 55, 183–198. [Google Scholar]
- Lin, F.J.; Shieh, P.H.; Hung, Y.C. An intelligent control for linear ultrasonic motor using interval type-2 fuzzy neural network. IET Electr. Power Appl. 2008, 2, 32–41. [Google Scholar] [CrossRef]
- Selmic, R.R.; Lewis, F.L. Neural-network approximation of piecewise continuous functions: Application to friction compensation. IEEE Trans. Neural Netw. 2002, 13, 745–751. [Google Scholar] [CrossRef]
- Tan, K.; Lee, T.H.; Zhou, H.X. Micro-positioning of linear-piezoelectric motors based on a learning nonlinear PID controller. IEEE/ASME Trans. Mechatron. 2001, 6, 428–436. [Google Scholar] [CrossRef]
- Amthor, A.; Zschack, S.; Ament, C. Position control on nanometer scale based on an adaptive friction compensation scheme. In Proceedings of the 2008 34th Annual Conference of IEEE Industrial Electronics; IEEE: Piscataway, NJ, USA, 2008; pp. 2568–2573. [Google Scholar]
- Lin, C.M.; Li, H.Y. Intelligent control using the wavelet fuzzy CMAC backstepping control system for two-axis linear piezoelectric ceramic motor drive systems. IEEE Trans. Fuzzy Syst. 2013, 22, 791–802. [Google Scholar] [CrossRef]
- Olsson, H.; Åström, K.J.; De Wit, C.C.; Gäfvert, M.; Lischinsky, P. Friction models and friction compensation. Eur. J. Control 1998, 4, 176–195. [Google Scholar] [CrossRef]
- De Wit, C.C.; Olsson, H.; Astrom, K.J.; Lischinsky, P. A new model for control of systems with friction. IEEE Trans. Autom. Control 1995, 40, 419–425. [Google Scholar] [CrossRef]
- Swevers, J.; Al-Bender, F.; Ganseman, C.G.; Projogo, T. An integrated friction model structure with improved presliding behavior for accurate friction compensation. IEEE Trans. Autom. Control 2002, 45, 675–686. [Google Scholar] [CrossRef]
- Hsieh, C.; Pan, Y.C. Dynamic behavior and modelling of the pre-sliding static friction. Wear 2000, 242, 1–17. [Google Scholar] [CrossRef]
- Al-Bender, F.; Lampaert, V.; Swevers, J. The generalized Maxwell-slip model: A novel model for friction simulation and compensation. IEEE Trans. Autom. Control 2005, 50, 1883–1887. [Google Scholar] [CrossRef]
- Kabziński, J.; Jastrzębski, M. Practical implementation of adaptive friction compensation based on partially identified LuGre model. In Proceedings of the 2014 19th International Conference on Methods and Models in Automation and Robotics (MMAR); IEEE: Piscataway, NJ, USA, 2014; pp. 699–704. [Google Scholar]
- Zschäck, S.; Büchner, S.; Amthor, A.; Ament, C. Maxwell Slip based adaptive friction compensation in high precision applications. In Proceedings of the IECON 2012-38th Annual Conference on IEEE Industrial Electronics Society; IEEE: Piscataway, NJ, USA, 2012; pp. 2331–2336. [Google Scholar]
- Arulkumaran, K.; Deisenroth, M.P.; Brundage, M.; Bharath, A.A. Deep reinforcement learning: A brief survey. IEEE Signal Process. Mag. 2017, 34, 26–38. [Google Scholar] [CrossRef]
- Nguyen, N.D.; Nguyen, T.; Nahavandi, S. System design perspective for human-level agents using deep reinforcement learning: A survey. IEEE Access 2017, 5, 27091–27102. [Google Scholar] [CrossRef]
- Tsitsiklis, J.; Van Roy, B. Analysis of temporal-diffference learning with function approximation. IEEE Trans. Autom. Control 1997, 42, 674–690. [Google Scholar] [CrossRef]
- Van Hasselt, H.; Guez, A.; Silver, D. Deep reinforcement learning with double q-learning. In Proceedings of the AAAI Conference on Artificial Intelligence; AAAI: Washington, DC, USA, 2016; Volume 30. [Google Scholar]
- Lillicrap, T.P.; Hunt, J.J.; Pritzel, A.; Heess, N.; Erez, T.; Tassa, Y.; Silver, D.; Wierstra, D. Continuous control with deep reinforcement learning. arXiv 2015, arXiv:1509.02971. [Google Scholar]
- Popov, I.; Heess, N.; Lillicrap, T.; Hafner, R.; Barth-Maron, G.; Vecerik, M.; Lampe, T.; Tassa, Y.; Erez, T.; Riedmiller, M. Data-efficient deep reinforcement learning for dexterous manipulation. arXiv 2017, arXiv:1704.03073. [Google Scholar] [CrossRef]
- Schulman, J.; Levine, S.; Abbeel, P.; Jordan, M.; Moritz, P. Trust region policy optimization. In Proceedings of the International Conference on Machine Learning; PMLR: London, UK, 2015; pp. 1889–1897. [Google Scholar]
- Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; Klimov, O. Proximal policy optimization algorithms. arXiv 2017, arXiv:1707.06347. [Google Scholar] [CrossRef]
- Haarnoja, T.; Zhou, A.; Abbeel, P.; Levine, S. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In Proceedings of the International Conference on Machine Learning; PMLR: London, UK, 2018; pp. 1861–1870. [Google Scholar]
- Wang, F.; Hu, J.; Qin, Y.; Guo, F.; Jiang, M. Trajectory tracking control based on deep reinforcement learning for a robotic manipulator with an input deadzone. Symmetry 2025, 17, 149. [Google Scholar] [CrossRef]
- Xu, H.; Terakawa, T.; Komori, M. Deep-reinforcement-learning-based trajectory tracking control for slidable-wheel omnidirectional mobile robot. J. Adv. Mech. Des. Syst. Manuf. 2025, 19, JAMDSM0031. [Google Scholar] [CrossRef]
- Pavlichenko, D.; Behnke, S. Real-robot deep reinforcement learning: Improving trajectory tracking of flexible-joint manipulator with reference correction. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA); IEEE: Piscataway, NJ, USA, 2022; pp. 2671–2677. [Google Scholar]
- Johannink, T.; Bahl, S.; Nair, A.; Luo, J.; Kumar, A.; Loskyll, M.; Ojea, J.A.; Solowjow, E.; Levine, S. Residual reinforcement learning for robot control. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA); IEEE: Piscataway, NJ, USA, 2019; pp. 6023–6029. [Google Scholar]























| Parameters | Value |
|---|---|
| (position absolute penalty) | 5.0 |
| (excess velocity penalty) | 100.0 |
| (passive acceleration boundary) | 0.010 m/s2 |
| (speed limit bounds) | m/s |
| (maximum potential energy) | 200.0 |
| (discount factor) | 0.98 |
| (starting distance threshold) | 2.5 m |
| (terminal success bonus) | 150.0 |
| (position tolerance) | 0.1 m |
| (velocity tolerance) | 0.01 m/s |
| Parameters | Value |
|---|---|
| State dimension | 2 |
| Action dimension | 1 |
| Actor hidden layers | 2 |
| Actor hidden units | 128, 128 |
| Actor activation | Tanh |
| Critic hidden layers | 2 |
| Critic hidden units | 128, 64 |
| Critic activation | Tanh |
| Category | Parameter | Symbol | Value |
|---|---|---|---|
| UUV Dynamics | Mass of UUV | m | 450,000 kg |
| Initial velocity | 0 m/s | ||
| Propulsion force | 1450 N | ||
| Target threshold | 2.5 m | ||
| Hydraulic Drive | Motor displacement | m3/rad | |
| Mechanical efficiency | 0.92 | ||
| Valve flow gain | 0.05 m3/(s·A) | ||
| Flow-pressure coeff | m3/(s·Pa) | ||
| Total control volume | m3 | ||
| Effective bulk modulus | Pa | ||
| Friction (Stribeck) | Static friction coeff | 0.37 | |
| Coulomb friction coeff | 0.30 | ||
| Stribeck velocity | 0.1 m/s | ||
| Stribeck shape factor | p | 1.0 | |
| cable geometry | Diameter | d | 0.05 m |
| Material | Nylon | ||
| Controller Setup | Sampling time | 0.1 s |
| Parameters | Value |
|---|---|
| Total episodes | 4200 |
| Max steps per episode | 1000 |
| Batch episodes (Update frequency) | 30 |
| Discount factor | 0.98 |
| GAE parameter | 0.90 |
| Actor learning rate | 0.0003 |
| Critic learning rate | 0.001 |
| Clip parameter | 0.2 |
| Entropy coefficient | 0.01 |
| Grad update (epochs) | 10 |
| Optimizer | Adam |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Wang, X.; Li, W.; Huang, J.; Liu, F. Deep Reinforcement Learning for Variable Tension Control of Unmanned Underwater Vehicle Arresting Gear Under Nonlinear Effects. Machines 2026, 14, 654. https://doi.org/10.3390/machines14060654
Wang X, Li W, Huang J, Liu F. Deep Reinforcement Learning for Variable Tension Control of Unmanned Underwater Vehicle Arresting Gear Under Nonlinear Effects. Machines. 2026; 14(6):654. https://doi.org/10.3390/machines14060654
Chicago/Turabian StyleWang, Xikun, Weijia Li, Junlei Huang, and Fayou Liu. 2026. "Deep Reinforcement Learning for Variable Tension Control of Unmanned Underwater Vehicle Arresting Gear Under Nonlinear Effects" Machines 14, no. 6: 654. https://doi.org/10.3390/machines14060654
APA StyleWang, X., Li, W., Huang, J., & Liu, F. (2026). Deep Reinforcement Learning for Variable Tension Control of Unmanned Underwater Vehicle Arresting Gear Under Nonlinear Effects. Machines, 14(6), 654. https://doi.org/10.3390/machines14060654

