Online PID Tuning Strategy for Hydraulic Servo Control Systems via SAC-Based Deep Reinforcement Learning
Abstract
1. Introduction
2. System Description and Modeling
2.1. Introduction of Hydraulic Servo System
2.2. Mathematical Model
3. SAC-PID Control Strategy
3.1. Overview of the Control Strategy
3.2. Design of the Upper Controller
3.3. Algorithm Statement
| Algorithm 1: Pseudocode of the SAC-PID control strategy. | 
| Initialize the relevant parameters of the policy network, replay buffer size for t = 1, 2, … do if t = 10, 20, … do for episode = 1, 2, …, E do Receive initial state for step = 1, 2, …, T1 do Select actions based on the current state Compute the control signals according to the action Apply control signals and observe the next state Compute the current reward Store following into replay buffer R if it is time to update then Update Q network parameters: Update critic network parameters: Update entropy parameters: Updating of target network parameters online End if End for End for End if End for | 
4. Simulation Environments
4.1. Simulation Setup
4.2. Training Samples Setup
5. Simulation Results
5.1. The Tracking Response of Random Signals Input
5.2. The Tracking Response of Sinusoidal Signals Input with Sudden Pressure Drop
5.3. The Response of Sinusoidal Signals Input with External Disturbance Force
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Huayong, Y.; Hu, S.; Guofang, G.; Guoliang, H. Electro-hydraulic proportional control of thrust system for shield tunneling machine. Autom. Constr. 2009, 18, 950–956. [Google Scholar] [CrossRef]
- Nguyen, M.T.; Dang, T.D.; Ahn, K.K. Application of Electro-Hydraulic Actuator System to Control Continuously Variable Transmission in Wind Energy Converter. Energies 2019, 12, 2499. [Google Scholar] [CrossRef]
- Sivčev, S.; Rossi, M.; Coleman, J.; Dooly, G.; Omerdić, E.; Toal, D. Fully automatic visual servoing control for work-class marine intervention ROVs. Control Eng. Pract. 2018, 74, 153–167. [Google Scholar] [CrossRef]
- Kim, S.; Park, J.; Kang, S.; Kim, P.Y.; Kim, H.J. A Robust Control Approach for Hydraulic Excavators Using μ-synthesis. Int. J. Control Autom. Syst. 2018, 16, 1615–1628. [Google Scholar]
- Wang, Y.; Zhang, J.; Zhang, H.; Xie, X. Adaptive Fuzzy Output-Constrained Control for Nonlinear Stochastic Systems With Input Delay and Unknown Control Coefficients. IEEE Trans. Cybern. 2021, 51, 5279–5290. [Google Scholar] [CrossRef]
- Chen, Z.; Yuan, X.; Ji, B.; Wang, P.; Tian, H. Design of a fractional order PID controller for hydraulic turbine regulating system using chaotic non-dominated sorting genetic algorithm II. Energy Convers. Manag. 2014, 84, 390–404. [Google Scholar] [CrossRef]
- Fan, Y.; Shao, J.; Sun, G. Optimized PID Controller Based on Beetle Antennae Search Algorithm for Electro-Hydraulic Position Servo Control System. Sensors 2019, 19, 2727. [Google Scholar] [CrossRef]
- Wang, L.; Zhao, D.; Liu, F.; Liu, Q.; Zhang, Z. Active Disturbance Rejection Position Synchronous Control of Dual-Hydraulic Actuators with Unknown Dead-Zones. Sensors 2020, 20, 6124. [Google Scholar] [CrossRef]
- Çetin, Ş.; Akkaya, A.V. Simulation and hybrid fuzzy-PID control for positioning of a hydraulic system. Nonlinear Dyn. 2010, 61, 465–476. [Google Scholar] [CrossRef]
- Jin, X.; Chen, K.; Zhao, Y.; Ji, J.; Jing, P. Simulation of hydraulic transplanting robot control system based on fuzzy PID controller. Measurement 2020, 164, 108023. [Google Scholar] [CrossRef]
- Truong, D.Q.; Ahn, K.K. Force control for hydraulic load simulator using self-tuning grey predictor—Fuzzy PID. Mechatronics 2009, 19, 233–246. [Google Scholar] [CrossRef]
- Shahid, A.A.; Piga, D.; Braghin, F.; Roveda, L. Continuous control actions learning and adaptation for robotic manipulation through reinforcement learning. Auton. Robot. 2022, 46, 483–498. [Google Scholar] [CrossRef]
- Coronato, A.; Naeem, M.; De Pietro, G.; Paragliola, G. Reinforcement learning for intelligent healthcare applications: A survey. Artif. Intell. Med. 2020, 109, 101964. [Google Scholar] [CrossRef] [PubMed]
- Han, M.; May, R.; Zhang, X.; Wang, X.; Pan, S.; Yan, D.; Jin, Y.; Xu, L. A review of reinforcement learning methodologies for controlling occupant comfort in buildings. Sustain. Cities Soc. 2019, 51, 101748. [Google Scholar] [CrossRef]
- Song, Z.; Yang, J.; Mei, X.; Tao, T.; Xu, M. Deep reinforcement learning for permanent magnet synchronous motor speed control systems. Neural Comput. Appl. 2020, 33, 5409–5418. [Google Scholar] [CrossRef]
- Naughton, N.; Sun, J.; Tekinalp, A.; Parthasarathy, T.; Chowdhary, G.; Gazzola, M. Elastica: A Compliant Mechanics Environment for Soft Robotic Control. IEEE Robot. Autom. Lett. 2021, 6, 3389–3396. [Google Scholar] [CrossRef]
- Nascimento, T.P.; Saska, M. Position and attitude control of multi-rotor aerial vehicles: A survey. Annu. Rev. Control 2019, 48, 129–146. [Google Scholar] [CrossRef]
- Yuan, X.; Wang, Y.; Zhang, R.; Gao, Q.; Zhou, Z.; Zhou, R.; Yin, F. Reinforcement Learning Control of Hydraulic Servo System Based on TD3 Algorithm. Machines 2022, 10, 1224. [Google Scholar] [CrossRef]
- Wu, T.; Zhao, H.; Gao, B.; Meng, F. Energy-Saving for a Velocity Control System of a Pipe Isolation Tool Based on a Reinforcement Learning Method. Int. J. Precis. Eng. Manuf. Green Technol. 2021, 9, 225–240. [Google Scholar] [CrossRef]
- Egli, P.; Hutter, M. A General Approach for the Automation of Hydraulic Excavator Arms Using Reinforcement Learning. IEEE Robot. Autom. Lett. 2022, 7, 5679–5686. [Google Scholar] [CrossRef]
- Carlucho, I.; De Paula, M.; Villar, S.A.; Acosta, G.G. Incremental Q -learning strategy for adaptive PID control of mobile robots. Expert Syst. Appl. 2017, 80, 183–199. [Google Scholar] [CrossRef]
- Yang, J.; Peng, W.; Sun, C. A Learning Control Method of Automated Vehicle Platoon at Straight Path with DDPG-Based PID. Electronics 2021, 10, 2580. [Google Scholar] [CrossRef]
- Yu, X.; Fan, Y.; Xu, S.; Ou, L. A self-adaptive SAC-PID control approach based on reinforcement learning for mobile robots. Int. J. Robust Nonlinear Control 2021, 32, 9625–9643. [Google Scholar] [CrossRef]
- Zhuang, H.; Sun, Q.; Chen, Z. Sliding mode control for electro-hydraulic proportional directional valve-controlled position tracking system based on an extended state observer. Asian J. Control 2020, 23, 1855–1869. [Google Scholar] [CrossRef]
- He, D.; Wang, T.; Wang, J.; Ren, Z.; Gao, X. Research on the position–pressure cooperative control strategy for full-hydraulic leveler. Adv. Mech. Eng. 2018, 10, 1–14. [Google Scholar] [CrossRef]
- Guo, W.; Zhao, Y.; Li, R.; Ding, H.; Zhang, J. Active Disturbance Rejection Control of Valve-Controlled Cylinder Servo Systems Based on MATLAB-AMESim Cosimulation. Complexity 2020, 2020, 9163675. [Google Scholar] [CrossRef]
- Su, S.; Xue, T.; Chen, Y.; Yang, H. Harmonic control of a dual-valve hydraulic servo system with dynamically allocated flows. Asian J. Control 2022, 25, 1939–1956. [Google Scholar] [CrossRef]
- Zhang, W.; Yuan, Q.; Xu, Y.; Wang, X.; Bai, S.; Zhao, L.; Hua, Y.; Ma, X. Research on Control Strategy of Electro-Hydraulic Lifting System Based on AMESim and MATLAB. Symmetry 2023, 15, 435. [Google Scholar] [CrossRef]
- Haarnoja, T.; Zhou, A.; Abbeel, P.; Levine, S. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In Proceedings of the 35th International Conference on Machine Learning, PMLR, Stockholm, Sweden, 3 July 2018; Volume 80, pp. 1861–1870. [Google Scholar]
- Wong, C.-C.; Chien, S.-Y.; Feng, H.-M.; Aoyama, H. Motion Planning for Dual-Arm Robot Based on Soft Actor-Critic. IEEE Access 2021, 9, 26871–26885. [Google Scholar] [CrossRef]
- Tang, H.; Wang, A.; Xue, F.; Yang, J.; Cao, Y. A Novel Hierarchical Soft Actor-Critic Algorithm for Multi-Logistics Robots Task Allocation. IEEE Access 2021, 9, 42568–42582. [Google Scholar] [CrossRef]










| Parameter | Value | Parameter | Value | 
|---|---|---|---|
| Pump displacement | Actuator stroke | ||
| Motor speed | Rod diameter | ||
| Servo valve’s natural frequency | Piston diameter | ||
| Servo valve’s input signal | Load mass | ||
| Servo valve’s max flow | Relief valve’s opening pressure | 
| Parameter | Value | 
|---|---|
| Nonlinearity | ReLU | 
| Optimizer | Adam | 
| Learning rate ( and ) | 0.001 | 
| Discount rate | 0.99 | 
| Size of the replay buffer | |
| Numbers of the hidden layers (all networks) | 128 | 
| NB | NM | NS | Z | PS | PM | PB | |
|---|---|---|---|---|---|---|---|
| NB | NB | NB | NB | NM | NM | NS | Z | 
| NM | NB | NB | NM | NS | NS | Z | PS | 
| NS | NB | NM | NS | NS | Z | PS | PM | 
| Z | NM | NS | NS | Z | PS | PS | PM | 
| PS | NM | NS | Z | PS | PS | PM | PB | 
| PM | NS | Z | PS | PS | PM | PB | PB | 
| PB | Z | PS | PM | PM | PB | PB | PB | 
| Sample Types | Training Samples | |
|---|---|---|
| Random signals | Ramp | |
| Sinusoidal | ||
| Signals with disturbance | Pressure drop | |
| Transient force | ||
| Ramp Signals/ Control Strategies | PID | Fuzzy PID | SAC-PID | 
|---|---|---|---|
| S1.1 | 116.82 | 6.81 | 3.94 | 
| S1.2 | 255.64 | 10.22 | 7.57 | 
| S1.3 | 336.23 | 14.82 | 11.67 | 
| S1.4 | 446.84 | 28.63 | 28.34 | 
| Sinusoidal Signals/ Control Strategies | PID | Fuzzy PID | SAC-PID | 
|---|---|---|---|
| S 2.1 | 268.71 | 54.77 | 16.97 | 
| S 2.2 | 427.51 | 71.14 | 22.53 | 
| S 2.3 | 482.68 | 62.55 | 21.79 | 
| S 2.4 | 437.08 | 36.90 | 12.98 | 
| Pressure Drop/ Control Strategies | PID | Fuzzy PID | SAC-PID | 
|---|---|---|---|
| W 1.1 | 257.29 | 22.31 | 7.24 | 
| W 1.2 | 285.92 | 27.45 | 11.17 | 
| W 1.3 | 329.63 | 48.10 | 26.19 | 
| W 1.4 | 427.57 | 169.36 | 142.65 | 
| Transient Force/ Control Strategies | PID | Fuzzy PID | SAC-PID | 
|---|---|---|---|
| W 2.1 | 437.82 | 44.75 | 22.99 | 
| W 2.2 | 443.19 | 63.21 | 46.11 | 
| W 2.3 | 447.73 | 80.08 | 57.79 | 
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
He, J.; Su, S.; Wang, H.; Chen, F.; Yin, B. Online PID Tuning Strategy for Hydraulic Servo Control Systems via SAC-Based Deep Reinforcement Learning. Machines 2023, 11, 593. https://doi.org/10.3390/machines11060593
He J, Su S, Wang H, Chen F, Yin B. Online PID Tuning Strategy for Hydraulic Servo Control Systems via SAC-Based Deep Reinforcement Learning. Machines. 2023; 11(6):593. https://doi.org/10.3390/machines11060593
Chicago/Turabian StyleHe, Jianhui, Shijie Su, Hairong Wang, Fan Chen, and BaoJi Yin. 2023. "Online PID Tuning Strategy for Hydraulic Servo Control Systems via SAC-Based Deep Reinforcement Learning" Machines 11, no. 6: 593. https://doi.org/10.3390/machines11060593
APA StyleHe, J., Su, S., Wang, H., Chen, F., & Yin, B. (2023). Online PID Tuning Strategy for Hydraulic Servo Control Systems via SAC-Based Deep Reinforcement Learning. Machines, 11(6), 593. https://doi.org/10.3390/machines11060593
 
         
                                                


 
       