Energy Consumption Minimization of Quadruped Robot Based on Reinforcement Learning of DDPG Algorithm
Abstract
:1. Introduction
2. Robot Modeling
2.1. Kinematics Modeling
2.1.1. Swing Phase of Leg Mechanism
2.1.2. Supporting Phase of Leg Mechanism
2.2. Energy Model
2.3. Foot Trajectory Analysis
3. Methods
4. Result and Discussion
4.1. Simulation Experiment
4.2. Prototype Experiment
4.3. Discussion
5. Conclusions
- This paper proposes a reinforcement learning method for quadrupedal robots based on the DDPG algorithm aimed at minimizing energy consumption in gait patterns.
- Training was performed in two stages using the DDPG TD3 algorithm, which improved the learning efficiency and accuracy of the results.
- We compared the energy consumption obtained in simulations between this paper’s method and common foot tracks: straight line trajectories and composite pendulum trajectories.
- It was found that the plantar trajectories obtained by this paper’s methods outperformed the other two plantar trajectories at different walking speeds of the robot, consuming 7% to 9% less energy in the same movement time.
- This paper focused on the optimization of energy consumption of quadrupedal robots without considering the friction between the foot end and joints, the initial attitude of the body, and the terrain conditions.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Biswal, P.; Mohanty, P.K. Development of quadruped walking robots: A review. Ain Shams Eng. J. 2021, 12, 2017–2031. [Google Scholar] [CrossRef]
- Hu, N.; Li, S.; Gao, F. Multi-objective hierarchical optimal control for quadruped rescue robot. Int. J. Control Autom. Syst. 2018, 16, 1866–1877. [Google Scholar] [CrossRef]
- Gao, F.; Tang, W.; Huang, J.; Chen, H. Positioning of Quadruped Robot Based on Tightly Coupled LiDAR Vision Inertial Odometer. Remote Sens. 2022, 14, 2945. [Google Scholar] [CrossRef]
- Shao, Q.; Dong, X.; Lin, Z.; Tang, C.; Sun, H.; Liu, X.J.; Zhao, H. Untethered Robotic Millipede Driven by Low-Pressure Microfluidic Actuators for Multi-Terrain Exploration. IEEE Robot. Autom. Lett. 2022, 7, 12142–12149. [Google Scholar] [CrossRef]
- Miller, I.D.; Cladera, F.; Cowley, A.; Shivakumar, S.S.; Lee, E.S.; Jarin-Lipschitz, L.; Kumar, V. Mine tunnel exploration using multiple quadrupedal robots. IEEE Robot. Autom. Lett. 2020, 5, 2840–2847. [Google Scholar] [CrossRef]
- Wang, P.; Song, C.; Li, X.; Luo, P. Gait planning and control of quadruped crawling robot on a slope. Ind. Robot. Int. J. Robot. Res. Appl. 2020, 47, 12–22. [Google Scholar] [CrossRef]
- Lipeng, Y.; Bing, L. Research on Gait Switching Control of Quadruped Robot Based on Dynamic and Static Combination. IEEE Access 2023, 11, 14073–14088. [Google Scholar] [CrossRef]
- Yong, S.; Teng, C.; Yanzhe, H.; Xiaoli, W. Implementation and dynamic gait planning of a quadruped bionic robot. Int. J. Control Autom. Syst. 2017, 15, 2819–2828. [Google Scholar] [CrossRef]
- Chen, M.; Zhang, K.; Wang, S.; Liu, F.; Liu, J.; Zhang, Y. Analysis and optimization of interpolation points for quadruped robots joint trajectory. Complexity 2020, 2020, 3507679. [Google Scholar] [CrossRef]
- Han, L.; Chen, X.; Yu, Z.; Zhu, X.; Hashimoto, K.; Huang, Q. Trajectory-free dynamic locomotion using key trend states for biped robots with point feet. Inf. Sci. 2023, 66, 189201. [Google Scholar] [CrossRef]
- Levine, S.; Pastor, P.; Krizhevsky, A.; Ibarz, J.; Quillen, D. Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. Int. J. Robot. Res. 2018, 37, 421–436. [Google Scholar] [CrossRef]
- Haarnoja, T.; Ha, S.; Zhou, A.; Tan, J.; Tucker, G.; Levine, S. Learning to walk via deep reinforcement learning. In Proceedings of the Robotics: Science and System XV, Freiburg im Breisgau, Germany, 22–26 June 2019. [Google Scholar]
- Tan, J.; Zhang, T.; Coumans, E.; Iscen, A.; Bai, Y.; Hafner, D.; Bohez, S.; Vanhoucke, V. Sim-to-Real: Learning Agile Locomotion for Quadruped Robots. In Proceedings of the Robotics: Science and System XIV, Pittsburgh, PL, USA, 26–30 June 2018. [Google Scholar]
- Li, T.; Geyer, H.; Atkeson, C.G.; Rai, A. Using deep reinforcement learning to learn high-level policies on the atrias biped. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019. [Google Scholar]
- Nagabandi, A.; Clavera, I.; Liu, S.; Fearing, R.S.; Abbeel, P.; Levine, S.; Finn, C. Learning to Adapt in Dynamic, Real-World Environments through Meta-Reinforcement Learning. In Proceedings of the International Conference on Learning Representations, New Orleans, LI, USA, 6–9 May 2019. [Google Scholar]
- Yang, K.; Rong, X.; Zhou, L.; Li, Y. Modeling and analysis on energy consumption of hydraulic quadruped robot for optimal trot motion control. Appl. Sci. 2019, 9, 1771. [Google Scholar] [CrossRef]
- Benotsmane, R.; Kovács, G. Optimization of energy consumption of industrial robots using classical PID and MPC controllers. Energies 2023, 16, 3499. [Google Scholar] [CrossRef]
- Wang, G.; Ding, L.; Gao, H.; Deng, Z.; Liu, Z.; Yu, H. Minimizing the energy consumption for a hexapod robot based on optimal force distribution. IEEE Access 2020, 8, 5393–5406. [Google Scholar] [CrossRef]
- Mikolajczyk, T.; Mikołajewska, E.; Al-Shuka, H.F.N.; Malinowski, T.; Kłodowski, A.; Pimenov, D.Y.; Paczkowski, T.; Hu, F.; Giasin, K.; Mikołajewski, D.; et al. Recent Advances in Bipedal Walking Robots: Review of Gait, Drive, Sensors and Control Systems. Sensors 2022, 22, 4440. [Google Scholar] [CrossRef] [PubMed]
- Li, Y.; Wang, Z.; Yang, H.; Wei, Y. Energy-Optimal Planning of Robot Trajectory Based on Dynamics. Arab. J. Sci. Eng. 2023, 48, 3523–3536. [Google Scholar] [CrossRef]
- Li, Y.; Yang, M.; Wei, B.; Zhang, Y. Energy-saving control of rolling speed for spherical robot based on regenerative damping. Nonlinear Dyn. 2023, 111, 7235–7250. [Google Scholar] [CrossRef]
- Hou, L.; Zhou, F.; Kim, K.; Zhang, L. Practical model for energy consumption analysis of omnidirectional mobile robot. Sensors 2021, 21, 1800. [Google Scholar] [CrossRef] [PubMed]
- Mikołajczyk, T.; Mikołajewski, D.; Kłodowski, A.; Łukaszewicz, A.; Mikołajewska, E.; Paczkowski, T.; Macko, M.; Skornia, M. Energy Sources of Mobile Robot Power Systems: A Systematic Review and Comparison of Efficiency. Appl. Sci. 2023, 13, 7547. [Google Scholar] [CrossRef]
i | ai−1/mm | αi−1/(°) | θ/(°) | di/mm |
---|---|---|---|---|
1 | 0 | 0 | θ1 | 0 |
2 | l1 | 90 | θ2 | 0 |
3 | l2 | 0 | θ3 | 0 |
4 | l3 | 0 | 0 | 0 |
i | ai−1/mm | αi−1/(°) | θ/(°) | di/mm |
---|---|---|---|---|
1 | 0 | 0 | θ4 | 0 |
2 | l3 | 0 | θ3 | 0 |
3 | l2 | 0 | θ2 | 0 |
4 | 0 | 0 | θ1 | 0 |
Observation (Major) | Action | |
---|---|---|
Hip joint angular velocity (RF) | Hip joint torque (RF) | Hip joint torque (RF) |
Femoral joint angular velocity (RF) | Femoral joint torque (RF) | Femoral joint torque (RF) |
Knee joint angular velocity (RF) | Knee joint torque (RF) | Knee joint torque (RF) |
Hip joint angular velocity (RH) | Hip joint torque (RH) | Hip joint torque (RH) |
Femoral joint angular velocity (RH) | Femoral joint torque (RH) | Femoral joint torque (RH) |
Knee joint angular velocity (RH) | Knee joint torque (RH) | Knee joint torque (RH) |
Hip joint angular velocity (LF) | Hip joint torque (LF) | Hip joint torque (LF) |
Femoral joint angular velocity (LF) | Femoral joint torque (LF) | Femoral joint torque (LF) |
Knee joint angular velocity (LF) | Knee joint torque (LF) | Knee joint torque (LF) |
Hip joint angular velocity (LH) | Hip joint torque (LH) | Hip joint torque (LH) |
Femoral joint angular velocity (LH) | Femoral joint torque (LH) | Femoral joint torque (LH) |
Knee joint angular velocity (LH) | Knee joint torque (LH) | Knee joint torque (LH) |
Hyper Parameter | Value |
---|---|
Value limit for torque commands | 25 |
Actor learning rate | 0.0004 |
Critic learning rate | 0.002 |
Max steps | 400 |
Score averaging window length | 250 |
Stop training value | 190 |
Save agent value | 200 |
Joint control range | −45~45° |
Max episodes | 20,000 |
1 m/s | 1.2 m/s | 1.4 m/s | |
---|---|---|---|
Simulation trajectory | 1.6236 | 1.8152 | 2.8261 |
Composite pendulum line trajectory | 1.8451 | 2.0489 | 3.1021 |
Straight path | 2.3571 | 2.7571 | 3.2871 |
Decrement | 12% | 11% | 9% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yan, Z.; Ji, H.; Chang, Q. Energy Consumption Minimization of Quadruped Robot Based on Reinforcement Learning of DDPG Algorithm. Actuators 2024, 13, 18. https://doi.org/10.3390/act13010018
Yan Z, Ji H, Chang Q. Energy Consumption Minimization of Quadruped Robot Based on Reinforcement Learning of DDPG Algorithm. Actuators. 2024; 13(1):18. https://doi.org/10.3390/act13010018
Chicago/Turabian StyleYan, Zhenzhuo, Hongwei Ji, and Qing Chang. 2024. "Energy Consumption Minimization of Quadruped Robot Based on Reinforcement Learning of DDPG Algorithm" Actuators 13, no. 1: 18. https://doi.org/10.3390/act13010018
APA StyleYan, Z., Ji, H., & Chang, Q. (2024). Energy Consumption Minimization of Quadruped Robot Based on Reinforcement Learning of DDPG Algorithm. Actuators, 13(1), 18. https://doi.org/10.3390/act13010018