Reinforcement Learning-Based Landing Impact Mitigation and Stabilization Control for Lunar Quadruped Robots Under Complex Operating Conditions
Abstract
1. Introduction
2. Problem Formulation and Modeling
2.1. Dynamics Model of a Lunar Quadruped Robot
- Hip abduction/adduction: Enables leg spreading and retraction in the lateral plane, used to adjust the support polygon, achieve turning, and maintain lateral balance.
- Hip pitch: Cooperates with the knee to generate forward and backward leg swinging, serving as the primary source of propulsion.
- Knee pitch: Implemented via the PFBM, works in conjunction with hip pitch to accomplish foot-end trajectory tracking, ground contact force control, and terrain adaptation.
2.2. Lunar Sloped-Terrain Landing as a Hybrid Control Problem
2.3. Control Objectives and Engineering Constraints
2.3.1. Impact Mitigation and Peak Suppression
2.3.2. Posture Stability and Slip Suppression
2.3.3. Sustained Stability and Effective Buffering
2.4. Limitations of End-to-End Reinforcement Learning
3. Phase-Structured Landing Framework
3.1. Implicit Phase Decomposition of the Landing Process
- Contact Preparation Phase;
- Energy Dissipation (Buffering) Phase;
- Stabilization Phase.
3.2. Phase-Dominant Control Objectives
3.2.1. Contact Preparation Phase
3.2.2. Energy Dissipation (Buffering) Phase
3.2.3. Stabilization Phase
3.3. Implicit Phase Encoding Without Policy Switching
4. Terrain-Aware State and Control Representation
4.1. Equivalent Support Direction Construction
4.2. Body-Frame Velocity Representation
4.3. Normal–Tangential Velocity Decomposition
4.4. Contact-Gated Variable Modulation
5. Learning-Based Control Design
5.1. Policy Architecture and Action Parameterization
- (i)
- It preserves a stable baseline behavior;
- (ii)
- It reduces the effective action space explored by the policy;
- (iii)
- It allows the policy to focus on critical adjustments during landing buffering rather than generating full-body motion from scratch.
5.2. Episode-Peak-Based Impact Suppression Modeling
5.2.1. Peak Acceleration Definition
5.2.2. Peak Growth Modeling
5.2.3. Barrier-Based Safety Modeling
5.3. Reward Function Design
5.3.1. State Stabilization Terms
5.3.2. Control Regularization
5.3.3. Impact Peak Penalties
5.4. Success Criteria with Stability Window and Buffering Sufficiency
5.4.1. Stability Window Criterion
5.4.2. Buffering Sufficiency Criterion
5.4.3. Overall Success Definition
6. Training Configuration and Implementation Details
6.1. PPO Hyperparameter Configuration
6.2. Observation and Action Space Specification
6.2.1. Observation Space
- Terrain-aligned geometric variables
- Minimum body–terrain clearance along the equivalent support direction .
- Equivalent support direction expressed in the body frame .
- Velocity and posture states
- Normal and tangential velocity components .
- Body roll and pitch angles .
- Body-frame angular velocity .
- Joint states
- Joint positions and velocities for all actuated joints.
- Contact-related information
- Binary foot contact indicators .
- Accumulated normal contact forces .
- Episode-level peak statistics
- Historical peak linear and angular accelerations .
6.2.2. Action Space
6.3. Domain Randomization Strategy
6.3.1. Mass and Inertia Randomization
6.3.2. Initial Velocity Randomization
6.3.3. Contact-Consistent Initialization
7. Simulation-Based Training
7.1. Simulation Environment and Training Configuration
7.2. Task Definition and Evaluation Metrics
- Episodic return: the cumulative reward over an episode, reflecting the composite objective combining safety, stability, and efficiency;
- Evaluation return: computed in a separate evaluation environment to assess generalization and detect overfitting.
- Minimum base height and compression;
- Step-wise and episode-level peak accelerations;
- Angular velocity norm;
- Roll and pitch angles.
7.3. Experimental Scenarios and Protocol
7.4. Results Analysis
8. Hardware Experiments on a Quadruped Robot
8.1. Test Condition 1: Total Mass 200 kg, Vertical Landing Velocity 0.3 m/s, Slope Angle 0°, Without Surface Protrusions
8.2. Test Condition 2: Total Mass 250 kg, Vertical Landing Velocity 0.3 m/s, Slope Angle 0°, Without Surface Protrusions
8.3. Test Condition 3: Total Mass 250 kg, Vertical Landing Velocity 0.8 m/s, Horizontal Landing Velocity 0.2 m/s, Slope Angle 8°, Without Surface Protrusions
9. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Sang, H.; Wang, S. Lunar Leap Robot: 3M Architecture–Enhanced Deep Reinforcement Learning Method for Quadruped Robot Jumping in Low-Gravity Environment. J. Aerosp. Eng. 2024, 37, 04024076. [Google Scholar] [CrossRef]
- Rudin, N.; Kolvenbach, H.; Tsounis, V.; Hutter, M. Cat-Like Jumping and Landing of Legged Robots in Low Gravity Using Deep Reinforcement Learning. IEEE Trans. Robot. 2022, 38, 317–328. [Google Scholar] [CrossRef]
- Lee, J.; Hwangbo, J.; Wellhausen, L.; Koltun, V.; Hutter, M. Learning quadrupedal locomotion over challenging terrain. Sci. Robot. 2020, 5, eabc5986. [Google Scholar] [CrossRef] [PubMed]
- Dong, Y.; Ding, J.; Wang, C.; Wang, H.; Liu, X. Soft landing stability analysis of a Mars lander under uncertain terrain. Chin. J. Aeronaut. 2022, 35, 377–388. [Google Scholar] [CrossRef]
- Kim, Y.B.; Jeong, H.J.; Park, S.M.; Lim, J.H.; Lee, H.H. Prediction and Validation of Landing Stability of a Lunar Lander by a Classification Map Based on Touchdown Landing Dynamics’ Simulation Considering Soft Ground. Aerospace 2021, 8, 380. [Google Scholar] [CrossRef]
- Zhu, J.; Ma, J.; Chen, J.; Wang, C.; Li, Y.; Fan, Z.; Lu, C. Improving landing stability and terrain adaptability in Lunar exploration with biomimetic lander design and control. Acta Astronaut. 2025, 226, 860–875. [Google Scholar] [CrossRef]
- Xin, G.; Zeng, F.; Qin, K. Loco-Manipulation Control for Arm-Mounted Quadruped Robots: Dynamic and Kinematic Strategies. Machines 2022, 10, 719. [Google Scholar] [CrossRef]
- Ji, S.; Liang, S. DEM-FEM-MBD coupling analysis of landing process of lunar lander considering landing mode and buffering mechanism. Adv. Space Res. 2021, 68, 1627–1643. [Google Scholar] [CrossRef]
- Lynch, D.J.; Lynch, K.M.; Umbanhowar, P.B. The Soft-Landing Problem: Minimizing Energy Loss by a Legged Robot Impacting Yielding Terrain. IEEE Robot. Autom. Lett. 2020, 5, 3658–3665. [Google Scholar] [CrossRef]
- Kiefer, J.; Ward, M.; Costello, M. Rotorcraft Hard Landing Mitigation Using Robotic Landing Gear. J. Dyn. Syst. Meas. Control 2016, 138, 031003. [Google Scholar] [CrossRef]
- You, Y.; Yang, Z.; Zou, T.; Sui, Y.; Xu, C.; Zhang, C.; Xu, H.; Zhang, Z.; Han, J. A New Trajectory Tracking Control Method for Fully Electrically Driven Quadruped Robot. Machines 2022, 10, 292. [Google Scholar] [CrossRef]
- Ding, Y.; Pandala, A.; Li, C.; Shin, Y.H.; Park, H.W. Representation-Free Model Predictive Control for Dynamic Motions in Quadrupeds. IEEE Trans. Robot. 2021, 37, 1154–1171. [Google Scholar] [CrossRef]
- Van Hauwermeiren, T.; Coene, A.; Crevecoeur, G. Tactile Force Sensing for Admittance Control on a Quadruped Robot. Machines 2025, 13, 426. [Google Scholar] [CrossRef]
- Garaffa, L.C.; Basso, M.; Konzen, A.A.; de Freitas, E.P. Reinforcement Learning for Mobile Robotics Exploration: A Survey. IEEE Trans. Neural Netw. Learn. Syst. 2023, 34, 3796–3810. [Google Scholar] [CrossRef]
- Liang, J.; Tang, S.; Jia, B. Control of Parallel Quadruped Robots Based on Adaptive Dynamic Programming Control. Machines 2024, 12, 875. [Google Scholar] [CrossRef]
- Wang, J.; Hu, C.; Zhu, Y. CPG-Based Hierarchical Locomotion Control for Modular Quadrupedal Robots Using Deep Reinforcement Learning. IEEE Robot. Autom. Lett. 2021, 6, 7193–7200. [Google Scholar] [CrossRef]
- Hwangbo, J.; Lee, J.; Dosovitskiy, A.; Bellicoso, D.; Tsounis, V.; Koltun, V.; Hutter, M. Learning agile and dynamic motor skills for legged robots. Sci. Robot. 2019, 4, eaau5872. [Google Scholar] [CrossRef]
- Shao, Y.; Jin, Y.; Liu, X.; He, W.; Wang, H.; Yang, W. Learning Free Gait Transition for Quadruped Robots Via Phase-Guided Controller. IEEE Robot. Autom. Lett. 2022, 7, 1230–1237. [Google Scholar] [CrossRef]
- Aractingi, M.; Léziart, P.A.; Flayols, T.; Perez, J.; Silander, T.; Souères, P. Controlling the solo12 quadruped robot with deep reinforcement learning. Sci. Rep. 2023, 13, 11945. [Google Scholar] [CrossRef] [PubMed]
- Huang, S.; Xiao, Z.; Zheng, M.; Shi, W. Hierarchical reinforcement learning for enhancing stability and adaptability of hexapod robots in complex terrains. Biomim. Intell. Robot. 2025, 5, 100231. [Google Scholar] [CrossRef]
- Qi, J.; Gao, H.; Su, H.; Huo, M.; Yu, H.; Deng, Z. Reinforcement Learning and Sim-to-Real Transfer of Reorientation and Landing Control for Quadruped Robots on Asteroids. IEEE Trans. Ind. Electron. 2024, 71, 14392–14400. [Google Scholar] [CrossRef]
- Qi, J.; Gao, H.; Su, H.; Han, L.; Su, B.; Huo, M.; Yu, H.; Deng, Z. Reinforcement learning-based stable jump control method for asteroid-exploration quadruped robots. Aerosp. Sci. Technol. 2023, 142, 108689. [Google Scholar] [CrossRef]
- Morente-Molinera, J.A.; Wang, Y.; Gong, Z.W.; Morfeq, A.; Al-Hmouz, R.; Herrera-Viedma, E. Reducing Criteria in Multicriteria Group Decision-Making Methods Using Hierarchical Clustering Methods and Fuzzy Ontologies. IEEE Trans. Fuzzy Syst. 2022, 30, 1585–1598. [Google Scholar] [CrossRef]
- Scorsoglio, A.; D’Ambrosio, A.; Ghilardi, L.; Gaudet, B.; Curti, F.; Furfaro, R. Image-Based Deep Reinforcement Meta-Learning for Autonomous Lunar Landing. J. Spacecr. Rocket. 2022, 59, 153–165. [Google Scholar] [CrossRef]
- Scorsoglio, A.; Gaudet, B.; Ghilardi, L.; Furfaro, R. Meta-reinforcement learning guidance, navigation, and control for autonomous lunar landing with safe site selection. Neural Comput. Appl. 2025, 37, 17311–17340. [Google Scholar] [CrossRef]
- Xiao, H.; Gong, Y.; Mei, J.; Wu, Z.; Ma, G.; Wu, W. Residual-learning-based landing control with gravity estimation for quadruped robot in low-gravity scenarios. Astrodynamics 2026, 1–14. [Google Scholar] [CrossRef]
- Yang, X.; Wen, T.; Zhang, K.; Yu, Y.; Qiao, D.; Zeng, X. Landing Dynamics of Telescopic-Legged Bionic Rover on Asteroid Gravel Surface Using Discrete Element Method. J. Field Robot. 2026, 43, 1091–1110. [Google Scholar] [CrossRef]
- Wang, L.; Meng, F.; Kang, R.; Sato, R.; Chen, X.; Yu, Z.; Ming, A.; Huang, Q. Design and Implementation of Symmetric Legged Robot for Highly Dynamic Jumping and Impact Mitigation. Sensors 2021, 21, 6885. [Google Scholar] [CrossRef] [PubMed]
- Hoseinifard, S.M.; Sadedel, M. Standing balance of single-legged hopping robot model using reinforcement learning approach in the presence of external disturbances. Sci. Rep. 2024, 14, 32036. [Google Scholar] [CrossRef] [PubMed]
- Tanaka, T.; Malki, H.; Cescon, M. Linear Quadratic Tracking With Reinforcement Learning Based Reference Trajectory Optimization for the Lunar Hopper in Simulated Environment. IEEE Access 2021, 9, 162973–162983. [Google Scholar] [CrossRef]
- Chen, Z.; Shen, S.; Cui, H.; Tian, Y. Robust adaptive guidance for autonomous asteroid landing via search-based meta-reinforcement learning. Acta Astronaut. 2025, 236, 723–734. [Google Scholar] [CrossRef]
- Panichi, E.; Ding, J.; Atanassov, V.; Yang, P.; Kober, J.; Pan, W.; Santina, C.D. On-the-Fly Jumping With Soft Landing: Leveraging Trajectory Optimization and Behavior Cloning. IEEE/ASME Trans. Mechatron. 2025, 30, 3142–3151. [Google Scholar] [CrossRef]
- Li, J.; Zhao, W.; Chen, L.; Liu, Z.; Sun, S. Reinforcement Learning-Based Locomotion Control for a Lunar Quadruped Robot Considering Space Lubrication Conditions. Mathematics 2026, 14, 848. [Google Scholar] [CrossRef]
- Chen, R.; Yang, H.; Feng, Q.; Bai, L.; Liu, L.; Yuan, Z.; Wang, H.; Liang, L.; Jiang, P.; Luo, J. Human-Inspired Foot-Spine Coordination Control for Stable Landing of Jumping Robots. Chin. J. Mech. Eng. 2025, 38, 178. [Google Scholar] [CrossRef]
- Sun, Z.; Zhao, J.; Li, Y.; Teng, L. Robotic leaping enhanced by thrust-induced hypogravity, achieving precise, predictable, and extended jumps. Nat. Commun. 2026, 17, 2523. [Google Scholar] [CrossRef]
- Zhao, W.; Liu, H.; Lewis, F.L. Robust Formation Control for Cooperative Underactuated Quadrotors via Reinforcement Learning. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 4577–4587. [Google Scholar] [CrossRef]
- Fan, R.; Chen, X.; Liu, M.; Cao, X. Attitude-orbit coupled sliding mode tracking control for spacecraft formation with event-triggered transmission. ISA Trans. 2022, 124, 338–348. [Google Scholar] [CrossRef]
- Zhu, A.; Ai, H.; Chen, L. A Fuzzy Logic Reinforcement Learning Control with Spring-Damper Device for Space Robot Capturing Satellite. Appl. Sci. 2022, 12, 2662. [Google Scholar] [CrossRef]
- Jendoubi, I.; Bouffard, F. Multi-agent hierarchical reinforcement learning for energy management. Appl. Energy 2023, 332, 120500. [Google Scholar] [CrossRef]
- Qi, J.; Gao, H.; Yu, H.; Huo, M.; Feng, W.; Deng, Z. Integrated attitude and landing control for quadruped robots in asteroid landing mission scenarios using reinforcement learning. Acta Astronaut. 2023, 204, 599–610. [Google Scholar] [CrossRef]
- Choi, S.; Ji, G.; Park, J.; Kim, H.; Mun, J.; Lee, J.H.; Hwangbo, J. Learning quadrupedal locomotion on deformable terrain. Sci. Robot. 2023, 8, eade2256. [Google Scholar] [CrossRef] [PubMed]
- Shi, Y.; He, X.; Zou, W.; Yu, B.; Yuan, L.; Li, M.; Pan, G.; Ba, K. Multi-Objective Optimal Torque Control with Simultaneous Motion and Force Tracking for Hydraulic Quadruped Robots. Machines 2022, 10, 170. [Google Scholar] [CrossRef]












| Name | Unit | Number |
|---|---|---|
| Mass | kg | 200–360 |
| Thigh link length | m | 0.5 |
| Shank link length | m | 0.5 |
| Body length | m | 1.45 |
| Body width | m | 1.45 |
| Body height | m | 0.4 |
| Maximum joint torque | N·m | 140 |
| Roll joint motion range | ° | −90–90 |
| Hip pitch joint motion range | ° | −90–30 |
| Knee pitch joint motion range | ° | −180–70 |
| Tension spring stiffness | N/m | 6000 |
| Tension spring free length | m | 0.21 |
| Joint mass | kg | 3 |
| Thigh link mass | kg | 2.1 |
| Shank link mass | kg | 1.2 |
| Foot pad mass | kg | 0.6 |
| Parameter | Symbol/Name | Value |
|---|---|---|
| Simulation timestep | 0.002 s | |
| Control frequency | 50 Hz | |
| Control period | 0.02 s | |
| Episode duration | 8.0 s | |
| Episode length (steps) | - | 400 control steps |
| Parameter | Symbol | Value |
|---|---|---|
| Proportional gain | 210.0 | |
| Derivative gain | 21.0 | |
| Torque limit | ±140 N·m |
| Quantity | Limit | Abort |
|---|---|---|
| Linear acceleration | 2.0 | 10.2 |
| Angular acceleration | 12.0 | 30.0 |
| Body tilt | 20° | termination |
| Category | Parameter | Value |
|---|---|---|
| Alive penalty | −0.01 | |
| Linear acc barrier | 50.0 | |
| Angular acc barrier | 5.0 | |
| Step peak lin acc | 0.002 | |
| Step peak ang acc | 0.001 | |
| Peak growth lin | 0.05 | |
| Peak growth ang | 0.02 | |
| Terminal peak lin | 1.0 | |
| Terminal peak ang | 0.3 | |
| Normal velocity | 1.2 | |
| Tangential velocity | 2.0 | |
| Attitude | 30.0 | |
| Angular velocity | 0.10 | |
| Height barrier | 350.0 | |
| Joint deviation | 0.50 | |
| Joint velocity | 0.03 | |
| Torque effort | 1 × 10−4 | |
| Action smoothness | 1 × 10−3 | |
| Contact ratio | 0.15 | |
| Normal force variance | 1 × 10−4 | |
| Failure penalty | — | −300 |
| Success bonus | — | +250 |
| Metric | Threshold |
|---|---|
| Roll pitch | ≤2° |
| Normal velocity | ≤0.04 m/s |
| Tangential velocity | ≤0.05 m/s |
| Angular velocity norm | ≤0.125 rad/s |
| Minimum clearance | ≥0.15 m |
| Contact ratio | ≥0.75 (≥3 feet) |
| Window duration | 0.9 s |
| Parameter | Value |
|---|---|
| Algorithm | PPO |
| Policy network | MLP |
| Actor critic layers | [512, 512] |
| Learning rate | |
| Rollout length | 2048 |
| Batch size | 2048 |
| Epochs per update | 5 |
| Discount factor () | 0.99 |
| GAE () | 0.95 |
| Clip range | 0.2 |
| Target KL | 0.12 |
| Entropy coefficient | 0.0 |
| Value coefficient | 0.5 |
| Gradient norm | 0.5 |
| Parallel envs | 16 |
| Total steps | |
| Device | GPU (CUDA) |
| No. | Parameter | Test Value |
|---|---|---|
| 1 | Bulk density (g/cm3) | – |
| 2 | Deformation index | – |
| 3 | Cohesion modulus (kN/mn+1) | – |
| 4 | Friction modulus (kN/mn+2) | 281–652 |
| 5 | Shear deformation modulus (cm) | – |
| 6 | Equivalent stiffness modulus (kPa/mn) | 840–2800 |
| 7 | Contact stiffness (N) | – – |
| 8 | Cohesion (kPa) | – |
| 9 | Internal friction angle (°) | – |
| 10 | Thermal conductivity () | – – |
| 11 | Albedo | – |
| Buffering Mobile Leg | Motor Speed (rpm) | ||
|---|---|---|---|
| Hip Roll Joint | Hip Pitch Joint | Knee Pitch Joint | |
| M01-1 | 73 | 911 | 545 |
| M01-2 | 91 | 721 | 399 |
| M01-3 | 3 | 780 | 461 |
| M01-4 | 73 | 882 | 527 |
| Maximum | 91 | 911 | 545 |
| Buffering Mobile Leg | Joint Torque () | ||
|---|---|---|---|
| Hip Roll Joint | Hip Pitch Joint | Knee Pitch Joint | |
| M01-1 | 9.66 | 75.69 | 31.82 |
| M01-2 | 16.73 | 84.51 | 31.43 |
| M01-3 | 3.92 | 101.5 | 30.96 |
| M01-4 | 17.36 | 96.05 | 35.87 |
| Maximum | 17.36 | 101.5 | 35.87 |
| Buffering Mobile Leg | Duration of the First Touchdown | Duration of the Second Touchdown | Duration of the Third Touchdown |
|---|---|---|---|
| Leg 1 | 5 ms | 667 ms | Duration |
| Leg 2 | Duration | / | / |
| Leg 3 | Duration | / | / |
| Leg 4 | 4 ms | Duration | / |
| Number | FX (N) | FY (N) | FZ (N) | MX (N·m) | MY (N·m) | MZ (N·m) |
|---|---|---|---|---|---|---|
| Leg 1 | 36.66 | −24.19 | 273.20 | −8.67 | −170.60 | −9.07 |
| Leg 2 | −27.38 | 16.08 | 313.29 | 33.13 | −180.58 | 10.19 |
| Leg 3 | −77.62 | 22.62 | 312.68 | −7.26 | −184.85 | 10.74 |
| Leg 4 | −56.51 | −35.80 | 332.34 | −11.93 | −188.11 | −13.75 |
| Maximum absolute value | 77.62 | 35.80 | 332.34 | 33.13 | 188.11 | 13.75 |
| Category | Pitch Angle | Roll Angle | Yaw Angle |
|---|---|---|---|
| Initial landing (°) | 22.44 | 8.70 | 39.02 |
| Maximum deflection (°) | 22.42 | 8.12 | 39.73 |
| Landing completion (°) | 22.37 | 8.15 | 39.65 |
| Maximum deviation (°) | 0.07 | 0.58 | 0.71 |
| Joint | Motor Speed (rpm) | Joint Torque (N·m) | |
|---|---|---|---|
| Hip pitch joint | Measured maximum value | 911 | 75.69 |
| Simulated maximum value | 884 | 81.73 | |
| Deviation | 26 | 6.04 | |
| Knee pitch joint | Measured maximum value | 545 | 31.82 |
| Simulated maximum value | 551 | 39.21 | |
| Deviation | 7 | 7.39 | |
| Buffering Mobile Leg | Motor Speed (rpm) | ||
|---|---|---|---|
| Hip Roll Joint | Hip Pitch Joint | Knee Pitch Joint | |
| M01-1 | 51 | 944 | 659 |
| M01-2 | 95 | 966 | 637 |
| M01-3 | 98 | 1175 | 651 |
| M01-4 | 95 | 1153 | 684 |
| Maximum | 98 | 1175 | 684 |
| Buffering Mobile Leg | Joint Torque (N·m) | ||
|---|---|---|---|
| Hip Roll Joint | Hip Pitch Joint | Knee Pitch Joint | |
| M01-1 | 10.91 | 84.44 | 35.96 |
| M01-2 | 19.35 | 96.4 | 38.65 |
| M01-3 | 8.91 | 114.19 | 37.15 |
| M01-4 | 24.02 | 106.1 | 41.53 |
| Maximum | 24.02 | 114.19 | 41.53 |
| Buffering Mobile Leg | Duration of the First Touchdown | Duration of the Second Touchdown | Duration of the Third Touchdown | Duration of the Fourth Touchdown | Duration of the Fifth Touchdown |
|---|---|---|---|---|---|
| Leg 1 | 4 ms | 644 ms | Duration | / | / |
| Leg 2 | 4 ms | 713 ms | Duration | / | / |
| Leg 3 | 4 ms | 4 ms | 18 ms | 3 ms | Duration |
| Leg 4 | 4 ms | Duration | / |
| Number | FX (N) | FY (N) | FZ (N) | MX (N·m) | MY (N·m) | MZ (N·m) |
|---|---|---|---|---|---|---|
| Leg 1 | 40.47 | −26.1 | 317.7 | −9.55 | −187.47 | −8.02 |
| Leg 2 | 35.84 | −22.89 | 353.34 | 25.62 | −197.34 | −9.99 |
| Leg 3 | −94.09 | 33.34 | 345.5 | 7.98 | −200.17 | 13.08 |
| Leg 4 | −72.14 | −41.04 | 372.29 | −17.94 | −202.79 | −15.09 |
| Maximum absolute value | 94.09 | 41.04 | 372.29 | 25.62 | 202.79 | 15.09 |
| Joint | Motor Speed (rpm) | Joint Torque (N·m) | |
|---|---|---|---|
| Hip pitch joint | Measured maximum value | 944 | 84.44 |
| Simulated maximum value | 1063 | 72.81 | |
| Deviation | 120 | 11.63 | |
| Knee pitch joint | Measured maximum value | 659 | 35.96 |
| Simulated maximum value | 647 | 22.26 | |
| Deviation | 11 | 13.69 | |
| Category | Pitch Angle | Roll Angle | Yaw Angle |
|---|---|---|---|
| Initial landing (°) | 23.00 | 9.13 | 30.63 |
| Maximum deflection (°) | 22.96 | 8.24 | 31.31 |
| Landing completion (°) | 22.96 | 8.34 | 31.31 |
| Maximum deviation (°) | 0.04 | 0.89 | 0.68 |
| Buffering Mobile Leg | Motor Speed (rpm) | ||
|---|---|---|---|
| Hip Roll Joint | Hip Pitch Joint | Knee Pitch Joint | |
| M01-1 | 300 | 1585 | 1124 |
| M01-2 | 355 | 1149 | 919 |
| M01-3 | 227 | 2288 | 1146 |
| M01-4 | 318 | 2482 | 1124 |
| Maximum | 355 | 2482 | 1146 |
| Buffering Mobile Leg | Joint Torque (N·m) | ||
|---|---|---|---|
| Hip Roll Joint | Hip Pitch Joint | Knee Pitch Joint | |
| M01-1 | 62.1 | 98.58 | 85.56 |
| M01-2 | 95.75 | 126.15 | 89.69 |
| M01-3 | 30.71 | 138.06 | 95.65 |
| M01-4 | 63.1 | 134.47 | 76.31 |
| Maximum | 95.75 | 138.06 | 95.65 |
| Buffering Mobile Leg | Duration of the First Touchdown | Duration of the Second Touchdown | Duration of the Third Touchdown | Duration of the Fourth Touchdown | Duration of the Fifth Touchdown |
|---|---|---|---|---|---|
| Leg 1 | 3 ms | 921 ms | / | / | / |
| Leg 2 | 4 ms | 912 ms | / | / | / |
| Leg 3 | 4 ms | 3 ms | 4 ms | 754 ms | / |
| Leg 4 | 5 ms | Duration | / | / | / |
| Number | FX (N) | FY (N) | FZ (N) | MX (N·m) | MY (N·m) | MZ (N·m) |
|---|---|---|---|---|---|---|
| Leg 1 | −315.73 | 110.64 | 528.84 | 82.06 | −216.11 | 49.02 |
| Leg 2 | −260.42 | −208.21 | 549.78 | 63.32 | −227.44 | −102.15 |
| Leg 3 | −317.58 | 222.36 | 539.44 | 54.66 | −288.84 | 75.16 |
| Leg 4 | −227.97 | 283.05 | 596.76 | 83.51 | −337.47 | −101.14 |
| Maximum absolute value | 317.58 | 283.05 | 596.76 | 63.32 | 337.47 | 102.15 |
| Category | Pitch Angle | Roll Angle | Yaw Angle |
|---|---|---|---|
| Initial landing (°) | 21.82 | 9.27 | −94.98 |
| Maximum deflection (°) | 21.86 | 2.22 | −92.35 |
| Landing completion (°) | 22.38 | 3.15 | −92.89 |
| Maximum deviation (°) | 0.56 | 7.05 | 2.63 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Li, J.; Yuan, Y.; Liu, Z.; Sun, S. Reinforcement Learning-Based Landing Impact Mitigation and Stabilization Control for Lunar Quadruped Robots Under Complex Operating Conditions. Machines 2026, 14, 417. https://doi.org/10.3390/machines14040417
Li J, Yuan Y, Liu Z, Sun S. Reinforcement Learning-Based Landing Impact Mitigation and Stabilization Control for Lunar Quadruped Robots Under Complex Operating Conditions. Machines. 2026; 14(4):417. https://doi.org/10.3390/machines14040417
Chicago/Turabian StyleLi, Jianfei, Yeqing Yuan, Zhiyong Liu, and Shengxin Sun. 2026. "Reinforcement Learning-Based Landing Impact Mitigation and Stabilization Control for Lunar Quadruped Robots Under Complex Operating Conditions" Machines 14, no. 4: 417. https://doi.org/10.3390/machines14040417
APA StyleLi, J., Yuan, Y., Liu, Z., & Sun, S. (2026). Reinforcement Learning-Based Landing Impact Mitigation and Stabilization Control for Lunar Quadruped Robots Under Complex Operating Conditions. Machines, 14(4), 417. https://doi.org/10.3390/machines14040417

