Robust Locomotion Control of Quadrupedal Wheel-Legged Robots via Contrastive History-Aware Reinforcement Learning in Complex Environments

Dai, Deyun; Liu, Tao; Tang, Tengfei

doi:10.3390/machines14050568

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

Robust Locomotion Control of Quadrupedal Wheel-Legged Robots via Contrastive History-Aware Reinforcement Learning in Complex Environments

by

Deyun Dai

^1,*,

Tao Liu

¹

and

Tengfei Tang

^2,*

¹

School of Mechanical Engineering, Zhejiang University, Hangzhou 310058, China

²

School of Mechanical Engineering, Zhejiang Sci-Tech University, Hangzhou 310018, China

^*

Authors to whom correspondence should be addressed.

Machines 2026, 14(5), 568; https://doi.org/10.3390/machines14050568

Submission received: 15 April 2026 / Revised: 14 May 2026 / Accepted: 18 May 2026 / Published: 20 May 2026

(This article belongs to the Section Robotics, Mechatronics and Intelligent Machines)

Download Versions Notes

Abstract

Quadrupedal wheel-legged robots possess exceptional mobility in complex terrains, but their robust locomotion control is severely hindered by the difficulty of accurate state estimation without external sensors. Existing reinforcement learning methods relying on two-stage imitation often suffer from representation collapse and information loss during sim-to-real transfer. To address these challenges, this paper proposes a novel end-to-end reinforcement learning framework for implicit state estimation, incorporating terrain and external force features. Inspired by internal model control, the proposed method leverages a history of purely proprioceptive observations to extract explicit kinematic responses, as well as implicit environmental and external force representations via prototypical contrastive learning, completely circumventing explicit terrain regression and the need for physical force sensors. Furthermore, a tailored composite reward function and a progressive curriculum training strategy with large-scale domain randomization are integrated to ensure dynamic stability and hardware safety. Extensive cross-simulator validations and real-world deployments demonstrate that the approach achieves highly agile and robust locomotion, including adaptive traversal over diverse terrains. Experiments show that the method significantly enhances robustness under external disturbances, notably reducing the lateral linear velocity tracking error from 0.2421 m/s to 0.1319 m/s. The proposed method realizes zero-shot sim-to-real transfer with superior sample efficiency, providing a reliable and universal control paradigm for wheel-legged robots in unstructured environments.

Keywords: quadrupedal wheel-legged robot; deep reinforcement learning; implicit state estimation; contrastive learning; sim-to-Real transfer

Share and Cite

MDPI and ACS Style

Dai, D.; Liu, T.; Tang, T. Robust Locomotion Control of Quadrupedal Wheel-Legged Robots via Contrastive History-Aware Reinforcement Learning in Complex Environments. Machines 2026, 14, 568. https://doi.org/10.3390/machines14050568

AMA Style

Dai D, Liu T, Tang T. Robust Locomotion Control of Quadrupedal Wheel-Legged Robots via Contrastive History-Aware Reinforcement Learning in Complex Environments. Machines. 2026; 14(5):568. https://doi.org/10.3390/machines14050568

Chicago/Turabian Style

Dai, Deyun, Tao Liu, and Tengfei Tang. 2026. "Robust Locomotion Control of Quadrupedal Wheel-Legged Robots via Contrastive History-Aware Reinforcement Learning in Complex Environments" Machines 14, no. 5: 568. https://doi.org/10.3390/machines14050568

APA Style

Dai, D., Liu, T., & Tang, T. (2026). Robust Locomotion Control of Quadrupedal Wheel-Legged Robots via Contrastive History-Aware Reinforcement Learning in Complex Environments. Machines, 14(5), 568. https://doi.org/10.3390/machines14050568

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Robust Locomotion Control of Quadrupedal Wheel-Legged Robots via Contrastive History-Aware Reinforcement Learning in Complex Environments

Abstract

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI