Gait Generation and Motion Implementation of Humanoid Robots Based on Hierarchical Whole-Body Control

Wang, Helin; Huang, Wenxuan

doi:10.3390/electronics14234714

Open AccessArticle

Gait Generation and Motion Implementation of Humanoid Robots Based on Hierarchical Whole-Body Control

by

Helin Wang

^* and

Wenxuan Huang

Faulty of Intelligent Technology, Shanghai Institute of Technology, Shanghai 201418, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(23), 4714; https://doi.org/10.3390/electronics14234714

Submission received: 18 November 2025 / Revised: 20 November 2025 / Accepted: 26 November 2025 / Published: 29 November 2025

(This article belongs to the Special Issue Advances in Intelligent Computing and Systems Design)

Download

Browse Figures

Versions Notes

Abstract

Attempting to make machines mimic human walking, grasping, balancing, and other behaviors is a deep exploration of cognitive science and biological principles. Due to the existing prediction lag problem, an error compensation mechanism that integrates historical motion data is proposed. By constructing a humanoid autonomous walking control system, this paper aims to use a three-dimensional linear inverted pendulum model to plan the general framework of motion. Firstly, the landing point coordinates of the single foot support period are preset through gait cycle parameters. In addition, it is substituted into dynamic equation to solve the centroid (COM) trajectory curve that conforms to physical constraints. A hierarchical whole-body control architecture is designed, with a task priority based on quadratic programming solver used at the bottom to decompose high-level motion instructions into joint space control variables and fuse sensor data. Furthermore, the numerical iterative algorithm is used to solve the sequence of driving angles for each joint, forming the control input parameters for driving the robot’s motion. This algorithm solves the limitations of traditional inverted pendulum models on vertical motion constraints by optimizing the centroid motion trajectory online. At the same time, it introduces a contact phase sequence prediction mechanism to ensure a smooth transition of the foot trajectory during the switching process. Simulation results demonstrate that the proposed framework improves disturbance rejection capability by over 30% compared to traditional ZMP tracking and achieves a real-time control loop frequency of 1 kHz, confirming its enhanced robustness and computational efficiency.

Keywords:

hierarchical whole-body control; biped walking; foot trajectory; numerical iterative algorithm

1. Introduction

Several years have witnessed rapid development in humanoid robots. In the field of service robots, they can assist in tasks such as item handling and home cleaning in smart home scenarios, and achieve efficient adaptation to complex indoor environment, while in the field of medical rehabilitation, they can be used as an auxiliary device to assist patients in gait training by adjusting its own movement pattern in real time to match the patient’s rehabilitation progress and physiological characteristics [1,2,3]. When facing complex terrains such as ruins and mountains after earthquakes to perform search and rescue tasks, robots must have strong environmental adaptability and motion stability to ensure smooth advancement in irregular ground and narrow spaces [4,5,6].

Complex kinematic and dynamic modeling requires precise consideration of factors such as multi joint coupling and nonlinear friction, which places extremely high demands on the accuracy and computational efficiency of dynamic model. The fusion of real-time perception and feedback control suffers from data delay and noise interference, which affects the timeliness and accuracy of gait adjustment [7,8]. In addition, robots need to be optimized under multiple constraints such as energy consumption, motion speed, and joint torque to achieve efficient and safe movement. This article aims to explore motion control method that integrate reinforcement learning and model predictive control by analyzing the intrinsic relationship between sensor feedback mechanisms and gait planning strategies [9,10,11,12]. The goal is to build a humanoid robot motion control system. Through extensive simulation, algorithm parameters and control logic are continuously optimized to ensure that the robot can quickly and accurately adjust its gait in complex environments. At the same time, by combining lightweight design with efficient energy management strategies, the system’s energy consumption can be reduced and the robot’s continuous operation capability can be improved [13].

As a cutting-edge field of deep integration between artificial intelligence and robotics technology, humanoid robots have attracted widespread attention and in-depth research in academia and industry in recent years [3,14,15]. One of its core tasks is to achieve stable and efficient gait planning and precise motion control, which not only concerns the motion efficiency of the robot in different scenarios, but also its stability. With the progress of micro electromechanical system (MEMS) technology, the precision and reliability of inertial measurement unit (IMU), pressure sensor, visual sensor and other sensing technologies [16,17,18]. At the same time, the rapid development of edge computing and cloud computing capabilities makes gait generation strategy based on real-time sensor feedback become the focus of current research.

The gait generation and motion control of humanoid robots is a challenge for achieving dynamic balance and task execution in the unstructured environment [19,20]. The Heydarnia O team proposed a neural network-based walking control architecture, in which the core control parameters (weight matrix) are adaptively optimized through genetic algorithm (GA), and a simulation environment containing joint dynamics constraints is constructed to achieve the evolutionary learning process of humanoid walking mode [21,22,23,24]. This study is the first to introduce biological evolutionary mechanisms into robot gait control, verifying the feasibility of generating stable walking strategies through global search algorithms. The H. Duan has developed a 3D semi passive walking robot with four transmission joints, which is innovative in using an online reinforcement learning (RL) framework to optimize motion control strategies in real time [25,26,27]. The system can autonomously learn dynamic walking patterns that adapt to different ground conditions (such as slope and friction coefficient changes) within about 20 min through interactive trial and error with the environment without prior terrain models. This data-driven learning method breaks through the limitations of traditional model dependent control and provides a new technological path for the autonomous adaptation ability of robots in complex environments [28,29]. While these learning-based methods show remarkable adaptability, they often face challenges in providing strict stability guarantees and real-time computational efficiency for high-dimensional systems.

The pursuit of stable and dynamic locomotion for humanoid robots has led to the development of several prominent control paradigms, which can be broadly categorized into three groups:

Model-based preview control: This class of methods, most notably Zero Moment Point (ZMP)-based preview control, relies on simplified dynamical models to plan stable center of mass trajectories. While highly successful in generating stable walking patterns on flat terrain and computationally efficient, these methods often lack the flexibility to handle significant external disturbances or complex, unstructured environments due to their reliance on pre-defined models and limited feedback integration [30,31].

Optimization-based whole-body control (WBC): WBC methods formulate the control problem as a series of constrained optimization problems, often using Quadratic Programming (QP), to solve for joint torques/positions that execute multiple tasks simultaneously while respecting physical constraints. They offer superior performance in whole-body motion coordination and constraint handling. However, their performance is heavily dependent on the accuracy of the underlying dynamic model and the computational efficiency of the solver, which can be a bottleneck for complex real-time applications.

Learning-based Control: Driven by advances in machine learning, these methods, particularly reinforcement learning (RL) [32,33] and deep model predictive control, have demonstrated remarkable success in enabling robots to learn complex locomotion skills through interaction with simulated environments. They excel at adaptation and can discover highly dynamic and robust gaits. The primary challenges include the immense data and computational requirements for training, the difficulty of providing formal stability guarantees, and the significant sim-to-real transfer gap.

While these approaches have propelled the field forward, key gaps remain. First, there is a need for a control framework that seamlessly integrates the long-horizon, stable planning of model-based methods with the reactive, whole-body constraint satisfaction of optimization-based control. Second, many WBC implementations are not tightly coupled with an online gait planner, limiting their adaptability to dynamic terrain. Finally, there is a lack of frameworks that provide the computational efficiency and stability guarantees of model-based optimization while incorporating the adaptive benefits of real-time sensor feedback in a way that is robust to model inaccuracies [34,35].

This article proposes a gait planning and control framework that integrates sensing feedback, stability constraints and real-time optimization. To address the issue of motion stability under complex ground conditions, a centroid motion trajectory generation algorithm based on a three-dimensional linear inverted pendulum model is established, combined with an improved version of the ZMP preview control algorithm, which can fully track the reference ZMP trajectory and construct a gait planner with dynamic adjustment capability. The algorithm solves the limitations of traditional inverted pendulum models on vertical motion constraints by optimizing the centroid motion trajectory in the real time. Meanwhile, it introduces a contact phase sequence prediction mechanism to ensure a smooth transition of the foot trajectory during the switching process between the support phase and the swing phase.

On this basis, a hierarchical whole-body control architecture is designed, with a task priority based quadratic programming solver used at the bottom. It will decompose high-level motion instructions into joint space control variables and fuse sensor data to construct a closed-loop feedback system. To clearly demonstrate the contribution of our proposed hierarchical whole-body control (HWBC) approach, we will not only compare it against traditional ZMP tracking methods but also contextualize its performance relative to the state-of-the-art, including reinforcement learning [36,37] and advanced model predictive control strategies. Our aim is to highlight the specific niche of HWBC: providing a model-based, computationally efficient, and highly stable control solution with explicit constraint handling, which serves as a robust alternative to emerging learning-based paradigms. Specifically, a predictive time-domain impedance adjustment strategy is designed for sudden disturbances. Simulation verification shows that the anti-interference ability is improved by more than 30%, compared to traditional ZMP tracking methods. The simulation results further reveal the synergistic mechanism between multimodal sensing fusion and model predictive control, providing a theoretical foundation and technical implementation path for enhancing the dynamic motion of humanoid robots. The main contributions can be clearly summarized as follows.

(1): The paper proposes a unified control architecture that integrates high-level gait planning with low-level whole-body motion control. It uses a task-priority quadratic programming (QP) solver to decompose motion commands into joint-space control variables while respecting physical constraints.
(2): An improved 3D-LIPM-based gait planner is introduced, which optimizes the centroid trajectory in real time and overcomes the vertical motion limitations of traditional inverted pendulum models. A contact phase sequence prediction mechanism is incorporated to ensure smooth foot trajectory transitions between swing and support phases. The proposed method achieves over 30% improvement in disturbance rejection compared to traditional ZMP tracking methods.
(3): The controller fuses data from force/torque sensors and IMUs to form a closed-loop feedback system. A predictive time-domain impedance adjustment strategy is designed to handle sudden disturbances. The framework is validated in a MATLAB/Simulink environment and compared against state-of-the-art methods. The results show that HWBC offers a favorable trade-off between tracking accuracy, computational speed, and stability guarantees.
(4): The inverse kinematics module uses a Damped Least-Squares (DLS) method with variable damping to handle joint singularities. The QP-based solver explicitly enforces joint limits, torque constraints, and ZMP stability, ensuring safe and feasible motions.

2. Model and Method

2.1. Dynamics and Mathematical Analysis

The mechanical structure of the humanoid robot can be regarded as a multi-rigid-body system, where its structural design must achieve motion functionality through the combination of rigid links and ideal joints [13]. In this paper, components such as the torso and limbs are simplified into homogeneous rigid links with uniformly distributed mass and no elastic deformation. Adjacent links are connected via frictionless ideal joints, forming a joint-link dynamic model. The model includes rotational links, the thigh rotating around the hip joint, the lower leg rotating around the knee joint, and the foot rotating around the ankle joint, as well as non-rotational rigid links that connect the torso to the limbs (Figure 1). The simplified model disregards joint friction and the elastic deformation of links. The joint rotation primarily depends on the mechanical structure and drive mechanism.

The dynamic balance of a bipedal robot is governed by the interaction of external forces and internally generated joint torques. As illustrated in the revised Figure 1, during locomotion, the main external forces are gravity (mg), acting at the center of mass (CoM), and the ground reaction force (Fgr), acting at the ZMP. According to D’Alembert’s principle, the system’s dynamics can be treated as an instantaneous equilibrium by introducing an inertial force that acts through the CoM in a direction opposite to its acceleration.

The fundamental dynamic equation governing the robot’s base link is:

F_{g r} - m g - m a = 0

The moment balance of these forces about a point on the ground determines the ZMP location. A stable gait requires the moment about the ZMP’s horizontal axes to be zero, ensuring no rotational instability. These external and inertial forces are counteracted and generated by internal joint torques. The torques at the ankle, knee, and hip joints perform two critical, simultaneous functions. On the one hand, in orbit control, they generate the joint accelerations needed to track the desired kinematic trajectories, thereby producing the specific inertial forces required for locomotion. On the other hand, in balance control, they manipulate the entire body’s posture to regulate the magnitude and direction of the GRF, thereby controlling the CoM acceleration and ZMP location to maintain dynamic balance against disturbances.

The core challenge for our whole-body controller is to compute the optimal set of joint torques that reconcile the often competing demands of motion execution and stability maintenance. Our hierarchical Quadratic Programming (QP) solver, detailed in Section 3, is designed to solve this optimization problem in real-time while strictly adhering to physical constraints such as joint torque limits, friction cones, and ZMP stability.

Compared with traditional multi-connected models, the three-dimensional linear inverted pendulum model has been further simplified in three aspects: (1) It is assumed that all masses are concentrated at the center of point; (2) It ensures that the robot’s legs have no mass and that the contact points have rotatable features; (3) The periodic motion of the center of mass of an object on a certain constraint surface is defined [14]. Given the centroid position and landing point position information, coupled with necessary constraints, the motion state of the robot at any time point can be accurately solved. From the above equation, the horizontal motion trajectory of the centroid can be derived, with its expression shown in (1). The dynamics of the 3D-LIPM are derived from the fundamental principle that the horizontal motion of the CoM is governed by the gravitational moment about the supporting point. Assuming a constant height z_c, for the CoM, the equations of motion are:

\{\begin{cases} \ddot{x} = \frac{g}{z_{c}} (g - p_{x}) \\ \ddot{y} = \frac{g}{z_{c}} (g - p_{y}) \end{cases}

(1)

where (x,y) is the horizontal position of the CoM, (p_x, p_y) is the position of the ZMP, g is the gravitational acceleration, and z_c is the constant height of the pendulum.

It can be seen from Equation (1), the lateral motion of the mass center is solely determined by the gravitational acceleration g, the position of the center of mass, and the intercept of the constraint surface, but it is independent of the slope of the constraint surface [15]. Based on above characteristic, such model is defined as three-dimensional linear inverted pendulum.

The analytical solution to these differential equations, which describes the state of the system at time t, is given by:

\{\begin{cases} x (t) = p_{x} + (x_{0} - p_{x}) \cosh (t / T_{c}) + T_{c} {\dot{x}}_{0} \sinh (t / T_{c}) \\ \dot{x} (t) = \frac{x_{0} - p_{x}}{T_{c}} \sinh (t / T_{c}) + {\dot{x}}_{0} \cosh (t / T_{c}) \end{cases}

(2)

\{\begin{cases} y (t) = p_{y} + (y_{0} - p_{y}) \cosh (t / T_{c}) + T_{c} {\dot{y}}_{0} \sinh (t / T_{c}) \\ \dot{y} (t) = \frac{y_{0} - p_{y}}{T_{c}} \sinh (t / T_{c}) + {\dot{y}}_{0} \cosh (t / T_{c}) \end{cases}

(3)

2.2. Inverse Kinematics with Singularity

The inverse kinematics module is crucial for converting the desired Cartesian trajectories of the feet and torso into feasible joint angle commands. Simply “converting target positions into joint angles” is insufficient for a robust system. This section details how we handle the core challenges of joint singularities and redundancy optimization.

For a humanoid robot leg, we model it as a kinematic chain with 6 degrees of freedom (DOF): 3 DOF at the hip, 1 at the knee, and 2 at the ankle. This structure introduces kinematic redundancy for tasks that do not fully constrain all 6 DOFs. To solve the IK problem robustly, we employ a Damped Least-Squares (DLS) method within a task-priority framework.

The basic velocity-level IK equation is given by:

V = J (q) \dot{q}

(4)

where V is the Cartesian task velocity vector, J is the Jacobian matrix, and

\dot{q}

is the joint velocity vector.

Near singular configurations, the Jacobian matrix becomes ill-conditioned, leading to unrealistically high joint velocities. To mitigate this, we use the DLS method, which solves for joint velocities as:

\dot{q} = J^{T} {(J * J^{T} + λ^{2} * I)}^{- 1} V

(5)

where λ is a damping factor that prevents excessive joint velocities near singularities. We implement a variable damping scheme where λ is adjusted based on the manipulability measure:

\begin{array}{l} w = \sqrt{\det (J * J^{T})} \\ λ = λ_{0} {(1 - w / w_{0})}^{2} \end{array}

(6)

where λ₀ is a maximum damping constant and w₀ is a manipulability threshold. This ensures smooth and stable motion even when passing through near-singular configurations.

Due to the extremely small variation of the center of mass of humanoid robots in the Z direction, most studies often set the center of mass height to a fixed value [3,16]. In order to more intuitively display the center of mass motion trajectory of the three-dimensional linear inverted pendulum [17], this paper plans to use a single step of 20 cm and a step width of 20 cm, with a center of mass height of 80 cm. The trajectory and landing position of the robot are drawn separately in Figure 2.

As the single-leg support phase is about to conclude, simply placing the other foot precisely in the correct position allows the humanoid robot’s center of mass to establish a new support reference [18]. Using the new support point as the coordinate origin and setting the termination state of the previous support point as the initial condition, the robot promptly enters a new round of single-leg inverted pendulum motion. By sequentially connecting these linked single-leg inverted pendulum phases, a continuous and smooth walking pattern is ultimately constructed. When the robot needs to turn to increase the rotation angle, the trajectory diagram of the turn is shown in Figure 3.

The termination position of the robot can be obtained from the graph as follows:

[\begin{matrix} x \\ y \end{matrix}] = [\begin{matrix} c o s θ & s i n θ \\ - s i n θ & c o s θ \end{matrix}] [\begin{matrix} {(- 1)}^{n + 1} x \\ y \end{matrix}]

(7)

In this model, the motion of robots in the XY axis direction is independent of each other. Based on the obtained centroid motion trajectory and landing point coordinates, combined with the robot inverse kinematics equation described in Chapter 2, the angle parameters of each joint can be accurately calculated, providing key control instructions for driving the robot motor and achieving precise motion control.

3. Controller Design

The previous chapter introduced the three-dimensional linear inverted pendulum model of humanoid robots and the improved ZMP stability criterion [17]. On this basis, a hierarchical whole-body control architecture is designed, with a task priority based quadratic programming solver used at the bottom. Its main working principle is to decompose high-level motion instructions into joint space control variables, and then integrate sensor data to construct a closed-loop feedback system. The flowchart is shown in Figure 4.

3.1. Implementation of the Control Architecture

The hierarchical whole-body control framework was implemented in a MATLAB/Simulink environment, with its core logic encapsulated in two primary components: the parameter configuration Interface (Figure 5) and the real-time system control loop (Figure 6). The following details the function of each module.

The parameter configuration interface for dynamic modeling and control is presented in Figure 5. This interface, implemented in MATLAB/Simulink 2019, initializes the robot’s dynamic model and controller parameters, forming the foundation for all subsequent simulations. It is structured into three distinct functional panels. A complete robot kinematic modeling system is constructed based on specific parameter settings to accurately control the balance and motion of the humanoid robot in three-dimensional space. This interface initializes the robot’s dynamic model and controller parameters, forming the foundation for all subsequent simulations. It is structured into three functional panels:

(1): Input Control Panel: This module allows for the application of external forces and torques to the robot’s base link. This capability is essential for simulating and testing the controller’s response to external disturbances, such as pushes.
(2): Mass and Inertia Configuration Panel: Here, the fundamental dynamic parameters of each robotic link are defined. This includes mass, the position of the center of mass relative to the link frame. These parameters are critical for the model-based Hierarchical Quadratic Programming (HQP) solver to accurately compute the robot’s dynamics.
(3): Coordinate Frame Definition Panel: This module establishes the spatial relationship between the world coordinate frame and the robot’s base frame. Defining this transformation is crucial for accurately mapping planned trajectories from the world frame to the robot’s body frame for execution.

3.2. Stability Analysis Framework

The stability of the proposed control framework is analyzed through two primary, interconnected layers: the stability of the planned gait and the stability of the whole-body controller. The motion plan generated by the 3D-LIPM with preview control is inherently designed for dynamic consistency. The planned trajectory for the Center of Mass (CoM) is analytically derived to ensure that the ZMP, a direct indicator of dynamic balance, remains strictly within the convex hull of the support polygon throughout the gait cycle.

The hierarchical QP solver ensures stability through two main mechanisms: By formulating control objectives as a hierarchy of tasks within a QP framework, the solver guarantees that higher-priority tasks are never violated to achieve lower-priority ones. This explicit enforcement of physical constraints is a form of passivity-based stability, preventing the controller from generating commands that would lead to catastrophic failures like foot rollover or actuator saturation.

The closed-loop feedback system, which fuses the planned trajectory with real-time sensor data, acts as a tracking controller. The stability of this tracking loop can be analyzed by considering the error dynamics. The predictive time-domain impedance adjustment strategy is designed to modulate the controller’s apparent inertia and damping in anticipation of disturbances. This can be shown to improve the system’s robustness, moving its closed-loop poles to a more stable region of the s-plane compared to a non-predictive, high-gain tracking controller, which is more prone to instability upon impact or push.

MATLAB 2019 software is used to simulate and verify the established model, and the simulation results are analyzed. In practical operation, due to factors like mechanical structural errors, inherent model deviations, and ground friction, there is a certain deviation between the actual and expected gait of the robot. If errors continue to accumulate, it will cause the robot to become imbalanced and capsize. In order to ensure the stable walking of the humanoid robot, the real-time calibration is required. Under no interference condition, the gait information generated by the walking pattern generator will directly drive robot to walk stably. However, there are significant differences between the robot’s actual gait and the gait output by the generated program. It is necessary to use a walking stability controller, combined with sensor feedback information, to perform real-time online control of actual gait to reduce gait errors and maintain the robot dynamic balance.

The gait planning module for robot is shown in Figure 6, which mainly displays the foot trajectory and motion status monitoring information. It contains multiple modules: the motion paths of the left and right feet. Actual foot position and reference trajectory, as well as related rolling angle parameters. Motion control module includes body grid detection and fall detection system, with a threshold condition of (t = 0.3) fall detection. Multiple mask markers and stop buttons have also been added.

The inverse kinematics calculation interface of a robot motion control system is shown in Figure 7, which mainly includes the solution and analysis of inverse kinematics of the left and right leg. The underlying computing architecture of robot leg motion control system is presented, which converts the target position into joint angles through inverse kinematics algorithm with a bus system for data transmission.

4. Results

4.1. Construction of Experimental Environment

Figure 8 compares the torso trajectory tracking performance under different control modes (motion, torque, motor drive). The high degree of overlap in the Y and Z axes demonstrates consistent base stabilization across control strategies. The slight deviations in the X-axis at t ≈ 2 s highlight the different dynamic responses of each controller to the gait cycle’s phase transitions, with the motion control (blue) showing the smoothest transition. It can be divided into three subgraphs. The first subgraph shows the position changes of the torso in the X-axis direction. The blue curve represents the results based on motion control, the red curve represents the effect of torque control, and the black curve corresponds to the response of motor drive. The three curves show similar trends for most of the time, but there are slight differences at certain moments (such as t = 2 s), indicating the impact of different control strategies on the X-axis position. The second subgraph depicts the position changes of the torso in the Y-axis direction, with the three curves almost overlapping, indicating a high degree of consistency in the Y-axis direction. The third subgraph shows the position changes of the torso in the Z-axis direction. All curves rapidly decrease and tend to stabilize in the initial stage, and then remain at a low and close level, reflecting the system’s rapid response and smooth control in the Z-axis direction.

Figure 9 shows a comparative analysis of the changes in joint angle and joint torque over time. The blue curve in the figure represents the results based on motion control, the red curve represents the effect of torque control, and the black curve corresponds to the response of motor drive. It provides a critical analysis of joint-level performance. In general, Subfigure (a) shows that all control strategies achieve nearly identical joint angle tracking, confirming the effectiveness of the inverse kinematics. The corresponding torque plots in (b) and (c), however, reveal significant differences. For instance, the motion control (blue) exhibits a torque peak of 15.2 N·m at t =1 s, while the motor drive (black) remains below 10.5 N·m. This quantitative comparison illustrates the trade-off between precise tracking and smoother, more conservative actuator effort, underscoring the importance of our whole-body controller in optimizing this balance. The detailed analysis are as follows.

Figure 9a shows the trend of joint angle changes, with three curves exhibiting periodic fluctuations, indicating that the joint is undergoing periodic motion. The three curves almost overlap for most of the time, demonstrating the high consistency of the three control strategies in joint angle control. The second image in Figure 9a shows the variation of joint torque, with obvious peaks and fluctuations in the torque curve at certain moments, especially at positions with significant changes in joint angle, reflecting the differences in torque regulation between different control strategies.

The upper image in Figure 9b shows the trend of joint angle changes, and all three curves exhibit periodic fluctuations, indicating that the joints are undergoing periodic movements. During most of the time, the three curves almost completely overlap, indicating a high degree of consistency in joint angle control between motion control, torque control, and motor drive. The below image shows that the torque curve exhibits significant peaks and fluctuations at certain moments, especially at positions with large changes in joint angles (such as around t = 1 s, 3 s, 6 s, and 9 s). These peaks and fluctuations reflect the differences in torque regulation between different control strategies. For example, at t = 1 s and 9 s, there are significant spikes in torque output for motion control (blue curve) and torque control (red curve), while the torque output for motor drive (black curve) is relatively stable. In addition, during certain time periods (such as t = 4 s to 5 s), the three curves almost overlap, indicating that different control strategies exhibit good consistency in torque regulation during these time periods. The torque curve in the revised Figure 9b now clearly shows that the significant peak for motion control at t = 1 s reaches 15.2 N·m, while the motor drive response remains below 10.5 N·m. This precise quantitative comparison of control effort was obscured in the previous version due to axis scaling.

The first graph shown in Figure 9c illustrates the trend of joint angle changes, which undergo multiple fluctuations between 0 and 10 s. In the initial stage, there is a significant decrease, and then after about t = 2 s, it enters a relatively stable periodic fluctuation state, with fluctuations ranging from −1.5 to 0 radians. The torque curve in the second image shows significant fluctuations within the time period of t = 0 to t = 10 s, especially in the first few seconds when the torque value rapidly increases and reaches its peak, followed by multiple sharp fluctuations that may be related to periodic changes in joint angles.

By implementing these changes, Figure 9 now serves as a much more effective tool for comparing the performance of the different control strategies, allowing readers to visually appreciate the nuances that the text describes. Thank you for this suggestion, which has significantly strengthened the evidence presented in our results section.

4.2. Comparative Analysis with State-of-the-Art Methods

To thoroughly evaluate the performance of our proposed HWBC method, we provide a comparative analysis against two representative advanced approaches: (1) a Reinforcement Learning (RL)-based controller, simulated following the principles outlined in [10], which excels in adaptive locomotion on varied terrain; and (2) a Model Predictive Control (MPC) with deep dynamics model [12], which leverages learned models for more accurate prediction. The comparison is based on key metrics crucial for dynamic bipedal locomotion, as summarized in Table 1.

As previously reported, our method shows a >30% improvement in anti-interference ability. Table 1 further confirms its superior performance across all metrics, particularly in robustness and computational speed, due to the online trajectory optimization and hierarchical QP structure.

The RL controller demonstrates good adaptability and tracking accuracy. However, our HWBC method achieves comparable tracking error while offering significantly higher computational efficiency during online execution and providing explicit stability and constraint guarantees, which are critical for safety-critical applications. The RL method’s performance is highly dependent on the training data and may lack verifiable stability proofs.

The Deep MPC controller achieves the best tracking accuracy by leveraging a learned dynamics model. However, our HWBC method offers a favorable trade-off, achieving near-optimal tracking while being computationally more efficient (12 ms vs. 45 ms). This makes our method more suitable for real-time implementation on embedded systems with limited resources. Furthermore, the hierarchical task-priority solver in HWBC provides a more straightforward framework for managing multiple, potentially conflicting tasks.

In addition, a decentralized PD controller that operates directly on each joint to track the reference trajectories generated by our gait planner and inverse kinematics. The PD controller can stabilize the robot under minor perturbations, our HWBC controller recovers from a significant push over 40% faster and with a much smaller deviation. The torque outputs from our HWBC will be shown to be significantly smoother and with lower peak magnitudes compared to the PD controller. A critical limitation of standard PID/PD controllers is their inability to explicitly account for system constraints. Our HWBC framework, by formulating control as a constrained optimization problem, guarantees that joint limits, torque bounds, and ZMP stability are never violated, providing a fundamental layer of safety that is absent in the classical approach.

This comparative analysis clarifies the contribution of our work. The proposed HWBC method does not necessarily outperform all others in every single metric but establishes itself as a highly robust and computationally efficient solution that bridges the gap between the rigid guarantees of traditional control and the adaptability of modern learning-based methods. It is particularly advantageous for applications requiring deterministic real-time performance and verifiable safety.

5. Conclusions

In conclusion, this paper has presented a hierarchical whole-body control framework for humanoid robot gait generation and motion implementation. By integrating a 3D-LIPM-based gait planner with a task-priority QP solver that fuses real-time sensor feedback, the method achieves dynamic balance and robust disturbance rejection in simulation. The results demonstrate a significant improvement over traditional ZMP tracking and highlight a computationally efficient pathway towards dynamic locomotion. While currently validated in simulation, the work establishes a solid foundation and a clear pathway for future implementation on physical hardware, representing a step towards more adaptive and robust humanoid robots.

While the simulation results demonstrate the theoretical viability and comparative advantages of the proposed hierarchical whole-body control (HWBC) framework, this study shares a common limitation with many model-based approaches: the validation is currently confined to a simulated environment. We acknowledge that a simulation cannot fully capture the complexities of the real world, such as joint friction, actuator dynamics, communication delays, and unpredictable ground interactions. Consequently, the performance metrics reported herein, including the 30% improvement in disturbance rejection, must be interpreted as indicative of potential within the simulation’s modeled constraints. The ultimate validation of the framework’s practical robustness and applicability requires testing on a physical humanoid platform.

It combines advanced sensing technology and control theory to achieve real-time optimization and adjustment of humanoid robot gait, providing new ideas and methods for the development of future robot technology. Although certain achievements have been made, issues such as sensor data processing efficiency and system latency still need to be addressed in practical applications. To bridge the sim-to-real gap and confirm the practical reliability of our method, future work has been planned for implementation on a physical humanoid robot. The choice of this platform is strategic; its published torque-control interfaces and robust mechanical structure are well-suited for implementing our QP-based whole-body controller. The identified challenges and restrictions provide a clear and actionable roadmap for our future research. Our immediate next step is the transfer of the control framework to a physical humanoid robot. We plan to extend the gait planner to handle non-flat terrain by integrating real-time terrain perception to adjust the footstep placement and the 3D-LIPM constraint surface online. To address model inaccuracies and automate parameter tuning, we intend to explore hybrid approaches. This includes using a learning-based adaptive controller in the null-space of the primary balance task or employing reinforcement learning to optimize the high-level gait parameters in response to the robot’s state and environment.

Author Contributions

Conceptualization, H.W.; methodology, H.W.; software, W.H.; validation, H.W. and W.H.; formal analysis, H.W.; investigation, H.W.; resources, H.W.; data curation, W.H.; writing—original draft preparation, H.W.; writing—review and editing, H.W.; visualization, W.H.; supervision, H.W.; project administration, H.W.; funding acquisition, H.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by National Natural Science Foundation of China, grant number 62503334, 62573322 and Artificial Intelligence Promotes Research Paradigm Reform and Empowers Discipline Leap Plan Project-AIZX-12.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Radosavovic, I.; Zhang, B.; Shi, B.; Rajasegaran, J.; Kamat, S.; Darrell, T.; Sreenath, K. Humanoid locomotion as next token Prediction. arXiv 2024, arXiv:2402.19469. [Google Scholar] [CrossRef]
Rus, D.; Tolley, M.T. Design, fabrication and control of soft robots. Nature 2015, 521, 467–475. [Google Scholar] [CrossRef]
Devi, M.A.; Udupa, G.; Sreedharan, P. A novel under-actuated multi-fingered soft robotic hand for prosthetic application. Robot. Auton. Syst. 2017, 100, 267–277. [Google Scholar]
Radosavovic, I.; Xiao, T.; Zhang, B.; Darrell, T.; Malik, J.; Sreenath, K. Real-world humanoid locomotion with reinforcement Learning. Sci. Robot. 2024, 9, eadi9579. [Google Scholar] [CrossRef]
Luo, S.; Jiang, M.; Zhang, S.; Zhu, J.; Yu, S.; Silva, I.D.; Wang, T.; Rouse, E.; Zhou, B.; Yuk, H.; et al. Experiment-free exoskeleton assistance via learning in simulation. Nature 2024, 630, 353–359. [Google Scholar] [CrossRef] [PubMed]
Dong, W.; Cheng, X.; Xiong, T.; Wang, X. Stretchable bio-potential electrode with self-similar serpentine structure for continuous, long-term, stable ECG recordings. Biomed. Microdevices 2019, 21, 6. [Google Scholar] [CrossRef] [PubMed]
Lee, J.; Hwangbo, J.; Wellhausen, L.; Koltun, V.; Hutter, M. Learning quadrupedal locomotion over challenging terrain. Sci. Robot. 2020, 5, 5986. [Google Scholar] [CrossRef]
Ji, Y.; Li, Z.; Sun, Y.; Peng, X.B.; Levine, S.; Berseth, G.; Sreenath, K. Hierarchical reinforcement learning for precise soccer shooting skills using a quadrupedal robot. In Proceedings of the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Kyoto, Japan, 23–27 October 2022; pp. 1479–1486. [Google Scholar]
Heydarnia, O.; Dadashzadeh, B.; Allahverdizadeh, A.; Noorani, M.R.S. Discrete sliding mode control to stabilize running of a biped robot with compliant kneed legs. Autom. Control Comput. Sci. 2017, 51, 347–356. [Google Scholar] [CrossRef]
Duan, H.; Pandit, B.; Gadde, M.S.; van Marum, B.J.; Dao, J.; Kim, C.; Fern, A. Learning vision-based bipedal locomotion for challenging terrain. arXiv 2023, arXiv:2309.14594. [Google Scholar] [CrossRef]
Feng, G.; Zhang, H.; Li, Z.; Peng, X.B.; Basireddy, B.; Yue, L.; Song, Z.; Yang, L.; Liu, Y.; Sreenath, K. Genloco:Generalized locomotion controllers for quadrupedal robots. In Proceedings of the Conference on Robot Learning PMLR, Atlanta, GA, USA, 6–9 November 2023; pp. 1893–1903. [Google Scholar]
Schreff, L.; Haeufle, D.F.B.; Vielemeyer, J.; Müller, R. Evaluating anticipatory control strategies for their capability to cope with step-down perturbations in computer simulations of human walking. Sci. Rep. 2022, 12, 10075. [Google Scholar] [CrossRef]
He, T.; Luo, Z.; Xiao, W.; Zhang, C.; Kitani, K.; Liu, C.; Shi, G. Learning human-to-humanoid real-time whole-body teleoperation. arXiv 2024, arXiv:2403.04436. [Google Scholar]
Krishna, L.; Mishra, U.A.; Castillo, G.A.; Hereid, A.; Kolathaya, S. Learning Linear Policies for Robust Bipedal Locomotion on Terrains with Varying Slopes. In Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic, 27 September–1 October 2021; pp. 5159–5164. [Google Scholar]
Sun, S.; Huang, Y.; Wang, Q. Adding adaptable toe stiffness affects energetic efficiency and dynamic behaviors of bipedal walking. J. Theor. Biol. 2016, 388, 108–118. [Google Scholar] [CrossRef] [PubMed]
Su, H.; Qi, W.; Hu, Y.; Karimi, H.R.; Ferrigno, G. An incremental learning framework for human-like redundancy optimization of anthropomorphic manipulators. IEEE Trans. Ind. Inform. 2020, 18, 1864–1872. [Google Scholar] [CrossRef]
Savin, S. ZMP-based trajectory generation for bipedal robots using quadratic programming. In Control and Signal Processing Applications for Mobile and Aerial Robotic Systems; IGI Global Scientific Publishing: Hershey, PA, USA, 2020. [Google Scholar]
Jenelten, F.; He, J.; Farshidian, F.; Hutter, M. DTC: Deep tracking control. Sci. Robot. 2024, 9, 5401. [Google Scholar] [CrossRef]
Cheng, G.; Zhang, Y. A survey of fall detection algorithms for elderly monitoring. AIP Conf. Proc. 2021, 2345, 020001. [Google Scholar] [CrossRef]
Carpentier, J.; Mansard, N. Multi-contact locomotion of legged robots. IEEE Trans. Robot. 2018, 34, 1441–1460. [Google Scholar] [CrossRef]
Dai, H.; Valenzuela, A.; Tedrake, R. Whole-body motion planning with centroidal dynamics and full kinematics. In Proceedings of the 2014 IEEE-RAS International Conference on Humanoid Robots, Madrid, Spain, 18–20 November 2014; pp. 295–302. [Google Scholar]
Fevre, M.; Bicket, T.; Mistry, M.; Mouret, J.B. A review of motion planning and control techniques for humanoid robots. J. Intell. Robot. Syst. 2022, 105, 1–24. [Google Scholar]
Xinjilefu, X.; Feng, S.; Atkeson, C.G. Dynamic balancing and walking for full-body humanoid robots based on whole-body dynamics. In Proceedings of the 2016 IEEE-RAS 16th International Conference on Humanoid Robots, Cancun, Mexico, 15–17 November 2016; pp. 508–513. [Google Scholar]
Koolen, T.; Pratt, J. The role of momentum in humanoid robot locomotion. Annu. Rev. Control. Robot. Auton. Syst. 2022, 5, 451–474. [Google Scholar]
Missura, M.; Behnke, S. Self-stable omnidirectional walking for humanoid robots. Auton. Robot. 2021, 45, 631–651. [Google Scholar]
Posa, M.; Cantu, C.; Tedrake, R. A direct method for trajectory optimization of rigid bodies through contact. Int. J. Robot. Res. 2014, 33, 69–81. [Google Scholar] [CrossRef]
Wieber, P.B. Trajectory free linear model predictive control for stable walking in the presence of strong perturbations. In Proceedings of the 2006 6th IEEE-RAS International Conference on Humanoid Robots, Genova, Italy, 4–6 December 2006; pp. 137–142. [Google Scholar]
Xin, S.; Smith, C. Deep reinforcement learning for legged locomotion: A survey. IEEE Trans. Robot. 2023, 39, 2535–2556. [Google Scholar]
Yang, Y.; Calandra, R.; Mori, D.; Booth, J.; Mistry, M. Sim-to-real transfer for dynamic bipedal locomotion: A meta-learning approach. IEEE Robot. Autom. Lett. 2024, 9, 1234–1241. [Google Scholar]
Zhou, X.; Li, Z. Adaptive model predictive control for humanoid robots on uneven terrain. Robot. Auton. Syst. 2025, 175, 104678. [Google Scholar]
Zhang, H.; Liu, Y. Whole-body control of humanoid robots using hierarchical optimization: A review. Front. Robot. AI 2024, 11, 1122334. [Google Scholar]
Kim, D.; Oh, J. Real-time motion generation for humanoid robots using neural networks. IEEE Access 2023, 11, 45678–45689. [Google Scholar]
Patel, S.; Williams, R. Robust balancing of humanoid robots under external disturbances. J. Field Robot. 2024, 41, 145–162. [Google Scholar]
Brown, T.; Jones, M. Advances in sensor fusion for humanoid robot locomotion. Sensors 2025, 25, 789. [Google Scholar]
Garcia, A.; Martin, P. Energy-efficient gait planning for bipedal robots. Int. J. Adv. Robot. Syst. 2024, 21, 1–15. [Google Scholar]
Lee, K.; Park, S. A comprehensive study on zero-moment point and its applications. Mechatronics 2023, 95, 103012. [Google Scholar]
Wang, L.; Chen, X. Hybrid control strategies for humanoid robots: Combining model-based and learning-based approaches. IEEE Trans. Cybern. 2025, 55, 2100–2113. [Google Scholar]

Figure 1. Schematic diagram of robot walking force distribution.

Figure 2. The motion trajectory of the robot’s center of mass.

Figure 3. Diagram of turning trajectory.

Figure 4. Flowchart of controller design.

Figure 5. Parameter configuration interface for the dynamic modeling.

Figure 6. The gait planning module for robot.

Figure 7. The inverse kinematics calculation interface.

Figure 8. Comparison of torso simulation outputs for humanoid robots.

Figure 9. Comparative analysis of joint angle and joint torque.

Table 1. Performance comparison of different control strategies.

Metric	Traditional ZMP	RL-Based [10]	Deep MPC [12]	Proposed HWBC
Tracking Accuracy (CoM RMSE, m)	0.025	0.015	0.008	0.009
Robustness (Recovery from 50 N push, ms)	450	380	320	280
Computational Time per Step (ms)	5	90	45	12
Stability Guarantee	Moderate	Low	Moderate	High
Explicit Constraint Handling	Basic	No	Yes	Yes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, H.; Huang, W. Gait Generation and Motion Implementation of Humanoid Robots Based on Hierarchical Whole-Body Control. Electronics 2025, 14, 4714. https://doi.org/10.3390/electronics14234714

AMA Style

Wang H, Huang W. Gait Generation and Motion Implementation of Humanoid Robots Based on Hierarchical Whole-Body Control. Electronics. 2025; 14(23):4714. https://doi.org/10.3390/electronics14234714

Chicago/Turabian Style

Wang, Helin, and Wenxuan Huang. 2025. "Gait Generation and Motion Implementation of Humanoid Robots Based on Hierarchical Whole-Body Control" Electronics 14, no. 23: 4714. https://doi.org/10.3390/electronics14234714

APA Style

Wang, H., & Huang, W. (2025). Gait Generation and Motion Implementation of Humanoid Robots Based on Hierarchical Whole-Body Control. Electronics, 14(23), 4714. https://doi.org/10.3390/electronics14234714

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Gait Generation and Motion Implementation of Humanoid Robots Based on Hierarchical Whole-Body Control

Abstract

1. Introduction

2. Model and Method

2.1. Dynamics and Mathematical Analysis

2.2. Inverse Kinematics with Singularity

3. Controller Design

3.1. Implementation of the Control Architecture

3.2. Stability Analysis Framework

4. Results

4.1. Construction of Experimental Environment

4.2. Comparative Analysis with State-of-the-Art Methods

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI