1. Introduction
Several years have witnessed rapid development in humanoid robots. In the field of service robots, they can assist in tasks such as item handling and home cleaning in smart home scenarios, and achieve efficient adaptation to complex indoor environment, while in the field of medical rehabilitation, they can be used as an auxiliary device to assist patients in gait training by adjusting its own movement pattern in real time to match the patient’s rehabilitation progress and physiological characteristics [
1,
2,
3]. When facing complex terrains such as ruins and mountains after earthquakes to perform search and rescue tasks, robots must have strong environmental adaptability and motion stability to ensure smooth advancement in irregular ground and narrow spaces [
4,
5,
6].
Complex kinematic and dynamic modeling requires precise consideration of factors such as multi joint coupling and nonlinear friction, which places extremely high demands on the accuracy and computational efficiency of dynamic model. The fusion of real-time perception and feedback control suffers from data delay and noise interference, which affects the timeliness and accuracy of gait adjustment [
7,
8]. In addition, robots need to be optimized under multiple constraints such as energy consumption, motion speed, and joint torque to achieve efficient and safe movement. This article aims to explore motion control method that integrate reinforcement learning and model predictive control by analyzing the intrinsic relationship between sensor feedback mechanisms and gait planning strategies [
9,
10,
11,
12]. The goal is to build a humanoid robot motion control system. Through extensive simulation, algorithm parameters and control logic are continuously optimized to ensure that the robot can quickly and accurately adjust its gait in complex environments. At the same time, by combining lightweight design with efficient energy management strategies, the system’s energy consumption can be reduced and the robot’s continuous operation capability can be improved [
13].
As a cutting-edge field of deep integration between artificial intelligence and robotics technology, humanoid robots have attracted widespread attention and in-depth research in academia and industry in recent years [
3,
14,
15]. One of its core tasks is to achieve stable and efficient gait planning and precise motion control, which not only concerns the motion efficiency of the robot in different scenarios, but also its stability. With the progress of micro electromechanical system (MEMS) technology, the precision and reliability of inertial measurement unit (IMU), pressure sensor, visual sensor and other sensing technologies [
16,
17,
18]. At the same time, the rapid development of edge computing and cloud computing capabilities makes gait generation strategy based on real-time sensor feedback become the focus of current research.
The gait generation and motion control of humanoid robots is a challenge for achieving dynamic balance and task execution in the unstructured environment [
19,
20]. The Heydarnia O team proposed a neural network-based walking control architecture, in which the core control parameters (weight matrix) are adaptively optimized through genetic algorithm (GA), and a simulation environment containing joint dynamics constraints is constructed to achieve the evolutionary learning process of humanoid walking mode [
21,
22,
23,
24]. This study is the first to introduce biological evolutionary mechanisms into robot gait control, verifying the feasibility of generating stable walking strategies through global search algorithms. The H. Duan has developed a 3D semi passive walking robot with four transmission joints, which is innovative in using an online reinforcement learning (RL) framework to optimize motion control strategies in real time [
25,
26,
27]. The system can autonomously learn dynamic walking patterns that adapt to different ground conditions (such as slope and friction coefficient changes) within about 20 min through interactive trial and error with the environment without prior terrain models. This data-driven learning method breaks through the limitations of traditional model dependent control and provides a new technological path for the autonomous adaptation ability of robots in complex environments [
28,
29]. While these learning-based methods show remarkable adaptability, they often face challenges in providing strict stability guarantees and real-time computational efficiency for high-dimensional systems.
The pursuit of stable and dynamic locomotion for humanoid robots has led to the development of several prominent control paradigms, which can be broadly categorized into three groups:
Model-based preview control: This class of methods, most notably Zero Moment Point (ZMP)-based preview control, relies on simplified dynamical models to plan stable center of mass trajectories. While highly successful in generating stable walking patterns on flat terrain and computationally efficient, these methods often lack the flexibility to handle significant external disturbances or complex, unstructured environments due to their reliance on pre-defined models and limited feedback integration [
30,
31].
Optimization-based whole-body control (WBC): WBC methods formulate the control problem as a series of constrained optimization problems, often using Quadratic Programming (QP), to solve for joint torques/positions that execute multiple tasks simultaneously while respecting physical constraints. They offer superior performance in whole-body motion coordination and constraint handling. However, their performance is heavily dependent on the accuracy of the underlying dynamic model and the computational efficiency of the solver, which can be a bottleneck for complex real-time applications.
Learning-based Control: Driven by advances in machine learning, these methods, particularly reinforcement learning (RL) [
32,
33] and deep model predictive control, have demonstrated remarkable success in enabling robots to learn complex locomotion skills through interaction with simulated environments. They excel at adaptation and can discover highly dynamic and robust gaits. The primary challenges include the immense data and computational requirements for training, the difficulty of providing formal stability guarantees, and the significant sim-to-real transfer gap.
While these approaches have propelled the field forward, key gaps remain. First, there is a need for a control framework that seamlessly integrates the long-horizon, stable planning of model-based methods with the reactive, whole-body constraint satisfaction of optimization-based control. Second, many WBC implementations are not tightly coupled with an online gait planner, limiting their adaptability to dynamic terrain. Finally, there is a lack of frameworks that provide the computational efficiency and stability guarantees of model-based optimization while incorporating the adaptive benefits of real-time sensor feedback in a way that is robust to model inaccuracies [
34,
35].
This article proposes a gait planning and control framework that integrates sensing feedback, stability constraints and real-time optimization. To address the issue of motion stability under complex ground conditions, a centroid motion trajectory generation algorithm based on a three-dimensional linear inverted pendulum model is established, combined with an improved version of the ZMP preview control algorithm, which can fully track the reference ZMP trajectory and construct a gait planner with dynamic adjustment capability. The algorithm solves the limitations of traditional inverted pendulum models on vertical motion constraints by optimizing the centroid motion trajectory in the real time. Meanwhile, it introduces a contact phase sequence prediction mechanism to ensure a smooth transition of the foot trajectory during the switching process between the support phase and the swing phase.
On this basis, a hierarchical whole-body control architecture is designed, with a task priority based quadratic programming solver used at the bottom. It will decompose high-level motion instructions into joint space control variables and fuse sensor data to construct a closed-loop feedback system. To clearly demonstrate the contribution of our proposed hierarchical whole-body control (HWBC) approach, we will not only compare it against traditional ZMP tracking methods but also contextualize its performance relative to the state-of-the-art, including reinforcement learning [
36,
37] and advanced model predictive control strategies. Our aim is to highlight the specific niche of HWBC: providing a model-based, computationally efficient, and highly stable control solution with explicit constraint handling, which serves as a robust alternative to emerging learning-based paradigms. Specifically, a predictive time-domain impedance adjustment strategy is designed for sudden disturbances. Simulation verification shows that the anti-interference ability is improved by more than 30%, compared to traditional ZMP tracking methods. The simulation results further reveal the synergistic mechanism between multimodal sensing fusion and model predictive control, providing a theoretical foundation and technical implementation path for enhancing the dynamic motion of humanoid robots. The main contributions can be clearly summarized as follows.
- (1)
The paper proposes a unified control architecture that integrates high-level gait planning with low-level whole-body motion control. It uses a task-priority quadratic programming (QP) solver to decompose motion commands into joint-space control variables while respecting physical constraints.
- (2)
An improved 3D-LIPM-based gait planner is introduced, which optimizes the centroid trajectory in real time and overcomes the vertical motion limitations of traditional inverted pendulum models. A contact phase sequence prediction mechanism is incorporated to ensure smooth foot trajectory transitions between swing and support phases. The proposed method achieves over 30% improvement in disturbance rejection compared to traditional ZMP tracking methods.
- (3)
The controller fuses data from force/torque sensors and IMUs to form a closed-loop feedback system. A predictive time-domain impedance adjustment strategy is designed to handle sudden disturbances. The framework is validated in a MATLAB/Simulink environment and compared against state-of-the-art methods. The results show that HWBC offers a favorable trade-off between tracking accuracy, computational speed, and stability guarantees.
- (4)
The inverse kinematics module uses a Damped Least-Squares (DLS) method with variable damping to handle joint singularities. The QP-based solver explicitly enforces joint limits, torque constraints, and ZMP stability, ensuring safe and feasible motions.
2. Model and Method
2.1. Dynamics and Mathematical Analysis
The mechanical structure of the humanoid robot can be regarded as a multi-rigid-body system, where its structural design must achieve motion functionality through the combination of rigid links and ideal joints [
13]. In this paper, components such as the torso and limbs are simplified into homogeneous rigid links with uniformly distributed mass and no elastic deformation. Adjacent links are connected via frictionless ideal joints, forming a joint-link dynamic model. The model includes rotational links, the thigh rotating around the hip joint, the lower leg rotating around the knee joint, and the foot rotating around the ankle joint, as well as non-rotational rigid links that connect the torso to the limbs (
Figure 1). The simplified model disregards joint friction and the elastic deformation of links. The joint rotation primarily depends on the mechanical structure and drive mechanism.
The dynamic balance of a bipedal robot is governed by the interaction of external forces and internally generated joint torques. As illustrated in the revised
Figure 1, during locomotion, the main external forces are gravity (mg), acting at the center of mass (CoM), and the ground reaction force (Fgr), acting at the ZMP. According to D’Alembert’s principle, the system’s dynamics can be treated as an instantaneous equilibrium by introducing an inertial force that acts through the CoM in a direction opposite to its acceleration.
The fundamental dynamic equation governing the robot’s base link is:
The moment balance of these forces about a point on the ground determines the ZMP location. A stable gait requires the moment about the ZMP’s horizontal axes to be zero, ensuring no rotational instability. These external and inertial forces are counteracted and generated by internal joint torques. The torques at the ankle, knee, and hip joints perform two critical, simultaneous functions. On the one hand, in orbit control, they generate the joint accelerations needed to track the desired kinematic trajectories, thereby producing the specific inertial forces required for locomotion. On the other hand, in balance control, they manipulate the entire body’s posture to regulate the magnitude and direction of the GRF, thereby controlling the CoM acceleration and ZMP location to maintain dynamic balance against disturbances.
The core challenge for our whole-body controller is to compute the optimal set of joint torques that reconcile the often competing demands of motion execution and stability maintenance. Our hierarchical Quadratic Programming (QP) solver, detailed in
Section 3, is designed to solve this optimization problem in real-time while strictly adhering to physical constraints such as joint torque limits, friction cones, and ZMP stability.
Compared with traditional multi-connected models, the three-dimensional linear inverted pendulum model has been further simplified in three aspects: (1) It is assumed that all masses are concentrated at the center of point; (2) It ensures that the robot’s legs have no mass and that the contact points have rotatable features; (3) The periodic motion of the center of mass of an object on a certain constraint surface is defined [
14]. Given the centroid position and landing point position information, coupled with necessary constraints, the motion state of the robot at any time point can be accurately solved. From the above equation, the horizontal motion trajectory of the centroid can be derived, with its expression shown in (1). The dynamics of the 3D-LIPM are derived from the fundamental principle that the horizontal motion of the CoM is governed by the gravitational moment about the supporting point. Assuming a constant height
zc, for the CoM, the equations of motion are:
where (
x,
y) is the horizontal position of the CoM, (
px,
py) is the position of the ZMP,
g is the gravitational acceleration, and
zc is the constant height of the pendulum.
It can be seen from Equation (1), the lateral motion of the mass center is solely determined by the gravitational acceleration
g, the position of the center of mass, and the intercept of the constraint surface, but it is independent of the slope of the constraint surface [
15]. Based on above characteristic, such model is defined as three-dimensional linear inverted pendulum.
The analytical solution to these differential equations, which describes the state of the system at time
t, is given by:
2.2. Inverse Kinematics with Singularity
The inverse kinematics module is crucial for converting the desired Cartesian trajectories of the feet and torso into feasible joint angle commands. Simply “converting target positions into joint angles” is insufficient for a robust system. This section details how we handle the core challenges of joint singularities and redundancy optimization.
For a humanoid robot leg, we model it as a kinematic chain with 6 degrees of freedom (DOF): 3 DOF at the hip, 1 at the knee, and 2 at the ankle. This structure introduces kinematic redundancy for tasks that do not fully constrain all 6 DOFs. To solve the IK problem robustly, we employ a Damped Least-Squares (DLS) method within a task-priority framework.
The basic velocity-level IK equation is given by:
where
V is the Cartesian task velocity vector,
J is the Jacobian matrix, and
is the joint velocity vector.
Near singular configurations, the Jacobian matrix becomes ill-conditioned, leading to unrealistically high joint velocities. To mitigate this, we use the DLS method, which solves for joint velocities as:
where
λ is a damping factor that prevents excessive joint velocities near singularities. We implement a variable damping scheme where
λ is adjusted based on the manipulability measure:
where
λ0 is a maximum damping constant and
w0 is a manipulability threshold. This ensures smooth and stable motion even when passing through near-singular configurations.
Due to the extremely small variation of the center of mass of humanoid robots in the Z direction, most studies often set the center of mass height to a fixed value [
3,
16]. In order to more intuitively display the center of mass motion trajectory of the three-dimensional linear inverted pendulum [
17], this paper plans to use a single step of 20 cm and a step width of 20 cm, with a center of mass height of 80 cm. The trajectory and landing position of the robot are drawn separately in
Figure 2.
As the single-leg support phase is about to conclude, simply placing the other foot precisely in the correct position allows the humanoid robot’s center of mass to establish a new support reference [
18]. Using the new support point as the coordinate origin and setting the termination state of the previous support point as the initial condition, the robot promptly enters a new round of single-leg inverted pendulum motion. By sequentially connecting these linked single-leg inverted pendulum phases, a continuous and smooth walking pattern is ultimately constructed. When the robot needs to turn to increase the rotation angle, the trajectory diagram of the turn is shown in
Figure 3.
The termination position of the robot can be obtained from the graph as follows:
In this model, the motion of robots in the XY axis direction is independent of each other. Based on the obtained centroid motion trajectory and landing point coordinates, combined with the robot inverse kinematics equation described in Chapter 2, the angle parameters of each joint can be accurately calculated, providing key control instructions for driving the robot motor and achieving precise motion control.
3. Controller Design
The previous chapter introduced the three-dimensional linear inverted pendulum model of humanoid robots and the improved ZMP stability criterion [
17]. On this basis, a hierarchical whole-body control architecture is designed, with a task priority based quadratic programming solver used at the bottom. Its main working principle is to decompose high-level motion instructions into joint space control variables, and then integrate sensor data to construct a closed-loop feedback system. The flowchart is shown in
Figure 4.
3.1. Implementation of the Control Architecture
The hierarchical whole-body control framework was implemented in a MATLAB/Simulink environment, with its core logic encapsulated in two primary components: the parameter configuration Interface (
Figure 5) and the real-time system control loop (
Figure 6). The following details the function of each module.
The parameter configuration interface for dynamic modeling and control is presented in
Figure 5. This interface, implemented in MATLAB/Simulink 2019, initializes the robot’s dynamic model and controller parameters, forming the foundation for all subsequent simulations. It is structured into three distinct functional panels. A complete robot kinematic modeling system is constructed based on specific parameter settings to accurately control the balance and motion of the humanoid robot in three-dimensional space. This interface initializes the robot’s dynamic model and controller parameters, forming the foundation for all subsequent simulations. It is structured into three functional panels:
- (1)
Input Control Panel: This module allows for the application of external forces and torques to the robot’s base link. This capability is essential for simulating and testing the controller’s response to external disturbances, such as pushes.
- (2)
Mass and Inertia Configuration Panel: Here, the fundamental dynamic parameters of each robotic link are defined. This includes mass, the position of the center of mass relative to the link frame. These parameters are critical for the model-based Hierarchical Quadratic Programming (HQP) solver to accurately compute the robot’s dynamics.
- (3)
Coordinate Frame Definition Panel: This module establishes the spatial relationship between the world coordinate frame and the robot’s base frame. Defining this transformation is crucial for accurately mapping planned trajectories from the world frame to the robot’s body frame for execution.
3.2. Stability Analysis Framework
The stability of the proposed control framework is analyzed through two primary, interconnected layers: the stability of the planned gait and the stability of the whole-body controller. The motion plan generated by the 3D-LIPM with preview control is inherently designed for dynamic consistency. The planned trajectory for the Center of Mass (CoM) is analytically derived to ensure that the ZMP, a direct indicator of dynamic balance, remains strictly within the convex hull of the support polygon throughout the gait cycle.
The hierarchical QP solver ensures stability through two main mechanisms: By formulating control objectives as a hierarchy of tasks within a QP framework, the solver guarantees that higher-priority tasks are never violated to achieve lower-priority ones. This explicit enforcement of physical constraints is a form of passivity-based stability, preventing the controller from generating commands that would lead to catastrophic failures like foot rollover or actuator saturation.
The closed-loop feedback system, which fuses the planned trajectory with real-time sensor data, acts as a tracking controller. The stability of this tracking loop can be analyzed by considering the error dynamics. The predictive time-domain impedance adjustment strategy is designed to modulate the controller’s apparent inertia and damping in anticipation of disturbances. This can be shown to improve the system’s robustness, moving its closed-loop poles to a more stable region of the s-plane compared to a non-predictive, high-gain tracking controller, which is more prone to instability upon impact or push.
MATLAB 2019 software is used to simulate and verify the established model, and the simulation results are analyzed. In practical operation, due to factors like mechanical structural errors, inherent model deviations, and ground friction, there is a certain deviation between the actual and expected gait of the robot. If errors continue to accumulate, it will cause the robot to become imbalanced and capsize. In order to ensure the stable walking of the humanoid robot, the real-time calibration is required. Under no interference condition, the gait information generated by the walking pattern generator will directly drive robot to walk stably. However, there are significant differences between the robot’s actual gait and the gait output by the generated program. It is necessary to use a walking stability controller, combined with sensor feedback information, to perform real-time online control of actual gait to reduce gait errors and maintain the robot dynamic balance.
The gait planning module for robot is shown in
Figure 6, which mainly displays the foot trajectory and motion status monitoring information. It contains multiple modules: the motion paths of the left and right feet. Actual foot position and reference trajectory, as well as related rolling angle parameters. Motion control module includes body grid detection and fall detection system, with a threshold condition of (t = 0.3) fall detection. Multiple mask markers and stop buttons have also been added.
The inverse kinematics calculation interface of a robot motion control system is shown in
Figure 7, which mainly includes the solution and analysis of inverse kinematics of the left and right leg. The underlying computing architecture of robot leg motion control system is presented, which converts the target position into joint angles through inverse kinematics algorithm with a bus system for data transmission.
5. Conclusions
In conclusion, this paper has presented a hierarchical whole-body control framework for humanoid robot gait generation and motion implementation. By integrating a 3D-LIPM-based gait planner with a task-priority QP solver that fuses real-time sensor feedback, the method achieves dynamic balance and robust disturbance rejection in simulation. The results demonstrate a significant improvement over traditional ZMP tracking and highlight a computationally efficient pathway towards dynamic locomotion. While currently validated in simulation, the work establishes a solid foundation and a clear pathway for future implementation on physical hardware, representing a step towards more adaptive and robust humanoid robots.
While the simulation results demonstrate the theoretical viability and comparative advantages of the proposed hierarchical whole-body control (HWBC) framework, this study shares a common limitation with many model-based approaches: the validation is currently confined to a simulated environment. We acknowledge that a simulation cannot fully capture the complexities of the real world, such as joint friction, actuator dynamics, communication delays, and unpredictable ground interactions. Consequently, the performance metrics reported herein, including the 30% improvement in disturbance rejection, must be interpreted as indicative of potential within the simulation’s modeled constraints. The ultimate validation of the framework’s practical robustness and applicability requires testing on a physical humanoid platform.
It combines advanced sensing technology and control theory to achieve real-time optimization and adjustment of humanoid robot gait, providing new ideas and methods for the development of future robot technology. Although certain achievements have been made, issues such as sensor data processing efficiency and system latency still need to be addressed in practical applications. To bridge the sim-to-real gap and confirm the practical reliability of our method, future work has been planned for implementation on a physical humanoid robot. The choice of this platform is strategic; its published torque-control interfaces and robust mechanical structure are well-suited for implementing our QP-based whole-body controller. The identified challenges and restrictions provide a clear and actionable roadmap for our future research. Our immediate next step is the transfer of the control framework to a physical humanoid robot. We plan to extend the gait planner to handle non-flat terrain by integrating real-time terrain perception to adjust the footstep placement and the 3D-LIPM constraint surface online. To address model inaccuracies and automate parameter tuning, we intend to explore hybrid approaches. This includes using a learning-based adaptive controller in the null-space of the primary balance task or employing reinforcement learning to optimize the high-level gait parameters in response to the robot’s state and environment.