Balanced Standing on One Foot of Biped Robot Based on Three-Particle Model Predictive Control

Balancing is a fundamental task in the motion control of bipedal robots. Compared to two-foot balancing, one-foot balancing introduces new challenges, such as a smaller supporting polygon and control difficulty coming from the kinematic coupling between the center of mass (CoM) and the swinging leg. Although nonlinear model predictive control (NMPC) may solve this problem, it is not feasible to implement it on the actual robot because of its large amount of calculation. This paper proposes the three-particle model predictive control (TP-MPC) approach. It combines with the hierarchical whole-body control (WBC) to solve the one-leg balancing problem in real time. The bipedal robot’s torso and two legs are modeled as three separate particles without inertia. The TP-MPC generates feasible swing leg trajectories, followed by the WBC to adjust the robot’s center of mass. Since the three-particle model is linear, the TP-MPC requires less computational cost, which implies real-time execution on an actual robot. The proposed method is verified in simulation. Simulation results show that our method can resist much larger external disturbance than the WBC-only control scheme.


Background
Motion control of a bipedal robot is a typical multi-task control problem. The robot has to keep its balance while moving or standing, interact with the environment, and complete tasks assigned by the user. Among the various tasks, balancing under external disturbance is a fundamental one for the robot to complete other tasks. Therefore, the balance control of a bipedal robot while walking or standing on both feet has been widely studied in recent years. However, the situation where robots stand on one leg and keep their balance has received little attention. Examples of such a situation include a humanoid trying to reach a distant object with its arms or a bipedal robot kicking a football. Compared with balancing on both feet, one-foot balancing presents new challenges, such as significantly reduced support polygons and reduced overall system stability. Furthermore, if the swing leg is heavy enough, its motion may significantly affect the motion of the entire robot [1]. In this sense, the problem of robot balance control with one foot is similar to that of robot balance control with two arms [2]. From a physical point of view, maintaining the balance of a biped robot with its arms reduces the difficulty of the control problem because the robot's supporting polygon remains unchanged, while extra degrees of freedom can be used to accomplish the balance task [3]. Therefore, the single-leg balancing stance method can be applied to dual-arm robots and vice versa.

Motivation
A bipedal robot's balancing on one foot is a typical multi-task control problem. The robot must keep the torso and the stance foot stable, move the swing foot properly, and conform to external and internal constraints. The whole-body control (WBC) method has been widely adopted in the balancing control of bipedal robots due to its capabilities to generate dynamically feasible control outputs and to handle various tasks and constraints [4]. However, the WBC only considers the current states of the robot and cannot handle strongly infeasible reference trajectories [5]. The model predictive control (MPC) is commonly used in conjunction with WBC to complement these insufficiencies [6]. In case of large external disturbance, the MPC foresees the robot's future states according to a specific dynamical model and acts as an online re-planner for the WBC. However, the full dynamics of a one-foot standing robot are high degrees of freedom and strongly nonlinear. Therefore, an MPC employing full dynamics has to solve a complicated nonlinear optimization problem. Thus, the full-dynamics MPC is computationally expensive and cannot be executed in real time on an actual robot [7][8][9][10][11].
Current MPC algorithms for legged robots usually adopt simplified models [12], for example, the linear inverted pendulum (LIP) model [2,13,14], the single particle model [15] or the single rigid body (SRB) model [6], to balance between model precision and computation time. In the case of balancing on one foot, these models are unsuitable because they adopt the dynamics model of a particle or a rigid body. Thus, the motion of the swinging leg cannot be predicted in MPC. In this research, we propose to adopt a three-particle model in the model predictive control, which is called the three-particle model predictive control (TP-MPC) approach. It considers the motion of the swing leg and requires far less computation time than a full-dynamics MPC. As a result, the TP-MPC combined with the WBC will solve the problem of one-foot balance control for practical biped robots with real-time computing requirements.

Related Work
Simplified models have been widely used in the balance control of bipedal robots. For example, Vukobratovic et al. first proposed the zero moment point (ZMP) concept and applied it to balance a bipedal robot [16]. Next, Kajita et al. proposed the LIP model, which simplifies the biped robot into a single mass point moving horizontally and connected to the ground by a massless retractable stick [17]. Finally, Pratt et al. extended the LIP model to a Linear Flywheel Inverted Pendulum Model, which takes the robot's angular momentum into account [18]. Their method can resist a large external disturbance by computing the robot's Capture Point and Capture Region. Li et al. used the momentum to control the balance of a biped robot, and they found that linear momentum was more critical than angular momentum in the balance control task. When the two momentum tasks cannot be satisfied simultaneously due to a significant disturbance, the linear momentum will be given priority [19].
WBC has also been applied to the bipedal robot's balancing. Xie et al. used the hierarchical WBC to maintain a balance of a standing biped robot [20]. Their algorithm can execute on an onboard computer in real time by optimizing the computing process. However, their control scheme needs a planner to generate feasible trajectories followed by the WBC to resist large external disturbances. Kim et al. proposed a new whole-body control approach called whole-body motion control [21]. It can realize dynamic walking on a bipedal robot with a passive ankle and maintain balance under external disturbance.
Model predictive control repeatedly solves a finite-horizon optimal control problem starting from the system's current states [22]. Moreover, the MPC can prepare for future motions in advance. Therefore, it is very suitable for highly dynamic motion control of legged robots. Li et al. applied a single rigid body model predictive control to a biped robot with light legs [23]. The MPC generates optimal contact forces and torques at each foot, which are converted to joint torque commands through contact Jacobians. Their method enables the robot to walk at a speed of 1.6 m/s on complex terrains and achieve a wide range of dynamic motions. Luo et al. proposed a three-mass model predictive control method for gait control of biped robots [24]. Their planar three-mass model considers the robot's ZMP and angular momentum. The swing leg trajectories are generated by heuristics rather than the MPC. Therefore, their method did not utilize the swing leg to balance the robot.
From a biomimicry point of view, swinging the legs is crucial for the balanced control of the torso [1]. Boston Dynamics released a video in 2013 of using swinging legs and arms to adjust posture but did not release relevant technical details [25]. To the best of our knowledge, other than the two works mentioned above, no literature has been published on the use of swinging legs to control the balance of biped robots.

Contribution
The main contributions of this paper are as follows: 1.
The proposed TP-MPC method can generate feasible swing leg trajectories that balance the robot while standing on one foot. The WBC tracks the generated swing leg trajectories. As a result, the overall control scheme can resist large external disturbances.

2.
The TP-MPC catches the main effects of the swing leg motion while being simple enough to operate at the same frequency as the WBC.
The overall structure of this paper is as follows: In Section 2, the three-particle model and the TP-MPC are derived. Section 3 introduces the hierarchical whole-body control approach and the overall control scheme. Section 4 discusses the simulation setups and results. Finally, Section 5 is for the conclusions and future work.

Three-Particle Model Predictive Control
In this section, we derive the three-particle model predictive control method. First, we introduce the simplified three-particle model for a bipedal robot standing on a single foot. Our MPC method requires to solve repeatedly a discrete, finite-horizon optimal control problem as follows: where n is the MPC's prediction length; x k and x k,re f are the actual state and reference state at step k, u k is the input of step k, and Q k and R k are the weight matrices, λ Γ = λ Γλ means the weighted value of λ of weight Γ . Equations (2) and (3) are equality and inequality constraints, respectively. Figure 1 shows the three-particle simplified model. The stance leg of the biped robot is fixed, and the position of the robot's center of mass can be adjusted by moving the torso and the swing leg to fall near the desired position. Here, we simplified the legs and the torso into three mass points without rotational inertia. In most cases, the torso's center of mass is located directly above the midpoint of the two hip joints but has an offset in the sagittal direction.

Three-Particle Simplified Model
As shown in Figure 1, p st = p st,x p st,y p st,z , p sw = p sw,x p sw,y p sw,z are the positions of the end of the stance leg and the swing leg in the world coordinate system, p b = p b,x p b,y p b,z is the position of the torso in the world coordinate system, p st = ṗ st,xṗst,yṗst,z ,ṗ sw = ṗ sw,xṗsw,yṗsw,z are the velocities of the end of the stance leg and the swing leg in the world coordinate system,ṗ b = ṗ b,xṗb,yṗb,z is the velocity of the torso in the world coordinate system. m st , m sw , and m b denote the mass of the stance leg, swing leg, and torso, respectively. In general, bipedal robots are symmetrical in the y direction, so the hip joint's distance from the hip's center in the y direction is l y . Let the distance between the torso center of mass and the hip plane be l z , and l x is the distance between the torso center of mass in the x direction and the hip motor of both legs. Let p m,st ,p m,sw be the center of mass position of the stance leg and the swing leg, respectively. To simplify the model, we assume that each leg's center of mass lies on the midpoint of the hip and foot, which means The robot system's center of mass can be derived as follows: where m = m b + m st + m sw represents the robot's total mass. Therefore, the robot's center of mass at time k is: Let x = p b p swṗ bṗ sw be the state variables of the system, u = p bp sw be the control input. For the model predictive control design, using Taylor expansion, we obtain the state equation of the discrete-time linear system as: Substituting Equations (4) and (5) into Equation (7) yields the location of the biped robot's center of mass: where m 1 = 1 2 (m sw + m st ) + m b and m 2 = 1 2 m sw .

Tasks
For the one-foot balance control of a biped robot, the primary task is to minimize the position and velocity tracking errors of the robot's center of mass. The position tracking error of the swing leg should also be minimized. Moreover, the system input should be regularized to avoid excessive joint torques. Therefore, we formulate the cost function of the MPC as follows: where p CoM,k − p CoM,re f ,k Q is the CoM position tracking task where p re f ,k is the reference trajectory of the CoM; ṗ CoM,k T is the CoM velocity tracking task, where the desired velocity is set to 0; p sw,k − p sw,re f ,k S is the position and velocity tracking task of the swing leg, where p sw,k and p sw,re f ,k are the actual and desired trajectory of the swing leg, respectively; p b,k − p b,re f ,k W is the position and velocity tracking task of the torso; p b,k and p b,re f ,k are the actual trajectory and the desired trajectory of the torso; u k R is the input penalty term; Q ∈ R 3×3 , T ∈ R 3×3 , S ∈ R 6×6 , W ∈ R 6×6 , R ∈ R 6×6 are the correspondent weight matrices. Substituting Equation (8) into Equation (9), the position of the robot's center of mass can be expressed as: The velocity of the center of mass can be obtained by taking the derivative of Equation (6). Then, its discrete form can be obtained: where The position and velocity of the swing leg can be derived by p sw,k = Ex k , where is the selection matrix. Similarly, for the torso position and velocity, p b,k = Gx k , where

Constraints
In order to ensure the real-time performance of the model predictive control algorithm, there should not be too many constraints in the MPC. Here, we only constrain the length of the swing leg to avoid the kinematic singularity. Since the Cartesian distance L = (x − x 1 ) 2 + (y − y 1 ) 2 + (z − z 1 ) 2 is a nonlinear function of coordinates x, y, z, we use the Manhattan distance instead. The swing leg's length constraint is written as Take the intermediate variable s leg = s x s y s z , which satisfies Then, s leg ≤ L leg is equivalent to the constraints of Equation (16). The matrix form of the constraints can be derived as:

Tasks
In order to improve the calculation efficiency, we eliminate the intermediate variables by substituting system dynamics into the cost function. Equation (8) can be expanded to Rewrite Equation (22) into matrix form as Then, by replacing the matrices with X, A, B and U, Equation (23) can be rewritten as: Similarly, for the robot's center of mass position, where C = diag(C, C, · · · , C) ∈ R 3n×12n , D = D , D , · · · , D ∈ R 3n×1 . The CoM velocity can be written as:Ṗ CoM = FX (26) where The position and velocity equations of the swing leg can be written as: where Similarly, the position and velocity equations of the torso can be written as: where The stacked acceleration vector is as follows: Therefore, the cost function of Equation (10) can be rewritten as where Q = diag(Q, Q · · · Q) ∈ R 3n×3n (34) By eliminating the intermediate variables, Equation (33) can be rewritten as

Constraints
According to Equation (3), the stacked form of the swing leg length constraint can be derived as where L leg = L leg , L leg , · · · , L leg ∈ R n×1 (43)

The Quadratic Programming Problem
The MPC problem can finally be transformed into the following quadratic optimization problem, which can be solved by off-the-shell QP solvers: where

Problem Formulation
The generalized coordinates of the floating base bipedal robot are defined as: where q f ∈ R n f represents the position and orientation of the robot's floating base relative to the inertial coordinate system, and q j ∈ R n j represents the angle of the actuated joints of the robot and n q = n f + n j . The dynamic equation of the system is formulated as whereq andq are the first and second-order derivative of q, respectively, M(q) ∈ R n q ×n q is the mass matrix of the robot, h(q,q) ∈ R n q collects the Coriolis force, centrifugal force and gravity, and τ j ∈ R n j is the joint torque: ω c,h = f c,h τ c,h ∈ R 6×1 is the external wrench applied to one leg, including three-dimensional external contact force and three-dimensional contact torque. For the biped robot, ω c = ω c,0 ω c,1 ∈ R 12×1 . J c (q) ∈ R 12×n q is the contact Jacobian matrix. In order to enable the robot to complete tasks with different priorities, a quadratic programming method with priority hierarchy was proposed [26]. The idea is to put the solution space of the upper priority level as a constraint of the lower level in the optimization problem. The optimization problem of the ith level task is defined as: where A i and b i represent the task matrix and target vector corresponding to the current priority. If multiple tasks have the same priority, they could be weighted and combined by a diagonal matrix Q i . C i , ld i and ud i represent the constraint matrix, lower bound, and upper bound of all constraints of the current priority. A aug i−1 = A 1 · · · A i−1 is the augmented task matrix corresponding to all tasks in the previous priority, and χ * i−1 is the optimal solution of the previous priority. C aug i−1 , ld aug i−1 , ud aug i−1 , with a similar form of A aug i−1 , represent the augmented constraint matrix and bounds corresponding to all constraints in the previous priority. Here, the tasks in level i have higher priorities than tasks in level j when i < j. χ i in level i represents the optimization variables to complete the tasks with priorities higher than i + 1, which means that the final optimal solution of multi-tasks with priority level n l is χ * n l . Here, we select the generalized joint acceleration and the contact wrench as the optimization variables in each level:

Tasks
The definition of each level's tasks and constraints is given in Table 1. The floating base dynamics task is the basis of all motions of the mobile robot, and all the active forces come from joint motors, so the floating base dynamics task and joint moment constraints are set as the first priority. The linear momentum and trunk posture are critical factors for robot stability, while the ZMP constraint and the plantar friction cone constraint can ensure that the foot of the biped robot will not flip and slip, so these tasks and constraints are set as the second priority. The position of the robot's feet, posture task, and plantar force screw task are set as the third priority. The detailed definition of each task and constraint is as follows.

Floating Base Dynamics Task
The first six rows of Equation (61) are part of the floating base dynamics, and it can be extracted by a selection matrix S f :

Centroidal Dynamics Task
Set the desired external force of the centroid task as f G,des ∈ R 3 , and design the following PD controller to track the centroid task: where p G,tar ∈ R 3 ,ṗ G,tar ∈ R 3 andp G,tar ∈ R 3 represent the desired position, velocity, and acceleration, respectively; p G ∈ R 3 ,ṗ G ∈ R 3 andp G ∈ R 3 represent the actual position, velocity, and acceleration, respectively; S G = I 3×3 0 3×3 represents the selected matrix, A G ∈ R 6×n q represents the centroidal momentum matrix [27]; K p,0 ∈ R 3×3 and K d,0 ∈ R 3×3 represent PD gain matrices.

Torso Orientation Task
Set the desired acceleration of the torso orientation task asθ t,des ∈ R 3 , and design the following PD controller to track the torso orientation: where θ t,tar ∈ R 3 ,θ t,tar ∈ R 3 , andθ t,tar ∈ R 3 represent the desired torso orientation angle, angular velocity, and angular acceleration, respectively; θ t ∈ R 3 ,θ t ∈ R 3 andθ t ∈ R 3 represent the actual torso orientation angle, angular velocity, and angular acceleration, respectively; K p,1 ∈ R 3×3 and K d,1 ∈ R 3×3 represent PD parameter matrices. The first six rows in the optimization variable Υ represent the position and orientation of the torso, and the fourth to sixth rows are the torso orientation, so, for the torso orientation task:

Feet Position and Orientation Task
As above, letr f ,des ∈ R 12 be the desired acceleration of the foot posture, and the PD controller is designed as follows: where r f ,tar ∈ R 12 ,ṙ f ,tar ∈ R 12 andr f ,tar ∈ R 12 represent the desired foot posture, velocity, and acceleration, respectively; r f ∈ R 12 ,ṙ f ∈ R 12 andr f ∈ R 12 represent the actual foot posture, velocity and acceleration, respectively; K p,2 ∈ R 12×12 and K d,2 ∈ R 12×12 represent PD parameter matrices. The acceleration of the foot of the end can be obtained according to the following equation: Then,

Contact Wrench Task
The contact wrench task tracks the desired contact wrench. Let ω c,des be the desired contact wrench. The last 12 dimensions of the optimization variable Υ are the contact wrench, so for the contact wrench task,

Joint Torque Constraint
The last ten rows of Equation (61) represent the actuated joints, which should be limited by torque: where S j = 0 n j ×n f I n j ×n j represents the selection matrix. The joint torque can be derived as: Therefore, the joint torque constraint is: where τ limit ∈ R n j is the torque limit of each actuated joint. Then, it can be obtained that:

ZMP Constraint
According to [28,29], the ZMP constraint on x direction can be written as where l − x and l + x are the minimum and maximum values of ZMP, d is the vertical distance between the foot joint and the ground. τ c,y is the moment applied to the foot in the direction of y, f c,x , and f c,z represent the force applied to the foot in the direction of x and z. For the y direction, the ZMP constraint can be derived similarly. Rewrite the ZMP constraints in matrix form, where

Foot Friction Cone Constraint
The stance leg's ground reaction force should stay in the friction cone to avoid slipping. Here, we employ a pyramid approximation of the friction cone and formulate the foot friction cone constraint as where µ is the coefficient of friction; f x,y is the resultant force in the direction of x and y, and f z is the force in the direction of z. Expand and arrange the aforementioned equation into the matrix form: Figure 2 shows the overall control framework. Both TP-MPC and WBC run at 1 kHz. The input of TP-MPC is the desired position of CoM, swing leg, and torso in the future n steps. WBC then takes the target trajectory of the swing leg calculated by TP-MPC and the target trajectory of CoM and torso orientation given by the stand planner as input. Although TP-MPC can also obtain the trajectories of the torso and CoM, we do not use them in our controller.

Simulation Setup
The simulation platform is the open-source robot simulator Webots [30]. The control algorithms are implemented with C++. We adopt the rigid body dynamics library (RBDL) [31] to calculate robot kinematics and dynamics and the qpOASES [32] toolkit to solve quadratic programming problems. The simulation environment and the controller run on a computer with an i7-11800H 2.30 GHz processor, 32 GB memory, and RTX3070 8 GB RAM GPU.
The bipedal robot in simulation has ten actuated joints and a total weight of 45 kg. The weights of the torso and a single leg are 22 kg and 11.5 kg, respectively. A single leg accounts for around one-fourth of the robot's total weight, which implies a significant effect of the swing leg's motion on the entire robot states. The robot's soles are 0.22 m-by-0.1 m rectangles.
As shown in Figure 3, the biped robot stands on a single leg, and a rigid hanging ball swings down to hit the robot. The suspension point of the ball is vertically above the robot. The ball is released from the same height and angle with zero initial speed. The robot's torso is hit 0.8 m above the ground in the frontal or the side direction. The initial height of the robot's CoM is 0.604 m. Therefore, the hit will exert a net torque concerning the robot's CoM. By increasing the rigid ball's mass, the impulse received by the robot at the instance of impact increases, meaning larger external disturbance. We compare our control scheme with the WBC-only control scheme in each experiment. For both control schemes, the WBC part is the same. For the WBC-only scheme, the desired torso and leg trajectories remain equal to their initial values.

Frontal Impact
In the frontal impact experiment, the rigid ball swings along the x-axis and hits the center of the torso's front surface. When the mass of the rigid ball is 9 kg or 11 kg, the robot's CoM trajectories during the impact and recovery stages are shown in Figures 4 and 5, respectively. In the 9 kg case, both control schemes can resist the external disturbance and bring the robot's CoM back to its initial position, but our MPC-WBC scheme achieves smaller CoM motions and faster recovery than the WBC-only scheme. For both control schemes, Figures 6 and 7 show the actual trajectories of the swing leg's foot end and the correspondent reference trajectories given to WBC. For the WBC-only scheme, the WBC's reference trajectories are constant, while the actual trajectories deviate significantly from the reference. This deviation results from the WBC sacrificing the foot-end tracking task for the high-priority CoM tracking task. For our MPC-WBC scheme, the reference foot end trajectories provided by the MPC and the actual trajectories almost coincide with each other. This suggests that the MPC generates swing foot trajectories that are coherent with the CoM tracking targets. It is also observed that the MPC-WBC scheme generates larger swing foot motions than the WBC-only scheme, which confirms the swing leg's vital role in balancing the robot.
When the ball's mass increases to 11 kg, the WBC-only scheme causes the divergence of the robot's CoM trajectories and thus the failure of the entire robot. However, our MPC-WBC scheme can still withstand the impact. Screenshots in Figure 8 depict the robot's motions during the impact and recovery process under our MPC-WBC scheme. Significant forward and upward motions of the swing leg can be observed after the impact, implying that our method successfully utilizes the swing leg motions to balance the robot. Figures 9 and 10 show the robot's centroidal pitch angular momentum. Although the rotational dynamics are not included in our TP-MPC, the simulation results show that our method is sufficient to deal with a certain degree of torque perturbation relative to the CoM. Figure 11 shows the computation time cost of the TP-MPC algorithm. It can be seen that most of the computations are within 500 µs.

Side Impact
Compared with frontal impact, the robot is more vulnerable to side impact since the supporting polygon is narrower in the side direction. For side impact, when the ball's mass is 2.7 kg or 3.2 kg, the robot's CoM trajectories during the impact and recovery stages are shown in Figures 12 and 13, respectively. Figures 14 and 15 show the actual trajectories of the swing leg's foot end and the correspondent reference trajectories given to WBC. Figures 16 and 17 show the robot's centroidal roll angular momentum. Similar to the situation of frontal impact in Section 4.2.1, both control schemes can balance the robot under the 2.7 kg ball's impact, with our MPC-WBC scheme achieving better performance. Under the 3.2 kg ball's impact, the WBC-only scheme fails, while our scheme still balances the robot. Screenshots in Figure 18 show that the robot can also side-swing its leg under a side impact. Figure 19 shows the computation time cost of the TP-MPC algorithm when the robot is subjected to a side impact.

Discussion
In the WBC, the centroidal dynamics task is considered the second-priority task. Due to its existence, the biped robot can mobilize all joints to maintain the position and velocity of the CoM. Therefore, the WBC-only controller has a certain degree of anti-interference ability. However, it may fail to deal with a considerable impact. The centroidal dynamics task is expressed on the acceleration level. When the acceleration required to achieve the second-priority tasks has been satisfied, the third-priority tasks will be executed as much as possible. That is to say, the WBC will execute the task of swing foot to some extent, even if the robot's CoM has not returned to its initial state. In the WBC-only control scheme, the desired trajectory of the swing foot only keeps the initial state unchanged, which is almost unhelpful for maintaining the stability of the CoM. On the contrary, in our control scheme, the TP-MPC provides the swing foot with the desired motion trajectory that can balance the position of the CoM. The task of swing foot is almost consistent with the centroidal dynamics task in WBC. Therefore, the control scheme with TP-MPC shows a better ability to maintain the stability of the center of mass.
In the above experiments, the maximum computation time of our TP-MPC is 961 µs, and the average value is about 200 µs. Through proper optimization in coding, our algorithm can run at the frequency of 1 kHz on the real robot.

Conclusions
This paper proposes the three-particle model predictive control scheme to solve the balance control problem of a bipedal robot standing on one foot. We verified our control scheme in simulation. Compared with the WBC-only control scheme, our scheme can resist much larger disturbances and bring the robot back to its initial state much faster. Although the TP-MPC discards the complicated full dynamics of the robot, it can still catch the main effects of the swing leg motions on the robot's CoM. Furthermore, since our MPC employs only linear dynamics and constraints, it requires little computational cost, implying real-time execution at the same frequencies as the WBC on a real robot.
Our controller also has some deficiencies. The three-particle model may be oversimplified in some cases, for example, where the robot's legs or torso have large angular velocities. In the future, we plan to improve our simplified model, for example, including the rigid body's rotational inertia or adding more particles to the model. We also plan to verify our method on an actual bipedal robot.