Whole-Body Dynamics-Based Aerial Fall Trajectory Optimization and Landing Control for Humanoid Robot

When humanoid robots work in human environments, falls are inevitable due to the complexity of such environments. Current research on humanoid robot falls has mainly focused on falls on the ground, with little research on humanoid robots falling from the air. In this paper, we employ an extended state variable formulation that directly maps from the high-level motion strategy space to the full-body joint space to optimize the falling trajectory in order to protect the robot when falling from the air. In order to mitigate the impact force generated by the robot’s fall, during the aerial phase, we employ simple proportion differentiation (PD) control. In the landing phase, we optimize the optimal contact force at the contact point using the centroidal dynamics model. Based on the contact force, the changes to the end-effector positions are solved using a dual spring–damper model. In the simulation experiments, we conduct three comparative experiments, and the simulation results demonstrate that the robot can safely fall 1.5 m from the ground at a pitch angle of 45°. Finally, we experimentally validate the methods on an actual robot by performing a side-fall experiment. The experimental results show that the proposed trajectory optimization and motion control methods can provide excellent shock absorption for the impact generated when a robot falls.


Introduction
Humanoid robots are a tangible manifestation of human technological advancement, making them of great interest to researchers.The world's most advanced humanoid, Atlas, demonstrates flexible motion and manipulation abilities, pointing to the current direction of humanoid robotics development [1].Business magnate Musk has greatly supported humanoid robotic technology by funding relevant research.As seen in the latest video [2], Optimus can now assist humans in performing tasks in human environments.
At present, humanoid robot research primarily centers on walking, running, jumping, and aiding people in performing duties [3][4][5][6].Due to the complexity of the human environment, robots will inevitably experience falling behavior when completing these tasks.Therefore, designing a structure or control method that can effectively protect robots is very important.According to the law of impulse, when the robot has a certain mass, reducing the impact force of the robot on the ground mainly involves increasing the touchdown time and reducing the landing speed.For the aspect of increasing the ground impact time, Kajita et al. introduced a method using airbags [7]; however, the airbags employed can only be used once and must reinflate before they can be used again, making the method complex and expensive.Similarly, Sung-Hee Lee introduced a way of placing a backpack on the robot's back [8] to change the falling direction after the robot receives an impact, which is suitable for environments wherein surrounding objects do not interfere.In [9], Kakiuchi designed a robot with hard points mounted over its entire body, which looks like a man wearing a set of shackles and significantly limits the robot's workspace.Inspired by the self-protection of turtles, Nguyen et al. designed a type of shell protector.However, in their latest research, since the robot only had a lower body, the actual fall performance of the whole body has not yet been observed [10].In the aspect of decreasing the ground impact velocity, researchers use methods for optimizing trajectories and motion control methods.For example, Fujiawara optimized the falling trajectories of robots in different orientations through imitating Ukemi motion by using energy shaping and distribution methods [11].Luca established a variable-length inverted pendulum model to optimize forward falling trajectories for a robot [12].In addition, some researchers optimize the the distribution of contact points and the contact sequence, as described in [13,14].Regarding robot motion control, being able to effectively cushion impact forces is a very important strategy.Mujica proposed a variable admittance control that successfully achieved human-robot collaborative tasks [15].For the task of adjusting control parameters, ref. [16] proposed a new Lyapunov-based offline self-tuning control method.Similarly to [15,17] presented a compliant adaptive control method that can enable stable and safe physical human-robot interaction during work.In recent research, MIT proposed using model predictive control to mitigate the impact forces of acrobatic robot maneuvers [18].HaoxiangQi achieved stable control after a robot's high jump landing by optimizing contact forces and using a virtual model [19].
Based on the above references, we can see that current research on robot falls mainly considers robots falling on flat ground.However, research on robots falling from high altitudes is still lacking.Because human exploration desires are endless, people also hope that robots can replace humans to accomplish some high-difficulty actions such as aerial acrobatics and parachuting.But when performing these actions, robots face the risk of falling.Therefore, more research needs to focus on solutions to help robots mitigate the impact of falling from high altitudes in order to achieve complex aerial tasks that are difficult for humans.
Research on falling mitigation for robots mainly focuses on quadruped robots, notably the Mini Cheetah from MIT [20,21].Research employs offline or online optimization to combine machine learning or neural networks when falling from heights.The results show that by using the method in [20], a robot can fall from a height of 10 feet (about 3 m); while using the method in [21], a robot can fall from around a 2 m height.Another avenue of research is from a bio-inspired perspective, where a tail is added to the quadruped robot to cushion the impact of falling, which is discussed in papers [22,23].Different from the methods and experimental objectives mentioned above, Francesc proposed an optimizationbased reactive landing controller for handling horizontal impacts during falls [24] and showed that a quadruped robot can recover from falls at horizontal speeds up to 3 m/s.In this paper, we adopted a direct mapping relationship from the high-level motion task strategy space to the robot's whole joint space by setting the robot's different state variables and adding the corresponding contact constraint equations.In this method, the high-level action task strategy is utilized as the objective function of the robot, allowing the optimizer to satisfy our desired actions.For instance, when a robot experiences a fall from the air, its high-level motion task strategy space is to minimize contact force, follow a prior trajectory function, and distribute the force evenly across each contact point.Using the optimizer solution method, we optimized the trajectory of a robot falling from the air given the initial and target states.When the robot was in the aerial stage, we used a proportion derivative (PD) method to control the posture of the centroid and endpoints.When the robot was in the landing stage, the target reference force was optimized using centroid dynamics to reduce the impact force by damping control.Figure 1 illustrates the overall framework of our method.The contribution of this work is three-fold:

•
An optimized trajectory for the robot falling from the air is generated, enabling the humanoid robot to land smoothly on the ground.• Direct mapping is adopted from the high-level task strategy space to the joint space, bypassing the need to go through the centroid space as in previous methods.

•
Simulation results show that the trajectories optimized after adding contact point information are more human-like compared to those without contact information.

Trajectory Optimization
Traditional motion planning methods map from the high-level motion task strategy space to the centroid space and then to the full-body joint space of humanoid robots [25].The advantage of this method is that the centroid motion model is simple and can easily generate whole-body joint motions.However, due to the oversimplified movements in the centroid space, which cannot consider all possible contacts, and its inability to utilize known information about the robot's spatial posture, we employ an optimization method that maps directly from the high-level motion task strategy space to the full-body joint space of the robot in this paper, as shown in Figure 2.

Methods
According to the above-mentioned method, we adopt an approach that works from the high-level task space directly to the joint space, putting the robot's joint angles into the optimization variables.The state variables of existing trajectory optimization contact models mainly focus on the position, direction, velocity, and angular velocity of the center of mass; and since a humanoid robot produces contact forces with the ground during the landing phase, these contact forces also have an impact on the robot's overall state, so this paper considers the robot's contact forces and the velocity and energy of the contact point and establishes the following form of state variables: x = [p com ; θ com ; q all ] T (1) where p com represents the position of the centroid, θ com represents the direction of the centroid, q all represents all joint angles, p c represents the position of the contact point, ṗc represents the velocity of the contact point, and W ci represents the work done by the contact force corresponding to the i-th contact point.
Here, we represent the control input of the robot as follows where τ represents the driving torque of the corresponding joint of the robot, and f n represents the contact force of the n-th contact point.The advantages of using this form of state variables and control inputs are: First, we added the state update equations for the contact point position dimensions, allowing us to leverage partial prior information about contact points from the reference trajectory.Second, the additional terms for contact force work nicely to avoid energy conservation issues while facilitating reference trajectories based on energy changes.To further validate that the state variables proposed in this paper can provide good guidance for trajectory optimization of the robot's falling state, we conducted a comparison in the simulation section to demonstrate its effectiveness.When a humanoid robot falls from the air, it has no contact with the environment, but when it lands on the ground, parts of its body come into contact with the surroundings.These contact forces affect the robot's base; however, the full-body dynamic model provides equations describing the impact of external forces on the robot's motion.According to reference [26], we can obtain the robot dynamic equation involving all external contact forces as where ) is gravity acting on the robot, τ ∈ R (N+6)×1 are the joint torques of the robot, x ∈ R (N+6)×1 are the state variables of the robot, is the Jacobean matrix of the i-th contact point, and f i ∈ R (6n)×1 is the contact force of the robot at the i-th contact point, where N denotes the number of degrees of freedom.When a robot's body comes into contact with the ground, an acceleration is produced at the point of contact.According to the contact dynamics equation mentioned above, the acceleration at the contact point can be related to the contact force via an expression as follows

Cost Function
In this paper, we aim to achieve the reference target with minimal driving force, and there should be no uneven force distribution at each contact point.Therefore, the objective function can be defined as follows: where U represents the robot's control input; U re f represents the robot's reference input; X and X re f represent the state variables and reference variables mentioned earlier, respectively; F re f represents the reference contact force; and w 1 , w 2 , and w 3 represent the corresponding weight matrices.The purpose of setting the third term is that we do not expect the robot to have an uneven distribution of forces after falling down the ground.In optimization, we expect the robot's knees and arms to land, resulting in a F re f value of mg/4.In addition, we expect the robot's center of mass to be 0.8 m above the ground with no slippage at the contact points and with the robot's roll, pitch, and yaw angles to be 0.

Constraints
The following constraints are crucial for optimizing the trajectory: (1).At the start of optimizing the robot, the minimum and maximum values need to be specified for the initial state variable X 0 ; also, the minimum and maximum desired values for the final state variable X F need to be specified.Additionally, the minimum time t min and maximum time t max for the desired robot motion, the minimum q min and maximum q max values for all joints, and the lower limit U min and upper limit U max for the control inputs need to be specified.To accelerate convergence, an initial guess value is also created, including the motion time t guess , state X guess , and control input U guess for the robot.
(2).For the dynamics, we want the entire motion trajectory to satisfy the full-body dynamical model, as shown in Equation ( 4).
(3).In terms of kinematics, during the optimization process of the robot, the maximum extension of the robot's end-effector should less than the length of the robot arm or leg and should meet the forward kinematics equation [21].
(4).During the optimization process, we found that when the robot falls in a nondynamic environment, the end points of the robot's arms or legs are prone to mold piercing when colliding with the ground, as shown in Figure 3. Therefore, it is necessary to set the vertical distance p z i > 0 of the robot's collision point.(5).When the robot makes contact with the ground, we want no slippage to occur.Based on Posa's contact complementarity constraint equations [27], we can obtain the relationship between the contact position, contact force, and contact velocity, as described below, where ξ is a slack coefficient to encourage convergence.6).In [28], the robot's sole was simplified to four support points, with each support point corresponding to a tetrahedron that forms a relatively complex friction cone constraint.In this paper, a simplified model is adopted by approximating each contact location to a single point.Based on the friction relationship, the following equations can be obtained, where F N and F f are optimization variables from the inputs above.

Air Stage Controller
Through previous trajectory optimizations, we can optimize the state variables and control inputs of the robot during the falling process.These parameters are essential for setting the motion controller.During the air stage, due to differences between the simulation model and the actual physical model, there will be deviations in the position and orientation of the robot's center of mass during motion.Therefore, designing an effective motion controller to minimize these deviations becomes crucial.The PD controller has the function of simple parameters and easy implementation, so we set the following control equations where p represents the position, R represents the orientation, subscript a represents the actual value, subscript d represents the desired actual value, and end represents the endpoint of arms or legs.The variables k p c , k d c , k R c , k w c represent the corresponding coefficients and have the same meaning as the coefficients in Equation (12).

Landing Controller
During a robot's walk or run, the center of mass is usually placed at the hip joint center; however, the robot requires a more precise position during falls.Therefore, we use the actual link lengths and mass distributions to solve for the corresponding center-ofmass position.
where N n is the total number of links, m i is the mass of the ith link, and p i (q) represents the position of the ith link in the world coordinate frame.
Although the aforementioned trajectory optimization method solves for the impact force of the robot's contact points during landing, this is done without considering external disturbances and lacks some robustness.In other words, since the landing time is relatively short, designing a method that can efficiently and quickly respond to such impact forces becomes very important.Model predictive control needs high precision in modeling, but optimization is computationally intensive and time-consuming.Also, different cost functions need to be designed for various scenarios, limiting generalizability.Since a robot's fall from the air and contact with the ground happens very quickly, a fast-responding algorithm is needed.Here, we refer to the landing control method proposed in [29]; according to the Newton-Euler laws of motion, we can obtain the following: where m denotes the mass of the robot, p com is the position of the center of mass, c is the position from the contact point to the center of mass, L is the angular momentum about the center of mass, and F all and n represent the force and moment, respectively, exerted on the robot by the environment, expressed in the world frame.Similar to [30], we approximate the angular momentum equation of the robot's center of mass as follows: where I all is the angular part of the centroidal momentum matrix; q is the angular velocity of all joints, including the floating base; I base is the matrix block corresponding to the base coordinate in I all ; w is the angular velocity of the base link; and I is a constant and diagonal approximation of I base .Combining Equations ( 14) and ( 15) and neglecting the effect of the moment, we can obtain the following equation: We set the force and torque when the robot lands as Let We can express Equation ( 16) as a form of a quadratic program (QP), where α 1 and α 2 represent the corresponding weight coefficients, the rightmost term in the equation means that the expected landing impact force of the robot is minimized.

Spring-Damper Controller
Using the method above, we have optimized for the impact force during the robot's landing.However, since our robot is position controlled, we need to convert this to corresponding joint angles.To do so, we adopt a dual spring-damper model as shown in Figure 4.Where k 1 , k 2 , D 1 , and D 2 are the spring and damper coefficients, respectively.Let ε be the overall deformation of the robot after being subjected to an external force f , where ε 1 and ε 2 are the deformations on the left and right sides, respectively.Then according to Hooke's law, we can obtain: Applying a Laplace transformation to the second term of Equation ( 20) yields Substituting Equation ( 21) into the first term of Equation ( 20) and taking the inverse Laplace transform, we obtain Let the state variable of the spring-damper model be χ = f , ε, ε T , where the control input u = ε; then the state equation of this spring-damper model can be written as: Based on Equation ( 23), we can obtain the relationship between the desired state variable and the desired input as follows: Subtracting Equation ( 23) from ( 25), we have The state feedback controller is given by ∆ν = −K∆χ (27) where k = [k 1 ; k 2 ; k 3 ] T , which can be obtained through the LQR method.Define the cost function as where Q and R are the weight matrices of the state variables X and the input variable u, respectively.Let K = R −1 B T P, where P can be obtained through the Riccati equation: Through Equations ( 28) and ( 29), we can get the values of the gain coefficient k = [k 1 ; k 2 ; k 3 ] T ; then the end change can be expressed as follows:

Simulation Platform
The simulation platform used in this paper is a bipedal robot independently developed by our laboratory, as shown Figure 5.The robot weighs 50 kg in total and has 20 degrees of freedom, including 6 degrees of freedom in the legs, 2 degrees of freedom in the waist, and 3 degrees of freedom in the arms.The specific dimensions and parameters are shown in Table 1.

Trajectory Optimization
The robot's falling trajectory was optimized in MATLAB, while the robot's kinematics and dynamics were generated by the open-source software FROST [31].The optimization library used consulted Matthew Kelly [32,33].In the optimization process, we first follow the first constraint introduced in Section 2.3 of the article to give the upper and lower bounds of the robot's state variables, motion time, state input, and so on.To shorten the optimization time and avoid local optima in the first optimization process, we give a free-fall trajectory as an initial trajectory, which is a simple trajectory that does not satisfy the dynamic constraints but can constrain the optimization result to an ideal reliable value range, as shown in Figure 6.Additionally, we selected the trapezoidal method as the interpolation approach.Obviously, the initial reference trajectory from the first optimization did not satisfy the dynamics equations.So after this, we put the optimized reference trajectory into the estimate, performed a second optimization, and repeated this process until the optimal trajectory appeared.

Simulation
We perform the simulations using Coppeliasim and MATLAB.In Coppeliasim, the robot's physical engine is Bullet 2.78 and the control period is 5 ms.The initial pitch angle of the robot is 45 • , the distance from the ground is 1.5 m, the ground friction coefficient is 0.75, and the other parameters are introduced in Table 2, which appears at the end of the article.We set two different sets of state variables to compare the optimization results: the first set of optimization variables are as shown in Equation ( 1), and the second set of optimization variables are as shown in Equation (2).The other constraints are kept consistent.In addition, we also simulated the effect of adding the controller to the robot when it landed.As described above, we used the equation in (1) for the optimization variables.Figure 7 shows the optimized robot motion states obtained using this method.As can be seen from Figure 7(1)-(4), in order to reduce the landing speed, the robot swings its arms backward.Figures 7(5)- (8) show that after the robot lands, its knees quickly touch the ground and its arms also start to find the landing position.Figure 7(8) is the final state, and it can be seen that without the constraint of touchdown information, the robot has problems such as flipping over backward and unbalanced ground contact, and the overall optimized motion exhibits unreasonable phenomena.

Extended State Variables
In the second set of trajectory optimization experiments, we optimized using extended state variables.The optimized robot motion states obtained are shown in Figure 8.Compared to Figure 7, the state variables contain touchdown information.It can be seen that when in the air, the robot stretches its arms and legs straight while bending its waist, which is conducive to adapting when contacting the ground; after the front end of the foot touches the ground, the robot immediately bends its knees to reduce the ground impact while the arms touch the ground to share the pressure.After the robot's hands or knees make contact with the ground, the whole body keeps balance.

Extended State Variables and Control
Although the trajectories optimized using extended state variables conform to the full-body dynamics model, the impact force during the robot's landing remains large.To solve this problem, we incorporated the motion controller described above.When the robot is moving in the air, simple PD control is used.However, when the robot lands, the landing controller is engaged.Figure 9(5)- (8) shows the effect of it.Once contact with the ground is detected, the robot swings its arms backward, presses down its body, and moves forward.Near the end of landing buffering, in order to maintain an overall balanced posture, the robot's upper body moves upward and recovers to a four-point landing state.

Graphical Analysis
Due to the significant changes in the z-and y-directions during the robot's fall, this study primarily considers the hip, knee, ankle, waist, shoulder, and elbow joints.Figure 10 shows the joint angle trajectories of the robot joints optimized using the aforementioned trajectory optimization method.It can be observed from the figure that the joint angle trajectories vary smoothly without any anomalous values.Figures [11][12][13] show schematic diagrams of the robot's actual center-of-mass position, velocity, and orientation, respectively, when falling from the air.The trajectory plots include the cases of extended state variables and control (with control), extended state variables (without control), simple state variables (without contact points), and free-fall motion.The other snapshots are similar.The red curve in Figure 11 shows that after adding the landing controller, the robot's center-of-mass trajectory continues moving downward and then recovers to a stable state, consistent with the motion in Figure 9.In Figure 12, the free-fall motion velocity curve is blue and has a maximum velocity reaching 4.02 m/s.The purple curve indicates optimization with no contact point information in the state variables: the robot lands after 0.76 s with a maximum velocity of 3.92 m/s.Comparing the red and fluorescent curves (without control), the speed of the robot drops directly from 3.1315 m/s to 0.18185 m/s within 0.04 s after landing, while with control, the velocity decreases from 2.9311 m/s to 1.3363 m/s.This demonstrates that the proposed landing controller provides good shock absorption.Figure 13 shows the orientation of the robot during landing, and it can be seen that after touchdown, the landing controller starts to take effect, demonstrating the effectiveness of the proposed method.
Figure 14 shows a schematic diagram of the impact force on the robot's right hand when striking the ground.Since in free-fall motion we do not want the robot's arms to contact the ground, the impact force is almost 0 during landing.Compared to no landing controller and trajectory optimization without contact point information, the red curve represents the use of control during the fall and has a maximum impact force of 5797 N at the instant of touchdown.Approximately 0.1 s later, a secondary impact occurs, but this time the impact force is 709.2N, which is less than the 1132 N experienced without a landing controller.
Figure 15 shows the ground reaction force on the robot's right foot: it can be seen that free-fall motion has the most significant impact force.After adding the landing controller, the robot's first impact force is reduced to about 4100 N, and the second impact force is reduced to 2300 N. Comparing without control and without contact points, it can be seen that the impact effects of both are similar.These data demonstrate that the proposed methods can handle ground impact forces effectively.

Experiment
Falling from the air is a very dangerous maneuver for the robot and requires various safety equipment.However, we were unable to complete this experiment at this time since the hardware environment is still being set up.To validate the method proposed in this paper, we conducted an experiment outdoors using the example of the robot falling forwards to the right.We completed this experiment using trajectory optimization and motion control methods.Figure 16 shows the joint angle profiles of the robot optimized using the aforementioned optimization approach.During the motion, we applied a lateral force of approximately 150 N to the robot for 0.3 s.When the lateral push force exceeded the robot's self-adjustment range, it had to fall.Our robot detected the falling direction and threshold based on the method proposed in [34].Upon detecting the fall, the right leg quickly lifted up and the waist joint immediately twisted to brace for impact with the ground.Since the robot's hands are quite delicate, to avoid damage during the collision, we swung the arms upwards during the fall.This minimized the impact on the hands; the final effect is shown in Figure 17.

Conclusions
This paper optimizes a protection trajectory for a robot falling from the air.Compared with traditional trajectory optimization methods, this paper abandons the strategy of humanoid robots of working from the high-level motion task strategy space to the center-of-mass space and then to the whole-body joint space.Rather, we establish a relationship between the high-level motion task strategy space and the whole-body joint space.Moreover, this paper adds the robot's contact point information to the state variables, enabling it to utilize reference contact information to avoid phenomena that do not comply with contact dynamics during trajectory optimization.A PD controller is added during the robot's flight phase to control the position and direction of the center of mass or the endpoints.During the robot's contact phase, according to the center-of-mass dynamics model, the contact force is optimized.Assuming that force sensors are installed at the endpoint parts of the robot, then according to the actual applied force and the optimized applied force, we can use a damping controller to calculate the movement of the endpoint to finally put the above results into an inverse kinematics optimization based on QP to obtain the joint angle changes required to desired forces.The simulation and hardware experiment results show that by combining trajectory optimization and motion control methods, the robot can safely fall to the ground.Compared to simple free-fall motion or only simple state variables, this method can effectively buffer the impact force.

Future Work
In the future, we will implement online motion trajectory optimization of the robot and deploy it to the physical prototype to further prove the effectiveness of this method.

Figure 1 .
Figure 1.Planning and control framework for a humanoid robot falling from the air.The superscripts "d" and "a" mean the desired value and the actual (measured) value, respectively; X 0 and X F represent the initial state and the final state, respectively; X d and U d represent the expected state and the expected input value, respectively.

Figure 2 .
Figure 2. The differences between step-by-step centroid space optimization and unified strategy space planning methods.

Figure 3 .
Figure3.The robot exhibits ground penetration.In the simulation, the robot is in a non-dynamic environment with its arms and knees already penetrating the ground.

Figure 5 .
Figure 5. Snapshots of the humanoid robot.The left side represents a three-dimensional view of the robot, and the right side depicts a schematic diagram of the robot's joints.

Figure 6 .
Figure 6.The robot free-falls through the air and in the final state lies flat on the ground.

Figure 7 .
Figure 7. Snapshots of the robot falling from the air without contact point information in the state variables.

Figure 8 .
Figure 8. Snapshots of the robot landing without damping controller.

Figure 9 .
Figure 9. Snapshots of the robot landing with the damping controller.

Figure 10 .
Figure10.The joint trajectory curve optimized by our method.Owing to the bilateral symmetry between the legs and arms, only the joint angle profiles of the right leg, waist, and arm of the robot are displayed.

Figure 17 .
Figure 17.The robot falls on its side, and in Figures (a,b) an external force is applied to the robot,Figures (c,d) represents that the robot completes this protection action according to the optimized trajectory and PD control, Figures(e,f) indicates that the landing controller takes effect after the robot reaches the ground.

Table 1 .
The parameters of our robot.

Table 2 .
The control parameters of the simulation.
Snapshots of the centroid position of the robot falling from the air.Snapshots of the centroid velocity of the robot falling from the air.Snapshots of the centroid orientation of the robot falling from the air.Schematic diagram of the impact force generated by the robot's right hand hitting the ground.
Figure 15.Schematic diagram of the impact force generated by the robot's right foot hitting the ground.