Modeling of Cooperative Robotic Systems and Predictive Control Applied to Biped Robots and UAV-UGV Docking with Task Prioritization

This paper studies a cooperative modeling framework to reduce the complexity in deriving the governing dynamical equations of complex systems composed of multiple bodies such as biped robots and unmanned aerial and ground vehicles. The approach also allows for an optimization-based trajectory generation for the complex system. This work also studies a fast–slow model predictive control strategy with task prioritization to perform docking maneuvers on cooperative systems. The method allows agents and a single agent to perform a docking maneuver. In addition, agents give different priorities to a specific subset of shared states. In this way, overall degrees of freedom to achieve the docking task are distributed among various subsets of the task space. The fast–slow model predictive control strategy uses non-linear and linear model predictive control formulations such that docking is handled as a non-linear problem until agents are close enough, where direct transcription is calculated using the Euler discretization method. During this phase, the trajectory generated is tracked with a linear model predictive controller and addresses the close proximity motion to complete docking. The trajectory generation and modeling is demonstrated on a biped robot, and the proposed MPC framework is illustrated in a case study, where a quadcopter docks on a non-holonomic rover using a leader–follower topology.


Introduction
Planning a trajectory for complex robots such as a biped is a complex task as the free-floating base of the robot is moved by the discontinuous contact forces acting on their feet.This propulsion method requires attention in planning the motion of the contact points and the contact forces while considering the dynamic effects on the robot [1].In addition, a straightforward optimal control formulation results in intractable non-linear programs, as stated in the literature [2,3].
There are multiple causes of complexity in planning the trajectory of a floating base robot.First of all, the pose of the floating base is described as unactuated base coordinates with six degrees of freedom (DoF).Then, actuated joint coordinates of legs are added on top of that, which results in relatively large joint coordinates [1].Therefore, the trajectory optimization of such a system should find optimal values for all these coordinates.Secondly, contact between the feet and the terrain must comply certain contact conditions such as unilateral contact forces [4].In the literature, this problem is handled as a trajectory optimization problem using numerical optimization techniques such as direct optimal control methods [5], indirect optimal control methods [6], dynamic programming [7], and sequential methods [8].In this way, trajectory of the robot, along with calculated joint coordinates, torques, and contact forces, are calculated by considering the upper-level constraints on the task.
The complexity of the problem is reduced in multiple ways.One way to carry this out is to use low-fidelity dynamic models for the robot such as single rigid body dynamics model, where lumped inertia is attached to the body frame and actuated links are assumed to be moving slowly and having low inertia [2].Another way to reduce complexity is to split the the optimization into smaller sub-problems and to pre-define some portions of the trajectory such as footholds [9].
However, simplifications in robot model and optimization problems are compromised with the complexity of the motion that can be calculated with trajectory optimization.For this reason, having means to simplify full-body dynamics without compromising the fidelity of the model is vital.Recently, the trajectory optimization problem was distributed into smaller alternating sub-problems, where one first satisfies dynamic constraints of the problem by finding robot momentum and contact forces, and then one finds the leg kinematics that satisfy robot dynamics.This is denoted as centroidal and whole-body model splitting and was first introduced in the work [8], where a sequential optimal control formulation was represented.In another work based on the same splitting of whole-body motion and centroidal dynamics, the locomotion problem is cast into a mathematical framework based on Alternating Direction Method of Multipliers (ADMM) [10].This paper exploits the natural splitting between centroidal and manipulator dynamics and ensures consensus between these models.Another recent example uses the same splitting and introduces an accelerated ADMM algorithm to solve the locomotion problem.[3].
Besides the centroidal and whole body splitting of the dynamic model of a legged robot, there is obviously another split between the portions of the body.This splitting is naturally defined by the design of legged robots and referred to as body and limbs.Depending on the design, they can contain a single body or a set of multi-bodies and interconnections between these sets maintained over a connection node, which is a joint.A formal way to describe the cooperation between these sets already exists in the literature within a cooperative system framework [11].This framework defines a relationship between independent agents using communication graphs, which dictate a mapping to the information flow among agents.Cooperative system frameworks are used in many areas such as increasing performance of a sensory network in localization of data [12] or calculating the communication topology between dynamic agents within the cooperative system [13].It is also used in designing controllers for agents within a network [14].All the applications have one thing in common, which is the distributed modeling of the cooperative system and the distributed calculation of the variables to reach the desired objective.
Autonomous aerial systems have become essential in numerous practical fields, such as in public safety, as a surveillance tool and first response [15][16][17].It has also been studied for the delivery of goods by the industry [18,19].A critical task during the operation of an aerial system is docking maneuver, where it might be approaching a stationary or a moving platform [20,21] to drop/load cargo.In addition, aerial vehicles require refueling or recharging to extend their workspace [22].Docking is an intricate maneuver that necessitates the awareness of the docking path's constraints regarding flight safety and the docked platform's attitude [23].In addition, nonlinear effects such as the wake of the leading agent due to the proximity flight must be considered [24] as well as the ground effect [25].Thus it is clear that the characteristic of this maneuver requires the agent to handle certain constraints and uncertainties.
A popular control method to handle previously described constraints and calculate control actions is model-predictive control (MPC).An MPC reaches the desired control action by minimizing a given objective using linear and non-linear optimization theory, as applied in the case of a tracking controller [26].MPC also allows the integration of secondary tasks into decision-making, as in [27], which makes it a unified strategy to handle docking maneuvers without requiring ad-hoc integration of multiple frameworks that increase complexity.Implementing MPC-based docking strategies for space applications exists in case studies, where line of sight constraints are satisfied, energy-saving strategies are pursued, and docking on tumbling objects are executed [28][29][30].
Autonomous docking for aerial vehicles requires state information of the docked platform, which receives the docking agent.Therefore it certainly requires an on-board or external mechanism to sense and estimate states.On-board sensors such as cameras or LiDAR are widely implemented solutions (see [31]), yet they are bounded with range limitations.External mechanisms to obtain state information about the docked platform are sensors placed on the agents, except for the docking agent or external observers.As a result, this information is shared over a communication grid as illustrated in [32].Besides state information, some of the autonomous docking literature can be grouped in terms of cooperation at the controller level.In one of the works, a central controller calculates lateral and longitudinal velocity commands for both of the docking maneuver devices: an uncrewed ground vehicle (UGV) and an uncrewed aerial vehicle (UAV).Then, these control signals are fed into these agents [33].In the extension of this work, an MPC strategy is utilized on the same system, where both agents are cooperatively trying to execute docking maneuvers [34].In another work, a multirotor docks onto a fixed-wing platform, where only the multirotor executes the docking maneuver [21].The aerial refueling problem is addressed as a docking problem in [22].However, only the docking platform and the boom are manipulated.Recent work studies an uncrewed sea surface vehicle (USV) landing and designs an MPC controller for a multirotor, where cooperative docking is executed [35].However, the docking trajectory calculation is calculated on the multirotor and shared back to the USV.

Motivation
In this context, this paper aims to keep the model fidelity of the biped robot higher than a single body dynamics and adopt a cooperative system framework in defining dynamics of the robot.As a result, the robot will be defined by multiple cooperating agents, where cooperation is introduced to the trajectory optimization by an adjacency matrix as a constraint.In this way, lumped trajectory optimization is converted to a separated objectives and constraints that are linearly related.
When multiple agents are communicating with each other and obeying specific rules, i.e., collision avoidance, velocity matching, and staying within the vicinity of the neighbors, the aggregate of these agents are called cooperative agents [36] and the application of these systems are vast due to the advantages.Popular applications include uncrewed vehicles [37] and space applications [38].The nature of autonomous docking makes the system performing this maneuver a cooperative system.However, it is not formally addressed as such in the vast majority of the literature.This is due to the fact that in most scenarios, one of the vehicles is assumed to be tightly controlled and hence passive.The control laws are then derived for the remaining active vehicle.On the other hand, controllers cooperatively addressing this problem are mainly centralized on a ground controller or in one of the agents, which could suffer in performance or even fail due to uncertainties in the communication.Apart from the centralized implementation of cooperative docking, there are a few applications where both agents take part in the docking maneuver.Since the cooperative control framework allows various communication topologies, methods based on MPC with local neighbor state information can be applied.Besides the cooperative aspect, a prioritization of the states to track during docking is typically not studied in the literature.As summarized before, agents that are too far apart from each other first could first minimize the positional difference, which is carried out to be in a feasible solution set when the MPC for docking is initiated.Instead of handling this problem as two separate sub-problems, an automated prioritization of tracking certain states can be defined so that linear states can be given higher priority over the angular states (see [39]).

Contributions
The key contributions of this paper are as follows.The methods introduced in this paper are used to divide the lumped multi-body model of a large robot-system into cooper-ative multi-bodies and address the trajectory optimization problem using this distributed model.In the case of the biped robot, although it has light weight legs compared to the shoulder and the floating base, the biped robot will be modeled as three cooperative agents, which are the floating base, right shoulder, and left shoulder.All agents are defined as multi-bodies, as defined in the following sections.The contributions of this paper are summarized as follows:

•
This method divides the EoM of the biped robot into smaller cooperative agents, which has simpler EoMs.Agents with simpler dynamics result in simpler equality constraints for the trajectory optimization.

•
The non-linear programming formulation given in this paper cast the trajectory optimization problem with single objective and single augmented Hamiltonian into split objectives and constraints.

•
This paper also proposes a cooperative control strategy based on MPC for docking.
The designed strategy implements a non-linear and a linear MPC for the coarse approach (long distance) and the delicate docking maneuver (short distance) based on the same objective function with tailored optimization strategies.A leader-follower type of topology is adopted, where the quadcopter docks on the UGV.As a showcase, this controller performs short-and long-distance docking of a quadcopter on a UGV.

•
Formulation of the MPC includes task prioritization, which is based on a null-space projection of the tasks being ranked.The formulation is adopted from [40] by defining the docking task in terms of the docking agents' Degrees of Freedom (DoF).
The organization of the paper is such that Section 2 expresses governing equations of agents and the underlying graph structure.Following that, Section 8 provides a taskprioritization and MPC strategy for docking.The simulation results are shared in Section 9. Finally, conclusions are provided in Section 11.

Notation and Preliminaries
The notation of this work is as follows.s ∈ R, v ∈ R n v and M ∈ R n r ×n c represent the arbitrary scalar, vector, and matrix.F I and F B represent the inertial and body frames, respectively, where the expression of a vector for these frames is written as i v, i = I, B. Unit vectors in orthonormal frames are denoted as u 1 = 1 0 0 T , u 2 = 0 1 0 T and u 3 = 0 0 1 T .The composite rotation matrix I C B ∈ SO(3) is written from F B to F I , the rotation sequence is depicted as z − y − x, and associated Euler angles are denoted as ψ, θ and ϕ. g = 9.81 m/s 2 is the gravitational acceleration.col{•} M m=1 , row{•} M l=1 , and diag{•} M m=1 represent column, row and diagonal concatenation, respectively, of the entity within the parenthesis.

Underlying Graph Structure
The communication among agents in the cooperative systems are described by the graph G = (N , E ), which consists of node set N and edge set E [41].Edge set E ⊂ N × N is given between nodes i ∈ N and j ∈ N such that (j, i) ∈ E denotes node i receives information from j.Let n w be an arbitrary signal dimension, and then the adjacency matrix; shared signal sizes among agents are assumed to be identical and equal to n w .
, where a ij quantifies the strength of the connection from node j to node i. N is the number of agents in the cooperative system (CS).Formally, A ij is described by the following equation.

ASLB System Composition
ASLB is a floating bipedal platform in which each leg is composed of hybrid structure with three degrees of freedom (DoFs).Specifically, starting from the body, the kinematic structure of legs are a revolutionjoint (at shoulder) followed by a parallel 5R mechanism.This kinematics result in three actuated and two passive joint coordinates in each leg.A rendered 3D model and manufactured prototype of ASLB is provided in Figure 1.
Table 1 provides a summary of the key variables and terms used for the development of the ASLB dynamic model.

ASLB Kinematics
Kinematic model of the floating platform starts with a set of unactuated joints that gives six DoFs to the mobile platform.These joints are collected in a column matrix q B ∈ R 3 × SO(3).Starting from the inertial frame, joints are located in a sequence such that first translational joints q t,B = [q 1 , q 2 , q 3 ] T ∈ R 3 are located in respective x, y and z directions.Assuming the euler sequence of YZX, rotational joints are q r,B = [q 4 , q 5 , q 6 ] T ∈ R 3 .Body frame F B is kinematically represented with respect to F I by q B = [q T t,B , q T r,B ] T such that r B (q t,B ) and C IB (q r,B ) are translation and rotation matrices from F I to F B , respectively.There are two legs attached to the body, and respective joints are represented by θ i ∈ R n i , where n i = 5 and i = R, L. In total, the combined DoF for legs are given as n a = 10.
The composition of θ i for each leg is given as θ i = [θ 0,i θ 1,i θ 2,i θ 3,i θ 4,i ] T , where there are three active and two passive joints, respectively θ a,i = [θ 0,i θ 1,i θ 3,i ] T ∈ R 3 and θ p,i (θ a,i ) = [θ 2,i (θ a,i ) θ 4,i (θ a,i )] T .Passive joints can be written as a function of θ a,i based on a velocity constraint as described in Section 4.2.1.As a result, total joint space of ASLB is given by q = [q B , θ a,R , θ a,L ] T ∈ R 12 .The kinematic structure of the right leg is illustrated in Figure 2.

Passive-Active Joint Relation
As described in [42] starting from the origin of the body frame F B , there are two chains to reach point v on both legs.Let R * represent elementary rotation matrix, where * = x, y, z are active axes.Then these two chains can be written as in Equation (2).
Differentiating Equation (2) results in velocity equations of v r,1 and v r,2 , which are equal.Writing this relationship as below relates the passive joints to active joints as described in Equation (3).Let θ a,i = [θ 1,i θ 3,i ] T denote the set of active joints related to the closed loop, then J a,i ∈ R 2 becomes a square matrix.
Finally, passive joints are related to active joints as given in Equation (4).

Forward Kinematics
Forward kinematics for each leg are used to calculate the position of the contact point c with respect to origin of F B , as illustrated in Figure 2. The aforementioned position vector is denoted as r c .As mentioned in Section 4.2.1 the active and passive joint angles are related to each other and unless the passive joints are measured by sensors, they have to be calculated from this relationship.To do so, position vectors r v,1 and and r v,2 are arranged as shown below.
The first two rows of Equation ( 5) can be written in a minimal form as provided in Equation ( 6) such that terms including θ 3,R , θ 4,R are left alone.
Elements of Equation ( 6) are squared and summed to obtain Equation (7).Then trigonometric expressions are written in terms of tangents of the half angles, which leads to the a solution to θ 12,R = θ 1,R + θ 2,R as provided in Equation (8).

Inverse Kinematics
Inverse kinematics is illustrated on right leg, and the calculations provided here can be duplicated for left leg.Inverse kinematics solutions in this work rely on the geometric calculation of θ 0,R as provided in Equation (11).The geometric entities are illustrated in Figure 3.
To calculate active joints θ 1,R and θ 3,R x and y components of the position vector r pc as given in Equation ( 12) are expanded in Equation ( 13).
Equation ( 13) is rewritten as in Equation ( 14), and each scalar equation can be squared and summed as in Equation (15).
Then, using tangents of the half angle, Equation ( 15) can be further manipulated in Equation (16).The solution to θ 34,R can be calculated by solving the quadratic problem for T 34 as illustrated in Equation (8).
After finding the solution to θ 34,R , these values are inserted into Equation (12) to calculate θ 3,R .This completes the solution to one chain of the leg mechanism.
Geometric definitions such as r pcl , r cl are provided in Figure 4 to calculate the joint variables on the other chain that contains a 2 and a 3 .Using the known joint variables, point v is represented from origin of the joint 3 with a vector denoted as r pcl .Alternatively, point v can also be represented from the origin of the joint 1 with a vector that is r cl .These vectors are defined on (P, R) plane, where the 5R mechanism lies , and provided in Equations ( 17) and ( 18), respectively.
Scalar equations r cl,x and r cl,y of Equation ( 18) are squared, summed, and reorganized as provided in Equation ( 19) to calculate β R , which leads to θ 2,R .
To calculate θ 1,R , r cl,x and r cl,y are represented as a function of r pcl .This is given in Equation (20) in matrix form, where terms with θ 1,R are the unknown variables.The solution to Equation (20) for θ 1,R finishes the inverse kinematic solution of the right leg.The procedure for the left leg is identical.

ASLB Dynamics
Based on the generalized coordinates, multi-body dynamics of ASLB are formulated as in (Equation ( 21)) where M(q) ∈ R 12×12 , C(q, q) ∈ R 12 , and G(q) ∈ R 12 are the generalized mass, Coriolis and centrifugal, and gravitational terms, respectively.S ∈ R 12×6 , τ ∈ R 6 , F C,i ∈ R 3 and J C,i ∈ R 3×12 are the selection matrix for the actuated joints of respective legs, actuated joint torques, geometric Jacobian of the tip point of the i th leg, and external force on the tip of i th leg.Let x = [q, q] T ∈ R 24 , u τ = τ, and u c = F C,i ∀i be the states, inputs to motors and external forces acting on tip point of the legs, respectively; then, the non-linear dynamics of the robot can be written as in Equation (22).
The application of the graph-theoretic modeling for the ASLB by separating the legs, which are defined as Agent 2 and Agent 3, from the floating base introduces simplifications to the overall complexity of the model and optimization.One of the simplifications appears in modeling as leg dynamics does not necessarily need to be modeled with respect to F I using q B but rather is better defined with respect to the F B .This local representation of leg dynamics leads to following outcomes for the dynamics of the agents:

•
The leg is only used to find adjacency and contact forces on the floating base and ground, • The state space representation of the leg dynamics can be kept at velocity level.
Note that the contact forces and geometric properties of contact point must be converted to F B .For ASLB, the floating base of the robot is denoted as Agent 1, the right leg is denoted as Agent 2, and the left leg is denoted as Agent 3. Agent 1 is defined with two nodes, node 1 and node 2; Agent 2 is defined with node 3; and Agent 3 is defined with node 4. The locations of these nodes are illustrated in Figure 5, and it should be noted that node 1 and node 2 coincide.Without any interconnection constraint, agents are independent of each other; however, there are rigid joints connecting them.In this case there are two bi-directional edges and these are (node 3, node 1) ∈ E and (node 2, node 4) ∈ E. Connections between nodes are assumed to be rigid, and then the adjacency matrix is composed of edge weights a mn = 1.This yields an adjacency matrix as in (23).
Finally, the relationships between cooperative agents are given in (24) by the Laplacian matrix.I p represents identity matrix with size p.
Let W ∈ R p be the signal that is being shared between agents; then, L can be expanded as in (25) to comply with the signal dimension.⊗ is the Kronecker product operator.
Recalling that node 1 and node 2 are coincident, one can define the following adjacency constraints using the extended Laplacian definition given in ( 25) Another aspect of splitting a lumped multi-body model into distributed cooperating multi-body models is generalized coordinates.This operation necessarily duplicates the generalized coordinates of the floating base in a lumped model to the distributed models.In addition to that, Agent 2 and Agent 3 have actuated joints of q a,i , i = 2, 3.
Therefore, generalized coordinates of the agents are defined as T ∈ R 9 and q 3 = q T t,3 q T r,3 q T a,3 T ∈ R 9 , respectively, for Agent 1, Agent 2, and Agent 3.This is illustrated in Figure 5.

Agent Kinematics
The YZX Euler sequence is used in defining the composite rotations C Ii (q r,i ) from F I to F i , where subscript i is the agent index and F i is the local origin of the agent.The contact point position and velocity of Agent 2 and Agent 3 are calculated with respect to F B , which is denoted as in (27).K + (q i ) represents forward kinematics of leg i in Equation ( 27), which is explained in Section 4.2.

Agent Dynamics
The dynamics of agents yield a similar equation as given in (22), and, for brevity, a representative Equation of Motion (EoM) is given in this section.
Generalized velocities are assigned to states of each agent as x i = qi .External forces in distributed notation are divided into two, where the first one is denoted as F C,i and acts on the agents as a result of ground contact.The second external force is denoted as F A,m and exerted on the agents from the adjacent nodes.Adjacent nodes also transmit moment, M A,m ; therefore it is convenient to collect forces and moments at adjacent nodes such as a wrench, denoted as . As a result, the non-linear dynamics of each agent are written as given in (28).Let and u w,i = W A,i ∈ R n wi be the states, torques, contact forces, and adjacency wrench, respectively; then, the non-linear dynamics of the robot can be written as in Equation (22).The signal sizes for Agent 1 are n x1 = 6, n t1 = 0, n f 1 = 0, n w1 = 6, while Agent 2 and Agent 3 have signal sizes of n xi = 9, n ti = 3, n f i = 3, and n wi = 6 for i = 2, 3. ẋi = f i (q i , u τ,i , u c,i )

Cooperative Graph-Theoretic Online Trajectory Generation for ASLB
Graph-theoretic online trajectory generation relies on the cooperative modeling of the robot and is composed of a series of optimizations.These are the contact phase, swing phase, and force optimizations, where contact optimization finds the optimal finite horizon for the current contact phase and a sequence of contact trajectories that will keep the robot states bounded for defined phase horizon.For this reason, the resultant contact phase trajectories, except for the one associated with the current contact leg, are not passed to the next optimization.Another cardinal data set that is passed to the next optimization from contact phase is the initial point of the subsequent contact phase.This information is required to generate a rough swing trajectory for the leg that is in the air.It should be noted that both contact and swing trajectories are calculated to provide an initial trajectory for the force optimization, where trajectories are refined using a cooperative system framework.Trajectories to previously defined optimizations are illustrated in Figure 6.

Contact Phase Optimization
The contact-phase optimization calculates a set of trajectories using the linear inverted pendulum model (LIPM) and contact constraints.The LIPM dynamics we used in this work are widely used in the vast majority of the literature.As shown in Section 5.3, the dynamics are written with respect to body frame F B .Besides that, the dynamics are also kept at the velocity level.Under these circumstances ,the contact conditions for the leg in contact need to be defined accordingly.

Contact Condition
Under contact conditions with the no-slip assumption, r c,i has no relative motion with respect to the ground if this condition is observed from the inertial frame F I .This condition is observed from F B as if ṙc,i = − u B , which is illustrated in Equation ( 29) for agents i = 2, 3. Recall that states of agents i = 2, 3 are denoted as x i = q T t,i q T r,i q T a,i T , i = 2, 3, and qt,i ,where i = 2, 3 is duplicate of u B , assuming that the connections between nodes are as defined in Equation (26).

LIPM Model
The implementation of the LIPM model in this work has some nuances compared to the the work where it is proposed [43,44] and illustrated in Figure 7. Let F xz ∈ R 2 be the virtual force created on the x − z plane due to the displacement between origin and contact locations xz r c,2 and xz r c,3 .Note that the origin represents the center of mass (CoM) and is not the origin of the F B .Although CoM moves with respect to the origin of F B , in practice, it is assumed to be fixed with an offset from F B .Under these assumptions, the LIPM dynamics are provided as in Equation (30) in state space form.States for this system are denoted as x c and defined as the contact point, and this point is denoted as ∆r xz , as any of the two contact locations can be assigned to it, which are xz r c,2 and xz r c,3 , respectively.Specifically, states are defined as This system is an inherently unstable system; therefore, what is being pursued with this system is to find a set of initial conditions, denoted as 0 x c , that will propel the CoM of the robot toward the desired velocity vector.While carrying that out, a set of state bounds are also satisfied.Equation ( 30) is discretized using Euler propagation as provided below, where ∆t is the sampling time.

Contact Phase Optimization
This method relies on finding a set of trajectories that will keep the proceeding steps within bounds; therefore, a phase horizon is defined as N p ∈ N, which represents the number of phases to be calculated during the optimization, including the current phase.A finite horizon for each phase is defined as N n ∈ N, which will be minimized for N p = 1 and kept at its nominal for N p > 1.The states of the LIPM dynamics in each phase are denoted as p x c,k , where The combined states for each phase are denoted as p x c,K as provided below.
Equation ( 33) represents the quadratic problem that runs in contact optimization phase, where L c ( p x c,1 , u B,re f ) is the objective function, and C cont ( p x c,1 ) and C bounds ( p x c,1 ) are equality and inequality constraints to ensure continuity of the states between phases and to keep the states within predefined bounds.minimize

Contact Phase: Continuity Constraint
As explained earlier, contact optimization seeks to find several contact phase trajectories, and these trajectories should ideally be continuous.This is achieved by an equality constraint defined as in Equation (34)  and assigned to S p such that S p = S i if c i = 1.Note that we are assuming a single point of contact.Using the bounds, the state boundary constraints are defined as in Equation (35).
5.1.6.Contact Phase: Cost Function Contact phase optimization aims to reach the x − z projection of reference velocity that is provided by the user u B,re f , which is u B,xz .Recall that ṙc,i = − u B ; therefore, the cost function is written as in Equation (36).

Swing Phase Optimization
Swing phase optimization calculates a rough trajectory for the swinging leg by connecting the current position of the tip of the swinging leg to the initial point of the proceeding contact phase trajectory with a bezier curve.Instances of the swing trajectory are denoted as p x s,k , and the current and final positions of the swing trajectory are denoted as p x s,1 and p x s,N n , respectively.p x s,1 and p x s,N n are the projections of the vectors on the x − z plane, and the initial point of the proceeding contact phase trajectory is p+1 x c,1 .Note that the contact phase optimization generates a planar trajectory.Therefore, the implementation of swing-phase trajectory optimization requires a modification of these vectors by adding the height of the CoM to the y axis of any projected vector if it needs to be passed to the swing trajectory.Figure 8 illustrates previously mentioned vectors.Dashed lines represent the projection of the swing trajectory, while solid lines show the contact trajectory.n 0 , n f , and L represent the unit vector to CoM, the unit vector from CoM, and the straight line on the x − z plane between initial and final positions of the swing trajectory.Formal definitions for n 0 , n f and L are provided in Equation (37).

Swing Phase Optimization
Swing phase optimization is run for the current phase; therefore, unlike for the contact phase optimization, p = 1.The finite horizon for this phase is the N n at p = 1.Note that at p = 1, N n is optimized at the contact phase optimization.
Let minimize The implementation of swing phase optimization requires the calculation of 0 d J b,c ,

Swing Phase: Cost Function
Swing phase optimization aims to pull the swinging leg to the CoM in the beginning of the swing motion and then pushes it toward the final position.Along with these, it also tries to approach the straight line L. Therefore, the cost function of the swing phase optimization is written as in Equation (41).

Cooperative Force Optimization
Approximate trajectories are obtained in contact and swing phase optimizations, and these trajectories are used in the cooperative force optimization as the initial trajectory.In order to follow the method easily, agent dynamics are rewritten in Equation (43), where M h and C h are partitioned as given in Equation ( 42).In addition, M i , C i , and G i for i = 2, 3 are assigned to M h , C h and G h , where h = S represents swing, h = C represents contact, and h = B represents floating base matrices.Minimal representation of the dynamics are provided in Equation (44).Floating base dynamics do not switch; however, Agent 2 and Agent 3 dynamics are assigned to h = C or h = S depending on c i for i = 2, 3. Similarly, wrenches W A,m are assigned to W A,h based on c i such that if c 2 = 1, then W A,C = W A,2 and if c 3 = 1, then W A,C = W A,3 for i = 2, 3. Finally, rC is the tip point position r c,i of the leg with c i = 1.F C is the interaction between the contact leg and the ground.
MB ẋB Then, continuous models in Equation ( 43) are converted into discrete models using Euler discretization, and Equation ( 45) provides the discrete system model that is used in cooperative force optimization.The current states are denoted as x h,k , where h and k represent the model identifier and the prediction step, respectively.
For brevity, Equations ( 45) are represented with a minimal representation as follows.

Cooperative Force Optimization Problem
This method relies on initially provided trajectories that is provided by contact and swing phase.In this phase, decision variables are defined as corrections to the nominal trajectories, and a complete trajectory is defined as such.Nominal trajectories for states are denoted with 0 x h,k , and corrections to the nominal trajectories at every instant are denoted as ∆x h,k .Similarly, force trajectories are defined in the same fashion such that 0 F C,k and 0 W A,m are the nominal force trajectories, while ∆F C,k and ∆W A,m are the corrections to the relevant trajectories. x States, contact forces, and wrenches for the entire trajectory are combined as follows. x The relationship between x h,K , F C,K , and W h,K can be written for the entire trajectory using combined matrices as provided in Equation (49). .
The dynamics for the floating base, contacting, and swinging bodies are written as provided in Equation (50).

MB,K PS
The optimization for the three agents is represented in a single objective Equation (51) and a set of constraints in this work; however, the problem is readily available for distributed optimization.minimize Based on the dynamics given in Equation ( 49) and the representation of the trajectories provided in Equation ( 47), the matrices for the equality constraints are denoted as A h,dyn and B h,dyn .The equality constraint is provided for only the floating base for brevity.

Cooperative Force Optimization: Contact Constraint
The contact constraint was provided previously in Equation ( 29).This condition is modified for the definition of the trajectory provided in Equation (47).Put simply, the contact point velocity has to be equal to the body velocity in the opposite direction in the no-slip condition, and Equation (53) projects the relationship on the decision variables for the quadratic optimization.

Cooperative Force Optimization: Force Cone Constraint
The coooperative force optimization phase is designed as a quadratic problem; therefore, constraints have to be set accordingly.Contact constraints are dedicated to keep tangential forces small so that no slipping occurs.To do so, a friction pyramid is created inside a friction cone.The friction cone is a geometric interpretation of the magnitude of the allowable tangential force that can be applied on the ground.The allowable limit is calculated by simply multiplying the normal component of the applied force by the contact point with the friction coefficient.The nrmal component of the force is denoted as B f c,n and the tangential components of the applied force are denoted as B f c,t , B f c,s , respectively.It should be noted that the applied force is defined with respect to F B .The friction coefficient is denoted as µ.The radius of the friction cone is defined as r f c = µ B f c,n .The friction pyramid is defined such that it is always upper-bounded by the r f c , and this is achieved by setting linear bounds that are denoted as r f c,s and r f c,t .These bounds are calculated such that |r f c,s | ≈ 0.707r f c and |r f c,s | ≈ 0.707r f c .It should be noted that the bounding r f c creates a non-linear relationship and makes optimization a non-linear problem; however, bounds that are defined for tangential components can be implemented in a linear fashion.The geometric interpretation of the friction cone (red solid line) and pyramid (blue solid line) is provided in Figure 9. Based on the linear and conservative bounds, the following linear constraints are defined for the tangential forces.
The relationship given in Equation ( 54) is written compactly as provided in Equation ( 55), where B F C,k is the vector containing the decision variables.B n, B t and B s are the unit vectors attached on the contact points and defined in F B .As a practical note, calculating these unit vectors with respect to F B is a straightforward calculation when there are passive joints at the ankle of the contact legs.Depending on the kinematic structure of the leg, certain unit vectors can be assumed to be in the same direction with the axes of the body frame.
Within the current work, B n, B t and B s are assumed to be constant and defined as

Cooperative Force Optimization: Cost Function
Force phase optimization tries to keep the commanded motion intact while minimizing the disturbance injected on the system due to the joint accelerations.The joint accelerations affect each agent due to the C coop , which is provided in Equation (26).The constraints that are introduced in this chapter previously maintained contact, and force cone constraints are satisfied.The cost function for this optimization is defined in Equation (56).
where I I Q is a matrix that selects the states that are the joint velocities of contact and swing legs and approximate the acceleration of these selected states.The joint velocities within states x h,k are denoted as q h,k for h = C, S, and the selection of these states is defined in Equation (57).
The approximation for acceleration is given below, where states are assumed to propagate with the first-order Euler method.

Preliminary Results for the ASLB Platform
This section presents the simulation of the algorithm for the proposed method.The simulation is not executed in a physics environment, and trajectories illustrated in this section are estimated trajectories only.Solutions are obtained on Intel(R) Core i7-4720HQ CPU @2.60 GHz 16GB RAM PC with MATLAB 2019b software.
The sampling time ∆t for the discrete model is selected to be ∆t = 0.05s.Nominal values for N n and N p are selected as N n = 20 and N p = 4, respectively.Q s and Q p for contact phase optimization is selected to be Q s = 5 and Q p = 2. Q 0 , Q f , and Q t for swing phase optimization is selected to be Q 0 = 12.5, Q f = 12.5, and Q t = 30.Finally, Q u and Q w for cooperative force optimization are selected to be Q u = 100 and Q w = 5.
Based on these settings, and from the initial conditions of u B (t = 0) = 0 0 0 T , r 2 (t = 0) = 0.01 −0.344 0.063 T , and ASLB is asked to move forward by u B,re f = 0.1 0 0 T .The results are provided in the following figures.Figure 10 illustrates the calculated swing and contact trajectories.N n for the first optimization is minimized to N n = 6 from a nominal 20 as the contact initial position for p=1 r c,1 is already provided by the sensor information.For a feasible finite horizon, N n = 6, and contact leg, which is the right leg or Agent 2, is used to calculate an approximate trajectory for r c starting from r 2 (t = 0) and diverge from the CoM toward the (+) x and (+) z directions.Note that this trajectory is calculated with respect to F B .The swing phase optimization connects r 3 (t = 0) to the p=2 r c,1 without violating the kinematic bounds.Figure 11 illustrates the nominal force trajectory for the contact leg as it interacts with the ground.0 F C is the initial trajectory that is calculated by substituting q a,i and x i into the agent dynamics.opt F C is the resultant contact force trajectory.It is visible from the figure that there is not a significant change in the y-axis.However, opt F C is shifted in the (+) x direction by approximately 0.4 N. Figure 12 shows the contact and swing trajectories for the second step, where N n is minimized to N n = 16.Similar to the first step, p=1 r c,1 for contact phase optimization is the initial position of the current contact leg, which is Agent 3. r c in Figure 12 converges to the CoM in the beginning of this phase and then pushes away toward the (-) x and (-) z directions.A smaller correction occurs in the contact force as 0 F C and opt F C have slight differences in all three directions as seen in Figure 13.
Figure 14 provides the u B trajectory throughout the walking simulation along with the r c,i for i = 2, 3.This figure illustrates the characteristic difference in the method, which calculates the contact positions and forces them to track a reference velocity u B,re f .The black circle in Figure 14 indicates the CoM and decision variables are defined with respect to that.The trajectory of u B , which is given with the blue line, settles in a cyclic pattern.The mean of the magnitude in x direction is approximately 0.28 m/s, while it is almost zero for z direction.The trajectory of I r B reveals that body drifts away from the line z = 0 while achieving forward motion, as desired.The drift is due to the first contact leg, which is the Agent 2, and cannot be corrected.
With the proposed method, a velocity command could be tracked with some offset.During the simulation of the algorithm, it was observed that relaxation of the state bounds decreases the offsets between velocity commands and actual velocity.In addition, since this method is designed for velocity level dynamics, a drift in the walking does occur, which can be mitigated by an appropriate control scheme.That said, extracting the dynamics from the kinematics and developing a controller for this simpler set equations allows faster calculation of the future steps.

Cooperative UAV-UGV Docking with Task Prioritization
Motivated by the results from the preceding section, we extend the framework to study systems that are not on the same platform for example a cooperative UAV-UGV system similar to that in [39], with the objective to have the UAV (quadcopter) dock on the moving UGV (rover).Note that this framework further illustrates that it can accommodate very different dynamical systems, unlike the ASLB system, where the left and right leg dynamics were identical.
Table 2 provides a summary of the key variables and terms used for the development of the ASLB dynamical model.

F B
body fixed reference frame (quadcopter) m q mass of the quadcopter J q moment of inertia of the quadcopter x q state vector of quadcopter I C q (θ q ) = I C q rotation matrix for quadcopter f q total thrust generated by motors in the body frame t q moments generated on the body u q input vector for the quadcopter model g acceleration due to gravity [u 1 u 2 u 3 ] T unit vectors representing the body frame m r mass of the rover J r moment of inertia of the rover x r state vector for rover dynamics I C r rotation matrix of the rover u r inputs to the rover

Quadcopter Dynamics
The six-degree-of-freedom (DoF) rigid body dynamics with mass (m q ) and moment of inertia (J q ) of the quadcopter are represented in (60).The model's states are denoted as , where p q = I p q ∈ R 3 , v q = I v q ∈ R 3 , θ q ∈ R 3 and w q = B w q ∈ R 3 are inertial position, inertial velocity, Euler angles, and angular velocity, respectively.Given the Euler angles, the rotation matrix for the quadcopter is denoted as I C q (θ q ) = I C q .The inputs of the system are denoted as u q = f q t T q T ∈ R 4 , where f q is the total thrust generated by the motors on F B and t q ∈ R 3 is the column matrix of moments generated on the body defined in F B .E q (θ q ) is the mapping between w q and θq such that w q = E q (θ q ) θq for pre-defined rotation sequence.The quadcopter properties are taken from the work [45].

Rover Dynamics
The dynamics of the rover are calculated with mass m r and moment of inertia J r , assuming that it runs on a flat surface and provided in (61).Based on this assumption, model states are denoted as x r = p r,x p r,y Ψ r v r,x w r,z T , where p r,x and p r,y represent inertial position on the x − y plane, Ψ r is the Euler angle about the z axis, B v r,x = v r,x is the horizontal velocity of the rover, and B w r,z = w r,z is the angular velocity of the rover.The composite rotation matrix for the rover is denoted as I C r = I C r (Ψ r ).Inputs to the system are denoted as u r = [ f r,1 f r,2 ] T , which represent the traction forces applied on the surface by a set of wheels on the right-and left-hand sides of the rover, respectively.m r is taken as 1 kg, and

Cooperative Model-Predictive Control Methodology
This section introduces a unified MPC strategy to maintain a docking approach for the long range and a finer docking maneuver in the short range by accommodating a nonlinear and a linear MPC designed with edge weight information and task prioritization.This section is divided into four subsections, where non-linear MPC (NMPC), linear MPC (LMPC), cooperative task prioritization, and implementation of the control strategy are described in Sections 8.1, 8.2, 8.3, and 8.4, respectively.
The introduced method can be applied on all of the agents as the formulation only considers local neighbor information; therefore, formulations will be provided for the agent denoted as i, which is defined by the states x i (t) ∈ R n x , inputs u i (t) ∈ R n u , and outputs y i (t) ∈ R n y .Given the states and inputs, the non-linear dynamics of agent i are given in the following form: ẋi (t) = f i (x i (t), u i (t)) Equation ( 63) provides the discrete linear representation of the agents' dynamics.This is obtained by linearizing the current state and input, which is denoted as (x i,c , u i,c ) and then by converting the continuous time system in a discrete system using Euler discretization (see [46]).
where k is the current sample such that ∆x i,k = x i,k − x i,c , and ∆u i,k = u i,k − u i,c .In this work, we assumed that full state information is shared among the agents.Therefore, matrices C i and D i are assumed to be I and 0, respectively, with compatible sizes.

Non-linear MPC Formulation
Equation (64) represents the non-linear problem that runs in NMPC where L i (x i,k , u i,k , x i,re f ) is the objective function, and C eq (x i,k , u i,k ) and C ineq (x i,k , u i,k ) are equality and inequality constraints, respectively.M in (64) denotes the prediction horizon for the NMPC.minimize The objective function implemented in this work has the following quadratic form.
where Q ∈ R n x ×n x and R ∈ R n u ×n u are square matrices.This objective function drives the system to achieve the cooperative task defined by L i,tp (x i,re f , x i,k ), which brings a nuanced cooperativeness with task prioritization to the tracking and is explained in Section 8.3.
The equality and inequality constraints serve the following purposes.C eq (x i,k , u i,k ) is introduced to ensure the initial conditions, which are x i,c = x i (t = 0) and u i,c = u i (t = 0), and C ineq (x i,k , u i,k ) is introduced to enforce states and inputs to stay in the predefined ranges.These constraints are provided in [45].

Linear MPC Formulation
The LMPC method implemented in this work is the implicit LMPC applied in [26,45], and details can be found therein.Therefore, the method is summarized here for completeness.The quadratic problem for the LMPC is provided in (66).minimize The quadratic objective function is denoted as L i (x i,K , ∆u i,K , x i,re f ) and given in (67), where the decision variable is ∆u i,K and x i,K is a function of ∆u i,K . where Due to the implicit formulation, state predictions are described with respect to the decision variables and combined in a lumped representation.Similarly, decision variables, which are the inputs, are also collected under a lumped term.These lumped states and inputs are denoted as ∆x i,K and ∆u i,K and provided in (68).
The relationship between ∆x i,K and ∆u i,K is provided below with matrices F i ∈ R Mn x ×n x and H i ∈ R Mn x ×Mn u for the prediction horizon of M.
Terminal states are given as a function of x i,k and ∆u i,K below.
C eq (x i,K , u i,K ) and C ineq (x i,K , u i,K ) serve the same purpose with C eq (x i,k , u i,k ) and C ineq (x i,k , u i,k ); however, they are modified to comply with LMPC notation.The calculations of these constraints are provided in [26].

Cooperative Task Prioritization
Let ϵ i,t ∈ R n ϵt , as provided in (71), represent a cooperative task for agent i based on local neighbor state information that is received from neighboring agents, denoted as j.Assume that ϵ i,t ∈ R n ϵt is defined for a subset of states denoted as xi ⊂ x i and xj ⊂ x j , where xi ∈ R n s and xj ∈ R n s .It should be noted that xi and xj are assumed to be measured with respect to the same coordinate frame.Otherwise, the task ϵ i,t becomes non-linear.
where T is the maximum number of tasks.The first derivative of ϵ i,t is calculated as follows: Collecting the terms on the right-hand side M i (a ij ) = a ij I n s −a ij I n s and Ṡi,t = ˙x T j ˙x T i T ∈ R n st , the mapping in (72) between task space (TS) velocities and local state space (LSS) velocities takes the form Following that, the inverse mapping from TS to LSS is calculated as where M + i is the pseudo inverse of M i,t and a ij is dropped from the expression for brevity.Note that x j is the state of a neighboring agent.
As illustrated in [47] for robots, excessive (redundant) LSS can be utilized to manage secondary tasks.Let ϵ i,1 and S i,1 denote the first TS and LSS and ϵ i,t , and let S i,t , t = 2, • • • , T denotes the remaining ones; then, following task management methodology can be used [40]: where Φ i,t is the null space projection matrix and calculated is as follows: The formulation defined in (75) is collected in a minimal representation as in (77).
Finally, based on (78), L i,tp (x i,re f , x i,k ) and L i,tp (x i,re f , x i,K ) are calculated as given in (79) and (83), respectively, where Remark 1.If states of the agents i and j are linearly related, then xi and xj are linearly related; therefore M i,1 is constant.As a result, • L( εi ) outputs a quadratic function of εi,t with scalars γ t such that L( εi ) = ∑ T t=1 γ t ε2 i,t , where The relationship given for derivatives of state space and task space in (78) is also valid for the state space and task space as follows: which results in Let Q ϵ be partitioned as in (82), and let X i,re f := 1 ⊗ x i,re f , where Then, the overall task matrix ϵ i can be written as an affine function of X i,re f , and x i,K ) and L i,tp (x i,re f , x i,K ) is calculated as provided in (83).

Implementation of the MPCs
The docking controller is designed to have several layers to decide how to perform the docking maneuver based on the distance between the agent i and the neighboring agents.The layers, as mentioned earlier, are the selector NMPC and LMPC.Essentially, the selector works as a governor and decides the reference trajectory that the LMPC tracks.If the absolute distance between agent i and neighboring agents is greater than a threshold, the NMPC generates a trajectory toward the neighboring agents using neighboring state information.Note that NMPC does not aim to achieve the terminal state constraint, but it repeatedly minimizes the state error within the given prediction horizon.Then, the generated trajectory, which creeps toward the neighbors, is passed to LMPC.If the distance is smaller than the threshold, LMPC receives the neighboring state information to generate control inputs without needing NMPC.The distance and threshold are denoted as e = ∥ p j − [p i ]∥ 2  2 ∈ R and e d ∈ R. The currently available state and input information are denoted as x i (t − 1), x j (t − 1) and u i (t − 1), where t represents current sample and (t − 1) represents previous sample.The pipeline for the control strategy described above is given in Figure 16.When triggered, NMPC generates a rough trajectory toward agent j for a given prediction horizon M based on the formulation given in (64).Then, a portion of this trajectory that is determined by the control horizon M u is extracted, and this is denoted as x nmpc .The optimization parameters for NMPC such as M, t f , and M u are selected empirically to achieve a rough trajectory and allow a time interval to queue a new trajectory to guide the agent i to the vicinity of other agents.The selection of the optimization parameters with the right-hand-side sparsity template described in [46] provides a longer time to queue a new trajectory.Before the LMPC receives the output of the NMPC or the neighboring state information, i.e., x nmpc ir x j (t − 1), respectively, a line or a set of line segments are populated based on the prediction horizon of the LMPC.Finally, the tracking based on the optimization provided in (66) is performed by the LMPC.

Cooperative MPC Simulation and Results
This section presents the simulation results for the proposed strategy based on the simulation setup of a quadcopter and a rover, as illustrated in Figure 17.[39]).
The NMPC and LMPC parameters are provided in Figure 18, and the quadcopter properties are given in [45] in Table 1.Two scenarios are illustrated-(1) proximity docking on a moving agent and (2) large range docking on a moving agent-where LMPC and NMPC-LMPC strategies are tested, respectively.Let P s = {1, 2, 3, 4} represent the priority of states, where 1 is the highest priority and 4 is the lowest priority, and s = p, v, e, w represents the subset of states that P is representing.On this basis, LMPC uses the priority map of {P p , P v , P e , P w } = {2, 3, 1, 4}.Priority mapping is used to arrange tasks defined as ϵ i,t , where i ∈ {1, 2, 3, 4}.This mapping is valid for both LMPC and NMPC.
The initial inputs for the quadcopter are u i,c = 9.81m q 0 0 0 T to temper the behavior of the f q .Given the initial values, the quadcopter is guided to the rover with −0.1 m offset in the z direction, as illustrated in Figure 19.As shown in Figure 19, the quadcopter makes a free fall in 3/4 th of a second, and then it recovers.During this phase, it loses altitude and gains speed in the w component up to 6 m/s.The duration of the free fall is correlated to R in the cost function and inequality constraints provided to the LMPC, which is C ineq (x i,K , u i,K ).After the free fall, after approximately t = 1 (s), the controller makes a correction in e and starts approaching the rover in the x-and y-directions.As mentioned above, free fall can also be seen in Figure 20 in the f q subplot, where f q stays at 0 N for that period.During the maneuver, f q is upper-bounded by 12 N.The maximums for t q are reached in the x-direction, where −0.52 and 0.05 Nm of toque are observed at the extremes.Torque, Nm t q vs time t q,x t q,y t q,z Figure 20.Inputs f q and t q over the duration of the docking maneuver.
Figure 21 provides an overview of the performed trajectory along with the heading direction of the quadcopter, which is aligned with the u 1 axis of F B .
The initial inputs for the quadcopter are u i,c = 9.81m q 0 0 0 T .This case illustrates the long-range capability of the proposed control strategy.The overview of the scenario is provided as a set snapshot in Figure 17, and the video of the performed scenario can be found in https://www.youtube.com/watch?v=QJ-NuwZQ8zw (accessed on 12 May 2024).
In this scenario, LMPC tracks the x nmpc that the NMPC generates until e is less than e d .At about 41 s, a transition happens, and LMPC starts tracking the x j (t − 1). Figure 22 shows the states of the quadcopter.An important feature of this motion is that the signal oscillates at a velocity level in both the linear and angular motion.The same oscillatory behavior is visible for inputs f q and t q , as provided in Figure 23, which is because the quadcopter's current states are changing as the x nmpc is generated on the NMPC side, sent to the LMPC, and populated as line segments.Figures 22 and 24 reveal the difference in proximity and long-range docking maneuvers, to which LMPC is more sensitive and can be seen from the magnitude of the tracking error e.  Torque, Nm t q vs time t q,x t q,y t q,z Figure 23.Inputs f q and t q over the duration of the docking maneuver.
The positional errors between the rover and the quadcopter in Figure 24 illustrate that the positional errors in the x− and y−directions approach zero, while the error in the z−direction approach −0.1 m. Figure 25 illustrates the calculation time of each NMPC trigger.Initially, the generation costs approximately 0.1 s, and as the quadcopter settles on the trajectory trajectory generation, the times decrease to 0.05 s.For a given elapsed time profile, the overall calculation time is calculated to be 13.91 s.The upper bound line illustrates the control horizon M u for the NMPC; see [39]).

Discussion
The central premise of this paper is to show that a complex dynamical system and/or a control system can be decomposed into parts.These parts can be constrained to develop the governing equations to study the dynamics or to synthesize a control strategy for the composite system using the the interconnection between the sub-systems as constraints.Note thatwhile these are two separate applications, they can be studied using the same framework.In the biped dynamics, the legs (left and right) and the torso are considered as sub-systems, which are coupled via constraints and thus modeled using the cooperative system framework.The trajectory for the biped robot is computed using a sequence of optimizations.Given a desired amount of forward movement of the center of mass, one would begin from the contact phase optimization, the results of which then feed into the swing phase optimization, hwich provides the necessary smooth trajectories for moving to the new location.This is finally passed on to the cooperative force optimization, which computes the necessary generalized forces to accomplish the movement.
While there are promising early results from this approach, further study is warranted to investigate the sensitivity of the optimization approaches to parameters such as the number of phases, the number of sample instants, the sampling period, the contact force models (friction cone), and the asymmetry in the legs, sensor noise, and other unmodeled terms in the dynamics.
Similarly, in the case of a cooperative docking scenario, the quadcopter and ground rover are two separate sub-systems of the overall system and are constrained to work together in the cooperative setup.In this context, the framework also allows one to choose different control schemes and combine them together for the overall cooperative mission.For example, a nonlinear MPC was chosen to bring the quadcopter closer to the ground rover faster because we wish for a faster approach to the goal; however, when one is close to the goal, a linear MPC controller is activated since when it is closer to the desired state, the linear terms typically dominate higher-order polynomial terms and/or harmonic functions.
In the cooperative task decomposition for the UAV-UGV problem, one would have to further study the impact of communication delays and environmental disturbances such as unstructured terrain models to fully understand the efficacy of the proposed cooperative control formulation.

Summary and Conclusions
This paper presents a cooperative modeling framework that is shown to reduce the complexity of deriving the governing dynamical equations of complex systems composed of multiple bodies.The approach also allows for an optimization-based trajectory generation for the complex system.
The results show that such cooperative modeling enables online trajectory generation through a series of optimizations that generate approximate trajectories and pass them to a final optimization that refines the motion with the cooperative system framework.
The paper also proposes a unified MPC control strategy capable of handling longrange to proximity docking maneuvers.A rough but fast NMPC method is run to propagate the agent to a closer vicinity, and then the LMPC handles more sensitive proximity motion.The sensitivity of the LMPC is apparent in Figures 22 and 24, where a smooth approach suddenly becomes relatively violent.Based on the simulations, the selection of the matrices Q, R and Q s makes the transition from NMPC-LMPC to LMPC smoother.Note that the docking of the quadcopter is almost tangential to the x − y plane, and during the final section of the maneuver, it penetrates the docking platform.The latter problem can be resolved with an approach constraint to the MPC controllers.Most importantly, the proposed method is readily applicable among dynamic systems, including non-linear systems, since the controller's structure stays the same.Using local neighbor information instead of external sensors or observers naturally makes this control strategy decentralized.

Figure 5 .
Figure 5. Cooperative interconnection between agents and generalized coordinates at every agent.

Figure 6 .
Figure 6.Cooperative interconnection between agents and generalized coordinates at every agent.

Figure 7 .
Figure 7. LIPM dynamics projected on the x − z plane.

. 1 . 5 .
Contact Phase: Constraint for State Bounds State bounds are defined based on the leg in contact; therefore, state bounds are switching between the bounds of Agent 2 and Agent 3. Let c 2 = [0, 1] and c 3 = [0, 1] be the contact indicators of Agent 2 and Agent 3, respectively.Under contact c i = 1; otherwise, c i = 0 for i = 2, 3 at phase p.The bounds for Agent 2 and Agent 3 are defined as S i = S T i,UB , S T i,LB T
b x,j , b y,j , and b z,j be the coefficients of the bezier curve, where j = 1, • • • , nb. c b and v b , which are sorted collections of b x,j , b y,j , and b z,j , are defined.This classification collects coefficients related to initial and final positions of the curve under c b and coefficients that are being optimized under v b .Specifically c b is defined as c b = p=1 x T s,1 p=1 x s,N n T .Based on thepreviously described notation, bezier curves for swing trajectory are defined as in Equation (39), where J b,c and J b,v are matrix-valued functions of k, which can be populated for k = 1 < • • • , N n and maps c b and v b to p=1 x s,k ∈ R 3 .Similarly, c b and v b are mapped to p=1 ẋs,k ∈ R 3 using d J b,c and d J b,v .Equation(38) represents the quadratic problem that runs in the swing optimization phase, where L s (v b ) is the objective function and S bounds (v b ) is the set of inequality constraints to keep states within predefined bounds.

Figure 9 .
Figure 9. Geometric interpretation of the friction pyramid.

Figure 10 .
Figure 10.Calculated swing and contact leg trajectories for first step.

Figure 11 .
Figure 11.Calculated force trajectory for the first step.

Figure 12 .
Figure 12.Calculated swing and contact leg trajectories for the second step.

Figure 13 .
Figure 13.Calculated force trajectory for the second step.

Figure 14 .
Figure 14.u B trajectory plot along with leg motion with respect to CoM.

Figure 15
Figure15provides a more intuitive illustration of the motion of the robot as I r B is calculated from u B .I r B gives the position of the origin of the F B with respect to the F I .The trajectory of I r B reveals that body drifts away from the line z = 0 while achieving forward motion, as desired.The drift is due to the first contact leg, which is the Agent 2, and cannot be corrected.With the proposed method, a velocity command could be tracked with some offset.During the simulation of the algorithm, it was observed that relaxation of the state bounds decreases the offsets between velocity commands and actual velocity.In addition, since this method is designed for velocity level dynamics, a drift in the walking does occur, which can be mitigated by an appropriate control scheme.That said, extracting the dynamics from the kinematics and developing a controller for this simpler set equations allows faster calculation of the future steps.

Figure 15 .
Figure 15.r B plot along with leg motion with respect to F I .

Figure 19 .
Figure 19.State trajectory of the agent of the quadcopter.

Figure 21 .
Figure 21.Trajectory and heading of the quadcopter.

Figure 22 .
Figure 22.State trajectory of the agent of the quadcopter.

Figure 24 .
Figure 24.Position error between the rover and the quadcopter.

Figure 25 .
Figure 25.Calculation time for the NMPC problem at every instant.

Table 1 .
Summary of key variables used for the description of the ASLB kinematics and dynamics.
p xc,K contact state xc at instant K in phase p v using Equation (39) at k = 1 and k = N n .
5.2.2.Swing Phase: Constraint for State Bounds State bounds are defined based on the leg in the swing; therefore, state bounds switch between the bounds of Agent 2 and Agent 3. Bounds for Agent 2 and Agent 3 are defined as S i = S T i,UB , S T i,LB T and assigned to S p such that S p = S i if c i = 0. Using these bounds, state boundary constraints are defined as in Equation (40).

Table 2 .
Summary of key variables used for description of the quadcopter and rover kinematics and dynamics.