Robust Collision-Free Guidance and Control for Underactuated Multirotor Aerial Vehicles

: This paper is concerned with the robust collision-free guidance and control of underactuated multirotor aerial vehicles in the presence of moving obstacles capable of accelerating, linear velocity and rotor thrust constraints, and matched model uncertainties and disturbances. We address this problem by using a hierarchical ﬂight control architecture composed of a supervisory outer-loop guidance module and an inner-loop stabilizing control one. The inner loop is designed using a typical hierarchical control scheme that nests the attitude control loop inside the position one. The effectiveness of this scheme relies on proper time-scale separation (TSS) between the closed-loop (faster) rotational and (slower) translational dynamics, which is not straightforward to enforce in practice. However, by combining an integral sliding mode attitude control law, which guarantees instantaneous tracking of the attitude commands, with a smooth and robust position control one, we enforce, by construction, the satisfaction of the TSS, thus avoiding the loss of robustness and use of a dull trial-and-error tweak of gains. On the other hand, the outer-loop guidance is built upon the continuous-control-obstacles method, which is incremented to respect the velocity and actuator constraints and avoid multiple moving obstacles that can accelerate. The overall method is evaluated using a numerical Monte Carlo simulation and is shown to be effective in providing satisfactory tracking performance, collision-free guidance, and the satisfaction of linear velocity and actuator constraints.


Introduction
Multirotor aerial vehicles (MAVs) have been extensively researched in recent decades and are expected to be widely used for a variety of new, challenging applications, such as delivery [1] and air taxis [2].These applications demand a high level of safety and are usually carried out in disturbed environments where the MAV may fly among obstacles such as buildings, other manned or unmanned aircraft, and birds.To enable dependable operation, the MAV flight control system must provide outstanding performance, robustness, and collision avoidance.Aiming at contributing to this topic, this paper focuses on underactuated MAVs equipped with fixed rotors (which includes the conventional and widespread quadrotor).We start by adopting a typical hierarchical guidance and control architecture [3], such as the one shown in Figure 1.This architecture is composed of an outer-loop guidance system whose goal is to conduct the vehicle to the desired state while avoiding collisions and respecting operational and physical constraints.It also contains an inner control loop providing stability and robustness.
To address the robustness requirement of MAV flight control systems against model uncertainties and disturbances, the sliding mode control [4,5] (SMC) strategy has been widely adopted to design the position and attitude control laws [6][7][8][9][10][11].This technique provides a high-frequency switching control signal to drive and keep a system output on a sliding manifold, where the system ideally becomes insensitive to bounded model uncertainties and disturbances of the matched type.In particular, there are also SMC design techniques based on sliding manifolds suitably constructed to eliminate the reaching phase, i.e., to enable a sliding motion (with the insensitivity property) from the very beginning [12][13][14][15][16].These design techniques can be found in the literature under two main alternative designations: the integral sliding mode control (ISMC) [14] and sometimes the global sliding mode control (GSMC) [13,16].To deal with the underactuated dynamics of the considered class of MAVs, most of the literature relies on the assumption of time-scale separation (TSS) between the (slower) position and (faster) attitude closed-loop dynamics [6][7][8][9][17][18][19][20][21], which allows us to nest the attitude control loop inside the position one.As a consequence, the position and attitude control laws can be designed separately, but their gains have to be suitably tuned to achieve a sufficient TSS.In particular, the use of SMC in underactuated control systems is still a challenge [22].In fact, under this nested control architecture, by using a highfrequency switching signal in the outer-loop position control law, it is not possible to properly guarantee the required TSS since such a signal is too fast to be considered slow in the design of the inner-loop attitude control.Nevertheless, there is no restriction on the use of such a policy in the attitude control.In this context, Besnard et al. [6] designed flight controllers using a continuous SMC driven by a sliding mode disturbance observer (SMDO).A procedure to tune the controllers' gains in order to respect the TSS between the control loops was presented.Muñoz Palacios et al. [7] adopted a modified super-twisting SMC driven by a high-order sliding mode observer.Silva and Santos [8] and Labbadi and Cherkaoui [9] designed nonsingular fast terminal sliding mode flight control laws, but, to avoid the TSS violation, the position control loop was smoothed using sigmoid functions at the cost of robustness.It is worth noting that all the above-cited methods demand a trial-and-error procedure for the tuning of the flight control laws to respect the TSS assumption and avoid eventual instability.
The guidance of MAVs among obstacles has been addressed in the literature using different methods, such as nonlinear model predictive control (NMPC) [23,24], samplingbased search [25], and velocity obstacles (VO) [26][27][28].In particular, Kamel et al. [23] and Pereira et al. [24] employed an NMPC to guide an MAV through static obstacles, addressing the collision avoidance problem by using a penalty in the cost function.Furthermore, Bouzid et al. [25] guided an MAV among static obstacles by combining an optimal rapidly exploring random tree (RRT) method with a genetic algorithm to compute the shortest path to a given target.On the other hand, Bareiss and Van Den Berg [27] proposed the linear-quadratic regulator obstacles (LQR obstacles) to address collision avoidance for mobile robots described by linear dynamics, rather than by single integrators as in the original VO [26].The method was evaluated using quadcopters whose nonlinear dynamics were supposed to be known and linearized around the hover state.Other methods, such as the acceleration velocity obstacles (AVO) [29] and continuous control obstacles (CCO) [30], were also developed as generalizations of the VO to deal with more complicated known linear dynamics.Among the references [23][24][25]27], only [27] addresses the guidance problem for MAVs in the presence of moving obstacles.However, they use a simplified dynamic model and, like the VO-based methods [26,29,30], predict the obstacles' future positions assuming that they keep a constant velocity, which is generally not satisfied in practice, but is addressed using a quick replanning loop with no guarantee that collision can be avoided when the obstacles accelerate.This paper tries to fill the aforementioned gaps in the current state of the art by addressing the guidance and control for underactuated MAVs subject to matched model uncertainties and disturbances, as well as velocity and rotor thrust constraints, in the presence of obstacles that can accelerate.To tackle this problem, we start by adopting the hierarchical architecture shown in Figure 1 [3].To design the position and attitude control laws, we propose a new hierarchical SMC scheme that, differently from [6][7][8][9][10], enforces the TSS without losing robustness and using a dull trial-and-error tweak of gains.Under this scheme, the attitude control law is designed using an ISMC strategy, in such a way as to guarantee exact tracking capability in the attitude control loop during the entire time, under the condition that the attitude command is sufficiently smooth.Then, this smoothness requirement is fulfilled by designing the position control law using a proportional-derivative (PD) approach combined with an SMDO to endow the position control loop with robustness.By doing so, the TSS is instantaneously enforced since the inner loop is made infinitely fast.On the other hand, we propose a nonlinear and robust guidance strategy based on the CCO method [30].The guidance considers the use of smooth reference filters and, rather than relying on overly simplified nominal dynamics, it is able to deal with the uncertain nonlinear system dynamics.These system dynamics, as seen by the guidance algorithm, encompass the reference filters and the translational and rotational control loops, including the uncertainties and disturbances.Then, using the fact that the TSS is enforced by the control strategy, we tighten the admissible sets according to the bounds of the MAV tracking errors as well as the bounds of the disturbance force and torque.By doing so, different from that of Bareiss and Van Den Berg [27], the proposed strategy is able to guarantee collision avoidance and the satisfaction of velocity and rotor thrust constraints for underactuated MAVs with uncertain dynamics.Furthermore, to reliably address the problem of collision avoidance in the presence of obstacles that can accelerate, we adopt our previous method [28] that uses a high-order sliding mode differentiator (SMD) [31,32] to robustly estimate the obstacles' maximum accelerations.Then, using these estimates, the obstacles' future positions are not predicted using first-order models, as generally done in the VO-based methods [26,27,29,30], but considering the obstacles' maximum accelerations.To summarize, the main contributions of this paper are the proposal of • a hierarchical SMC scheme to design the position and attitude control laws for underactuated MAVs that, by construction, enforces the TSS assumption and • robust guidance, based on the CCO and designed using a constraint-tightening approach, for underactuated MAVs with uncertain dynamics in the presence of velocity and rotor thrust constraints and multiple moving obstacles capable of accelerating.
The remaining text is structured as follows.Section 1.1 presents the notation.Section 2 defines the problem.Sections 3 and 4 present, respectively, the control and guidance methods.Section 5 evaluates the proposed method using numerical simulations.Lastly, Section 6 concludes the paper.

Notation
The sets of real numbers, positive real numbers, and non-negative real numbers are denoted by R, R >0 , and R ≥0 , respectively.In the same manner, the sets of integer numbers, positive integer numbers, and non-negative integer numbers are denoted by Z, Z >0 , and Z ≥0 , respectively.Uppercase and lowercase boldface letters are used, respectively, to denote matrices and algebraic vectors, while geometric (Euclidean) vectors are denoted as in a.The symbols I n and 0 n×m denote, respectively, the n × n identity matrix and the n × m zero matrix.Moreover, the vectors of ones and zeros of dimension n are denoted, respectively, by 1 n and 0 n .A Cartesian coordinate system (CCS) is represented as S b {B; x b , y b , z b }, with B denoting its origin and x b , y b , and z b representing the unit geometric vectors along its orthogonal axes.The algebraic vectors corresponding to the projection of an arbitrary physical vector a ∈ R 3 onto S b and S r are denoted, respectively, by a b ∈ R 3 and a r ∈ R The inverse of D b/r is equal to its transpose, being denoted by D r/b .Let a b/r represent an arbitrary quantity of S b relative to S r , e.g., throughout this paper, r b/r will denote the position of S b with respect to S r .Consider two arbitrary algebraic vectors x = (x 1 , . . ., x n ) and y = (y 1 , . . ., y n ).The vector inequality x < y means that x i < y i , ∀i ∈ {1, . . ., n}.The kth-order time derivative of x is denoted by x (k) .Furthermore, we define the vector signum function as sign(x) (sign(x 1 ), . . ., sign(x n )), where sign(x i ) , where A i,j is the element of the ith row and jth column of A. The standard basis vectors of R 3 are denoted by e 1 (1, 0, 0), e 2 (0, 1, 0), and e 3 (0, 0, 1).A closed sphere of radius ρ ∈ R >0 centered at p ∈ R 3 , the Minkowski sum of two sets, and the set subtraction are denoted, respectively, by

Problem Definition
This section defines the main problem of the paper.First, the MAV translational and rotational dynamics are derived in Section 2.1.Second, the control and guidance problems are stated in Section 2.2.

MAV Dynamic Modeling
Consider an inertial reference CCS S r {R; x r , y r , z r } fixed on the ground at a known point R, with z r oriented upwards, parallel to the local vertical.Moreover, consider a body-fixed CCS S b {B; x b , y b , z b } located at a point B fixed to the MAV center of mass, as shown in Figure 2, along with an illustration of a general underactuated MAV equipped with n r ≥ 4 fixed rotors parallel to z b .
The attitude kinematics of S b with respect to S r can be described in SO(3) by [33] Ḋb where D b/r ∈ SO(3) and ω b/r b ∈ R 3 are, respectively, the attitude matrix and the angular velocity of the MAV.Using the Euler equation [34], the MAV rotational dynamics can be described by where h b ∈ R 3 is the angular momentum of the MAV relative to point B, τ c b ∈ R 3 is the control torque, and τ d b ∈ R 3 is the disturbance torque, which can include state-dependent terms related to parametric and model uncertainties [11]; both torques are with respect to B. Assuming a rigid airframe and negligible rotor inertias, the angular momentum h b can be written as where J b ∈ R 3×3 is the MAV inertia tensor.By substituting (3) into (2), the rotational dynamics can be finally described by The translational kinematics of S b relative to S r , on the other hand, are represented by where r b/r r ∈ R 3 and v b/r r ∈ R 3 are, respectively, the position and linear velocity of the MAV.Using Newton's second law, the translational dynamics of the MAV can be described in S r by vb/r where g ∈ R >0 is the gravity acceleration magnitude, m ∈ R >0 is the MAV mass, f c r ∈ R 3 is the control force, and f d r ∈ R 3 is the disturbance force, which can include state-dependent terms related to parametric and model uncertainties.
The set of Equations ( 1) and ( 4)-( 6) can be used to represent the six-degrees-of-freedom (DOF) dynamics of any fixed-rotor rigid MAV with negligible rotor inertia, regardless of its rotor configuration.For the considered class of underactuated MAVs, the control force is given by where n r D r/b e 3 is the S r representation of z b and f f c r is the resultant thrust magnitude.
Denote by f i ∈ R the thrust of the ith rotor and define the vector f r ( f 1 , f 2 , . . ., f n r ).The control allocation equation relating f and τ c b with f r is [35] f where Γ ∈ R 4×n r is the control allocation matrix, which depends on the rotors' coefficients and arrangement.For the considered class of underactuated MAVs, Γ is always of full-row rank, i.e., rank(Γ) = 4, thus allowing the rotors to produce, within certain limits, a three-dimensional torque and a resultant thrust in the z b direction.
The resulting system (1), ( 4)-( 8) is said to be underactuated since it has only four control inputs, τ c b and f , to control a six-DOF motion.Furthermore, from ( 6) and ( 7), it can be seen that the translational dynamics depend on the MAV attitude.For this reason, we can say that the rotational and translational dynamics are cascaded.

Problem Statement
Consider that the MAV is subject to velocity and rotor thrust constraints and flies among moving obstacles.These circumstances give rise to the following constraints: where P (t) ⊆ R 3 denotes the collision-free space, which changes as the obstacles move and is generally non-convex; V ⊆ R 3 is an admissible convex set; and F is assumed to be , respectively, the given lower and upper bounds for the ith rotor thrust.
Let us define the heading angle ψ ∈ [−π, π) as the third rotation in the 1-2-3 Euler angle representation of the MAV attitude.
Problem 1 is a nonlinear guidance and control problem of an underactuated MAV subject to disturbances and constraints, operating in a dynamic environment with multiple moving obstacles that can accelerate.To tackle this problem, we adopt the guidance and control architecture shown in Figure 1.The adopted strategy is to design the control module to provide robustness with respect to disturbances and uncertainties, and the guidance module to satisfy the position, velocity, and rotors' thrust constraints while conducting the vehicle to reach the desired position and heading.The control and guidance methods are, respectively, detailed in Sections 3 and 4. It is worth mentioning that the objective of Problem 1 can be immediately generalized for a sequence of desired positions and headings [3].

Control Design
This section presents the design of the position and attitude control laws to address Problem 1. Specifically, Section 3.1 introduces the adopted hierarchical flight control architecture.Section 3.2 formulates the attitude control law.Section 3.3 formulates the position control law.Lastly, Section 3.4 presents the control allocation.
Before proceeding to the next subsection, let us denoted the command of a given variable by using an overbar symbol, e.g., rb/r r and ψ denote the position and heading commands, respectively.
Assumption 1.The rotor dynamics are instantaneous and their static parameters are exactly known.
Assumption 1 is common in practice since the rotor dynamics are much faster than the attitude and position ones, thus allowing us to suppose, in the design of the controllers, that the thrust commands are instantaneously achieved, i.e., f r ≡ f r , ∀t ≥ 0. Consequently, from (8), one can say that f ≡ f and τ c b ≡ τ c b .By assuming that the rotational dynamics are faster than the translational ones, the TSS allows consideration in the design of the position controller that n r = nr .Then, by also considering Assumption 1, Equation (7) becomes where nr ≡ Dr/b e 3 , which in theory removes the underactuation of the system since the dependency of f c r on the actual attitude is removed.Then, the attitude and position control laws can be separately designed by considering the resulting fully actuated system described by Equations ( 1), ( 4)-( 6), and (12).The critical point is that, when using this strategy, the controllers' gains have to be carefully tuned in order to respect the TSS assumption, whereas, in practice, this tuning is a cumbersome trial-and-error process, thus requiring improved safety procedures to deal with the eventual instability that may occur in the case of insufficient TSS [36].
To avoid this drawback, we propose a hierarchical sliding mode control scheme for underactuated MAVs that enforces the attitude-position TSS.To this end, we design a multi-input attitude control law using an ISMC strategy, in such a way as to guarantee exact tracking capability in the attitude control loop during all the time.Such exact tracking makes n r = nr without the need for tuning to reach the TSS.Therefore, the position control law can be designed using the fully actuated system described by ( 5), (6), and (12).

Integral Sliding Mode Attitude Control Law
Consider the objective of tracking a time-varying attitude command Db/r that satisfies the following assumption.

Assumption 2. The time-varying attitude command
Db/r is such that, at the initial time, Db/r (0) = D b/r (0) and ωb/r b (0) = ω b/r b (0).Moreover, it is (h − 1)th-order differentiable with respect to time, where h ≥ 2, such that its first-time derivative Ḋb/r is Lipschitz-continuous. Assumption 2 is not restrictive; it only requires the knowledge of the MAV's initial attitude and angular velocity and the use of a suitably smooth attitude command.These conditions can be fulfilled by properly designing the heading command ψ and the position control law (see Figure 3).
The attitude and angular velocity control errors can be defined, respectively, as [33] D ω where The attitude and angular velocity errors ( 13) and ( 14) allow the description of the attitude error kinematics by a conventional attitude kinematics differential equation, such as (1), i.e., Besides the attitude matrix D, a three-dimensional attitude representation is required for the proposed control design.Here, we adopt the Gibbs vector [33] g ε tan( θ/2), where ε ∈ S 2 {x ∈ R 3 | x = 1} and θ ∈ R are, respectively, the principal Euler axis and angle corresponding to D. Note that the Gibbs vector is singular at the angles θ = (2i + 1)π rad, ∀i ∈ Z.However, since g represents the attitude error (not the full attitude), an effective control design will keep θ << π and singularities will not be reached in practice [8].The direct and inverse relations between D and g are, respectively, given by where Dij is the element of the ith row and jth column of D, and tr(•) is the trace operator.
The attitude error kinematics (15) can alternatively be described using the Gibbs vector as [33] ġ = 1 2 On the other hand, by replacing ( 13) and ( 14) into (4) and considering Assumption 1, the attitude error dynamics can be described by By defining x 1 g, x 2 ω, u τc b , d τ d b , the error kinematics (16) and dynamics (17) can be inserted into the state-space model where B J −1 b , Now, let us define the attitude sliding variable where C ∈ R 3×3 .The corresponding sliding set is From Assumption 2, one can see that the system is in the sliding set S at the initial time instant.Therefore, by designing a control u such that the inequality V ≤ −βV 1/2 [37], with β ∈ R >0 , is satisfied from the very beginning, it holds that (x 1 , ẋ1 ) ∈ S during the entire time, and, consequently, (x 1 (t), ẋ1 (t)) = (0 3 , 0 3 ), ∀t ≥ 0. Therefore, from (20), it holds that x 2 (t) = 0 3 , ∀t ≥ 0. In other words, the system is capable of exactly tracking the attitude and angular velocity commands during the entire time, i.e., D b/r (t) = Db/r (t) and (21) and using ( 18) and ( 19), we can obtain the dynamic equation for s (for conciseness, we omit the function-independent variables): Regarding the disturbance d, consider the following assumption.
Assumption 3. The disturbance d is bounded according to |d| ≤ τ max , where τ max ∈ R 3 is a known vector with positive components.
The boundedness in Assumption 3 is reasonable in practice, but one can rarely obtain a non-conservative estimate of the bound τ max without a switching-gain adaptation scheme [38].
Lemma 1 gives a control law that ensures that the system ( 18) and ( 19) has an integral sliding mode in S.
Lemma 1.Under Assumptions 2 and 3, the following control law guarantees the integral sliding mode of the system ( 18) and ( 19) in S: where B, and K 1 ∈ R 3×3 is a positive-definite diagonal matrix satisfying Proof.Consider the Lyapunov candidate function V(s) = s T s/2.From Assumption 2, the system ( 18) and ( 19) starts the motion in the sliding set S. Therefore, to prove the integral sliding mode, it is sufficient to show the satisfaction of the inequality V ≤ −βV 1/2 .To this end, differentiating V(s) with respect to time, using ( 22) and ( 23), and choosing K 1 according to (24), one can show that Thus, we complete the proof.

Attitude Command Generator
The attitude command generator (ACG) converts nr and ψ into Db/r .Since the heading angle ψ is defined as the third rotation in the 1-2-3 Euler angle representation of D b/r , it is appropriate to parameterize Db/r also using the Euler angles ( φ, θ, ψ) in the 1-2-3 sequence, i.e., Db/r where c * and s * are short notations for cos( * ) and sin( * ), respectively.From the definition of nr given after Equation ( 12), it can be seen that its transpose is equal to the third line of the attitude command Db/r .Then, the commands φ and θ in (25) can be calculated from nr , respectively, as φ = −tan −1 e T 2 nr e T 3 nr Regarding the smoothness of Db/r in Assumption 2, consider the following remark.
Remark 1.For the attitude command Db/r to be in fact (h − 1)th-order differentiable with respect to time, such that its first-time derivative is Lipschitz-continuous, as supposed by Assumption 2, it can be seen from ( 25)-( 27) that nr and ψ must have the same smoothness degree.In turn, the required smoothness of nr and ψ can be achieved by properly designing the guidance method and the position control law.

Position Control Law
Using Assumption 1 and the fact that the attitude loop has an integral sliding mode, which ideally imply that D b/r (t) ≡ Db/r (t) and n r (t) ≡ nr (t), ∀t ≥ 0, the position control law can be designed using the fully actuated model ( 5) and ( 6) with f c r given by (12) instead of (7).
Therefore, consider the objective of tracking a time-varying position command rb/r r (t).To this end, define the position and linear velocity errors, respectively, by Then, the translational error kinematics and dynamics are obtained, respectively, by substituting ( 28) into (5), and ( 12) and ( 29) into (6), yielding ṙ = ṽ, ( Consider the following assumption regarding the disturbance force f d r . Assumption 4. The disturbance force is bounded according to |f d r | ≤ f max , with f max ∈ R 3 being a known vector with positive components.Moreover, f d r is (h − 1)th-order differentiable with respect to time, being h the smoothness parameter defined in Assumption 2, such that f d(1) r is Lipschitz and has a known Lipschitz constant vector γ p > 0 3 .Assumption 4 is not very restrictive.It essentially states that the disturbance is limited and has a Lipschitz-continuous first and (h − 1)th derivative with respect to time, which is reasonable in practice.In fact, model uncertainties and external aerodynamic disturbances usually fulfill such an assumption [39].
In order to satisfy the smoothness requirements for Db/r as discussed in Remark 1, while ensuring robustness with respect to the disturbance force, we design the position control law by combining a proportional-derivative (PD) policy with an SMDO.This control law is given by where K 2 ∈ R 3×3 and K 3 ∈ R 3×3 are positive-definite diagonal matrices, and f d r ∈ R 3 is the disturbance force estimate to be provided by the SMDO.
Then, since and z d i → σ (i) , ∀i ∈ {1, . . ., h}, the estimate of the disturbance force and its time derivatives are simply given by The proposed SMDO is represented by Equations ( 33)- (36) and exhibits low sensitivity to parameter variations, making it easy to tune.The tuning trade-off is that the larger the gains Λ d j , the faster the convergence and the higher the sensitivity to measurement noise, time discretization, and unmodeled dynamics [31].
Substituting the control law ( 32) into ( 31), the closed-loop position dynamics can be described by Given the disturbance force bound of Assumption 4, it holds that the disturbance estimation error is bounded during the SMDO ( 33)-( 36) convergence phase and vanishes after a finite time.Then, the origin (r, ṽ) = (0 3 , 0 3 ) of the system described by ( 30) and ( 37) can be made asymptotically stable by suitably choosing the gains K 2 and K 3 .Now, to satisfy the smoothness requirement for nr as discussed in Remark 1, f c r must have the same smoothness degree.Regarding the smoothness of f c r , consider the following remark.

Remark 2. For the control force command
f c r to be (h − 1)th-order differentiable with respect to time and have a Lipschitz-continuous first-time derivative, it can be seen from (32) that, since f d r already satisfies this requirement (see (33)-( 36)), the acceleration command v b/r r must have the same smoothness degree.Such a command will be properly generated by the guidance strategy presented in Section 4.
The adaptation of K 3 must be quick enough to provide good performance to the position control loop.To guarantee the position control smoothness (see Remark 1), the adaptive gain K 3 must be (h − 1)th-order differentiable with respect to time, such that its first-time derivative is Lipschitz.To achieve this, we increase K 3 (t) during a prescribed time t s ∈ R >0 according to where K3 ∈ R 3×3 is a positive-definite diagonal matrix corresponding to the final desired value of K 3 , and I [0,t s ) (t) is an indicator function that is equal to one if t ∈ [0, t s ) and equal to zero otherwise.

Control Allocation
The control allocation calculates f r from f and τc b as a solution to the control allocation Equation [35]: Assuming that the rotor set is configured in such a way that Γ is always of full-row rank, i.e., rank(Γ) = 4, the solution of ( 44) is unique when n r = 4 and is simply calculated by inverting Γ.For n r > 4, a simple solution can also be obtained, but now using the pseudo-inverse method [40].

Guidance Design
Consider that the MAV flies among n o ∈ Z >0 obstacles and define a set I containing the identification of the obstacles.Moreover, consider a point C i fixed to the center of mass of obstacle i ∈ I and denote its position and velocity by r i/r r ∈ R 3 and v i/r r ∈ R 3 , respectively.To guarantee the smoothness requirement of ψ and rb/r r as discussed in Remarks 1 and 2, the proposed guidance algorithm generates these commands using reference filters, as depicted in Figure 4.The heading reference filter is designed using an overdamped LPF to make ψ converge smoothly to a target heading ψ * ∈ (−π, π] provided by the CCO method.Similarly, the position reference filter consists of an overdamped LPF to make v b/r r smoothly converge to a target velocity v * r ∈ R 3 , also given by the CCO, and an integrator for calculating r b/r r .We assume that the MAV is aware of the obstacles' position and velocity O r i/r r , v i/r r , ∀i ∈ I .Then, to avoid collision with accelerated obstacles, we use a high-order SMD [31] to provide an estimate of the acceleration of obstacle i, denoted by z i 1 ∈ R 3 , from observations of v i/r r .In summary, the CCO receives the desired position ř b/r r and the desired heading ψ as input and the MAV states r b/r r , v b/r r , D b/r , and ω b/r b , as well as z i 1 , ∀i ∈ I, and O, as feedback.Using this information, it chooses v * r and ψ * aiming at avoiding collisions with the obstacles and respecting the linear velocity constraint (10) and the rotor thrust constraint (11).The heading reference filter consists of a hth-order overdamped LPF to guarantee the necessary smoothness of ψ and ensure that it has no overshoot with respect to the input ψ * .This reference filter can be represented by the state-space model where The initial heading and heading rate commands are set equal to the respective MAV initial conditions, while the remaining filter states are set equal to zero, i.e., y ψ (0) (ψ(0), ψ(0), 0, . . . , 0). (46) It is worth noting that the target heading ψ * (t) given by the CCO may be discontinuous [16].Then, designing the heading reference filter using an LPF of order h guarantees that ψ is minimally (h − 1)-times differentiable with respect to time and has a Lipschitz-continuous first-time derivative.As a result, the smoothness requirement for ψ can be satisfied.
Similarly, to guarantee the necessary smoothness of v b/r r , the position reference filter is designed using an overdamped LPF of order h + 1 and an integrator, being represented by the state-space model ẏp where The initial position and velocity commands are set equal to the corresponding MAV initial conditions, while the remaining states of the filter are set equal to zero, i.e., Analogous to ψ * , the target velocity v * r (t) calculated by the CCO may be discontinuous [16].Then, by using an LPF of order h + 1, we guarantee that the acceleration command v b/r r is minimally (h − 1)-times differentiable with respect to time and has a Lipschitz-continuous first-time derivative.As a result, the smoothness requirement of v b/r r is satisfied.
From the choice of y p (0), we have that r(0) = 0 3 and ṽ(0) = 0 3 .Therefore, by properly designing the position control law (32), r and ṽ can be limited, respectively, as where r ∈ R ≥0 and v ∈ R ≥0 can be chosen as time-dependent functions.It is worth noting that r and v cannot be analytically calculated, but they can be approximated from numerical simulations of the position closed-loop system ( 30) and ( 37) using a great amount of values of the disturbance force inside the bounds provided in Assumption 4.
Assume that obstacle i and the MAV can be contained, respectively, in the closed spheres B(C i , ρ i ) and B(B, ρ b ), where ρ i ∈ R >0 and ρ b ∈ R >0 are of a given radius.Therefore, a collision between the MAV and obstacle i is assumed to take place if and only if where Using the position error bound (49) and the position error definition (28), condition (51) can be tightened to obtain whose satisfaction implies that condition (51) is true.
To prevent collisions, the adopted strategy is to avoid satisfying (52) in a future time horizon of finite length τ ∈ R >0 .To this end, we must predict r b/r r (t) and r i/r r (t) inside the time interval [t 0 , t 0 + τ], where t 0 ∈ R ≥0 is the current time.By using the length τ, we only consider the obstacles that can cause imminent collisions, i.e., collisions that can occur inside [t 0 , t 0 + τ].This can be effective to reduce the computational burden when there are many obstacles.However, the parameter τ must be chosen based on the physical bounds of the obstacles' trajectories and the MAV dynamics in such a way that the MAV can avoid a collision when it is detected as imminent.
From (47), one can see that the future values of r b/r r are unknown once they depend on the future values of v * r , which are calculated in real time by the CCO and dependent on the unknown future behavior of the obstacles.In this context, we predict r b/r r on the time interval [t 0 , t 0 + τ] by considering v * r as a constant.Therefore, to support the forthcoming derivations, Lemma 3 provides a unique solution to (47) with a given initial condition y p (t 0 ), while considering v * r as a constant.
Lemma 3. Considering v * r as a constant input, the solution of (47) in t ∈ [t 0 , ∞) with initial condition y p (t 0 ) is given by Proof.See Appendix A.
Considering v * r as a constant input, the predicted values of the MAV position command r b/r r (t), which we denote by r b/r r (t) ∈ R 3 , can be obtained from (53) as where The earlier VO-based approaches generally assume that the obstacles keep a constant velocity.In practice, however, this assumption only ensures collision avoidance against obstacles with low acceleration capacity or under a large τ.Here, instead of predicting the obstacles' trajectories using a first-order approximation, we consider a set of possible future trajectories for the obstacles based on estimates of their acceleration bounds calculated from z i 1 provided by the SMD.In this context, the predicted position r i/r r ∈ R 3 of obstacle i inside the time interval [t 0 , t 0 + τ] belongs to where âmax i ∈ R ≥0 is a bound estimate of obstacle i acceleration.
Note that, before the differentiator (56)'s convergence time, denoted by t i c ∈ R >0 , there is no accurate information about the acceleration bound of obstacle i.In this context, we choose to calculate âmax According to (57), before t i c , âmax i is converging according to the SMD (56), and after t i c , by maximizing z i 1 over the time interval [t i c , t], âmax i becomes an accurate estimate of obstacle i's acceleration bound inside the considered time interval.However, since t i c is difficult to calculate without conservativeness, a very intuitive choice to approximate it is by verifying when the SMD states enter a small neighborhood of the sliding manifold, i.e., when where α ∈ R >0 is satisfied.Using r b/r r (t) and r i/r r (t) given, respectively, by ( 54) and (55), the collision condi- tion (52) can be further tightened to obtain where c i (δt) −C 1 e Aδt y p (t 0 ) + r i/r r (t 0 ) + δtv i/r r (t 0 ).In turn, we can also notice that the collision condition (59) is always satisfied if where ρ + bi (δt) 2 .From the definition of G 1 (δt), one can see that it is nonsingular ∀δt > 0.Then, by multiplying both sides of (60) by G −1 1 (δt) , it can be shown that ( 60) is satisfied if Now, using (61), define a set of target velocities v * r that may result in a collision with obstacle i within (t 0 , t 0 + τ] as Therefore, the set of target velocities v * r that may result in a collision with any obstacle within (t 0 , t 0 + τ] is given by In other words, we can guarantee collision avoidance against accelerated obstacles by merely continuously choosing v * r / ∈ CCO τ b , thus satisfying the position constraint (9).To consider the linear velocity constraint (10), we can rewrite it in terms of the target velocity v * r .To this end, note that, given the choice of y p (0) in (48) and the design of the position reference filter (47) using the adopted overdamped LPF, v b/r r presents no overshoot relative to the input v * r .Consequently, by choosing v * r ∈ V, we guarantee that v b/r r ∈ V since V is a convex set.Then, from the velocity tracking error definition (29) and bound (50), one can see that by choosing it holds that v b/r r ∈ V, thus respecting the velocity constraint (10).Then, the position and velocity constraints ( 9) and ( 10) can be respected by continuously choosing It should be noted that the set V R is generally non-convex.
On the other hand, to account for the rotor thrust constraint (11), it can be seen from the control allocation Equation ( 8) and Assumption 1 that (11) results in the following control command constraints: where Knowing that f = f c r and the attitude control loop has an integral sliding mode, i.e., D b/r ≡ D b/r and ω b/r b ≡ ω b/r b , by substituting the control torque command (23) and the control force command (32) into (64), we obtain where To respect the rotor thrust constraint (11), our strategy is to guarantee the satisfaction of (65) in the time horizon τ.To this end, we have to predict the left side of (65).However, note that we cannot predict the terms K 2 r, K 3 ṽ, and f d r .In this sense, using the tracking error bounds (49) and (50), Assumption 4, and the definition of sign(s), condition (65) can be tightened to obtain where A prediction for v b/r r inside the time interval [t 0 , t 0 + τ] can be directly obtained from (53).On the other hand, a prediction for ωb/r b cannot be exactly obtained.To support this statement, note that for the considered class of underactuated MAVs, the angular velocity command is given by ωb/r b = e 3 × Db/r ṅr + ψe 3 .
Then, note that D b/r depends on nr (see ( 25)-( 27)), which is a unit vector that has the same direction and orientation of the control force command f c r (see (12)).However, it can be seen from ( 32) that f c r cannot be exactly predicted since the future values of the terms K 2 r, K 3 ṽ, and f d r are unknown.As a result, we cannot obtain an exact prediction for ωb/r b inside [t 0 , t 0 + τ].
On the basis of the above discussion, nr can be rewritten in terms of a nominal part nn r ∈ R 3 and an unknown part ∆ nr ∈ R 3 , i.e., where nn where ωn b e 3 × Db/r,n ṅn r + ψe 3 , being Db/r,n calculated in the same way as Db/r in (25) but considering nn r instead of nr , and ∆ ωb ∈ R 3 is an unknown part.Substituting (69), condition (66) can be rewritten as where where Regarding the set ∆ W, consider the following remark.
Remark 3.An analytic expression for ∆ ωb is difficult to obtain since the relation between D b/r and nr is strongly nonlinear (see (25)-( 27)).Consequently, it is complex to obtain an analytical expression for the set ∆ W. Therefore, we approximate it based on computer simulations.Now, we are able to predict the left side of (71) inside the time interval [t 0 , t 0 + τ].In this sense, let us rewrite (71) in terms of the predicted values for each variable, i.e., The acceleration command prediction â b/r r can be obtained from (53) using the same strategy adopted to predict the position command r b/r r (t), i.e., considering v * r as a constant input inside the time interval [t 0 , t 0 + τ], yielding where On the other hand, note that the angular velocity prediction ωn b can be calculated by differentiating Db/r,n with respect to time.Here, we choose to predict Db/r,n (t) inside the interval [t 0 , t 0 + τ] using the Euler angles 1-2-3 parameterization, denoted by α b/r (t) ( φ(t), θ(t), ψ(t)).The prediction of the heading command ψ is obtained by adopting the same strategy used to predict r b/r r (t) and v b/r r (t), i.e., considering ψ * as a constant input.Under this assumption, ψ(t) can be calculated from the actual time instant t 0 by solving Equation (45), yielding where The predictions φ and θ inside [t 0 , t 0 + τ] can be obtained from Equations ( 26 where The angular acceleration prediction ˙ω n b is simply obtained by differentiating (75) with respect to time.
The continuous selection of v * r and ψ * such that condition (72) is satisfied in [t 0 , t 0 + τ] guarantees the satisfaction of the control command constraints (64) and, consequently, the rotor thrust constraint (11).
Lastly, to completely solve Problem 1, we formulate a bilevel optimization problem [41] that calculates v * r and ψ * , giving priority to collision avoidance.In this sense, we compute v * r and ψ * as the solution of the following minimization that aims to satisfy the linear velocity and rotor thrust constraints, respectively, given by ( 10) and (11), and make the MAV reach its desired position řb/r r and desired heading ψ without collision: where v pref ∈ R 3 is a preferred velocity and Z is the set of solutions to the ν-parameterized problem min .
Here, v pref is a vector that points to ř b/r r and has a magnitude equal to the maximum admissible velocity in this direction.To provide a smooth deceleration phase, the magnitude of v pref is gradually decreased according to the remaining distance when the vehicle is near ř b/r r .However, we highlight that the design of v pref can be done using other strategies [26,42].
When obstacles are present, the set V R is generally non-convex and can even become empty in very dense scenarios.In this case, it is appropriate to choose a target velocity and heading that result in the largest time to collide, while respecting constraints (10) and (11).The shortest time for a collision to occur, denoted by t col ∈ R >0 , as a result of choosing a certain target velocity ν is given by Therefore, in case there is no solution to (76), i.e., V R = ∅, we propose to choose the target velocity and heading that imply the largest t col , while respecting constraints ( 10) and (11), i.e., (v * r , ψ * ) = arg max In general, finding the global minima of a non-convex optimization problem, such as the one in (76), is computationally intensive.Attempting to reduce the computational burden, we approximate the solution of (76) using a fixed number of velocity samples calculated from a uniform distribution over V B(0 3 , v ) and creating an equally spaced fixed number of heading angle samples inside the interval (−π, π].When (76) has no solution, we use the same sampling strategy to solve the optimization problem (77).Algorithm  Algorithm 1 generates n v random velocity samples inside V B(0 3 , v ) and n ψ equally spaced samples inside (−π, π], sorting them in ascending cost order.Then, it sequentially checks that each velocity sample does not result in a collision, i.e., if it does not belong to CCO τ b .When a velocity sample is collision-free, the algorithm tries to choose a heading sample that, given this collision-free velocity, respects the rotors' thrust constraints.If such heading exists, the algorithm returns the corresponding pair of velocity and heading samples.On the other hand, when a velocity sample is not collision-free, the algorithm tries to choose a heading sample that respects the rotors' thrust constraints and, if such heading exists, it calculates the time to collision for this particular pair of samples.Then, in the event that any velocity sample is free of collision, the algorithm returns the pair of samples that results in the minimum time to collision and respects the velocity and rotors' thrust constraints.For this case, the main for loop in Algorithm 1 iterates n v times and, within each iteration, the nested else condition is executed.On the other hand, when the first velocity sample in V * o is collision-free and there is a heading sample that, given this first velocity sample, respects the rotors' thrust constraints, the main for loop is executed only one time.

Numerical Simulation
The effectiveness of the proposed method is evaluated in a numerical simulation implemented on the basis of the nonlinear equations of motion ( 1) and ( 4)-( 8) and coded in MATLAB utilizing the first-order explicit Euler method with a sampling time of 0.01 s.The vehicle adopted for the simulation is an x-shaped quadcopter with a mass of 1 kg, arm length of 0.5 m, and inertia matrix J b = diag(0.015,0.015, 0.03) kg m 2 .
The proposed method is compared with another one that uses the same control strategy and the original CCO [30] as the guidance strategy.To compare the methods, we conduct a Monte Carlo simulation where the quadcopter has an initial position (0, 0, 5) m, zero initial velocity, and has to go to the desired position (38, 0, 5) m, while avoiding collision with 270 quadcopters that are used to represent the obstacles, as depicted in Figure 5  The vehicle can be circumscribed by a sphere of radius 0.5 m, while all obstacles can be circumscribed by spheres of radius 0.7 m.The controlled quadcopter has the following linear velocity and rotor thrust admissible sets: Moreover, the vehicle is subject to the disturbances Table 1 shows the adopted parameters for the proposed guidance and control strategies.
Table 1.Parameters of the proposed method.

Control Law Variable Value
Attitude control law Moreover, we have used in Algorithm 1 a total of 800 target velocity samples calculated using a uniform distribution over V B(0 3 , v ) and 50 equally spaced target heading samples inside the interval (−π, π]. The MAV receives the desired heading signal (in degrees) In order to conduct a more comprehensive analysis of the proposed method, we present in Figures 6 and 7 the relevant results from a single iteration of the Monte Carlo simulation.A three-dimensional animation of this simulation is available at the following link: https://youtu.be/d7K4e6Ytn8M(accessed on 23 September 2023).Figure 6a-c, respectively, show the distance between the quadcopter and the obstacles, the Euclidean norm of the quadcopter's velocity and its bound, and the rotors' thrust and their bounds.It can be seen that the proposed method guarantees the satisfaction of the position constraint (9), the linear velocity constraint (10), and the rotors' thrust constraint (11), which demonstrates the effectiveness and robustness of the proposed method.
Figure 7 presents the position and attitude tracking performance as well as the force and torque control commands.In Figure 7a, it can be seen that the position commands are satisfactorily tracked despite the presence of the disturbance force.Figure 7b shows that the attitude commands are exactly tracked during the entire time despite the presence of the disturbance torque, thus confirming that by combining the proposed sliding mode attitude and smooth position control laws, an integral sliding mode exists in the attitude loop.This fact is also confirmed by Figure 8, which shows that the Euclidean norm of the attitude sliding variable s is restricted to a relatively small neighborhood of the origin during the entire time.Figure 7c and Figure 7d, respectively, show the force and torque control commands.These plots confirm the smoothness and the switching behavior of the force and torque control commands, respectively.

Monte Carlo Simulation
We have performed, for each method, 100 iterations of the above-described Monte Carlo simulation.For both methods, we have analyzed at each iteration whether collisions have occurred and their number.For the proposed method, we have also analyzed whether the linear velocity and rotors' thrust constraints are satisfied.where n col ∈ Z ≥0 is the total number of collisions that have occurred in the 100 iterations.
For the proposed method, Table 2 also shows the percentage of iterations in which the linear velocity and rotor thrust constraints are respected.The results in Table 2 confirm the effectiveness of the proposed method in respecting the linear velocity rotors' thrust constraints.Moreover, one can see that, by observing the obstacles' acceleration and considering the vehicle position tracking error bound r , the proposed method is more effective than the original CCO in preventing collisions.We emphasize that this is remarkable since the original CCO, by not considering the rotors' thrust constraints, has more control force and torque available to avoid collisions.To support this statement, we also ran 100 iterations for the proposed method, disregarding the rotors' thrust constraints.In this simulation, the proposed method had a CR = 0%, demonstrating that the previous CR of 25% was only related to the inability to avoid collisions due to the control force and torque limitations imposed by the rotors' thrust constraints.

Discussion
The results of the numerical simulations demonstrate the efficiency of the proposed method in guiding underactuated MAVs subject to model uncertainties/disturbances as well as velocity and rotors' thrust constraints in the presence of obstacles that can accelerate.These results have practical importance given the recent growing interest in autonomous flight and the number of aircraft sharing the same air space.On the other hand, the performed simulation study, although comprehensive, did not consider sensor noise and inaccuracies.These uncertainties present in real-world scenarios could be immediately considered in the proposed strategy by suitably increasing the MAV tracking error bounds, the obstacles' radii, and the guidance time horizon τ.Moreover, it is worth mentioning that the proposed formulation considers that the MAV and the obstacles can be enclosed by spheres.While this is not a significant limitation, as more complex shapes can be approximated using a collection of spheres [43], there are situations where employing this approach might lead to an increased level of complexity to represent the obstacles.

Conclusions
This paper proposes robust guidance and control methods for underactuated MAVs equipped with fixed rotors subject to model uncertainties/disturbances and linear velocity and rotor thrust constraints, in the presence of accelerated obstacles.The attitude and position control laws are designed using a hierarchical SMC scheme that enforces the attitude-position TSS.The former is designed using a multi-input ISMC strategy and the latter is designed using a proportional-derivative policy combined with an SMDO.On the other hand, the proposed guidance strategy is based on the CCO and designed using a constraint-tightening approach for robustness against uncertainties/disturbances and a high-order SMD to robustly estimate the maximum accelerations of the obstacles.A Monte Carlo numerical simulation was performed using a quadcopter to compare the proposed method with another one that uses the same attitude and position control design and the original CCO in the guidance level.The proposed guidance method outperformed the original CCO in avoiding collisions with moving obstacles that can accelerate and, additionally, has been shown to be very effective in respecting linear velocity and rotor thrust constraints.In future works, the proposed method can be evaluated experimentally and applied to other mobile robotics systems.Moreover, Algorithm 1 can be further improved in performance, aiming at its implementation in embedded systems.

Figure 1 .
Figure 1.A typical hierarchical guidance and control architecture for MAVs.
3 .The relation between a r and a b is a b = D b/r a r , where D b/r ∈ SO(3) is the attitude matrix of S b relative to S r and SO(3) {D ∈ R 3×3 | D T D = I 3 } denotes the special orthogonal group.

Finally, consider the S r representations a r and b r (b 1 , b 2 , b 3 )
of a and b, respectively.The vector product c = b × a is represented in S r by c r = [b r ×]a r , where [b r ×] is the following skew-symmetric matrix:

Figure 2 .
Figure 2. The adopted CCSs and a general underactuated MAV equipped with n r fixed rotors parallel to z b .
has designed the position and attitude controllers relying on a TSS between the (slower) position and (faster) attitude closed-loop dynamics, which allows us to nest the attitude control loop inside the position one, as shown in the block diagram of Figure 3.In this architecture, the position controller receives rb/r r as the command input, r b/r r and v b/r r as feedback, and produces f c r as the output.The attitude command generator (ACG) converts nr and ψ into the three-dimensional attitude command Db/r .The attitude controller receives Db/r as the command input, D b/r and ω b/r b as feedback, and produces τ c b as output.Lastly, the control allocation converts f and τ c b into individual thrust commands f r f1 , . . ., fn r .

Figure 3 .
Figure 3. Hierarchical control architecture for underactuated MAVs equipped with fixed rotors.ACG stands for attitude command generator.
Db/r D b/r , ωb/r b ω b/r b , and b refers to a CCS S b representing the commanded attitude for S b .

The relation between the 1 - 2 - 3
Euler angles α b/r (φ, θ, ψ) that parameterize D b/r and the angular velocity ω b/r b is α

Figure 4 .
Figure 4. Block diagram of the proposed guidance strategy.

where â b/r r ∈
R 3 and ωn b ∈ R 3 are, respectively, the predictions of v b/r r and ωn b inside [t 0 , t 0 + τ].
) and (27) and the definition of nn r given after (68), being given by φ = − tan −1 e prediction defined in Equation (73).Then, the angular velocity prediction ωn b can be immediately calculated by ωn b = A α ˙α b/r , . The obstacles are arranged between the MAV's initial and desired positions in thirty evenly spaced groups of nine.The kth group, where k ∈ {1, . . ., 30}, contains a static obstacle that has the randomly uniformly selected position offsets ∆y b,k ∈ [−6, 6] m and ∆z b,k ∈ [−2, 7] m relative to the line connecting the MAV's initial and desired positions in the directions of y r and z r , respectively.The remaining obstacles are circularly and uniformly distributed in the plane y r -z r and rotate around the static obstacle with a randomly uniformly selected radius ρ b,k ∈ [2, 5] and angular velocity ω k ∈ ([−0.4,−0.1] ∪ [0.1, 0.4]) x r .

Figure 5 .
Figure 5. Schematic illustration of the proposed Monte Carlo simulation scenario.(a) Overview of the scenario showing the MAV, its target position, and the groups of obstacles.(b) The kth group of obstacles in detail, where k ∈ {1, . . ., 30}.

Figure 6 .
Figure 6.Constraint plots for the proposed method.(a) Distance from the quadcopter to the obstacles.(b) Euclidean norm of the quadcopter velocity and its bound.(c) Rotors' thrusts and their bounds.

Figure 7 .
Figure 7. Position tracking performance, attitude tracking performance, and control commands for the proposed method.(a) Position tracking performance.(b) Attitude tracking performance.(c) Control force command.(d) Control torque command.Legend: vector first component, vector second component, and vector third component.

Figure 8 .
Figure 8. Euclidean norm of the attitude sliding variable s.
n}.The Euclidean norm and component-wise absolute value of x are denoted, respectively, by x x 2 1 + • • • + x 2 n and |x| (|x 1 |, . . ., |x n |).On the other hand, the component-wise absolute value of a generic matrix A 1 is proposed to solve (76) or (77).
Table2shows the Monte Carlo simulation results.It presents, for both methods, the collision rate (CR) ∈ Z ≥0 is the number of iterations where collisions have occurred, and it also shows the average number of collisions per iteration (ACI)

Table 2 .
Monte Carlo simulation results for the proposed and original CCO methods.N/A: not applicable.