A Backstepping Approach to Nonlinear Model Predictive Horizon for Optimal Trajectory Planning

: This paper presents a novel trajectory planning approach for nonlinear dynamical systems; a multi-rotor drone, built on an optimization-based framework proposed by the authors named the Nonlinear Model Predictive Horizon. In the present work, this method is integrated with a Backstepping Control technique. The goal is to remove the non-convexity of the optimization problem in order to provide real-time computation of reference trajectories for the vehicle which respects its dynamics while avoiding sensed static and dynamic obstacles in the environment. Our method is applied to two models of multi-rotor drones to demonstrate its ﬂexibility. Several simulation and hardware ﬂight experiments are presented to validate the proposed design and demonstrate its performance improvement over earlier work.


Introduction
Planning collision-free trajectories for autonomous unmanned vehicles operating within unknown, dynamic, 3D, geometrically complex, and GPS-denied environments is a challenging and exciting research problem for both academia and industry. For agile drone systems, generating efficient trajectories in real-time requires using trajectory planning methods which respect the vehicle's dynamics and input constraints as part of the prediction process. Researchers are now studying trajectory planning approaches which can take these considerations into account, for instance receding horizon-based methods [1][2][3].
Trajectory planning provides a parameterized path from a starting configuration to a terminal setpoint while avoiding obstacles. It is sometimes called motion planning and mistakenly referred to as path planning, since trajectory planning is a superset of path planning by generating reference kinematics over the entire path instead of geometric paths only [4]. Planning algorithms have received much attention from robotics researchers, where most of the published algorithms fall under one of the following categories: searchbased, sampling-based, artificial potential field, artificial intelligence and optimizationbased methods.
Search-based methods, a.k.a. grid-based methods, use search algorithms to find the best possible collision-free path in a discretized graph of grids. Some of the widely used graph search algorithms include Dijkstra algorithm [5], A* [6], Lifelong Planning A* (LPA*) [7], D* Lite [8], Anytime Repairing A* (ARA*) [9], and Hybrid-state A* [10]. The sampling-based approaches randomly sample a space of possible robot configurations distributed in space and use search algorithms to form a directed graph of collision-free motions. Probabilistic Road-Map (PRM) [11], Rapidly-Exploring Random Tree (RRT) [12], RRT*, and Rapidly-Exploring Random Graph (RRG) methods [13] are some examples of this thoroughly studied approach in the literature. The artificial potential field method developed in [14] plans a path to a goal that avoids collisions by assigning an attractive force to the desired goal and repulsive forces to obstacles [15]. A variety of artificial intelligence-based algorithms for path planning have been proposed in the literature, for instance, Artificial Neural Network (ANN) [16], Genetic Algorithm (GA) [17], Ant Colony Optimization (ACO) [18], Particle Swarm Optimization (PSO) [19], and Simulated Annealing (SA) [20] algorithms. Reference [21] presents a comprehensive review of artificial intelligence-based methods for path planning up to 2018. A recent line of research involves trajectory generation for flexible space robots based on deterministic artificial intelligence, which aims to use the dynamic equations of the system to autonomously determine an optimal reference trajectory (c.f. [22] and the references therein).
However, the above-mentioned motion planning algorithms have analytical and practical limitations, such as: • Absence or limited consideration of the internal and external constraints imposed by the system and dynamic environment; • Lack of repeatability in generating trajectories between a starting and ending configuration for a fixed initial vehicle state and environment scenario; • Computational inefficiency in regenerating paths while moving between the start and terminal point; • For some approaches, generating non-smooth paths that lead to jerky motions and create inefficiency in the vehicle's power draw [23].
Recently, optimization-based motion planning methods have gained researchers' attention due to their ability to resolve some or all of the above-mentioned limitations. The best-known optimization-based methods for motion planning are Covariant Hamiltonian Optimization for Motion Planning (CHOMP) [24] and Stochastic Trajectory Optimization for Motion Planning (STOMP) [25]. The former uses gradient-based optimization while the latter uses stochastic optimization, and both produce collision-free trajectories that satisfy given constraints, but are computationally expensive. In addition, CHOMP is prone to getting trapped in local minima, where it returns infeasible or sub-optimal solutions. A very recent publication on optimization-based trajectory generation is [26], which uses Pontryagin's maximum principle to efficiently generate slew trajectories for a spacecraft.
Our proposed approach, called the Nonlinear Model Predictive Horizon (NMPH) [27], uses optimization for planning smooth, optimal, consistent, and computationally efficient trajectories which respect the internal and external constraints of the vehicle. This approach compensates for the nonlinearities of the trajectory planning dynamics and hence reduces or removes the non-convexity of the optimization problem. This is performed by combining the nonlinear plant model with the Backstepping Control (BSC) method within the optimization problem. It also accounts for static and moving obstacles as constraints in the optimization problem.
In ref. [27], we proposed an NMPH formulation which used Feedback Linearization (FBL) within its dynamics to compensate for nonlinearities. In that work, the state augmentation process required to make the system state feedback linearizable created numerical difficulties due to the need to obtain the second-order time derivative of the total thrust of the drone. This difficulty with using the FBL design steered us to find an alternative feedback design methodology to be used within the NMPH design, namely BSC. This new NMPH-BSC formulation compensates for the system nonlinearities and guarantees global asymptotic stability of the closed-loop system within the optimization problem used to predict the reference trajectories for the drone vehicle.
The backstepping control algorithm was originally developed by [28] to design adaptive controllers for a special class of nonlinear dynamical systems. Backstepping is a recursive Lyapunov-based design technique applied to a nonlinear strict-feedback system with a series of cascaded subsystems, in which the choice of the Lyapunov functions for the subsystems guarantees the global asymptotic stability of the overall controlled system [29].
One of the most important advantages of the BSC method is the systematic procedure that the designer can follow to construct a feedback control law based on an appropriate choice of Lyapunov functions [30]. This step-by-step procedure uses the system's dynamics model, which accounts for the system nonlinearities.
The flexibility of the BSC methodology and its ability to guarantee global asymptotic stability [31,32] make it an excellent candidate to be used inside the NMPH approach, as this integration compensates for the drone's nonlinearities and leads to a less non-convex optimization problem.
To demonstrate the versatility of our NMPH-BSC approach over the earlier NMPH-FBL [27], a more detailed model of the drone vehicle dynamics, which includes linear and angular velocity drag forces and rotor gyroscopic effects, is employed. The optimization problem is also used to predict both the reference trajectory as well as its rates of change, which provides higher-quality tracking performance.
The research contributions of this paper are: • Implementing the BSC method within NMPH to compensate for nonlinearities in order to reduce the non-convexity of the trajectory generation optimization problem; • Demonstrating the versatility of the NMPH-BSC approach by using both a simplified and a higher-fidelity dynamics model of the drone; • Using the NMPH optimization problem to predict both the reference trajectory as well as its rates of change for the onboard flight trajectory controller; • Validating and evaluating the performance of the proposed approach in both simulation and hardware drone flight experiments.
The remainder of this paper is arranged as follows: the design of the NMPH-BSC for two variants of drone dynamics models is presented in Section 2. Section 3 evaluates the proposed designs in simulation and hardware flight tests. Section 4 summarizes the paper and proposes future work.

Problem Formulation
The Nonlinear Model Predictive Horizon (NMPH) is a recent approach to path planning proposed by the authors in [27]. In simple terms, NMPH integrates a linearizing feedback law to reduce the non-convexity of the optimization problem handled by a Nonlinear Model Predictive Control (NMPC) algorithm. NMPH generates optimal reference trajectories for the closed-loop system which steers an aerial drone's 3D position, heading angle, and their rates of change. The original NMPH [27] relied on state feedback linearization [33], which required the addition of integral states to satisfy the conditions of the method, translating to the need for numerical differentiation of measured outputs. In the present work, we replace feedback linearization with Backstepping Control (BSC) [28] to compensate for the system's nonlinearities. The recursive structure of BSC provides stable response of a dynamic system and makes it more robust to parameter uncertainties.
In this section, an overview of the NMPH framework is presented, both high-fidelity and simplified drone dynamics models are presented, and finally a backstepping design for each model is derived and integrated into the NMPH.

Nonlinear Model Predictive Horizon
NMPH is an optimization-based method whose objective is to generate optimal reference trajectories for closed-loop nonlinear systems. The optimization problem of NMPH considers the nonlinear plant model, nonlinear control law, and user-specified constraints to continuously solve for reference trajectories which are fed to the closed-loop system and which respect its dynamics. An overview of the NMPH architecture is shown in Figure 1.
Embedding the nonlinear control law in the optimization problem is performed to decrease the nonlinearity of the closed-loop system dynamics. Consequently, this will reduce the non-convexity of the optimization problem and enhance convergence and computational time of the solution, enabling real-time trajectory generation for purposes such as motion planning and environmental exploration. For the closed-loop system, which is depicted by a blue box in Figure 1, assume that a nonlinear plant is controlled by a nonlinear feedback law as follows: where x ∈ X ⊆ R n x , u ∈ U ⊆ R n u , and ξ ∈ Ξ ⊆ R n ξ are the system states, inputs, and outputs, respectively. The system outputs considered in this work are the vehicle's 3D position and heading angle and their rates, such that Ξ ⊆ X. The reference trajectory ξ re f ∈ Ξ is generated by our proposed algorithm, as will be presented below in (2). The system dynamics are represented by a smooth map f : X × U → X, while the feedback control law is the smooth map g : X × Ξ → U which makes the system output to follow a reference trajectory ξ re f . In this work the control law is designed using the Backstepping Control technique. For the NMPH, which is represented by the gray box in Figure 1, a copy of (1) plus assigned constraints are used in the optimization problem to compute two predictions over a finite time horizon, from the current state x to a terminal stabilization setpoint x ss . The first prediction is the predicted system state trajectoryx, which includes the predicted output trajectoryξ as a subset. The second one is the estimated reference trajectoryξ re f , which is used as a (continuously updated) reference trajectory for the actual closed-loop system.
The online NMPH optimization problem for a stabilization setpoint x ss is shown in Equation (2) [34]. Let t n , n = 0, 1, 2, · · · represent successive sampling times. At every sampling time, the optimization problem solves forx andξ re f as long as for τ ∈ [t n , t n + T].
The stage L and terminal cost E functions in (2) can be selected as the terms (3a,b), which allows the use of the Gauss-Newton method for quickly finding a good approximation of the Hessian matrix within the optimization problem: where the constraint sets for the state, control and output trajectory are X ⊆ X, U ⊆ U, and Z ⊆ X, respectively. A total of p static and dynamic obstacles within the environment are represented by the inequality constraint O i (x) ≤ 0 in (2e). The deviation between the predicted system state trajectoryx and the stabilization setpoint x ss and between the predicted output trajectoryξ and the estimated reference trajectoryξ re f are penalized within the stage cost function L (3a) by the weighting matrices W x ∈ R n x ×n x and W ξ ∈ R n ξ ×n ξ , respectively. The terminal cost function E (3b), which represents the cost of steady-state error, penalizes the deviation between the end value of the system state predictionx(t n + T) and the terminal setpoint x ss by the weighting matrix W T . In words, the optimization process (2) runs as follows: 1.
Measure or estimate the current state x(t n ) of the actual closed-loop system; 2.
Predictx andξ re f by minimizing the cost function J x,ξ re f over the prediction horizon [t n , t n + T] subject to the system dynamics (2b), control law (2c), and assigned constraints (2e); 3.
Use the estimated reference trajectoryξ re f (or the predicted output trajectoryξ, as both converge to each other) as the reference trajectory of the actual closed-loop system; 4.
Repeat Steps 1-3 until the drone vehicle approaches the terminal setpoint x ss or the optimization problem produces an infeasible solution.
A comprehensive study about NMPH, including discrete-time representation, use of constraints, and software implementation of the optimization problem can be found in our recent work [27].

Drone Dynamics
The dynamics of a multi-rotor drone vehicle can be modeled using the Newton-Euler equations [35], governing six degrees of freedom rigid-body motion, augmented with force and torque generation models of the individual rotors. The model can either assume static hover conditions for simplicity, or include linear and angular velocity drag forces and rotor gyroscopic effects to yield a more complicated but higher-fidelity model.
To model rigid-body dynamics, two reference frames are used: a stationary groundfixed navigation frame N , and a moving body-fixed frame B. The origin of the latter is placed at the drone's center of gravity, as shown in Figure 2. The frame bases follow the East, North, and Up (ENU) convention, with orthonormal basis vectors {n 1 , n 2 , n 3 } and {b 1 , b 2 , b 3 } for the navigation and body frames, respectively. Rigid-body pose in space can be described as a member of the Special Euclidean group SE(3) = SO(3) × R 3 , the product space of the orientation and position R nb , p n where R nb ∈ SO(3) is the rotation matrix of the body frame with respect to the navigation frame, p n = [x y z] T ∈ R 3 is the position vector of the vehicle's body frame with respect to the navigation frame, and SO (3) is the Special Orthogonal group defined as SO(3) = {R ∈ R 3×3 | RR T = R T R = I, det(R) = +1}. The roll-pitch-yaw Euler angles η = [φ θ ψ] T are employed to parametrize the rotation matrix as: where s (·) = sin(·) and c (·) = cos(·). Similarly, t (·) = tan(·) which will be seen later.
Conversion between translational and rotational velocity vectors can be performed using the transformations [36] where v n , v b ∈ R 3 are the translational velocity vectors in frame N and B coordinates, respectively,η = [φθψ] T is the vector of Euler angle rates, and ω b ∈ R 3 is the angular velocity vector in frame B coordinates. The rotational velocity transformation matrix W is given by: The time derivative of the rotation matrix isṘ nb = R nb S(ω b ), where S(·) : R 3 → so(3) maps a vector to a skew-symmetric matrix such that S(a)b = a × b for a, b ∈ R 3 . Taking the time derivatives of (5a,b), whereẆ =φ(∂W/∂φ) +θ(∂W/∂θ). The Newton-Euler equations for a multi-rotor drone read [36] T is the vector of torques about the b 1 , b 2 and b 3 frame axes,ḡ = [0 0 g] T is the gravitational acceleration vector where g = 9.81 m/s 2 , J = diag(J x , J y , J z ) is the drone's mass moment of inertia matrix which is assumed to be diagonal, the scalar J r is the rotor's inertia, where ω i is the angular speed of the ith propeller, and K t = diag(k t1 , k t2 , k t3 ), K r = diag(k r1 , k r2 , k r3 ) represent the translational and rotational drag coefficient matrices of the drone, respectively.
To express the drone's dynamics with respect to the navigation frame, Equation (8) are combined with (5) and (7) to yield: This leads to: Remark 2. Each multi-rotor configuration (quadrotor, hexarotor, octorotor, and so on) has a different expression for the net body-frame thrust and torque vectorsū andτ. These expressions are algebraic and can be readily calculated. The dynamics presented in (10) thus model any multi-rotor drone as long as the correctū andτ expressions are used.
The development of the proposed NMPH with a backstepping control design will be based on the dynamics model presented in (10). We will also present a design based on a simplified version of (10) to illustrate the ease of adapting the proposed approach to different model representations. This is in contrast to the formulation of NMPH with feedback linearization presented in our recent work [27] where this adaptation requires a fundamental re-derivation of the expressions involved.
In the simplified version of (10), body and propeller gyroscopic effects are dropped from the model, and the translational and rotational drags are neglected as well. The simplified model can then be written as:

Backstepping Control Design
Because a drone's dynamics are nonlinear, solving the NMPC optimization problem is challenging because of its non-convexity. Introducing backstepping control into the optimization problem (within the framework of NMPH) will either remove or reduce the nonlinearity of the overall system, and consequently the non-convexity of the optimization problem. This will make it possible to find an optimal solution more quickly.
In this section, a backstepping control law is derived for the drone dynamics, which will be used within the NMPH framework to enhance the performance of the reference trajectory prediction. To demonstrate the versatility of this methodology, both the simplified (11) and high-fidelity (10) models of the drone dynamics will be considered.
Our backstepping control design consists of a coupling of inner and outer control loops [37]. The inner loop controls the rotational dynamics of the drone and tracks desired values provided by the outer loop, which controls the translational dynamics. In the literature, many studies of backstepping control applied to multi-rotor drones considered applying the design steps to each system output separately [36,[38][39][40], but in our work the method is first applied to the rotational dynamics subsystem by itself, then to the translational dynamics subsystem. This approach will facilitate the integration of BSC within the NMPH framework as discussed later in Section 2.4.
First, recall the terms: where η d ∈ R 3 are the desired Euler angles, to be provided by the outer loop design. Now, define the tracking error vector δ 1 ∈ R 3 as: and choose a positive semi-definite Lyapunov function candidate V 1 ≥ 0 ∈ R, such that The time derivative of (14) is: Next, define the virtual tracking error rate δ 2 ∈ R 3 and the first virtual where Λ 1 = diag(λ 1 , λ 2 , λ 3 ) ∈ R 3×3 is a diagonal gain matrix that contains positive entries such that Λ 1 is positive definite or Λ 1 > 0. Using (16) and (17), the derivative of the Lyapunov function candidate (15) can be written as: which by inspection may or may not be negative semi-definite. Therefore, a recursive step must be performed. Note that the time derivatives of (13) and (16) are: Now define the new positive semi-definite Lyapunov function candidate V 2 ≥ 0 ∈ R as: such thatV where δ T 1 δ 2 = δ T 2 δ 1 . To stabilize the tracking errors δ 1 and δ 2 , the backstepping control formulation will introduce a second virtual control v 2 ∈ R 3 . We will define v 2 based on the system dynamics, and then recursively design it within the backstepping control structure.
As mentioned in Section 2.2, we will apply the backstepping technique to both the full and simplified system dynamics presented in (10) and (11), respectively. For the full model, the attitude dynamics in (10b) can be written as: where The second virtual control is defined as v 2 =ḡ 1 (x,τ) = [τ φ τ θ τ ψ ] T , where τ φ , τ θ and τ ψ are the virtual inputs of the rotational subsystem.
We now design v 2 to make the time derivative of the Lyapunov candidate Function (22) negative semi-definite. Let where Λ 2 = diag(λ 4 , λ 5 , λ 6 ) ∈ R 3×3 is the second diagonal gain matrix with positive entries such that Λ 2 > 0. By substituting (23) and (26) into (22) we obtain: By (21) and (27), we can thus conclude the asymptotic stability of the error terms δ 1 and δ 2 , and thus the rotational subsystem. Consequently, the physical control law for the rotational subsystem can be found by returning to (26), then solving (25) to obtain the physical control inputs as: We now perform the backstepping design for the translational dynamics of the drone. The actual and desired position vectors with respect to the navigation frame are written as: For the first step of the backstepping design, the position tracking error vector and its time derivative are defined as: Consider the Lyapunov candidate function V 3 ∈ R and its time derivative, Define the virtual tracking error rate and the first virtual control for the translational subsystem as: where Λ 3 = diag(λ 7 , λ 8 , λ 9 ) ∈ R 3×3 is a diagonal gain matrix with positive entries, and v 3 = [v x v y v z ] T ∈ R 3 is the virtual control vector. Substituting (34) and (35) into the Lyapunov candidate function rate (33) yields: which cannot be guaranteed to be negative semi-definite. Therefore, a new Lyapunov candidate function V 4 ∈ R is defined as: The translational dynamics of the full model (10a) can be written as: and where u ∈ R is the total thrust of the propellers, a physical input. The next step of the backstepping design introduces the second virtual control for the translational system v 4 =ḡ 2 (x, u) = [u x u y u z ] T . Assign this control as: where Λ 4 = diag(λ 10 , λ 11 , λ 12 ) ∈ R 3×3 contains positive entries. Substituting (39) and (42) into (38) yields:V such thatV 4 ≤ 0, meaning the error terms δ 3 , δ 4 are asymptotically stable, and thus the translational subsystem dynamics. For our cascaded control design, the desired roll and pitch angles for the inner loop system are extracted from (41) after computing (42). Assume ψ is a measured state of the system. Then, the desired roll and pitch angles φ d , θ d and thrust u are obtained by solving (41), which gives: Remark 3. In addition to the Euler angles limitations mentioned in Remark 1, the outer loop control law provides solutions if and only if the total thrust is a non-zero positive value, u > 0. This condition must be included within the constraints of the optimization problem in (2d) to avoid solution infeasibility.
We can also perform the backstepping control design using the simplified system model (11). The second time derivative of the Euler angles vector η is written as: The terms δ 1 , δ 2 and virtual input v 1 are defined exactly as in the full model backstepping design. Analogously to (28), the second virtual input v 2 for the (simplified) rotational dynamics is now assigned as: and since v 2 = [τ φ τ θ τ ψ ] T =ḡ 3 (τ), the physical inputs are obtained from (46) as: For the translational dynamics, define the position vector χ, whose second time derivative is written asχ =f 4 (x) +ḡ 4 (x, u), where: The definitions of δ 3 , δ 4 and virtual input v 3 remain identical to the full model case. Assign the virtual control v 4 as in (42): wheref 4 (x) is given in (49). Since v 4 = [u x u y u z ] T =ḡ 4 (x, u), assuming the yaw angle ψ is known, we solve this expression for the desired roll and pitch angles φ d , θ d and the total thrust u using Equation (44).

Backstepping Control Law Integration within NMPH
A copy of the full system model in (10) is used within the NMPH optimization problem asẋ = [p nṽnηη ] T . Letχ =p n andχ =ṽ n below. The backstepping control design representingũ(τ) = g x(τ),ξ re f (τ) within the NMPH (2c) is implemented in two stages. The first stage is the outer loop, which takes the desired position vector of the drone, and computes the thrust plus the desired roll and pitch angles. The second stage is the inner loop, which takes the computed roll and pitch angles plus a desired yaw angle, and computes the torque inputs. The details of the two-stage process (implemented as input constraints within the NMPH) are as follows: 1.
For the first NMPH input constraint, define the virtual input for the translational dynamics (42) as:ṽ Associate the total thrust u and the desired roll and pitch angles φ d , θ d from (44) with [u x u y u z ] T =ṽ 4 and ψ =η(3) 3.
For the second NMPH input constraint, define the virtual input for the rotational dynamics (28):ṽ in the NMPH optimization problem, a function of the NMPH statesx and the estimated reference trajectoriesξ re f , wherê 6. Solve the optimization problem (2), which leads to the prediction of the system states x and the estimated reference trajectoriesξ re f . The latter is used as the reference trajectory for the actual closed-loop system.

Evaluation of NMPH-BSC
For testing and validation, the NMPH-BSC approach was implemented and tested in simulated and hardware flight tests on quadcopter and hexacopter drone vehicles, respectively.
The algorithms are implemented within the Robot Operating System (ROS) [41], a Linux-based software environment that handles communications between the vehicle's onboard computer and its hardware subsystems. The ACADO Toolkit [42] is used to solve the optimization problem. For implementation, the overall NMPH problem (2) was coded in C++, then converted into highly efficient C code by ACADO to be able to run the calculations in real-time.
The set of continuous-time equations in (2) is a Nonlinear Programming (NLP) optimization problem, which can be discretized using the direct multiple-shooting method. NLP solves optimization problems, which include nonlinear functions and/or nonlinear constraints using Sequential Quadratic Programming (SQP) [43], and in our case the qpOASES solver is used to solve SQP numerically [44].

Simulation Environment
The proposed approach was first implemented in a simulation on a quadcopter drone using the AirSim simulator [45]. AirSim is an open-source package which provides photorealistic rendered environments and a physics engine to enable performing lifelike simulations of drone vehicles. Moreover, we used the PX4 autopilot [46] running onboard the drone for software-in-the-loop operation to make the simulated drone's characteristics more closely resemble the hardware unit.
In our work, an incremental volumetric mapping technique named Voxblox [47] was used. Voxblox represents the environment volumetrically using a signed distance field and classification into unknown, free, or occupied spaces. The drone then uses this generated map to continuously build dynamic obstacle constraints that are used by the NMPH optimization problem to generate collision-free trajectories [34].
A desktop computer equipped with an Intel Core i7-10750H CPU and an Nvidia GeForce RTX 2080 Super GPU was used to run the optimization calculations and simulation environment in conjunction with ROS. The prediction horizon of the optimization problem was set to 8.0 s and discretized into 40 samples, which was found to be sufficient for motion planning purposes. The cost function weights were empirically tuned to provide good trajectory planning performance.
The drone's pose and environmental sensor readings were obtained from the AirSim simulator and communicated to our NMPH-based global motion planning system. The resulting output was used as a reference trajectory for the drone vehicle's flight controller. This design enables the drone to explore an unmapped environment, as will be discussed in Section 3.1.2.

Trajectory Planning
In this simulation, an optimal reference trajectory was planned and tracked within the Air-Sim simulator as shown in Figure 3a. The quadcopter vehicle started at p n = [−9, −3.5, 2] T m, ψ = 0 • and the NMPH-BSC algorithm was used to generate an optimal trajectory to the desired setpoint p n d = [−5, −8, 5] T m, ψ d = 90 • while avoiding an obstacle as shown in Figure 3b. The optimization problem within NMPH-BSC provided an estimate of the reference output trajectoryξ re f and a prediction of the system state trajectoryx, which included the predicted output trajectoryξ as a subset.   Figure 4 shows that the estimated reference trajectories of the vehicle's position and velocitiesξ re f = [ξ x,re f ,ξ y,re f ,ξ z,re f ,ξ ψ,re f ,ξ˙x ,re f ,ξ˙y ,re f ,ξ˙z ,re f ,ξψ ,re f ] perfectly match their corresponding predicted reference trajectoriesξ = [x x,re f ,x y,re f ,x z,re f ,x ψ,re f ,x˙x ,re f ,x˙y ,re f ,x˙z ,re f ,xψ ,re f ]. This confirms that using the stage cost function in (3a) minimizes the deviation between the estimated and predicted reference trajectories, and thus ensures their convergence towards each other. This validates the statement made in Section 2.1, that either the estimated or the predicted trajectory can be used as the reference trajectory for the closed-loop system.  In the second simulation, tracking performance is assessed by having the vehicle move from the initial position p n = [0, 0, 2] T m to the desired terminal point p d = [6, −4, 2] T m using the optimized reference trajectory provided by the NMPH-BSC. The tracking performance is plotted in Figure 5, showing the vehicle satisfactorily tracks the time-varying reference trajectory generated by the NMPH-BSC algorithm. The small variation between the desired and actual outputs is because the former are obtained from numerical integration of the drone dynamics (10), while the latter are obtained from the physics engine of the simulation environment, which likely uses more complicated aerodynamic force and torque models than our design. This modeling mismatch can also be expected for real-world hardware testing, which will be covered in Section 3.2.

Exploration of Unknown Environment
In this simulation test, the drone explored an unknown environment by using a modular global motion planner, as described by the authors of [34]. This graph-based motion planner generated terminal setpoints within unexplored areas of an incrementally built-up volumetric map of the environment [47,48], and the NMPH-BSC algorithm was used to calculate optimal trajectories from the vehicle's current pose to these terminal setpoints. To achieve a smooth integration between the graph-based planner and the NMPH-BSC trajectory planning approach, computationally efficient algorithms for obstacle mapping and avoidance plus robust path guidance algorithms were used. Further details of this methodology can be found in [34].   Table 1 offers a comparison between three approaches: graph-based planner only [48], graph-based planner plus NMPH-FBL [27], and graph-based planner plus NMPH-BSC (this paper). Based on this comparison, it can be seen that both NMPH algorithms have the effect of reducing the distance traveled, which reduces the mission time and consequently the energy consumption of the vehicle. Using the proposed NMPH-BSC leads to an improved performance over the NMPH-FBL. This is in addition to NMPH-BSC's advantages of avoiding numerical differentiation and its ability to extend to more complicated plant models as compared to NMPH-FBL.

Hardware Flight Experiments
For hardware experiments, a custom DJI FlameWheel F550 hexacopter was built and instrumented to explore unknown environments using our proposed NMPH-BSC approach in conjunction with a global motion planner. The drone was equipped with a Pixhawk 2.1 flight controller running the PX4 autopilot system [46], plus an Nvidia Jetson Xavier NX system-on-module running ROS Melodic Morenia [41] under Ubuntu 18.04. A Velodyne Puck LITE LiDAR sensor and an Intel RealSense T265 stereo camera were mounted on the drone to provide 360 • point cloud and estimated pose data, respectively. The drone system used for testing is depicted in Figure 7.  Trajectory generation and tracking were evaluated by running the NMPH-BSC algorithm onboard the vehicle. Figure 8a shows the planned trajectory between two setpoints avoiding a sensed obstacle generated by the NMPH-BSC algorithm, while Figure 8b depicts the flight trajectory achieved by the drone using its flight controller to track the planned trajectory. The NMPH-BSC solver provides continuous updates of the estimated and predicted reference trajectories as the vehicle moves towards its endpoint. In this way, the system can handle uncertainties and disturbances such as dynamic environments and moving obstacles. The regeneration rate was set to 5 Hz, although this can be set as high as 100 Hz with the presented hardware.
Trajectory planning using NMPH-BSC in the presence of dynamic obstacles was evaluated experimentally as shown in Figure 9. In this Figure, it can be seen that the drone was able to regenerate trajectories to avoid a moving obstacle while flying through a constrained indoor environment. It is worth pointing out that the continuous trajectory regeneration process of NMPH-BSC provided smooth transitions between the generated trajectories, leading to smooth flight around the moving obstacle.
In a second flight test experiment, the navigation capabilities of the system were tested within a confined indoor environment as illustrated in Figure 10. During this test, the motion planner generated five terminal setpoints and the NMPH-BSC algorithm provided continuously regenerating local reference trajectories between the successive setpoints to ensure smooth flight.  A final hardware flight test was performed outdoors in order to assess the trajectory planning performance in the presence of wind, in this case approximately 15 km/h. As before, the graph-based motion planner provided multiple terminal setpoints, while our NMPH-BSC planned smooth trajectories between them in real-time. The resulting flight trajectory can be seen in Figure 11. Despite the presence of a wind disturbance, the drone was able to smoothly navigate between generated setpoints as in the earlier tests.

Conclusions and Future Work
This paper proposed an optimization-based trajectory planning approach for drones operating in unknown environments. The proposed method embeds the Backstepping Control technique within our recently-proposed Nonlinear Model Predictive Horizon framework [27] for generating reference trajectories for nonlinear dynamical systems. This integration reduces the non-convexity of the optimization problem and thus enables realtime computation of optimal trajectories which respect the nonlinear dynamics of the vehicle while avoiding static and dynamic obstacles.
The resulting NMPH-BSC design was tested in simulation and hardware flight experiments on quadcopter and hexacopter drone vehicles, respectively. The results showed an improvement in performance over a system previously proposed by the authors based on Feedback Linearization [34]. The new design was shown to offer additional implementation advantages over our earlier work including the ability to readily extend to more complicated plant models and avoid numerical differentiation.
Future work will include implementing an adaptation scheme allowing the online updating of the optimization problem weights within the NMPH, and testing the developed system in large-scale, unknown and GPS-denied environments such as subterranean mines.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations
The following abbreviations are used in this manuscript: Navigation and body-fixed frame (n 1 , n 2 , n 3 ), (b 1 , b 2 , b 3 ) Navigation and body frame basis vectors SE(3), SO (3) Special Euclidean and Special Orthogonal groups R nb Rotation matrix of B relative to N p n Position vector of vehicle's B frame origin relative to N η Vector of three Euler angles φ, θ, ψ Roll, pitch and yaw Euler angles s (·) , c (·) , t (·) Sine, cosine and tangent of angle