Evaluation of Hunting-Based Optimizers for a Quadrotor Sliding Mode Flight Controller

: The design of Multi-Input Multi-Output nonlinear control systems for a quadrotor can be a difﬁcult task. Nature inspired optimization techniques can greatly improve the design of non-linear control systems. Two recently proposed hunting-based swarm intelligence inspired techniques are the Grey Wolf Optimizer (GWO) and the Ant Lion Optimizer (ALO). This paper proposes the use of both GWO and ALO techniques to design a Sliding Mode Control (SMC) ﬂight system for tracking improvement of altitude and attitude in a quadrotor dynamic model. SMC is a nonlinear technique which requires that its strictly coupled parameters related to continuous and discontinuous components be correctly adjusted for proper operation. This requires minimizing the tracking error while keeping the chattering effect and control signal magnitude within suitable limits. The performance achieved with both GWO and ALO, considering realistic disturbed ﬂight scenarios are presented and compared to the classical Particle Swarm Optimization (PSO) algorithm. Simulated results are presented showing that GWO and ALO outperformed PSO in terms of precise tracking, for ideal and disturbed conditions. It is shown that the higher stochastic nature of these hunting-based algorithms provided more conﬁdence in local optima avoidance, suggesting feasibility of getting a more precise tracking for practical use.


Introduction
QuadRotor (QR) is a small rotorcraft which can be remotely controlled or fly autonomously through GPS-based embedded flight plans. Besides educational and leisure applications, its professional use has been increasing for intervention in hostile environments, remote sensing, precision agriculture, and urban planning [1]. This system is nonlinear, strongly coupled, Multiple-Input Multiple-Output (MIMO), and underactuated. Besides, since its operation is subject to external disturbances, robust control strategies have been investigated. Although some approaches are based on linear approximations and Proportional-Integral-Derivative (PID) controllers [2,3], most controllers are based on nonlinear algorithms and tools, such as feedback linearization, backstepping, and Sliding Mode Control (SMC). SMC [4] is a robust control technique, based upon a switching control law and the definition of a sliding surface, function of the system state variables. In [5], backstepping and SMC have shown to be equivalent from a robustness point of view, with SMC presenting a smoother and faster response. This has motivated research on both continuous and discrete-time • First evaluation of SMC parameter optimization for quadrotor flight system with hunting-based algorithms ALO and GWO • Parameters obtained by ALO and GWO provided more confidence and repeatability during optimization process • Parameters obtained by ALO and GWO provided lower tracking error • Novel extension of such optimization approaches to SMC controller tuning, usually applied to PID control

Quadrotor Dynamics
A quadrotor consists of a rigid body frame equipped with four rotors, as shown in Figure 1, where the following assumptions are made: (1) the center of gravity is fixed at the origin of the quadrotor B and it coincides with the inertial axis E; (2) its structure is symmetrical; and (3) thrust and drag constants are proportional to the square value of the motor's speed. The full quadrotor Newton-Euler dynamic model in Equation (4) represents the x, y, z motions as a consequence of a roll (φ), pitch (θ) or yaw (ψ) rotation. A detailed quadrotor physical model derivation may be seen in [34]. Lift forces F 1 , F 2 , F 3 , F 4 are basically the thrust generated by each propeller and they are related to the input signals U i , i = 1 . . . 4. This vehicle can be controlled and stabilized by altering rotors' speed ω i , i = 1 . . . 4. By varying these speeds, one can change the lift forces and create motion. The collective input U 1 is the sum of each DC motor thrust. By defining the state variables as: X = x,ẋ, y,ẏ, z,ż, φ,φ, θ,θ, ψ,ψ T ∈ R 12 (1) and considering that x = x 1 ,ẋ = x 2 , y = x 3 ,ẏ = x 4 , z = x 5 ,ż = x 6 , φ = x 7 ,φ = x 8 , θ = x 9 ,θ = x 10 , ψ = x 11 ,ψ = x 12 , Equation (1) may be written as: with the control input vector U: the full state-space system may be split into translational and rotational subsystems (see Table 1),with the model parameters of a micro Vertical Take-Off (VTOL) flying robot proposed in [34], and detailed in Table 2. Table 1. Quadrotor dynamical subsystems.

Subsystems
Translational Rotational propeller lift coefficient (thrust factor) 3.13 × 10 −5 J r moment of propeller inertia around Z axis 8.66 × 10 −7 I x moment of inertia around X axis 6.228 × 10 −3 kgm 2 I y moment of inertia around Y axis 6.228 × 10 −3 kgm 2 I z moment of inertia around Z axis 1.121 × 10 −2 kgm 2 Therefore, using the notation presented in Table 1, the quadrotor model described in [34] may be written as:Ẋ (cos x 7 sin x 9 cos x 11 + sin x 7 sin x 10 x 12 where Ω r is considered a disturbance associated to the difference between clockwise and counterclockwise propellers' rotation in Figure 1, defined as Ω r = ω 2 − ω 1 + ω 4 − ω 3 . The control inputs U i and rotors' speeds ω i are related by the following equation:

Altitude and Attitude Control-Sliding Mode Flight System
This design closely follows Herrera et al. [13], and it is based on the equivalent control method [35]. A Lyapunov-based stability analysis may be found in [36]. First, a Proportional-Derivative sliding surface S =ė + λe is defined for each process variable (z, φ, θ, ψ) and respective tracking error (e z , e φ , e θ , e ψ ), to be reached by the trajectories in finite time. In this condition, named sliding mode, S = 0 and a non-switching control signal U eq , referred as equivalent control, is designed. The second step consists on designing the control law U sm , which drives the process variable to the sliding surface, during the reaching mode, and it must obey a reaching law, considered here as SṠ < η|S|, η > 0. Therefore, considering Equation (4) one has for the altitude z, and, following a similar procedure for roll, pitch, and yaw angles, one has the expressions presented in Table 3. Table 3. Roll, pitch, and yaw control signals.

Roll
From Equations (9) and (10) and Table 3 expressions for the control components U i eq , U i sm (i = 2, 3, 4), it can be seen that twelve parameters must be adjusted, namely: detailed in Table 4. Table 4. Controller parameters to be optimized.

Control Component Controller Parameters
Altitude z Roll φ Pitch θ Yaw ψ When the controller parameters set in Equation (11) is badly tuned, non-negligible tracking error may occur. Besides, high frequency oscillations (chattering) and control signal saturation must be kept within feasible limits. This situation is undesirable in practice [8] as it increases the risk of actuators and motors damage, as shown in Figure 2, for thrusts U 2 and U 3 . Therefore, an optimal (or suboptimal) set of parameters is desired to minimize some suitable fitness function. The next section presents ALO and GWO algorithms as selected optimization metaheuristics to be evaluated within the specific problem stated in this work.

Ant Lion Optimizer
ALO is a nature-inspired stochastic optimization algorithm which mimics the hunting mechanism of ant lions (predator), that is, how they build pits (traps) where ants (prey) slide in and are caught. It has presented competitive results when compared to PSO [17], in terms of improved exploration, local optima avoidance, exploitation, and convergence, to solve constrained problems with diverse search spaces. The exploration (global search) is guaranteed by the random selection of ant lions to some extent and the random walks of ants around ant lions, so the probability of avoiding local optima is high [21]. The exploitation (local search) is ensured by shrinking size of ant lions traps and the promising search is saved by the elite ant lion, so the convergence accuracy is good. The main interpretation is that ant lions can build pits proportional to their fitness (the higher the fitness, the higher the pit). Its stochastic behavior also results in obtaining different solutions in each independent run, and therefore the final parameters and convergence curve represent the mean value of the population. The ALO steps are detailed next.

Random Walk of Ants
The basic idea of ALO is how to model ants' random walks around the ant lions and keep them inside the search space, which is achieved by a min-max normalization, described by: where cs is the cumulative sum; t is the iteration of the random walk; T is the total number of iterations; r(t) is a function defined as 1 if rnd > 0.5 or 0 if rnd ≤ 0.5; rnd is generated with uniform distribution in the interval [0, 1]; a i and b i are the min and max of random walk of the variable i; and d t i and c t i are the upper bound and lower bounds of variable i at iteration t.

Ant Lions Building Traps
To build a trap and associate every ant with an ant lion, a roulette wheel mechanism selects an ant lion based on its fitness and the trapping in its pit is modeled as with c t and d t as the minimum and maximum for the entire ant vector, considering all dimensional variables, at iteration t. Antlion t j is the position of the selected ant lion j at iteration t.

The Entrapment of Ants in Traps
To move ants towards ant lions, the radius I of their random walks is decreased, reducing the level of exploitation as the iterations t increase,

Ant Lions Catching Ants and Re-Building Traps
The action of catching a prey (ant) and rebuilding the pit is represented by replacing each ant lion with its corresponding ant, if it becomes fitter (see Equation (17)).
Moreover, ALO has an elitism scheme, where the best of all ant lions is the elite solution represented as R E . Every ant movement is affected by using an average between a random walk performed around an ant lion selected using the roulette wheel scheme, R A , and a random walk around the best ant lion (elite), R E . This is represented by Equation (18): The ALO pseudocode is presented in Algorithm 1.

Randomly initialize the populations of ants and ant lions; Calculate the fitness of ants and ant lions;
Find the best of all antlions and assume it as the elite (determined optimum); while not (termination criterion) do for each search agent-ant do Select an ant lion using Roulette wheel; Update c and d using (15); Create a random walk and normalize it using Equations (12)-(13); Update the ant position using Equation (18); end Evaluate the fitness of all ants; Replace each ant lion with its corresponding ant if it becomes fitter (Equation (17)); Update elite if the ant lion becomes fitter than current elite; end end

Grey Wolf Optimizer
The GWO algorithm, as proposed by Mirjalili et al. [19], is inspired by the behavior of a wolf pack social hierarchy. An analogy is established between a set of potential solutions for a given problem and a population of wolves chasing a prey. Following the wolf pack social hierarchy, the GWO establishes four organization levels: • Alpha (α) are dominant wolves and thus followed by the rest of the pack. • Beta (β) are second in command helping alphas in the decision process and establish a bridge between alphas and the lower levels. • Delta (δ) are third in the pack hierarchy; while submitted to alphas and betas, they submit the lowest rank, which is called omega. Deltas represent wolves such as scouts, sentinels, elders, hunters, and caretakers. • Omega (ω) represent the rest of population solutions.
The three higher rank elements (α, β, and δ) movement mimics the prey encirclement by wolves, and it is modeled by: where t represents the current iteration, X p is the prey position vector, X is a grey wolf position, and C is a coefficient vector, which is evaluated using a uniformly random vector r 2 generated in the interval [0, 1]. c 1 can be an adjustable constant proposed in [19] as c 1 = 2. The result difference vector D is then used to move the specific element towards or away from the region where the best solution is located (the prey) using: where r 1 represents a uniformly random generated number in the interval [0, 1] and a is linearly decreased vector from a max = 2 to a min = 0 throughout the predefined number of iterations. If the absolute value of A is smaller than 1, this corresponds to an exploitation behavior and mimics the wolf attacking the prey. Otherwise, if the absolute value of A is larger than 1, this corresponds to an exploration behavior and mimics the wolf diverging from the prey. The proposed values by Mirjalili et al. [19] for A are in the interval of [−2, 2]. Thus, applying the general expressions (Equations (19)- (20)) to the higher wolfs ranks (α, β, and δ) results, respectively, in: In Equations (21)-(23), the prey position corresponds, respectively, to the best position attained by each of the wolf ranks (α, β, and δ). All wolf population positions are updated using the following expression: The GWO pseudocode is presented in Algorithm 2.
Algorithm 2: Pseudocode for the GWO algorithm.
begin Initialize the grey wolf population X(t); Initialize a; Evaluate each search agent-wolf fitness; X α = the best search agent; X β = the second best search agent; X δ = the third best search agent; while not (termination criterion) do for each search agent do Update the position of the current search agent using Equation (24); end Update a, A and C; Evaluate all search agents fitness; Update X α , X β and X δ using Equation (21)-(23); end end

Simulations and Discussion
All simulations were carried out on an Intel Core i7 3.4GHz, 8 GB DDR3 RAM, 500 GB Hard Disk Drive, Windows 7 64bits, Matlab ® R-2014b and represent a time of 300 s. A sampling frequency of f s = 30 Hz was selected, within the feasible sample time limit for a small quadrotor with a diameter around 50 cm [37].

Fitness Function and Optimization Methodology
ALO and GWO were used to improve iteratively a set of random potential solutions encoding in Equation (11) by minimizing a representative aggregated cost function in Equation (25), which equally balances the objectives of set point tracking and control signal variation. The results were compared to PSO subject to the same cost function. Although many PSO variants are available [38], the conventional PSO [14] is the common choice for performance comparison [17,19] and it is already available as an optimization solver in software such as Matlab ® . The fitness of each potential solution (particle, ant, or wolf) were evaluated using Equation (25) considering a full quadrotor flight (see Figure 3) which explores all common VTOL movements (vertical take-off and landing and curves) during 300 s (see Table 5). Although no trajectory benchmark is available, most works consider rectangular, helical, elliptical, or mixed paths [9,13], in such a way that the main VTOL movements be explored. It is noteworthy and required that the same reference trajectory be used for all optimization algorithms. The proposed general fitness function J is defined as: where, Each component J i , i = 1, . . . , 4 is given by with e z , e φ , e θ , and e ψ defined as in Equation (6) and Table 3. To equally balance all components, w i = 0.25 (i = 1, . . . , 4) and ij = 0.5 (i = 1, . . . , 4; j = 1, 2). The control action constraints are defined by the motors specifications [34] with physical parameters presented in Table 2, as follows:

Time Interval
The same conditions were applied to the PSO algorithm [14], with cognitive and social constants, c 1 = c 2 = 2 and inertia vector, ω E , linearly decayed within the range 0.9 − 0.4, along the number of iterations (T = 250). A pre-defined number of independent runs N runs = 30 was executed for each algorithm, considering a suitable and representative choice to assure variability [17,19]. Populations with size N = 100 were considered. It was assumed that all initial populations are different among runs and algorithm type. The bounds for the common search space are: The optimization results achieved are presented in Table 6 for the fitness values of Equation (25) and in Table 7 for the controller parameters set. In Table 7, two sets are presented for each method: (i) the mean best values of all 30 runs; and( ii) the best set (which provided the best fitness value) throughout N runs = 30 independent runs. The used indicators (metrics) for comparing the different algorithms are: • statistical Best is the minimum fitness function value (F i ) (or best value) obtained in N runs .
• statistical Worst is the maximum fitness function value (F i ) (or worst value) obtained in N runs .
• statistical Median is the middle fitness function value (F i ) in a sorted list (or median value) obtained in N runs . If there are two middle numbers (N runs is even), the median is their average. • statistical Mean is the average performance of a stochastic algorithm applied N runs times, where F * i is the optimal solution at the ith run.
statistical Standard deviation (Std) indicates the optimizer stability and robustness, preferably as small as possible.
As shown in Figure 4a and by the Best measure in Table 6, the results achieved for PSO and GWO are almost the same. However, the exploratory behavior of GWO (see Figure 4b)) is more suitable to get a stable convergence (reflected by Std and Mean indexes), outperforming PSO. Although the Best value for ALO is slightly greater, its Std value suggests a higher probability of getting similar results at each independent run. An equivalent convergence rate for the population mean fitness value in all algorithms may be verified, around 200 iterations.   Table 7, there is an interesting similarity among PSO, ALO, and GWO for the roll angle, since this combination provides the best fitness value for each algorithm. The values for the switching component U sm converged to the boundary limits, with the lower bound 0.1 for the gain of this component and the upper bound for δ. The pair (k, δ) has an intrinsic trade-off and affects the disturbance rejection (robustness) and chattering. It is noteworthy that the highest λ found by ALO (using mean controller parameter set) for the roll and pitch angles presents better disturbance rejection properties in the flight simulation shown in the next section. A simple comparison between Figures 2 and 5 below gives a good understanding on how the used SMC technique reduces chattering.

Flight Simulation
With these parameters (Table 7), four Flight Plans (FP) were simulated (Table 8) to meet common movements such as vertical takeoff with yaw correction, hovering, maneuvering, and landing. Usually, such movements explore non-ideal conditions [6], such as constant input disturbances, parameter variations, motor failure, and noise, here simulated as a white Gaussian noise in the range of ±15 mV. A diversity of common paths (rectangular, elliptical, and helical) are also used [9,13]. These flight plans are described as follows, and, where not explicitly mentioned, all initial values are zero.  The total performance indexes used in Tables 9 and 10 are defined as: where f s is the sampling frequency and t sim is the total simulation time. Std represents the usual standard deviation of Equation (40), for each control component U i , i = 1, . . . , 4.   Tables 9 and 10. All simulations used the mean controller parameters set presented in Table 6, which provides more diversity and suitability. At least one hunting method outperforms PSO in terms of general tracking (ISE T ): both ALO and GWO when the mean controller parameters (Table 6) are used (Table 9 ) and ALO when the choice is the controller parameter set related to best fitness values (Table 10). Although these benefits may require more oscillatory control signal (TV U ) when compared to PSO, the total control effort (U T ) is quite similar. Figures 6 and 7 show the ideal case with a zoom at the initial instants, to highlight that ALO provides a faster tracking for all process variables. The worst performance for GWO on pitch and roll control in Figures 6 and 12 is explained by the smaller parameters λ 2 and λ 3 . However, since the quadrotor control must be considered as a whole, the ISE T is still better for GWO when compared to PSO. For the disturbed cases in Figures 8-12, the overall best tracking for ALO and GWO is also verified. The zoom in the top plot of Figure 8 shows the best recovery achieved, when system physical parameters vary. If a motor failure is considered ( Figure 10, roll plot), ALO outperforms PSO and GWO, for the same reason explained above for pitch and roll. Figure 12 presents only the altitude tracking under measurement noise, but the other process variables show similar behavior. In terms of tracking, both GWO and ALO are slightly better than PSO. This finding is in accordance to the results in [17,19,32,33]. In these two latter works, the Gravitational Search Algorithm [39] exceeds both, but here it is not considered since it is not an animal-behavior method, but inspired in physical motion laws. For an actual and digital implementation where robustness and stable convergence rate must be emphasized [7,8], GWO may be recommended in its original formulation, since it has presented a good balance between exploration and exploitation. Regarding ALO, nonsystematic adaptation methods of exploration rate along the search space have recently been proposed to outperform the original ALO [21]. However, here, this feature did not compromise the tracking results, due to the aggregated cost function type and since the control must be considered for the four motors as a whole. In Figures 7-13, for illustrative purposes, the trajectories in Table 8 are controlled using the SMC with the parameters found by the ALO (see Table 7). When the curves almost coincide with the set point, it is due to the SMC transient response, which is very fast and non-oscillatory, achieving the objectives of this work of getting a more precise tracking with equivalent and feasible control signal effort and chattering, when compared to PSO.

Conclusions
This paper proposes the use of both ALO and GWO algorithms to solve the optimization problem of finding a suitable parameter set for the quadrotor Sliding Mode control of altitude and attitude stabilization and tracking. As trial-and-error adjustment is the common practice; this work addresses a challenging and open issue due to the strong trade-off among parameters related to set point tracking, as robustness is guaranteed and chattering is kept within feasible limits. Both techniques were simulated and compared with PSO. Both GWO and ALO outperformed PSO in terms of precise tracking, for ideal and disturbed conditions. Regarding the optimization process, since ALO does not achieve the same best fitness values found by GWO and PSO, this result stimulates further research on ALO improvements for control and robotics applications, as well as the path that PSO and other meta-heuristics optimization algorithms have undergone. This finding is in accordance to the motivation for novel ALO approaches, such as the Dynamic ALO [21] and the Chaotic ALO [15,40], to be explored in future works. Moreover, the multi-objective approach can be explored, as well as hybrid optimization techniques, since ALO and GWO presented good results for different process variables. Besides, practical implementation is intended in order to evaluate on-board specific issues (embedded) and corroborate such simulation results. O. designed the model and the computational framework, analysed the data and carried out the simulation, writing-original draft with input from all authors; J.B.-C., besides reviewing, contributed to get funding acquisition. T.P. conducted simulation and visualization editing; All authors discussed the results and writing-review and editing process. All authors have read and agreed to the published version of the manuscript.