Next Article in Journal
Methodology for Selecting an Ideal Thermal Gasification Technique for Municipal Solid Waste Using Multi-Criteria Decision Analysis
Next Article in Special Issue
An Improved Data-Driven Integral Sliding-Mode Control and Its Automation Application
Previous Article in Journal
A Mathematical Morphological Network Fault Diagnosis Method for Rolling Bearings Based on Acoustic Array Signal
Previous Article in Special Issue
Research on Unmanned Surface Vessel Aggregation Formation Based on Improved A* and Dynamic Window Approach Fusion Algorithm
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Adaptive-Dynamic-Programming-Based Robust Control for a Quadrotor UAV with External Disturbances and Parameter Uncertainties

1
Institute of Logistics Science and Engineering, Shanghai Maritime University, Shanghai 201306, China
2
School of Automation, Northwestern Polytechnical University, Xi’an 710072, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2023, 13(23), 12672; https://doi.org/10.3390/app132312672
Submission received: 5 November 2023 / Revised: 19 November 2023 / Accepted: 23 November 2023 / Published: 26 November 2023

Abstract

:
Thiswork addresses the trajectory-tracking-control problem for a quadrotor unmanned aerial vehicle with external disturbances and parameter uncertainties. A novel adaptive-dynamic-programming-based robust control method is proposed to eliminate the effects of lumped uncertainties (including external disturbances and parameter uncertainties) and to ensure the approximate optimal control performance. Its novelty lies in that two radial basis function neural network observers with fixed-time convergence properties were first established to reconstruct the lumped uncertainties. Notably, they tune only the scalar parameters online and have low computational complexities. Subsequently, two actor–critic neural networks were designed to approximate the optimal cost functions and control policies for the nominal system. In this design, two new actor–critic neural network weight update laws are proposed to eliminate the persistent excitation condition. Then, two adaptive-dynamic-programming-based robust control laws were obtained by integrating the observer reconstruction information and the nominal control policies. The uniformly ultimately bounded stability of the closed-loop tracking control systems was ensured using the Lyapunov methodology. Finally, numerical results are shown to verify the effectiveness and superiority of the proposed control scheme.

1. Introduction

In recent years, unmanned aerial vehicles (UAVs) have attracted significant research interest on account of their wide range of applications in both the civilian and military fields, such as agricultural production, logistics and distribution, urban management, border blockade prevention, reconnaissance, surveillance, etc. [1,2,3,4,5]. Compared with other flight vehicles, a quadrotor UAV has many significant advantages, including a simple structure, vertical take-off and landing, a low manufacturing cost, and so on [6]. However, the quadrotor UAV flight control system is an underactuated system, and its dynamic model possesses the characteristics of nonlinearity and a strong coupling of the translational and rotational dynamics. Moreover, it is very sensitive to disturbances, including external gust disturbances, aerodynamic effects, model uncertainties, and so on. Considering the trajectory tracking control of a quadrotor UAV as the basis for the previously mentioned applications, a series of extensive studies has been undertaken on this subject. For practical applications, linear control schemes have been adopted to linearize the complex dynamic model to ensure the basic operation of a quadrotor UAV, such as proportional–integral–derivative (PID) control [7] and linear quadratic regulator (LQR) control [8]. However, the linear model can cause performance degradations when performing complex tasks and when in unknown environments. In order to improve the tracking performance, various nonlinear control strategies have been further studied, including sliding mode control [9,10], backstepping control [11], adaptive control [12], and so on. However, the control schemes mentioned above only focus on the stability of the system and do not consider the control cost. Since mobile vehicles can carry limited energy equipment, it is necessary to consider the control cost and energy optimization [13,14]. Therefore, it is of great significance to design a tracking control scheme that can not only stabilize the quadrotor UAV flight control system, but also minimize its control cost.
Adaptive dynamic programming (ADP) is an optimization-theory-based control method and achieves a balance between the control cost and control performance by adaptively adjusting the control policy. As an effective method for solving the optimal control problems of nonlinear systems, it has received much attention in its related fields. Werbos et al. [15] first proposed a heuristic dynamic programming algorithm based on reinforcement learning and designed a preliminary online learning control framework. Vamvoudakis et al. [16] proposed an online algorithm based on actor–critic neural networks (AC NNs) to learn the optimal control solutions for nonlinear systems of known dynamics with an infinite horizon cost. The AC NNs were used to approximate the control policy and cost function, respectively. Kamalapurkar et al. [17] investigated an ADP-based optimal tracking problem for nonlinear systems. Wang et al. [18] solved the robust stabilization problem via adaptive-critic-based techniques. Wen et al. [19] introduced a new optimized backstepping control method based on ADP and backstepping technologies, which solved the tracking control problem of strict feedback systems. Furthermore, ADP technology has also been extensively researched in fault-tolerant control [20], zero-sum games [21], multi-agent systems [22], etc. However, most of the existing weight update laws are required to satisfy persistent excitation (PE) conditions. It is worth mentioning that some scholars have performed successive studies to avoid the use of PE conditions and make ADP technology easier to implement in practical engineering. In [23,24], the experience replay technique was utilized to overcome the PE condition for weight convergence. Dong et al. [25] proposed a simplified online learning strategy, which removed the PE conditions via the concurrent learning method. Liu et al. [26] designed a novel weight update law by utilizing adaptive technology, which removed the PE condition. Unfortunately, most of the existing methods still rely on real data. Thus, how to effectively avoid the use of PE conditions is still an open problem.
Meanwhile, uncertainties can seriously degrade the performance of a quadrotor UAV flight control system. Uncertainties usually have time-varying characteristics, and the ADP technique lacks the ability to tackle time-varying signals [26,27]. Thus, it alone cannot be utilized to achieve high-quality control of quadrotor UAVs. Therefore, integrating the uncertainty attenuation technique with the ADP technique is of great significance in solving the control problems of quadrotor UAVs. In recent years, due to their strong fitting ability, neural networks (NNs) have provided some promising approaches to the uncertainty attenuation problem in practical systems such as quadrotor UAVs [28,29,30,31,32]. Zhao et al. [30] proposed an NN-based adaptive control scheme for the unknown and continuous dynamics of quadrotor UAVs, which cannot achieve fast convergence characteristics. To improve the convergence speed of the control system, some scholars have integrated finite-time control techniques with NN-based techniques. Wang et al. [31] proposed a control scheme based on a finite-time multivariate NN interference observer for helicopter systems. Liu et al. [32] proposed an adaptive NN-learning-based control scheme with finite-time convergence characteristics for the quadrotor UAV fault-tolerant control problem. However, the performance of the finite-time control techniques depends heavily on the initial state errors or observation errors, which limits the scope of their application in quadrotor UAVs. Therefore, combining NN-based techniques with fixed-time techniques for application in a quadrotor UAV control system requires further research. In addition, NN units commonly directly approximate the uncertainties in the above-mentioned studies, which leads to a large computational load and parameter explosion.
Motivated by the aforementioned significant research, this paper investigated the control scheme of a quadrotor UAV under external disturbances and parameter uncertainties. The investigation focused on tracking control by integrating the NN-based observer technique with the ADP-based control technique, ensuring that the resulting control laws demonstrate both robust and approximately optimal characteristics. The main contributions are outlined below:
(1)
Two fixed-time NN-based observers (FTNNOBs) were developed to compensate the control inputs of quadrotor UAV, which can estimate external disturbances and parameter uncertainties in a fixed time. Different from the traditional NN approximators designed in [33,34,35], the FTNNOBs only need to adjust the scalar parameters rather than the weight vectors or matrices. They provided a simple structure and inexpensive computation.
(2)
A novel ADP-based robust control scheme is proposed by combining the ADP technique with the estimated information from FTNNOBs, which improves the tracking control accuracy and optimizes the control cost consumption. Different from the existing weight update laws of AC NNs [16,25], two novel weight update laws were introduced in this work. They not only have a simple structure, but are also independent of the PE condition. Moreover, two auxiliary terms related to the system states were introduced to improve the data utilization efficiency of the AC NNs and make the training results more effective.
The rest of the paper is organized as follows. Section 2 introduces the dynamics model of a quadrotor UAV and derives the position and attitude error systems. Section 3 describes the design process of the ADP-based robust control scheme and the stability analysis of the overall closed-loop system. Section 4 introduces the simulation results. Section 5 presents the conclusions of this paper.
Notations 1. 
· is utilized for the Euclidean norm of vectors and matrices. λ min ( · ) and λ max ( · ) represent the minimum or maximum eigenvalues of symmetric matrices, respectively. For any x = [ x 1 , x 2 , , x n ] T , we define sig η ( x ) = [ x 1 η sgn ( x 1 ) , , x n η sgn ( x n ) ] T , where η R + , sgn ( · ) denotes the sign function. I n represents an identity matrix of size n.

2. Model Description and Transformation

2.1. Dynamics Model

The mechanical structure of a quadrotor UAV and the coordinate reference system are shown in Figure 1. ς E o E x E y E z E is the Earth-fixed inertial frame. ς B o B x B y B z B denotes the body-fixed frame. In the frame ς E , ζ = [ x , y , z ] T R 3 presents the position state vector. Θ = [ ϕ , θ , ψ ] T R 3 is the Euler angle vector, in which ϕ , θ and ψ denote the roll, pitch, and yaw angles. η = [ u , v , w ] T R 3 and Ω = [ p , q , r ] T R 3 are the linear and angular velocity vectors under the frame ς B , respectively. Furthermore, the kinematic transformation relationships of the quadrotor UAV between the frame ς E and ς B are introduced as [32]:
ζ ˙ = R B E η , Θ ˙ = Φ B E Ω ,
where the transformation matrices R B E and Φ B E from the frame ς B to ς E can be expressed as
R B E = C θ C ψ S ϕ S θ C ψ C ϕ S ψ C ϕ S θ C ψ + S ϕ S ψ C θ S ψ S ϕ S θ S ψ + C ϕ S ψ C ϕ S θ S ψ S ϕ S ψ S θ S ϕ C θ C ϕ C θ ,
Φ B E = 1 S ϕ T θ C ϕ T θ 0 C ϕ S ϕ 0 S ϕ / C θ C ϕ / C θ ,
with S · , C · and T · denoting sin ( · ) , cos ( · ) and tan ( · ) , respectively. The dynamical model [36,37] with respect to the frame ς B can be established as
m η ˙ = Ω × m η F f + F g + U f + D f , J Ω ˙ = Ω × J Ω M t + G a + U t + D t ,
where m is the total mass of the quadrotor UAV. J = diag J x , J y , J z R 3 × 3 represents the moments of inertia. F f = [ k 1 u , k 2 v , k 3 w ] T R 3 and M t = [ k 4 p , k 5 q , k 6 r ] T R 3 represent aerodynamic friction vectors with k 1 , k 2 , , k 6 being the aerodynamic friction coefficients. U f = [ 0 , 0 , f ] T R 3 is the total thrust vector, and U t = [ U t 1 , U t 2 , U t 3 ] T R 3 denotes the control torque vector. D f = [ D f 1 , D f 2 , D f 3 ] T R 3 and D t = [ D t 1 , D t 2 , D t 3 ] T R 3 denote the external disturbance vectors of the quadrotor UAV.
F g = ( R B E ) 1 [ 0 , 0 , m g ] T R 3 denotes the gravity vector with g being the gravity coefficient. G a = [ J r q ω ¯ , J r p ω ¯ , 0 ] T R 3 represents the gyroscopic moment vector, where J r is the inertia of each motor, and ω ¯ = ω 1 ω 2 + ω 3 ω 4 is the total residual rotor speed. Note that the inaccessibility of the rotor speed, G a is considered as an additional disturbance during the control design [38]. In addition, ( · ) × denotes the skew-symmetric matrix satisfying
x × = 0 x 3 x 2 x 3 0 x 1 x 2 x 1 0 , x = [ x 1 , x 2 , x 3 ] T R 3 .
Consider the additive parameter uncertainties of the dynamics system as (2); the model parameters are assumed to be m = m 0 + Δ m ,   J x = J x 0 + Δ J x ,   J y = J y 0 + Δ J y ,   J z = J z 0 + Δ J z ,   k 1 = k 10 + Δ k 1 ,   k 2 = k 20 + Δ k 2 ,   k 3 = k 30 + Δ k 3 ,   k 4 = k 40 + Δ k 4 ,   k 5 = k 50 + Δ k 5 ,   k 6 = k 60 + Δ k 6 . m 0 denotes the nominal total mass. J 0 = diag J x 0 , J y 0 , J z 0 denotes the nominal inertia matrix. k 10 , k 20 , , k 60 denote the nominal value of the aerodynamic friction coefficients, respectively, and Δ * denotes the corresponding parameter uncertainties. Combining (1) and (2), one can further obtain the position and attitude subsystems in the form of
x ˙ 11 = x 12 , x ˙ 12 = g q P 1 x 12 + 1 m 0 ( u + d 1 ) ,
x ˙ 21 = x 22 , x ˙ 22 = P 2 x 22 + J 0 1 ( τ + d 2 )
with x 11 = ζ , x 12 = ζ ˙ , x 21 = Θ , x 22 = Θ ˙ , and q = [ 0 , 0 , 1 ] T . u = R B E U f and τ = Φ B E U t represent the control inputs under the frame ς E . P 1 and P 2 are defined as
P 1 = 1 m 0 diag k 10 , k 20 , k 30 , P 2 = Φ ˙ B E ( Φ B E ) 1 + J 0 1 F 1 + Φ B E J 0 1 ( Φ B E ) 1 Θ ˙ × J 0 ( Φ B E ) 1
with F 1 = diag k 40 , k 50 , k 60 . d 1 = u d f p and d 2 = τ d f a represent the lumped uncertainties, where u d = R B E D f , τ d = Φ B E D t ; the parameter uncertainties f p and f a are defined as
f p = Δ m g q + Δ m ζ ¨ + Δ P 1 ζ ˙ , f a = Δ J Θ ¨ + Δ P 3 Θ ˙ + Δ F 1 Θ ˙
with
Δ P 1 = diag Δ k 1 , Δ k 2 , Δ k 3 , Δ F 1 = diag Δ k 4 , Δ k 5 , Δ k 6 , Δ P 3 = Φ B E Δ J ( Φ ˙ B E ) 1 + Φ B E Δ J ( Φ B E ) 1 Θ ˙ × + F 2 ( Φ B E ) 1 , F 2 = [ 0 , J r ω ¯ , 0 ; J r ω ¯ , 0 , 0 ; 0 , 0 , 0 ] .
Since Δ P 3 is a Coriolis-like matrix, we have the following result according to the properties of the Coriolis matrix [39].
Property 1. 
Δ P 3 α 1 Θ ˙ with α 1 being a positive constant.

2.2. Model Transformation

Define the reference trajectories of the position and attitude subsystems as ζ d = [ x d , y d , z d ] T and Θ d = [ ϕ d , θ d , ψ d ] T . Due to the under-actuated characteristics of a quadrotor UAV, the desired roll and pitch can be generated by
ϕ d = arcsin u q 1 sin ψ d u q 2 cos ψ d u q , θ d = arctan u q 1 cos ψ d + u q 2 sin ψ d u q 3 ,
where u q = u m 0 with u q = [ u q 1 , u q 2 , u q 3 ] T .
Then, the tracking errors can be represented as ζ e = [ x e , y e , z e ] T = ζ ζ d and Θ e = [ ϕ e , θ e , ψ e ] T = Θ Θ d . Combining with (3) and (4), one can further obtain the error subsystems:
e ˙ 11 = e 12 , e ˙ 12 = g q P 1 e 12 + 1 m 0 ( u + d 1 ) ζ ¨ d P 1 ζ ˙ d ,
e ˙ 21 = e 22 , e ˙ 22 = P 2 e 22 + J 0 1 ( τ + d 2 ) Θ ¨ d P 2 Θ ˙ d ,
where e 11 = ζ e , e 12 = ζ ˙ e , e 21 = Θ e , and e 22 = Θ ˙ e . For subsequent optimal controller design, divide the control inputs u and τ into two parts (i.e., u = u 0 + u l , τ = τ 0 + τ l ). By setting two feedforward controllers u l = m 0 ( g q + ζ ¨ d + P 1 ζ ˙ d ) and τ l = J 0 ( Θ ¨ d + P 2 Θ ˙ d ) , (5) and (6) can be formulated as
e ˙ i = f ( e i ) + g i ( χ i + d i ) .
As shown in (7), i = 1 , 2 represent the error subsystems of position and attitude, respectively. In addition, e i = [ e i 1 , e i 2 ] T , χ 1 = u 0 , χ 2 = τ 0 :
f ( e i ) = e i 2 P i e i 2 , g 1 = 0 3 × 3 1 m 0 I 3 , g 2 = 0 3 × 3 J 0 1 .
On the basis of the above transformation of the dynamical model, the UAV trajectory tracking control problem is transformed into a stabilization problem. To the end, we first designed FTNNOBs to estimate the uncertainties in fixed time and give online compensation for the approximate optimal control policies. Then, the approximate optimal control policies for the nominal systems e ˙ i = f ( e i ) + g i χ i are obtained using the ADP technique. The block diagram of the control systems is shown in Figure 2.
Lemma 1 
([40]). Consider a nonlinear system in the form:
x ˙ = f ( x ) , f ( 0 ) = 0 ,
where f : U 0 R n is continuous in an open neighborhood U 0 of the origin. Consider the system (8); if a Lyapunov function V ( x ) satisfies V ˙ ( x ) α V p ( x ) β V q ( x ) + ϑ , where α, β and ϑ are positive constants, and 0 < p < 1 , q > 1 , then the trajectory of the system is practical fixed-time stable within settling time t f , which satisfies t f 1 α ω ( 1 p ) + 1 β ω ( q 1 ) with 0 < ω < 1 . In addition, the residual set is given by ϑ ¯ = x | V ( x ) min ϱ α 1 p , ϱ β 1 q with ϱ = ϑ 1 ω .
Lemma 2 
([41]). For any x , y R , x + y v 1 2 v 1 1 x v 1 + y v 1 , if v 1 R + and v 1 > 1 .
Lemma 3 
([37]). For any x , y R , ( x + y ) v 2 x v 2 + y v 2 with 0 < v 2 1 .
Assumption 1. 
Without loss of generality, we can make the following assumption from [42,43]. The parameter uncertainties f p , f a are bounded by the following functions:
f p α 2 + α 3 ζ + α 4 ζ ˙ 2 , f a α 5 + α 6 Θ + α 7 Θ ˙ 2 ,
where α 2 , α 3 , α 4 , α 5 , α 6 , and α 7 are unknown, but bounded positive constants.
Assumption 2. 
The external continuous disturbances u d , τ d under the frame ς E and the system matrices g i are norm bounded (i.e., u d u d ¯ , τ d τ d ¯ , and g i g ¯ i ), where u d ¯ , τ d ¯ , and g i ¯ are positive constants.

3. Adaptive-Dynamic-Programming-Based Robust Control Design

3.1. Online-Uncertainty-Compensation-Based Fixed-Time NN-Based Observers

Consider the systems (7) with uncertainties; the auxiliary systems are introduced as
e ^ ˙ i = f ( e i ) + g i ( χ i + d ^ i ) ,
where d ^ i are the estimated values of the lumped uncertainties. Define the auxiliary variables as Ξ i = g i ( e i e ^ i ) with g i being pseudo inverse matrices of g i . Consider the Assumptions 1 and 2; this yields
d i h i H i ( S i ) = ξ i ( S i )
with
h 1 = max u d ¯ , α 2 , α 3 , α 4 , h 2 = max τ d ¯ , α 5 , α 6 , α 7 , H 1 ( S 1 ) = 1 + ζ + ζ ˙ 2 , H 2 ( S 2 ) = 1 + Θ + Θ ˙ 2 ,
where H i ( S i ) are defined as the core functions [44]. Furthermore, they are dependent on S 1 = [ ζ , ζ ˙ , ζ ˙ 2 ] T and S 2 = [ Θ , Θ ˙ , Θ ˙ 2 ] T . As pointed out in [45,46], the continuous nonlinear scalar functions ξ i ( S i ) can be approximated using NNs with Gaussian basis functions:
ξ i ( S i ) = W f i T Z ( S i ) + ε f i ,
where Z ( S i ) = [ Z 1 ( S i ) , Z 2 ( S i ) , , Z n ( S i ) ] T are activation function vectors. Furthermore, Z n ( S i ) are expressed in detail as
Z n ( S i ) = exp S i ι i n 2 h i n 2 ,
where ι i n and h i n are the center vectors and width and n is the sum of the number of the Gaussian basis functions. It is straightforward to show that Z ( S i ) have norm upper bounds ϵ i , that is Z ( S i ) < ϵ i < with ϵ i are positive constants. W f i denote the optimal weight vectors, and ε f i are the approximation errors, in which W f i and ε f i are bounded, i.e., W f i W ¯ f i and ε f i ε ¯ f i .
In order to reduce the number of parameters for online calculation, the following procedure is designed. Firstly, we have
d i ξ i ( S i ) W f i T Z ( S i ) + ε f i b i Z f i
with b i = max W f i , ε f i , Z f i = 1 + Z ( S i ) .
Then, the FTNNOBs are established as
d ^ i = r i 3 sig m i n i ( Ξ i ) + r i 4 sig p i q i ( Ξ i ) + c i l ^ i Ξ i Z f i 2 ,
where m i , n i , p i and q i are positive odd integers satisfying m i < n i , p i > q i . r i 3 , r i 4 , and c i are positive constants. l ^ i are the estimated values of l i = b i 2 , which are tuned by the following adaptive laws:
l ^ ˙ i = ϱ i 1 l ^ i + c i Ξ i 2 Z f i 2 , ( l ^ i ( 0 ) 0 )
with ϱ i 1 being positive constants. The estimation errors of l i are defined as l ˜ i = l i l ^ i . We have the following result for the proposed procedure.
Lemma 4. 
Based on the proposed adaptive laws (11), there exists positive constants l ¯ i such that l ^ i l ¯ i and l i l ¯ i .
Proof. 
Select the following positive definite Lyapunov functions:
L i 1 = 1 2 Ξ i T Ξ i + 1 2 l ˜ i 2 .
Substituting (10) and (11) into the time derivative of L i 1 yields
L ˙ i 1 b i Z f i Ξ i r i 3 Ξ i 2 m i + n i 2 n i c i l ^ i Ξ i 2 Z f i 2 c i l ˜ i Ξ i 2 Z f i 2 + ϱ i 1 l ˜ i l ^ i .
According to Young’s inequality [47], one has
x y α ˇ a a x a + 1 p α ˇ p y p ,
where x , y R , α ˇ > 0 , a > 1 , p > 1 and ( a 1 ) ( p 1 ) = 1 . Then, one has
b i Z f i Ξ i c i b i 2 Ξ i 2 Z f i 2 + 1 4 c i ϱ i 1 , l ˜ i l ^ i ϱ i 1 2 l ˜ i 2 + ϱ i 1 2 l i 2 .
According to Lemma 3 and substituting (13) into (12), one can obtain that
L ˙ i 1 r i 3 2 m i + n i 2 n i 1 2 Ξ i 2 m i + n i 2 n i ϱ i 1 2 m i + n i 2 n i 1 2 l ˜ i 2 m i + n i 2 n i + ϱ i 1 l ˜ i 2 m i + n i 2 n i ϱ i 1 l ˜ i 2 = + ϱ i 1 2 l i 2 + 1 4 c i π i 1 L i 1 a i 2 ϱ i 1 l ˜ i 2 + ϱ i 1 l ˜ i 2 a i 2 + ϱ i 1 2 l i 2 + 1 4 c i .
If l ˜ i 2 1 , one can obtain that
ϱ i 1 l ˜ i 2 a i 2 ϱ i 1 l ˜ i 2 0 .
If l ˜ i 2 < 1 , one can obtain that
0 ϱ i 1 l ˜ i 2 a i 2 ϱ i 1 l ˜ i 2 a i 1 ,
where
π i 1 = min r i 3 2 a i 2 , ϱ i 1 2 a i 2 , a i 1 = ϱ i 1 a i 2 a i 2 1 a i 2 a i 2 1 1 a i 2
with a i 2 = m i + n i 2 n i . Then, it can further deduced that L i 1 π i 1 L i 1 a i 2 + ϑ i 0 = L i 1 a i 2 ( ϑ i 0 π i 1 L i 1 a i 2 ) with ϑ i 0 = a i 1 + ϱ i 1 2 l i 2 + 1 4 c i . Furthermore, if ϑ i 0 π i 1 L i 1 a i 2 > 0 , L i 1 will reachan invariant set
L i 1 | L i 1 ( π i 1 ϑ i 0 ) 1 a i 2 . Thus, one can conclude that L i 1 and l ˜ i = l i l ^ i are bounded, and there exists positive constants l ¯ i such that l ^ i , l i l ¯ i . □
Furthermore, we will show that the proposed FTNNOBs are stable, as shown in the following theorem.
Theorem 1. 
Consider the systems with uncertainties (7); if the FTNNOBs are designed as the form of (10) with l ^ i being updated by (11), the uncertainties’ observation errors d ˜ i = d i d ^ i will converge to a bounded region of the origin in a fixed time.
Proof. 
Select a candidate Lyapunov functions in the form of
L i 2 = 1 2 Ξ i T Ξ i + 1 2 l ˇ i 2 .
where l ˇ i = l ¯ i l ^ i . Substituting (10) and (11) into the time derivative of L i 2 , one can achieve
L ˙ i 2 b i Z f i Ξ i r i 3 Ξ i 2 a i 2 r i 4 Ξ i 2 a i 4 c i l ^ i Ξ i 2 Z f i 2 c i l ˇ i Ξ i 2 Z f i 2 = + ϱ i 1 l ˇ i l ^ i .
where a i 4 = p i + q i 2 q i . According to Young’s inequality, one has
b i Z f i Ξ i c i l ¯ i Ξ i 2 Z f i 2 + 1 4 c i , ϱ i 1 l ˇ i l ^ i ϱ i 1 2 l ˇ i 2 + ϱ i 1 2 l ¯ i 2 .
Substituting (15) into (14), one can obtain that
L ˙ i 2 r i 3 2 a i 2 1 2 Ξ i 2 a i 2 ϱ i 1 2 a i 2 1 2 l ˇ i 2 a i 2 r i 4 2 a i 4 1 2 Ξ i 2 a i 4 = ϱ i 1 2 a i 4 1 2 l ˇ i 2 a i 4 ϱ i 1 l ˇ i 2 + ϱ i 1 l ˇ i 2 a i 2 + ϱ i 1 l ˇ i 2 a i 4 + ϱ i 1 2 l ¯ i 2 + 1 4 c i .
If l ˇ i 2 1 , one can obtain that
ϱ i 1 l ˇ i 2 a i 2 ϱ i 1 l ˇ i 2 0 .
If l ˇ i 2 < 1 , one can obtain that
0 ϱ i 1 l ˇ i 2 a i 2 ϱ i 1 l ˇ i 2 a i 1 ,
On account of the inequalities l ˇ i = l ¯ i l ^ i l ¯ i , 0 < l ^ i < l ¯ i and according to Lemmas 2 and 3, one can further obtain that
L ˙ i 2 π i 1 L i 1 a i 2 π i 2 2 a i 3 L i 1 a i 4 + ϑ i 1 ,
where π i 1 = min r i 3 2 a i 2 , ϱ i 1 2 a i 2 , π i 2 = min r i 4 2 a i 4 , ϱ i 1 2 a i 4 , a i 3 = q i p i 2 q i , and ϑ i 1 = 1 4 c i + a i 1 + ϱ i 1 ( l ¯ i 2 2 + l ¯ i 2 a i 4 ) . Based on Lemma 1, L i 2 will reduce to a residual set in fixed time. The settling time can be achieved by t i f 1 π i 1 ω i ( 1 a i 2 ) + 1 π i 2 2 a i 3 ω i ( a i 4 1 ) with 0 < ω i < 1 . Defining an augmented vector X ˇ i = [ Ξ i T , l ˇ i ] T , L i 2 is also represented as L i 2 = 1 2 X ˇ i T X ˇ i , and it is bounded by
ϑ ˇ i 1 = X ˇ i L i 2 ( X ˇ i ) min ϑ i 2 , ϑ i 3 ,
where ϑ i 2 = ϱ i 0 π i 1 1 a i 2 and ϑ i 3 = ϱ i 0 π i 2 2 a i 3 1 a i 4 with ϱ i 0 = ϑ i 1 1 ω i . Then, one can conclude that
Ξ i X ˇ i 2 ϑ ¯ i 1 .
where ϑ ¯ i 1 = min ϑ i 2 , ϑ i 3 . Consider the definition of Ξ i ; one has
Ξ ˙ i = d i d ^ i = d i r i 3 sig m i n i ( Ξ i ) r i 4 sig p i q i ( Ξ i ) c i l ^ i Ξ i Z f i 2 = d ˜ i .
Then, according to the result (16), the upper norm bounds of d ˜ i are given by
d ˜ i d i + r i 3 sig m i n i ( Ξ i ) + r i 4 sig p i q i ( Ξ i ) + c i l ¯ i Z f i 2 Ξ i b i ( 1 + ϵ i ) + 3 r i 3 ( 2 ϑ ¯ i 1 ) m i n i + 3 r i 4 ( 2 ϑ ¯ i 1 ) p i q i + c i l ¯ i 2 ϑ ¯ i 1 ( 1 + ϵ i ) 2 .
Based on the above analysis and the selection pf appropriate parameter values, the uncertainties’ observation errors d ˜ i will converge to a bounded region of origin in a fixed time. □
Remark 1. 
Compared with previous NN-based observer methods, it is obvious that the proposed FTNNOBs only need to estimate a scalar parameter online instead of a weight vector or matrix, which reduces the computational complexities. Moreover, the fixed-time convergence of the estimation error is guaranteed.

3.2. ADP-Based Nominal Optimal Control Design

Consider the nominal systems of (7):
e ˙ i = f ( e i ) + g i μ i .
where μ i is the approximated optimal control policies to be designed. According to [17,26,33], the infinite horizon cost functions can be defined as
V i ( e i ( 0 ) ) = 0 e i T Q i e i + μ i T R i μ i d t ,
where Q i and R i are assumed to be positive definite diagonal constant matrices. Assuming that the optimal policies μ i * exist, the corresponding optimally cost functions of (17) are formulated as
V i * ( e i ( 0 ) ) = min μ i Ψ i ( ϖ i ) 0 e i T Q i e i + μ i T R i μ i d t ,
where Ψ i ( ϖ i ) are the admissible sets on the compact sets ϖ i . Taking the time derivative of (18), one can obtain the Hamilton–Jacobi–Bellman (HJB) equations:
H i e i , μ i * , V i * ( e i ) = T V i * f e i + g i μ i * + μ i * T R i μ i * + e i T Q i e i = 0 ,
where = / e i are the Jacobian matrices. Then, the optimal control policies μ i * can be derived by taking the partial derivative for (19) with respect to μ i * :
μ i * = 1 2 R i 1 g i T V i * ( e i ) .
Substituting (20) into (19), the HJB equations can be further written as
0 = V i * T ( e i ) f ( e i ) + e i T Q i e i 1 4 V * T ( e i ) g i R i 1 g i T V i * ( e i ) .
However, for the HJB Equation (21), it is difficult to obtain the analytical solutions owing to their nonlinearity. Therefore, the AC NN technique will be introduced to solve this problem.
For all e i ϖ i , the optimal cost functions V i * ( e i ) can be approximated by
V i * ( e i ) = W i T σ i ( e i ) + ε i ,
where W i R n denote the unknown ideal weight vectors, and σ i ( e i ) R n are the basis function vectors with n being the number of hidden layer neurons; ε i is the approximation error. In addition, the optimal control policies μ i * can be reconstructed in the form of
μ i * = 1 2 R i 1 g i T ( T σ i ( e i ) W i + ε i ) .
Two NNs (i.e., AC NNs) are usually utilized to approximate the control policies and optimal cost functions, respectively. The approximation results can be given by
V i ( e i ) = W ^ i c T σ i ( e i ) ,
μ i = 1 2 R i 1 g i T T σ i ( e i ) W ^ i a ,
where W ^ i a , W ^ i c R n denote the weight vectors of the AC NNs.
Substituting (22) and (23) into (21), one can obtain the approximate results of the HJB equations:
H ^ i = W ^ i c T σ i ( f ( e i ) + e i T Q i e i 1 2 g i R i 1 g T i T σ i ( e i ) W ^ i a ) + 1 4 W ^ i a T B i W ^ i a = E i ,
where H ^ i = H i e i , μ i , σ i ( e i ) and B i = σ i ( e i ) g i R i 1 g T i T σ i ( e i ) . E i represent the Bellman residual errors. Two positive functions are constructed as
κ i ( t ) = ( W ^ i a W ^ i c ) T ( W ^ i a W ^ i c ) .
Then, calculating the time derivative of κ i ( t ) along W ^ i a and W ^ i c , one has
d κ i ( t ) d t = κ i ( t ) W ^ i a W ^ ˙ i a + κ i ( t ) W ^ i c W ^ ˙ i c .
The weight update laws of W ^ i a , W ^ i c are proposed in the form of
W ^ ˙ i c = k i 1 Λ i W ^ i c + k i 3 β i 1 ,
W ^ ˙ i a = k i 1 Λ i W ^ i c k i 2 ( W ^ i a W ^ i c ) + k i 3 β i 1 ,
where k i 1 , k i 2 , and k i 3 are the learning rates. k i 1 and k i 2 are positive constants, and k i 3 = Q i . β i 1 = β i 2 1 + β i 2 T β i 2 with β i 2 = T σ i ( e i ) · e i . Λ i = r i 1 β i 1 + r i 2 with r i 1 and r i 2 being positive constants. In addition, the AC NNs’ weight errors are defined as W ˜ i a = W i W ^ i a and W ˜ i c = W i W ^ i c . Substituting (25) and (26) into (24) yields
d κ i ( t ) d t = 2 k i 2 ( W ^ i a W ^ i c ) T ( W ^ i a W ^ i c ) 0 .
According to (27), it can be further concluded that κ i ( t ) = 0 will be finally achieved under the weight update laws (25) and (26). According to [48], the approximate optimal solutions μ i are expected to satisfy the Bellman residual errors E i 0 . Furthermore, if E i = 0 has a unique solution, it is equivalent to the following equations:
E i W ^ i a = 1 2 B i ( W ^ i a W ^ i c ) = 0 n × 1 .
Obviously, based on the results κ i ( t ) = 0 , the derivation of (27) and (28) will be satisfied.

3.3. Adaptive-Dynamic-Programming-Based Robust Control Law Design

Based on the reconstructed information of the above-proposed FTNNOBs d ^ i in (10) and the designed approximate optimal control policies μ i in (23), the feedforward method was employed to compensate the approximate optimal control policies to restrain the lumped uncertainties. Then, the ADP-based robust control laws can be designed as
χ i = μ i d ^ i .
Assumption 3. 
For all e i ϖ i , the following conditions σ i ( e i ) σ ¯ i , W i W ¯ i , and ε i ε ¯ i are satisfied with σ ¯ i , W ¯ i and ε ¯ i being positive constants.
With the designed control law, we can show that the closed-loop system is stable in the following theorem.
Theorem 2. 
Consider the systems in (7), Assumptions 1 and 3, and the cost function (18); if the proposed ADP-based robust control laws are defined as the form of (29) with the FTNNOBs (10) and the approximate optimal laws (23), then the tracking errors e i and AC NNs’ weight errors W ˜ i a , W ˜ i c can be guaranteed to be ultimately uniformly bounded (UUB).
Proof. 
Select a candidate Lyapunov functions as
L i 3 = V * ( e i ) + ρ i 1 2 W ˜ i a T W ˜ i a + ρ i 2 2 W ˜ i c T W ˜ i c ,
where ρ i 1 and ρ i 2 are positive constants with ρ i 1 > ρ i 2 . Calculating the time derivative of L i 3 yields
L ˙ i 3 = T V * ( f ( e i ) + g i μ i * ) + T V * ( g i μ i g i μ i * ) + T V * g i ( d i d ^ i ) ρ i 2 W ˜ i c T W ^ ˙ i c ρ i 1 W ˜ i a T W ^ ˙ i a = e i T Q e i 1 4 W i T B i W i + k i 1 ρ i 1 2 W i T Λ i W i 1 2 W i T σ i ( e i ) g i R i 1 g i T ε i = 1 2 T ε i g i R i 1 g i T T σ i ( e i ) W ˜ i a + 1 2 W i T σ i ( e i ) g i R i 1 g i T ε i = k i 1 ρ i 1 2 W ˜ i c T Λ i W ˜ i c k i 1 ρ i 1 2 W ^ i c T Λ i W ^ i c + 1 4 T ε i g i R i 1 g i T ε i + k i 1 ρ i 2 W ˜ i a T Λ i W ^ i c = k i 3 ρ i 2 W ˜ i c T β i 1 k i 3 ρ i 1 W ˜ i a T β i 1 k i 2 ρ i 2 W ˜ i a T W ˜ i a + k i 2 ρ i 2 W ˜ i a T W ˜ i c + ϑ i 4
where ϑ i 4 = ( T σ i ( e i ) W i + T ε i ) g i d ˜ i . Based on Assumptions 2 and 3, it is reasonable to assume that 1 2 σ i ( e i ) g i R i 1 g T i T σ i ( e i ) , 1 2 T ε i g i R i 1 g i T ε i , and ( T σ i ( e i ) W i + T ε i ) g i are norm-bounded. From Theorem 1, since the disturbance error bounds are fixed-time stable, we can assume that d ˜ i are also norm-bounded, that is d ˜ i d i M with d i M being positive constants. Then, one has
1 2 σ i ( e i ) g i R i 1 g T i T σ i ( e i ) b D i , 1 2 T ε i g i R i 1 g i T ε i + ϑ i 4 α i 8
where b D i are positive constants. Consider the inequality β 1 σ ¯ i e i ; the following inequalities can be obtained by using Young’s inequality:
k i 3 ρ i 2 W ˜ i c T β i 1 σ ¯ i 2 2 W ˜ i c 2 + k i 3 2 ρ i 2 2 2 e i 2 ,
k i 3 ρ i 1 W ˜ i a T β i 1 σ ¯ i 2 2 W ˜ i a 2 + k i 3 2 ρ i 1 2 2 e i 2 .
In addition, there exist constants b i Q satisfying b i Q < λ min ( Q i ) . By utilizing the above inequalities (30) and (31), one can further obtain that
L ˙ i 3 ( b i Q k i 3 2 ρ i 1 2 2 k i 3 2 ρ i 2 2 2 ) e i 2 k i 2 ρ i 2 k i 1 ρ i 2 γ i b D i σ ¯ i 2 2 W ˜ i a 2 + α i 8 = k i 1 ρ i 1 r i 2 k i 2 ρ i 2 σ ¯ i 2 2 W ˜ i c 2 + α i 9 ,
where γ i = Λ max = r i 1 β i 1 max + r i 2 and α i 9 = k i 1 ρ i 1 γ i 2 W ¯ i 2 . Then, we can select the parameters k i 1 , k i 2 , k i 3 , such that 2 b i Q k i 3 2 ρ i 1 2 k i 3 2 ρ i 2 2 > 0 , k i 2 ρ i 2 k i 1 ρ i 2 γ i b D i σ ¯ i 2 > 0 , and k i 1 ρ i 1 r i 2 k i 2 ρ i 2 σ ¯ i 2 > 0 hold. Furthermore, one can conclude that L ˙ i 3 0 when the condition:
e i > 2 α i 8 2 b i Q k i 3 2 ρ i 1 2 k i 3 2 ρ i 2 2 W ˜ i a > 2 α i 9 k i 2 ρ i 2 k i 1 ρ i 2 γ i b D i σ ¯ i 2 or W ˜ i c > 2 α i 9 k i 1 ρ i 1 r i 2 k i 2 ρ i 2 σ ¯ i 2
is satisfied. Therefore, (32) indicates that the error states of the systems (7) and the AC NNs’ weight errors are UUB stable. □
Remark 2. 
Different from [16,25], we designed weight update laws of the AC NNs that avoid the use of the PE condition by using an adaptive method, where Λ i are designed to guarantee stable convergence performances of W ˜ i c and W ˜ i a . Moreover, the positive constants r i 2 are introduced in Λ i to guarantee W ˜ i c and W ˜ i a can converge to a neighborhood of the origin when β i 1 = 0 .
Remark 3. 
In [48,49], the critic NNs’ weight update law only contains a term similar to the first term of (31), which makes the convergence performances of the weights depend heavily on the initial weights. For instance, it is obvious that the training process of the AC NNs’ weights can be broken when the initial weights are set to zero. To emphasize the role of the system states, the second term in the critic NNs’ weight update laws (31) is added to improve the efficiency of the data utilization and the adaption capability of AC NNs.

4. Simulation

Numerical simulations are presented to verify the effectiveness and fine performances of the proposed ADP-based robust trajectory-tracking-control scheme (denoted as ADPFTNs) in this section. Firstly, the robust performances of the proposed ADPFTNs in (29) were verified in the case of external disturbances and parameter uncertainties. Then, we demonstrate the superior ability of the proposed ADPFTNs in balancing control performances and cost. The initial position vector was selected as ζ ( 0 ) = [ 0.1 , 0.2 , 0.131 ] T m . The desired position trajectory was ζ d = [ 0.5 sin ( 0.6 t ) , 0.5 cos ( 0.6 t ) , 0.5 + 3 t 50 ] T m . The desired yaw angle was ψ d = 0 rad . The basis function vectors are
σ 1 ( e 1 ) = [ 0.5 x e 2 , 0.5 y e 2 , 0.5 z e 2 , x e x ˙ e , y e y ˙ e , z e z ˙ e ] T , σ 2 ( e 2 ) = [ 0.5 ϕ e 2 , 0.5 θ e 2 , 0.5 ψ e 2 , ϕ e ϕ ˙ e , θ e θ ˙ e , ψ e ψ ˙ e ] T .
The initial weight vectors are
W 1 a ( 0 ) = [ 27 , 26 , 21 , 22 , 24 , 22 ] T , W 2 a ( 0 ) = [ 15 , 16 , 18 , 16 , 18 , 12 ] T , W 1 c ( 0 ) = [ 28 , 28 , 26 , 23 , 22 , 24 ] T , W 2 c ( 0 ) = [ 14 , 15 , 15 , 16 , 18 , 24 ] T .
In addition, the main parameters of the quadrotor UAV are given as m 0 = 1.2 kg , g = 9.81 m / s 2 , J 0 = diag 0.125 , 0.125 , 0.25 kg · m 2 , J r = 0.002 kg · m 2 , and k 10 , k 20 , , k 60 = 0.01 .

4.1. Robust Tracking Control Performances with Uncertainty

To highlight the superiority of the proposed ADPFTNs, the two NN-based observers (NNOBs) in [34] with the proposed approximate optimal control polices in (29) (denoted as ADPNNs) were employed to compare with the ADPFTNs. The uncertain parameter values were set as Δ m = 0.2 + 0.05 · rand kg , Δ J x = Δ J y = Δ J z = 0.03 + 0.01 · rand kg · m 2 , and Δ k 1 = Δ k 2 = Δ k 3 = Δ k 4 = Δ k 5 = Δ k 6 = 0.005 + 0.001 · rand . The rand function generates a random number in the interval [ 0 , 1 ] . u d = [ 0.1 sin ( 0.8 t ) , 0.1 sin ( 0.7 t ) , 0.1 sin ( 0.5 t ) ] T N and τ d = [ 0.1 sin ( 0.8 t ) + J r θ ˙ , 0.1 sin ( 0.7 t ) J r ϕ ˙ , 0.1 sin ( 0.5 t ) ] T N · m with J r θ ˙ and J r ϕ ˙ being gyroscopic moments of the quadrotor UAV. The parameters of the ADPFTNs are given in Table 1.
Figure 3 shows the time response of the observation errors for the FTNNOBs and NNOBs, respectively. It can be observed that the observation error of the FTNNOBs stabilizes to the order of 10 4 within 0.5 s, while that of the NNOBs stabilizes to the order of 10 3 around 5 s. It can be concluded that the FTNNOBs have faster convergence performance and higher stabilization accuracy to attenuate external disturbances and parameter uncertainties in comparison with the NNOBs. Figure 4 shows the time response of the tracking errors for the ADPFTNs and the ADPNNs, respectively. For a fair comparison, we adjusted the convergence times of the two control schemes in the simulation to make them approximately equal (Section 4.2 performs the same setup). As shown in Figure 4, the tracking errors’ convergence curve of the proposed ADPFTNs is smoother than that of the ADPNNs and has higher convergence accuracy. This means that, when there are external disturbances and parameter uncertainties in the quadrotor UAV system, the FTNNOBs can quickly and accurately estimate the unknown uncertainties and compensate for the control policies. To illustrate the tracking performance of the two control schemes more directly, the 3D trajectory tracking results are shown in Figure 5. It can be observed that both the ADPFTNs and the ADPNNs can successfully track the desired trajectory. Further, it can be observed that the red solid line (the proposed ADPFTNs in this work) is more consistently close to the black solid line (desired trajectory) compared to the red dotted line (ADPNNs in [34]). Figure 6 shows the time responses of the AC NNs’ weights. One can see that the curves of the position error subsystem and attitude error subsystem stabilize after 5 s. Figure 7 shows the control inputs under the frame ς E .

4.2. Tracking Control Performances and Cost

In this subsection, the control performances of the ADPFTNs in (19) are further analyzed from the perspective of control cost quantitatively and compared with the backstepping control scheme in [11] (denoted as Backstepping) and the learning-based robust control scheme in [50] (denoted as LBRC). For a fair comparison, the same cost functions are defined as V p = 0 ( e 1 T Q 1 e 1 + u T R 1 u ) and V a = 0 ( e 2 T Q 2 e 2 + τ T R 2 τ ) , in which V p and V a represent the system of position and attitude control cost, respectively. For a convenient comparison of the position subsystem control cost, the gravity necessary for the three schemes was ignored in the calculation of the control cost V p .
Figure 8 shows the time response of the tracking errors for the ADPFTNs, Backstepping, and LBRC, respectively. As shown in Figure 8, the stabilization accuracy of the ADPFTN was 10 5 orders of magnitude higher than the 10 4 orders of magnitude of Backstepping and LBRC in the same convergence time. In the convergence step, Backstepping and LBRC had steeper convergence trends and faster transient responses, while the ADPFTNs has a smooth convergence tendency. It is worth noting that the steep convergence trends and fast transient responses led to higher control cost. Figure 9 illustrates the control cost of the three control schemes. It can be observed that the ADPFTNs were capable of reducing the control cost while guaranteeing better control performances. In 0–10 s, the tracking errors lied in the convergence process, and it could be obtained that the ADPFTNs had lower control cost than Backstepping and LBRC, which demonstrates the superior performance of the ADPFTNs in reducing the control cost. After 10 s, the tracking error converged to a neighborhood of the origin, and there was no major difference in the control cost of the three control schemes. Compared to Backstepping, the total control cost of the position subsystem was reduced by 5.5 % and the total control cost of the attitude subsystem was reduced by 38.9 % . In addition, the control inputs of the position system were larger than those of the attitude system, which accordingly led to a larger control cost of the position system. The 3D trajectory tracking results are illustrated in Figure 5. It can be observed that all three control schemes can successfully track the desired trajectory. Further, it can be seen that the red solid line had a more-stable tracking result in comparison with the blue dotted line (Backstepping in [11]) and the black dashed line (LBRC in [50]). Figure 10 shows the time response of control inputs in the frame ς E . As shown in Figure 9, the initial control inputs of the ADPFTNs were the smallest among the three schemes. Smaller control inputs had a smaller control cost, which corresponds to the results in Figure 9. This further illustrates the superiority of ADPFTNs in balancing control cost.

4.3. Matlab/Simscape Simulation Results

In this subsection, the effectiveness of the ADPFTNs is further validated by utilizing the semi-physical simulation environment of Matlab/Simscape, where the mathematical model was replaced with a 3D physical model of SolidWorks. The Quadrotor UAV 3D physical model parameters, external disturbances, and parameter uncertainties were set the same as in the numerical simulation section (Section 4.1 and Section 4.2). Figure 11 shows the time response of the tracking errors of the ADPFTNs in the Matlab/Simscape semi-physical simulation environment. It can be observed that all errors had good convergence performance. The control inputs of the semi-physical simulation are displayed in Figure 12. The simulation results of Matlab/Simscape semi-physical simulation environment in Figure 11 and Figure 12 were almost the same as the numerical simulation results, which indicated that the ADPFTNs have potential value to be extended and transplanted to engineering applications.

5. Conclusions

This paper investigated the ADP-based robust control problem of a quadrotor UAV trajectory tracking control system subject to external disturbances and parameter uncertainties. Firstly, the error subsystems were obtained by preprocessing the quadrotor UAV system using the feedforward technique. Subsequently, two FTNNOBs were designed to estimate external disturbances and parameter uncertainties quickly and accurately (observation errors converged within 0.5 s with an accuracy of the order of 10 4 ). Then, the AC NNs approximated the optimal value functions and the optimal control policies by designing two new weight update laws. Meanwhile, the designed weight update laws not only avoided utilizing the PE condition, but also improved the adaptive ability of the AC neural network. Finally, the ADPFTNs’ control scheme with high accuracy tracking performance (tracking accuracy of the order of 10 5 ) and low control cost (a total control cost of position subsystem savings of 5.5 % ; a total control cost of attitude subsystem savings of 38.9 % ) was proposed by combining the FTNNOBs and the approximate results of the AC NNs. Through Lyapunov stability analysis, the proposed control scheme can guarantee the closed-loop tracking control systems to be UUB stable. In the future, we will construct some experiments to demonstrate the practicability of the developed control scheme and perform some research on a multi-UAV system. Meanwhile, we will further improve the adopted model to enhance its accuracy and applicability to ensure that it better reflects the actual situation.

Author Contributions

Conceptualization, S.Y.; methodology, S.Y.; software, S.Y.; validation, S.Y.; formal analysis, H.L. and H.Z.; investigation, H.L.; resources, H.L.; data curation, S.Y. and H.M.; writing—original draft preparation, S.Y. and F.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by National Natural Science Foundation of China (Grant Number 62073212).

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Liu, N.; Shao, X. Desired compensation RISE-based IBVS control of quadrotors for tracking a moving target. Nonlinear Dyn. 2019, 95, 2605–2624. [Google Scholar] [CrossRef]
  2. Das, D.N.; Sewani, R.; Wang, J.; Tiwari, M.K. Synchronized truck and drone routing in package delivery logistics. IEEE Trans. Intell. Transp. Syst. 2021, 22, 5772–5782. [Google Scholar] [CrossRef]
  3. Wu, Y.; Wu, S.B.; Hu, X.T. Cooperative path planning of UAVs & UGVs for a persistent surveillance task in urban environments. IEEE Internet Things J. 2021, 8, 4906–4919. [Google Scholar]
  4. Gajbhiye, S.; Cabecinhas, D.; Silvestre, C.; Cunha, R. Geometric finite-time inner-outer loop trajectory tracking control strategy for quadrotor slung-load transportation. Nonlinear Dyn. 2022, 107, 2291–2308. [Google Scholar] [CrossRef]
  5. Li, B.; Gong, W.Q.; Yang, Y.S.; Xiao, B. Distributed fixed-time leader-following formation control for multi-quadrotors with prescribed performance and collision avoidance. IEEE Trans. Aerosp. Electron. Syst. 2023, 59, 7281–7294. [Google Scholar] [CrossRef]
  6. Labbadi, M.; Cherkaoui, M. Robust adaptive backstepping fast terminal sliding mode controller for uncertain quadrotor UAV. Aerosp. Sci. Technol. 2019, 93, 105306. [Google Scholar] [CrossRef]
  7. Michael, N.; Mellinger, D.; Lindsey, Q.; Kumar, V. The GRASP Multiple Micro-UAV Testbed. IEEE Robot Autom. Mag. 2010, 17, 56–65. [Google Scholar] [CrossRef]
  8. Sun, Y.B.; Xian, N.; Duan, H.B. Linear-quadratic regulator controller design for quadrotor based on pigeon-inspired optimization. Aircr. Eng. Aerosp. Tec. 2016, 88, 761–770. [Google Scholar] [CrossRef]
  9. Li, B.; Gong, W.Q.; Yang, Y.S.; Xiao, B.; Ran, D.C. Appointed Fixed Time Observer-Based Sliding Mode Control for a Quadrotor UAV Under External Disturbances. IEEE Trans. Aerosp. Electron. Syst. 2022, 58, 290–303. [Google Scholar] [CrossRef]
  10. Zhao, Z.; Cao, D.; Yang, J.; Wang, H. High-order sliding mode observer-based trajectory tracking control for a quadrotor UAV with uncertain dynamics. Nonlinear Dyn. 2020, 102, 2583–2596. [Google Scholar] [CrossRef]
  11. Xiao, B.; Yin, S. A New Disturbance Attenuation Control Scheme for Quadrotor Unmanned Aerial Vehicles. IEEE Trans. Industr. Inform. 2017, 13, 2922–2932. [Google Scholar] [CrossRef]
  12. Tran, V.P.; Santoso, F.; Garratt, M.A. Adaptive Trajectory Tracking for Quadrotor Systems in Unknown Wind Environments Using Particle Swarm Optimization-Based Strictly Negative Imaginary Controllers. IEEE Trans. Aerosp. Electron. Syst. 2021, 57, 1742–1752. [Google Scholar] [CrossRef]
  13. Alifbek, K.K.; Stepan, D.; Murodbek, S.; Dmitry, A.P.; Anvari, G.; Javod, A. Expert system application for reactive power compensation in isolated electric power systems. Int. J. Electr. Comput. Eng. 2021, 11, 3682–3691. [Google Scholar]
  14. Martyushev, N.V.; Malozyomov, B.V.; Khalikov, I.H.; Kukartsev, V.A.; Kukartsev, V.V.; Tynchenko, V.S.; Tynchenko, Y.A.; Qi, M. Review of Methods for Improving the Energy Efficiency of Electrified Ground Transport by Optimizing Battery Consumption. Energies 2023, 16, 729. [Google Scholar] [CrossRef]
  15. Werbos, P.J. Consistency of HDP applied to a simple reinforcement learning problem. Neural Netw. 1990, 3, 179–189. [Google Scholar] [CrossRef]
  16. Vamvoudakis, K.G.; Lewis, F.L. Online actor–Ccritic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica 2010, 46, 878–888. [Google Scholar] [CrossRef]
  17. Kamalapurkar, R.; Dinh, H.; Bhasin, S.; Dixon, W.E. Approximate optimal trajectory tracking for continuous-time nonlinear systems. Automatica 2015, 51, 40–48. [Google Scholar] [CrossRef]
  18. Wang, D.; Liu, D.R.; Li, H.L. Policy Iteration Algorithm for Online Design of Robust Control for a Class of Continuous-Time Nonlinear Systems. IEEE Trans. Autom. Sci. Eng. 2014, 11, 627–632. [Google Scholar] [CrossRef]
  19. Wen, G.X.; Ge, S.S.; Tu, F.W. Optimized Backstepping for Tracking Control of Strict-Feedback Systems. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 13. [Google Scholar]
  20. Zhao, B.; Liu, D.R.; Li, Y.C. Observer based adaptive dynamic programming for fault tolerant control of a class of nonlinear systems. Inf. Sci. 2017, 384, 21–33. [Google Scholar] [CrossRef]
  21. Wei, Q.L.; Liu, D.R.; Lin, Q.; Song, R.Z. Adaptive Dynamic Programming for Discrete-Time Zero-Sum Games. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 957–969. [Google Scholar] [CrossRef]
  22. Zhao, W.; Liu, H.; Lewis, F.L. Data-Driven Fault-Tolerant Control for Attitude Synchronization of Nonlinear Quadrotors. IEEE Trans. Automat. Control 2021, 66, 5584–5591. [Google Scholar] [CrossRef]
  23. Chowdhary, G.; Yucelen, T.; Mühlegg, M.; Johnson, E.N. Concurrent learning adaptive control of linear systems with exponentially convergent bounds. Int. J. Adapt. Control Signal Process. 2013, 27, 280–301. [Google Scholar] [CrossRef]
  24. Yang, X.; He, H. Self-learning robust optimal control for continuous-time nonlinear systems with mismatched disturbances. Neural Netw. 2018, 99, 19–30. [Google Scholar] [CrossRef] [PubMed]
  25. Dong, H.Y.; Zhao, X.W.; Yang, H.Y. Reinforcement Learning-Based Approximate Optimal Control for Attitude Reorientation Under State Constraints. IEEE Trans. Control Syst. Technol. 2021, 29, 1664–1673. [Google Scholar] [CrossRef]
  26. Liu, H.; Li, B.; Xiao, B.; Ran, D.C.; Zhang, C.X. Reinforcement learning© tracking control for a quadrotor unmanned aerial vehicle under external disturbances. Int. J. Robust Nonlinear Control. 2022, 33, 10360–10377. [Google Scholar] [CrossRef]
  27. Sun, J.L.; Liu, C.S. Disturbance observer-based robust missile autopilot design with full-state constraints via adaptive dynamic programming. J. Frankl. Inst. 2018, 355, 2344–2368. [Google Scholar] [CrossRef]
  28. Zhao, B.; Xu, S.Y.; Guo, J.Q.; Jiang, R.M.; Zhou, J. Integrated strapdown missile guidance and control based on neural network disturbance observer. Aerosp. Sci. Technol. 2019, 84, 170–181. [Google Scholar] [CrossRef]
  29. Zhang, R.; Xu, B.; Shi, P. Finite time observer© output feedback control of MEMS gyroscopes with input saturation. Int. J. Robust Nonlinear Control. 2022, 32, 4300–4317. [Google Scholar] [CrossRef]
  30. Zhao, Z.Y.; Jin, X.Z. Adaptive neural network-based sliding mode tracking control for agricultural quadrotor with variable payload. Comput. Electr. Eng. 2022, 103, 108336. [Google Scholar] [CrossRef]
  31. Wang, D.D.; Zong, Q.; Tian, B.L.; Shao, S.K.; Zhang, X.Y.; Zhao, X.Y. Neural network disturbance observer-based distributed finite-time formation tracking control for multiple unmanned helicopters. ISA Trans. 2018, 73, 208–226. [Google Scholar] [CrossRef] [PubMed]
  32. Liu, K.; Wang, R.J.; Wang, X.D.; Wang, X.X. Anti-saturation adaptive finite-time neural network based fault-tolerant tracking control for a quadrotor UAV with external disturbances. Aerosp. Sci. Technol. 2021, 115, 106790. [Google Scholar] [CrossRef]
  33. Fan, Q.Y.; Yang, G.H. Adaptive Actor–Ccritic design-based integral sliding-mode control for partially unknown nonlinear systems with input disturbances. IEEE Trans. Neural Netw. Learn. Syst. 2016, 27, 165–177. [Google Scholar] [CrossRef] [PubMed]
  34. Liu, X.; Zhao, B.; Liu, D.R. Fault tolerant tracking control for nonlinear systems with actuator failures through particle swarm optimization-based adaptive dynamic programming. Appl. Soft Comput. 2020, 97, 106766. [Google Scholar] [CrossRef]
  35. Adhyaru, D.M. State observer design for nonlinear systems using neural network. Appl. Soft Comput. 2012, 12, 2530–2537. [Google Scholar] [CrossRef]
  36. Farid, K.; Zhen Yu, Y.; Kenzo, N. Guidance and nonlinear control system for autonomous flight of minirotorcraft unmanned aerial vehicles. J. Field Robot. 2010, 27, 311–334. [Google Scholar]
  37. Abeywardena, D.; Kodagoda, S.; Dissanayake, G.; Munasinghe, R. Improved State Estimation in Quadrotor MAVs: A Novel Drift-Free Velocity Estimator. IEEE Robot. Autom. Mag. 2013, 20, 32–39. [Google Scholar] [CrossRef]
  38. Tang, P.; Zhang, F.B.; Ye, J.C.; Lin, D.F. An integral TSMC-based adaptive fault-tolerant control for quadrotor with external disturbances and parametric uncertainties. Aerosp. Sci. Technol. 2021, 109, 106415. [Google Scholar] [CrossRef]
  39. Xiao, B.; Yin, S. Exponential Tracking Control of Robotic Manipulators With Uncertain Dynamics and Kinematics. IEEE Trans. Industr. Inform. 2019, 15, 689–698. [Google Scholar] [CrossRef]
  40. Li, B.; Zhang, H.C.; Xiao, B.; Wang, C.H.; Yang, Y.S. Fixed-time integral sliding mode control of a high-order nonlinear system. Nonlinear Dyn. 2022, 107, 909–920. [Google Scholar] [CrossRef]
  41. Zhao, L.; Yu, J.P.; Lin, C.; Yu, H.S. Distributed adaptive fixed-time consensus tracking for second-order multi-agent systems using modified terminal sliding mode. Appl. Math. Comput. 2017, 312, 23–35. [Google Scholar] [CrossRef]
  42. Yu, S.H.; Yu, X.H.; Shirinzadeh, B.; Man, Z.H. Continuous finite-time control for robotic manipulators with terminal sliding mode. Automatica 2005, 41, 1957–1964. [Google Scholar] [CrossRef]
  43. Shao, K.; Zheng, J.C.; Huang, K.; Wang, H.; Man, Z.H.; Fu, M.Y. Finite-time control of a linear motor positioner using adaptive recursive terminal sliding mode. IEEE Trans. Ind. Electron. 2020, 67, 6659–6668. [Google Scholar] [CrossRef]
  44. Song, Y.D.; Huang, X.C.; Wen, C.Y. Tracking Control for a Class of Unknown Nonsquare MIMO Nonaffine Systems: A Deep-Rooted Information Based Robust Adaptive Approach. IEEE Trans. Automat. Control 2016, 61, 3227–3233. [Google Scholar] [CrossRef]
  45. Hornik, K.; Stinchcombe, M.; White, H. Multilayer feedforward networks are universal approximators. Neural Netw. 1989, 2, 359–366. [Google Scholar] [CrossRef]
  46. Girosi, F.; Poggio, T. Networks and the best approximation property. Biol. Cybern. 1990, 63, 169–176. [Google Scholar] [CrossRef]
  47. Zhang, D.H.; Kong, L.H.; Zhang, S.; Li, Q.; Fu, Q. Neural networks-based fixed-time control for a robot with uncertainties and input deadzone. Neurocomputing 2020, 390, 139–147. [Google Scholar] [CrossRef]
  48. Wen, G.X.; Chen, C.L.P.; Ge, S.S. Simplified Optimized Backstepping Control for a Class of Nonlinear Strict-Feedback Systems With Unknown Dynamic Functions. IEEE Trans. Cybern. 2021, 51, 4567–4580. [Google Scholar] [CrossRef]
  49. Li, K.W.; Li, Y.M. Adaptive NN optimal consensus fault-tolerant control for stochastic nonlinear multiagent systems. IEEE Trans. Neural Netw. Learn. Syst. 2021, 34, 947–957. [Google Scholar] [CrossRef]
  50. Mu, C.X.; Zhang, Y. Learning-Based Robust Tracking Control of Quadrotor With Time-Varying and Coupling Uncertainties. IEEE Trans. Neural Netw. Learn. Syst. 2020, 31, 259–273. [Google Scholar] [CrossRef]
Figure 1. Mechanical structure of a quadrotor UAV.
Figure 1. Mechanical structure of a quadrotor UAV.
Applsci 13 12672 g001
Figure 2. Framework for ADP-based robust tracking control of the quadrotor UAV system.
Figure 2. Framework for ADP-based robust tracking control of the quadrotor UAV system.
Applsci 13 12672 g002
Figure 3. Time response of the observation errors by different observers.
Figure 3. Time response of the observation errors by different observers.
Applsci 13 12672 g003
Figure 4. Time response of tracking errors under ADPFTNs and ADPNNs.
Figure 4. Time response of tracking errors under ADPFTNs and ADPNNs.
Applsci 13 12672 g004aApplsci 13 12672 g004b
Figure 5. Three-dimensional trajectory tracking results under different control schemes.
Figure 5. Three-dimensional trajectory tracking results under different control schemes.
Applsci 13 12672 g005
Figure 6. Time responses of AC NNs’ weights of ADPFTNs.
Figure 6. Time responses of AC NNs’ weights of ADPFTNs.
Applsci 13 12672 g006
Figure 7. Time response of control inputs under ADPFTNs and ADPNNs.
Figure 7. Time response of control inputs under ADPFTNs and ADPNNs.
Applsci 13 12672 g007
Figure 8. Time response of tracking errors under different schemes.
Figure 8. Time response of tracking errors under different schemes.
Applsci 13 12672 g008
Figure 9. The position control cost and attitude control cost under different schemes.
Figure 9. The position control cost and attitude control cost under different schemes.
Applsci 13 12672 g009
Figure 10. Time responses of control inputs under different schemes.
Figure 10. Time responses of control inputs under different schemes.
Applsci 13 12672 g010
Figure 11. Time response of tracking errors of ADPFTNs in Simscape.
Figure 11. Time response of tracking errors of ADPFTNs in Simscape.
Applsci 13 12672 g011
Figure 12. Time response of control inputs of ADPFTNs in Simscape.
Figure 12. Time response of control inputs of ADPFTNs in Simscape.
Applsci 13 12672 g012
Table 1. Parameters’ selection.
Table 1. Parameters’ selection.
VariablePosition Error System (i = 1)Attitude Error System (i = 2)
r i 1 1 × 10 6 1 × 10 6
r i 2 1 × 10 5 1 × 10 5
k i 1 0.01 0.01
k i 2 1.2 1.2
Q i 4 I 6 10 I 6
R i 2.5 I 3 5 I 3
r i 3 1.6 1.6
r i 4 0.2 0.2
c i 200700
ϱ i 1 0.06 0.06
m i 9799
n i 99101
p i 105105
q i 99101
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yang, S.; Yu, F.; Liu, H.; Ma, H.; Zhang, H. Adaptive-Dynamic-Programming-Based Robust Control for a Quadrotor UAV with External Disturbances and Parameter Uncertainties. Appl. Sci. 2023, 13, 12672. https://doi.org/10.3390/app132312672

AMA Style

Yang S, Yu F, Liu H, Ma H, Zhang H. Adaptive-Dynamic-Programming-Based Robust Control for a Quadrotor UAV with External Disturbances and Parameter Uncertainties. Applied Sciences. 2023; 13(23):12672. https://doi.org/10.3390/app132312672

Chicago/Turabian Style

Yang, Shaoyu, Fang Yu, Hui Liu, Hongyue Ma, and Haichao Zhang. 2023. "Adaptive-Dynamic-Programming-Based Robust Control for a Quadrotor UAV with External Disturbances and Parameter Uncertainties" Applied Sciences 13, no. 23: 12672. https://doi.org/10.3390/app132312672

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop