Backstepping-Based Finite-Horizon Optimization for Pitching Attitude Control of Aircraft

Ang Li; Yaohua Shen; Bin Du

doi:10.3390/aerospace12080653

Abstract

In this paper, the problem of pitching attitude finite-horizon optimization for aircraft is posed with system uncertainties, external disturbances, and input constraints. First, a neural network (NN) and a nonlinear disturbance observer (NDO) are employed to estimate the value of system uncertainties and external disturbances. Taking input constraints into account, an auxiliary system is designed to compensate for the constrained input. Subsequently, the backstepping control containing NN and NDO is used to ensure the stability of systems and suppress the adverse effects caused by the system uncertainties and external disturbances. In order to avoid the derivation operation in the process of backstepping, a dynamic surface control (DSC) technique is utilized. Simultaneously, the estimations of the NN and NDO are applied to derive the backstepping control law. For the purpose of achieving finite-horizon optimization for pitching attitude control, an adaptive method termed adaptive dynamic programming (ADP) with a single NN-termed critic is applied to obtain the optimal control. Time-varying feature functions are applied to construct the critic NN in order to approximate the value function in the Hamilton–Jacobi–Bellman (HJB) equation. Furthermore, a supplementary term is added to the weight update law to minimize the terminal constraint. Lyapunov stability theory is used to prove that the signals in the control system are uniformly ultimately bounded (UUB). Finally, simulation results illustrate the effectiveness of the proposed finite-horizon optimal attitude control method.

Keywords:

pitching attitude control; backstepping control; finite-horizon optimization; adaptive method

1. Introduction

In flight control, the problem of finite-horizon optimization for pitching attitude tracking control of aircraft can be treated as the pitching attitude tracking the command signal with the desired finite-horizon optimal index. To resolve the above problem, both the tracking control and finite-horizon optimization should be taken into consideration, which makes the overall problem significantly difficult.

In order to achieve attitude tracking for aircraft, much research has been conducted. In [1], an adaptive second order sliding model control method was presented in order to improve the performance in the presence of external disturbances. In [2], a gain-scheduling control method was proposed for linear parameter-varying systems with multi-input multi-output in order to obtain a satisfactory performance in the bank-to-turn control of aircraft. In [3], nonlinear dynamic inversion technology was proposed for the design of a supermaneuverable aircraft control. In [4], backstepping control was employed to execute the attitude tracking control of a mini unmanned aerial vehicle.

Among the control methods above, the backstepping control scheme is widely used owing to some advantages. First, virtual control is designed separately for each subsystem in the process of backstepping control to reduce the complexity of high-order system control. Subsequently, the backstepping control can be combined with other control methods such as sliding mode control, NN, adaptive control method, and disturbance observer to improve the control performance. In [5], a robust backstepping control scheme combining sliding mode control and neural network (NN) was proposed to achieve the reentry attitude tracking control of a near-space hypersonic vehicle in the presence of parameter variations and external disturbances. In [6], a deep convolutional NN-based backstepping method was used to identify system uncertainties and hidden states in attitude control in order to enhance the robustness. In [7], an auxiliary system-based backstepping control was constructed for the aircraft subject to the input saturation problem caused by wing rock. In [8], a finite-time convergence backstepping control scheme was designed. In the scheme, a finite-time observer and finite-time auxiliary system were used to suppress the effects of unsteady aerodynamic disturbances and compensate for the effect of input saturation, respectively. However, the aforementioned attitude control methods do not take into account the optimal control that meets some desired index. Actually, the optimal control problem is intractable, especially for a nonlinear system. Hence, there is relatively little research on the attitude control of aircraft in an optimal way, which is a nonlinear optimization problem in nature, not to mention the finite-horizon optimization for attitude tracking control. Thus, how to control the aircraft attitude in an optimal way should be further studied.

Quadratic optimal control is an optimal control method applied earlier in flight control. A given quadratic index is used to control the system with a desired optimal performance. In [9], a nonlinear system was divided into two parts of a linear nominal system and compound disturbances. Then, a linear quadratic regulator was designed to control the linear nominal system, while a robust control was derived to compensate for the effects caused by compound disturbances. In order to cope with the problem of recovering open-loop singular values in the quadratic optimal control, the LQG/LTR technique was applied for a multivariable vertical short take-off and landing aircraft linear system in [10]. However, the above control methods can only be applied to linear systems. For a nonlinear system, Hamilton–Jacobi–Bellman (HJB) equations without analytical solutions need to be solved, which makes it intractable to execute optimal control.

To cope with the problem of solving HJB equations for a nonlinear system, some numerical methods were applied to approximate the solution. In [11], a dynamic programming algorithm was presented, which is supposed to be solved in an off-line manner. In [12], a recursive optimization approach was proposed for a nonlinear system. In [13], a state-dependent Riccati equation (SDRE) method, which used a parameterization technique to convert the nonlinear system into a linear structure with state-dependent coefficients, was proposed to deal with the problem of nonlinear optimization. Nevertheless, a heavy computational burden is the main obstacle to applying the above three methods to nonlinear optimization. Inspired by the dynamic programming algorithm, the ADP algorithm was proposed in [14]. Compared with the dynamic programming algorithm, a critic NN was constructed to approximate the value function in order to solve the HJB equation forward-in-time in the ADP algorithm. Thus, the heavy computational burden was avoided, and on-line optimization was achieved.

The ADP algorithm, characterized by strong abilities of self-learning and adaptivity, has received significantly increased attention and has become an important intelligent optimal control method for nonlinear systems [15]. Due to its advantage of a low calculation cost, ADP has been applied in flight control. In [16], an adaptive critic design (ACD)-based optimal control algorithm was proposed. Under the premise of ensuring system stability, the ACD algorithm was utilized to improve the control performance of the system. In [17], a constrained ADP approach and linear parameter-varying technique were employed to guarantee the closed-loop stability and excellent control performance of the flight with aerodynamic parameter uncertainties and actuator failures. In [18], an incremental ADP algorithm was proposed to control the attitude tracking of spacecraft. In [19], an integral sliding-mode control based adaptive actor–critic algorithm was developed to guarantee the optimal control for sliding-mode dynamics online. As discussed above, the backstepping control method is favored by researchers due to its advantages. Because of its feature of easy combination with other control methods, the backstepping-based ADP scheme has been applied in many works. In [20], a backstepping-based ADP algorithm was developed to solve the problem of missile-target interception with state and input constraints. In the scheme of backstepping, a barrier Lyapunov function was used in the virtual controller design process for each subsystem to guarantee the state constraints, and an auxiliary system was designed to compensate for the constrained input. In [21], a backstepping-based ADP algorithm with zero-sum differential game method was applied to the zero-sum game problem for a missile and target. The zero-sum differential game technique was applied in the scheme of ADP algorithm in order to control the missile and target in an optimal way, and a critic network was constructed to approximate the value function in Hamilton–Jacobi–Isaacs (HJI) in order to achieve optimization online. In [22], an NN-based optimal control scheme was proposed for the near-space vehicle attitude tracking control. In the scheme, the NN and NDO were designed in a backstepping scheme to approximate the system uncertainties and external disturbances, while the critic network was constructed to approximate the value function in the HJB equation. However, it should be noted that the developments of the ADP algorithm above mainly address only the problem of infinite-horizon optimization. In fact, it is required to control in a finite-horizon optimal way for many systems, especially for flight systems.

Compared with infinite-horizon optimal control, finite-horizon optimal control is considered to be more challenging. First, the value function of the finite-horizon optimal control system is time-to-go-dependent, which leads to a time-varying associated HJB equation. Hence, it is more difficult to solve the HJB equation. Second, the terminal constraint should be taken into account for infinite-horizon optimization [23]. For the purpose of addressing the issues above, some research has been conducted. In [24], time-dependent weights and state-dependent feature functions were incorporated to construct an NN in order to approximate the time-to-go-dependent value function, and the least-square-based gradient descent method was utilized to update the weights off-line. In contrast, an NN consisting of constant weights and time-state-dependent feature functions was designed to achieve the approximation of the value function in the HJB function online in [25]. Nevertheless, the constrained input and system uncertainties were not taken into account. Considering input constraints, a non-quadratic function was utilized to eliminate the input constraints in [26]. Regarding system uncertainties, an online NN identifier was designed to approximate system uncertainties, and an actor–critic algorithm was introduced to solve the HJB equation to guarantee that the system was controlled with the optimal index in finite-horizon in [27]. Unfortunately, most of the research considered discrete systems. Furthermore, the constrained input, system uncertainties, and external disturbances are not considered in the finite-horizon optimal control together, which limits its application in flight control.

In order to address the problem of finite-horizon optimization pitching attitude tracking with system uncertainties, external disturbances, and input constraints, a novel backstepping-based finite-horizon optimization is developed in this work. The backstepping scheme, in which NN and NDO are employed to estimate the value of system uncertainties and external disturbances and an auxiliary system is designed to compensate for the constrainted input, is introduced to ensure the stability of systems. The ADP algorithm with a critic NN that consists of constant weights and time-state-dependent feature functions is employed to obtain finite-horizon optimal control. A novel updating law of the critic NN weights is derived to solve the HJB equation and minimize the terminal constraints. Furthermore, the Lyapunov stability method is applied to prove that the signals in the control system are UUB. Finally, simulation results illustrate the effectiveness of the proposed control scheme.

The main contributions of this paper include the following:

(1): A backstepping-based ADP scheme is used to achieve finite-horizon optimal control. In the backstepping control scheme, NN is applied to approximate system uncertainties, while NDO is employed to estimate external disturbances. The ADP is used to control the nominal system in a finite-horizon optimal manner. Due to the integration of the backstepping method and the advantages of ADP, the backstepping-based finite-horizon optimization ADP scheme is promising for pitching attitude tracking control.
(2): A novel updating law of the critic NN weights is derived in order to satisfy the terminal constraints, relaxing the requirement of an initial admissible control and guarantee the stability of system.

The rest of the paper is organized as follows. Section 2 formally states the preliminaries of the research object of this paper. The desigsn for backstepping control and finite-horizon optimal control are given in Section 3 and Section 4, respectively. Then, the stability analysis is developed in Section 5. The simulation results are presented in Section 6. Finally, the conclusions of the paper are given in the last section.

Notations. Throughout the paper,

R^{m \times n}

stands for the set of all

m \times n

real matrices.

\nabla_{x} f

stands for the gradient of f with respect to x such as

\nabla_{x} f = \frac{\partial f}{\partial x}

.

s i g n (\cdot)

stands for the sign function.

2. Problem Descriptions and Preliminaries

Before giving the attitude dynamics of an aircraft, a diagram illustrating the model parameters of the aircraft is given in Table 1. Regardless of the unsteady aerodynamics, the longitudinal attitude dynamics of an aircraft with system uncertainties and external disturbances can be described as follows [7].

\dot{α} = f_{1} (α) + q + Δ f_{1} (α) + d_{1},

(1)

\dot{q} = f_{2} (α, q) + g_{2} δ_{z} (u) + Δ f_{2} (α, q) + d_{2},

(2)

where

f_{1} (α) \in R^{1}

and

f_{2} (α, q) \in R^{1}

are the known internal system dynamics.

g_{2} \in R^{1}

is the known control coefficient.

u \in R^{1}

is the unconstrainted control to be designed.

Δ f_{1} (α) \in R^{1}

and

Δ f_{2} (α, q) \in R^{1}

are the unknown system uncertainties.

d_{1} \in R^{1}

and

d_{2} \in R^{1}

are the unknown external disturbances.

f_{1} (α)

,

f_{2} (α, q)

,

g_{2}

,

δ_{z} (u)

are given by

f_{1} (α) = \frac{1}{M V} (- L - T sin α + M g cos γ),

(3)

f_{2} (α, q) = \frac{\bar{q} S \bar{c} C_{m}}{I_{y y}},

(4)

g_{2} = \frac{1}{I_{y y}} x_{T} T \frac{π}{180},

(5)

δ_{z} = \{\begin{matrix} u_{M} s i g n (u), |u| \geq u_{M} \\ u, |u| < u_{M} \end{matrix},

(6)

where

u_{M}

is a known boundary of u.

Table 1. The illustration of aircraft model parameters.

In this paper, the control objective is to design a controller u, so that the angle of attack

α

is driven to track a desired signal

α_{c}

in a finite-horizon optimal manner, and all the signals in the control system (1) and (2) are uniformly ultimately bounded (UUB). To illustrate the design of the proposed method, the control block diagram is shown in Figure 1. To deal with the external disturbances and the system uncertainties, the NDO and the NN are designed together with the backstepping control. To solve the problem of the input constraints, the auxiliary is used in the forward control, which transforms the longitudinal attitude dynamics of the aircraft into the nominal form. To carry out optimal control, the ADP-based finite-horizon optimal control method is designed with the critic NN.

Figure 1. The control block diagram of the proposed method.

For the design of the controller hereinafter, the following assumption is required.

Assumption 1.

In Equations (1) and (2), the system uncertainties

Δ f_{1} (α)

and

Δ f_{2} (α, q)

as well as the external disturbances

d_{1}

and

d_{2}

are differentiable. In addition, the first derivatives of

d_{1}

and

d_{2}

are bounded such as

\dot{|d_{1}|} \leq {\dot{d}}_{1 M}

. Furthermore,

g_{2}

is invertible and bounded such as

0 < |g_{2}| \leq g_{2 M}

.

Remark 1.

Investigating the expression of

g_{2}

shown as Equation (5), the boundeness of

I_{y y}

,

x_{T}

, and T yield the boundness of

g_{2}

, and the non-zero values of

x_{T}

and T make

g_{2}

invertible. Thus, Assumption 1 is reasonable.

3. Design for Backstepping Control

In this section, a backstepping method with NN and NDO is derived to design a forward controller. The NN is constructed to approximate the system uncertainties while the NDO is designed to approximate the external disturbances. Then, the system comprising (1) and (2) is transformed into a nominal system to be controlled in a finite-horizon optimal manner by ADP.

In order to obtain satisfying control performance, the negative effects caused by unknown system uncertainties must be eliminated. According to the NN theory, the system uncertainties

Δ f_{1} (α)

and

Δ f_{2} (α, q)

can be approximated as [22]

Δ f_{1} (α) = L_{1}^{- 1} (W_{1}^{* T} a_{1} (α) + r_{1}^{*}),

(7)

Δ f_{2} (α, q) = L_{2}^{- 1} (W_{2}^{* T} a_{2} (α, q) + r_{2}^{*}),

(8)

where

L_{1} \in R^{1}

,

L_{2} \in R^{1}

are the parameters to be designed.

a_{1} (α) \in R^{n_{1}}

and

a_{2} (α, q) \in R^{n_{2}}

are basis functions.

n_{1}

and

n_{2}

are the numbers of the basis functions of

a_{1} (α)

and

a_{2} (α, q)

, respectively.

W_{1}^{*} \in R^{n_{1}}

and

W_{2}^{*} \in R^{n_{2}}

are the desired weight vectors.

r_{1}^{*}

and

r_{2}^{*}

are the approximation errors of the NN.

Assumption 2

([20]).

W_{1}^{*}

and

W_{2}^{*}

are both bounded as

∥W_{1}^{*}∥ \leq W_{1 M}^{*}

,

∥W_{2}^{*}∥ \leq W_{2 M}^{*}

.

Invoking Equations (7) and (8) into (1) and (2) yields

\dot{α} = f_{1} (α) + q + L_{1}^{- 1} W_{1}^{* T} a_{1} (α) + D_{1},

(9)

\dot{q} = f_{2} (α, q) + g_{2} δ_{z} (u) + L_{2}^{- 1} W_{2}^{* T} a_{2} (α, q) + D_{2},

(10)

where

D_{1} = L_{1}^{- 1} r_{1}^{*} + d_{1} \in R^{1}

and

D_{2} = L_{2}^{- 1} r_{2}^{*} + d_{2} \in R^{1}

are treated as compound disturbances [22].

For the purpose of compensating for input constraints, an auxiliary system is designed as

{\dot{S}}_{2} = - k_{a u x} S_{2} + g_{2} Δ,

(11)

where

Δ = δ_{z} (u) - u

.

S_{2} \in R^{1}

is an auxiliary control signal.

k_{a u x}

is the parameter to be designed.

Assumption 3

([8]). Δ is bounded such as

∥Δ∥ \leq \bar{Δ}

.

The error system is defined as [20,22]

z_{1} = α - α_{c},

(12)

z_{2} = q - q_{c} - S_{2},

(13)

where

q_{c} \in R^{1}

is the virtual control law to be designed.

Invoking Equations (9)–(11), the dynamics of the error system are derived as

{\dot{z}}_{1} = f_{1} (α) + q + L_{1}^{- 1} W_{1}^{* T} a_{1} (α) + D_{1} - {\dot{α}}_{c},

(14)

\begin{matrix} {\dot{z}}_{2} = f_{2} (α, q) + g_{2} u + L_{2}^{- 1} W_{2}^{* T} a_{2} (α, q) + D_{2} - {\dot{q}}_{c} + k_{a u x} S_{2} \end{matrix} .

(15)

For the purpose of achieving the backstepping control and finite-horizon optimal control, both of the virtual control

q_{c}

and the unconstrained control u are divided into two parts as

q_{c} = q_{c}^{a} + q_{c}^{*},

(16)

u = u^{a} + u^{*},

(17)

where

q_{c}^{a}

and

u^{a}

are the virtual control input and unconstrained control input in the backstepping scheme, respectively, and

q_{c}^{*}

and

u^{*}

are the virtual control input and unconstrained control input in the finite-horizon optimal control scheme, respectively.

To estimate the compound disturbances, NDOs are designed as [22]

\{\begin{matrix} {\hat{D}}_{1} = ς_{1} + H_{1} (α) \\ {\dot{ς}}_{1} = - L_{1} (f_{1} (α) + q + {\hat{D}}_{1}) - {\hat{W}}_{1}^{T} a_{1} (α) + z_{1} \end{matrix},

(18)

\{\begin{matrix} {\hat{D}}_{2} = ς_{2} + H_{2} (q) \\ {\dot{ς}}_{2} = - L_{2} (f_{2} (α, q) + g_{2} δ_{z} (u) + {\hat{D}}_{2}) - {\hat{W}}_{2}^{T} a_{2} (α, q) + z_{2} \end{matrix},

(19)

where

{\hat{D}}_{1}

and

{\hat{D}}_{2}

are the estimations of

D_{1}

and

D_{2}

, respectively.

H_{1} (α) \in R^{1}

and

H_{2} (α) \in R^{1}

are functions to be designed that satisfy

L_{1} = \partial H_{1} (α) / \partial α

and

L_{2} = \partial H_{2} (q) / \partial q

.

{\hat{W}}_{1}

and

{\hat{W}}_{2}

are the estimations of

W_{1}^{*}

and

W_{2}^{*}

, respectively.

We define the estimation errors as

{\tilde{W}}_{1} = {\hat{W}}_{1} - W_{1}^{*}

(20)

{\tilde{W}}_{2} = {\hat{W}}_{2} - W_{2}^{*}

(21)

{\tilde{D}}_{1} = D_{1} - {\hat{D}}_{1}

(22)

{\tilde{D}}_{2} = D_{2} - {\hat{D}}_{2} .

(23)

Invoking Equations (9), (10), (18), (19), (22), and (23), the dynamics of

{\tilde{D}}_{1}

and

{\tilde{D}}_{2}

can be written as

{\dot{\tilde{D}}}_{1} = {\dot{D}}_{1} - L_{1} {\tilde{D}}_{1} + {\hat{W}}_{1}^{T} a_{1} (α) - z_{1}

(24)

{\dot{\tilde{D}}}_{2} = {\dot{D}}_{2} - L_{2} {\tilde{D}}_{2} + {\tilde{W}}_{2}^{T} a_{2} (α, q) - z_{2} .

(25)

Then, the backstepping control law can be designed as follows.

Step 1: Taking Equation (14) into account, the virtual control input

q_{c}^{a}

is designed as

q_{c}^{a} = - (k_{1} z_{1} + f_{1} (α_{c}) + L_{1}^{- 1} {\hat{W}}_{1}^{T} a_{1} (α) + {\hat{D}}_{1} - {\dot{α}}_{c}),

(26)

where

k_{1}

is the parameter to be designed.

The weights vector

{\hat{W}}_{1}

is updated as

{\dot{\hat{W}}}_{1} = Ω_{1}^{- 1} (a_{1} (α) z_{1}^{T} L_{1}^{- 1} - τ_{1} {\hat{W}}_{1}),

(27)

where

Ω_{1} \in R^{n_{1} \times n_{1}}

is the positive definite symmetric matrice to be designed [22].

τ_{1}

is the parameter to be designed.

Invoking Equations (13), (14), (16), and (26) yields

\begin{matrix} {\dot{z}}_{1} = f_{1} (α) - f_{1} (α_{c}) + q_{c}^{*} + z_{2} - k_{1} z_{1} - L_{1}^{- 1} {\tilde{W}}_{1}^{T} a_{1} (α) + {\tilde{D}}_{1} + S_{2} \end{matrix} .

(28)

In the normal backstepping scheme, it is inevitable to differentiate

q_{c}

. Nevertheless, due to the unknown information in the partial derivative of

q_{c}

, it is intractable to obtain the derivation. In order to avoid the derivation operation, a dynamic surface control (DSC) technique is applied as [28]

τ \dot{λ} + λ = q_{c}, λ (0) = q_{c} (0),

(29)

where τ is the parameter to be designed. λ is a first-order filter in nature to approximate

q_{c}

such that

\dot{λ}

can substitute for

{\dot{q}}_{c}

.

We define the error as

e = λ - q_{c} .

(30)

Then, we have

\begin{matrix} \dot{e} = \dot{λ} + (- {\dot{q}}_{c}) \\ = - e / τ + (- \frac{\partial q_{c}}{\partial α} \dot{α} - \frac{\partial q_{c}}{\partial α_{c}} {\dot{α}}_{c} - \frac{\partial q_{c}}{\partial z_{1}} {\dot{z}}_{1} - \frac{\partial q_{c}}{\partial {\hat{W}}_{1}} {\dot{\hat{W}}}_{1} - \frac{\partial q_{c}}{\partial {\hat{D}}_{1}} {\dot{\hat{D}}}_{1} - \frac{\partial q_{c}}{\partial {\dot{α}}_{c}} {\ddot{α}}_{c} - \frac{\partial q_{c}}{\partial {\hat{W}}_{c}} {\dot{\hat{W}}}_{c}) \\ = - e / τ + M_{d} (z_{1}, z_{2}, e, {\hat{W}}_{1}, {\hat{D}}_{1}, {\hat{W}}_{2}, α_{c}, {\dot{α}}_{c}, {\ddot{α}}_{c}) \end{matrix},

(31)

where

M_{d} (z_{1}, z_{2}, e, {\hat{W}}_{1}, {\hat{D}}_{1}, α_{c}, {\dot{α}}_{c}, {\ddot{α}}_{c}) = - \frac{\partial q_{c}}{\partial α} \dot{α} - \frac{\partial q_{c}}{\partial α_{c}} {\dot{α}}_{c} - \frac{\partial q_{c}}{\partial z_{1}} {\dot{z}}_{1} - \frac{\partial q_{c}}{\partial {\hat{W}}_{1}} {\dot{\hat{W}}}_{1} - \frac{\partial q_{c}}{\partial {\hat{D}}_{1}} {\dot{\hat{D}}}_{1} - \frac{\partial q_{c}}{\partial {\dot{α}}_{c}} {\ddot{α}}_{c} - \frac{\partial q_{c}}{\partial {\hat{W}}_{c}} {\dot{\hat{W}}}_{c}

.

{\hat{W}}_{c} \in R^{L}

is the weight vector of critic NN designed hereinafter.

Assumption 4

([28]).

M (z_{1}, z_{2}, e, {\hat{W}}_{1}, {\hat{D}}_{1}, α_{c}, {\dot{α}}_{c}, {\ddot{α}}_{c})

is a continuous function. For any

C_{1}

and

C_{2}

, the sets

Π_{1} : = {(α_{c}, {\dot{α}}_{c}, {\ddot{α}}_{c}) : α_{c}^{2} + {\dot{α}}_{c}^{2} + {\ddot{α}}_{c}^{2} \leq C_{1}}

and

Π_{2} : = {\sum_{j = 1}^{2} z_{j}^{2} + {\tilde{W}}_{1}^{T} Ω_{1} {\tilde{W}}_{1} + {\tilde{W}}_{c}^{T} {\tilde{W}}_{c} + e^{2} + {\tilde{D}}_{1}^{2} + {\ddot{α}}_{c}^{2} \leq C_{2}}

are compact in

R^{3}

and

R^{5 + n_{1} + L}

, respectively. Hence,

Π_{1} \times Π_{2}

is also compact. Considering the continuous property, the function

M_{d} (\cdot)

is bounded for the given initial conditions in the compact set

Π_{1} \times Π_{2}

such as

∥M_{d} (\cdot)∥ \leq M

.

We consider the Lyapunov function candidate as

V_{1} = \frac{1}{2} z_{1}^{2} + \frac{1}{2} e^{2} + \frac{1}{2} {\tilde{W}}_{1}^{T} Ω_{1} {\tilde{W}}_{1} + \frac{1}{2} {\tilde{D}}_{1}^{2}

(32)

Differentiating

V_{1}

and invoking Equations (20), (24), and (27)–(31) yields

\begin{matrix} {\dot{V}}_{1} = z_{1} (f_{1} (α) - f_{1} (α_{c}) + q_{c}^{*}) + z_{1} z_{2} - k_{1} z_{1}^{2} - \frac{e^{2}}{τ} + e (- {\dot{q}}_{c}) + z_{1} S_{2} \\ - τ_{1} {\tilde{W}}_{1}^{T} {\hat{W}}_{1} + {\tilde{D}}_{1} {\dot{D}}_{1} - L_{1} {\tilde{D}}_{1}^{2} + {\tilde{D}}_{1} {\hat{W}}_{1}^{T} a_{1} (α) \end{matrix} .

(33)

In addition, we have

{\tilde{W}}_{1}^{T} {\hat{W}}_{1} \geq \frac{1}{2} {\tilde{∥W_{1}∥}}^{2} - \frac{1}{2} {∥W_{1}^{*}∥}^{2} .

(34)

Taking Assumption 3.3 and Young’s inequality into account and invoking inequality (34) yields

\begin{matrix} {\dot{V}}_{1} \leq z_{1} (f_{1} (α) - f_{1} (α_{c}) + q_{c}^{*}) - (k_{1} - 1) z_{1}^{2} + \frac{1}{2} z_{2}^{2} - (\frac{1}{τ} - \frac{1}{2}) e^{2} + \frac{1}{2} S_{2}^{2} \\ - \frac{1}{2} (τ_{1} - ι_{1}^{- 1}) {\tilde{∥W_{1}∥}}^{2} - (L_{1} - \frac{1}{2} - \frac{1}{2} ι_{1} a_{1 M}^{2}) {\tilde{D}}_{1}^{2} + \frac{1}{2} {\dot{D}}_{1}^{2} + \frac{1}{2} τ_{1} {∥W_{1}^{*}∥}^{2} + \frac{1}{2} M^{2} \end{matrix},

(35)

where

∥a_{1} (α)∥ \leq a_{1 M}

.

ι_{1}

is the parameter to be designed.

Step 2: Taking Equation (15) into account, the control input

u^{a}

is designed as

\begin{matrix} u^{a} = - g_{2}^{- 1} (f_{2} (α_{c}, q_{c}) - \dot{λ} + k_{2} z_{2} + L_{2}^{- 1} {\hat{W}}_{2}^{T} a_{2} (α, q) + {\hat{D}}_{2} + k_{a u x} S_{2}), \end{matrix}

(36)

where

k_{2}

is the parameter to be designed.

The weights vector

{\hat{W}}_{2}

is updated as

{\dot{\hat{W}}}_{2} = Ω_{2}^{- 1} (a_{2} (α, q) z_{2}^{T} L_{2}^{- 1} - τ_{2} {\hat{W}}_{2}),

(37)

where

Ω_{2} \in R^{n_{2} \times n_{2}}

represents the positive definite symmetric matrices to be designed [22].

τ_{2}

is the parameter to be designed.

Invoking Equations (15), (17), (29)–(31), and (36) yields

\begin{matrix} {\dot{z}}_{2} = f_{2} (α, q) - f_{2} (α_{c}, q_{c}) + g_{2} u^{*} - L_{2}^{- 1} {\tilde{W}}_{2}^{T} a_{2} (α, q) + {\tilde{D}}_{2} - k_{2} z_{2} - \frac{e}{τ} - {\dot{q}}_{c} \end{matrix} .

(38)

We consider the Lyapunov function candidate as

V_{2} = \frac{1}{2} z_{2}^{2} + \frac{1}{2} {\tilde{W}}_{2}^{T} Ω_{2} {\tilde{W}}_{2} + \frac{1}{2} {\tilde{D}}_{2}^{2} + \frac{1}{2} S_{2}^{2} .

(39)

Differentiating

V_{2}

and invoking Equations (21), (25), (37), and (38) yields

\begin{matrix} {\dot{V}}_{2} = z_{2} (f_{2} (α, q) - f_{2} (α_{c}, q_{c}) + g_{2} u^{*}) - k_{2} z_{2}^{2} - z_{2} \frac{e}{τ} + z_{2} (- {\dot{q}}_{c}) \\ - τ_{2} {\tilde{W}}_{2}^{T} {\hat{W}}_{2} + {\tilde{D}}_{2} {\dot{D}}_{2} - L_{2} {\tilde{D}}_{2}^{2} + {\tilde{D}}_{2} {\tilde{W}}_{2}^{T} a_{2} (α, q) - k_{a u x} S_{2}^{2} + S_{2} g_{2} Δ \end{matrix} .

(40)

In addition, we have

\begin{matrix} {\tilde{W}}_{2}^{T} {\hat{W}}_{2} = \frac{1}{2} {\tilde{∥W_{2}∥}}^{2} + \frac{1}{2} {\hat{∥W_{2}∥}}^{2} - \frac{1}{2} {∥W_{2}^{*}∥}^{2} \geq \frac{1}{2} {\tilde{∥W_{2}∥}}^{2} - \frac{1}{2} {∥W_{2}^{*}∥}^{2} \end{matrix} .

(41)

Taking Assumptions 3 and 4 and Young’s inequality into account and invoking inequality (41) yields

\begin{matrix} {\dot{V}}_{2} \leq z_{2} (f_{2} (α, q) - f_{2} (α_{c}, q_{c}) + g_{2} u^{*}) - (k_{2} + \frac{1}{2 τ} - \frac{1}{2}) z_{2}^{2} - \frac{1}{2 τ} e^{2} - (\frac{τ_{2}}{2} - \frac{1}{2} ι_{2}^{- 1}) {\tilde{∥W_{2}∥}}^{2} \\ - (L_{2} - \frac{1}{2} - \frac{1}{2} ι_{2} a_{2 M}^{2}) {\tilde{D}}_{2}^{2} - (k_{a u x} - \frac{1}{2} θ_{a u x} g_{2}^{2}) S_{2}^{2} + \frac{1}{2} {\dot{D}}_{2}^{2} + \frac{τ_{2}}{2} {∥W_{2}^{*}∥}^{2} + \frac{1}{2} θ_{a u x}^{- 1} {\bar{Δ}}^{2} + \frac{1}{2} M^{2} \end{matrix},

(42)

where

∥a_{2} (α)∥ \leq a_{2 M}

.

ι_{2}

,

θ_{a u x}

are the parameters to be designed.

The nominal affine nonlinear system is defined as

\dot{Z} = F (Z) + G U,

(43)

where

Z = {[z_{1}, z_{2}]}^{T} \in R^{2}

(44)

F (Z) = {[f_{1} (α) - f_{1} (α_{c}), f_{2} (α, q) - f_{2} (α_{c}, q_{c})]}^{T} \in R^{2}

(45)

G = [\begin{matrix} 1 & 0 \\ 0 & g_{2} \end{matrix}] \in R^{2 \times 2}

(46)

U = {[q_{c}^{*}, u^{*}]}^{T} \in R^{2} .

(47)

We consider the Lyapunov function candidate

V_{b}

in the backstepping scheme as

V_{b} = V_{1} + V_{2} .

(48)

Differentiating

V_{b}

and invoking Equations (35) and (42)–(47) yields

\begin{matrix} {\dot{V}}_{b} \leq Z^{T} (F (Z) + G U) - (k_{1} - 1) z_{1}^{2} - (k_{2} + \frac{1}{2 τ} - \frac{1}{2}) z_{2}^{2} - (\frac{3}{2 τ} - \frac{1}{2}) e^{2} - (k_{a u x} - \frac{1}{2} - \frac{1}{2} θ_{a u x} g_{2}^{2}) S_{2}^{2} \\ - \frac{1}{2} (τ_{1} - ι_{1}^{- 1}) {\tilde{∥W_{1}∥}}^{2} - \frac{1}{2} (τ_{2} - ι_{2}^{- 1}) {\tilde{∥W_{2}∥}}^{2} - (L_{1} - \frac{1}{2} - \frac{1}{2} ι_{1} a_{1 M}^{2}) {\tilde{D}}_{1}^{2} - (L_{2} - \frac{1}{2} - \frac{1}{2} ι_{2} a_{2 M}^{2}) {\tilde{D}}_{2}^{2} \\ + \frac{1}{2} {\dot{D}}_{1}^{2} + \frac{1}{2} {\dot{D}}_{2}^{2} + \frac{1}{2} τ_{1} {∥W_{1}^{*}∥}^{2} + \frac{1}{2} τ_{2} {∥W_{2}^{*}∥}^{2} + \frac{1}{2} θ_{a u x}^{- 1} {\bar{Δ}}^{2} + M^{2} \end{matrix} .

(49)

Remark 2.

Based on Assumption 1, the first derivatives of

D_{1}

and

D_{2}

are bounded such as

\dot{∥D_{1}∥} \leq {\dot{D}}_{1 M}

,

\dot{∥D_{2}∥} \leq {\dot{D}}_{2 M}

. In addition, considering the optimal approximation property of the NN, the desired weight vectors are bounded. Thus, Assumption 2 is reasonable. Furthermore, if the difference Δ between the desired control input and saturation input is unbounded, the desirable attitude motion will be uncontrollable. Thus, Assumption 3 is reasonable. In addition, in terms of the compact property and the continuous property, which have been detailed in [28], Assumption 4 is reasonable. Detailed derivations of some equations above is provided in Appendix A.

4. Design for Finite-Horizon Optimal Control

In this section, an ADP based finite-horizon optimal control method is designed to make the nominal system (43) controlled in a finite-horizon optimal manner. In order to approximate the value function in the HJB equation, an NN consisting of constant weights and a time-state-dependent feature function is constructed. A novel weight updating law is proposed in order to minimize the objective function, remove the requirement for the initial admissible control, and guarantee the Lyapunov stability.

The objective of the finite-horizon optimal control is to maximize the finite-horizon cost function defined as

V (Z, t) = ψ (Z (t_{f}), t_{f}) + \int_{t}^{t_{f}} r (Z, U) d t,

(50)

where

ψ (Z (t_{f}), t_{f})

is the terminal constraint of the terminal state

Z (t_{f})

.

r (Z, U)

is the cost-to-go function defined as

r (Z, U) = Z^{T} Q Z + U^{T} R U,

(51)

where

Q > 0 \in R^{2 \times 2}

,

R > 0 \in R^{2 \times 2}

are symmetric positive matrices.

Similarly, the terminal cost function is defined as

V (Z, t_{f}) = ψ (Z (t_{f}), t_{f}) .

(52)

Considering Equation (50), the Hamiltonian function of the nominal system (43) is given as

\begin{matrix} H (Z, U, V (Z, t)) = \nabla_{t} V (Z, t) + Z^{T} Q Z + U^{T} R U + \nabla_{Z}^{T} V (Z, t) (F (Z) + G U) \end{matrix} .

(53)

Then, the optimal cost function

V^{*} (Z, t)

satisfies the equation as [20]

min_{U} H (Z, U, V^{*} (Z, t)) = 0 .

(54)

According to Equation (54), the optimal control input

U^{*}

meets the conditions as

\frac{\partial H (Z, U, V^{*} (Z, t))}{\partial U} |_{U = U^{*}} = 0 .

(55)

Hence, the optimal control input

U^{*}

can be obtained as

U^{*} = - \frac{1}{2} R^{- 1} G^{T} \nabla_{Z} V^{*} (Z, t) .

(56)

Invoking Equation (56) into (53) and considering (54) yields

\begin{matrix} H (Z, U^{*}, V^{*} (Z, t)) \\ = \nabla_{t} V^{*} (Z, t) + Z^{T} Q Z + \nabla_{Z}^{T} V^{*} (Z, t) F (Z) - \frac{1}{4} \nabla_{Z}^{T} V^{*} (Z, t) G R^{- 1} G^{T} \nabla_{Z} V^{*} (Z, t) = 0 \end{matrix} .

(57)

We rewrite the optimal cost function

V^{*} (Z, t)

by NN as

V^{*} (Z, t) = W_{c}^{T} b_{c} (Z, t_{f} - t) + ε (Z, t),

(58)

where

b_{c} (Z, t_{f} - t) \in R^{L}

is the basis functions vector, L is the number of the basis functions.

W_{c} \in R^{L}

is the weights vector.

ε (Z, t)

is the approximate error.

Similarly, the terminal optimal cost function

V^{*} (Z, t_{f})

can be written as

V^{*} (Z, t_{f}) = W_{c}^{T} b_{c} (Z (t_{f}), 0) + ε (Z, t_{f}) .

(59)

The gradients of

V^{*} (Z, t)

with respect to t and Z are

\nabla_{t} V^{*} (Z, t) = \nabla_{t}^{T} b_{c} (Z, t_{f} - t) W_{c} + \nabla_{t} ε (Z, t),

(60)

\nabla_{Z} V^{*} (Z, t) = \nabla_{Z}^{T} b_{c} (Z, t_{f} - t) W_{c} + \nabla_{Z} ε (Z, t) .

(61)

Assumption 5

([20,22,25,26,29,30]).

W_{c}

,

ε (Z, t)

,

\nabla_{t} ε (Z, t)

,

\nabla_{Z} ε (Z, t)

,

b_{c} (Z, t_{f} - t)

,

\nabla_{t} b_{c} (Z, t_{f} - t)

, and

\nabla_{Z} b_{c} (Z, t_{f} - t)

are all bounded such as

∥W_{c}∥ \leq W_{c M}

,

∥ε (Z, t_{f})∥ \leq ε_{M}

,

∥\nabla_{t} ε (Z, t)∥ \leq ε_{t M}^{'}

,

∥\nabla_{Z} ε (Z, t)∥ \leq ε_{Z M}^{'}

,

∥b_{c} (Z, t_{f} - t)∥ \leq b_{M}^{'}

,

∥\nabla_{t} b_{c} (Z, t_{f} - t)∥ \leq b_{t M}^{'}

,

∥\nabla_{Z} b_{c} (Z, t_{f} - t)∥ \leq b_{Z M}^{'}

.

Invoking Equation (61) into (56) yields

\begin{matrix} U^{*} = - \frac{1}{2} R^{- 1} G^{T} \nabla_{Z}^{T} b_{c} (Z, t_{f} - t) W_{c} - \frac{1}{2} R^{- 1} G^{T} \nabla_{Z} ε (Z, t) \end{matrix} .

(62)

Invoking Equations (60)–(62) into (57) yields

\begin{matrix} H (Z, U^{*}, V^{*} (Z, t)) \\ = \nabla_{t}^{T} b_{c} (Z, t_{f} - t) W_{c} + Z^{T} Q Z + W_{c}^{T} \nabla_{Z} b_{c} (Z, t_{f} - t) F (Z) - \frac{1}{4} W_{c}^{T} X W_{c} + ε_{H J B} = 0 \end{matrix},

(63)

where

\begin{matrix} X = \nabla_{Z} b_{c} (Z, t_{f} - t) G R^{- 1} G^{T} \nabla_{Z}^{T} b_{c} (Z, t_{f} - t) \in R^{L \times L} \end{matrix},

(64)

\begin{matrix} ε_{H J B} = \nabla_{t} ε (Z, t) + \nabla_{Z}^{T} ε (Z, t) (F (Z) + G U^{*}) + \frac{1}{4} \nabla_{Z}^{T} ε (Z, t) G R^{- 1} G^{T} \nabla_{Z} ε (Z, t) \end{matrix} .

(65)

Lemma 1

([31]). For nominal system (43), it is asymptotically stable under the control as

U = - \frac{1}{2} ζ R^{- 1} G^{T} \nabla_{Z} V^{*} (Z, t),

(66)

where

ζ \geq \frac{1}{2}

.

Assumption 6.

X

and

ε_{H J B}

are both bounded as

∥X∥ \leq X_{M}

and

∥ε_{H J B}∥ \leq ε_{H J B M}

.

Since

W_{c}

is unknown, a critic NN is constructed to approximate the optimal cost function as

\hat{V} (Z, t) = {\hat{W}}_{c}^{T} b_{c} (Z, t_{f} - t),

(67)

where

{\hat{W}}_{c}

is the estimation of

W_{c}

.

We define the estimation error

{\tilde{W}}_{c} = W_{c} - {\hat{W}}_{c} .

(68)

Invoking Equation (67), the gradients of the optimal cost function with respect to t and Z can be approximated as

\nabla_{t} \hat{V} (Z, t) = \nabla_{t}^{T} b_{c} (Z, t_{f} - t) {\hat{W}}_{c},

(69)

\nabla_{Z} \hat{V} (Z, t) = \nabla_{Z}^{T} b_{c} (Z, t_{f} - t) {\hat{W}}_{c} .

(70)

Invoking Equations (56) and (70), the estimation of the optimal control input can be written as

\hat{U} = - \frac{1}{2} R^{- 1} G^{T} \nabla_{Z}^{T} b_{c} (Z, t_{f} - t) {\hat{W}}_{c},

(71)

where

\hat{U} = {[{\hat{q}}_{c}^{*}, {\hat{u}}^{*}]}^{T} \in R^{2}

.

Similar to Equation (57), the estimation of the Hamiltonian function can be expressed as

\begin{matrix} \hat{H} (Z, \hat{U}, \hat{V} (Z, t)) = \nabla_{t}^{T} b_{c} (Z, t_{f} - t) {\hat{W}}_{c} + Z^{T} Q Z + {\hat{W}}_{c}^{T} \nabla_{Z} b_{c} (Z, t_{f} - t) F (Z) - \frac{1}{4} {\hat{W}}_{c}^{T} X {\hat{W}}_{c} = e_{c} \end{matrix} .

(72)

Then, the optimal terminal cost function can be estimated as

\hat{V} (Z, t_{f}) = {\hat{W}}_{c}^{T} b_{c} (\hat{Z} (t_{f}), 0),

(73)

where

\hat{Z} (t_{f})

is the estimation of

Z (t_{f})

[25,26,32].

We define the terminal constraints estimation error as

e_{t_{f}} = ψ (Z (t_{f}), t_{f}) - {\hat{W}}_{c}^{T} b_{c} (\hat{Z} (t_{f}), 0) .

(74)

Invoking Equations (72) and (74), a total squared error is defined as

E = \frac{1}{2} e_{c}^{T} e_{c} + \frac{1}{2} e_{t_{f}}^{T} e_{t_{f}} .

(75)

Prior to designing the weight updating law for the critic NN, an assumption is given.

Assumption 7.

Considering system (43) with the optimal control input (56), we can always find a Lyapunov function

J_{1}

that satisfies

{\dot{J}}_{1} = \nabla_{Z}^{T} J_{1} (F (Z) + G U^{*}) < 0

. Furthermore, there is always a positive function

Λ (Z) \in R^{2 \times 2}

that satisfies the following inequality.

\nabla_{Z}^{T} J_{1} (F (Z) + G U^{*}) < - \nabla_{Z}^{T} J_{1} Λ (Z) \nabla_{Z} J_{1}

(76)

In order to minimize the total squared error (75), a novel weight updating law based on the gradient descent theory is developed as

\begin{matrix} {\dot{\hat{W}}}_{c} = - c_{1} \frac{{\bar{β}}_{1}}{m_{s}} e_{c} - c_{1} \frac{{\bar{β}}_{2}}{m_{t}} e_{t_{f}} + \frac{1}{2} c_{1} Φ \nabla_{Z} b_{c} (Z, t_{f} - t) G R^{- 1} G^{T} \nabla_{Z} J_{1} \\ + c_{1} (\frac{1}{4} \frac{{\bar{β}}_{1}}{m_{s}} {\hat{W}}_{c}^{T} X {\hat{W}}_{c} - (b_{c} (\hat{Z} (t_{f}), 0) \frac{{\bar{β}}_{2}^{T}}{m_{t}} + Y_{2} - Y_{1} {\bar{β}}_{1}^{T}) {\hat{W}}_{c}) \end{matrix},

(77)

where

c_{1} > 0

is the parameter to be designed.

J_{1}

is designed in Remark 4.

Y_{1} \in R^{L \times 1}

,

Y_{2} \in R^{L \times L}

are the vector and matrix to be designed, respectively.

{\bar{β}}_{1}

,

{\bar{β}}_{2}

,

m_{s}

,

m_{t}

are expressed as

{\bar{β}}_{1} = \frac{β_{1}}{1 + β_{1}^{T} β_{1}}

,

{\bar{β}}_{2} = \frac{β_{2}}{1 + β_{2}^{T} β_{2}}

,

m_{s} = 1 + β_{1}^{T} β_{1}

,

m_{t} = 1 + β_{2}^{T} β_{2}

, where

β_{1}

and

β_{2}

are written as

\begin{matrix} β_{1} = \nabla_{t} b_{c} (Z, t_{f} - t) + \nabla_{Z} b_{c} (Z, t_{f} - t) (F (Z) + G \hat{U}) \in R^{L \times 1} \end{matrix},

(78)

β_{2} = - b_{c} (\hat{Z} (t_{f}), 0) \in R^{L \times 1} .

(79)

Φ is given as

Φ = \{\begin{matrix} 0, i f \nabla_{Z} J_{1}^{T} (F (Z) + G \hat{U}) < 0 \\ 1, o t h e r w i s e \end{matrix} .

(80)

Remark 3.

Considering the optimal approximation property of the NN, Assumption 5 is reasonable. In addition, the optimal control

U^{*}

can be obtained when

ζ = 1

in Equation (66). Taking Lemma 1 into account,

U^{*}

can stabilize the nominal system (43); that is,

F (Z) + G U^{*}

is bounded. Simultaneously, considering Assumptions 1 and 5, Assumption 6 is reasonable. Taking Remark 3 into consideration, the optimal control input

U^{*}

can stabilize the nominal system (43). Hence, we can always find a Lyapunov function

J_{1}

where the derivative of

J_{1}

with respect to t is negative and bounded. In general,

J_{1}

can be designed as

J_{1} = \frac{1}{2} Z^{T} Z

. Thus, Assumption 7 is reasonable.

Remark 4.

According to the expression of

{\bar{β}}_{1}

,

m_{s}

and

m_{t}

, we have that

∥{\bar{β}}_{1}∥

,

\frac{1}{m_{s}}

and

\frac{1}{m_{t}}

are bounded as

∥{\bar{β}}_{1}∥ \leq {\bar{β}}_{1 M}

,

0 < \frac{1}{m_{s}} \leq 1

,

0 < \frac{1}{m_{t}} \leq 1

, respectively.

Remark 5.

The first and second terms in Equation (77) are employed to minimize the total squared error based on gradient descent theory. Moreover, the third term is used to enhance the stabilizing ability of the controller. In more detail, according to Equation (80), the third term disappears when

\nabla_{Z} J_{1}^{T} (F (Z) + G \hat{U}) < 0

, which can be treated as a stability characteristic of the system, while the third term is activated to reinforce the stability of the system when the stability characteristic is gone. Thus, the requirement of an initial admissible control is avoided. In addition, the fourth term is designed for the UUB stability of the system in the subsequent process of the proof.

5. Stability Analysis

In this section, Theorem 1 is proposed to analyze the stability of the closed system controlled by the backstepping and finite-horizon optimal control methods. Theorem 1 is given as follows.

Theorem 1.

For the system comprising (1) and (2) with associated finite-horizon cost function (50), the backstepping control inputs and finite-horizon optimal control inputs are designed as Equations (26), (36), and (71), respectively. The virtual control input

q_{c}

and unconstrained input control input u are designed as Equations (16) and (17), respectively. The NN weights vectors tuning laws are given by Equations (27), (37), and (77). Then, the closed-loop system states errors

z_{1}

,

z_{2}

, the weights vectors estimation errors

{\tilde{W}}_{1}

,

{\tilde{W}}_{2}

,

{\tilde{W}}_{c}

, the disturbance estimation errors

{\tilde{D}}_{1}

,

{\tilde{D}}_{2}

, the DSC system state error e, and the auxiliary system state error

S_{2}

are UUB with appropriate designed parameters.

Proof.

We consider the following Lyapunov function as

J = V_{b} + \frac{1}{2} {\tilde{W}}_{c}^{T} c_{1}^{- 1} {\tilde{W}}_{c},

(81)

where

V_{b}

is defined as Equation (48).

Differentiating J yields

\dot{J} = {\dot{V}}_{b} + {\tilde{W}}_{c}^{T} c_{1}^{- 1} {\dot{\tilde{W}}}_{c},

(82)

where

{\dot{V}}_{b}

is given as Equation (49).

Next,

{\tilde{W}}_{c}^{T} c_{1}^{- 1} {\dot{\tilde{W}}}_{c}

is derived as follows.

Invoking Equations (69)–(71) and (78) into (72) yields

e_{c} = Z^{T} Q Z + {\hat{W}}_{c}^{T} β_{1} + \frac{1}{4} {\hat{W}}_{c}^{T} X {\hat{W}}_{c} .

(83)

Invoking Equation (63) yields

\begin{matrix} \nabla_{t}^{T} b_{c} (Z, t_{f} - t) W_{c} + W_{c}^{T} \nabla_{Z} b_{c} (Z, t_{f} - t) F (Z) = - Z^{T} Q Z + \frac{1}{4} W_{c}^{T} X W_{c} - ε_{H J B} \end{matrix} .

(84)

Invoking Equations (68)–(70) and (84) into (83) yields

\begin{matrix} e_{c} = - {\tilde{W}}_{c}^{T} β_{1} + W_{c}^{T} β_{1} - W_{c}^{T} β - \frac{1}{4} W_{c}^{T} X W_{c} + \frac{1}{4} {\hat{W}}_{c}^{T} X {\hat{W}}_{c} - ε_{H J B} \end{matrix},

(85)

where

\begin{matrix} β = \nabla_{t} b_{c} (Z, t_{f} - t) + \nabla_{Z} b_{c} (Z, t_{f} - t) (F (Z) - \frac{1}{2} G R^{- 1} G^{T} \nabla_{Z}^{T} b_{c} (Z, t_{f} - t) W_{c}) \end{matrix} .

(86)

Invoking Equations (64), (71), (78), and (86) yields

W_{c}^{T} β_{1} - W_{c}^{T} β = - \frac{1}{2} W_{c}^{T} X {\hat{W}}_{c} + \frac{1}{2} W_{c}^{T} X W_{c} .

(87)

Invoking Equation (87) yields

\begin{matrix} W_{c}^{T} β_{1} - W_{c}^{T} β - \frac{1}{4} W_{c}^{T} X W_{c} + \frac{1}{4} {\hat{W}}_{c}^{T} X {\hat{W}}_{c} = \frac{1}{4} {\tilde{W}}_{c}^{T} X {\tilde{W}}_{c} \end{matrix} .

(88)

Invoking Equation (88) into (85) yields

e_{c} = - {\tilde{W}}_{c}^{T} β_{1} + \frac{1}{4} {\tilde{W}}_{c}^{T} X {\tilde{W}}_{c} - ε_{H J B} .

(89)

Invoking Equation (89) yields

\begin{matrix} {\tilde{W}}_{c}^{T} \frac{{\bar{β}}_{1}}{m_{s}} e_{c} = - {\tilde{W}}_{c}^{T} {\bar{β}}_{1} {\bar{β}}_{1}^{T} {\tilde{W}}_{c} + \frac{1}{4} {\tilde{W}}_{c}^{T} \frac{{\bar{β}}_{1}}{m_{s}} {\tilde{W}}_{c}^{T} X {\tilde{W}}_{c} - {\tilde{W}}_{c}^{T} \frac{{\bar{β}}_{1}}{m_{s}} ε_{H J B} \end{matrix} .

(90)

In addition, we have

{\tilde{W}}_{c}^{T} X {\tilde{W}}_{c} = 2 W_{c}^{T} X {\tilde{W}}_{c} - W_{c}^{T} X W_{c} + {\hat{W}}_{c}^{T} X {\hat{W}}_{c} .

(91)

Invoking Equations (52), (58), and (68) into (74) yields

e_{t_{f}} = W_{c}^{T} {\tilde{b}}_{c} (Z (t_{f}), 0) + ε (Z, t_{f}) + {\tilde{W}}_{c}^{T} b_{c} (\hat{Z} (t_{f}), 0),

(92)

where

{\tilde{b}}_{c} (Z (t_{f}), 0) = b_{c} (Z (t_{f}), 0) - b_{c} (\hat{Z} (t_{f}), 0) .

(93)

Invoking Equations (68) and (90)–(92) into (77) yields

\begin{matrix} {\tilde{W}}_{c}^{T} c_{1}^{- 1} {\dot{\tilde{W}}}_{c} = - {\tilde{W}}_{c}^{T} {\bar{β}}_{1} {\bar{β}}_{1}^{T} {\tilde{W}}_{c} - {\tilde{W}}_{c}^{T} Y_{2} {\tilde{W}}_{c} + {\tilde{W}}_{c}^{T} {\bar{β}}_{1} (Y_{1}^{T} + \frac{1}{2} \frac{1}{m_{s}} W_{c}^{T} X) {\tilde{W}}_{c} \\ + {\tilde{W}}_{c}^{T} {\bar{β}}_{1} (- \frac{1}{4} \frac{1}{m_{s}} W_{c}^{T} X W_{c} + \frac{1}{m_{s}} κ_{1}) + {\tilde{W}}_{c}^{T} (b_{c} (\hat{Z} (t_{f}), 0) \frac{{\bar{β}}_{2}}{m_{t}} + Y_{2}) W_{c} - Y_{1} {\bar{β}}_{1}^{T} W_{c} + \frac{{\bar{β}}_{2}}{m_{t}} κ_{2}) \\ - \frac{1}{2} {\tilde{W}}_{c}^{T} Φ \nabla_{Z} b_{c} (Z, t_{f} - t) G R^{- 1} G^{T} J_{1 z} \end{matrix},

(94)

where

κ_{1} = - ε_{H J B}

(95)

κ_{2} = W_{c}^{T} {\tilde{b}}_{c} (Z (t_{f}), 0) + ε (Z, t_{f}) .

(96)

Invoking Equations (49), (82), and (94) yields

\begin{matrix} \dot{J} \leq \nabla_{Z}^{T} J_{1} (F (Z) + G \hat{U}) - (k_{1} - 1) z_{1}^{2} - (k_{2} + \frac{1}{2 τ} - \frac{1}{2}) z_{2}^{2} - (\frac{3}{2 τ} - \frac{1}{2}) e^{2} \\ - (k_{a u x} - \frac{1}{2} - \frac{1}{2} θ_{a u x} g_{2}^{2}) S_{2}^{2} - \frac{1}{2} (τ_{1} - ι_{1}^{- 1}) {\tilde{∥W_{1}∥}}^{2} - \frac{1}{2} (τ_{2} - ι_{2}^{- 1}) {\tilde{∥W_{2}∥}}^{2} \\ - (L_{1} - \frac{1}{2} - \frac{1}{2} ι_{1} a_{1 M}^{2}) {\tilde{D}}_{1}^{2} - (L_{2} - \frac{1}{2} - \frac{1}{2} ι_{2} a_{2 M}^{2}) {\tilde{D}}_{2}^{2} + \frac{1}{2} {\dot{D}}_{1}^{2} + \frac{1}{2} {\dot{D}}_{2}^{2} \\ + \frac{1}{2} τ_{1} {∥W_{1}^{*}∥}^{2} + \frac{τ_{2}}{2} {∥W_{2}^{*}∥}^{2} + \frac{1}{2} θ_{a u x}^{- 1} {\bar{Δ}}^{2} + M^{2} - {\tilde{W}}_{c}^{T} {\bar{β}}_{1} {\bar{β}}_{1}^{T} {\tilde{W}}_{c} \\ + {\tilde{W}}_{c}^{T} {\bar{β}}_{1} (Y_{1}^{T} + \frac{1}{2} \frac{1}{m_{s}} W_{c}^{T} X) {\tilde{W}}_{c} - {\tilde{W}}_{c}^{T} Y_{2} {\tilde{W}}_{c} + {\tilde{W}}_{c}^{T} {\bar{β}}_{1} (- \frac{1}{4} \frac{1}{m_{s}} W_{c}^{T} X W_{c} + \frac{κ_{1}}{m_{s}}) \\ + {\tilde{W}}_{c}^{T} ((b_{c} (\hat{Z} (t_{f}), 0) \frac{{\bar{β}}_{2}^{T}}{m_{t}} + Y_{2}) W_{c} - Y_{1} {\bar{β}}_{1}^{T} W_{c} + \frac{{\bar{β}}_{2}}{m_{t}} κ_{2}) \\ - \frac{1}{2} {\tilde{W}}_{c}^{T} Φ \nabla_{Z} b_{c} (Z, t_{f} - t) G R^{- 1} G^{T} \nabla_{Z} J_{1} \end{matrix} .

(97)

Let

Ξ = {[z_{1}, z_{2}, {\tilde{W}}_{c}^{T}, {\tilde{W}}_{1}^{T}, {\tilde{W}}_{2}^{T}, {\tilde{D}}_{1}, {\tilde{D}}_{1}, e, S_{2}]}^{T} \in R^{(6 + L + n_{1} + n_{2})}

; then, Equation (97) can be rewritten in the form of a matrix as

\begin{matrix} \dot{J} \leq \nabla_{Z}^{T} J_{1} (F (Z) + G \hat{U}) - Ξ^{T} \bar{M} Ξ + Ξ^{T} \bar{N} + d - \frac{1}{2} {\tilde{W}}_{c}^{T} Φ \nabla_{Z} b_{c} (Z, t_{f} - t) G R^{- 1} G^{T} \nabla_{Z} J_{1} \end{matrix},

(98)

where

\bar{M} = [\begin{matrix} {\bar{M}}_{11} & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & {\bar{M}}_{22} & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & {\bar{M}}_{33} & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & {\bar{M}}_{44} & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & {\bar{M}}_{55} & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & {\bar{M}}_{66} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & {\bar{M}}_{77} & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & {\bar{M}}_{88} & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & {\bar{M}}_{99} \end{matrix}] \in R^{(6 + L + n_{1} + n_{2}) \times (6 + L + n_{1} + n_{2})}

(99)

\begin{matrix} \bar{N} = {[\begin{matrix} 0 & 0 & {\bar{N}}_{3}^{T} & 0^{1 \times n_{1}} & 0^{1 \times n_{2}} & 0 & 0 & 0 & 0 \end{matrix}]}^{T} \in R^{(6 + L + n_{1} + n_{2})} \end{matrix}

(100)

{\bar{M}}_{11} = k_{1} - 1 \in R^{1}

(101)

{\bar{M}}_{22} = k_{2} + \frac{1}{2 τ} - \frac{1}{2} \in R^{1}

(102)

{\bar{M}}_{33} = {\bar{β}}_{1} {\bar{β}}_{1}^{T} + Y_{2} - {\bar{β}}_{1} (Y_{1} + \frac{1}{2} \frac{1}{m_{s}} W_{c}^{T} X)

(103)

{\bar{M}}_{44} = \frac{1}{2} (τ_{1} - ι_{1}^{- 1}) I^{n_{1} \times n_{1}} \in R^{n_{1} \times n_{1}}

(104)

{\bar{M}}_{55} = \frac{1}{2} (τ_{2} - ι_{2}^{- 1}) I^{n_{2} \times n_{2}} \in R^{n_{2} \times n_{2}}

(105)

{\bar{M}}_{66} = L_{1} - \frac{1}{2} - \frac{1}{2} ι_{1} a_{1 M}^{2} \in R^{1}

(106)

{\bar{M}}_{77} = L_{2} - \frac{1}{2} - \frac{1}{2} ι_{2} a_{2 M}^{2} \in R^{1}

(107)

{\bar{M}}_{88} = \frac{3}{2 τ} - \frac{1}{2} \in R^{1}

(108)

{\bar{M}}_{99} = k_{a u x} - \frac{1}{2} - \frac{1}{2} θ_{a u x} g_{2}^{2} \in R^{1}

(109)

{\bar{N}}_{3} = {\bar{β}}_{1} (- \frac{1}{4} \frac{1}{m_{s}} W_{c}^{T} X W_{c} + \frac{κ_{1}}{m_{s}}) + (b_{c} (\hat{Z} (t_{f}), 0) \frac{{\bar{β}}_{2}^{T}}{m_{t}} + Y_{2}) W_{c} - Y_{1} {\bar{β}}_{1}^{T} W_{c} + \frac{{\bar{β}}_{2}}{m_{t}} κ_{2}

(110)

\begin{matrix} d = \frac{1}{2} {\dot{D}}_{1}^{2} + \frac{1}{2} {\dot{D}}_{2}^{2} + \frac{1}{2} τ_{1} {∥W_{1}^{*}∥}^{2} + \frac{1}{2} τ_{2} {∥W_{2}^{*}∥}^{2} + \frac{1}{2} θ_{a u x}^{- 1} {\bar{Δ}}^{2} + M^{2} \in R^{1} \end{matrix} .

(111)

Let parameters

k_{1}

,

k_{2}

,

τ

,

Y_{1}

,

Y_{2}

,

τ_{1}

,

τ_{2}

,

L_{1}

,

L_{2}

,

k_{a u x}

be chosen such that

\bar{M} ≻ 0

, and invoking Equation (98) yields

\begin{matrix} \dot{J} \leq \nabla_{Z}^{T} J_{1} (F (Z) + G \hat{U}) - λ_{min} (\bar{M}) {∥Ξ∥}^{2} + ∥Ξ∥ ∥\bar{N}∥ + d \\ - \frac{1}{2} {\tilde{W}}_{c}^{T} Φ \nabla_{Z} b_{c} (Z, t_{f} - t) G R^{- 1} G^{T} \nabla_{Z} J_{1} \end{matrix},

(112)

where

λ_{min} (\bar{M})

stands for the minimum eigenvalue of

\bar{M}

.

Case 1: If

Φ = 0

, then

\nabla_{Z}^{T} J_{1} \dot{Z} < 0

; that is,

\exists χ > 0

satisfies that

0 < χ \leq \dot{∥Z∥}

. Thus, we have

\nabla_{Z}^{T} J_{1} \dot{Z} \leq - χ ∥\nabla_{Z}^{T} J_{1}∥ < 0 .

(113)

Invoking Equation (113) into (112) yields

\begin{matrix} \dot{J} \leq - χ ∥\nabla_{Z}^{T} J_{1}∥ - λ_{min} (\bar{M}) {(∥Ξ∥ - \frac{1}{2} \frac{∥\bar{N}∥}{λ_{min} (\bar{M})})}^{2} + \frac{1}{4} \frac{{∥\bar{N}∥}^{2}}{λ_{min} (\bar{M})} + d \end{matrix} .

(114)

Invoking Equation (114) yields that

\dot{J} < 0

, as long as one of the following conditions holds.

∥\nabla_{Z}^{T} J_{1}∥ > \frac{{∥\bar{N}∥}^{2} + 4 d λ_{min} (\bar{M})}{4 χ λ_{min} (\bar{M})}

(115)

or

∥Ξ∥ > \sqrt{\frac{{∥\bar{N}∥}^{2} + 4 d λ_{min} (\bar{M})}{4 λ_{min}^{2} (\bar{M})}} + \frac{1}{2} \frac{∥\bar{N}∥}{λ_{min} (\bar{M})}

(116)

Case 2: If

Φ = 1

, then

\nabla_{Z}^{T} J_{1} \dot{Z} \geq 0

. Invoking

Φ = 1

and Equations (62) and (71) into (112) yields

\begin{matrix} \dot{J} \leq \nabla_{Z}^{T} J_{1} (F (Z) + G U^{*}) - λ_{min} (\bar{M}) {∥Ξ∥}^{2} + ∥Ξ∥ ∥\bar{N}∥ + d + \frac{1}{2} \nabla_{Z}^{T} J_{1} G R^{- 1} G^{T} \nabla_{Z} ε (Z, t) \end{matrix} .

(117)

Taking Assumptions 1 and 5 into account,

\frac{1}{2} G R^{- 1} G^{T} \nabla_{Z} ε (Z, t)

is bounded such as

∥\frac{1}{2} G R^{- 1} G^{T} \nabla_{Z} ε (Z, t)∥ \leq d_{M} .

(118)

Invoking inequality (118) into (117) and considering Assumption 7 yields

\begin{matrix} \dot{J} \leq - λ_{min} (Λ) {∥\nabla_{Z} J_{1}∥}^{2} - λ_{min} (\bar{M}) {(∥Ξ∥ - \frac{1}{2} \frac{∥\bar{N}∥}{λ_{min} (\bar{M})})}^{2} + \frac{1}{4} \frac{{∥\bar{N}∥}^{2}}{λ_{min} (\bar{M})} + d + ∥\nabla_{Z} J_{1}∥ d_{M} \\ = - λ_{min} (Λ) {(∥\nabla_{Z} J_{1}∥ - \frac{d_{M}}{2 λ_{min} (Λ)})}^{2} + \frac{d_{M}^{2}}{4 λ_{min} (Λ)} \\ - λ_{min} (\bar{M}) {(∥Ξ∥ - \frac{1}{2} \frac{∥\bar{N}∥}{λ_{min} (\bar{M})})}^{2} + \frac{1}{4} \frac{{∥\bar{N}∥}^{2}}{λ_{min} (\bar{M})} + d \end{matrix} .

(119)

Invoking Equation (119) yields that

\dot{J} < 0

, as long as one of the following conditions holds.

∥\nabla_{Z}^{T} J_{1}∥ > \sqrt{\frac{N u m}{4 λ_{min} (\bar{M}) λ_{min}^{2} (Λ)}} + \frac{d_{M}}{2 λ_{min} (Λ)},

(120)

where

\begin{matrix} N u m = λ_{min} (M) d_{M}^{2} + λ_{min} (Λ) {∥\bar{N}∥}^{2} + 4 λ_{min} (\bar{M}) λ_{min} (Λ) d \end{matrix}

(121)

or

∥Ξ∥ > \sqrt{\frac{N u m}{4 λ_{min}^{2} (\bar{M}) λ_{min} (Λ)}} + \frac{1}{2} \frac{∥\bar{N}∥}{λ_{min} (\bar{M})} .

(122)

Combining Case 1 and Case 2, we can obtain the conclusion that the augmented state vector

Ξ

is UUB. This completes the proof. □

Remark 6.

Recalling Assumptions 5, 6, and Remark 5 yields that

∥\bar{N}∥

is bounded.

Remark 7.

d is expressed in quadratic form, as shown in Equation (111). Hence, we have that

d > 0

. Simultaneously, taking

\bar{M} ≻ 0

into account,

{∥\bar{N}∥}^{2} + 4 d λ_{min} (\bar{M}) > 0

. In addition, recalling Assumptions 1, 2–4 and Remark 1 yields that d is bounded.

Remark 8.

Recalling Assumption 7, we have that

Λ ≻ 0

. Simultaneously, taking Remark 6 into account yields that

d > 0

,

\bar{M} ≻ 0

. Thus, it can be guaranteed that

N u m > 0

,

4 λ_{min} (\bar{M}) λ_{min}^{2} (Λ) > 0

, and

4 λ_{min}^{2} (\bar{M}) λ_{min} (Λ) > 0

.

6. Simulation Results

In this section, the simulation of the pitching attitude tracking of the aircraft controlled by the backstepping finite-horizon optimal control is described to illustrate the effectiveness of the proposed scheme. The parameters of the aircraft model and the designed control system are given in Table 2 and Table 3, respectively. The finite-horizon is selected as

t_{f} = 1 s

. The terminal constraint is chosen as

ψ (Z (t_{f}), t_{f}) = 0

. The basis functions are designed as

b_{c} (Z, t_{f} - t) = [z_{1} exp (- 0.1 (t_{f} - t)),

z_{2} exp (- 0.1 (t_{f} - t)),

z_{1}^{2} (t_{f} - t),

z_{2}^{2} (t_{f} - t),

z_{1}^{3},

z_{2}^{3},

sin (z_{1}) exp (- 0.1 (t_{f} - t)),

sin (z_{2}) exp (- 0.1 (t_{f} - t)),

sin (2 z_{1}),

sin (2 z_{2}),

tanh (z_{1}) exp (- 0.1 (t_{f} - t)),

tanh (z_{2}) exp (- 0.1 (t_{f} - t)),

tanh (2 z_{1}),

and

tanh (2 z_{2})]^{T}

. The objective of the designed control law is to obtain an optimized performance in the finite-horizon

t_{f}

under the guarantee of basic tracking ability.

Table 2. The parameters of the aircraft model.

Table 3. The parameters of the control system.

In order to simulate system uncertainties

Δ f_{1} (α)

,

Δ f_{2} (α, q)

, aerodynamic parameter variations are set to 1.1 times the nominal value. The disturbances are given as

d_{1} = d_{2} = 0.05 sin t

. NN and NDO are designed as the estimators of the system uncertainties and external disturbances, respectively. In order to illustrate the effectiveness of the NN and NDO, the estimations of the sums of

d_{1}

,

Δ f_{1} (α)

and

d_{2}

,

Δ f_{2} (α, q)

are shown as Figure 2 and Figure 3, respectively. In addition, the comparison of the response of the angle of attack controlled with and without the designed estimators are shown in Figure 4. As shown in Figure 2 and Figure 3, the system uncertainties and external disturbances can be estimated accurately and quickly under the utility of NN and NDO. From Figure 4, it can be depicted that owing to the designed estimators, the adverse effects caused by the system uncertainties and external disturbances are greatly reduced so that the angle of attack can track the command signal more precisely.

Figure 2. The estimations of the sum of

d_{1}

and

Δ f_{1} (α)

.

Figure 3. The estimations of the sum of

d_{2}

and

Δ f_{2} (α, q)

.

Figure 4. The response of the angle of attack controlled with and without estimators.

In order to achieve the finite-horizon optimal control, the ADP algorithm is applied. The objective function is given as

\frac{1}{2} e_{c}^{T} e_{c} + \frac{1}{2} e_{t_{f}}^{T} e_{t_{f}}

in Equation (75). The objective of the ADP algorithm is to minimize the objective function. For the purpose of illustrating the effectiveness of the designed ADP algorithm, the response of

\frac{1}{2} e_{c}^{T} e_{c} + \frac{1}{2} e_{t_{f}}^{T} e_{t_{f}}

is shown in Figure 5. It can be observed that

\frac{1}{2} e_{c}^{T} e_{c} + \frac{1}{2} e_{t_{f}}^{T} e_{t_{f}}

gradually decreases and eventually converges to zero. Hence, the designed ADP algorithm is efficient.

Figure 5. The response of the objective function.

The virtual control inputs

q^{a}

defined in Equation (26), the control inputs

u^{a}

defined in Equation (36), and

{\hat{q}}^{*}

and

{\hat{u}}^{*}

defined in Equation (71) are shown in Figure 6 and Figure 7, respectively. Under the action of the backstepping based finite-horizon optimal control inputs, the response of the angle of attack is shown in Figure 8. Furthermore, the response of the angle of attack controlled by the backstepping method with the same parameters is given in Figure 8 as a contrast. It can be illustrated that the system controlled by the designed backstepping based finite-horizon optimal control method can evolve in a finite-horizon optimal way. Thus, the conclusion can be drawn that a better performance can be obtained under the control of the backstepping-based finite-horizon optimal control method.

Figure 6. The response of

q^{a}

and

q^{*}

.

Figure 7. The response of

u^{a}

and

u^{*}

.

Figure 8. The response of the angle of attack with and without finite-horizon optimization.

7. Conclusions

In this paper, a backstepping-based finite-horizon optimal control scheme is proposed to complete the task of angle of attack tracking in a finite-horizon optimal manner. An auxiliary system is designed to compensate for the input constraints. NN and NDO are applied to estimate the system uncertainties and external disturbances. Furthermore, the backstepping method containing NN and NDO is employed to ensure the stability of the system and suppress the adverse effects caused by the system uncertainties and external disturbances. In addition, the DSC technique is utilized to avoid the derivation operation in the process of the backstepping control. Moreover, the ADP algorithm is used to control the system in a finite-horizon optimal manner. In the design of the ADP, a critic NN is constructed by time-state-dependent feature functions to approximate the value function in the HJB equation. Finally, simulation results illustrate the effectiveness of the proposed backstepping-based finite-horizon optimal control scheme.

In future work, a practical experiment will be constructed and carried out to verify the proposed control method.

Author Contributions

Conceptualization, A.L., Y.S. and B.D.; methodology, A.L. and Y.S.; writing—original draft preparation, A.L. and Y.S.; writing—review and editing, A.L., Y.S. and B.D.; visualization, A.L. and Y.S.; supervision, B.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Dataset available on request from the authors.

Conflicts of Interest

Author Ang li was employed by the company Shenyang Aircraft Design and Research Institute Yangzhou Collaborative Innovation Research Institute Co., Ltd., Yangzhou, China. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Appendix A. Derivation of Equation

Appendix A.1. The Derivation of Equation (33) Is as Follows

\begin{matrix} {\dot{V}}_{1} = & z_{1} {\dot{z}}_{1} + e \dot{e} + {\tilde{W}}_{1}^{T} Ω_{1} {\dot{\tilde{W}}}_{1} + {\tilde{D}}_{1} {\dot{\tilde{D}}}_{1} \\ = & z_{1} (f_{1} - f_{1} (α_{c}) + q_{c}^{*} + z_{2} - k_{1} z_{1} - L_{1}^{- 1} {\tilde{W}}_{1}^{T} a_{1} (α) + {\tilde{D}}_{1} + S_{2}) \\ + e (- \frac{e}{τ} + (- {\dot{q}}_{c})) + {\tilde{W}}_{1}^{T} Ω_{1} (Ω_{1}^{- 1} (a_{1} (α) z_{1} L_{1}^{- 1} - τ_{1} {\hat{W}}_{1})) \\ + {\tilde{D}}_{1} ({\dot{D}}_{1} - L_{1} {\tilde{D}}_{1} + {\hat{W}}_{1}^{T} a_{1} (α) - z_{1}) \\ = & (f_{1} - f_{1} (α_{c}) + q_{c}^{*}) + z_{1} z_{2} - k_{1} z_{1}^{2} - \frac{e^{2}}{τ} + e (- {\dot{q}}_{c}) + z_{1} S_{2} \\ - τ_{1} {\tilde{W}}_{1}^{T} {\hat{W}}_{1} + {\tilde{D}}_{1} {\dot{D}}_{1} - L_{1} {\tilde{D}}_{1}^{2} + {\tilde{D}}_{1} {\hat{W}}_{1}^{T} a_{1} (α) \end{matrix}

(A1)

Appendix A.2. The Derivation of Equation (34) Is as Follows

\begin{matrix} {\tilde{W}}_{1}^{T} {\hat{W}}_{1} = & \frac{1}{2} {\tilde{W}}_{1}^{T} ({\tilde{W}}_{1} + W_{1}^{*}) + \frac{1}{2} {({\hat{W}}_{1} - W_{1}^{*})}^{T} {\hat{W}}_{1} \\ = & \frac{1}{2} {∥{\tilde{W}}_{1}∥}^{2} + \frac{1}{2} {∥{\hat{W}}_{1}∥}^{2} + \frac{1}{2} {\tilde{W}}_{1}^{T} W_{1}^{*} - \frac{1}{2} W_{1}^{* T} {\hat{W}}_{1} \\ = & \frac{1}{2} {∥{\tilde{W}}_{1}∥}^{2} + \frac{1}{2} {∥{\hat{W}}_{1}∥}^{2} - \frac{1}{2} {∥W_{1}^{*}∥}^{2} \\ \geq & \frac{1}{2} {∥{\tilde{W}}_{1}∥}^{2} - \frac{1}{2} {∥W_{1}^{*}∥}^{2} \end{matrix}

(A2)

Appendix A.3. The Derivation of Equation (35) Is as Follows

\begin{matrix} {\dot{V}}_{1} \leq & z_{1} (f_{1} - f_{1} (α_{c}) + q_{c}^{*}) + \frac{1}{2} z_{1}^{2} + \frac{1}{2} z_{2}^{2} - k_{1} z_{1}^{2} - \frac{e^{2}}{τ} + \frac{1}{2} e^{2} + \frac{1}{2} M^{2} + \frac{1}{2} z_{1}^{2} + \frac{1}{2} S_{2}^{2} \\ - \frac{1}{2} τ_{1} {∥{\tilde{W}}_{1}∥}^{2} + \frac{1}{2} τ_{1} {∥W_{1}^{*}∥}^{2} + \frac{1}{2} {\tilde{D}}_{1}^{2} + \frac{1}{2} {\dot{D}}_{1}^{2} - L_{1} {\tilde{D}}_{1}^{2} + \frac{1}{2} ι_{1} a_{1}^{2} {\tilde{D}}_{1}^{2} + \frac{1}{2} ι_{1}^{- 1} {∥{\tilde{W}}_{1}∥}^{2} \\ = & z_{1} (f_{1} - f_{1} (α_{c}) + q_{c}^{*}) - (k_{1} - 1) z_{1}^{2} + \frac{1}{2} z_{2}^{2} - (\frac{1}{τ} - \frac{1}{2}) e^{2} + \frac{1}{2} S_{2}^{2} - \frac{1}{2} (τ_{1} - ι_{1}^{- 1}) {∥{\tilde{W}}_{1}∥}^{2} \\ - (L_{1} - \frac{1}{2} - \frac{1}{2} ι_{1} a_{1}^{2}) {\tilde{D}}_{1}^{2} + \frac{1}{2} {\dot{D}}_{1}^{2} + \frac{1}{2} τ_{1} {∥W_{1}^{*}∥}^{2} + \frac{1}{2} M^{2} \end{matrix}

(A3)

Appendix A.4. The Derivation of Equation (38) Is as Follows

\begin{matrix} {\dot{z}}_{2} = & f_{2} + L_{2}^{- 1} W_{2}^{* T} a_{2} (α, q) + D_{2} - {\dot{q}}_{c} + k a u S_{2} + g_{2} u^{*} + g_{2} u^{a} \\ = & f_{2} - f_{2} (α_{c}, q_{c}) + g_{2} u^{*} - L_{2}^{- 1} {\hat{W}}_{2}^{T} a_{2} (α, q) + L_{2}^{- 1} W_{2}^{* T} a_{2} (α, q) + D_{2} - {\hat{D}}_{2} - k_{2} z_{2} + \dot{λ} - {\dot{q}}_{c} \\ = & f_{2} - f_{2} (α_{c}, q_{c}) + g_{2} u^{*} - L_{2}^{- 1} {\tilde{W}}_{2}^{T} a_{2} (α, q) + {\tilde{D}}_{2} - k_{2} z_{2} + \dot{e} \\ = & f_{2} - f_{2} (α_{c}, q_{c}) + g_{2} u^{*} - L_{2}^{- 1} {\tilde{W}}_{2}^{T} a_{2} (α, q) + {\tilde{D}}_{2} - k_{2} z_{2} - \frac{e}{τ} - {\dot{q}}_{c} \end{matrix}

(A4)

References

Castañeda, H.; Salas-Peña, O.S.; de León-Morales, J. Extended observer based on adaptive second order sliding mode control for a fixed wing UAV. ISA Trans. 2017, 66, 226–232. [Google Scholar] [CrossRef] [PubMed]
Lee, C.H.; Chung, M.J. Gain-scheduled state feedback control design technique for flight vehicles. IEEE Trans. Aerosp. Electron. Syst. 2001, 37, 173–182. [Google Scholar]
Snell, S.A.; Enns, D.F.; Garrard, W.L. Nonlinear inversion flight control for a supermaneuverable aircraft. J. Guid. Control. Dyn. 1992, 15, 976–984. [Google Scholar] [CrossRef]
Lungu, M. Stabilization and control of a UAV flight attitude angles using the backstepping method. World Acad. Sci. Eng. Technol. 2012, 6, 241–248. [Google Scholar]
Zhang, J.; Sun, C.; Zhang, R.; Qian, C. Adaptive sliding mode control for re-entry attitude of near space hypersonic vehicle based on backstepping design. IEEE/CAA J. Autom. Sin. 2015, 2, 94–101. [Google Scholar] [CrossRef]
Kang, Y.; Chen, S.; Wang, X.; Cao, Y. Deep convolutional identifier for dynamic modeling and adaptive control of unmanned helicopter. IEEE Trans. Neural Netw. Learn. Syst. 2018, 30, 524–538. [Google Scholar] [CrossRef] [PubMed]
Wu, D.; Chen, M.; Gong, H.; Wu, Q. Robust backstepping control of wing rock using disturbance observer. Appl. Sci. 2017, 7, 219. [Google Scholar] [CrossRef]
Wu, D.; Chen, M.; Gong, H. Robust control of post-stall pitching maneuver based on finite-time observer. ISA Trans. 2017, 70, 53–63. [Google Scholar] [CrossRef] [PubMed]
Liu, H.; Lu, G.; Zhong, Y. Robust LQR Attitude Control of a 3-DOF Laboratory Helicopter for Aggressive Maneuvers. IEEE Trans. Ind. Electron. 2013, 60, 4627–4636. [Google Scholar] [CrossRef]
Zarei, J.; Montazeri, A.; Motlagh, M.R.J.; Poshtan, J. Design and comparison of LQG/LTR and H controllers for a VSTOL flight control system. J. Frankl. Inst. 2007, 344, 577–594. [Google Scholar] [CrossRef]
Bellman, R. Dynamic programming. Science 1966, 153, 34–37. [Google Scholar] [CrossRef] [PubMed]
Chanane, B. Optimal control of nonlinear systems: A recursive approach. Comput. Math. Appl. 1998, 35, 29–33. [Google Scholar] [CrossRef]
Mracek, C.P.; Cloutier, J.R. Control designs for the nonlinear benchmark problem via the state-dependent Riccati equation method. Int. J. Robust Nonlinear Control 1998, 8, 401–433. [Google Scholar] [CrossRef]
Werbos, P. Approximate Dynamic Programming for Real-Time Control and Neural Modeling; Academic Press: New York, NY, USA, 1977. [Google Scholar]
Wei, Q.; Song, R.; Yan, P. Data-driven zero-sum neuro-optimal control for a class of continuous-time unknown nonlinear systems with disturbance using ADP. IEEE Trans. Neural Netw. Learn. Syst. 2017, 27, 444–458. [Google Scholar] [CrossRef] [PubMed]
Ferrari, S.; Stengel, R.F. Online adaptive critic flight control. J. Guid. Control. Dyn. 2004, 27, 777–786. [Google Scholar] [CrossRef]
Ferrari, S.; Steck, J.E.; Chandramohan, R. Adaptive feedback control by constrained approximate dynamic programming. IEEE Trans. Syst. Man Cybern. Part 2008, 38, 982–987. [Google Scholar] [CrossRef] [PubMed]
Zhou, Y.; Kampen, E.J.V.; Chu, Q.P. Incremental approximate dynamic programming for nonlinear adaptive tracking control with partial observability. J. Guid. Control. Dyn. 2018, 41, 1–14. [Google Scholar] [CrossRef]
Fan, Q.Y.; Yang, G.H. Adaptive actor–critic design-based integral sliding-mode control for partially unknown nonlinear systems with input disturbances. IEEE Trans. Neural Netw. Learn. Syst. 2017, 27, 165–177. [Google Scholar] [CrossRef] [PubMed]
Sun, J.; Liu, C. Backstepping-based adaptive dynamic programming for missile-target guidance systems with state and input constraints. J. Frankl. Inst. 2018, 355, 8412–8440. [Google Scholar] [CrossRef]
Sun, J.; Liu, C.; Zhao, X. Backstepping-based zero-sum differential games for missile-target interception systems with input and output constraints. IET Control Theory Appl. 2018, 12, 243–253. [Google Scholar] [CrossRef]
Xia, R.; Chen, M.; Wu, Q. Neural network based optimal adaptive attitude control of near-space vehicle with system uncertainties and disturbances. Proc. Inst. Mech. Eng. Part J. Aerosp. Eng. 2019, 233, 641–656. [Google Scholar] [CrossRef]
Cui, X.; Zhang, H.; Luo, Y.; Zu, P. Online finite-horizon optimal learning algorithm for nonzero-sum games with partially unknown dynamics and constrained inputs. Neurocomputing 2016, 185, 37–44. [Google Scholar] [CrossRef]
Cheng, T.; Lewis, F.L.; Abu-Khalaf, M. A neural network solution for fixed-final time optimal control of nonlinear systems. Automatica 2007, 43, 482–490. [Google Scholar] [CrossRef]
Zhao, Q.; Xu, H.; Jagannathan, S. Neural network-based finite-horizon optimal control of uncertain affine nonlinear discrete-time systems. IEEE Trans. Neural Netw. Learn. Syst. 2014, 26, 486–499. [Google Scholar] [CrossRef] [PubMed]
Sun, J.; Liu, C. Finite-horizon differential games for missile–target interception system using adaptive dynamic programming with input constraints. Int. J. Syst. Sci. 2018, 49, 264–283. [Google Scholar] [CrossRef]
Xu, H.; Jagannathan, S. Neural network-based finite horizon stochastic optimal control design for nonlinear networked control systems. IEEE Trans. Neural Netw. Learn. Syst. 2014, 26, 472–485. [Google Scholar] [CrossRef] [PubMed]
Chen, M.; Tao, G.; Jiang, B. Dynamic surface control using neural Networks for a class of uncertain nonlinear systems with input saturation. IEEE Trans. Neural Netw. Learn. Syst. 2015, 26, 2086–2097. [Google Scholar] [CrossRef] [PubMed]
Abu-Khalaf, M.; Lewis, F.L. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 2005, 41, 779–791. [Google Scholar] [CrossRef]
Dierks, T.; Jagannathan, S. Optimal control of affine nonlinear continuous-time systems. In Proceedings of the 2010 American Control Conference, Baltimore, MD, USA, 30 June–2 July 2010; pp. 1568–1573. [Google Scholar]
Wang, D.; Liu, D.; Li, H.; Ma, H. Neural-network-based robust optimal control design for a class of uncertain nonlinear systems via adaptive dynamic programming. Inf. Sci. 2014, 282, 167–179. [Google Scholar] [CrossRef]
Xu, H.; Zhao, Q.; Dierks, T.; Jagannathan, S. Neural network-based finite-horizon approximately optimal control of uncertain affine nonlinear continuous-time systems. In Proceedings of the 2014 American Control Conference, Portland, OR, USA, 4–6 June 2014; pp. 1243–1248. [Google Scholar]

Figure 1. The control block diagram of the proposed method.

Figure 2. The estimations of the sum of

d_{1}

and

Δ f_{1} (α)

.

Figure 2. The estimations of the sum of

d_{1}

and

Δ f_{1} (α)

.

Figure 3. The estimations of the sum of

d_{2}

and

Δ f_{2} (α, q)

.

Figure 3. The estimations of the sum of

d_{2}

and

Δ f_{2} (α, q)

.

Figure 4. The response of the angle of attack controlled with and without estimators.

Figure 5. The response of the objective function.

Figure 6. The response of

q^{a}

and

q^{*}

.

Figure 6. The response of

q^{a}

and

q^{*}

.

Figure 7. The response of

u^{a}

and

u^{*}

.

Figure 7. The response of

u^{a}

and

u^{*}

.

Figure 8. The response of the angle of attack with and without finite-horizon optimization.

Table 1. The illustration of aircraft model parameters.

$α$	angle of attack	$γ$	flight path angle
q	pitching rate	$I_{y y}$	moment of inertia
$δ_{z} (u)$	constrained normal thrust vectoring angle	$\bar{q}$	dynamic pressure
M	mass of aircraft	S	reference surface area of wing
V	airspeed of aircraft	$\bar{c}$	mean aerodynamic chord
L	lift force	$C_{m}$	pitch moment aerodynamic coefficient
T	thrust	$x_{T}$	distance between engine nozzle and center of mass

Table 2. The parameters of the aircraft model.

M (kg)	10,617	V (m/s)	70
T (N)	146,000	$γ$ (∘)	0
$I_{y y}$ (kg · m²)	77,095	$\bar{q}$ (kg/(m · s²))	2724
S (m²)	57.7	$\bar{c}$ (m)	4.4
$x_{T}$ (m)	8.5	$u_{M}$ (∘)	15
$δ_{c}$ (∘)	−71.7	$α_{c}$ (∘)	10
$α_{0}$ (∘)	0	$q_{0}$ (∘)	0

Table 3. The parameters of the control system.

$k_{1}$	2	$k_{2}$	20
$L_{1}$	200	$L_{2}$	200
$τ_{1}$	100	$τ_{2}$	100
$τ$	$0.1$	$k_{a u x}$	$7.9$
$Y_{1}$	$40 \cdot [\underset{14}{\underset{⏟}{1 \dots 1}}]$	$Y_{2}$	$120 \cdot I^{14 \times 14}$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Multiple requests from the same IP address are counted as one view.