Dynamic Event-Triggered Integral Sliding Mode Adaptive Optimal Tracking Control for Uncertain Nonlinear Systems

: In this paper, we study the event-triggered integral sliding mode optimal tracking problem of nonlinear systems with matched and unmatched disturbances. The goal is to design an adaptive dynamic programming-based sliding-mode controller, which stabilizes the closed-loop system and guarantees the optimal performance of the sliding-mode dynamics. First, in order to remove the effects of the matched uncertainties, an event-triggered sliding mode controller is designed to force the state of the systems on the sliding mode surface without Zeno behavior. Second, another event-triggered controller is designed to suppress unmatched disturbances with a nearly optimal performance while also guaranteeing Zeno-free behavior. Finally, the beneﬁts of the proposed algorithm are shown in comparison to several traditional triggering and learning-based mechanisms.


Introduction
Traditional control systems are implemented with time-triggered (TT) sampling (i.e., periodic sampling). Data are sent from the controller to the actuator (or from the sensor to the controller) using a fixed sample length. However, in modern control systems, especially networked control systems, control signals are often implemented aperiodically [1]. The advantages of the aperiodic sampling in terms of update times have been elaborated in [2] in detail. In fact, the limited communication bandwidth in networked control systems has stimulated a huge attention in event-triggered control (ETC) [3][4][5][6][7][8][9][10][11], as an alternative to TT. Despite the progress in the filed, some problems remain open, such as how to simultaneously counteract the effect of matched and unmatched disturbances while guaranteeing the optimal performance of the ETC systems, which is one of our concerns and is an issue to be studied.
Owing to the distinguished features such as fast dynamic response, robustness, order reduction, and implementation simplicity, sliding mode control (SMC) is widely studied in the fields of transportation, power grid, industrial communication networks, and uncertain serial industrial robots [12][13][14][15][16][17][18][19][20]. Especially, SMC has been recognized as one of the popular and powerful control tools in power converters based on deriving from the variable structure systems theorem. In addition, its popularity comes from the robustness feature, which eliminates the burden of the necessity of system parameters required for accurate modeling, yet it may lack robustness against unmatched disturbances. Although many achievements in codesign of sliding mode control and event-triggered have emerged [21][22][23], they all neglect how to guarantee the optimal performance of the controlled system in the presence of matched and unmatched disturbances.
Optimal control theory is now quite mature [24][25][26][27]. Optimal control for nonlinear systems requires one to solve the Hamilton-Jacobi-Bellman (HJB) or Hamilton-Jacobi-Isaac (HJI) equations. However, the nonlinearity of the above equation makes it impossible to get an analytical solution [28]. Although numerical solutions via dynamic • Different from the combined SMC and ADP frameworks of [39,40], this paper proposes a new dynamic event-triggered (DET) mechanism. By introducing an auxiliary variable, which is non-negative. This can increase the length of the time intervals between triggering events, further reducing the communication burden compared with [39,40]. • A novel Integral Sliding Mode Control (ISMC) scheme based on DET for uncertain nonlinear systems is proposed, consisting of two control laws. A first event-triggered controller is designed to tackle the matched uncertainties and force the trajectory of the system on the sliding mode surface. A second event-triggered controller is designed to tackle the unmatched uncertainties and guarantee optimal performance. • To solve the resulting optimal control problem, a critic-only neural network (NN) based on ADP is proposed via the experience replay technique, which helps relaxing the excitation condition typically required for ADP methods to work. Stability of the closed-loop system is proven in the sense of uniformly ultimately boundedness, while guaranteeing Zeno-free behavior of the triggering mechanism.
The paper is arranged as follows. In Section 2, we introduce the model formulation and preliminaries. Section 3 covers the event-triggered-based ISMC design. Section 4 presents the framework of a dynamic triggered ADP strategy, along with stability analysis. Section 5 illustrates the effectiveness of the novel algorithm via comparative simulations. Section 6 gives the conclusions and possible future works.
Notations: R + represents the sets of the positive real numbers. R n and R n×m denote the space of all real n-vectors and the space of all n × m real matrices, respectively. means "equal by definition", and I n is the identity matrix of dimension n × n. T is the transposition symbol. C 1 represents the class of functions with continuous derivative. λ min (X) is defined as the minimum eigenvalue of matrix X, · represents the 2-norm of a vector or matrix. For any full column rank matrix F (·), its left pseudoinverse is F † (·) (F T (·)F (·)) −1 F T (·).

System Formulation and Preliminaries
In this section, an ISMC design oriented toward optimal tracking is discussed. To target optimality, it is useful to introduce an augmented system associated with the tracking error system and the desired reference system.
Consider the following nonlinear system with matched and unmatched uncertainties [39]: where x ∈ R n , u ∈ R m , d ∈ R m and w ∈ R q are the system state, control input, unknown bounded matched disturbance, and unknown bounded unmatched disturbance, respectively. The system dynamics p(x), q(x) andh(x) are known Lipschitz functions with p(0) = 0, and q(x) and h (x) having a positive upper bound. Meanwhile, let the system (1) be controllable, and the matrix function q(x) be full column rank with q(·) =h(·).

Remark 1.
Herein, it should be noted that system (1) is favored by scholars in theoretical studies [18,33,35] as well as has been widely studied and explored in practical applications, such as a single link robot arm [39,40]; power system [34]; a spacecraft [41] et al.
The desired signal x d is subject toẋ where the bounded x d ∈ R n is Lipschitz continuous with r(0) = 0. Define the tracking error Combining (1)- (3), e d satisfies the following dynamicṡ Define the augmented state ξ = [e T d , x T d ] T ∈ R 2n . Combining (2) with (4) generates the following augmented systemξ where f (ξ) ∈ R 2n , g(ξ) ∈ R 2n×m and h(ξ) ∈ R 2n×q . For simplicity and convenience of analysis, we will omit some arguments of the functions ( f (ξ), g(ξ), and h(ξ) will be written as f , g, and h, respectively) in the later part of the paper. To this end, the following two standard assumptions are required. Assumption 1 ([39]). System dynamics (5) with f (0) = 0 is Lipschitz continuous, and g and h Assumption 2 ([39]). The matrix function g is full column rank and its left pseudoinverse is given by g † = (g T g) −1 g T and there exists some positive values b q † (b q † ∈ R + ) such that g † ≤ b q † . Then, the following equality holds hw = gg † hw + (I − gg † )hw (6) where gg † h ≤ b g † (b g † ∈ R + ).
Control objective: This article aims to achieve an optimal tracking control of system state x for the desired trajectory x d , so that the tracking error is uniformly ultimately bounded UUB.
The control input should eliminate the effect of the matched disturbance and reduce the effects of the unmatched disturbance.
To achieve this goal, first, a new composite control law in the form u = u 0 + u 1 will be considered. Second, according to the robustness to nonlinearities and uncertainties of SMC technique, the Section 3 designs an integral sliding surface and event-triggered controller u 1 aiming to suppress the matched effects of the systems while forcing the state of the systems on the sliding mode surface without Zeno-free behavior. Then, event-triggered optimal controller u 0 is designed aiming to reduce the unmatched effect and guarantee the optimal performance of the sliding-mode dynamics in Section 4. Finally, a nonlinear single link robot arm is considered to verify the effectiveness of the proposed algorithm.

DET-Based ISMC Design
To tackle the uncertainty affecting the nonlinear system (5), an integral type sliding surface is designed as where M ∈ R m×2n (m < 2n) is a projection matrix satisfying that Mg is invertible. Here, u 0 is an optimal control input that will be designed in the next section. In this section, we aim to design the control u 1 to force the augmented system (5) onto the manifold {ξ|S(ξ, t) = 0} in finite time, to remove the effect of matched disturbances. This can be achieved via where X > 0. The sign function sgn(·) is Next, we introduce the dynamics of the sliding variable and the notion of practical sliding mode, which will be required later. The dynamics of the sliding variable can be obtained by differentiating (7) with respect to timė According to the SMC theory, when the system trajectories reach the manifold, one hasṠ = 0. Then, by combining (5), (6) and (9), one has an equivalent control u 1eq satisfying Substituting the equivalent control (10) into the augmented system (5), one gets thew dynamics on the sliding manifolḋ where w u = [I − gg † ]hw. The following result is recalled.

Lemma 1 ([18]
). The optimal solution to the following optimization problem arg min M∈R m×2n

Remark 2.
Since M is the left pseudoinverse of g, one knows (1) M = g † can minimize w ueq , which leads to w ueq = w u , refer to [18] for the detailed proof; (2) The modulation gain associated with u 1 (t) is minimized, which means that the amplitude of chattering can be reduced; (3) M = g † avoids amplifying the effect of the unmatched disturbance.
To extend the continuous-time control u 1 in the event-triggered paradigm, we define a virtual control input and {t k } ∞ k=0 is a sampling sequence with t k < t k+1 , k ∈ N with N = {0, 1, 2, · · · }. Define the following measurement error satisfying the following triggering condition The following theorem establishes a sufficient condition to the reachability of the specified sliding surface (7). Theorem 1. Suppose Assumptions 1 and 2 hold. Consider the augmented system (5) with the sliding surface (7) under event-triggered controller (12). Then, practical sliding mode is achieved if the triggering condition satisfies (14).

Proof. Consider the following Lyapunov function
. Noticing the triggering condition (14), we know that e u 1 ≤ X − b d − b q † holds all the time, thenV 1 < 0 for S = 0, which implies that the system state ξ starting from the initial state ξ(0) slide robustly on the switching surface S from initial time t = 0 and can gradually reach the sliding mode surface S = 0. The proof is completed. Theorem 2. The event-triggered rule (14) avoids the Zeno behavior, since a minimal triggering interval is given by Proof. Define T u 1 (k) = t k+1 − t k as the interexecution time for u 1 . Recall that the control u 1 is updated at t = t k and e u 1 (t k ) = 0. During the event intervals The technique [39] is utilized to approximate the sign function sgn(S). That is, After integrating both sides with respect to (17) Using the event-triggered rule (14), when t ∈ [0, τ ∞ ), one has (18) and (19), one has The proof is completed.
Next, a dynamically triggered controller u 0 will be designed to suppress unmatched disturbances and guarantee the event-triggered stability of sliding manifold dynamics system (20) with a nearly optimal performance. We shall find the dynamic event-triggered condition and solve the optimal control problem using critic-only NN approximation strategy.

DET-Based Optimal Controller Design
The design u 0 comprises two steps. First, find the event-triggered rule to guarantee stability and optimal performance; second, solve the resulting optimal control problem approximately by using ADP and NN approximation strategy. With M = g † , the sliding mode dynamics (11) can be revised aṡ wherek = (I − gg † )h.
The discounted cost function V(ξ), which is subject to the above dynamics (20), is defined as where α > 0 is a discount factor and Q = Q 0 n×n 0 n×n 0 n×n . Moreover, Q ∈ R n×n and R ∈ R m×m are symmetric positive definite matrices to weight the system state and input, respectively. Meanwhile, if u 0 is admissible and V (V = V(ξ)) is C 1 , the corresponding nonlinear Bellman equation is Define the Hamiltonian where ∇V = ∂V/∂ξ. According to the zero-sum game [27], we get the optimal cost function V * (ξ) by which satisfies the HJI equation Consider the stationarity conditions [25]: Solving the stationary conditions (25), we obtain the optimal control input and worst-case disturbance Substituting the control input (26) into (22), the HJI equation is written as

Dynamically Triggering Rule for Optimal Input
To propose a dynamically triggering rule, we define again a new sampling sequence Define the error between the sampled state ξ i and the current state ξ as where ξ i ξ(t)| t=t i . Thus, for t ∈ [t i , t i+1 ), the sampled-data version of the system (20) can be rewritten asξ Considering the event-based sampling rule, (26) is written as where ∇V * (ξ i ) = ∂V * (ξ)/∂ξ| ξ=ξ i . The HJI equation with event-triggered law can be written as Now, a necessary assumption is introduced for stability analysis.

Assumption 3 ([36]
). The optimal controller u * 0 is Lipschitz continuous on Ω,viz. there exists We define an internal dynamic variable η evolving according to the following differential equation:η Here, η is designed as a filtered value associated with e T 2 − 2L 2 R e i 2 . This new DET technique can avoid e T 2 − 2L 2 R e i 2 to be always nonnegative if the following condition is used where θ ∈ R + . This is stated in the following result.
Two main stability results for u * 0 follow.

Theorem 3.
Considering the sliding mode dynamics (20) with the optimal cost V * and (30). Let Assumptions 1-3 hold. The tracking error e d and the closed-loop system (29) achieve UUB via DET (33). (30), taking the derivative of the Lyapunov function along the trajectory of (29) giveṡ Note that (26) implies that According to the time-triggered HJI (27), one has Thus, based on (38) and (39), we havė Since V * is continuously differentiable on Ω, one can conclude that both V * and its derivative V * ξ are bounded on Ω. Here, we have max{ V * , ∇V * } ≤ b v * , where b v * ∈ R + is a constant. Recalling the triggering condition (33), one obtainṡ where Accordingly, the UUB of the closed-loop system is ensured. The proof is completed.
Next, we prove that the dynamic triggering rule (33) avoids the Zeno behavior.

Assumption 4 ([36]
). f + gu * (ξ i ) is Lipschitz continuous with Lipschitz constants L f and L g , for all ξ and ξ i , one has Theorem 4. Let Assumptions 1 and 4 hold. The dynamic triggering rule (33) avoids the Zeno behavior, and a minimal triggering interval is given by Proof. Define T u 0 (i) = t i+1 − t i as the interexecution time for u 0 . The control u 0 is updated at t = t i andė i = 0. During the event intervals Using the comparison lemma, we have Recall the triggering condition in (33) associated with and resulting in where η, θ are positive for all t > 0. The proof is completed.

Dynamically Triggered ADP with Single Critic NN
The dynamic event-triggered optimal control u * 0 has been analyzed before. In the sequence, to guarantee UUB of sliding-mode dynamics (20) , approximate optimal solution V * is to obtain by a critic-only NN approximation structure in the following part using reinforcement learning method.
Next, the solution V * of HJI (31) is approximated using NN using Weierstrass highorder approximation theorem: where ω c ∈ R l is the unknown ideal weight vector, φ ∈ R l is the activation function vector, l is the number of hidden neurons, and ε ∈ R is the critic NN approximation error. The gradient of V * is where ∇φ T = ∂φ/∂ξ and ∇ε = ∂ε/∂ξ.

Stability Analysis
According to the critic-only NN strategy presented above, the ISMC can be obtained by solving the HJI Equation (31) approximately by DET way. Thus, the weight estimation error and the tracking error are proven to be UUB by the Lyapunov function in this section. To this end, the assumption 5 is necessary as follows for the following analysis.
In the sequence, an important theorem will be emerged to guarantee that the weight estimation error and the tracking error are UUB under dynamic event-triggered condition (33) and controller (49). Proof. We will discuss two cases, i.e., the continuous dynamics and the jump dynamics.

Algorithm Design of the Event-Triggered ISM Optimal Tracking Control
In the framework of all assumptions holding in this article, the following Algorithm 1 is to show the procedures of the event-triggered ISM optimal tracking control.

Algorithm 1 Event-Triggered ISM Optimal Tracking Control.
Input: initial states of the sliding mode dynamics (11) 1: Select an initial admissible policy u 0 0 (X), ω 0 (X) and a proper small scalar > 0.ω while the optimal continuous control law with input constraints is approximated asû 2: To tackle the uncertainty affecting the nonlinear system (5), an integral type sliding surface is designed as 3: To extend the continuous-time control u 1 in the event-triggered paradigm, we define a virtual control input µ(t) satisfying µ(t) u(t k ), t ∈ [t k , t k+1 ), where µ(t) = −Xsgn(g T M T S(ξ(t k ))) 4: Hence, the event-triggered ISMC becomes

Remark 3.
An optimal tracking composite controller u = u 0 + u 1 , subject to two different dynamic event-triggered conditions is presented in this paper. Subsequent numerical experiments (cf . Tables 1 and 2 and Figures 1 and 2, respectively) show that the proposed new algorithm not only can reduce the communication burden but also improve the speed of convergence.

Simulation
In this section, a nonlinear single link robot arm is considered to verify the effectiveness of the proposed algorithm. Consider the following system dynamics [39] where θ(t) is the angle position of robot arm, and u(t) is the control input. Moreover, M is the mass of the payload,G is the moment of inertia, g is the acceleration of gravity, l is the length of the arm, and D is the viscous friction, where g, l, D are the system parameters and M,G are the design parameters. Set the values of the system parameters as g = 9.81, D = 1, and l = 1, and the design parameters M andG are alterable. Assuming x 1 (t) = θ(t) and x 2 (t) =θ(t). Herein, consider the effect of interference on the actuator, so the dynamics by selecting the system parameters can be written as Let us take the initial state x(0) = [0.02; −0.5]. The matched disturbance is taken as The desired trajectory isẋ with the initial state x d (0) = [0.1; 0.65]. Therefore, based on (64)-(66), the augmented system is reformulated aṡ with Based on Lemma 1, one has g † = [0; 25; 0; 0] and k = (I − gg † )h = [0.1013; 0; 0; 0]. The integral sliding mode surface is as in (7) with the sliding mode gain X = 1.
For simulation, the parameters of the algorithm are chosen as Q = diag(100, 100, 0, 0), γ = 5, ζ = 0.9, = 0.5. The parameters of the triggering condition are selected as η 1 (0) = 0.1, θ 1 = 1, η(0) = 0.1, L = 10, λ = 0.1, θ = 1 and α = 0.5. The initial NN weight is selected as ω c = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0] and the NN activation function is designed as To make a comparison with DET via PE, during the neural network implementation process, a small probing noise is injected in u for the first 90 s. Figure 1 presents the evolution of the tracking system states and the augmented system states with DET via PE technique. Figure 2 presents the evolution of the tracking system states and the augmented system states with DET-based ER technique. From Figures 1 and 2, it is obvious that the convergence of states and tracking trajectories via ER method is faster than the PE method. The optimal control u 0 , the ISMC law u 1 , the composite law u, and the sliding mode function S are shown in Figures 3 and 4, for DET via PE technique and DET via ER technique, respectively. From Figures 3 and 4, we can see that the convergence via ER method is faster than the PE method. The initial weights are selected randomly in the interval [0, 1]. After a learning process, Figure 5 presents the convergence process of the critic NN weights ω c . The evolution of the event error ||e i || 2 and triggering threshold ||e T || 2 are shown in Figure 6. The interevent time of the control u 0 under under the four strategies is shown in Figure 7. Next, the characteristics of the different strategies are analyzed by comparison through Tables 1-3.    First, Table 1 shows the data comparison for the four control strategies. As is well known, the number of triggering events is one of most important factors in evaluating the triggering mechanism. Furthermore, a smaller number of triggering events can reduce the communication burden and save resources. To implement this goal, it is needed to reduce the unnecessary update of the controller guaranteeing system performance. Based on this, a novel adaptive adjustment technique consisting of DET via ER technique is used. In addition, with the help of the simulation experiment platform and MATLAB data package, using respectively the technology in [39,40] and our technique in this paper, some experimental results are shown in Table 1. From Table 1, it is obvious that the DET via ER technique is the best, and which can better reduce the communication burden and save resources, since there are only 418 samples that occurred because of the larger average triggering interval time 0.2392 s. In particular, one may notice that the ER technique is also beneficial in reducing the number of event-triggering comparing with the PE technique in the framework of the same event-triggered conditions. Similarly, the DET is also beneficial in reducing the number of event-triggering comparing with ET in the framework of the same PE technique (ER technique). In addition, the minimal interval indicates Zeno-free behaviors. The minimal interval values of the four strategies are consistent with Figure 7. Moreover, according to Figure 7, we know that DET via ER technique can generate a biggest minimal interval comparing with other technique, so the effectiveness of the current technique designed in this paper is verified.
Next, we define another important factor to evaluate the control strategy, (called a triggering rate based on TT). The triggering rate is calculated by DET number TT number . When θ = 1 and λ = 0.01, Tables 2 and 3 present that the rate of DET via PE and the rate of DET via ER are 23.75% and 20.08%, respectively. Generally speaking, a small triggering rate is more favorable than a large triggering rate. To investigate the characteristics of the changes utilized to execute the triggering rule, some data of the triggering rates are shown in Tables 2 and 3.The tables also show the effect of the parameters on the triggering rate generated by θ and λ with DET via PE and ER respectively. To summary, a bigger θ value means a reduction of the number of events; in contrast, a bigger λ value means an increase in the number of events.

Conclusions
In this article, a learning-based event-triggered optimal tracking control technique for nonlinear systems was developed via ADP. Matched uncertainties have been eliminated by the ISMC proposed in this paper. Unmatched uncertainties have been attenuated, utilizing projection matrix and an optimal controller. A critic NN via a novel dynamic event-triggered rule has been constructed to ensure the existence of the solution of the HJI equation and all errors have been proved UUB using the Lyapunov analysis method. Moreover, the simulation results revealed that our control algorithm is more favorable than traditional event-triggered control algorithms. Future work is to further explore this framework, e.g., in the presence of various delays or system constraints.

Remark 5.
(1) The pros: This study first provides codesign of dynamic event mechanism and experience replay technique-based weighted adaptive adjustment technique to guarantee the optimal approximation of the cost function V * . In addition, this codesign technique not only speeds up the approximation speed of the cost function but also enlarges the average interval of internal events aiming to reduce the updating times of the controller and save the calculation burden and communication resources.
(2) The cons: In the proof Theorem 2, using the approximate the sign function sgn(S) ≈ tanh(S/ν) where ν ≥ 1 that makes us to only obtain an approximation condition rather than a sufficient condition, namely, which weakens the effect of sliding-mode control. This issue should be investigated further.