Abstract
This paper addresses the problem of decentralized safety control (DSC) of constrained interconnected nonlinear safety-critical systems under reinforcement learning strategies, where asymmetric input constraints and security constraints are considered. To begin with, improved performance functions associated with the actuator estimates for each auxiliary subsystem are constructed. Then, the decentralized control problem with security constraints and asymmetric input constraints is transformed into an equivalent decentralized control problem with asymmetric input constraints using the barrier function. This approach ensures that safety-critical systems operate and learn optimal DSC policies within their safe global domains. Then, the optimal control strategy is shown to ensure that the entire system is uniformly ultimately bounded (UUB). In addition, all signals in the closed-loop auxiliary subsystem, based on Lyapunov theory, are uniformly ultimately bounded, and the effectiveness of the designed method is verified by practical simulation.
1. Introduction
Over the past few decades, safety has received increasing attention in autonomous driving [1], intelligent robots [2], robotic arms [3], adaptive cruise control [4], etc. The design of these systems and controllers require that the system state trajectories evolve within a set called the safe set, reflecting the inherent properties of the system [5]. In practice, many engineering systems must operate within a specific safety range, beyond which the controlled system may be at risk [6]. Safety-critical systems primarily refer to systems having control behaviors that prioritize safety. The designed control schemes aim to reduce the potential for severe consequences, such as personal injury and environmental pollution, which may arise due to system shutdown or operational errors [7]. To ensure the safety and reliability of the system, scholars developed many safety control schemes. The classical approach focused on extending and applying Naguma’s theorem to safe sets defined by continuously differentiable functions [8]. In particular, barrier functions have become an effective tool for verifying security and have been widely used in [9,10,11]. They were used to convert a system with security constraints into an equivalency system that satisfies security requirements and then a security controller was designed to protect the system. In [9,10], penalty functions and BF-based state transitions were employed to merge states into a reinforcement learning framework to solve optimal control problems with full-state constraints. In [11], a safe non-strategic reinforcement learning method to solve secure nonlinear systems with dynamic uncertainty was proposed. In [12,13], a new secure reinforcement learning method was proposed to solve secure nonlinear systems with symmetric input constraints. However, the results in [9,10,11,12,13], mentioned above, were mainly based on studying the optimal safety control in a single continuous-time/discrete-time nonlinear system. The security control of interconnected systems has not been fully resolved.
On the other hand, interconnected systems consist of multiple subsystems with interconnected characteristics, and designing controllers for them through a concept similar to that of a single-system approach is difficult [14]. To solve this problem [15,16,17], the decentralized control approach, based on local subsystem information, was proposed. This approach involved using multiple controllers to control the interconnected systems. In [18,19], the decentralized control approach differed by initially decomposing the entire system control problem into a series of subproblems that could be solved independently. The solutions to the subproblems (i.e., independent controllers) were then joined to form a decentralized controller to stabilize the entire system. In addition, implementing the decentralized control algorithm used only the local subsystem’s knowledge, not the complete system’s information. Recently, scholars have proposed many schemes or techniques for designing decentralized controllers, including quantization techniques [20], fuzzy techniques [21], and optimal control methods [22]. This paper develops decentralized control strategies from the optimal control perspective. Problems of optimal control are usually solved via the solution of the Hamilton–Jacobi–Bellman (HJB) partial differentiation equation [23,24]. However, the HJB equation is generally not solvable analytically due to its inherent nonlinearity [25,26]. Therefore, adaptive dynamic programming (ADP) and reinforcement learning (RL) algorithms were proposed to obtain numerical solutions to the HJB equation and were widely applied to nonlinear interconnected systems [27,28,29,30]. In [31,32], the two previously mentioned algorithms could be deemed closely related, as they exhibited similar characteristics in addressing optimal control problems. For example, in [27,28], the distributed optimal controller was designed using robust ADP for nonlinear interconnected systems with unknown dynamics and parameters. In [29], the optimal decentralized control problem for interconnected nonlinear systems subject to stochastic dynamics was solved by enhancing the performance function of the auxiliary subsystem and transforming the original control problem into a set of optimal control strategies sampled in periodic patterns. Furthermore, in [30], the identifier–critic network framework was used to solve the problem of decentralized event-triggered control based on sliding-mode surfaces, avoiding the need for knowledge of the system’s internal dynamics. It is worth noting that the control results provided in [27,28,29,30] did not consider input constraints.
Control constraints are commonly encountered in industrial processes, where they are widespread and have a detrimental impact on the performance of systems [33,34]. Therefore, the study of constrained nonlinear systems is of practical importance. In [35,36], the RL-based decentralized algorithm was developed for tracking control of constrained interconnected nonlinear systems. In [37], the problem of decentralized optimal control of a constrained interconnected nonlinear system was solved by introducing a nonquadratic performance function to overcome the symmetric input constraint. The results in [35,36,37], mentioned above, mainly addressed the symmetric input constraint. However, the problem of asymmetric input constraints was identified in several project cases [38,39]. In [40], the optimal decentralized control problem with asymmetric input constraints was solved by designing a new non-quadratic performance function. In [41], a new performance function was proposed for interconnected nonlinear systems to successfully overcome the asymmetric input constraint and to solve the decentralized fault-tolerant control problem. However, none of the above studies considered the safety of the system. The optimal decentralized safety control (DSC) for constrained interconnected nonlinear safety-critical systems has not been thoroughly investigated thus far, which inspired our current study.
Motivated by previous discussions, this paper proposes an RL-based decentralized DSC strategy for constrained interconnected nonlinear safety-critical systems. The primary achievements are concluded below:
- The reinforcement learning algorithm is used to solve the optimal DSC problem for restricted interconnected nonlinear safety-critical systems, and the asymmetric input constraint is successfully solved. The method optimizes the control strategy by minimizing the performance function, ensuring the safety of the system’s state, while considering the asymmetric input constraints.
- Nonlinear interconnected safety-critical systems with asymmetric input constraints and safety constraints are converted to equivalent systems that satisfy user-defined safety constraints using barrier functions. Unlike the nonlinear safety-critical systems [3,9,10,13], this paper solves the security constraint problem of the interconnection term through the potential barrier function, which ensures the interconnected nonlinear safety-critical system satisfies the security constraint.
- The asymmetric input constraints are solved by utilizing a single CNN architecture for online approximation of the performance function. Theoretical demonstrations show that the optimal DSC method can achieve uniformly ultimately bounded (UUB) system states and neural network weight estimation errors. In addition, a simulation example verified the feasibility and effectiveness of the developed DSC method.
The remainder of this article is structured as follows. In Section 2, the issue formulation and conversion are presented. In Section 3, the decentralized optimal safety DSC design scheme is presented. The design scheme for the critical neural network is presented in Section 4. In Section 5, the analyses of system stability are presented. In Section 6, the simulation sample demonstrates the effectiveness of the presented approach. Lastly, conclusions are given in Section 7.
2. Preliminaries
2.1. Problem Descriptions
Consider a constrained interconnected nonlinear safety-critical system composed of n subsystems and the formula below:
where is the ith subsystem’s state vector and represents the initial state, represents the overall state vector of the constrained interconnected nonlinear safety-critical system, represents the control input, and the set of asymmetric constraints is represented as with and being the asymmetric saturating minimum and maximum bounds, and represent the drift system dynamics and input dynamics of the ith subsystem, respectively, and are Lipschitz continuous, and represents the unknown interconnected term.
To simplify the design of the controller, let us introduce some assumptions. For , we suppose the equilibrium of the ith subsystem’s state is .
Assumption 1.
For , the satisfies the below unmatched condition:
where is a known function with , and is a bounded vector function that satisfies
where is a constant, and are normal definite functions. Furthermore, and . Then, assuming , the unequal Equation (2) is denoted as:
where is a positive constant, and .
Remark 1.
It is noted that constraints (2) and (3) specified by Assumption 1 are strict restrictions on specific interrelated nonlinear systems. Nevertheless, when we consider the function that satisfies no constraints (2) and (3), we discover that the calculational costs to address the stability of the closed-loop system are high. In fact, in real-world applications, constraints like inequalities (2) and (3) impose on the mismatched interconnection terms of the system (1) [40,42].
Assumption 2.
For , the known function is bounded as , where is a known constant. Furthermore, and .
Based on the ith subsystem (1) described, the ith auxiliary subsystem is designed as:
where is used to compensate for mismatched interconnections and stands for auxiliary control, is Moore–Penrose pseudo-reverse. According to Assumption 2, it can be found that the matrix and . Then, we rewrite the auxiliary subsystem (4) as:
2.2. Security Conversion Issues
For the ith subsystem in the system (1), its state satisfies the following security constraints:
For nonlinear interconnect safety-critical systems with asymmetric input constraints and security constraints, we need to define the performance function as:
where is the discount factor, and with and are positive definite functions, where and are positive design parameters.
Remark 2.
Due to accounting for safety constraints and asymmetric input constraints in (7), the optimal control law does not converge to zero while the system state achieves the stable phase [43]. The discount factor , may be unbounded, so it is necessary to consider the discount factor.
Problem 1.
(Decentralized control problems with security constraints and asymmetric input constraints) Consider the safety-critical system (1) and find the policy and auxiliary control strategy in the ith subsystem. The performance function is given by (7) with the ith subsystem state and the control input satisfying the following conditions:
Ensure that the security-critical system state is consistently within the security constraints. Further, the definitions of some barrier functions are given.
Definition 1
(Barrier function [9,10]). The function defined on interval (a, A) is referred to as the barrier function if
where a and A are two constants satisfying . Moreover, the potential function is invertible on the interval , i.e.,
Furthermore, the derivative of (11) is
Based on Definition 1, we consider the state transition based on the potential barrier function as follows:
where . So, the ’s derivative concerning t is , and after using Definition 1, we obtain:
where
and is the interconnection term of the th term in the ith subsystem.
Then, the interconnected nonlinear safety-critical system (1) can be rewritten as:
where , and is the unknown interconnected term.
Based on Assumption 1, we define the unknown interconnection term after the system transformation as:
where , and
and is a bounded vector function that satisfies
where is a positive definite function. Then, assuming and , where
According to (3) and (18), the inequality (17) is expressed as:
where is a positive constant, and .
Assumption 3.
Lemma 1
([32]). , we have the following condition,
where and .
Remark 3.
The barrier function in Definition 1, which has the following characteristics, ensures that the safety-critical system (15) always satisfies the safety constraints [9,10].
- 1.
- 2.
- When the system’s state approaches the boundary of the safety area, the barrier function changes as follows:
- 3.
- The barrier function fails to function when the system state reaches equilibrium, i.e.,
3. Decentralized Optimal DSC Design
This section consists of two main subsections to establish the decentralized optimal DSC method. First, the security constraint problem is dealt with through the systematic transformation of the barrier function and the HJB equation for the ith auxiliary subsystem without security constraints is developed by introducing the improved performance function. Finally, the decentralized safety controller is constructed by solving the HJB equation for the auxiliary subsystem.
3.1. Barrier Function Conversion
According to the ith subsystem (15) described, the ith auxiliary subsystem is designed as:
where is Moore–Penrose pseudo-reverse. According to Assumptions 2 and 3, the matrix if found to be and . Then, the auxiliary subsystem (20) is rewritten as:
Regarding the converted system (15), analogous to (7), the performance function below is introduced:
where and , is the positive definition matrix. Furthermore, denotes the initial state, and is a non-quadratic utility function that solves the asymmetric input constraint. Then, is defined in the following form:
where and , and represent the monotonic odd function, where . In this paper, without sacrificing generality, .
Remark 4.
Unlike the traditional form of symmetric input constraints [35], this article considered asymmetric constraints on the controlling inputs [44]. The revised hyperbolic tangent function presented in (22) effectively transforms the asymmetric constrained control problem into an unconstrained control problem by devising different maximum and minimum bounds.
Problem 2.
(Optimal decentralized control problems with asymmetric input constraints) Finding the control policy and auxiliary control strategy in the ith subsystem, the performance function becomes (22).
Based on the subsystem (21), as well as the performance function (22), the corresponding Hamiltonian is given by:
with .
The optimal performance function is
where is a collection of all acceptable control policies and auxiliary control strategies for .
Based on Bellman’s optimality principle [31], in (25) satisfies the HJB
where . Then, the optimal control policy and the auxiliary control policy can be derived as follows:
where .
Substituting and into (26), the HJB equation is rewritten as:
with .
Through the BF-based system transformation, the decentralized control problem 1 with asymmetric input constraints and security constraints is transformed into an unconstrained optimization problem, i.e., the decentralized control problem 2. Next, the following lemma is discussed to ensure the equivalence between the decentralized control problems 1 and 2.
Lemma 2.
Assume that Assumptions 1 to 3 are met and that control policy and auxiliary control strategy solve the decentralized control problem 2 of (21). It follows, then, that the below holds:
Proof.
Both the performance function and Assumption 3 satisfy the observability of zero states, guaranteeing the presence of the safety-optimal performance function . From (24), we obtain , which allows us to obtain for all . Consequently, as stated in Remark 3, if the initial state of the system (21) satisfies the security constraint (6), and is bounded, then the is also bounded. Finally, we obtain
Therefore, the given and satisfy the constraints of the decentralized control problem 1.
Now, consider the state transition based on the barrier function described in (13) and (14). Since satisfies the constraints given in (8), each element of the state is finite. By comparing the performance functions (7) and (22), the equivalence relation is obtained, provided that . This completes the proof. □
3.2. Designing the Optimal DSC Strategy by Solving n HJB Equations
Throughout this section, we show that the optimal DSC strategies for interconnected nonlinear systems can be constructed by solving the n HJB equations.
Theorem 1.
Consider n subsystems under Assumptions 1 to 3 with DSC policies and auxiliary control strategies , having the corresponding conditions as below:
Next, consider n positive constants , so that for anything , the optimal DSC policies , , …, guarantee that the interconnected nonlinear system (15) with security constraints is UUB.
Proof.
The Lyapunov candidacy function below was selected:
where the is defined in the same way as (22), and we denote the time derivative along the trajectory as:
According to the optimal DSC policy (27), the term becomes
By appealing to the proof in [44], Equation (37) can be further reduced to
replacing (38) into (36), one has
It is known from [45] that there is a positive constant such that . Therefore, using Lemma 1, Assumption 1, (17), (19), and (27), we obtain
Utilizing the integral median theorem [46] and the inequality (40), the (38) can be deduced as:
where .
From [27], we conclude that , where is a positive constant. Then, plugging (40) and (41) into (39), and taking into consideration the conclusion mentioned above, we can rephrase inequality (39) as follows:
by denoting and . Let the condition (31) be satisfied, so we have
with and .
From the matrix X expression, positive definiteness is maintained by choosing a sufficiently large . In other words, there is , such that , ensuring . Thus, the inequality (43) is further deduced as:
The inequality (44) means that whenever lies outside the following set :
Based on Lyapunov’s extension theorem [47], it is shown that the optimal performance functions guarantee that the interconnected nonlinear system (15) with asymmetric input constraints is UUB. Since the performance function (7) and (22) yield the same results, it can be shown that the optimal performance function guarantees that the interconnected nonlinear safety-critical system (1) with security constraints and asymmetric input constraints is UUB. □
4. Critic Network for Approximation
The critic neural network is introduced in this section, with the aim of approximating the optimal performance function. Then, the evaluation network of the auxiliary subsystem (21) is used to construct the estimated optimal control strategy. According to [48], is expressed as:
where denotes the activation function, denotes the ideal weight vector, denotes the number of neurons, and is the reconstruction error of NN. The vector activation function is denoted as a continuously differentiable function, where . For , is linearly independent. Then, the derivative of can be expressed as:
where and .
From Equations (27), (28) and (47), the optimal safety control policy and the auxiliary control strategy are rephrased as:
where
with . The seclected value of is between and .
The ideal weight vector is not available and the optimal control strategy is not directly applicable. Therefore, the estimated weight vector is constructed to replace as:
The estimation error is defined. Similarly, according to (50), the (49) and (48) are further developed as:
According to (53), the error of the Hamiltonian is given by:
with . In order to make , the error should be guaranteed to be sufficiently small. To solve this issue, a critic weight adjustment law is proposed to minimize the objective function . Next, the critic updating law is developed as:
where the constant is the positive learning rate.
Remark 5.
To minimize the Hamiltonian error , it is necessary to maintain the derivative of as . Therefore, the critic weight adjustment law is derived by employing the normalization term and applying the gradient descent method with respect to [49].
By considering the definition of , we obtain
where and . denotes the residual error, defined as .
The proposed decentralized DSC strategy for the ith subsystem with a single critic-NN is illustrated in Figure 1.
Figure 1.
The block diagram of the developed optimal DSC scheme.
5. Stability Analysis
This section focuses on the stability of the n-auxiliary subsystem for the given control scheme. We need to make some Assumptions to satisfy the theorem.
Assumption 4.
For , there exist some positive constants and satisfying , , and .
Assumption 5.
Consider the time period and . Then, the term fulfills the following condition:
where and are positive constants.
Theorem 2.
Proof.
The candidate Lyapunov function is considered to be:
Then, defining and , the time derivative by is
According to Lemma 1, and taking into account (40), (48), (51), we observe that the term in (61) is satisfied by
where . Then, based on the fact in [44], according to Assumption 5, is derived as:
The error weight update law . is considered with the time derivative
Combining Lemma 1 and Assumption 4, the following conclusion is drawn:
Substituting (65) and (68) into (59), the following inequality is obtained:
where , means the minimum eigenvalue of .
Introducing Lyapunov’s extension theorem, ref. [47], ensures the stability of the closed-loop system. This proof ensures that the weight estimation error is UUB. At this point, this completes the proof process. □
Remark 6.
In contrast to techniques that aim to achieve input saturation [10,13], this article proposes an RL technique to solve the optimal DSC problem with safety constraints and asymmetric input constraints. This approach ensures not only the safety of the system but also minimizes the input constraints. Therefore, the developed reinforcement learning technique, based on security constraints and asymmetric input constraints, is better suited for some project applications, particularly for systems where the system state must be globally within the security settings.
6. Simulation Example
In this section, we provide a simulation example to verify the effectiveness of the proposed approach. The simulation involved a dual-linked robotic arm system [42]. The state space model of the system is defined by
where and indicate the angular location of the robot arm, stands for control input, and the represents the interconnection terms. The other parameters of the robotic arm system (72) are depicted in Table 1. The initial system state was selected as . We first defined the state variable and constructed the internal dynamics and input gain matrix as follows:
where , denote the uncertain interconnection terms of subsystems 1 and 2, i.e.,
Table 1.
Meanings and values of symbols used in robotic arm systems.
Furthermore, the two robotic arm subsystems were in a state that satisfied the below security constraints:
Therefore, to deal with the security constraint, the following system of transformations without security constraint was obtained, using the BF-based system transformation (13):
where
For the transformed dual-linked robotic arm system (74), the initial state was chosen by . The discount factors were chosen as and . The matrices were designed as and , and . The upper and lower limits were allocated as below: , and , . Let and . Additional design factors were setup as below: . Choose the activation functions and .
The simulation outcomes are presented in Figure 2, Figure 3, Figure 4, Figure 5, Figure 6, Figure 7, Figure 8, Figure 9, Figure 10, Figure 11, Figure 12 and Figure 13. The states of the system are depicted in Figure 2 and Figure 8, and it can be observed that the closed-loop system stabilized after 20 s and 35 s, respectively. However, the system failed to meet the specified security constraints. Figure 3 and Figure 9, shown in comparison with Figure 2 and Figure 8, not only assured that the system states converged to zero, but also satisfied the given safety constraints. The evolving states and are presented in Figure 4 and Figure 10, based on the safe control method with asymmetric input constraints. The optimal DSC policies are shown in Figure 5 and Figure 11. We found that the optimal DSC policies were restricted to the asymmetric set and . Figure 6 and Figure 12 represent the optimal auxiliary control strategies for subsystems 1 and 2, respectively. Figure 7 and Figure 13 show the critic updated laws. It can be observed that the weights converged after 15 s. According to Theorem 3, we concluded that the proposed optimal safety control policy and the auxiliary control policy could stabilize the closed-loop nonlinear system and satisfy the safety constraints on the system state. Moreover, the optimal control policy eventually converged to a predefined set of constraints. Finally, the results of the simulation showed that the presented optimal DSC solution for constrained interconnected nonlinear safety-critical systems, affected by system state constraints, is effective.
Figure 2.
Evolution of state without using the DSC method.
Figure 3.
Evolution of state using the DSC method.
Figure 4.
Evolution of state using the DSC method.
Figure 5.
Control evolution of input .
Figure 6.
Evolution of the auxiliary control input using the DSC method.
Figure 7.
Evolution of the critic weight vector using the DSC method.
Figure 8.
Evolution of state without using the DSC method.
Figure 9.
Evolution of state using the DSC method.
Figure 10.
Evolution of state using the DSC method.
Figure 11.
Control evolution of input .
Figure 12.
Evolution of the auxiliary control input using the DSC method.
Figure 13.
Evolution of the critic weight vector using the DSC method.
7. Conclusions
This article presents an RL-based DSC scheme for interconnected nonlinear safety-critical systems with security constraints and asymmetric input constraints. The proposed method transformed an interconnected nonlinear safety-critical system with security and asymmetric input constraints into an interconnected nonlinear safety-critical system with only asymmetric input constraints by using the barrier function. The non-quadratic utility function was added to the performance function to address the asymmetric input constraint. The critic network was also used to approach the optimal performance function and to establish the best security policy. Our control scheme stabilizes the closed-loop system and minimizes the improved performance function. In addition, the simulation results demonstrated the efficacy of the proposed distributed security solution. Future work will explore the optimal safety control of stochastic interconnected nonlinear systems with event triggering.
Author Contributions
C.Q. and Y.W. provided methodology, validation, and writing—original draft preparation; T.Z. provided conceptualization, writing—review; J.Z. provided supervision; C.Q. provided funding support. All authors read and agreed to the published version of the manuscript.
Funding
This work was supported by the science and technology research project of the Henan province 222102240014.
Institutional Review Board Statement
Not applicable.
Data Availability Statement
The authors can confirm that all relevant data are included in the article.
Conflicts of Interest
The authors declare that they have no conflict of interest. All authors have approved the manuscript and agreed with submission to this journal.
References
- Son, T.D.; Nguyen, Q. Safety-critical control for non-affine nonlinear systems with application on autonomous vehicle. In Proceedings of the 2019 IEEE 58th Conference on Decision and Control (CDC), Nice, France, 11–13 December 2019; pp. 7623–7628. [Google Scholar]
- Manjunath, A.; Nguyen, Q. Safe and robust motion planning for dynamic robotics via control barrier functions. In Proceedings of the 2021 60th IEEE Conference on Decision and Control (CDC), Austin, TX, USA, 14–17 December 2021; pp. 2122–2128. [Google Scholar]
- Wang, J.; Qin, C.; Qiao, X.; Zhang, D.; Zhang, Z.; Shang, Z.; Zhu, H. Constrained optimal control for nonlinear multi-input safety-critical systems with time-varying safety constraints. Mathematics 2022, 10, 2744. [Google Scholar] [CrossRef]
- Liu, Z.; Yuan, Q.; Nie, G.; Tian, Y. A multi-objective model predictive control for vehicle adaptive cruise control system based on a new safe distance model. Int. J. Automot. Technol. 2021, 22, 475–487. [Google Scholar] [CrossRef]
- Ames, A.D.; Xu, X.; Grizzle, J.W.; Tabuada, P. Control barrier function based quadratic programs for safety critical systems. IEEE Trans. Autom. Control 2016, 62, 3861–3876. [Google Scholar] [CrossRef]
- Qin, C.; Wang, J.; Zhu, H.; Zhang, J.; Hu, S.; Zhang, D. Neural network-based safe optimal robust control for affine nonlinear systems with unmatched disturbances. Neurocomputing 2022, 506, 228–239. [Google Scholar] [CrossRef]
- Qin, C.; Wang, J.; Zhu, H.; Xiao, Q.; Zhang, D. Safe adaptive learning algorithm with neural network implementation for H∞ control of nonlinear safety-critical system. Int. J. Robust Nonlinear Control 2023, 33, 372–391. [Google Scholar] [CrossRef]
- Srinivasan, M.; Abate, M.; Nilsson, G.; Coogan, S. Extent-compatible control barrier functions. Syst. Control Lett. 2021, 150, 104895. [Google Scholar] [CrossRef]
- Yang, Y.; Yin, Y.; He, W.; Vamvoudakis, K.G.; Modares, H. Safety-aware reinforcement learning framework with an actor-critic-barrier structure. In Proceedings of the 2019 American Control Conference (ACC), Philadelphia, PA, USA, 10–12 July 2019; pp. 2352–2358. [Google Scholar]
- Yang, Y.; Vamvoudakis, K.G.; Modares, H.; Yin, Y.; Wunsch, D.C. Safe intermittent reinforcement learning with static and dynamic event generators. IEEE Trans. Neural Netw. Learn. Syst. 2020, 31, 5441–5455. [Google Scholar] [CrossRef]
- Xu, J.; Wang, J.; Rao, J.; Zhong, Y.; Wang, H. Adaptive dynamic programming for optimal control of discrete-time nonlinear system with state constraints based on control barrier function. Int. J. Robust Nonlinear Control 2022, 32, 3408–3424. [Google Scholar] [CrossRef]
- Brunke, L.; Greeff, M.; Hall, A.W.; Yuan, Z.; Zhou, S.; Panerati, J.; Schoellig, A.P. Safe learning in robotics: From learning-based control to safe reinforcement learning. Annu. Rev. Control Robot. Auton. Syst. 2022, 5, 411–444. [Google Scholar] [CrossRef]
- Qin, C.; Zhu, H.; Wang, J.; Xiao, Q.; Zhang, D. Event-triggered safe control for the zero-sum game of nonlinear safety-critical systems with input saturation. IEEE Access 2022, 10, 40324–40337. [Google Scholar] [CrossRef]
- Bakule, L. Decentralized control: An overview. Annu. Rev. Control. 2008, 32, 87–98. [Google Scholar] [CrossRef]
- Xu, L.X.; Wang, Y.L.; Wang, X.; Peng, C. Decentralized Event-Triggered Adaptive Control for Interconnected Nonlinear Systems With Actuator Failures. IEEE Trans. Fuzzy Syst. 2022, 31, 148–159. [Google Scholar] [CrossRef]
- Guo, B.; Dian, S.; Zhao, T. Robust NN-based decentralized optimal tracking control for interconnected nonlinear systems via adaptive dynamic programming. Nonlinear Dyn. 2022, 110, 3429–3446. [Google Scholar] [CrossRef]
- Feng, Z.; Li, R.B.; Wu, L. Adaptive decentralized control for constrained strong interconnected nonlinear systems and its application to inverted pendulum. IEEE Trans. Neural Netw. Learn. Syst. 2023, 1–11. [Google Scholar] [CrossRef]
- Zouhri, A.; Boumhidi, I. Stability analysis of interconnected complex nonlinear systems using the Lyapunov and Finsler property. Multimed. Tools Appl. 2021, 80, 19971–19988. [Google Scholar] [CrossRef]
- Li, X.; Zhan, Y.; Tong, S. Adaptive neural network decentralized fault-tolerant control for nonlinear interconnected fractional-order systems. Neurocomputing 2022, 488, 14–22. [Google Scholar] [CrossRef]
- Tan, Y.; Yuan, Y.; Xie, X.; Tian, E.; Liu, J. Observer-based event-triggered control for interval type-2 fuzzy networked system with network attacks. IEEE Trans. Fuzzy Syst. 2023, 1–10. [Google Scholar] [CrossRef]
- Zhang, J.; Li, S.; Ahn, C.K.; Xiang, Z. Adaptive fuzzy decentralized dynamic surface control for switched large-scale nonlinear systems with full-state constraints. IEEE Trans. Cybern. 2021, 52, 10761–10772. [Google Scholar] [CrossRef]
- Huo, X.; Karimi, H.R.; Zhao, X.; Wang, B.; Zong, G. Adaptive-critic design for decentralized event-triggered control of constrained nonlinear interconnected systems within an identifier-critic framework. IEEE Trans. Cybern. 2021, 52, 7478–7491. [Google Scholar] [CrossRef]
- Bao, C.; Wang, P.; Tang, G. Data-Driven Based Model-Free Adaptive Optimal Control Method for Hypersonic Morphing Vehicle. IEEE Trans. Aerosp. Electron. Syst. 2022, 1–15. [Google Scholar] [CrossRef]
- Farzanegan, B.; Suratgar, A.A.; Menhaj, M.B.; Zamani, M. Distributed optimal control for continuous-time nonaffine nonlinear interconnected systems. Int. J. Control 2022, 95, 3462–3476. [Google Scholar] [CrossRef]
- Heydari, M.H.; Razzaghi, M. A numerical approach for a class of nonlinear optimal control problems with piecewise fractional derivative. Chaos Solitons Fractals 2021, 152, 111465. [Google Scholar] [CrossRef]
- Liu, S.; Niu, B.; Zong, G.; Zhao, X.; Xu, N. Data-driven-based event-triggered optimal control of unknown nonlinear systems with input constraints. Nonlinear Dyn. 2022, 109, 891–909. [Google Scholar] [CrossRef]
- Niu, B.; Liu, J.; Wang, D.; Zhao, X.; Wang, H. Adaptive decentralized asymptotic tracking control for large-scale nonlinear systems with unknown strong interconnections. IEEE/CAA J. Autom. Sin. 2021, 9, 173–186. [Google Scholar] [CrossRef]
- Zhao, B.; Luo, F.; Lin, H.; Liu, D. Particle swarm optimized neural networks based local tracking control scheme of unknown nonlinear interconnected systems. Neural Netw. 2021, 134, 54–63. [Google Scholar] [CrossRef] [PubMed]
- Zhao, Y.; Niu, B.; Zong, G.; Xu, N.; Ahmad, A.M. Event-triggered optimal decentralized control for stochastic interconnected nonlinear systems via adaptive dynamic programming. Neurocomputing 2023, 539, 126163. [Google Scholar] [CrossRef]
- Wang, T.; Wang, H.; Xu, N.; Zhang, L.; Alharbi, K.H. Sliding-mode surface-based decentralized event-triggered control of partially unknown interconnected nonlinear systems via reinforcement learning. Inf. Sci. 2023, 641, 119070. [Google Scholar] [CrossRef]
- Lewis, F.L.; Vrabie, D. Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits Syst. Mag. 2009, 9, 32–50. [Google Scholar] [CrossRef]
- Tang, F.; Niu, B.; Zong, G.; Zhao, X.; Xu, N. Periodic event-triggered adaptive tracking control design for nonlinear discrete-time systems via reinforcement learning. Neural Netw. 2022, 154, 43–55. [Google Scholar] [CrossRef]
- Sun, J.; Liu, C. Backstepping-based adaptive dynamic programming for missile-target guidance systems with state and input constraints. J. Frankl. Inst. 2018, 355, 8412–8440. [Google Scholar] [CrossRef]
- Zhao, S.; Wang, J.; Wang, H.; Xu, H. Goal representation adaptive critic design for discrete-time uncertain systems subjected to input constraints: The event-triggered case. Neurocomputing 2022, 492, 676–688. [Google Scholar] [CrossRef]
- Liu, C.; Zhang, H.; Xiao, G.; Sun, S. Integral reinforcement learning based decentralized optimal tracking control of unknown nonlinear large-scale interconnected systems with constrained-input. Neurocomputing 2019, 323, 1–11. [Google Scholar] [CrossRef]
- Sun, H.; Hou, L. Adaptive decentralized finite-time tracking control for uncertain interconnected nonlinear systems with input quantization. Int. J. Robust Nonlinear Control 2021, 31, 4491–4510. [Google Scholar] [CrossRef]
- Duan, D.; Liu, C. Finite-horizon optimal tracking control for constrained-input nonlinear interconnected system using aperiodic distributed nonzero-sum games. IET Control Theory Appl. 2021, 15, 1199–1213. [Google Scholar] [CrossRef]
- Li, Y.; Li, Y.-X.; Tong, S. Event-based finite-time control for nonlinear multi-agent systems with asymptotic tracking. IEEE Trans. Autom. Control 2023, 68, 3790–3797. [Google Scholar] [CrossRef]
- Zhang, H.; Zhao, X.; Zong, G.; Xu, N. Fully distributed consensus of switched heterogeneous nonlinear multi-agent systems with bouc-wen hysteresis input. IEEE Trans. Netw. Sci. Eng. 2022, 9, 4198–4208. [Google Scholar] [CrossRef]
- Yang, X.; Zhou, Y.; Dong, N.; Wei, Q. Adaptive critics for decentralized stabilization of constrained-input nonlinear interconnected systems. IEEE Trans. Syst. Man Cybern. Syst. 2021, 52, 4187–4199. [Google Scholar] [CrossRef]
- Zhao, Y.; Wang, H.; Xu, N.; Zong, G.; Zhao, X. Reinforcement learning-based decentralized fault tolerant control for constrained interconnected nonlinear systems. Chaos Solitons Fractals 2023, 167, 113034. [Google Scholar] [CrossRef]
- Cui, L.; Zhang, Y.; Wang, X.; Xie, X. Event-triggered distributed self-learning robust tracking control for uncertain nonlinear interconnected systems. Appl. Math. Comput. 2021, 395, 125871. [Google Scholar] [CrossRef]
- Tang, Y.; Yang, X. Robust tracking control with reinforcement learning for nonlinear-constrained systems. Int. J. Robust Nonlinear Control 2022, 32, 9902–9919. [Google Scholar] [CrossRef]
- Yang, X.; Zhao, B. Optimal neuro-control strategy for nonlinear systems with asymmetric input constraints. IEEE/CAA J. Autom. Sin. 2020, 7, 575–583. [Google Scholar] [CrossRef]
- Beard, R.W.; Saridis, G.N.; Wen, J.T. Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation. Automatica 1997, 33, 2159–2177. [Google Scholar] [CrossRef]
- Liu, D.; Yang, X.; Wang, D.; Wei, Q. Reinforcement-learning-based robust controller design for continuous-time uncertain nonlinear systems subject to input constraints. IEEE Trans. Cybern. 2015, 45, 1372–1385. [Google Scholar] [CrossRef]
- Pishro, A.; Shahrokhi, M.; Sadeghi, H. Fault-tolerant adaptive fractional controller design for incommensurate fractional-order nonlinear dynamic systems subject to input and output restrictions. Chaos Solitons Fractals 2022, 157, 111930. [Google Scholar] [CrossRef]
- Zhang, L.; Zhao, X.; Zhao, N. Real-time reachable set control for neutral singular Markov jump systems with mixed delays. IEEE Trans. Circuits Syst. II Express Briefs 2021, 69, 1367–1371. [Google Scholar] [CrossRef]
- Lakmesari, S.H.; Mahmoodabadi, M.J.; Ibrahim, M.Y. Fuzzy logic and gradient descent-based optimal adaptive robust controller with inverted pendulum verification. Chaos Solitons Fractals 2021, 151, 111257. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).