Dynamic Event-Triggering Surrounding Control for Multi-USVs Under FDI Attacks via Adaptive Dynamic Programming

Dongwei Wang; Ying Zhang; Qing Hu

doi:10.3390/jmse13081588

,

and

¹

Logistics Engineering College, Shanghai Maritime University, Shanghai 201306, China

²

College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China

³

School of Mechanical Engineering, Shanghai Dianji University, Shanghai 201306, China

⁴

School of Information Science and Technology, Dalian Maritime University, Dalian 116026, China

J. Mar. Sci. Eng.2025, 13(8), 1588;https://doi.org/10.3390/jmse13081588

This article belongs to the Special Issue Ship Wireless Sensor

Version Notes

Order Reprints

Abstract

This paper investigates the surrounding control problem of multiple unmanned surface vehicles (USVs) against false data injection (FDI) attacks and proposes a learning-based prescribed performance control (PPC) integrated with a dynamic event-triggering (DET) mechanism. First, a predefined-time observer (PTO) is designed to estimate the injected false data. Then, the constrained surrounding tracking error of multi-USVs is first formulated based on an exponential prescribed performance function. To facilitate the control law design, the constrained surrounding problem is transformed into an unconstrained space using a hyperbolic tangent function. Based on adaptive dynamic programming (ADP) and the DET mechanism, a prescribed performance time-varying surrounding control scheme is developed. Finally, the effectiveness and superiority of the proposed control strategy are demonstrated through rigorous theoretical analysis and simulation experiments, while Zeno behavior in the event-triggered mechanism is excluded.

Keywords:

adaptive dynamic programming; dynamic event-triggering; surrounding control; prescribed performance control; predefined-time observer; multi-USVs

1. Introduction

In recent years, multi-agent systems (MASs) control has emerged as a significant research focus due to the inherent flexibility, scalability, and intelligent emergence characteristics of MASs. The MASs control technology has been widely used in the formation control of unmanned systems [1], surrounding control of unmanned surface vehicles (USVs) [2], and reconnaissance missions of multiple unmanned aerial vehicles (UAVs) [3].

Multi-USV systems have attracted significant research attention globally, owing to the advantages of their low cost, high maneuverability, and operational reliability. An equal-distance surrounding control strategy was proposed to address the encirclement tasks of multi-USVs with collision avoidance in [4]. Skjetne et al. [5] designed an adaptive recursive control scheme for nonlinear USV systems with parameter uncertainties. In [6], a distributed prescribed performance formation control method with obstacle avoidance was presented for multiple USVs under parameter uncertainties and external disturbances. However, the above-mentioned control methods, such as sliding mode control, adaptive control, and backstepping control, fail to account for the trade-off between control performance and control cost. Moreover, in practical applications, the communication networks of multi-USVs may be vulnerable to cyber-attacks, and the closed-loop systems (CLSs) are required to achieve good steady-state and transient performance. Therefore, it is significant to design a robust, secure, and optimal control scheme with prescribed performance to address the surrounding control problem of multiple USVs under network attacks.

In [7], a fixed-time extended state observer (FTESO) was designed to estimate the velocities and disturbances of multi-USVs, and a FTESO-based fixed-time distributed formation control method was developed. In [8], a leader–follower formation control strategy based on event-triggered mechanism and artificial potential function was proposed for multiple autonomous underwater vehicles (AUVs) in a three-dimensional environment. The authors investigated the formation control problem of multiple USVs under constraints on line-of-sight range and angular measurements, and a fault-tolerant finite-time formation control method based on time-varying tangent barrier Lyapunov function was proposed in [9]. Formation control methods are generally used to realize cooperative tasks. However, the surrounding control and containment control methods can be utilized to address the MASs control problem involving non-cooperative targets [10]. Xu et al. [11] developed an event-triggered control method to address the surrounding-formation control problem with multiple dynamic targets; Zeno behavior did not occur in the event-triggered mechanism. Reference [12] proposed a sampled-data-based containment control strategy to solve the second-order multi-agent control problem involving a dynamic leader with nonzero input, ultimately guaranteeing that all followers converge within the convex hull formed by the dynamic leader. In practical applications, USVs are typically powered by lithium batteries, which are limited in capacity due to the constraints of the USV’s size and structure. Therefore, designing control schemes that minimize energy consumption while maintaining USV performance is of significant value. In [13], an optimal tracking control scheme was designed for USV dead-zone input nonlinearity and external disturbances, which can balance the control cost and performance. In [14], the authors addressed the control problem of fixed-wing UAVs subjected to both matched and unmatched disturbances, and an adaptive optimal control method based on the adaptive gain generalized super-twisting algorithm was proposed to mitigate the negative impact of disturbances on the CLSs.

The aforementioned works primarily focus on designing control methods to ensure the steady-state performance of control systems. However, these control methods do not simultaneously account for both steady-state and transient performance, including factors such as convergence accuracy, convergence speed, and overshoot in the CLSs. In [15], a prescribed performance control (PPC) method was first designed for MIMO nonlinear systems, which can ensure that the state of CLSs remains within the range defined by a prescribed performance function. A reinforcement learning (RL)-based prescribed performance optimal control scheme was presented to deal with the constrained tracking control problem of USV [16]. An adaptive fault-tolerant control scheme based on PPC techniques for spacecraft in the presence of external disturbances and actuator failures in [17]. In [18], a RL-based prescribed performance attitude control method was designed for spacecraft attitude control systems under disturbances and state constraints, and a neural network (NN) weight update law without persistent excitation (PE) conditions was proposed in the RL algorithm. In [19], a leader–follower prescribed performance formation control approach was developed to solve second-order MASs output feedback control problems without velocity measurement. In [20], a FTESO-based PPC method was proposed to handle the path-tracking problem of USVs via sliding mode control (SMC) techniques. An event-triggered-based reinforcement learning PPC method was presented for the underactuated USV in [21]. The event-triggered strategies in the aforementioned works primarily are mainly static triggering mechanisms. To further reduce the number of event triggers, reference [22] employed a dynamic event-triggered (DET) mechanism to design a finite-time formation control method for USVs.

Multi-UAVs are one branch of cyber-physical systems (CPSs); malicious cyberattacks can have severe negative impacts on their security and stability [23,24]. The cyberattacks are mainly categorized into denial of service (DoS) attacks [25] that destroy data availability, and false data injection (FDI) attacks [26] that damage data integrity. Compared to DoS attacks, FDI attacks are more covert and destructive. Considering the consensus control problem of multiple UAVs under deception attacks, a memory-based event-triggered control strategy was proposed by incorporating an averaging mechanism to suppress event mis-triggering caused by deception attacks [27]. For smart grid systems subjected to FDI attacks, a deep-learning-based intelligent detection algorithm was designed to identify the behavioral characteristics of false data using historical measurement data in [28]. He et al. [29] proposed a pulse-synchronization-based secure control method for MASs under deception attacks. For MASs under composite attacks combining FDI and information replay, an adaptive observer-based distributed leader–follower resilient control scheme was proposed in [30]. A periodic event-triggered resilient control strategy was developed for the consensus control problem of linear MASs under FDI and DoS attacks [31]. In [32], a fully distributed resilient control protocol was developed for the formation-containment control problem of higher-order heterogeneous MASs under unbounded FDI attacks. In [33], a distributed resilient containment control method was proposed for MASs against FDI attacks. Zhang et al. [34] designed a predictive sliding mode control method to address the output tracking problem of higher-order fully actuated systems in the presence of FDI attacks. In [35], considering the higher-order fully actuated MASs under FDI attacks, a predictive sliding mode control approach was proposed to ensure the stability and security of CLSs.

By analyzing the aforementioned works, it can be observed that there is relatively little research on the surrounding control problem of multi-USVs under FDI cyberattacks. Inspired by the above studies, this paper proposes a learning-based resilient DET surrounding control strategy for multi-USVs under FDI attacks. The main contributions of this work are as follows:

1. A novel predefined-time observer (PTO) is proposed to estimate false data in the communication network of multi-USVs, and a PTO-based resilient control scheme is designed. In contrast to the asymptotic convergence FDI observer presented in [27], the proposed PTO can estimate the false data within a user-predefined time and does not require the assumption that the false data are bounded.

2. Based on the information from the PTO, a prescribed performance time-varying surrounding control method is designed for multi-USVs within the adaptive dynamic programming (ADP) framework. In [16], an RL-based control method for the tracking control problem of USVs was proposed using the NN weight update law under the persistent excitation condition (PE). In contrast, the NN weight update law in this paper relaxes the PE requirement by incorporating the system’s state into the learning rate, thereby accelerating the convergence of the weight update law.

3. Compared to these control schemes for USVs in [4,11,16], this work proposes an ADP-based prescribed performance controller integrated with the DET mechanism. In contrast to the time-sampling controller in [4,11], and the static event-triggering strategy in [16], the proposed DET scheme further reduces the event-triggering numbers to reduce energy consumption. Moreover, by combining the ADP and PPC techniques, the proposed optimal surrounding controller effectively balances control cost and performance while considering both the steady-state and transient performance of the CLSs.

2. Materials and Methods

This paper investigates the surrounding control problem of multiple unmanned surface vehicles (USVs) against false data injection (FDI) attacks and the control structure diagram of the multi-USVs against FDI attacks is given in Figure 1. The false data are injected into the communication network between USVs to degrade the performance of CLSs. The proposed FDI compensation observer can effectively detect false signals to mitigate the impact of FDI attacks. Ultimately, combining the framework of adaptive dynamic programming and the dynamic event-triggered mechanism, a resilient prescribed performance surrounding control method is designed for multi-USVs.

Figure 1. The structure illustration of surrounding control for multi-USVs under FDI attacks.

2.1. Communication Network Diagram of USV

The time-varying surrounding control problem of multi-USVs is investigated in this work. Considering the multi-USVs with N agents, the directed communication network of USVs is defined as

G = (V, ε)

, where

V = {v_{1}, v_{2}, \dots v_{N}}

denotes the node set and

ε \subseteq V \times V

is the edge set among the directed graph

G

. The symbol

A = [a_{i j}] \in R^{N \times N}

is defined as the adjacency matrix of multi-USVs, where

a_{i j}

denotes the connection weight between the ith USV and jth USV. If

(v_{j}, v_{i}) \in ε

, that means the ith USV can acquire the state of jth USV,

a_{i j} = 1

, otherwise

a_{i j} = 0

. The diagonal matrix

D = diag {d_{i}}

is denoted as the in-degree matrix of USVs, where

d_{i} = \sum_{j = 1}^{N} a_{i j}

. The Laplacian matrix is given by

L = D - A

.

Consider the state of target USV as

η_{0}

, if the state information of target USV can flow to the ith USV,

b_{i 0} = 1

, otherwise

b_{i 0} = 0

. For a digraph

\bar{G}

with

N + 1

agents, the directed graph

\bar{G}

contains the target USV and N pursuer USVs. The Laplacian matrix of digraph

\bar{G}

is defined as

H = L + B

, where

B = diag {b_{i 0}}

.

Assumption 1.

For the digraph

\bar{G}

, there exists at least one path from the target USV to any pursuer USV. Also, a directed spanning tree is rooted at the target USV in the

\bar{G}

.

2.2. The Dynamical Model of USV

The simplified schematic diagram of the unmanned surface vehicle (USV) is shown in Figure 2, where

F_{e} O_{e} X_{e} Y_{e}

represents the Earth-fixed coordinate and

F_{B} O_{B} X_{B} Y_{B}

denotes the body-fixed coordinate of the USV.

Figure 2. Construction and frame of the unmanned surface vehicle (USV).

According to references [6,16], the kinematic and dynamic models of the USVs are given as follows:

\{\begin{cases} {\dot{η}}_{i} = R_{i} (φ_{i}) v_{i} \\ {\dot{v}}_{i} = - M_{i}^{- 1} [C_{i} (v_{i}) v_{i} + D_{i} (v_{i}) v_{i} - τ_{i}] \end{cases}

(1)

where

η_{i} = {[x_{i}, y_{i}, φ_{i}]}^{T}

denotes the state of the ith USV,

(x_{i}, y_{i})

is the position state and

φ_{i}

is the yaw angle.

v_{i} = {[u_{i}, v_{i}, r_{i}]}^{T}

is the velocity vector,

u_{i}, v_{i}, r_{i}

are the surge, sway, yaw velocities, respectively.

τ_{i} = {[τ_{u_{i}}, τ_{v_{i}}, τ_{r_{i}}]}^{T}

is the control input.

M_{i} \in R^{3 \times 3}

represents the inertia matrix, satisfying

M_{i} = M_{i}^{T}

,

C_{i} (v_{i}) \in R^{3 \times 3}

is the skew-symmetric Coriolis force matrix,

C_{i} (v_{i}) = - C_{i} {(v_{i})}^{T}

, and

D_{i} (v_{i}) \in R^{3 \times 3}

is the damping matrix.

R_{i} (φ_{i})

is the rotation matrix from

F_{B} O_{B} X_{B} Y_{B}

to

F_{e} O_{e} X_{e} Y_{e}

, defined as follows:

R_{i} (φ_{i}) = [\begin{matrix} \cos φ_{i} & - \sin φ_{i} & 0 \\ \sin φ_{i} & \cos φ_{i} & 0 \\ 0 & 0 & 1 \end{matrix}]

(2)

where the rotation matrix has the property

R_{i}^{T} (φ_{i}) R_{i} (φ_{i}) = I

. The matrices

M_{i}

,

C_{i} (v_{i})

,

D_{i} (v_{i})

are expressed as follows:

M_{i} = [\begin{matrix} m_{11 i} & 0 & 0 \\ 0 & m_{22 i} & m_{23 i} \\ 0 & m_{23 i} & m_{33 i} \end{matrix}], C_{i} (v_{i}) = [\begin{matrix} 0 & 0 & - c_{13 i} \\ 0 & 0 & c_{23 i} \\ c_{13 i} & - c_{23 i} & 0 \end{matrix}], D_{i} (v_{i}) = [\begin{matrix} d_{11 i} & 0 & 0 \\ 0 & d_{22 i} & 0 \\ 0 & 0 & d_{33 i} \end{matrix}]

(3)

with

m_{11 i} = m_{i} - X_{\dot{u} i}

,

m_{22 i} = m_{i} - Y_{\dot{v} i}

,

m_{33 i} = I_{z i} - N_{\dot{r} i}

,

m_{23 i} = m_{i} x_{g i} - Y_{\dot{r} i}

,

m_{i}

is the mass of the ith USV,

I_{z i}

denotes the inertia term in the z-axis,

x_{g i}

is the gravity of the USV, and these variables

X_{\dot{u} i}, Y_{\dot{v} i}, Y_{\dot{r} i}, N_{\dot{r} i}

are the added masses.

c_{13 i} = m_{22 i} v_{i} + m_{23 i} r_{i}

,

c_{23 i} = m_{11 i} u_{i}

.

d_{11 i} = - (X_{u i} + X_{u | u | i} |u_{i}|)

,

d_{22 i} = - (Y_{v i} + Y_{v | v | i} |v_{i}|)

,

d_{33 i} = - (N_{r i} + N_{r | r | i} |r_{i}|)

,

X_{u i}, Y_{v i}, N_{r i}

are the linear damping coefficients of USV, and

X_{u | u | i}, Y_{v | v | i}, N_{r | r | i}

denote the quadratic damping coefficients.

Next, according to the Euler-Lagrange modeling method [36,37], the dynamical model (1) of the USV is transformed as follows:

v_{i} = R_{i}^{- 1} {\dot{η}}_{i}

(4)

Taking the derivative of Equation (4), we obtain the following equation:

{\dot{v}}_{i} = {\dot{R}}_{i}^{- 1} {\dot{η}}_{i} + R_{i}^{- 1} {\ddot{η}}_{i}

(5)

Substituting (5) into (1), we can obtain the following:

J_{i} {\ddot{η}}_{i} + H_{i} {\dot{η}}_{i} = τ_{i}

(6)

where

J_{i} = M_{i} R_{i}^{- 1}

,

H_{i} = (M_{i} + C_{i} + D_{i}) R_{i}^{- 1}

.

In this work, the control inputs under the FDI attacks are represented as follows:

{\tilde{τ}}_{i} = τ_{i} + E_{i} Γ_{i}

(7)

where

Γ_{i}

denotes the false information injected by the attacker,

E_{i} = diag {γ_{i 1}, γ_{i 2}, γ_{i 3}}

is the FDI attacks probability matrix,

γ_{i 1}

,

γ_{i 2}

, and

γ_{i 3}

denote random variables that follow the Bernoulli distribution. The probability distribution of the random variable satisfies the following:

\begin{array}{l} P r o b {γ_{i 1} = 1} = {\bar{γ}}_{i 1}, \\ P r o b {γ_{i 2} = 1} = {\bar{γ}}_{i 2}, \\ P r o b {γ_{i 3} = 1} = {\bar{γ}}_{i 3}, \end{array}

(8)

where

{\bar{γ}}_{i 1} \in [0, 1]

,

{\bar{γ}}_{i 2} \in [0, 1]

, and

{\bar{γ}}_{i 3} \in [0, 1]

.

Substituting (7) into (6), we can obtain the dynamic systems of multiple USVs under FDI attacks as follows:

J_{i} {\ddot{η}}_{i} + H_{i} {\dot{η}}_{i} = τ_{i} + E_{i} Γ_{i}

(9)

2.3. The Surrounding Control for Multi-USVs

Definition 1

[38]. There exists a set

X = {x_{1}, x_{2}, \dots x_{n}}

that belongs to a convex set

C

. In the convex set

C

, there exists a convex hull

C o (X) = \{\sum_{i = 1}^{n} δ_{i} x_{i} | x_{i} \in X, \sum_{i = 1}^{n} δ_{i} = 1\}

.

Definition 2.

The time-varying surrounding control of multi-USVs is achieved, if for any pursuer USV satisfying the following equation:

\lim_{t \to \infty} ‖η_{i} - η_{0} - ρ_{i 0}‖ = 0

(10)

where

η_{0}

denotes the state of target USV, the time-varying surrounding function

ρ_{i 0} = {[c_{1} \cos (c_{2} t + 2 π i / N), c_{1} \sin (c_{2} t + 2 π i / N), 0]}^{T}

,

c_{1}, c_{2} > 0

,

i = 1, 2, \dots N

.

Figure 3 illustrates the process of multiple USVs moving from their initial states to uniformly encircling the target. It can be observed that the target USV is finally distributed into the convex hull spanned by the pursuer USVs.

Figure 3. Schematic diagram of surrounding control for multi-USVs.

Lemma 1

[39]. Consider the dynamic system

x = f (x, u)

, where

x

represents the system state and

u

is the control input. For any positive constants

a

and

0 < p < 1

, there exists a positive-definite Lyapunov function

V (t)

satisfying the following:

\dot{V} (t) \leq - \frac{1}{a p T_{c}} {(1 + a V {(t)}^{p})}^{2} V {(t)}^{1 - p}

(11)

with

T_{c} > 0

. The system state

x

will converge to the origin within the user-predefined time

T_{c}

.

2.4. Design of a Preset Time Controller Based on ADP and Dynamic Event-Triggering Mechanism

2.4.1. Predefined-Time FDI Attack Observer Design

This work considers the surrounding control problem of multi-USVs under FDI attacks. If the false information in the communication network of USVs is not handled properly, it will lead to the performance degradation of the closed-loop system. To mitigate the impact of FDI attacks, a predefined-time attack compensation observer will be designed for the multi-USVs. One auxiliary variable is defined as follows:

S_{i} = k_{i} \int_{t 0}^{t} (τ_{i} - H_{i} {\dot{η}}_{i} + {\dot{J}}_{i} {\dot{η}}_{i} - S_{i} (s)) d s - k_{i} J_{i} {\dot{η}}_{i}

(12)

where

k_{i}

is a positive constant. Taking the time derivative of Equation (12), we obtain the following expression:

{\dot{S}}_{i} = - k_{i} (S_{i} + F_{i})

(13)

where

F_{i} = E_{i} Γ_{i}

.

Theorem 1.

Based on the auxiliary variable

S_{i}

and its derivative

{\dot{S}}_{i}

, which are specifically defined in Equations (12) and (13), a predefined-time observer is designed as follows:

{\dot{\hat{S}}}_{i} = \frac{1}{2 a_{i} p_{i} T_{i}} {(1 + a_{i} L_{i}^{p_{i}})}^{2} L_{i}^{- p_{i}} {\tilde{S}}_{i} + {\dot{S}}_{i}

(14)

with

a_{i} > 0

,

0 < p_{i} < 1

,

L_{i} = \frac{1}{2} {\tilde{S}}_{i}^{T} {\tilde{S}}_{i}

,

{\tilde{S}}_{i} = S_{i} - {\hat{S}}_{i}

. According to Equations (13) and (14), to alleviate the impact of FDI attacks on the closed-loop system of USVs, the following predefined-time FDI observer is given as follows:

{\hat{F}}_{i} = - \frac{1}{k_{i}} (k_{i} {\hat{S}}_{i} + {\dot{S}}_{i})

(15)

Under the observer (15), the estimation error

{\tilde{F}}_{i} = F_{i} - {\hat{F}}_{i}

will converge to zero within the predefined time

T_{i}

.

Proof.

The function

L_{i} = \frac{1}{2} {\tilde{S}}_{i}^{T} {\tilde{S}}_{i}

is selected as the Lyapunov function, and the

{\dot{L}}_{i}

can be obtained as follows:

\begin{matrix} {\dot{L}}_{i} & = {\tilde{S}}_{i}^{T} {\dot{\tilde{S}}}_{i} \\ = {\tilde{S}}_{i}^{T} (- \frac{1}{2 a_{i} p_{i} T_{i}} {(1 + a_{i} L_{i}^{p_{i}})}^{2} L_{i}^{- p_{i}} {\tilde{S}}_{i}) \\ \leq - \frac{1}{a_{i} p_{i} T_{i}} {(1 + a_{i} L_{i}^{p_{i}})}^{2} L_{i}^{1 - p_{i}} \end{matrix}

(16)

According to Lemma 1, it can be derived that

‖{\tilde{S}}_{i}‖ \to 0

as the time

t \to T_{i}

. When

t \geq T_{i}

,

‖{\tilde{S}}_{i}‖ = 0

can be obtained. Based on Equations (13) and (15), we can obtain the following:

\begin{matrix} {\tilde{F}}_{i} & = F_{i} - {\hat{F}}_{i} \\ = F_{i} + \frac{1}{k_{i}} (k_{i} {\hat{S}}_{i} - k_{i} (S_{i} + F_{i})) \\ = - {\tilde{S}}_{i} \end{matrix}

(17)

Based on the above analysis, it can be concluded that the estimation error

‖{\tilde{F}}_{i}‖

of the false data will converge to zero within the predefined time

T_{i}

. □

2.4.2. Tracking Error Subsystem of USVs

Considering the second-order Euler-Lagrange dynamic model of the USV in Equation (6), the following controller is given as follows:

τ_{i} = τ_{r i} + τ_{f i}

(18)

where the

τ_{r i}

denotes the ADP-based controller, the feedforward control law

τ_{f i}

is defined as

τ_{f i} = H_{i} {\dot{η}}_{i} - {\hat{F}}_{i} + J_{i} {\ddot{h}}_{i 0}

, and

h_{i 0}

is the surrounding position vector of the ith USV relative to the target.

The position vector of the target USV is set as

η_{d} = {[x_{d}, y_{d}, φ_{d}]}^{T}

, and

{\dot{v}}_{d} = η_{d}

is the velocity vector of the target. Next, the surrounding tracking error model for the multi-USVs will be established. The position error

e_{i, 1}

of the ith USV relative to its neighbor is formulated as follows:

e_{i, 1} = \sum_{j = 0}^{N} a_{i j} (η_{i} - η_{j} - h_{i j})

(19)

where

h_{i j} = h_{i 0} - h_{j 0}

denotes the relative position between ith USV and jth USV,

η_{0} = η_{d}

.

The velocity tracking error

e_{i, 2}

is defined as follows:

{\dot{e}}_{i, 1} = e_{i, 2} = \sum_{j = 0}^{N} a_{i j} ({\dot{η}}_{i} - {\dot{η}}_{j} - {\dot{h}}_{i j})

(20)

Take the derivative of Equation (20) with respect to time, it can yield the following:

\begin{matrix} {\dot{e}}_{i, 2} & = \sum_{j = 0}^{N} a_{i j} ({\ddot{η}}_{i} - {\ddot{η}}_{j} - {\ddot{h}}_{i j}) \\ = \sum_{j = 0}^{N} a_{i j} ({\ddot{η}}_{i} - {\ddot{η}}_{j} - {\ddot{h}}_{i j}) \\ = \sum_{j = 0}^{N} a_{i j} (J_{i}^{- 1} τ_{r i} - J_{j}^{- 1} τ_{r j} + J_{i}^{- 1} F_{i} - J_{i}^{- 1} {\hat{F}}_{i} - (J_{j}^{- 1} F_{j} - J_{j}^{- 1} {\hat{F}}_{j})) \\ = \sum_{j = 0}^{N} a_{i j} (τ_{R i} - τ_{R j} + J_{i}^{- 1} {\tilde{F}}_{i} - J_{j}^{- 1} {\tilde{F}}_{j}) \end{matrix}

(21)

with

τ_{R i} = J_{i}^{- 1} τ_{r i}

,

τ_{R j} = J_{j}^{- 1} τ_{r j}

,

τ_{R 0} = {\ddot{η}}_{d}

. One can determine that

{\tilde{F}}_{i} = 0_{3}

and

{\tilde{F}}_{j} = 0_{3}

when time

t \geq T_{i}

. According to (20) and (21), a combined dynamic system is defined as follows:

{\dot{e}}_{i} = A_{i} e_{i} + B_{i} τ_{R i} - B_{i} τ_{R j}

(22)

where

e_{i} = {[e_{i, 1}^{T}, e_{i, 2}^{T}]}^{T}

,

e_{i, 1} = {[e_{i 1}, e_{i 2}, e_{i 3}]}^{T}

,

e_{i, 2} = {[e_{i 4}, e_{i 5}, e_{i 6}]}^{T}

,

A_{i} = [\begin{matrix} 0_{3 \times 3} & I_{3 \times 3} \\ 0_{3 \times 3} & 0_{3 \times 3} \end{matrix}]

,

B_{i} = [\begin{matrix} 0_{3 \times 3} \\ I_{3 \times 3} \sum_{j = 0}^{N} a_{i j} \end{matrix}]

.

In this paper, to ensure that the surrounding tracking error of USVs has both great steady-state and transient performance, the following prescribed performance function is designed:

λ_{i} (t) = (λ_{i} (0) - λ_{i, \infty}) \exp (- l_{i} t) + λ_{i, \infty}

(23)

where

λ_{i} (t) = {[λ_{i 1}, λ_{i 2}, \dots λ_{i 6}]}^{T}

,

l_{i} > 0

,

0 < |e_{i k} (0)| < λ_{i k} (0)

,

0 < λ_{i k, \infty} < λ_{i k} (0)

,

k = 1, \dots 6

. The surrounding tracking error

e_{i}

can be expressed as follows:

e_{i} (t) = λ_{i} (t) Γ_{i} (z_{i})

(24)

where

Γ_{i} (z_{i})

is a monotonically increasing smooth function such that

\lim_{z_{i} \to + \infty} Γ_{i} (z_{i}) = 1

,

\lim_{z_{i} \to - \infty} Γ_{i} (z_{i}) = - 1

. The mapping function

Γ_{i} (z_{i})

is designed as follows:

Γ_{i} (z_{i}) = \frac{\exp (z_{i}) - \exp (- z_{i})}{\exp (z_{i}) + \exp (- z_{i})}

(25)

Because

- 1 < Γ_{i} (z_{i}) < 1

, it can be deduced that

- λ_{i} (t) < λ_{i} (t) Γ_{i} (z_{i}) < λ_{i} (t)

from Equation (25). Furthermore, we can obtain the following:

- λ_{i} (t) < e_{i} (t_{i}) < λ_{i} (t)

(26)

From (26), the inverse function of the variable

Γ_{i} (z_{i})

is given as follows:

z_{i} = \frac{1}{2} \ln \frac{1 + Γ_{i} (z_{i})}{1 - Γ_{i} (z_{i})} = \frac{1}{2} \ln \frac{1 + (e_{i} / λ_{i})}{1 - (e_{i} / λ_{i})}

(27)

The time derivative of Equation (27) can be given as follows:

{\dot{z}}_{i} = \frac{1}{2} (\frac{{\dot{λ}}_{i} + {\dot{e}}_{i}}{λ_{i} + e_{i}} - \frac{{\dot{λ}}_{i} - {\dot{e}}_{i}}{λ_{i} - e_{i}}) = G_{i 1} ({\dot{e}}_{i} - G_{i 2} e_{i})

(28)

with

G_{i 1} = diag \{G_{i 11}, G_{i 12}, \dots, G_{i 16}\}

,

G_{i 1 k} = λ_{i k} / (λ_{i k}^{2} - e_{i k}^{2})

,

G_{i 2} = diag \{G_{i 21}, G_{i 22}, \dots, G_{i 26}\}

,

G_{i 2 k} = {\dot{λ}}_{i k} / λ_{i k}

,

k = 1, \dots 6

. Substituting Equation (20) into (28), we obtain the following:

{\dot{z}}_{i} = F (z_{i}) + {\bar{B}}_{i} τ_{R i} - {\bar{B}}_{i} τ_{R j}

(29)

where

F (z_{i}) = G_{i 1} (A_{i} - G_{i 2}) e_{i}

,

{\bar{B}}_{i} = G_{i 1} B_{i}

. According to (24) and (27),

F (0) = 0

can be derived.

2.4.3. ADP-Based Optimal Time-Varying Surrounding Control

To save the energy of the USV, this paper adopts a dynamic event-triggered control strategy to reduce communication frequency and state sampling times between USVs. The sampling time sequence for the ith USV is defined as

t_{s}^{i}

, satisfying

t_{s}^{i} < t_{s + 1}^{i}

,

s = 0, \dots \infty

. The sampling error

q_{i} (t)

of ith USV is designed as follows:

q_{i} (t) = z_{i} (t_{s}^{i}) - z_{i} (t), t \in [t_{s}^{i}, t_{s + 1}^{i}]

(30)

where

z_{i} (t_{s}^{i})

represents the previous sampling time

t_{s}^{i}

of ith USV. To facilitate the design of the event-triggered control strategy in the following discussion, we define

{\hat{z}}_{i} = z_{i} (t_{s}^{i})

. The performance index function for the ith USV is designed as follows:

J_{i} (z_{i}) = \int_{0}^{\infty} [z_{i}^{T} Q_{i} z_{i} + τ_{R i}^{T} R_{i} τ_{R i}] d t

(31)

where

Q_{i} = Q_{i}^{T} > 0

and

R_{i} = R_{i}^{T} > 0

. If the ith USV has an optimal control scheme

τ_{R i}^{*}

, the cost function in Equation (27) can be transformed into the following form:

J_{i}^{*} (z_{i}) = \int_{0}^{\infty} [z_{i}^{T} Q_{i} z_{i} + {(τ_{R i}^{*})}^{T} R_{i} τ_{R i}^{*}] d t

(32)

We conduct the time derivative of the optimal cost function

J_{i}^{*} (z_{i})

in (32), it yields the following:

\begin{matrix} H (z_{i}, τ_{R i}^{*}, \nabla_{z_{i}} J_{i}^{*} (z_{i})) & = z_{i}^{T} Q_{i} z_{i} + {(τ_{R i}^{*})}^{T} R_{i} τ_{R i}^{*} + \nabla_{z_{i}}^{T} J_{i}^{*} (z_{i}) (F (z_{i}) + {\bar{B}}_{i} τ_{R i}^{*} - {\bar{B}}_{i} τ_{R j}^{*}) \\ = 0 \end{matrix}

(33)

where

\nabla_{z_{i}} J_{i}^{*} (z_{i}) = \partial J_{i}^{*} (z_{i}) / \partial z_{i}

is the partial derivative of

J_{i}^{*} (z_{i})

with respect to

z_{i}

. Using Equation (33), the optimal control strategy

τ_{R i}^{*}

is obtained as follows:

τ_{R i}^{*} = - \frac{1}{2} R_{i}^{- 1} {\bar{B}}_{i}^{T} \nabla_{z_{i}} J_{i}^{*} (z_{i})

(34)

According to the definition of

z_{i} (t_{s}^{i})

in (30), the event-triggered optimal controller is obtained as follows:

{\hat{τ}}_{R i}^{*} ({\hat{z}}_{i}) = - \frac{1}{2} R_{i}^{- 1} {\bar{B}}_{i}^{T} \nabla_{z_{i}} J_{i}^{*} ({\hat{z}}_{i})

(35)

Substituting (35) into (33), the HJB equation of ith USV under the event-triggered strategy is given as follows:

\begin{array}{l} z_{i}^{T} Q_{i} z_{i} + \frac{1}{4} {(\nabla J_{i}^{*} ({\hat{z}}_{i}))}^{T} {\bar{B}}_{i} R_{i}^{- 1} {\bar{B}}_{i}^{T} \nabla J_{i}^{*} ({\hat{z}}_{i}) + {(\nabla J_{i}^{*} (z_{i}))}^{T} F (z_{i}) \\ - \frac{1}{2} {(\nabla J_{i}^{*} (z_{i}))}^{T} {\bar{B}}_{i} R_{i}^{- 1} {\bar{B}}_{i}^{T} \nabla J_{i}^{*} ({\hat{z}}_{i}) + \frac{1}{2} \nabla J_{i}^{*} (z_{i}))^{T} {\bar{B}}_{i} R_{j}^{- 1} {\bar{B}}_{j}^{T} \nabla J_{j}^{*} (z_{j}) = 0 \end{array}

(36)

Based on the above analysis, we can obtain the event-triggered optimal control strategy

{\hat{τ}}_{R i}^{*} ({\hat{z}}_{i})

by solving the HJB Equation (36). However, the HJB equation is highly nonlinear and contains partial differential terms, making it difficult to obtain an analytical solution. Therefore, we use a neural network to approximate the optimal cost function, which can be formulated as follows:

J_{i}^{*} (z_{i}) = {(W_{i}^{*})}^{T} ϕ_{i} (z_{i}) + ε_{i} (z_{i})

(37)

where

W_{i} \in ℝ^{6 \times 1}

is the optimal weight vector of the NN,

ϕ_{i} (z_{i}) \in ℝ^{6 \times 1}

represents the activation function, and

ε_{i} (z_{i})

is the approximation error of the NN. Utilizing (37), the optimal event-triggered control strategy in (35) is re-expressed as follows:

{\hat{τ}}_{R i}^{*} ({\hat{z}}_{i}) = - \frac{1}{2} R_{i}^{- 1} {\bar{B}}_{i}^{T} (\nabla ϕ_{i} ({\hat{z}}_{i}) W_{i}^{*} + \nabla ε_{i} ({\hat{z}}_{i}))

(38)

The optimal weight vector

W_{i}^{*}

is unknown, which makes it impossible to directly obtain the optimal cost function

J_{i}^{*} (z_{i})

. Then, we design the

{\hat{W}}_{i}

to estimate the

W_{i}^{*}

, and

{\hat{J}}_{i} (z_{i}) = {\hat{W}}_{i}^{T} ϕ_{i} (z_{i})

is defined as the estimation of

J_{i}^{*} (z_{i})

. Therefore, the actual optimal control strategy

{\hat{τ}}_{R i} ({\hat{z}}_{i})

is given as follows:

{\hat{τ}}_{R i} ({\hat{z}}_{i}) = - \frac{1}{2} R_{i}^{- 1} {\bar{B}}_{i}^{T} \nabla ϕ_{i} ({\hat{z}}_{i}) {\hat{W}}_{i}

(39)

According to (33) and

{\hat{J}}_{i} (z_{i}) = {\hat{W}}_{i}^{T} ϕ_{i} (z_{i})

, the Hamiltonian function can be rewritten as follows:

H (z_{i}, {\hat{τ}}_{R i}, {\hat{W}}_{i}) = z_{i}^{T} Q_{i} z_{i} + {({\hat{τ}}_{R i})}^{T} R_{i} {\hat{τ}}_{R i} + {\hat{W}}_{i}^{T} \nabla_{z_{i}}^{T} ϕ_{i} (z_{i}) {\dot{z}}_{i} = 𝔼_{i}

(40)

Furthermore, based on Equations (33), (37) and (40), we can obtain the following:

H (z_{i}, {\hat{τ}}_{R i}, W_{i}^{*}) = z_{i}^{T} Q_{i} z_{i} + {({\hat{τ}}_{R i})}^{T} R_{i} {\hat{τ}}_{R i} + {(W_{i}^{*})}^{T} \nabla_{z_{i}}^{T} ϕ_{i} (z_{i}) {\dot{z}}_{i} = 𝔼_{i H}

(41)

where

𝔼_{i H} = - {(\nabla ε_{i} (z_{i}))}^{T} {\dot{z}}_{i}

. According to (40) and (41), we can obtain the following:

𝔼_{i H} - 𝔼_{i} = {\tilde{W}}_{i}^{T} ϖ_{i}

(42)

with

ϖ_{i} = \nabla_{z_{i}}^{T} ϕ_{i} (z_{i}) {\dot{z}}_{i}

and

{\tilde{W}}_{i} = W_{i}^{*} - {\hat{W}}_{i}

.

Assumption 2.

The activation function of the NN and the approximation error are both bounded, satisfying

‖\nabla ϕ_{i}‖ \leq {\bar{ϕ}}_{i}

,

‖\nabla ε_{i}‖ \leq {\bar{ε}}_{i}

, where

{\bar{ϕ}}_{i}

and

{\bar{ε}}_{i}

are constant values.

Assumption 3.

The activation function

\nabla ϕ_{i}^{T} (z_{i})

satisfies the Lipschitz continuous [40]; we obtain the following:

‖\nabla ϕ_{i}^{T} (z_{i}) - \nabla ϕ_{i}^{T} ({\hat{z}}_{i})‖ \leq c_{i} ‖q_{i} (t)‖

(43)

where

c_{i}

is a positive constant.

Theorem 2.

Considering the transformed subsystem (29) of the USVs, an event-triggered control strategy (39) based on reinforcement learning is designed. The dynamic event-triggering condition is designed as follows:

t_{s + 1}^{i} = \sup \{t > t_{s}^{i} |l_{4} (l_{3} {‖q_{i}‖}^{2} - σ l_{1} {‖z_{i}‖}^{2} - σ l_{2} {‖{\hat{τ}}_{R i}‖}^{2} - \exp (- ‖z_{i}‖)) \leq δ_{i}\}

(44)

where

l_{1} = λ_{\min} (Q_{i})

,

l_{2} = λ_{\min} (R_{i})

,

l_{4} > 0

,

l_{3} = c_{i}^{2} ‖R_{i}^{- 1}‖ {‖{\bar{B}}_{i}^{T}‖}^{2} {‖{\hat{W}}_{i}‖}^{2}

,

σ \in (0, 1)

. The dynamic variable

δ_{i}

in (44) is given as follows:

{\dot{δ}}_{i} = - β_{i} δ_{i} + ξ_{i} (σ l_{1} {‖z_{i}‖}^{2} + σ l_{2} {‖{\hat{τ}}_{R i}‖}^{2} + \exp (- ‖z_{i}‖) - l_{3} {‖q_{i}‖}^{2})

(45)

where

δ_{i} (0) > 0

,

β_{i} > 0

,

ξ_{i} \in [0, 1]

. The NN weight update law without PE condition is given as follows:

{\dot{\hat{W}}}_{i} = - \frac{ς_{2}}{ς_{1}} (\exp (- ς_{3} / ‖z_{i}‖) + ς_{4}) {\hat{W}}_{i} - \frac{ς_{5} 𝔼_{i} ϖ_{i}}{ς_{1} {(ϖ_{i}^{T} ϖ_{i} + 1)}^{2}}

(46)

with

ς_{1}

,

ς_{2}

,

ς_{3}

,

ς_{4}

and

ς_{5}

being positive constants. The state tracking error

z_{i}

and the NN weight estimation error

{\tilde{W}}_{i} = W_{i}^{*} - {\hat{W}}_{i}

are ensured to be uniformly ultimately bounded (UUB) under the optimal control law (39) and the weight update law (46). Furthermore, the proposed dynamic event-triggered control strategy does not exhibit Zeno behavior.

Proof.

According to Equations (44) and (45), we can obtain

{\dot{δ}}_{i} \geq - β_{i} δ_{i} - \frac{ξ_{i}}{l_{4}} δ_{i} = - (β_{i} + \frac{ξ_{i}}{l_{4}}) δ_{i}

. Furthermore, from one, we obtain

δ_{i} (t) \geq 0

. Based on the above analysis, a positive-definite Lyapunov function for the ith USV is chosen as follows:

V_{i} = J_{i}^{*} + \frac{1}{2} ς_{1} {\tilde{W}}_{i}^{T} {\tilde{W}}_{i} + δ_{i}

(47)

The following analysis will be divided into two parts: (1) the stability analysis; (2) the exclusion of Zeno behavior.

(1) Stability Analysis: The time derivative of the Lyapunov function

V_{i}

is given as follows:

{\dot{V}}_{i} = {(\nabla J_{i}^{*})}^{T} (F (z_{i}) + {\bar{B}}_{i} {\hat{τ}}_{R i} - {\bar{B}}_{i} {\hat{τ}}_{R j}) + ς_{1} {\tilde{W}}_{i}^{T} {\dot{\tilde{W}}}_{i} + {\dot{δ}}_{i}

(48)

According to Equations (33), (46) and (48) can be further derived into the following form:

\begin{matrix} {\dot{V}}_{i} = & - z_{i}^{T} Q_{i} z_{i} + {(τ_{R i}^{*})}^{T} R_{i} τ_{R i}^{*} - 2 {(τ_{R i}^{*})}^{T} R_{i} {\hat{τ}}_{R i} + {(\nabla J_{i}^{*})}^{T} {\bar{B}}_{i} (τ_{R j}^{*} - {\hat{τ}}_{R j}) \\ - ς_{1} {\tilde{W}}_{i}^{T} {\dot{\hat{W}}}_{i} + {\dot{δ}}_{i} \end{matrix}

(49)

According to [18], the ideal weight

W_{i}^{*}

is bounded. Furthermore, we can obtain

‖{(\nabla J_{i}^{*})}^{T} {\bar{B}}_{i} (τ_{R j}^{*} - {\hat{τ}}_{R j})‖ \leq υ_{j}

, where

υ_{j} > 0

. Based on the weight update law (46) and Equation (49), the following inequality can be obtained:

\begin{matrix} - ς_{1} {\tilde{W}}_{i}^{T} {\dot{\hat{W}}}_{i} & = χ_{1 i} {\tilde{W}}_{i}^{T} {\hat{W}}_{i} + χ_{2 i} {\tilde{W}}_{i}^{T} 𝔼_{i} ϖ_{i} \\ = χ_{1 i} {\tilde{W}}_{i}^{T} (W_{i}^{*} - {\tilde{W}}_{i}) + χ_{2 i} {\tilde{W}}_{i}^{T} ϖ_{i} (𝔼_{i H} - {\tilde{W}}_{i}^{T} ϖ_{i}) \\ \leq - \frac{χ_{1 i}}{2} {‖{\tilde{W}}_{i}‖}^{2} - \frac{χ_{2 i}}{2} {‖{\tilde{W}}_{i}^{T} ϖ_{i}‖}^{2} + \frac{χ_{1 i}}{2} {‖W_{i}^{*}‖}^{2} + \frac{χ_{2 i}}{2} {‖𝔼_{i H}‖}^{2} \\ \leq - \frac{χ_{1 i}}{2} {‖{\tilde{W}}_{i}‖}^{2} - \frac{χ_{2 i}}{2} {‖{\tilde{W}}_{i}^{T} ϖ_{i}‖}^{2} + \frac{χ_{1 i}}{2} {\bar{W}}_{i}^{2} + \frac{χ_{2 i}}{2} {‖{\bar{𝔼}}_{i H}‖}^{2} \end{matrix}

(50)

where

χ_{1 i} = ς_{2} (\exp (- ς_{3} / ‖z_{i}‖) + ς_{4})

and

χ_{2 i} = ς_{5} / {(ϖ_{i}^{T} ϖ_{i} + 1)}^{2}

.

{\bar{W}}_{i}

is the upper bound of the ideal weight

‖W_{i}^{*}‖

, and

{\bar{𝔼}}_{i H}

is the upper bound of

𝔼_{i H}

Substituting (50) into (49), we obtain the following:

\begin{matrix} {\dot{V}}_{i} \leq - z_{i}^{T} Q_{i} z_{i} + {(τ_{R i}^{*} - {\hat{τ}}_{R i})}^{T} R_{i} (τ_{R i}^{*} - {\hat{τ}}_{R i}) - {({\hat{τ}}_{R i})}^{T} R_{i} {\hat{τ}}_{R i} \\ - \frac{χ_{1 i}}{2} {‖{\tilde{W}}_{i}‖}^{2} - \frac{χ_{2 i}}{2} {‖{\tilde{W}}_{i}^{T} ϖ_{i}‖}^{2} + \frac{χ_{1 i}}{2} {\bar{W}}_{i}^{2} + \frac{χ_{2 i}}{2} {‖{\bar{𝔼}}_{i H}‖}^{2} + {\dot{δ}}_{i} \end{matrix}

(51)

According to Equations (34) and (37), the time-sampling-based control strategy

τ_{R i}^{*}

is obtained as follows:

τ_{R i}^{*} = - \frac{1}{2} R_{i}^{- 1} {\bar{B}}_{i}^{T} [\nabla ϕ_{i}^{T} (z_{i}) W_{i}^{*} + \nabla ε_{i} (z_{i})]

(52)

And further from (39) and (52), the following inequality can be derived as follows:

\begin{matrix} {(τ_{R i}^{*} - {\hat{τ}}_{R i})}^{T} R_{i} (τ_{R i}^{*} - {\hat{τ}}_{R i}) & \leq ‖R_{i}‖ {‖τ_{R i}^{*} - {\hat{τ}}_{R i}‖}^{2} \\ \leq ‖R_{i}‖ {‖R_{i}^{- 1} {\bar{B}}_{i}^{T} (\nabla ϕ_{i}^{T} (z_{i}) - \nabla ϕ_{i}^{T} ({\hat{z}}_{i})) {\hat{W}}_{i}‖}^{2} \\ + ‖R_{i}‖ {‖R_{i}^{- 1} {\bar{B}}_{i}^{T} (\nabla ϕ_{i}^{T} (z_{i}) {\tilde{W}}_{i} + \nabla ε_{i} (z_{i}))‖}^{2} \\ \leq c_{i}^{2} ‖R_{i}^{- 1}‖ {‖{\bar{B}}_{i}^{T}‖}^{2} {‖{\hat{W}}_{i}‖}^{2} {‖q_{i}‖}^{2} + 2 ‖R_{i}^{- 1}‖ {‖{\bar{B}}_{i}^{T}‖}^{2} {\bar{ϕ}}_{i}^{2} {‖{\tilde{W}}_{i}‖}^{2} \\ + 2 ‖R_{i}^{- 1}‖ {‖{\bar{B}}_{i}^{T}‖}^{2} {\bar{ε}}_{i}^{2} \end{matrix}

(53)

Substituting (53) into (52), Equation (51) can be reformulated as follows:

\begin{matrix} {\dot{V}}_{i} & \leq - λ_{\min} (Q_{i}) {‖z_{i}‖}^{2} + c_{i}^{2} ‖R_{i}^{- 1}‖ {‖{\bar{B}}_{i}^{T}‖}^{2} {‖{\hat{W}}_{i}‖}^{2} {‖q_{i}‖}^{2} + 2 ‖R_{i}^{- 1}‖ {‖{\bar{B}}_{i}^{T}‖}^{2} {\bar{ϕ}}_{i}^{2} {‖{\tilde{W}}_{i}‖}^{2} + 2 ‖R_{i}^{- 1}‖ {‖{\bar{B}}_{i}^{T}‖}^{2} {\bar{ε}}_{i}^{2} \\ - λ_{\min} (R_{i}) {‖{\hat{τ}}_{R i}‖}^{2} - \frac{χ_{1 i}}{2} {‖{\tilde{W}}_{i}‖}^{2} - \frac{χ_{2 i}}{2} {‖{\tilde{W}}_{i}^{T} ϖ_{i}‖}^{2} + \frac{χ_{1 i}}{2} {\bar{W}}_{i}^{2} + \frac{χ_{2 i}}{2} {‖{\bar{𝔼}}_{i H}‖}^{2} + {\dot{δ}}_{i} \\ \leq - l_{1} {‖z_{i}‖}^{2} - l_{2} {‖{\hat{τ}}_{R i}‖}^{2} + l_{3} {‖q_{i}‖}^{2} - ϑ_{1} {‖{\tilde{W}}_{i}‖}^{2} + ϑ_{2} + {\dot{δ}}_{i} \end{matrix}

(54)

where

ϑ_{1}

and

ϑ_{2}

are shown as follows:

\{\begin{cases} ϑ_{1} = \frac{χ_{1 i}}{2} + \frac{χ_{2 i} ϖ_{i}^{T} ϖ_{i}}{2} - 2 ‖R_{i}^{- 1}‖ {‖{\bar{B}}_{i}^{T}‖}^{2} {\bar{ϕ}}_{i}^{2} > 0 \\ ϑ_{2} = 2 ‖R_{i}^{- 1}‖ {‖{\bar{B}}_{i}^{T}‖}^{2} {\bar{ε}}_{i}^{2} + \frac{χ_{1 i}}{2} {\bar{W}}_{i}^{2} + \frac{χ_{2 i}}{2} {\bar{𝔼}}_{i H}^{2} \end{cases}

(55)

Based on the dynamic event-triggering conditions (44) and (45), Equation (54) can be re-expressed in the following form:

\begin{matrix} {\dot{V}}_{i} & \leq - (1 - ξ_{i} σ) l_{1} {‖z_{i}‖}^{2} - (1 - ξ_{i} σ) l_{2} {‖{\hat{τ}}_{R i}‖}^{2} + (1 - ξ_{i} σ) l_{3} {‖q_{i}‖}^{2} \\ + ξ_{i} \exp (- ‖z_{i}‖) - ϑ_{1} {‖{\tilde{W}}_{i}‖}^{2} + ϑ_{2} - β_{i} δ_{i} \\ \leq - (1 - σ) l_{1} {‖z_{i}‖}^{2} - (1 - σ) l_{2} {‖{\hat{τ}}_{R i}‖}^{2} + (\frac{1 - ξ_{i}}{l_{4}} - β_{i}) δ_{i} - ϑ_{1} {‖{\tilde{W}}_{i}‖}^{2} + {\bar{ϑ}}_{2} \end{matrix}

(56)

where

l_{4} > (1 - ξ_{i}) / β_{i}

,

ξ_{i} \exp (- ‖z_{i}‖) + ϑ_{2} \leq {\bar{ϑ}}_{2}

. Based on the above analysis, it is known that the variables

z_{i}

and

{\tilde{W}}_{i}

are UUB. It is reasonable to assume that the position tracking error is

\lim_{t \to 0} ‖e_{i, 1}‖ \leq k_{i 1}

, where

k_{i 1}

is a positive constant. Therefore, we can obtain the following:

\sum_{i = 1}^{N} \frac{1}{N} ‖η_{i} - η_{0} - ρ_{i 0}‖ = ‖η_{0} - \frac{1}{N} \sum_{i = 1}^{N} η_{i}‖ \leq k_{i 1}

(57)

According to the definition of encirclement control and Equation (53), it can be concluded that multiple unmanned ships will successfully complete the encirclement control task for the target.

(2) Exclusion of Zeno behavior: Taking the derivative of the event-triggered sampling error

q_{i} (t)

with respect to time

t \in (t_{s}^{i}, t_{s + 1}^{i})

, we obtain the following:

\begin{matrix} ‖{\dot{q}}_{i} (t)‖ & = ‖- {\dot{z}}_{i} (t)‖ \\ = ‖F (z_{i}) + {\bar{B}}_{i} τ_{R i} - {\bar{B}}_{i} τ_{R j}‖ \end{matrix}

(58)

From the proof results given above, it is known that these variables

z_{i}, τ_{R i}, τ_{R j}

are bounded. Therefore, Equation (58) can be written in the following form:

‖{\dot{q}}_{i} (t)‖ \leq k_{i 2}

(59)

where

k_{i 2} \geq 0

. Since time is within the interval

t \in (t_{s}^{i}, t_{s + 1}^{i})

, integrating both sides of Equation (59) yields the following expression:

t_{s + 1}^{i} - t_{s}^{i} \geq t_{s + 1}^{i} - t \geq \frac{‖q_{i} (t)‖}{k_{i 2}} > 0

(60)

According to (60), there is no Zeno behavior in the proposed DETC. □

Remark 1.

The predefined-time FDI attacks observer was first proposed in this work for multi-USVs under FDI network attacks. Compared to the asymptotic convergence, finite-time, and fixed-time observers [1,7,27], the proposed observer allows for a predefined convergence time while maintaining a simple structure. In [41], the prescribed-time observer may encounter singularity issues at the prescribed time point. This occurs because measurement errors in the system state can lead to an infinite gain. Compared to [41], the predefined-time approach proposed in this paper overcomes the singularity problem.

Remark 2.

Compared to these multi-USVs surrounding control strategies in [4,11,42], this paper proposes a prescribed-performance time-varying surrounding control scheme based on ADP for the first time. In the proposed control strategy, multi-USVs can achieve time-varying even encirclement of the target, meaning that all the pursuer USVs are evenly distributed on a circle with the encirclement radius and move in a circular motion around the target.

Remark 3.

Reference [16] proposed a reinforcement learning-based prescribed-performance tracking control strategy for USV, which requires the assumption of a weight update law with persistent excitation. Compared to [16], this paper introduces an NN weight update law that does not require the persistently excited condition. Additionally, a system state is incorporated into the NN weight update law, which accelerates the convergence of the NN weights.

Remark 4.

Compared to the static event-triggered control strategy proposed in [21] for the USV tracking problem, this paper introduces an improved dynamic event-triggered mechanism (IDETM) to further reduce the number of triggers. Moreover, through rigorous theoretical proof, Zeno behavior is eliminated. The proposed reinforcement learning control method based on IDETM significantly reduces the energy consumption of multi-USVs caused by the controller and communication.

3. Results and Discussion

In this section, numerical simulations are used to verify the effectiveness of the proposed control strategy. The computer specifications for this simulation are as follows: an AMD Ryzen 9-7945HX processor running at 2.5 GHz and 16 GB of RAM. The simulation is implemented in MATLAB/Simulink (R2016B) and MATLAB/Simscape (R2023B) with a fixed-step sampling method, where the step size is 0.01 and the total simulation time is 100 s. In the simulation, four pursuer USVs (

i = 1, 2, 3, 4

) and one target are designed. Figure 4 shows the communication topology of the multi-USVs.

Figure 4. The communication topology of the multi-USVs.

The settings for the inertia matrix

M_{i}

, Coriolis term

C_{i}

, and damping matrix

D_{i}

of the dynamics model of the unmanned surface vehicle refer to [5], with specific values listed in Table 1. The parameters for the prescribed performance function are set as

λ_{i} (0) = 4

,

λ_{i, \infty} = 0.001

,

l_{i} = 0.5

. Additionally, the initial values of USV states, NN weights, the weight matrix, control parameters, and the parameters of the predefined-time FDI observer are provided in Table 2. The vector

{\hat{z}}_{i}

is expanded as

{\hat{z}}_{i} = {[{\hat{z}}_{i 1}, \dots, {\hat{z}}_{i 6}]}^{T}

, and

ϕ_{i} ({\hat{z}}_{i}) = {[{\hat{z}}_{i 1}^{2}, {\hat{z}}_{i 2}^{2}, {\hat{z}}_{i 3}^{2}, {\hat{z}}_{i 1} {\hat{z}}_{i 4}, {\hat{z}}_{i 2} {\hat{z}}_{i 5}, {\hat{z}}_{i 3} {\hat{z}}_{i 6}]}^{T}

is set as the NN activation function. The desired motion trajectory for the target is set as

η_{d} (t) = {[3 + 0.2 t, 3 + 0.2 t, 0.785]}^{T}

, while the time-varying surrounding configuration of the pursuer USVs relative to the target is given in Table 3. Assuming the false data are

Γ_{i} (t) = {[0.1 \sin (0.1 t), 0.1 \cos (0.1 t), 0.1 \sin (0.1 t)]}^{T}

, the attack probability constants are set as

{\bar{γ}}_{i 1} = 0.3

,

{\bar{γ}}_{i 2} = 0.26

,

{\bar{γ}}_{i 3} = 0.23

.

Table 1. Parameters of USVs.

Table 2. Parameters for simulation.

Table 3. Time-varying surrounding function.

Considering the surrounding control problem of multi-USVs under FDI attacks, this work proposes an ADP-based prescribed performance time-varying encirclement control strategy. This subsection verifies the effectiveness and superiority of the proposed scheme through numerical simulations and comparative experiments.

Figure 5 shows the 2D trajectory of multi-USVs. From Figure 5, it can be seen that at the initial moment, the target is located outside the four pursuer USVs. Under the proposed control method, the four pursuer USVs successfully form a square configuration to encircle the target unmanned ship. Furthermore, from Figure 5, it can be observed that the formation of the configuration of pursuer USVs is time-varying, meaning that each pursuer USV performs a circular motion around the target unmanned ship, ultimately resulting in the target being enclosed within the convex hull formed by the pursuing unmanned ships.

Figure 5. Trajectories of multiple USVs with a time-varying surrounding configuration.

To demonstrate the superior performance of the proposed ADP-based prescribed-performance control method (PPCADP), Figure 6 presents comparative simulation results with three different control approaches, including the traditional ADP control method (TADP) in [38], the model predictive control strategy (MPC) in [43], and the sliding mode control method (SMC) in [44]. Figure 6 illustrates the simulation results of the state tracking error under different controllers. From Figure 6, it can be seen that the state tracking errors under the PPCADP strategy remain strictly within the prescribed bounds, indicating the superior control performance of the proposed PPCADP. Figure 6b–d show the simulation results of the state tracking errors under the TADP, MPC, and SMC methods, respectively. The three control methods without prescribed performance constraints fail to confine the state tracking errors within the predefined bounds.

Figure 6. Comparison of state tracking error results under different controllers.

To sum up, it can be observed from Figure 6 that the proposed PPCADP approach achieves faster convergence, higher convergence accuracy, and smaller overshoot compared with the other three control strategies.

In order to quantitatively compare the energy consumption results, a unified control energy cost function has been designed in the manuscript for different control methods:

J_{C} = \int_{0}^{T_{0}} \sum_{i = 1}^{4} [z_{i}^{T} Q z_{i} + {(τ_{i} - τ_{f i})}^{T} R (τ_{i} - τ_{f i})] d t

, where

Q = I_{6}

,

R = I_{3}

,

T_{0} = 100 s

. Figure 7 presents comparative simulation results of energy consumption under different control strategies.

Figure 7. Energy consumption results under different control strategies.

As observed from Figure 7, the PPCADP strategy achieves the lowest energy consumption, reducing energy cost by 12.8%, 49.3%, and 71.6% compared to the TADP, MPC, and SMC methods, respectively. This improvement is attributed to the incorporation of both control performance and control cost in the design of the cost function for the proposed PPCADP.

Figure 8 shows the control input curves. It can be observed that the resilient control strategy designed in this work ensures the stable performance of the multi-USVs under FDI attacks, and the control inputs converge to the origin within 10 s.

Figure 8. Time response of control input

τ_{i}

achieved by PPCADP.

Figure 9 illustrates the time response of the NN weights. It can be seen that the weights converge from initial values to a constant value, indicating that the weights remain stable as the state convergence of CLSs.

Figure 9. Time response of NN weight

{\hat{W}}_{i}

achieved by PPCADP.

The time response results of the attack variable that satisfy the Bernoulli probability distribution are shown in Figure 10.

Figure 10. The time response results of random variable

γ_{i}

.

Figure 11 shows the estimation error of the predefined-time FDI observer for the injected false data. It can be observed that the FDI observer’s estimation error converges to zero within 10 s (the predefined convergence time), verifying the effectiveness and rapidity of the proposed predefined-time FDI observer.

Figure 11. Estimation error of the predefined-time FDI observer.

Figure 12 presents a comparison of the number of sampling between the proposed dynamic event-triggered control scheme (DETC), and the event-triggered reinforcement learning control (ETC) method proposed in [40]. The ETC method [40] is applied to the USV model in simulation, and all parameters of the multi-USVs are kept consistent with the settings mentioned above. In Figure 12, the blue bar represents the results under the DETC, while the orange-red bar is the results under the ETC. It can be observed that the four USVs have fewer event-trigger numbers under DETC. The dynamic event-triggering strategy proposed in this paper reduces the total number of triggers by 31.5% compared to the ETC in [40].

Figure 12. The event-triggered number used in DETC and ETC.

Figure 13 shows the time response curves of the dynamic event-triggering condition

q_{i} (t)

(44) and trigger threshold

q_{i T}

. It can be observed that the value of the event-triggering condition is always less than or equal to the trigger threshold, and the

q_{i} (t)

and

q_{i T}

gradually converge to zero as the closed-loop system state stabilizes.

Figure 13. The results of event-triggering condition and trigger threshold under the DETC.

In summary, the theoretical proofs and simulation results given above validate the effectiveness and superiority of the proposed control method (18). This method can effectively solve the time-varying surrounding control problem of multi-USVs under FDI attacks.

To further validate the engineering applicability of the proposed PPCADP control method, a 3D semi-physical simulation model of multiple USVs was developed using the Simscape simulation platform. The mass is

1.884 kg

, the inertial matrix of USV is

I = diag {0.098, 0.017, 0.75} kg \cdot m^{2}

, and the control parameters were kept consistent with those used in the previous Simulink simulations. Figure 14 and Figure 15 show the motion trajectories of the USVs and the time responses of the state tracking errors in the Simscape environment, respectively. These results demonstrate the effectiveness of the proposed PPCADP in the Simscape-based multi-USVs simulation.

Figure 14. The collaborative surrounding process of multi-USVs under Simscape simulation.

Figure 15. State tracking errors achieved by PPCADP under Simscape simulation.

Although the above simulation results verify the effectiveness and superiority of the proposed control method, sensitivity analysis was not conducted [45,46]. In future work, we plan to further investigate the reliability of the algorithm under varying simulation parameters through box plots and Monte Carlo simulations.

4. Conclusions

This paper integrated an IDETM with the ADP algorithm and proposed a prescribed-performance time-varying surrounding control scheme for multiple unmanned surface vehicles (multi-USVs) under false data injection (FDI) attacks. First, a predefined-time FDI observer was developed to estimate the injected false data within a user-specified time, thereby removing the conventional assumption of bounded false data. Subsequently, an ADP-based control strategy utilizing the IDETM was designed. Within the learning framework of ADP, a weight update law was constructed without requiring the persistency of the excitation condition, enhancing the generality of the proposed method. Finally, theoretical analysis and numerical simulations demonstrated that the proposed approach enables the pursuer USVs to achieve time-varying surrounding configuration of the target while avoiding Zeno behavior in the DETC. Future work will explore the surrounding control problem of multi-USVs in the presence of actuator failures.

Author Contributions

Y.Z. put forward the general framework of the article and provided the writing ideas, conceived and supervised the research and experiments, and made many constructive comments for the improvement and revision of the article. D.W. consulted the references, performed the validation, and completed the writing and revision of this article. Q.H. guided the structure of the paper and provided suggestions for revisions. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China under Grant 61673259, in part by the Natural Science Foundation of Shanghai under Grant 25ZR1401159, in part by the Open Project Funds for the Key Laboratory of Space Photoelectric Detection and Perception (Nanjing University of Aeronautics and Astronautics), Ministry of Industry and Information Technology under Grant NJ2024027-3, in part by the Fundamental Research Funds for the Central Universities under Grant NJ2024027, also in part by the Shanghai Key Laboratory of Intelligent Information Processing, Fudan University. Grant No. IIPL-2025-RD5-01.

Data Availability Statement

The data presented in this study are available in this article (tables and figures).

Acknowledgments

The authors would like to thank the College of Information Engineering and the Institute of Logistics Science and Engineering of Shanghai Maritime University for their support. The author would also like to thank the anonymous reviewers for their helpful suggestions and comments to improve the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

USVs	Unmanned surface vehicles
FDI	False data injection
PPC	Prescribed performance control
DET	Dynamic event-triggering
PTO	Predefined-time observer
MASs	Multi-agent systems
FTESO	Fixed-time extended state observer
CLSs	Closed-loop systems
RL	Reinforcement learning
NN	Neural network
SMC	Sliding mode control
PE	Persistent excitation
IDETM	improved dynamic event-triggered mechanism
PPCADP	Prescribed-performance control ADP
TADP	Traditional ADP
DETC	Dynamic event-triggered control
Symbols and terms.
$G$	The directed communication network of USVs
$V$	The node set among the directed graph $G$
$ε$	The edge set among the directed graph $G$
$A$	The adjacency matrix of multi-USVs,
$a_{i j}$	The connection weight between the ith USV and jth USV
$D$	The in-degree matrix of USVs
$L$	The Laplacian matrix of USV
$H$	The Laplacian matrix of digraph
$η_{i}$	The state of ith USV
$η_{0}$	the state of target USV
$(x_{i}, y_{i})$	The position state of ith USV
$φ_{i}$	The yaw angle of ith USV
$u_{i}, v_{i}, r_{i}$	The surge, sway, yaw velocities
$v_{i}$	The velocity vector of ith USV
$τ_{i}$	The control input
$M_{i}$	The inertia matrix
$C_{i} (v_{i})$	The skew-symmetric Coriolis force matrix
$D_{i} (v_{i})$	The damping matrix
$R_{i} (φ_{i})$	$The rotation matrix from the Earth - fixed coordinate (F_{B} O_{B} X_{B} Y_{B})$ $to the body - fixed coordinate (F_{e} O_{e} X_{e} Y_{e}$ )
$Γ_{i}$	The false information injected by the attacker
$E_{i}$	The FDI attacks probability matrix
$γ_{i 1}$ $, γ_{i 2}$ $, γ_{i 3}$	Random variables that follow the Bernoulli distribution
$ρ_{i 0}$	The time-varying surrounding function
$τ_{r i}$	The RL-based controller
$τ_{f i}$	The feedforward control law
${\tilde{F}}_{i}$	The estimation error
$e_{i, 1}$	The position error of ith USV relative to its neighbor
$e_{i, 2}$	The velocity tracking error of ith USV relative to its neighbor
$λ_{i} (t)$	The prescribed performance function
$q_{i} (t)$	The sampling error of ith USV
$J_{i} (z_{i})$	The performance index function for the ith USV

References

Zhao, J.; Li, X.; Yu, X.; Wang, H. Finite-Time Cooperative Control for Bearing-Defined Leader-Following Formation of Multiple Double-Integrators. IEEE Trans. Cybern. 2022, 52, 13363–13372. [Google Scholar] [CrossRef]
Hu, B.-B.; Zhang, H.-T. Bearing-Only Motional Target-Surrounding Control for Multiple Unmanned Surface Vessels. IEEE Trans. Ind. Electron. 2022, 69, 3988–3997. [Google Scholar] [CrossRef]
An, B.; Wang, B.; Fan, H.; Liu, L.; Hu, H.; Wang, Y. Fully distributed prescribed performance formation control for UAVs with unknown maneuver of leader. Aerosp. Sci. Technol. 2022, 130, 107886. [Google Scholar] [CrossRef]
Hu, B.-B.; Zhang, H.-T.; Wang, J. Multiple-Target Surrounding and Collision Avoidance With Second-Order Nonlinear Multiagent Systems. IEEE Trans. Ind. Electron. 2021, 68, 7454–7463. [Google Scholar] [CrossRef]
Skjetne, R.; Fossen, T.I.; Kokotović, P.V. Adaptive maneuvering, with experiments, for a model ship in a marine control laboratory. Automatica 2005, 41, 289–298. [Google Scholar] [CrossRef]
He, S.; Wang, M.; Dai, S.-L.; Luo, F. Leader–Follower Formation Control of USVs With Prescribed Performance and Collision Avoidance. IEEE Trans. Ind. Inform. 2019, 15, 572–581. [Google Scholar] [CrossRef]
Li, J.; Fan, Y.; Liu, J. Distributed fixed-time formation tracking control for multiple underactuated USVs with lumped uncertainties and input saturation. ISA Trans. 2024, 154, 186–198. [Google Scholar] [CrossRef]
Wang, L.; Zhu, D.; Pang, W.; Luo, C. A Novel Obstacle Avoidance Consensus Control for Multi-AUV Formation System. IEEE/CAA J. Autom. Sin. 2023, 10, 1304–1318. [Google Scholar] [CrossRef]
Jin, X. Fault tolerant finite-time leader–follower formation control for autonomous surface vessels with LOS range and angle constraints. Automatica 2016, 68, 228–236. [Google Scholar] [CrossRef]
Xiong, H.; Zhang, Y. Time-Varying Formation-Surrounding Control for Multiquadrotors Pursuit–Evasion Games With Disturbances and Collision Avoidance. IEEE Trans. Aerosp. Electron. Syst. 2025, 61, 522–541. [Google Scholar] [CrossRef]
Xu, B.; Zhang, H.-T.; Ding, Y.; Ren, W. Event-Triggered Surrounding Formation Control of Multiagent Systems for Multiple Dynamic Targets. IEEE Trans. Control Netw. Syst. 2023, 10, 752–764. [Google Scholar] [CrossRef]
Ding, Y.; Ren, W. Sampled-data containment control for double-integrator agents with dynamic leaders with nonzero inputs. Syst. Control Lett. 2020, 139, 104673. [Google Scholar] [CrossRef]
Wang, N.; Gao, Y.; Zhao, H.; Ahn, C.K. Reinforcement Learning-Based Optimal Tracking Control of an Unknown Unmanned Surface Vehicle. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 3034–3045. [Google Scholar] [CrossRef] [PubMed]
Zhang, C.; Zhang, G.; Dong, Q. Adaptive dynamic programming-based adaptive-gain sliding mode tracking control for fixed-wing unmanned aerial vehicle with disturbances. Int. J. Robust Nonlinear Control. 2023, 33, 1065–1097. [Google Scholar] [CrossRef]
Bechlioulis, C.P.; Rovithakis, G.A. Robust Adaptive Control of Feedback Linearizable MIMO Nonlinear Systems With Prescribed Performance. IEEE Trans. Autom. Control 2008, 53, 2090–2099. [Google Scholar] [CrossRef]
Wang, N.; Gao, Y.; Zhang, X. Data-Driven Performance-Prescribed Reinforcement Learning Control of an Unmanned Surface Vehicle. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 5456–5467. [Google Scholar] [CrossRef]
Hu, Q.; Shao, X.; Guo, L. Adaptive Fault-Tolerant Attitude Tracking Control of Spacecraft With Prescribed Performance. IEEE/ASME Trans. Mechatron. 2018, 23, 331–341. [Google Scholar] [CrossRef]
Xiao, B.; Zhang, H.; Chen, Z.; Cao, L. Fixed-Time Fault-Tolerant Optimal Attitude Control of Spacecraft With Performance Constraint via Reinforcement Learning. IEEE Trans. Aerosp. Electron. Syst. 2023, 59, 7715–7724. [Google Scholar] [CrossRef]
Gong, W.; Li, B.; Yang, Y.; Xiao, B.; Ran, D. Leader-following output-feedback consensus for second order multiagent systems with arbitrary convergence time and prescribed performance. ISA Trans. 2023, 141, 251–260. [Google Scholar] [CrossRef]
Nie, J.; Zhang, X.; Wang, H.; Sheng, C.; Zhang, C.; Zhang, C. Anti-saturation distributed fixed-time prescribed performance sliding mode formation control based on FXESO for uncertain USVs. Ocean Eng. 2025, 318, 120101. [Google Scholar] [CrossRef]
Liu, H.; Chen, Y.; Tian, X.; Mai, Q. Reinforcement learning control for USVs using prescribed performance sliding surfaces and an event-triggered strategy. Ocean Eng. 2024, 306, 118045. [Google Scholar] [CrossRef]
Feng, K.; Li, K.; Li, Y. Finite-Time Formation Tracking Control for USVs Based on Dynamic Event-triggered Mechanism. In Proceedings of the 2023 IEEE International Conference on Unmanned Systems (ICUS), Hefei, China, 13–15 October 2023; pp. 1148–1153. [Google Scholar] [CrossRef]
Ma, Y.; Che, W.; Deng, C. Event-triggered model-free adaptive control for nonlinear cyber-physical systems with false data injection attacks. Int. J. Robust Nonlinear Control 2022, 32, 2442–2452. [Google Scholar] [CrossRef]
Huang, S.; Zhang, Y. Secrecy Rate Optimization for Multi-User Secure Communication Assisted by Intelligent Reflecting Surfaces (IRS) Under Imperfect CSI Conditions. Trans. Emerg. Telecommun. Technol. 2025, 36, 70117. [Google Scholar] [CrossRef]
Zhan, W.; Miao, Z.; Chen, Y.; Wu, Z.-G.; Wang, Y. Event-Triggered Finite-Time Formation Control for Networked Nonholonomic Mobile Robots Under Denial-of-Service Attacks. IEEE Trans. Netw. Sci. Eng. 2023, 10, 354–368. [Google Scholar] [CrossRef]
He, W.; Gao, X.; Zhong, W.; Qian, F. Secure impulsive synchronization control of multi-agent systems under deception attacks. Inf. Sci. 2018, 459, 354–368. [Google Scholar] [CrossRef]
Mu, X.; Gu, Z.; Lu, Q. Memory-event-triggered consensus control for multi-UAV systems against deception attacks. ISA Trans. 2023, 139, 95–105. [Google Scholar] [CrossRef] [PubMed]
He, Y.; Mendis, G.J.; Wei, J. Real-Time Detection of False Data Injection Attacks in Smart Grid: A Deep Learning-Based Intelligent Mechanism. IEEE Trans. Smart Grid 2017, 8, 2505–2516. [Google Scholar] [CrossRef]
He, W.; Mo, Z.; Han, Q.-L.; Qian, F. Secure impulsive synchronization in Lipschitz-type multi-agent systems subject to deception attacks. IEEE/CAA J. Autom. Sin. 2020, 7, 1326–1334. [Google Scholar] [CrossRef]
Tahoun, A.; Arafa, M. Cooperative control for cyber–physical multi-agent networked control systems with unknown false data-injection and replay cyber-attacks. ISA Trans. 2021, 110, 1–14. [Google Scholar] [CrossRef]
Amini, A.; Mohammadi, A.; Asif, A.; Hou, M.; Plataniotis, K.N. Fault-Tolerant Periodic Event-Triggered Consensus Under Communication Delay and Multiple Attacks. IEEE Syst. J. 2022, 16, 6338–6349. [Google Scholar] [CrossRef]
Zuo, S.; Yue, D. Resilient Output Formation Containment of Heterogeneous Multigroup Systems Against Unbounded Attacks. IEEE Trans. Cybern. 2022, 52, 1902–1910. [Google Scholar] [CrossRef]
Zuo, S.; Yue, D. Resilient Containment of Multigroup Systems Against Unknown Unbounded FDI Attacks. IEEE Trans. Ind. Electron. 2022, 69, 2864–2873. [Google Scholar] [CrossRef]
Zhang, D.-W.; Liu, G.-P. Predictive sliding-mode control of networked high-order fully actuated systems under random deception attacks. Sci. China Inf. Sci. 2023, 66, 190204. [Google Scholar] [CrossRef]
Zhang, D.-W.; Liu, G.-P. Predictive Sliding-Mode Control for Networked High-Order Fully Actuated Multiagents Under Random Deception Attacks. IEEE Trans. Syst. Man Cybern. Syst. 2024, 54, 484–496. [Google Scholar] [CrossRef]
Li, D.; Zhang, W.; He, W.; Li, C.; Ge, S.S. Two-Layer Distributed Formation-Containment Control of Multiple Euler–Lagrange Systems by Output Feedback. IEEE Trans. Cybern. 2019, 49, 675–687. [Google Scholar] [CrossRef] [PubMed]
Chen, L.; Li, C.; Xiao, B.; Guo, Y. Formation-containment control of networked Euler–Lagrange systems: An event-triggered framework. ISA Trans. 2019, 86, 87–97. [Google Scholar] [CrossRef] [PubMed]
Xiong, H.; Zhang, Y. Reinforcement learning-based formation-surrounding control for multiple quadrotor UAVs pursuit-evasion games. ISA Trans. 2024, 145, 205–224. [Google Scholar] [CrossRef]
Zhang, H.; Huang, H.; Xiao, B.; Dong, K. Command-filtered incremental backstepping attitude control of spacecraft with predefined-time stability. Aerosp. Sci. Technol. 2024, 155, 109552. [Google Scholar] [CrossRef]
Xue, S.; Luo, B.; Liu, D. Event-Triggered Adaptive Dynamic Programming for Zero-Sum Game of Partially Unknown Continuous-Time Nonlinear Systems. IEEE Trans. Syst. Man Cybern. Syst. 2020, 50, 3189–3199. [Google Scholar] [CrossRef]
Li, B.; Gong, W.; Xiao, B.; Yang, Y. Distributed prescribed-time leader-following formation control for second-order multi-agent systems with mismatched disturbances. Int. J. Robust Nonlinear Control 2023, 33, 9781–9803. [Google Scholar] [CrossRef]
Xiong, H.; Zhang, Y. Dynamic Event-Triggering Formation-Surrounding Control for Multiagent Pursuit-Evasion Games under DoS Attacks. IEEE Internet Things J. 2025, 1–14. [Google Scholar] [CrossRef]
Zhao, M.; Li, H. Distributed Model Predictive Contouring Control of Unmanned Surface Vessels. IEEE Trans. Ind. Electron. 2024, 71, 13012–13019. [Google Scholar] [CrossRef]
Gao, Z.; Guo, G. Fixed-time sliding mode formation control of AUVs based on a disturbance observer. IEEE/CAA J. Autom. Sin. 2020, 7, 539–545. [Google Scholar] [CrossRef]
Owais, M.; Moussa, G.S. Global sensitivity analysis for studying hot-mix asphalt dynamic modulus parameters. Constr. Build. Mater. 2024, 413, 134775. [Google Scholar] [CrossRef]
Idriss, L.K.; Owais, M. Global sensitivity analysis for seismic performance of shear wall with high-strength steel bars and recycled aggregate concrete. Constr. Build. Mater. 2024, 411, 134498. [Google Scholar] [CrossRef]

Figure 1. The structure illustration of surrounding control for multi-USVs under FDI attacks.

Figure 2. Construction and frame of the unmanned surface vehicle (USV).

Figure 3. Schematic diagram of surrounding control for multi-USVs.

Figure 4. The communication topology of the multi-USVs.

Figure 5. Trajectories of multiple USVs with a time-varying surrounding configuration.

Figure 6. Comparison of state tracking error results under different controllers.

Figure 7. Energy consumption results under different control strategies.

Figure 8. Time response of control input

τ_{i}

achieved by PPCADP.

Figure 9. Time response of NN weight

{\hat{W}}_{i}

achieved by PPCADP.

Figure 10. The time response results of random variable

γ_{i}

.

Figure 11. Estimation error of the predefined-time FDI observer.

Figure 12. The event-triggered number used in DETC and ETC.

Figure 13. The results of event-triggering condition and trigger threshold under the DETC.

Figure 14. The collaborative surrounding process of multi-USVs under Simscape simulation.

Figure 15. State tracking errors achieved by PPCADP under Simscape simulation.

Table 1. Parameters of USVs.

Parameters	Values	Parameters	Values
$m_{i 11}$	25.8 kg	$d_{i 11}$	$0.7225 + 1.3274 \|u_{i}\| + 5.8664 u_{i}^{2}$
$m_{i 22}$	33.8 kg	$d_{i 22}$	$0.8612 + 36.2823 \|v_{i}\| + 0.805 \|r_{i}\|$
$m_{i 23}$	1.0948 kg	$d_{i 23}$	$- 0.1079 + 0.845 \|v_{i}\| + 3.45 \|r_{i}\|$
$m_{i 32}$	1.0948 kg	$d_{i 32}$	$- 0.1052 - 5.0437 \|v_{i}\| - 0.13 \|r_{i}\|$
$m_{i 33}$	2.76 kg	$d_{i 33}$	$1.9 - 0.08 \|v_{i}\| + 0.75 \|r_{i}\|$
$c_{i 13}$	$- m_{i 22} v_{i} - m_{i 23} r_{i}$	$c_{i 23}$	$m_{i 11} u_{i}$

Table 2. Parameters for simulation.

Parameters	Values	Parameters	Values
$η_{1} (0)$	${[0, 3, 2]}^{T}$	$a_{i}$	1
$η_{2} (0)$	${[- 3, 0, 1.2]}^{T}$	$p_{i}$	0.2
$η_{3} (0)$	${[0, - 3, 1.5]}^{T}$	$k_{i}$	10
$η_{4} (0)$	${[3, 0, 1.8]}^{T}$	$T_{i}$	10 s
$v_{1} (0)$	${[0, 0, 2]}^{T}$	$Q_{i}$	$15 I_{6}$
$v_{2} (0)$	${[0, 0, 1.2]}^{T}$	$R_{i}$	$8 I_{3}$
$v_{3} (0)$	${[0, 0, 1.5]}^{T}$	$ς_{1}$	1.1
$v_{4} (0)$	${[0, 0, 1.8]}^{T}$	$ς_{2}$	1
${\hat{W}}_{1} (0)$	${[16, 30, 35, 22, 28, 33]}^{T}$	$ς_{3}$	1.2
${\hat{W}}_{2} (0)$	${[20, 28, 32, 22, 28, 33]}^{T}$	$ς_{4}$	1.5
${\hat{W}}_{3} (0)$	${[18, 26, 33, 22, 28, 33]}^{T}$	$ς_{5}$	0.5
${\hat{W}}_{4} (0)$	${[17, 31, 32, 22, 27.5, 33]}^{T}$	$σ$	0.01
$l_{4}$	1	$β_{i}$	1.1
$δ_{i} (0)$	1	$ξ_{i}$	0.5

Table 3. Time-varying surrounding function.

Parameters	Values
$h_{10} (t)$	${[3 \cos (0.15 t + 2 π / 4), 3 \sin (0.15 t + 2 π / 4), 0]}^{T}$
$h_{20} (t)$	${[3 \cos (0.15 t + 4 π / 4), 3 \sin (0.15 t + 4 π / 4), 0]}^{T}$
$h_{30} (t)$	${[3 \cos (0.15 t + 6 π / 4), 3 \sin (0.15 t + 6 π / 4), 0]}^{T}$
$h_{40} (t)$	${[3 \cos (0.15 t + 8 π / 4), 3 \sin (0.15 t + 8 π / 4), 0]}^{T}$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Dynamic Event-Triggering Surrounding Control for Multi-USVs Under FDI Attacks via Adaptive Dynamic Programming

Abstract

1. Introduction

2. Materials and Methods

2.1. Communication Network Diagram of USV

2.2. The Dynamical Model of USV

2.3. The Surrounding Control for Multi-USVs

2.4. Design of a Preset Time Controller Based on ADP and Dynamic Event-Triggering Mechanism

2.4.1. Predefined-Time FDI Attack Observer Design

2.4.2. Tracking Error Subsystem of USVs

2.4.3. ADP-Based Optimal Time-Varying Surrounding Control

3. Results and Discussion

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Article Metrics

Citations

Article Access Statistics