1. Introduction
The Perimeter Defense (PD) problem is a variant of the Pursuit-Evasion Game (PEG) proposed by Isaacs et al. [
1], which concerns the following scenario. There is a region containing important facilities, and a group of agents attempt to reach the boundary of the target region without being intercepted, known as intruders, while another group of agents tries to intercept or drive them away before the intruders reach the boundary, known as defenders. During the past decade, increasing efforts have been devoted to solving PEGs involving multiple pursuers or multiple evaders.
When the number of pursuers and evaders is limited, we can simply use the methods of differential games in Fu et al. [
2] to analyze the entire problem. In Fisac et al. [
3], a three-agent PEG is considered, with one pursuer, one evader, and a collaborator who helps the evader by blocking or delaying the pursuer. In an environment with obstacles, the game result is obtained by solving a double-obstacle Hamilton–Jacobi–Isaacs variational inequality. The problem is decomposed into two simpler two-agent games with dynamic objectives and constraints through the proposed method, allowing the three-agent game to be solved at a lower cost. In Shishika et al. [
4], a game problem between two pursuer aircraft and one evader aircraft is described. By introducing hierarchical decomposition of the game, at the upper collaborative level, pursuers choose their optimal behavioral strategies (i.e., to pursue or to temporarily leave), thus forming a three-agent non-cooperative dynamic game. The game is then solved at the lower level of the overall game. This allows pursuers to intelligently change their behavior according to game-theoretic solutions, ultimately achieving the goal of capturing the evading aircraft. In Santos et al. [
5], the researchers designed a parallel optimization algorithm to compute the optimal strategy for capturing the evader in PEGs. The algorithm minimizes the capture time in the game and, given a discrete topology, outputs the minimum number of pursuers required to capture the evader. The establishment and solution of dominant regions in the game, and the use of graph methods represented by Voronoi diagrams described by Isaacs et al. [
1], also play an important role in solving PEG problems. In Huang et al. [
6], a decentralized control scheme based on the game domain is proposed, where pursuers collectively minimize the area of the evader’s Voronoi cell to continuously force the evader to be unable to evade, thereby achieving the goal of capturing the evader. The authors prove that, regardless of the evader’s actions, this scheme can guarantee the capture of the evader. In Pierson et al. [
7], a distributed algorithm for PEGs in bounded convex environments is proposed using Voronoi diagrams, which applies to scenarios such as intercepting illegal drones in protected airspace. The authors point out that even if the pursuers do not know the evader’s strategy, by dividing the environment based on Voronoi diagrams and implementing a global “area minimization” strategy, all evaders can be captured in finite time. The algorithm is also extended to three-dimensional space in the paper. In Zhou et al. [
8], a distributed, real-time algorithm for the cooperative pursuit of a single evader by multiple pursuers in a bounded, simply connected planar domain is proposed by continuously minimizing the area of the evader’s Voronoi partition. The algorithm allows pursuers to share state information but compute inputs independently, without making assumptions about the evader’s control strategy, only requiring that its control inputs follow speed limits. Capture can be guaranteed when the environmental domain is convex. In Oyler et al. [
9], PEGs with obstacles in the region are considered. The paper suggests that in an obstacle environment, if an agent can reach a position before the opposing agent, regardless of the opposing agent’s actions, the agent can win by occupying a dominant region. The authors provide two construction schemes for dominant regions and show that dominant analysis can provide a complete solution to the game.
Later, the study of PEGs further developed and evolved into the PD problem, which has attracted significant research interest due to its broader range of practical applications. Firstly, small-scale PD problems were considered. For example, Shishika et al. [
10] derived the solution for a two-agent game (one attacker and one defender) and proved it geometrically, which applies to general convex boundaries. Vonmoll et al. [
11] analyzed the PD problem with a circular defense boundary, where defenders are restricted to moving on the circular boundary. The one-on-one and two-on-one scenarios were modeled as zero-sum differential games. The paper formally verified that the attacker’s game strategy given in Shishika et al. [
10] for the one-on-one scenario is indeed the saddle-point equilibrium strategy of the game. For the two-on-one scenario, the state space was divided into multiple regions based on the equilibrium termination conditions, and the separating surfaces between these regions were derived. Additionally, the paper addresses another situation where the attacker aims to reach the target in the shortest possible time and provides solutions for both one-on-one and two-on-one game scenarios. Pourghorban et al. [
12] considered a non-zero-sum game scenario. In the game scenario, there is only one defender whose goal is to capture a series of incoming attackers. The attackers’ goal is to break through the target boundary without being captured by the defender. Once the current attacker breaks through the target boundary or is captured by the defender, the next attacker will randomly appear on a fixed circumference around the target. Therefore, the defender’s position at the end of the current game will become the starting position for the next game. This requires the defender to choose a strategy that is not only advantageous for the current game but also for future games. Based on the information available to the agents, each game is divided into two phases: a partial information phase and a complete information phase. Under the assumption of certain sensing and speed capabilities, the paper analyzed the agents’ strategies under the two phases and derived the equilibrium strategies for both attackers and defenders, optimizing the capture rate through the concepts of “contact surfaces” and “capture circles.” Then, large-scale PD problems were considered. In particular, Shishika et al. [
10] also proposed a method to decompose the global game into local games. Compared with Maximum Matching and Maximum Independent Set methods, the Local-game Regions method in Shishika et al. [
10] was useful in the design and evaluation of defender-team strategies. Additionally, Shishika et al. [
13] conducted an in-depth analysis of typical one-on-one and two-defenders-against-one scenarios and applied them to multi-agent games. In particular, building on Shishika et al. [
10], Shishika et al. [
13] further summarized the Local Game Region (LGR) defense strategy, which limits the computational complexity to polynomial time. Compared to the maximum matching method, this approach enhances the team collaboration between defenders (two defenders pincer-attacking one attacker), thus improving the interception performance against attackers. Compared with the maximum independent set method, the computational complexity is reduced. Moreover, collisions among defenders were studied in Velhal et al. [
14] and Macharet et al. [
15]. In particular, Velhal et al. [
14] discussed a non-fully distributed approach. The independent sensing range of defenders was limited such that an attacker could only be detected when within a defender’s sensing range. Firstly, based on the attacker’s current position and motion state, their possible trajectories were calculated, and the time and position of their arrival at the defense boundary were estimated, forming spatiotemporal tasks for the defenders to handle. Then, defenders “bid” on these tasks, and a bidding algorithm was used to solve the task allocation for defenders. The entire PD problem was transformed into a Decentralized Multi-robot Spatiotemporal Multi-task Assignment (DMRST-MTA) problem. Macharet et al. [
15] proposed an adaptive partitioning defense algorithm based on the distribution of attackers. The paper assumed a circular defense boundary, with potential attackers randomly appearing at positions with a fixed distance from the boundary, their directions determined by some unknown probability distribution. The research focused on two aspects: (i) estimating the probability density of the direction of the next attacker’s arrival; and (ii) spatial partitioning, enabling defenders to focus on capturing non-overlapping subsets of attackers. The proposed strategy is also effective when the spatial distribution of attackers’ arrivals is uneven.
Based on the foregoing discussion, the perimeter defense problem remains an open problem. In particular, it is worth noting that the methods in Vonmoll et al. [
11] and Pourghorban et al. [
12] are not suitable for large-scale multi-agent perimeter defense problems, whereas Shishika et al. [
10] and Shishika et al. [
13] neglect collisions among defenders and the presence of obstacles in the environment. Although Velhal et al. [
14] and Macharet et al. [
15] account for collisions among defenders, and their approach is applicable to large-scale multi-agent systems, they unfortunately overlook obstacles in the environment. So, this paper considers the perimeter defense strategies for multi-agent systems in multi-obstacle scenarios. The impact of obstacles on defense strategies is reflected mainly in the following two aspects. Firstly, obstacles directly affect the path planning of defenders heading towards the assumed attack points, increasing the complexity of path selection. Defenders should adopt more flexible and diversified path strategies to avoid obstacles. Secondly, interference caused by obstacles makes the boundary defense environment more complex. This not only affects the defenders’ trajectories but also increases the uncertainty of the time required to reach the target points. Due to blocked paths, there may be significant deviations in the defenders’ arrival times, thereby reducing the overall success rate of defense. Therefore, based on the consideration of the existence of obstacles, this paper proposes a perimeter defense strategy based on priority path planning to address the perimeter defense problem in obstacle-laden scenarios. The main contributions are as follows. A minimum-weight matching algorithm is used to solve the optimal task sequence, ensuring the maximum intercept rate while optimizing defense cost. The priority of defense tasks is established based on the window size of relative time, effectively enabling defenders to autonomously adjust their defense paths under non-collision conditions.
The remaining parts of this paper are organized as follows. In
Section 2, the problem formulation is presented. The main results are proposed in
Section 3. In
Section 4, the performance of the designed perimeter defense strategy is verified by some simulations.
Section 5 concludes the paper.
Notations. denotes the 2-norm of vector x. denotes the sign function of vector x. denotes the gradient of the cost function with respect to vector x. denotes the ceiling function, i.e., the smallest integer not less than x.
2. Problem Formulation
Considering a region
in a two-dimensional plane. As shown in
Figure 1,
is a convex region. A Cartesian coordinate system is established with the region’s center as the origin. Thus, the coordinates of the center of the region are
. After discretizing the boundary of the region, the vertices can be represented counterclockwise as
, where
. Some important facilities within the region, or obstacles, cannot be passed through. In the perimeter defense problem, there exists a multi-agent system in which all agents are divided into intruders and defenders.
Specifically, considering a set of defenders as
. For defender
, the kinematic equations are given as follows:
where
is the position of the defender,
is the heading angle, and
is the angular velocity.
represents the speed of the defender. Furthermore,
is used to denote the radius of the defender. Note that the kinematic model in Equation (
1) is standard and commonly used in mobile robot dynamics (see Siciliano et al. [
16]).
Similarly, consider another set of agents
, referred to as intruders. The kinematic equations of intruder
are given by
Similarly,
and
represent the position and speed of
, respectively.
is the heading angle and
is the angular velocity. The radius of the intruder is indicated by
. The speed and angle speed of intruder
are calculated by the artificial potential field method given in
Section 4.
The following settings are made in this paper: (1) The number of defenders is greater than or equal to the number of intruders, i.e., ; (2) The defenders’ maximum speed is greater than or equal to that of the intruders, i.e., ; (3) The angular velocity and angular acceleration of intruders and defenders are bounded, i.e., , , and ; (4) The initial position of the defenders and intruders satisfy and ; (5) If the distance between a defender and an intruder is less than , the defender will intercept the intruder using a head-on collision strategy. The set of intercepted intruders at time t is defined as ; (6) For an obstacle with radius , if the distance between a defender and the obstacle is less than , the defender will collide with the obstacle. Similarly, if the distance between two defenders (or intruders) is less than (or ), they will collide with each other. (7) When a defender collides with an obstacle or another defender, or captures an intruder, it will lose its speed and the ability to capture other intruders. Moreover, an intruder will lose its speed and will no longer be captured by other defenders if it has been captured by a defender. This setting implies that, upon any collision or capture, the involved defender or intruder stops moving and is removed from further play.
Intruders launch attacks from outside , with the goal of successfully reaching . The defenders are restricted to move within , and their objective is to intercept the intruders on the boundary of . Let H denote the number of defenders that successfully intercept an intruder. Therefore, this paper aims to design a perimeter defense strategy, i.e., to design and , for defenders that maximally increases H.
Remark 1. Note that the attack, decision making, and path planning problems of multi-agent systems have been studied in the following references: [17,18,19,20,21]. In these references, the objective of agents is usually to minimize time/energy or maximize the success probability of reaching the target. Conversely, this paper addresses the perimeter defense problem where the objective of the defenders is to intercept the intruders on the boundary of a perimeter. Thus, the methods in [17,18,19,20,21] cannot be directly applied in this paper. 3. Main Result
In this section, we first introduce the main steps in the perimeter defense strategy, including the following: (1) relative attack time calculation, (2) relative defense time calculation, (3) task assignment and priority calculation, (4) swarm path planning, and (5) trajectory tracking. Finally, we summarize the strategy. Compared with the existing literature, the proposed method can handle large-scale agent-based perimeter defense while accounting for inter-defender collision avoidance and obstacle avoidance. Note that a comparison with existing results is given in
Table 1 to validate the novelty and performance of the proposed algorithm.
3.1. Relative Attack Time
Without loss of generality, assume that there is an intruder
outside the defense perimeter, as shown in
Figure 2. The current position of the intruder is
, and the heading angle is
. Assume the intruder’s target is the center of the defense area, and the intersection point of the line connecting the current position and the center of the area with the boundary is used as the hypothetical attack point
. The calculation of
is as follows.
First, calculate the intersection point
between the line passing through
and the line passing through
, that is,
Next, determine whether the intersection point is on the boundary, that is, whether
satisfies the following conditions:
If condition (
4) is satisfied, then
. Then, the relative attack time can be obtained as follows:
It is worth noting that if condition (
4) is not met, it implies that the intersection point of the intruder’s target path with the boundary of
is not within the boundary of
. In such cases, the intruder can be ignored. The pseudocode for this part is given in Algorithm 1.
Algorithm 1 Attack Calculation |
- 1:
for to M do - 2:
for to E do - 3:
Calculate (Equation ( 3)) - 4:
if satisfies the conditions (Equation ( 4)) then - 5:
Set , and calculate (Equation ( 5)) - 6:
end if - 7:
end for - 8:
end for
|
3.2. Relative Defense Time
Assuming that defender
intercepts intruder
only at the hypothetical attack point
, the formula for calculating the relative defense time
is
where
is the planned path length from the current position of
to
. The problem is transformed into solving for
.
First, we establish a straight-line path between the current position of the defender and the target point, that is, the line segment from to , and discretise the path into a series of path points , using control points. If this path does not pass through any obstacles, in other words, this path is a straight safe path that does not cross any obstacles, then we determine the line segment between the start and end points as the optimal path, and at this time .
Remark 2. When calculating the defense path for a single defender, there is a certain probability that the straight-line path will pass through obstacles. However, this is related to the distribution density of obstacles in the task scenario. When the density is low, the probability that the straight-line path of the defender to the hypothetical attack point passes through obstacles is small. In this case, using the straight-line path can reduce the computational load, and the straight-line path is the optimal path.
If the straight-line path passes through any obstacle, then the PSO algorithm is used to find the collision-free optimal interception path. Let a particle represent a possible path, and the cost of a particle (i.e., a possible path, referred to as the current path) can be calculated as follows: Let
be a path point of the current path and
be the next path point. The cost function for the path length between the
k-th and
-th path points can be defined as:
Second, for the obstacles in the scene, we calculate the collision penalty cost.
where
represents the geometric center of obstacle
s,
,
is the distance between the
k-th path point and the geometric center of obstacle
s,
is the penalty cost for the
k-th path point passing through obstacle
s,
is the obstacle penalty coefficient, and
is a very small number to prevent division by zero.
is the distance from the boundary of the obstacle to its center, and
is the penalty for a path point passing through all obstacles.
Remark 3. When calculating the penalty for a single path point passing through an obstacle, an obstacle penalty coefficient is introduced to adjust the penalty value, rather than directly using the distance between the path point and the obstacle center as the penalty value. This facilitates the adjustment of the impact of the obstacle penalty on the overall cost function.
Finally, the PSO algorithm is employed to solve the following cost function to obtain the shortest path:
Here,
n is the number of path points, and
is the proportionality coefficient for the path length cost. By optimizing Equation (
9), the optimized path length is obtained as:
Remark 4. When calculating the total path cost, the product of the segment length and the penalty for passing through obstacles is used as the penalty for the path passing through obstacles. The segment length is used as a weight to ensure that if a longer segment passes through an obstacle, the overall path penalty will be higher. This achieves “local weighting” of particles, which can cause the path cost to increase rapidly when the path is extended and simultaneously disturbed by obstacles. This, in turn, encourages the algorithm to prefer paths that are both short and safe.
The pseudocode for this part is provided in Algorithm 2.
Algorithm 2 Defense Calculation |
- 1:
for to N do - 2:
for to M do - 3:
Calculate the straight-line defense path and discretize it into path points , - 4:
Set - 5:
for to n do - 6:
for to z do - 7:
if satisfying then - 8:
Set - 9:
end if - 10:
end for - 11:
end for - 12:
if satisfying then - 13:
Calculate - 14:
else - 15:
Construct the path cost function (Equations ( 7)–( 9)) - 16:
Run PSO algorithm in [ 22] to solve for the optimal path and calculate the length (Equation ( 10)) - 17:
Calculate (Equation ( 6)) - 18:
end if - 19:
end for - 20:
end for
|
3.3. Task Assignment and Priority Determination
First, we use a graph-theoretic method to generate interception task pairs. We construct the following bipartite graph:
where the sets of defenders and intruders are defined as
For any defender
and intruder
, if the condition
is satisfied, it means that defender
can intercept intruder
before it reaches the target. In this case, an edge
is added to the graph
G. Here,
is a redundancy coefficient, which describes the additional time caused by disturbances and other uncertainties during the actual path tracking of the unmanned vessel swarm.
Based on the above graph construction, we transform the problem of solving the optimal interception task sequence into the following minimum-weight matching problem,
where
M is the maximum matching value presented as follows:
and the edge weights are chosen as the ceiling of the defense time, that is,
The notation
represents the ceiling function, which rounds a number up to the nearest integer. The Blossom algorithm is a classic graph algorithm to find the maximum matching in general graphs. Its weighted extension is used to solve the problem of maximum-weight or minimum-weight matching in graphs (see Galil et al. [
23]). Using the weighted extension of the Blossom algorithm (via the NetworkX library in Python; see Hagberg et al. [
24] for more details) to solve the minimum-weight matching problem (
11), we obtain the optimal interception task sequence:
Remark 5. Using the minimum-weight matching algorithm to plan the interception tasks, we optimize the task costs by minimizing the cost of the defender swarm executing the interception tasks while ensuring the maximum number of matches.
After obtaining the pairs of interception tasks, consider that the size of the time window between the relative defense time and the relative attack time will affect the redundancy time available for the defender to reach the interception point, thus affecting the success rate of the interception task. Therefore, we determine the task priority based on this window size. The priority of
is defined as
A smaller relative time window means less reaction time available for the defender, indicating a higher risk. In contrast, a larger window provides more buffer time for the defender, resulting in a relatively lower task risk.
3.4. Swarm Path Planning
After obtaining the interception tasks and their priorities, this stage focuses on path planning for the defender swarm. Specifically, if the defender
is assigned a task, let
be a path point on its trajectory and
be the next path point. The cost function for the path length between the
k-th and
-th path points of defender
can be defined as:
The cost for this segment of the path (between the
k-th and
-th path points) passing through obstacles is:
Here, represents the geometric center of obstacle s, , is the distance between the k-th path point and the geometric center of obstacle s, is the penalty cost for the k-th path point passing through obstacle s, is the obstacle penalty coefficient, is a very small number to prevent division by zero when the obstacle center coincides with the path point, and is the distance from the obstacle boundary to its center. is the total penalty for the path point passing through all obstacles.
Remark 6. If defender is not assigned an interception task, it will not participate in the swarm path planning or subsequent algorithm steps. will remain stationary and be treated as a static obstacle.
Assuming the set of defenders assigned interception tasks is
K, and defender
, meaning
is also assigned a task. Let
represent the
k-th path point on its trajectory. The mutual collision penalty function between two defenders is:
Here,
is the distance between the two defenders,
is the mutual collision penalty value between
and
, and
is the mutual collision penalty coefficient.
is a very small number to prevent division by zero when the centers of the defenders coincide.
will be penalized when the distance
is less than the sum of their radius and its time window
has lower priority (i.e.,
<
). This design ensures that, in the case of a potential path conflict, the defender with lower priority yields to avoid collision. The collision penalty for defender
at path point
with all other assigned defenders at the current index point (the
k-th path point of other defenders) is:
Then, the total cost function for defender
along this path is:
Here, n represents the number of path control points.
The PSO algorithm can then be used to solve the following cost function of swarm path planning to obtain the optimal collision-free path for the swarm.
Remark 7. When calculating the cost of a single path, task priority affects path planning: when is large (i.e., the task priority is low), the path of the defender may need to be adjusted more significantly compared to high-priority tasks to achieve avoidance and coordinated planning.
The pseudocode for this part is given in Algorithm 3.
Algorithm 3 Task Assignment and Path Planning |
- 1:
Run the weighted extension of Blossom algorithm in Galil et al. [ 23] to solve for the optimal interception task sequence Task - 2:
Calculate task priorities (Equation ( 12)) - 3:
Construct the path planning cost function (Equations ( 13)–( 18)) - 4:
Run PSO algorithm in Kennedy et al. [ 22] to compute the optimal swarm path
|
3.5. Trajectory Tracking
In the previous stage, we obtained the optimal paths for the defender swarm using PSO. In this stage, we need to design a feedback tracking algorithm to ensure that the defenders can follow these optimal paths to execute their tasks. To achieve precise trajectory tracking, we employ a sliding mode controller to regulate the movement of the agents.
Suppose the position of the defender
i at time
t is
, and the optimal target point for the defender
i at time
t is given as
. Here, the optimal target point is selected by the following steps. We find three consecutive points on the path whose approximate curvatures all exceed
, and then the first point of these three consecutive points is chosen as the optimal target point
. The tracking error can be defined as:
We use the position error to evaluate the tracking performance, i.e.,
. Then, the desired heading angle is calculated by
. The difference between the current heading angle and the desired heading angle is defined as the heading error:
Next, we use the heading error to design the feedback controller. To enable the defenders to move precisely along the path, we can design the following sliding surface:
where
is the derivative of the heading error, and
is a positive constant used to control the convergence rate. The sliding mode controller can be expressed as:
where
k is the heading angular velocity satsifying
, and
is the sign function of the sliding surface. The defender’s speed is set to the maximum speed, i.e.,
, and only slows down when approaching the hypothetical attack point to gradually reach it. Thus, the speed of the defender satisfies:
where
is the distance from defender
to
, and
is a distance threshold.
The pseudocode for this part is as follows:
Remark 8. We now discuss parameter selection in the aforementioned algorithms. For Algorithms 2 and 3, all parameters are chosen to meet the requirements of the PSO algorithm. For Algorithm 4, the control gain is required to satisfy . This condition comes from the constraint on the heading angular velocity. It is worth noting that excessively high gains may cause severe chattering in sliding-mode control, whereas too-low gains enlarge trajectory-tracking errors and degrade the defender’s ability to intercept the intruder.
Algorithm 4 Trajectory Tracking |
- 1:
for to N do - 2:
Obtain the planned path and select the first path point with curvature exceeding as the target point - 3:
Calculate the heading angular velocity (Equations ( 19)–( 21)), and set - 4:
Update the defender’s pose (Equation ( 1)) - 5:
if satisfying the interception condition and then - 6:
Mark the task as successful - 7:
end if - 8:
end for
|
3.6. Perimeter Defense Strategy
Assuming the start time, end time, and sampling time interval for the perimeter defense problem are
,
, and
T, respectively, the pseudocode for the perimeter defense strategy based on prioritized path planning is shown in Algorithm 5.
Algorithm 5 Perimeter Defense Strategy based on Prioritized Path Planning |
- 1:
repeat - 2:
- 3:
Construct the bipartite graph G - 4:
for to M do - 5:
Add node to the vertex set V - 6:
end for - 7:
for to N do - 8:
Add node to the vertex set U - 9:
end for - 10:
Calculate hypothetical attack points and relative attack times (Algorithm 1) - 11:
Calculate relative defense times (Algorithm 2) - 12:
if satisfying then - 13:
Add edge to G with edge weight - 14:
end if - 15:
Task assignment and swarm path planning (Algorithm 3) - 16:
Trajectory tracking and interception decision (Algorithm 4) - 17:
until
|
Remark 9. The defense strategy needs to run Algorithms 1–4 at the initial moment. In subsequent iterations, if the intersection point calculated by Algorithm 1 does not change, Algorithm 2 and 3 can be skipped, and the planned trajectory obtained from the previous iteration can be continued to be tracked. For example, when the intruder adopts the shortest path strategy to reach the boundary, all the intruders’ intersection points do not change. In this special case, Algorithm 2 and 3 only need to be run once at initialization, and then repeatedly running the tracking algorithm of Algorithm 4 can achieve the objective of perimeter defense. Additionally, even if the intersection point changes at each sampling time, Algorithms 1–3 can be run at with positive integer . In this way, the running frequency of Algorithms 1–3 can be reduced, the computational cost can be decreased, and the re-planning rate of the optimal path can be lowered.
4. Simulations
In this section, the effectiveness of the algorithm will be verified. A perimeter defense scenario has been designed, which includes obstacles of various shapes and sizes. The scenario is set in a square sea area with a length and width of
, as shown in
Figure 3. The defense boundary is a pentagon, and the central area is a circle with a radius of
. Inside the boundary, there are three sparsely arranged obstacles representing possible reefs around the central island. The geometric representations of the three obstacles are provided as follows: Obstacle 1 is a circle with its center at
and a radius of
; Obstacle 2 is a circle with its center at
and a radius of
; Polygonal Obstacle 3 has vertices (given in a counterclockwise direction) at
,
,
, and
. The initial positions of the intruders are randomly generated within an annular region ranging from
to
. The defenders are randomly generated within the defense boundary, excluding the areas occupied by obstacles.
The models of intruders and defenders are given in Equations (
1) and (
2). There are 10 agents on each side, and each agent updates its state once per iteration based on its heading and speed. The defender swarm employs the perimeter defense strategy in Algorithm 5. The intruder swarm uses a direct attack strategy adapted with the artificial potential field method. Specifically, the intruders move directly towards the center of the region, and the intersection point of their direct path with the defense boundary is considered the target point, which exerts an attractive force on them. Meanwhile, each defender is treated as an obstacle that repels the intruders. The formulas for the artificial potential field method are as follows:
Here,
represents the attractive potential field,
q represents the position of intruder,
is the attractive coefficient,
is the distance to the target point, and
is the threshold distance for the attractive potential.
represents the repulsive potential field,
is the repulsive coefficient,
represents the distance from the intruder’s current position
q to the nearest obstacle, defender, or other intruder.
is the distance within which the intruder is influenced by obstacles, defenders, or other intruders. For example, when the distance between a defender and an intruder is within
, the intruder is affected by that defender. For intruder
j, let
and
. Then, the speed and angle speed of
can be calculated as follows,
where
and
are some control gains, and
and
are the components of the gradient of
along the
- and
-directions. The angle returned by the
function ranges between
and
, representing the angle between the ray from the origin to the point
and the positive direction of the
x-axis.
The parameters used in the simulation are shown in
Table 2, where
n is the number of intermediate control points used to discretize the path during PSO optimization, nPop is the number of particles in the PSO algorithm,
is the coefficient for the path length in the cost function, and
and
are the penalty adjustment parameters used in both individual and swarm path planning. The parameters
,
,
and
are used in the artificial potential field method.
is the curvature threshold used in Step 5 of the algorithm to select tracking points. Note that the parameter selection is discussed in Remark 8.
The initial task list is shown in
Table 3, where the priority of each task is calculated using Equation (
12). It can be seen that the difference between the relative attack time and the relative defense time for the interception task
is only
, which is relatively small. Therefore, it is assigned a higher priority (a smaller numerical value indicates a higher priority). In contrast, the task
has a time difference of
, which is relatively large, so its priority is low. This allows for more flexibility in swarm path planning to accommodate the movements of other defenders. After the swarm path planning was completed,
effectively avoided the circular obstacle below the central island reef while intercepting
, which is shown in
Figure 4.
Additionally, the cost function throughout all iterations is shown in
Figure 5, and the curve generally shows a downward trend. Analyzing the entire curve, in the first iteration, the straight-line path was used as the initial path, which resulted in the shortest path length and the lowest path length cost. However, some defenders’ paths passed through obstacles, such as defender
, leading to the highest overall cost. After random exploration, the path length cost increased, but the obstacle penalty decreased. Moreover, with the appropriate values of the penalty coefficients
and
, the overall swarm cost still showed a downward trend. After dozens of iterations, the cost function rapidly decreased and reached its lowest point at the maximum number of iterations. The fluctuations observed during the cost function descent were mainly due to the mutual collisions of the unmanned vessels and the random exploration of the particles. Based on the current cluster path planning, trajectory tracking is carried out. The control inputs for the multi-agent systems are given in
Figure 6. Throughout the entire defense process, the tracking errors of all defenders converge to zero, as shown in
Figure 7. All interception tasks were successfully completed at the end of the defense process, which implies that all intruders are captured by defenders. Moreover, the sensitivity of collision penalty coefficient
is analyzed in
Figure 8. When the coefficient
is too large (e.g.,
), the cost function tends to ignore collision cases. Conversely, when the coefficient
is too small (e.g.,
), it fails to reflect the priority of avoiding collisions. Therefore, the parameter must be chosen with care.
All simulations were conducted on a workstation equipped with an AMD EPYC 9654 96-Core Processor running at 2.40 GHz, 128 GB of RAM, and an NVIDIA RTX 4090 GPU. The system operated on a 64-bit Windows platform based on an x64 architecture. The simulation framework was developed in Python 3.8, and all experiments were executed in a single-GPU environment.