1. Introduction
Unmanned aerial vehicles (UAVs) have gained significant attention because of their numerous advantages, including high mobility, low operational cost, and the ability to access difficult or hazardous environments. These characteristics have led to their widespread application in various fields. However, when a single UAV is tasked with complex operations, its limited capacity in terms of payload, coverage area, and redundancy often restricts working effectiveness. To overcome these challenges, there is a growing need for multiple UAVs to collaborate and complete tasks more efficiently as a swarm. Due to their flexibility and efficiency, UAV swarms can be used for various kinds of cooperative tasks, such as surveillance [
1], search [
2], rescue [
3], exploration [
4], cooperative encirclement [
5], and object transportation [
6,
7].
In the domain of UAV swarms, formation flying has emerged as a key strategy for enabling coordinated behavior. It allows UAVs to work together in predefined spatial configurations, thus optimizing resource usage, increasing coverage, and ensuring fault tolerance. The ability to manage and maintain formation not only enhances the operational capabilities of UAV swarms but also improves their overall robustness and efficiency in task execution, making it an essential approach in modern UAV systems.
Planning is an indispensable component of any autonomous system, offering safe and efficient guidance to complete specific tasks. Therefore, formation planning is a critical aspect of UAV swarms. Formation planning refers to the process of determining the optimal movement strategy for UAVs to ensure safe task completion while maintaining the desired formation shape. Collision avoidance between UAVs and formation shape maintenance are the two main challenges of the formation planning problem. Many researchers have devoted attention to this field in recent years.
Extensive research has been carried out on the collision avoidance problem (collision with obstacles and collision between UAVs) in multi-UAV planning. Alonso-Mora et al. [
8] used the velocity obstacle method to establish a local obstacle avoidance planning problem for multiple UAVs. Collision avoidance, obstacle avoidance, and motion continuity problems are considered in the cost functions of the proposed optimization problems. Safe and feasible real-time local trajectories are obtained by solving the proposed problems. Zhou et al. [
9] presented EGO-swarm, a decentralized approach for autonomous navigation by multiple UAVs using only onboard resources. The planning system is formulated under gradient-based local planning framework, where collision avoidance is realized by formulating the collision risk as a penalty of a nonlinear optimization problem. On that basis, ref. [
10] used MINCO instead of a B-spline to parameterize trajectories, thus solving the difficulty of time adjustment when UAVs need to pass through the same area. It also produced smoother trajectories and a lower optimization time. Tordesillas et al. [
11] presented MADER, a 3D decentralized and asynchronous trajectory planner for UAVs that generates collision-free trajectories. Collision with other UAVs can be realized by including their committed trajectories as constraints in the optimization and then executing a collision check–recheck scheme. Recently, Toumieh et al. [
12] proposed a high-speed, decentralized, and synchronous motion planning framework (HDSM), which generated a time-aware safe corridor (TASC) to guarantee the safety of the UAV trajectories. Zhao et al. [
13] introduced a new Theta*–APF method for drone swarm path planning in 3D space; the method reduces the searching time and the path length. Collision avoidance for the agents is realized utilizing repulsive force fields.
Although the above work offers outstanding performance in collision avoidance for UAV swarms, the formation maintenance problem is not taken into account. Existing formation maintenance methods are summarized below.
First, from the perspective of control, a high-precision formation shape can be maintained. Leader–follower control is a hierarchical approach used in UAV swarm operations, where one or several UAVs are designated as leaders, and the others follow these leaders based on predefined rules [
14]. This approach depends on the stability and reliability of the leader(s), meaning that disruption of the leader’s communication could impact the entire formation. Therefore, when faced with intricate scenarios and constraints, such methods [
15,
16,
17] need further research and refinement to address all the requirements. Virtual structure methods establish a geometric formation in which the UAVs move together as a cohesive unit. Each UAV preserves its relative position as the formation moves, allowing the entire formation to translate, rotate, or scale as required to meet mission goals or avoid obstacles. These approaches enable precise control over the formation shape and provides an intuitive way to modify the formation in response to changing conditions. However, as the number of UAVs increases, the stability and accuracy of the virtual structure may become extremely difficult to maintain. Consensus control is also a classic and promising formation maintenance approach, enabling the UAVs to agree on specific parameters such as position, velocity, or angle, making the swarm move cohesively. It promotes flexible adaptation to environmental changes and enhances collaborative task performance. Ref. [
18,
19,
20] are all recent works utilizing consensus theory to design formation planning algorithms. However, the effectiveness of consensus algorithms depends heavily on the quality of communication between UAVs and the convergence speed of the algorithms. Other control methods such as fuzzy control and adaptive control [
21] can also tackle the formation maintenance problem.
Second, from the perspective of searching and optimization, flexibility and robustness can be improved by treating the formation maintenance requirement as a soft constraint. Nguyen et al. [
22] formulated a distributed optimization problem based on dynamic consensus and solved the problem using ADMM (Alternating Direction Method of Multipliers), achieving time-varying formation shape maintenance with inter-robot collision avoidance. They took advantage of MPC-based motion planning approaches to design reference trajectories at each replanning instant. Quan et al. [
23] designed a differentiable cost function based on graph theory to evaluate UAV formations. By considering formation similarity, obstacle avoidance, and dynamic feasibility, they realized a balance between UAV formation maintenance and safety during flight. Peng et al. [
24] proposed a distributed and synchronous motion planning framework for a formation of multiple UAVs equipped with an active sensing system in an obstacle-based environment using a gradient-based method. They used expanding FOVs to enhance safety in the UAV swarm motion planning task. The planning problem was solved with the distributed particle swarm optimization algorithm. Zhang et al. [
25] proposed an online formation planning method for a tethered multirotor UAV cooperative transportation system. An optimization problem was constructed considering asymmetric tension-based swarm reciprocal avoidance, obstacle avoidance, and target transportation. All constraints are represented as soft constraints to realize task requirements. Mikkelsen et al. [
26] introduced a distributed planner for rigid formations. They first determined the scaling, rotation, and translation of a base configuration to obtain the desired velocities at each time step for the swarm. The desired velocities of the agents were mapped to a parameter space to guarantee consensus and constraints, and they were then remapped to the velocity space of the agents, ensuring that the robots maintained the shape of their formation. Liu et al. [
27] proposed a global formation planning method with obstacle avoidance, which conceptualized robot formations into distinct configurations. The feasible configurations and the transitions between two configurations were represented as vertexes and edges, respectively, to construct an undirected graph where the optimal formation path could be found using searching algorithms.
Third, formation maintenance can also be realized from the perspective of deep reinforcement learning (DRL) [
28,
29,
30,
31]. In recent years, significant attention has been focused on DRL, in which deep neural networks are employed to approximate the value function, the policy, or both within reinforcement learning algorithms, allowing agents to effectively manage high-dimensional states. By utilizing DRL, UAVs are capable of independently coordinating their movements to preserve desired formation patterns, avoid collisions, and adapt to changing environments. Through continuous interaction with the environment, DRL methods allow UAVs to learn optimal control strategies, resulting in more efficient and resilient formation control compared to conventional approaches. DRL methods offer benefits such as complex decision-making, long-term reward optimization, and adaptability. However, there exist challenges, including sample inefficiency, potential training instability, the need for hyperparameter tuning, and safety risks during the learning process. Furthermore, the selection of these approaches depends on various factors, including the specific needs of the formation task, available data, computational resources, and safety concerns [
14]. However, the high computational demands and long computing time are the main challenges for DRL-based methods.
In summary, different kinds of methods have their own advantages and shortages. A detailed comparison is shown in
Table 1. Control-based methods have lower computational demands and higher real-time performance, which make them easier to deploy, yet there is a lack of environmental adaptability because control-based methods rely on accurate environmental models and strict communication among agents. The local minimum problem is also a challenge during intricate obstacle avoidance. Search-optimization-based methods perform well with relatively high scalability and moderate computational complexity when the environment is known (even if it is intricate). This paper mainly considers the formation planning problem in terms of known static environments. DRL-based methods realize much higher scalability and adaptability, but the training process is time-consuming and requires enormous computing power. In summary, control-based methods are suitable for smaller swarms in easier environments; search-optimization methods are good at handling normal formations in environments that allow offline planning; and DRL-based methods are recommended for large-scale formations in unknown and complex environments under the condition of high computational power. Since this paper focuses on the formation planning problem in static environments for medium-scale swarms, the search-optimization method is utilized. Compared with the existing search-optimization-based algorithms, our proposed method simplifies the expression of formation cost using the reference relative vectors instead of the Laplacian matrix used in [
23]. The path searching algorithm is also improved to avoid disintegration of the formation. The safe corridor approach is utilized instead of soft constraints to ensure flight safety in [
23,
24].
In this paper, we introduce the swarm-A* algorithm, which enhances the cohesion of the swarm and prevents disintegration of the formation during the path searching process, and we also present a distributed formation trajectory optimization framework that can balance collision avoidance and formation shape maintenance. In detail, we propose a distributed formation planning approach for UAVs, which can generate collision-free trajectories in environments with static obstacles. The proposed formation planning method mainly consists of swarm path searching and formation trajectory optimization. A sliding mode controller is designed to validate the dynamic feasibility of the trajectories. Simulations and experiments verify the effectiveness and adaptability of the proposed method.
The main contributions of this work are as follows:
A path searching algorithm that prevents formation disintegration is proposed. A swarm heuristic cost is designed to be applied during the search, and it observably enhances the cohesion of the swarm paths. As a result, the difficulty of solving the optimization problem can be greatly reduced compared to searching without consideration of swarm cohesion.
We propose a distributed formation trajectory optimization method that takes formation maintenance, obstacle avoidance, and kinematics into account. By solving the optimization problem, smooth rotation and translation of the UAV formation can be realized. The method enables the UAV system to balance between moving in the reference formation shape and avoiding obstacles.
A series of simulations and a real-world experiment are conducted, validating the effectiveness of our proposed method.
The remaining of this paper is organized as follows. In
Section 2, we describe the studied system and formulate the problem. Some basic knowledge used in this paper is also introduced. In
Section 3, the proposed formation planning method is described in detail.
Section 4 introduces the formation tracking control method. In
Section 5, simulations under different circumstances is introduced.
Section 6 shows the experimental results and analysis. Finally,
Section 7 summarizes the paper and discusses future work.
3. Formation Planning
The overall framework of the proposed formation planning method is shown in
Figure 1. First, the swarm-A* algorithm is designed to search for collision-free discrete waypoints for the center of the formation and for the UAVs. Second, the waypoints are densified, and the numbers of waypoints are equalized. Safe corridors, a set of convex polygons covering the free motion space of the UAVs, are constructed using these processed waypoints. Finally, a nonlinear distributed trajectory optimization problem is presented and solved to obtain safe and smooth trajectories that maintain the reference formation shape. Detailed explanations of the proposed method are provided in the subsections below.
3.1. Path Searching
This subsection presents a path searching method called the swarm-A* algorithm, which aims to obtain discrete collision-free path points for UAVs in consideration of their cohesion. The search space is a two-dimensional grid map that contains free grid squared and occupied grid squares. We define eight available directions in each search iteration.
In terms of formation path searching, conventional algorithms such as A* are not applicable because if we use them to search paths for multiple UAVs sequentially, the UAVs are very likely to bypass obstacles on different sides, which is not conducive to maintaining formation and will lead to a large computational burden, sometimes even to the point of unsolvability, for the subsequent trajectory optimization. Additionally, some large obstacles may lead to communication blockage if UAVs are located on different sides. To overcome this problem, we designed the swarm-A* algorithm, whose principle is presented below.
The swarm-A* algorithm is shown as Algorithm 1.
and
represent the parent node and the child node, respectively, during the searching process. In Line 2, the algorithm searches for a reference path
for the center of the formation using a hybrid A* algorithm [
34].
contains both position and orientation information, and its
kth waypoint is denoted as
, where
represents the orientation. In Line 12,
, the closest path point to
, is found in
, acting as a reference point. In Line 17, the algorithm judges whether the path passing
towards
is shorter than the path passing its previous parent node. If so, the parent of
is replaced with
. In Line 21, a node extension formula for swarm-A* is proposed, where
is the cumulative path length from the starting point to the current node
,
is the heuristic cost between the current node and the goal point, and
is the heuristic cost between the current node and the reference point.
and
are the weights of the two heuristic costs, respectively.
is the total cost. A comparison of different values of swarm weight is shown in
Figure 2. It can be observed that, as
increases, the cohesion of the searched paths also improves, i.e., the UAVs’ path become closer to the reference path. When
, the paths bypass all obstacles on the same side. It proves that the proposed swarm-A* algorithm is capable of enhancing the cohesion of swarm movements.
Algorithm 1 Swarm-A* |
- Input:
UAV number n, starting and goal points - Output:
UAV paths - 1:
Compute the starting point and goal point of the formation center , - 2:
Use Hybrid A* algorithm to search for a reference path for the center of the formation, - 3:
for do - 4:
; - 5:
; - 6:
while do - 7:
Choose with minimal total cost f in ; - 8:
; - 9:
if then - 10:
break; - 11:
end if - 12:
Find the point in that is closest to ; - 13:
for all do - 14:
if or is unfeasible then - 15:
continue; - 16:
else if then - 17:
; - 18:
else - 19:
; - 20:
; - 21:
; - 22:
end if - 23:
end for - 24:
end while - 25:
Obtain ; - 26:
end for - 27:
return
|
3.2. Waypoint Reallocation
Due to the diverse starting and goal points of the UAVs, the number of waypoints for each UAV may be different. Since the relative position between UAVs will be calculated at each time step during the trajectory optimization process, the length of the path for each UAV should be the same. Besides, since the reference path (formation center path) will be used to offer a formation rotation angle for UAVs at each time step (
Section 4), the length of the reference path should be close to that of the UAVs. To achieve this requirement, the path points are densified by inserting
h points equidistantly into the line segments between two adjacent points. An equalization algorithm (Algorithm 2) is proposed to calculate the number
h of points that need to be inserted and then insert them into the previously searched paths. The process of Algorithm 2 is as follows.
Algorithm 2 Waypoint reallocation method |
- Input:
UAV paths 1, waypoint number , waypoint number of the formation center , maximum of the inserted point number - Output:
reallocated paths - 1:
= - 2:
- 3:
for do - 4:
if and then - 5:
if then - 6:
- 7:
else - 8:
end if - 9:
end if - 10:
end for - 11:
for do - 12:
- 13:
end for - 14:
- 15:
if then - 16:
- 17:
else - 18:
end if - 19:
return
|
In Line 1, the longest path among the UAVs is chosen, and its waypoint number is denoted as . In Line 2, each UAV’s path (except for the longest one) is extended by copying and appending its last waypoint to the end (function ), thus making all UAVs’ numbers of waypoints equivalent. In Lines 3–10, h is determined by comparing the numbers of waypoints on the reference path and the inserted paths. In Lines 11–18, the UAVs’ paths are densified (function ), and the length of the reallocated path is denoted as l. After that, the lengths of the reference path and the reallocated path are equalized. represents the final number of waypoints on all discrete paths.
To ensure the clarity and conciseness of the representation, the discrete path point set of UAV i obtained through path searching and waypoint reallocation is denoted as , where represents the position of the kth path point.
3.3. Safe Corridor Generation
Safe corridors are several convex polygons covering the feasible space in the environment that ensures collision-free trajectories. The safe corridor set of the path
is represented as
, where
represents the safe corridor of the
kth waypoint. Note that the safe corridor needs to be sequentially connected, i.e., the safe corridors of two adjacent points must overlap, which is denoted as
In this paper, a rectangular safe corridor is generated around each waypoint by expanding a safe region centered at that point.
is initialized as the starting point
. Expansion proceeds in the four cardinal directions
until the distance between the corridor’s boundary and nearby obstacles is reduced to a specified safe distance in all directions. This expansion is repeated sequentially for each point in the path
until the final corridor for
is obtained. Inspired by [
35], the connectivity problem is solved by inserting points into the line segments between two adjacent waypoints (the same process as in
Section 3.2) of both the UAV paths and the reference path. The number of inserted points is chosen in consideration of the density of obstacles.
Figure 3 demonstrates a safe corridor generation result. The yellow dots represent the original waypoints, and the blue ones represent the inserted waypoints (
). The dots surrounded by red dashed lines represent the ‘expansion points’. The green rectangles represent the generated safe corridor for the path. During the generation process, if a waypoint
does not lie within the boundaries of the previously generated safe corridor
(
), i.e.,
, then it is called an ‘expansion point’. Otherwise, if
, then the corridor generation process for the expansion point is skipped, and we suppose that
. As is shown in
Figure 3, after the first corridor
is constructed, the next waypoint that is not within
is
. Therefore,
is defined as an expansion point, whose safe corridor, denoted as
, is then generated, etc. When the last waypoint
receives its corresponding corridor
, the generation process is finished.
3.4. Trajectory Optimization
The discrete waypoints are refined into smooth formation trajectories in this subsection. The trajectory of the UAV i obtained through trajectory optimization is denoted as , where represents the time from the start to the kth trajectory point, is the unit time, and L represents the total number of points contained in the optimized trajectory.
First, the formation rotation matrix at each time step is calculated. By combining the reference formation shape with the rotation matrix, the formation cost is determined. Subsequently, a distributed trajectory optimization problem is formulated. Safe and smooth formation trajectories are obtained by solving the optimization problem.
3.4.1. Smoothness Cost
The smoothness cost involves two parts. The first part,
, describes the difference in linear speed between two adjacent waypoints, i.e., the acceleration of the UAV. The second part,
, describes the difference in acceleration between two adjacent trajectory segments, i.e., the jerk of the UAV. The smoothness cost is represented by
where
is the smoothness cost of UAV
i at the
kth trajectory point, while
and
are the weighting factors of the two parts of the smoothness cost.
3.4.2. Formation Cost
The formation cost is designed to realize the formation maintenance through the optimization process. First, the calculation of the formation rotation matrix is introduced as follows.
For a two-dimensional vector, we define its rotation matrix
R as
where
represents the rotation angle. The counterclockwise rotation of a two-dimensional vector by an angle
relative to its original orientation can be achieved by left-multiplying it by the rotation matrix
R.
The orientation of the UAV formation is denoted as
. Let
when the formation is towards the positive direction of the x-axis in the xOy plane, and
gradually increases as the formation rotates counterclockwise. The starting and goal orientation angles of the formation are defined as
and
, respectively. To realize the rotation of the moving formation, the previously searched reference path (in Line 2 of Algorithm 1) is utilized to obtain the rotation angle of the formation at each time step, denoted as
. The
kth waypoint of the reference path is denoted as
, with
. Therefore, the rotation matrix of the reference formation at the
kth trajectory point can be written as
where
represents the orientation angle of the reference path at the
kth point. A transformation and rotation process of a triangle reference formation with 3 UAVs is illustrated in
Figure 4. The black circles and the red circles represent UAVs and the center of the formation, respectively. The red dashed line represents the reference path (the path taken by formation’s center). It shows that the orientation of the formation’s center is used to describe the rotation of the overall formation. As
k increases from 1 to
L, the formation’s orientation angle (shown as dark blue arrows) gradually reduces from
to
.
After the rotation matrix is obtained, the calculation of the formation cost is explained as follows.
We construct the formation cost of the optimization problem using two values, and , both representing the deviation between reference and actual relative position vectors:
is the deviation of the relative position vector between each UAV and the formation’s center at the
kth trajectory point. Based on the rotation matrix and the reference formation shape, the reference value can be calculated by
, where
represents the reference position of the formation’s center. The actual value is
, where
and
represent the
k-th waypoint of the
i-th UAV’s path and the reference path, respectively. Then,
is the deviation of relative position vectors between two UAVs at the
kth trajectory point. Only one neighbor
is taken into consideration for UAV
i in order to reduce the computational burden. The reference value is
, while the actual value is
. Then,
In
Figure 4, the gray arrows are examples of the two reference vectors mentioned above.
represents the reference relative position between UAV1 and UAV2, while
represents the reference relative position between UAV2 and the formation’s center. The sum of the two terms above yields the formation error vector at the
kth trajectory point (i.e., at time step
) as follows:
where
is the formation cost of UAV
i at the
kth trajectory point, while
are positive weighting constants. To obtain optimal formation shape maintenance, the objective of the optimization is to minimize the norm of
, i.e.,
.
3.4.3. Trajectory Optimization Problem
The proposed trajectory optimization problem is as follows:
where
and
are weight constants. The number of the trajectory points is denoted as
L. The cost function (
13a) involves a smoothness cost
and a formation cost
, which are represented in quadratic form. The optimization variable is the acceleration of the UAVs. Equations (13b)–(13g) are the hard constraints of the optimization problem. The starting position and goal position of each UAV are limited by (13b), where
. The kinematics of a UAV is demonstrated by (13c) and (13d), where
. The specific forms of (13c) and (13d) are denoted as
The safe corridor constraint is denoted as (13e), where
. Equation (13f) determines the upper bound and lower bound of the control input. Equation (13g) represents the collision avoidance constraint among UAVs, where
.
is the collision radius of the UAV. From UAV1 to UAV
n, an optimization problem (
13a) is constructed and solved sequentially. The prior optimized trajectories are utilized by the latter ones to calculate formation cost and to achieve reciprocal avoidance in the swarm. A directed graph is used to describe the communication topology between UAVs. Take a formation with 4 UAVs as an example (see
Figure 5), UAV2 receives the trajectory of UAV1, and UAV3 can receive the trajectories of both UAV1 and UAV2, etc.
Note that the proposed optimization problem (
13a) is a non-convex optimization with nonlinear constraints. This may preclude the finding of an optimal solution. However, if a slight deviation from the reference formation shape is acceptable, classic nonlinear optimization techniques can still be utilized to solve the problem and obtain a feasible solution.