A Hybrid Search Behavior-Based Adaptive Grey Wolf Optimizer for Cooperative Path Planning for Multiple UAVs

Zheng, Zhiwen; Huang, Hao; Li, Chenbo; Yu, Yongbin; Wang, Xiangxiang; Cai, Jingye; Huang, Xi; Hu, Songbo

doi:10.3390/s25247657

Open AccessArticle

A Hybrid Search Behavior-Based Adaptive Grey Wolf Optimizer for Cooperative Path Planning for Multiple UAVs

by

Zhiwen Zheng

¹

,

Hao Huang

²,

Chenbo Li

^1,*,

Yongbin Yu

¹

,

Xiangxiang Wang

¹

,

Jingye Cai

¹

,

Xi Huang

³ and

Songbo Hu

³

¹

School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu 610054, China

²

Electrical Engineering & Compute Sciences, University of California Berkeley, Berkeley, CA 94720, USA

³

Chengdu Pvirtech Technology Co., Ltd., Chengdu 610000, China

^*

Author to whom correspondence should be addressed.

Sensors 2025, 25(24), 7657; https://doi.org/10.3390/s25247657

Submission received: 27 October 2025 / Revised: 4 December 2025 / Accepted: 11 December 2025 / Published: 17 December 2025

(This article belongs to the Special Issue Intelligent Control and Robotic Technologies in Path Planning)

Download

Browse Figures

Versions Notes

Abstract

Cooperative path planning of multiple unmanned aerial vehicles (UAVs) is pivotal for improving mission efficiency and safety in complex scenarios. However, the multi-constraint of UAVs increases the design difficulity of cooperative path planning. To address these issues, a hybrid search behavior-based adaptive grey wolf optimizer (HSB-GWO) is proposed in this work. HSB-GWO incorporates three key innovations: (1) A dimension learning-based hunting (DLH) strategy is employed to enhance population diversity by enabling knowledge exchange between non-leader wolves and their neighbors. (2) Aquila exploration combining expand exploration for global potential region detection and Lévy flight-based narrowed exploration for preventing populations from falling into local optimal solutions is adopted to enrich search behaviors and avoid local optima. (3) An adaptive weight adjustment mechanism is designed for leader wolves (

α

,

β

, and

δ

) to dynamically tune their contribution to offspring generation based on fitness to improve high-quality solution utilization. The search performance of HSB-GWO on the benchmark functions was validated by experiments on the benchmark suites of IEEE CEC 2017 and 2019, in which HSB-GWO outperformed seven comparison algorithms (AO, AOA, CBOA, NOA, GWO, IGWO, and AGWO), with Friedman test confirming its top overall rank (Rank 1). The results of cooperative path planning simulation demonstrate that the high-quality multi-UAV trajectories can be generated by the HSB-GWO to guide UAVs from the start to the destination safely and smoothly with the smallest cost.

Keywords:

swarm intelligence algorithm; cooperative path planning; multiple UAVs; hybrid search; grey wolf optimizer; multi-constraint problem

1. Introduction

The characteristics of high maneuverability, low operational cost, and the ability to perform tasks in complex environments mean that unmanned aerial vehicles (UAVs) are widely applied in both civil and military fields, such as disaster rescue, environmental monitoring, and cooperative reconnaissance [1,2,3]. Among various UAV-related technologies, cooperative path planning for multiple UAVs stands out as a core challenge, as it directly determines the efficiency, safety, and success rate of multi-UAV mission execution. Simultaneously designing the flight trajectories of multiple UAVs in parallel is a key challenge for this task. In addition to considering various environmental constraints (maximum and minimum flight) and threat avoidance (radar, missile, and artillery threats), it is also necessary to consider collisions between multiple UAVs and spatiotemporal coordination [4,5,6,7,8,9]. Therefore, the design of the path is complex.

The A* algorithm [10] and Dijkstra algorithm [11], as traditional and classical path planning methods, are limited by handling high-dimensional and complex constraint spaces, especially when dealing with dynamic threats and multi-UAV collaboration [12,13,14]. Due to the characteristics of strong search capabilities for the high-dimension solution space and flexibility in dealing with nonlinear constraints [15,16], metaheuristic algorithms have been widely adopted to emerge promising solutions for optimization of various application scenarios [17,18,19]. The grey wolf optimizer (GWO), proposed by Mirjalili et al., is a representative metaheuristic inspired by the social hierarchy and cooperative hunting behavior of grey wolves [20]. It has been widely used in UAV path planning due to its simple structure and few control parameters [21,22,23]. However, the standard GWO still has inherent shortcomings: its search performance is highly dependent on the three leader wolves (

α

,

β

, and

δ

), which easily leads to premature convergence and trapping in local optima when facing complex optimization tasks (e.g., multi-UAV cooperative path planning with multiple threats and time–space constraints) [24,25].

To address the limitations of GWO, numerous improved versions have been proposed [26,27,28]. To mitigate the insufficient exploration and premature convergence of the original GWO, Zhu et al. [26] adopted chaotic maps integrated into the position update process of GWO. Moreover, differential evolution (DE) combined with a fractal-based multi-scale search strategy is used to realize search with multiple levels of detail to enhance the exploitation ability of GWO. In [27], Liu et al. integrated Gaussian mutation and dynamic weights of trigonometric functions to adjust the importance of information for three leaders through differentiated weight adjustment. In addition, a spiral function was employed to perturb the optimal individual position, which prevented GWO from getting stuck in local optimal solutions. Due to the traditional GWO relying on leader wolves’ (

α

,

β

, and

δ

) guidance for population updates, the population tends to converge in the later stages of iteration, resulting in insufficient diversity and difficulty in breaking through local optima. Zhou [28] directly integrated crossover and mutation operators of the genetic algorithm (GA) into the GWO population update process, breaking population homogenization through gene recombination and random mutation, thereby improving premature convergence problems. Nevertheless, these improved algorithms still struggle to balance exploration and exploitation in complex solution spaces, such as hybrid and composite benchmark functions or multi-UAV path planning scenarios with dense threats. The imbalance between exploration and exploitation can lead to different inefficient search problems. Excessive exploration makes it difficult for individuals to achieve precise search within the existing search area, thereby impeding the approach of the global optimal solution. Conversely, excessive exploitation behavior causes inefficient searching for unexplored areas, resulting in falling into local optima, especially when the solution space contains multiple local optima [29,30]. In addition, existing path planning algorithms for cooperative path planning often ignore the coupling relationship between individual UAV trajectory optimization and global team coordination. For example, some algorithms focus on avoiding threats but may result in excessive flight distance and energy consumption, while others prioritize time coordination but cannot ensure collision avoidance between UAVs [31]. Therefore, there is an urgent need for a metaheuristic algorithm that can not only balance exploration and development to handle complex optimization tasks, but also effectively integrate multiple drone constraints (time coordination, spatial collision avoidance, threat avoidance) to generate high-quality cooperative trajectories.

Against this background, this study proposes a hybrid search behavior-based adaptive GWO (HSB-GWO) for multi-UAV cooperative path planning. The algorithm introduces dimension learning-based hunting (DLH) and Aquila exploration strategies to enrich the search behavior of the wolf pack, and adopts an adaptive weight adjustment mechanism for leader wolves to improve the utilization of high-quality solutions. By constructing a comprehensive cost model that integrates energy consumption, altitude constraints, threat avoidance, time coordination, and collision avoidance, HSB-GWO is expected to solve the multi-UAV cooperative path planning problem more effectively. The performance of HSB-GWO is verified through extensive experiments on IEEE CEC 2017 benchmark functions and a multi-UAV cooperative path planning scenario, providing a new efficient solution for multi-UAV path planning.

The remainder of this article is structured as follows. Section 2 introduces the modeling of cooperative path planning for multiple UAVs. Section 3 designs the proposed HSB-GWO. Section 4 reports the simulation results. The conclusion and future work are summarized in Section 5 and Section 6, respectively.

2. Modeling of Cooperative Path Planning for Multiple UAVs

In this part, the modeling of cooperative path planning task with various threats and constraints is introduced.

2.1. Terrain Modeling

Firstly, a three-dimensional geographic environment (planning space) is modeled by discretization. By dividing the planning space into cubic grids, the space can be reconstituted to numerous adjacent cubes of equal size. Based on the number of flight waypoints set, multiple waypoints are searched in the planning space in an orderly manner, connected from the starting point to the target point, forming a trajectory. The planning space is modeled by a two-dimensional matrix, where each element of the matrix represents the highest altitude of the corresponding cubic grids. Based on the above, the planning space O can be formed as

\begin{matrix} O & = [\begin{matrix} h_{1, 1} & h_{1, 2} & \dots & h_{1, M_{O}} \\ h_{2, 1} & h_{2, 2} & \dots & h_{2, M_{O}} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ h_{N_{O}, 1} & h_{N_{O}, 2} & \dots & h_{N_{O}, M_{O}} \end{matrix}], \end{matrix}

(1)

in which

N_{O}

and

M_{O}

denote the size of the discretized planning space.

2.2. Constraint Design

2.2.1. Threat Model Design

The threat model has constraints such as maximum effective range and effective kill distance. A reasonable target cost function is established to introduce these constraint conditions into the target function to form radar, missile, anti-aircraft gun, and atmospheric threat models. The model definitions for different threats in this paper are summarized below.

Radar threat model

We define

r_{\max}^{R}

as the maximum scanning radius of the radar with the antenna performing a

360^{\circ}

scan in azimuth, and the detection probability of radar for a UAV can be approximately expressed as

T_{R} (D_{R}) = \{\begin{matrix} 0, & D_{\max}^{R} < D_{R} \\ \frac{1}{D_{R}^{4}}, & D_{\min}^{R} \leq D_{R} \leq D_{\max}^{R} \\ 1, & D_{R} < D_{\min}^{R} \end{matrix},

(2)

in which

D_{R}

,

D_{\max}^{R}

, and

D_{\min}^{R}

denote the distance between the UAV and the radar, the maximum radius of the radar detection area (the return signal will be too weak to recognize the UAV if

D_{\max}^{R} < D_{R}

), and the effective detection radius of the radar, respectively.

T_{R} (D_{R})

is defined as the probability of radar threat.

Other threat models

In addition to radar threats, there are also missile, artillery, and meteorological threats. Since the mathematical models of the three threats mentioned above are the same, they are uniformly defined as other threats. The probability of other threats can be formed as

T_{S} (D_{j}) = \{\begin{matrix} 0, & D_{\max}^{S} < D_{j} \\ \frac{1}{D_{j}}, & D_{\min}^{S} \leq D_{j} \leq D_{\max}^{S} \\ 1, & D_{j} < D_{\min}^{S} \end{matrix},

(3)

in which

D_{j} \in D_{S}

(

D_{S} = {D_{M}, D_{A}, D_{C}}

) denotes the distance belonging to missile, artillery fire, and meteorological threats.

2.2.2. Collaborative Constraints

In addition to the constraints for a single UAV mentioned above, further consideration is needed for the constraints among UAVs to ensure the entire UAV formation can successfully complete the task. Therefore, two types of collaborative constraints will be introduced in the following.

Time collaborative constraint

For the m-th UAV, assuming that its trajectory length is

L_{m}

(

m = 1, 2, \dots, M

), its corresponding time constraint can be designed through the maximum and minimum arrival time as

\begin{matrix} t_{\min}^{m} \leq t_{m} \leq t_{\max}^{m} \end{matrix}

(4)

where

t_{\min}^{m}

and

t_{\max}^{m}

are the shortest and longest arrival time for the m-th UAV calculated by the corresponding maximum and minimum velocities as

\{\begin{matrix} t_{\min}^{m} = \frac{L_{m}}{v_{\max}^{m}} \\ t_{\max}^{m} = \frac{L_{m}}{v_{\min}^{m}} \end{matrix},

(5)

Space collaborative constraint

Space collaborative constraint, also known as collision-free constraint, requires that the minimum distance between UAVs is not less than the minimum safe flight distance. For the m-th UAV, the distance to the n-th UAV need to be satisfied with

\begin{matrix} D_{m, n} > D_{K}, n = 1, 2, \dots, m - 1 \end{matrix}

(6)

in which

D_{m, n}

is the distance between the two UAVs and

D_{K}

is defined as the minimum safe flight distance among the UAVs.

2.3. Cost Model for Trajectory Planning of UAVs

In the collaborative flight mission of multiple UAVs, trajectory evaluation needs to further introduce indicators relating the time and space coordination on the basis of the single-UAV cost function. Thus, the cost function for UAVs can be formed as

\begin{matrix} F = \sum_{m = 1}^{M} (ω_{1} f_{E, m} + & ω_{2} f_{H, m} + ω_{3} f_{T, m} + ω_{4} f_{F, m} + ω_{5} f_{C, m}) \end{matrix}

(7)

in which

{ω_{1}, \dots, ω_{5}}

are weight factors,

f_{E, m},

f_{H, m},

f_{T, m},

f_{F, m},

and

f_{C, m}

are the energy consumption, altitude, threats, flight time, and collision cost for the m-th UAV, respectively.

For the energy consumption,

f_{E, m}

can be computed by

\begin{matrix} f_{E, m} = p_{E} \sum_{d = 1}^{D_{m} - 1} l_{d, m}, \end{matrix}

(8)

where

p_{E}

is defined as a proportional factor (the fuel consumption per kilometer) and

l_{d}

denotes the d-th distance of flight path formed as

\begin{matrix} l_{d, m} = [{(x_{d + 1, m} - x_{d, m})}^{2} & + {(y_{d + 1, m} - y_{d, m})}^{2} + {(z_{d + 1, m} - z_{d, m})}^{2}]^{1 / 2} \end{matrix}

(9)

in which d is defined as the d-th waypoint;

(x_{d, m}, y_{d, m}, z_{d, m})

is the spatial coordinate of the m-th UAV at the d-th waypoint.

For the altitude,

f_{H, m}

is formed as

\begin{matrix} f_{H, m} = \sum_{d = 1}^{D_{m}} u_{d}, \end{matrix}

(10)

in which

u_{d}

is the punish function for the d-th waypoint of m-th UAV that can be described as

u_{d} = \{\begin{matrix} p_{H_{1}} (z_{d, m} - H_{\max}), & if H_{\max} < z_{d, m} \\ 0, & if H_{\min} \leq z_{d, m} \leq H_{\max} \\ p_{H_{2}} (H_{\min} - z_{d, m}), & if 0 \leq z_{d, m} < H_{\min} \end{matrix},

(11)

in which

H_{\min}

and

H_{\max}

are the minimum and maximum altitude for UAV flight.

p_{H_{1}}

and

p_{H_{2}}

are proportional factors.

f_{T, m}

as a threat cost can be computed by

\begin{matrix} f_{T, m} = \sum_{d = 1}^{D_{m}} T_{R} (D_{R, d, m}) + \sum_{j = 1}^{3} (\sum_{d = 1}^{D_{m}} T_{S} (D_{j, d, m})), \end{matrix}

(12)

where

D_{i, d, m} \in {D_{R, d, m}, D_{S, d, m}}

can be calculated by

\begin{matrix} D_{i, d, m} = [{(x_{d, m} - x_{i})}^{2} & + {(y_{d, m} - y_{i})}^{2} + {(z_{d, m} - z_{i})}^{2}]^{1 / 2} \end{matrix}

(13)

in which

(x_{i}, y_{i}, z_{i})

is defined as the center position coordinate of radar and other threats.

According to the information on the time collaborative constraint, time cost

f_{F, m}

can be modeled as

f_{F, m} = \{\begin{matrix} 0, & if t_{\min}^{m} \leq t_{com} \leq t_{\max}^{m} \\ |t_{m} - t_{com}|, & otherwise \end{matrix},

(14)

in which

t_{m}

is the actual arrival time of m-th UAV and

t_{com}

is the instruction time. When the range of theoretical flight time

[t_{\min}^{m}, t_{\max}^{m}]

for m-th UAV does not include

t_{com}

, it indicates that the m-th UAV is too fast or too slow to meet the time collaborative requirement.

Each UAV will fly at a constant speed

v_{m}

, and the flight time

t_{m}

can be calculated with the trajectory determined by waypoints (Equation (9)), which can be formed as

\begin{matrix} t_{m} = \frac{1}{v_{m}} \sum_{d = 1}^{D_{m} - 1} l_{d, m} . \end{matrix}

(15)

The collision cost

f_{C, m}

can be calculated based on the distance between the m-th UAV and the 1st to

(m - 1)

-th UAVs throughout the entire flight; collision can be considered to occur if Equation (6) is not satisfied.

Remark 1.

Considering that threat areas with spherical shapes are common in real environments, the aforementioned threat models are constructed. It is worth noting that threat modeling and HSB-GWO are independent, which means that different modeling methods will not affect the path planning algorithm. Therefore, different models and penalty functions can be designed according to actual needs.

Remark 2.

In addition to adding weight factors to convert multiple indicators into a single fitness function (Equation (7)), constructing a multi-objective architecture is another strategy to design the optimization task. Under the multi-objective architecture, the metaheuristic algorithm is conducted to search the Pareto frontier composed of multiple indicators, and there is no dominant relationship between the solutions on this frontier, which can be considered as optimal solutions.

2.4. Optimization Task Mathematical Formulation

Based on the previous content, the mathematical formulation of the optimization problem for the collaborative path planning can be summarized as follows:

\begin{matrix} \begin{matrix} min_{X} F (X) & = \sum_{m = 1}^{M} (ω_{1} f_{E, m} + ω_{2} f_{H, m} + ω_{3} f_{T, m} + ω_{4} f_{F, m} + ω_{5} f_{C, m}) \\ s . t . \{\begin{matrix} H_{\min} \leq z_{d, m} \leq H_{\max}, m \in [1, M], d \in [1, D_{m}] \\ v_{\min} \leq v_{m} \leq v_{\max}, m \in [1, M] \\ D_{m, n} > D_{K}, m \neq n, {m, n} \in [1, M] \\ t_{\min}^{m} \leq t_{com} \leq t_{\max}^{m}, m \in [1, M] \\ 0 \leq x_{m, d} \leq M_{O} \\ 0 \leq y_{m, d} \leq N_{O} \\ z_{m, d} \geq h_{n, m}, h_{n, m} \in O \end{matrix} \end{matrix} \end{matrix}

(16)

3. Hybrid Search Behavior-Based Adaptive Grey Wolf Optimizer

GWO, introduced by Mirjalili et al. [20], is inspired by the social hierarchy and cooperative hunting behavior of grey wolves. GWO emulates the leadership structure within a wolf pack, typically categorized into four levels, alpha (

α

), beta (

β

), delta (

δ

), and omega (

ω

), in which the

α

,

β

, and

δ

wolves, representing the fittest solutions, are regarded as leaders to guide the search process of the

ω

wolves that follow them. In [32], it is suggested that GWO is a variant algorithm of PSO (similar to SPSO-2011). Therefore, the shortcomings of GWO in search performance can reflect the common problems of PSO-based metaheuristic algorithms, which is the focus of this work to solve.

GWO mathematically models the strategies of encircling, hunting, and attacking prey. The positions of the search agents (wolves) are updated based on the perceived locations of the

α

,

β

, and

δ

wolves, simulating the collaborative effort of the pack to approach the optimal solution (prey). This mechanism is developed to balance the exploration of the search space and the exploitation of promising regions. However, the search performance of the whole population is strongly dependent on the three best-positioned wolves, which easily causes the population to fall into the local optimal solution and weak search stability.

In order to overcome the weakness of traditional GWO, in this paper a novel algorithm named hybrid search behavior-based adaptive GWO (HSB-GWO) is developed. Dimension learning-based learning (DLH) [33] and Lévy flight [34] are introduced into the GWO for improve the balance of exploration and exploitation behaviors. In addition, the adaptive strategy is also adopted to dynamically adjust the weights of the

α

,

β

, and

δ

wolves in offspring generation. The overall diagram of the HSB-GWO algorithm is depicted in Figure 1 and the pseudo-code is shown in Algorithm 1.

Algorithm 1: HSB-GWO

Remark 3.

It is worth noting that HSB-GWO performs fitness calculation twice per generation (see lines 13 and 32 of Algorithm 1). Therefore, G of HSB-GWO should be set half compared to the other algorithms to ensure the fitness evaluations (FEs) are equal for fair comparison.

3.1. Individual Formulation

The cooperative path planning task for multiple UAVs is achieved by generating a certain number of waypoints on a three-dimensional map for each UAV. Therefore, each individual of HSB-GWO contains the coordinates of these waypoints. In addition, the search range of HSB-GWO for paths is determined by the size of the 3D map constructed (flight altitude range

[H_{\min}, H_{\max}]

and map area

N_{O} \times M_{O}

). Figure 2 is an example of path planning in a two-dimensional map.

The data structure of each wolf for multi-UAV path planning is illustrated in Figure 3, where a sequence of waypoints starting from the initial node

Q_{m} (m = 1, 2, \dots, M)

and terminating at the goal waypoint

G_{m}

is adopted to record spatial position information about each UAV for the calculation of

f_{E, m}

and

f_{H, m}

. In addition, between

Q_{m}

and

P_{m, D_{m}}

(D_{m} = D_{1}, D_{2}, \dots, D_{M} - 1)

, state variables

S_{m, 0}

(for

Q_{m}

) and

S_{m, D_{m}}

(for

P_{m, D_{m}}

) are integrated into the information set to record constraint states (the number of collisions generated by the path formed by waypoints

{Q_{m}, P_{m, 1}}

or

{P_{m, D_{m}}, P_{m, D_{m} + 1}}

, and the threat area passed through) for the calculation of

f_{T, m}

and

f_{C, m}

. Each UAV has an independently set fixed speed. Therefore, the speed

v_{m}

will also be recorded for the calculation of

f_{F, m}

.

According to the above content, HSB-GWO will generate waypoints and velocities for each UAV based on search strategies, then update the constraint state according to the coordinates of the waypoints and velocity, and finally record the relevant information in each individual for the calculation of the cost function, thereby quantifying the performance of the UAV under the cooperative path planning task.

3.2. Wolf Pack Dividing

Before the search of each iteration, wolves are divided into two sub groups

X^{S_{1}} (g)

and

X^{S_{2}} (g)

randomly by

\{\begin{matrix} X_{i} (g) \in X^{S_{1}} (g), & if rand > r_{E} \\ X_{i} (g) \in X^{S_{2}} (g), & else \end{matrix},

(17)

in which

r_{E}

is defined as the dividing probability, i denotes the i-th individual in the wolf pack (

i \leq N_{p}

), g is defined as the g-th iteration (

g \leq G

), and

rand

is a random function to generate disturbance within

[0, 1]

.

3.3. Adaptive Search

Firstly, the surround behavior is conducted to update the position of the wolf pack that can be modeled by

\begin{matrix} D = | C \circ X_{P} (g) - X (g) | \end{matrix}

(18)

\begin{matrix} X^{AD} (g) = X_{P} (g) - A \circ D \end{matrix}

(19)

where ∘ denotes the Hadamard product,

X_{P} (g)

is defined as the position of the prey,

X (g)

denotes the position of the wolf in the g-th generation, A and C are the random coefficients, in which A is a control parameter to adjust the trend of the individual position movement and C is adopted to generate various distances from the population to the

α

,

β

, and

δ

wolves. The update laws of these two correlation coefficient vectors are designed as

\begin{matrix} A = 2 a \cdot r_{1} - a \end{matrix}

(20)

\begin{matrix} C = 2 \cdot r_{2} \end{matrix}

(21)

where

r_{1}

and

r_{2}

are random vectors generated within

[0, 1]

, and a is defined as the attenuation coefficient to decrease from 2 to 0 with increasing iterations.

Different from the linear decay adopted in the conventional GWO, the update law of a proposed in [35] is formed as

\begin{matrix} a = 1 + tanh (2.5 (1 - \frac{2 g}{G})) \end{matrix}

(22)

in which G denotes the maximum iteration for the HSB-GWO. Figure 4 shows the convergence process of the linear decay method and Equation (22). Nonlinear decay can enrich the behavior of wolf packs throughout the entire search, enabling them to exhibit certain exploitation behavior in the early search stages to accelerate the convergence speed of the population. Similarly, in the later stages of the search, certain exploration behavior can also be exhibited, thereby reducing the possibility of falling into local optima.

The hunting stage is conducted after the positions of the leader wolves (

α

,

β

, and

δ

wolves) are updated. In wolf pack hunting, the leader wolves are the closest to the prey. Therefore, considering that

α

,

β

, and

δ

wolves have a better knowledge of the location of the optimal global solution, the other wolves are guided by them. The hunting behavior can be mathematically modeled by

\begin{matrix} D_{α} = | C_{1} \circ X_{α} - X (g) | \end{matrix}

(23)

\begin{matrix} D_{β} = | C_{1} \circ X_{β} - X (g) | \end{matrix}

(24)

\begin{matrix} D_{δ} = | C_{1} \circ X_{δ} - X (g) | \end{matrix}

(25)

where

C_{1}

,

C_{2}

, and

C_{3}

are computed by Equation (21).

| \cdot |

is a sign for absolute value. We substitute

D_{α}

,

D_{β}

, and

D_{δ}

into Equation (18), which can be formed as

\begin{matrix} X_{i_{1}} (g) = X_{α} (g) - A_{i_{1}} \circ D_{α} (g) \end{matrix}

(26)

\begin{matrix} X_{i_{2}} (g) = X_{β} (g) - A_{i_{2}} \circ D_{β} (g) \end{matrix}

(27)

\begin{matrix} X_{i_{3}} (g) = X_{δ} (g) - A_{i_{3}} \circ D_{δ} (g) \end{matrix}

(28)

in which

X_{α} (g)

,

X_{β} (g)

, and

X_{δ} (g)

are the top three solutions at the g-th generation;

A_{i_{1}}

,

A_{i_{2}}

, and

A_{i_{3}}

are calculated by Equation (20).

In the conventional GWO, the generation of a new solution

X^{AD} (g)

is achieved by calculating the mean of

X_{i_{1}} (g)

,

X_{i_{2}} (g)

, and

X_{i_{3}} (g)

, which can be written as

\begin{matrix} X^{AD} (g) = \frac{1}{3} (X_{i_{1}} (g) + X_{i_{2}} (g) + X_{i_{3}} (g)) . \end{matrix}

(29)

However, by considering the differences in their fitness, their contribution to the generation of a new solution should also be different. For the current best solution

X_{α} (g)

, the importance of its

X_{i_{1}} (g)

should be the highest. Therefore, Equation (29) is rewritten as follows:

\begin{matrix} X^{AD} (g) = (F_{i_{1}} (g) X_{i_{1}} (g) + F_{i_{2}} (g) X_{i_{2}} (g) + F_{i_{3}} (g) X_{i_{3}} (g)) {(F_{i_{1}} (g) + F_{i_{2}} (g) + F_{i_{3}} (g))}^{- 1}, \end{matrix}

(30)

where

F_{i_{1}} (g)

,

F_{i_{2}} (g)

, and

F_{i_{3}} (g)

are the fitness of

X_{α} (g)

,

X_{β} (g)

, and

X_{δ} (g)

. F represents the total cost generated during the flight process, in which

f_{E, m} > 0

always holds and the minimum values of other elements (

f_{H, m}

,

f_{T, m}

,

f_{F, m}

, and

f_{C, m}

) are 0. Therefore,

F > 0

always holds.

3.4. Aquila Exploration

Aquila exploration as a hybrid search method is proposed by Ma et al. [24], in which two search strategies are introduced to prevent wolves from falling into a local optimal solution. Moreover, hybrid search modes combined with the aforementioned adaptive search method can further enrich the diversity of wolves to improve search ability and stability.

3.4.1. Expand Exploration

This search strategy is used to act as an investigator for locating the potential areas in the solution space, which can be formed as

\begin{matrix} X^{Aquila} (g) = X_{α} (g) \cdot (1 - g / N_{g}) + (\bar{X} (g) - X_{α} (g) \circ rand), \end{matrix}

(31)

where

X^{Aquila} (g)

is defined as a new position for the wolf,

1 - g / N_{g}

is adopted as the linear decay factor to control search behavior and

\bar{X} (g)

is the average value of all the positions of the wolves participating in the expand exploration that can be calculated by

\begin{matrix} \bar{X} (g) = \frac{1}{N_{E}} \sum_{i = 1}^{N_{E}} X_{i} (g), \end{matrix}

(32)

in which

N_{E}

denotes the number of wolves in expand exploration.

3.4.2. Narrowed Exploration

This search mode imitates the hunting behavior of Aquila after discovering prey, wolves are expected to be able to perform local search behavior while possessing the ability to escape from local optima. Therefore, Levy flight with variable step size conforming to the characteristics of heavy tailed distribution is introduced. The new position updated by this search mode can be formed as follows:

\begin{matrix} X^{Aquila} (g) = X_{α} (g) \circ Levy (D), \end{matrix}

(33)

where

Levy (D)

is calculated by

\begin{matrix} Levy (D) = c S (ν), \end{matrix}

(34)

in which c represents a fixed constant (0.01) and the mathematical formula of

S (ν)

is written as

\begin{matrix} S (ν) = \frac{1}{π} \int_{0}^{\infty} e^{- ε q^{ϕ}} cos (q ν) d q, 0.3 \leq ϕ \leq 1.99, \end{matrix}

(35)

which can be approximated and computed by the Mantegna method [36] as

\begin{matrix} S (ν) = \frac{u}{{| v |}^{1 / ϕ}}, \end{matrix}

(36)

\begin{matrix} u \sim N (0, σ_{u}^{2}), v \sim N (0, σ_{v}^{2}), \end{matrix}

(37)

in which

ϕ

is a fixed constant set to be 1.5 and u is generated by a random function that conforms to a normal distribution with variance

σ_{u}^{2}

computed by

\begin{matrix} σ_{u} = {(\frac{Γ (1 + ϕ) sin (π ϕ / 2)}{Γ (\frac{1 + ϕ}{2}) ϕ \cdot 2^{(ϕ - 1) / 2}})}^{1 / ϕ} . \end{matrix}

(38)

Similarly, the variance of a random function for v is 1.

Γ (ϕ)

is a gamma function, which can be computed according to Weierstrass’s definition as

\begin{matrix} Γ (ϕ) = \frac{e^{- γ ϕ}}{ϕ} \prod_{n = 1}^{N} {(1 + \frac{ϕ}{n})}^{- 1} e^{ϕ / n} \\ s . t . {(1 + \frac{ϕ}{n})}^{- 1} e^{ϕ / n} < 10^{- 12} \end{matrix}

(39)

where

γ

is the Euler–Mascheroni constant, 0.577216.

3.5. Dimension Learning-Based Hunting (DLH)

The unstable convergence caused by the generation of new individuals relying on the three leader wolves cannot be unavoidable in the conventional GWO. Moreover, hunting, as another interesting social behavior of the non-leader members, is neglected, which is potential information for enhancing search performance. Therefore, DLH is adopted to overcome these issues, which is an effective strategy to achieve knowledge learning for each wolf from its neighbors.

For the new position of each wolf, its d-th dimension

X_{i, d}^{DLH} (g)

can be calculated by

\begin{matrix} X_{i, d}^{DLH} (g) = X_{i, d} (g) + rand \cdot (X_{i, d}^{N} (g) - X_{r, d} (g)), \end{matrix}

(40)

in which

X_{i, d} (g)

is the position of the i-th wolf,

X_{r, d} (g) \in X_{r} (g)

is a wolf selected randomly from the current population

X (g)

, and

X_{i, d}^{N} (g)

is defined as the d-th dimension of a composite individual

X_{i}^{N} (g)

combined by neighbor group

N_{i} (g)

. The neighbors of each wolf

X_{i} (g)

represented by

N_{i} (g)

are constructed as follows:

\begin{matrix} N_{i} (g) = {X_{j} (g) | D_{i} (X_{i} (g), X_{j} (g)) \leq R_{i} (g), X_{j} (g) \in X (g)}, \end{matrix}

(41)

in which

D_{i} (X_{i} (g), X_{j} (g))

is the Euclidean distance between

X_{i} (g)

and

X_{j} (g)

. Similarly,

R_{i} (g)

, as the Euclidean distance, can be computed by

\begin{matrix} R_{i} (g) = ∥X_{i} (g) - X_{i}^{'} (g)∥ \end{matrix}

(42)

where

X_{i} (g)

is the current position of the i-th wolf and

X_{i}^{'} (g)

denotes the new position of a wolf updated by the aforementioned two search strategies designed in Section 3.3 and Section 3.4.

4. Simulations

In this section, the searching performance of our proposed HSB-GWO algorithm is evaluated through several test functions and the simulation of UAVs for cooperative path planning. All the simulations were conducted on a CPU, Intel Core (TM) i9-13900HX 2.2 GHz and 64.00 GB RAM (Intel, Santa Clara, CA, USA), and the version of Matlab is R2023b.

4.1. Control Parameter Analysis

The search performance evaluations of the proposed HSB-GWO with different control parameter settings were performed by IEEE Congress on Evolutionary Computation 2017 (IEEE CEC 2017) benchmark suite consisting of 29 test functions [37]. These test suite combines unimodal (F1, F3), multimodal (F4–F10), hybrid (F11–F20), and composition (F21–F30) functions. All test functions were adopted with three dimensions of

D = 10

,

D = 30

, and

D = 50

by 10 independent runs. The number of fitness evaluations (FEs) was set based on

(D \times 10^{4})

for fair comparison. The value of population size is

N_{p} = 100

.

The Friedman test [38] was used for ranking HSB-GWO with seven parameter settings based on their obtained fitness. As shown in Table 1, for the F1 and F3 group (unimodal functions), the mean ranks of HSB-GWO with

r_{E} = 0.2

are 1.00 (

D = 10

), 2.00 (

D = 30

), and 1.50 (

D = 50

), which are relatively low in comparison to other

r_{E}

settings. In the F4–F10 group (multimodal functions), the mean ranks for

r_{E} = 0.2

are 1.57 (

D = 10

), 1.43 (

D = 30

), and 2.29 (

D = 50

), exhibiting superior performance over most other

r_{E}

settings. Regarding the F11–F20 group (hybrid functions), the mean ranks with

r_{E} = 0.2

are 3.20 (

D = 10

), 3.30 (

D = 30

), and 2.40 (

D = 50

), demonstrating competitiveness. In the F21–F30 group (composition functions), the mean ranks for

r_{E} = 0.2

are 3.30 (

D = 10

), 2.20 (

D = 30

), and 3.20 (

D = 50

), indicating favorable results.

From the perspective of overall mean ranks across all function groups and dimensions, the HSB-GWO with

r_{E} = 0.2

obtained the lowest mean ranks (2.69 for

D = 10

, 2.38 for

D = 30

, and 2.59 for

D = 50

) in contrast to other

r_{E}

configurations. A lower mean rank signifies better algorithm performance. Moreover, considering the complexity of solution space for the cooperative path planning of multiple UAVs,

r_{E} = 0.2

, which performs well in different types of benchmark functions, is chosen for subsequent experiments. The variation of control parameters can determine the search behavior of HSB-GWO; therefore, the performance of the algorithm in different search tasks can be improved by adjusting the control parameters.

Remark 4.

In addition to finding the optimal configuration of control parameters through the above experiments, a variety of tuners, such as CRS-Tuning [39], F-Race [40], REVAC [41], and ParamILS [42], can also be adopted.

4.2. Search Performance Comparison Experiment on Benchmark Functions

To validate the search performance of HSB-GWO, seven metaheuristic algorithms are adopted: AO [43], AOA [44], CBOA [45], NOA [46], GWO [20], IGWO [25], and AGWO [24]. All the control parameters of seven comparative algorithms were set following the recommended settings provided from their original works. Due to the varying number of fitness evaluation operators used by different algorithms in each iteration, the maximum number of fitness evaluations (MaxFEs) was set based on

(D \times 10^{4})

. The value of population size is

N_{p} = 100

. Two benchmark function sets (IEEE CEC 2017 and 2019) were adopted to verify the search performance of our proposed HSB-GWO.

The mean fitness errors (MFEs) representing the difference between the best fitness and the global optimum are summarized in Table 2, Table 3, Table 4, Table 5, Table 6 and Table 7. Moreover, standard deviations (STD) of fitness errors were also reported to measure the search performance of each algorithm. “w/t/l” denotes the number of wins (w), ties (t), and losses (l).

4.2.1. Results of IEEE CEC 2017

Exploitation and exploration ability analysis

F1 and F3 as unimodal test functions are well-suited for verifying the exploitation capability to locate the optimal solution. The results shown in Table 2 demonstrate that the proposed HSB-GWO algorithm can yield highly competitive results on unimodal test functions. Notably, it significantly improved results on F3 across all dimensions (

D = 10, 30, 50

) compared to the comparative algorithms. Hence, the effective exploitation ability of the HSB-GWO algorithm on the region around the optimal solution is proved.

Multimodal functions (F4–F10), which possess numerous local minima, can be utilized to test both the exploration ability and the local optimum avoidance capability of the HSB-GWO. According to the results presented in Table 3, HSB-GWO is capable of providing superior results on multimodal functions for different dimensions (

D = 10, 30, 50

) in F4–F7 and F9, which indicates that the proposed HSB-GWO algorithm is competitive in terms of exploration.

Search performance analysis on complex optimization tasks

Optimization for the hybrid (F11–F20) and composite (F21–F30) functions possessing complex solution space requires strong and stable search performance of algorithms. Thus, the balance between exploitation and exploration of HSB-GWO can be evaluated simultaneously by these test functions.

The results summarized in Table 4 indicate that the HSB-GWO performs superiorly on all hybrid functions across three different dimensions (

D = 10, 30, 50

) except F13 and F16. Although IGWO, as a comparative algorithm, achieved better results in F13 with

D = 10

and F16 with

D = 30, 50

, the overall search performance of HSB-GWO is superior. Furthermore, Table 5 presents the solutions obtained by the HSB-GWO and other algorithms for solving composition functions (F21-F30). HSB-GWO won or tied all the other seven algorithms on F22-F24, F26, F29, and F30 with

D = 10, 30, 50

. Although HSB-GWO lost to AO in F21 with

D = 10

and IGWO in F25 with

D = 10

, F27 with

D = 30

, and F28 with

D = 10, 50

, its MFEs in these functions are close to the best. Table 6 summarizes the overall effectiveness of the eight algorithms. It can be seen that the HSB-GWO wins or draws with seven comparative algorithms in 82.76% of optimization tasks. Therefore, our proposed algorithm exhibited a favorable balance between exploration and exploitation, as the results reveal, which illustrates the high-performing search ability on complex optimization tasks.

4.2.2. Results of IEEE CEC 2019

To comprehensively demonstrate the performance of our proposed HSB-GWO, the benchmark functions of IEEE CEC 2019 were determined. The results are summarized in Table 7. For F1, both HSB-GWO and CBOA can find the optimal solution for all the trials. Although AGWO and CBOA obtained the best results in F2 and F3, respectively, HSB-GWO performed competitively, being the second-best in F2 and F3. For F4–F10, HSB-GWO outperformed the other seven algorithms, demonstrating a strong search ability. The rank results based on the Wilcoxon signed-rank test further verified the superiority of HSB-GWO because it achieved seven wins, one tie, and two losses in the IEEE CEC 2019 benchmark suite.

Remark 5.

In addition to verifying the balance of exploitation and exploration behaviors through the experimental results, it can also be measured indirectly [47,48] (diversity-based, entropy-based, fitness-based) or directly [49] (attraction basin-based).

4.2.3. Statistical Analysis by Non-Parametric Friedman Test

To comprehensively evaluate the performance of the HSB-GWO against other algorithms, the Friedman test proposed in [38] is adopted, and the results are presented in Table 8. First, observing the “Overall Rank”, HSB-GWO achieved the top rank (rank 1) in all dimensions (

D = 10, 30, 50

). In contrast, other algorithms have lower overall rankings or show inconsistent performance in different dimensions. Analyzing the “Avg. Rank”, HSB-GWO has the lowest average rank among all algorithms for each dimension. When

D = 10

, its average rank is 1.66; for

D = 30

, it is 1.17; and for

D = 50

, the average rank of HSB-GWO is 1.14. Lower average ranks signify better overall performance in the Friedman test, which evaluates the relative performance of algorithms in 29 test functions. Moreover, examining the performance on individual test functions (F1, F3–F30) across different dimensions, HSB-GWO consistently delivers optimal or near-optimal results. For example, HSB-GWO obtained the lowest scores compared to other algorithms on F3–F7, F9, F11–F15, F22, F26, and F29 in three different dimensions (

D = 10, 30, 50

). Moreover, in other test functions like F10 (

D = 10, 30, 50

), F17 (

D = 30

), and F23 (

D = 30

), HSB-GWO also achieved competitive rankings compared to the best. In summary, the Friedman test results clearly demonstrate that HSB-GWO outperformed other comparative algorithms in terms of both exploration and exploitation capabilities, exhibiting more excellent search performance.

The Friedman test results of the eight algorithms in IEEE CEC 2019 are shown in Table 9. HSB-GWO achieved the best performance, as the average rank is only 1.17, indicating that HSB-GWO ranked high in most test functions. For example, HSB-GWO obtained the lowest scores in F4, F5, F6, F7, F9, and F10. In contrast, other algorithms obtained lower or inconsistent average ranks in the benchmark functions. For example, AOA received an average rank of 2.8 in F1. However, it obtained 7.05 average ranks in F2 and F6, which demonstrates the unstable search performance of AOA.

Moreover, Figure 5 shows the results of the critical difference (CD) in the Friedman test (Table 8 and Table 9). For

D = 10

of IEEE CEC 2017, the high performance of HSB-GWO is statistically significant compared to the other algorithms except IGWO. For

D = 30

and 50 of IEEE CEC 2017, the superiority of HSB-GWO on the search performance is statistically significant. Figure 5d suggests that HSB-GWO significantly outperforms AO, AGWO, AOA, and NOA in IEEE CEC 2019.

4.3. Cooperative Path Planning for Multiple UAVs

Eight algorithms were adopted in the two cooperative path planning tasks for multiple UAVs. For all algorithms, we conducted 10 trials, and the maximum number of fitness evaluations (MaxFEs) was set to 2400, and the value of population size

N_{p} = 60

. The parameter setting and threat design regarding the cooperative path planning simulation are summarized in Table 10, Table 11, Table 12 and Table 13.

4.3.1. Task 1

Table 14 summarizes the fitness results of eight algorithms after we separately performed 10 trials in Task 1 for the cooperative path planning of multiple UAVs. HSB-GWO achieved the lowest mean value of 30.81 among all algorithms, which indicates the better performance of HSB-GWO in path planning, as it implies a more optimal path with shorter distance or lower cost. Through the p-value results, it can be seen that the performance improvement of HSB-GWO in cooperative path planning tasks for multiple UAVs is statistically different from other compared algorithms. Therefore, the superiority of HSB-GWO is demonstrated. The distances, flight time, velocity, and collision of each UAV in the best trial obtained by each algorithm are provided in Table 15.

Figure 6 presents the 3D flight trajectory maps of multiple UAVs planned by each algorithm, where the trajectory corresponding to the best trial (the trial boxed in Table 14 with the lowest fitness value) of each algorithm is selected for representation. It can be seen that multiple UAVs flying in a straight line as much as possible from the starting point along the target point is a potential optimal path because the flight distance is shorter, thereby decreasing the flight time and saving more energy. However, this path passes through multiple threat areas, and the algorithms need to guide the UAVs to avoid entering the threat area as much as possible while satisfying the constraints of time and space coordination, which increases the search difficulty of the algorithms. Therefore, AOA, CBOA, and NOA tended to select paths that bypassed threat dense regions, which reduced the extra cost incurred by traversing threat zones. Nevertheless, such paths led to an increase in flight distance and energy consumption, rendering them locally optimal solutions. For the result obtained by AO, the path is a trade-off solution, which shortens the path in the vicinity of threat dense regions by traversing some threat areas. However, as indicated by the fitness, this path also corresponds to a locally optimal solution. GWO, IGWO, AGWO, and HSB-GWO have discovered paths with similar characteristics, yet from the comparison of their fitness, it is evident that the path determined by HSB-GWO incurs the lowest cost for the multiple UAVs. Three views and a 3D graph of the best trial result for HSB-GWO are shown in Figure 7.

Mean fitness convergences of the eight algorithms are shown in Figure 8. The curves show that HSB-GWO can quickly converge to the local optimal solution in the early stages of iteration and gradually approach the global optimal solution in subsequent searches. For AOA and NOA, their curves showed convergence stagnation during the search process, indicating that their population was trapped in a local optimal solution. For the other comparative algorithms, it can be seen that the fitness stagnation occurred in the middle stage of iteration and converged again in the later stage. The similar convergence characteristics are caused by their hybrid search strategies of dynamical adjustment of exploitation and exploration search behavior. Although these strategies can help the population escape from local optima, the problem of convergence stagnation has not been solved.

4.3.2. Task 2

For Task 2, the fitness results are summarized in Table 16. It can be seen that HSB-GWO achieved the lowest mean value of 2.11 compared to other algorithms, indicating the best performance of HSB-GWO in path planning. The results of p-values verified that the performance improvement of HSB-GWO is statistically different from other comparative algorithms. Therefore, the superiority of HSB-GWO is demonstrated. The distances, flight time, velocity, and collision of each UAV in the best trial of each algorithm are provided in Table 17.

Figure 9 presents the best flight trajectories of multiple UAVs planned by each algorithm. In Task 2, the UAVs need to depart from the initial point and bypass the threat zones to reach the goal positions. Due to the fact that UAVs above the queue (

z_{0} = 15

) need to reach the target point at the bottom (

z_{G} = 10

), and UAVs departing from below (

z_{0} = 10

) need to reach the endpoint above (

z_{G} = 15

), they need to avoid collision issues during queue position exchange as much as possible, which increases the difficulty of the path planning.

It can be seen that the strategies adopted by the eight algorithms were different. For example, AO reduced the collisions by implementing a strategy of exchanging positions of UAVs from the starting positions. For AOA and CBOA, similar paths were generated that increased the difference in flight distance between the upper and lower UAVs to reduce the possibility of collision due to the time difference. However, both strategies were considered as local optimal paths due to increased flight distances and energy consumption. The other five algorithms adopted the same strategy, which is to achieve position exchange while moving towards the target positions. However, the conservative routes designed by GWO and IGWO were adopted for some UAVs via a detour to decrease the possibility of collisions. Conversely, AGWO and HSB-GWO achieved coordinated flight by setting different flight velocities for UAVs. Three views and a 3D graph of the best trial result for HSB-GWO are shown in Figure 10.

Mean fitness convergences of the eight algorithms are shown in Figure 11, which shows that HSB-GWO exhibited remarkable convergence behavior because the HSB-GWO quickly reduced its mean fitness in the early stage of optimization and converged to the lowest fitness when the search finished. It is worth noting that during the search process, AGWO, IGWO, and GWO converged slowly for a period of time. Although the algorithm ultimately found a better solution, this indicates that there are shortcomings in the algorithm. Although AOA, CBOA, and NOA achieved a rapid decrease in average fitness in the early stages of search, they fell into local optima after

F E s = 400

.

4.4. Architecture Effectiveness Analysis of HSB-GWO

The ablation simulation was conducted to verify the architecture effectiveness of the proposed HSB-GWO. The simulation configuration is the same as Section 4.3 and three ablation algorithms are designed: HSB-GWO without DLH (A-1), HSB-GWO without Aquila exploration (A-2), and HSB-GWO without adaptive search (A-3). The simulation results and mean fitness convergences are shown in Table 18 and Figure 12, respectively. From the curve of A-1, it can be seen that the hybrid search mode composed of Aquila exploration and adaptive search can effectively address the problem of convergence stagnation, but A-1 without DLH cannot further approach the global optimal solution, whereby the mean fitness is 40.86. Conversely, the curves of A-2 and A-3 indicate that the fitness convergence stagnation is unavoidable, which affects search efficiency. In addition, the p-values represented in Table 18 of the three ablation algorithms demonstrate that the designed architecture has a statistically significant effect on improving search performance of the HSB-GWO.

5. Conclusions

This study proposes a hybrid search behavior-based adaptive grey wolf optimizer (HSB-GWO) to address the limitations of traditional metaheuristic algorithms in complex optimization tasks and multi-UAV cooperative path planning. Through theoretical analysis, benchmark function experiments, and multi-UAV path planning simulations, the effectiveness and superiority of HSB-GWO are systematically verified. According to the results on optimization for the benchmark functions, HSB-GWO exhibits excellent search capabilities across different types of optimization tasks. On the IEEE CEC 2017 benchmark suite (including unimodal, multimodal, hybrid, and composite functions), HSB-GWO outperformed seven metaheuristic algorithms (AO, AOA, CBOA, NOA, GWO, IGWO, and AGWO). Ablation experiments show that the three core components of HSB-GWO (DLH, Aquila exploration, and adaptive search) play crucial roles in improving search performance, in which DLH enhances the search ability of algorithm and hybrid search mode (Aquila exploration and adaptive search) effectively avoids premature convergence and enhances the ability to escape local optima. Statistical analysis confirms that the performance improvement of HSB-GWO is statistically significant. In the multi-UAV cooperative path planning scenario with 10 UAVs and 13 threats, HSB-GWO achieves the lowest mean fitness (30.81) among all algorithms. Additionally, HSB-GWO’s convergence curve shows fast early convergence and stable late-stage optimization, avoiding the convergence stagnation problems.

6. Future Work

In this work, HSB-GWO was proposed for global optimal path planning. However, in practical applications, in order to better cope with environmental changes, a hybrid architecture of global and local path planning is adopted to achieve collaborative control of UAVs. In the design of local path planning algorithms, it is necessary to consider issues such as slow communication speed, limited sensors, or restrictions on UAV movement due to external interference. Combining sensor data and control technology is an effective solution for designing efficient local path planning algorithms [50,51,52]. Therefore, in future work, we will further introduce relevant control techniques based on the existing HSB-GWO to implement a hybrid (global + local) path planning algorithm, improving the algorithm’s adaptability to dynamic obstacles and unexpected situations.

Author Contributions

Conceptualization, Z.Z. and X.W.; methodology, Z.Z.; software, H.H.; validation, Z.Z., C.L. and H.H.; formal analysis, Z.Z.; investigation, X.H.; resources, Y.Y. and X.H.; data curation, S.H.; writing—original draft preparation, Z.Z. and H.H.; writing—review and editing, C.L. and S.H.; visualization, X.W.; supervision, Y.Y., J.C. and X.H.; project administration, J.C.; funding acquisition, X.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded in part by the National Natural Science Foundation of China, grant number 62276055, and in part by the Sichuan Science and Technology Program, grant numbers 23ZDYF0755, 24NSFSC1476 and 2024YFFK0109. The APC was funded by Chengdu Pvirtech Technology Company Ltd.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Dataset available on request from the authors.

Conflicts of Interest

Author Xi Huang and Songbo Hu were employed by the company Chengdu Pvirtech Technology Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Pan, M.; Chen, C.; Yin, X.; Huang, Z. UAV-aided emergency environmental monitoring in infrastructure-less areas: LoRa mesh networking approach. IEEE Internet Things J. 2021, 9, 2918–2932. [Google Scholar] [CrossRef]
Wan, Y.; Zhong, Y.; Ma, A.; Zhang, L. An accurate UAV 3-D path planning method for disaster emergency response based on an improved multiobjective swarm intelligence algorithm. IEEE Trans. Cybern. 2022, 53, 2658–2671. [Google Scholar] [CrossRef]
Xi, M.; Dai, H.; He, J.; Li, W.; Wen, J.; Xiao, S.; Yang, J. A lightweight reinforcement-learning-based real-time path-planning method for unmanned aerial vehicles. IEEE Internet Things J. 2024, 11, 21061–21071. [Google Scholar] [CrossRef]
Zhou, R.; Huang, C.; Wei, Z.; Zhao, K. Application of mp-gwo algorithm in multiple cooperating UCAV path planning. J. Air Force Eng. Univ. 2017, 18, 24–29. [Google Scholar]
Li, W.; Xiong, Y.; Xiong, Q. Reinforcement Learning-Guided Particle Swarm Optimization for Multi-Objective Unmanned Aerial Vehicle Path Planning. Symmetry 2025, 17, 1292. [Google Scholar] [CrossRef]
Liu, C.; Wang, X.; Liu, C.; Wu, H. Three-dimensional route planning for unmanned aerial vehicle based on improved grey wolf optimizer algorithm. J. Huazhong Univ. Sci. Technol. (Nat. Sci. Ed.) 2017, 45, 38–42. [Google Scholar]
Santoso, A.; Hage, G.; Einsthan, A.; Sahal, M.; Jazidie, A. Adaptive Task Allocation and Missile Threat Management for Multi-UAV Systems with Fuzzy State-Feedback Control in Complex Environments. Int. J. Intell. Eng. Syst. 2025, 18. [Google Scholar]
Alpdemir, M.N. Tactical UAV path optimization under radar threat using deep reinforcement learning. Neural Comput. Appl. 2022, 34, 5649–5664. [Google Scholar] [CrossRef]
Lin, Y.; Saripalli, S. Sampling-Based Path Planning for UAV Collision Avoidance. IEEE Trans. Intell. Transp. Syst. 2017, 18, 3179–3192. [Google Scholar] [CrossRef]
Hart, P.E.; Nilsson, N.J.; Raphael, B. A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. Syst. Sci. Cybern. 1968, 4, 100–107. [Google Scholar] [CrossRef]
Dijkstra, E.W. A note on two problems in connexion with graphs. In Edsger Wybe Dijkstra: His Life, Work, and Legacy; Morgan & Claypool: San Rafael, CA, USA, 2022; pp. 287–290. [Google Scholar]
Xu, W.; Zhang, T.; Mu, X.; Liu, Y.; Wang, Y. Trajectory planning and resource allocation for multi-UAV cooperative computation. IEEE Trans. Commun. 2024, 72, 4305–4318. [Google Scholar] [CrossRef]
Wang, M.; Zhang, D.; Wang, B.; Li, L. Dynamic Trajectory Planning for Multi-UAV Multi-Mission Operations Using a Hybrid Strategy. IEEE Trans. Aerosp. Electron. Syst. 2025, 61, 7369–7386. [Google Scholar] [CrossRef]
He, C.; Ouyang, H.; Huang, W.; Li, S.; Zhang, C.; Ding, W.; Zhan, Z.H. An adaptive heuristic algorithm with a collaborative search framework for multi-UAV inspection planning. Appl. Soft Comput. 2025, 174, 112969. [Google Scholar] [CrossRef]
Li, K. A Survey of Multi-objective Evolutionary Algorithm Based on Decomposition: Past and Future. IEEE Trans. Evol. Comput. 2024. Early Access. [Google Scholar] [CrossRef]
Ma, X.; Li, X.; Zhang, Q.; Tang, K.; Liang, Z.; Xie, W.; Zhu, Z. A Survey on Cooperative Co-Evolutionary Algorithms. IEEE Trans. Evol. Comput. 2019, 23, 421–441. [Google Scholar] [CrossRef]
Cao, S.; Feng, X.; Chang, J.; Yu, Y.; Wang, X.; Cai, J.; Lai, Y.; Wang, H. A hybrid operator-based multifactorial evolutionary algorithm for inverse-engineering design of soft network materials. Thin-Walled Struct. 2024, 198, 111655. [Google Scholar] [CrossRef]
Feng, X.; Lai, Y.; Yang, X.; Yu, Y.; Li, F.; Wang, X.; Liang, J.; Cai, J.; Cao, S. A competitive coevolution-based evolutionary algorithm for the parallel inverse design of multiple soft network materials. Mater. Des. 2025, 256, 114359. [Google Scholar] [CrossRef]
Xie, S.; Zhong, H.; Li, Y.; Xu, S.; Liu, W.; Bian, S.; Zhang, S. Predictive Control for An Ankle Rehabilitation Robot Using Differential Evolution Optimization Algorithm-Based Fuzzy NARX Model. IEEE Trans. Neural Syst. Rehabil. Eng. 2025, 33, 1886–1895. [Google Scholar] [CrossRef]
Mirjalili, S.; Mirjalili, S.M.; Lewis, A. Grey wolf optimizer. Adv. Eng. Softw. 2014, 69, 46–61. [Google Scholar] [CrossRef]
Yu, X.; Jiang, N.; Wang, X.; Li, M. A hybrid algorithm based on grey wolf optimizer and differential evolution for UAV path planning. Expert Syst. Appl. 2023, 215, 119327. [Google Scholar] [CrossRef]
Luo, Y.; Qin, Q.; Hu, Z.; Zhang, Y. Path planning for unmanned delivery robots based on EWB-GWO algorithm. Sensors 2023, 23, 1867. [Google Scholar] [CrossRef]
Jarray, R.; Al-Dhaifallah, M.; Rezk, H.; Bouallègue, S. Parallel cooperative coevolutionary grey wolf optimizer for path planning problem of unmanned aerial vehicles. Sensors 2022, 22, 1826. [Google Scholar] [CrossRef]
Ma, C.; Huang, H.; Fan, Q.; Wei, J.; Du, Y.; Gao, W. Grey wolf optimizer based on Aquila exploration method. Expert Syst. Appl. 2022, 205, 117629. [Google Scholar] [CrossRef]
Nadimi-Shahraki, M.H.; Taghian, S.; Mirjalili, S. An improved grey wolf optimizer for solving engineering problems. Expert Syst. Appl. 2021, 166, 113917. [Google Scholar] [CrossRef]
Zhu, C.; Bouteraa, Y.; Khishe, M.; Martín, D.; Hernando-Gallego, F.; Vaiyapuri, T. Enhancing unmanned marine vehicle path planning: A fractal-enhanced chaotic grey wolf and differential evolution approach. Knowl.-Based Syst. 2025, 317, 113481. [Google Scholar] [CrossRef]
Liu, X.; Li, G.; Yang, H.; Zhang, N.; Wang, L.; Shao, P. Agricultural UAV trajectory planning by incorporating multi-mechanism improved grey wolf optimization algorithm. Expert Syst. Appl. 2023, 233, 120946. [Google Scholar] [CrossRef]
Zhou, S. Gwo-ga-xgboost-based model for Radio-Frequency power amplifier under different temperatures. Expert Syst. Appl. 2025, 278, 127439. [Google Scholar] [CrossRef]
Feng, X.; Yu, Y.; Wang, X.; Cai, J.; Zhong, S.; Wang, H.; Han, X.; Wang, J.; Shi, K. A hybrid search mode-based differential evolution algorithm for auto design of the interval type-2 fuzzy logic system. Expert Syst. Appl. 2024, 236, 121271. [Google Scholar] [CrossRef]
Zhang, F.; Li, R.; Gong, W. Deep reinforcement learning-based memetic algorithm for energy-aware flexible job shop scheduling with multi-AGV. Comput. Ind. Eng. 2024, 189, 109917. [Google Scholar] [CrossRef]
Debnath, D.; Vanegas, F.; Sandino, J.; Hawary, A.F.; Gonzalez, F. A review of UAV path-planning algorithms and obstacle avoidance methods for remote sensing applications. Remote Sens. 2024, 16, 4019. [Google Scholar] [CrossRef]
Camacho Villalón, C.L.; Stützle, T.; Dorigo, M. Grey wolf, firefly and bat algorithms: Three widespread algorithms that do not contain any novelty. In Proceedings of the International Conference on Swarm Intelligence, Barcelona, Spain, 26–28 October 2020; pp. 121–133. [Google Scholar]
Kaur, N.; Kaur, L.; Cheema, S.S. An enhanced version of Harris Hawks optimization by dimension learning-based hunting for breast cancer detection. Sci. Rep. 2021, 11, 21933. [Google Scholar] [CrossRef]
Lévy, P. L’addition des variables aléatoires définies sur une circonférence. Bull. Soc. Math. Fr. 1939, 67, 1–41. [Google Scholar] [CrossRef]
Jiang, W.; Liu, Z.; Wang, Y.; Lin, Y.; Li, Y.; Bi, F. Enhancing jamming source tracking capability via adaptive grey wolf optimization mechanism for passive radar network. Signal Process. 2025, 235, 110026. [Google Scholar] [CrossRef]
Mantegna, R.N.; Stanley, H.E. Stochastic process with ultraslow convergence to a Gaussian: The truncated Lévy flight. Phys. Rev. Lett. 1994, 73, 2946. [Google Scholar] [CrossRef]
Wu, G.; Mallipeddi, R.; Suganthan, P.N. Problem Definitions and Evaluation Criteria for the CEC 2017 Competition on Constrained Real-Parameter Optimization; Technical Report; National University of Defense Technology: Changsha, China; Kyungpook National University: Daegu, Republic of Korea; Nanyang Technological University: Singapore, 2017; Volume 9, p. 2017. [Google Scholar]
Derrac, J.; García, S.; Molina, D.; Herrera, F. A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evol. Comput. 2011, 1, 3–18. [Google Scholar] [CrossRef]
Veček, N.; Mernik, M.; Filipič, B.; Črepinšek, M. Parameter tuning with Chess Rating System (CRS-Tuning) for meta-heuristic algorithms. Inf. Sci. 2016, 372, 446–469. [Google Scholar] [CrossRef]
Birattari, M.; Yuan, Z.; Balaprakash, P.; Stützle, T. F-Race and iterated F-Race: An overview. In Experimental Methods for the Analysis of Optimization Algorithms; Springer: Berlin/Heidelberg, Germany, 2010; pp. 311–336. [Google Scholar]
Nannen, V.; Eiben, A.E. Efficient relevance estimation and value calibration of evolutionary algorithm parameters. In Proceedings of the 2007 IEEE Congress on Evolutionary Computation, Singapore, 25–28 September 2007; pp. 103–110. [Google Scholar]
Hutter, F.; Hoos, H.H.; Leyton-Brown, K.; Stützle, T. ParamILS: An automatic algorithm configuration framework. J. Artif. Intell. Res. 2009, 36, 267–306. [Google Scholar] [CrossRef]
Abualigah, L.; Yousri, D.; Abd Elaziz, M.; Ewees, A.A.; Al-qaness, M.A.; Gandomi, A.H. Aquila Optimizer: A novel meta-heuristic optimization algorithm. Comput. Ind. Eng. 2021, 157, 107250. [Google Scholar] [CrossRef]
Abualigah, L.; Diabat, A.; Mirjalili, S.; Abd Elaziz, M.; Gandomi, A.H. The Arithmetic Optimization Algorithm. Comput. Methods Appl. Mech. Eng. 2021, 376, 113609. [Google Scholar] [CrossRef]
Trojovská, E.; Dehghani, M. A new human-based metahurestic optimization method based on mimicking cooking training. Sci. Rep. 2022, 12, 14861. [Google Scholar] [CrossRef]
Abdel-Basset, M.; Mohamed, R.; Jameel, M.; Abouhawwash, M. Nutcracker optimizer: A novel nature-inspired metaheuristic algorithm for global optimization and engineering design problems. Knowl.-Based Syst. 2023, 262, 110248. [Google Scholar] [CrossRef]
Črepinšek, M.; Liu, S.H.; Mernik, M. Exploration and exploitation in evolutionary algorithms: A survey. ACM Comput. Surv. (CSUR) 2013, 45, 1–33. [Google Scholar] [CrossRef]
Reddy, A.J.; Geng, X.; Herschl, M.; Kolli, S.; Kumar, A.; Hsu, P.; Levine, S.; Ioannidis, N. Designing cell-type-specific promoter sequences using conservative model-based optimization. Adv. Neural Inf. Process. Syst. 2024, 37, 93033–93059. [Google Scholar]
Jerebic, J.; Mernik, M.; Liu, S.H.; Ravber, M.; Baketarić, M.; Mernik, L.; Črepinšek, M. A novel direct measure of exploration and exploitation based on attraction basins. Expert Syst. Appl. 2021, 167, 114353. [Google Scholar] [CrossRef]
Zhou, X.; Yu, X.; Zhang, Y.; Luo, Y.; Peng, X. Trajectory planning and tracking strategy applied to an unmanned ground vehicle in the presence of obstacles. IEEE Trans. Autom. Sci. Eng. 2020, 18, 1575–1589. [Google Scholar] [CrossRef]
Tripicchio, P.; Unetti, M.; D’Avella, S.; Avizzano, C.A. Smooth coverage path planning for UAVs with model predictive control trajectory tracking. Electronics 2023, 12, 2310. [Google Scholar] [CrossRef]
Fabris, M.; Cenedese, A.; Hauser, J. Optimal time-invariant formation tracking for a second-order multi-agent system. In Proceedings of the 2019 18th European Control Conference (ECC), Naples, Italy, 25–28 June 2019; pp. 1556–1561. [Google Scholar]

Figure 1. Calculation flow for the HSB-GWO.

Figure 2. Example for the trajectory generation.

Figure 3. Collaborative track chromosome encoding.

Figure 4. Convergence of nonlinear decay (Equation (22)) and linear decay.

Figure 5. Critical difference (CD) results for the Friedman test of IEEE CEC 2017 and 2019.

Figure 6. Three-dimensional diagrams of the optimal trial results of 8 algorithms for Task 1.

Figure 7. Three views and 3D graph of the best trial result of HSB-GWO for Task 1.

Figure 8. Mean fitness convergences of 8 algorithms in 10 trials for Task 1.

Figure 9. Three-dimensional diagrams of the optimal trial results of 8 algorithms for Task 2.

Figure 10. Three views and 3D graph of the best trial result of HSB-GWO for Task 2.

Figure 11. Mean fitness convergences of 8 algorithms in 10 trials for Task 2.

Figure 12. Mean fitness convergences of three ablation algorithms and HSB-GWO in 10 trials using Task 1.

Table 1. Mean ranks of HSB-GWOs with different control parameter settings using IEEE CEC 2017.

F	D	$r_{E} = 0.2$	$r_{E} = 0.3$	$r_{E} = 0.4$	$r_{E} = 0.5$	$r_{E} = 0.6$	$r_{E} = 0.7$	$r_{E} = 0.8$
F1, F3	10	1.00	2.00	3.50	3.50	5.50	5.50	7.00
	30	2.00	5.50	5.50	1.50	2.50	6.00	5.00
	50	1.50	2.00	3.00	5.50	5.50	4.50	6.00
F4–F10	10	1.57	2.57	2.86	3.14	5.00	6.29	6.57
	30	1.43	1.57	4.00	3.86	5.00	6.14	6.00
	50	2.29	2.43	3.86	3.43	4.43	5.71	5.86
F11–F20	10	3.20	3.20	3.90	2.60	4.30	4.90	5.90
	30	3.30	2.90	4.10	3.20	4.70	5.20	4.60
	50	2.40	3.50	4.00	4.50	4.40	4.50	4.70
F21–F30	10	3.30	4.00	3.90	3.50	3.20	4.70	5.40
	30	2.20	2.70	4.30	3.30	4.00	5.40	6.10
	50	3.20	3.40	3.70	4.00	4.30	5.10	4.30
Overall	10	2.69	3.24	3.62	3.10	4.17	5.21	5.97
	30	2.38	2.69	4.24	3.28	4.38	5.55	5.48
	50	2.59	3.10	3.79	4.14	4.45	5.00	4.93