Multi-UAV Trajectory Planning Based on a Two-Layer Algorithm Under Four-Dimensional Constraints

Yang, Yong; Fu, Yujie; Xin, Runpeng; Feng, Weiqi; Xu, Kaijun

doi:10.3390/drones9070471

Open AccessArticle

Multi-UAV Trajectory Planning Based on a Two-Layer Algorithm Under Four-Dimensional Constraints

by

Yong Yang

^1,2,*

,

Yujie Fu

^1,2,

Runpeng Xin

^1,2,

Weiqi Feng

^1,2 and

Kaijun Xu

^1,2

¹

School of Flight Technology, Civil Aviation Flight University of China, Guanghan 618307, China

²

Sichuan Provincial Engineering Research Center of Domestic Civil Aircraft Flight and Operation Support, Chengdu 610021, China

^*

Author to whom correspondence should be addressed.

Drones 2025, 9(7), 471; https://doi.org/10.3390/drones9070471

Submission received: 23 May 2025 / Revised: 29 June 2025 / Accepted: 30 June 2025 / Published: 1 July 2025

Download

Browse Figures

Versions Notes

Abstract

With the rapid development of the low-altitude economy and smart logistics, unmanned aerial vehicles (UAVs), as core low-altitude platforms, have been widely applied in urban delivery, emergency rescue, and other fields. Although path planning in complex environments has become a research hotspot, optimization and scheduling of UAVs under time window constraints and task assignments remain insufficiently studied. To address this issue, this paper proposes an improved algorithmic framework based on a two-layer structure to enhance the intelligence and coordination efficiency of multi-UAV path planning. In the lower layer path planning stage, considering the limitations of the whale optimization algorithm (WOA), such as slow convergence, low precision, and susceptibility to local optima, this study integrates a backward learning mechanism, nonlinear convergence factor, random number generation strategy, and genetic algorithm principle to construct an improved IWOA. These enhancements significantly strengthen the global search capability and convergence performance of the algorithm. For upper layer task assignment, the improved ALNS (IALNS) addresses local optima issues in complex constraints. It integrates K-means clustering for initialization and a simulated annealing mechanism, improving scheduling rationality and solution efficiency. Through the coordination between the upper and lower layers, the overall solution flexibility is improved. Experimental results demonstrate that the proposed IALNS-IWOA two-layer method outperforms the conventional IALNS-WOA approach by 7.30% in solution quality and 7.36% in environmental adaptability, effectively improving the overall performance of UAV trajectory planning.

Keywords:

trajectory planning; time window; task allocation; two-layer structure; improved whale optimization algorithm

1. Introduction

The Outline of the Strategic Plan for Expanding Domestic Demand (2022–2035), issued by the Central Committee of the Communist Party of China and the State Council, explicitly proposes “supporting the application of technologies such as autonomous driving and unmanned delivery”, highlighting the development of new types of consumption. This marks the formal entry of the low-altitude economy into industrialization and standardization [1]. Low-altitude airspace policies are being continuously refined, and low-altitude aircraft technologies are rapidly advancing. As a result, drone applications in low-altitude airspace have expanded, particularly in military reconnaissance, emergency rescue, and logistics delivery applications [2]. However, traditional single-drone models can no longer meet the demands of large-scale applications. Ensuring the safety, feasibility, and optimality of multi-drone cooperative flight paths under multiple constraints has become a core challenge in the field of intelligent path planning.

Nowadays, the drone task assignment problem with time windows can be abstracted as a dynamic scheduling problem involving multiple agents under spatio-temporal constraints. Essentially, it is an extension of the vehicle routing problem with time windows (VRP-TW) [3]. With ongoing technological advancements, research on VRP-TW has become increasingly diversified, ranging from traditional single-depot route planning to more complex scenarios involving multiple depots and time window constraints. Zhao et al. [4] developed an adaptive genetic algorithm with dynamic crossover/mutation probabilities, enhancing solution efficiency and quality. Pan [5] used an LSTM network to predict logistics demands, integrating forecasts into VRP for routing optimization. For problems involving hard time windows and heterogeneous vehicle types, the BPC model with capacity inequalities to reduce the relaxation gap was successfully introduced to improve urban waste collection efficiency [6]. Christian et al. [7] developed a novel mathematical model using an innovative backtracking approach to address a variable-location delivery problem with multiple time windows, using hospital data to assess the impact of delivery locations and cost functions, thereby enhancing delivery flexibility. Liu et al. [8] recognized the need for dynamic route adjustments in emergency rescue scenarios and argued that soft time windows are more appropriate. Mathematical model for optimizing multi-rescue vehicle routes with soft time windows for a multi-depot scenario is constructed and a hybrid genetic algorithm incorporating a nearest-neighbor heuristic to solve the problem is proposed. While VRP and its variants have been widely studied—particularly in optimizing multi-depot vehicle scheduling to reduce logistics costs and enhance transportation efficiency—research on applying VRP to drone systems has lagged. With the rapid development of drone technologies, more researchers are now beginning to apply VRP methodologies to drone path planning. Unlike traditional vehicle delivery problems, drone routing must consider additional physical constraints such as ground obstacles, flight altitude, and range limitations, as well as aerial no-fly zones, payload capacities, and multi-drone coordination.

In recent years, with the rapid development of deep reinforce learning (DRL) and distributed control technology, multi-UAV formation path planning and scheduling has become a research hotspot. Zhang et al. [9] achieved dynamic obstacle avoidance and cooperative formation in complex urban environments by using DRL methods, which significantly improved planning efficiency and robustness. Li et al. [10] proposed a hybrid algorithm combining the genetic algorithm and particle swarm optimization for real-time multi-UAV task scheduling and achieved better performance in scenarios containing hard and soft time window constraints. In addition, Chen et al. [11] proposed a distributed consensus-based coordination strategy for heterogeneous UAV formations, which effectively mitigates the cooperative degradation problem when the communication bandwidth is limited. However, most of the above works focus on single-layer path or scheduling optimization, failing to simultaneously consider the coupled optimization of path planning and task allocation under spatio-temporal constraints. Chen et al. [12] proposed a distributed scheduling approach for the networked UAV group scheduling problem. Kim et al. [13] introduced a path prediction algorithm based on graph neural networks in a multi-UAV collaborative environment, realizing real-time obstacle avoidance in dense urban airspace. Sato et al. [14] optimized airspace sharing and game theory among different operators by combining multi-intelligent reinforcement learning and game theory real-time obstacle avoidance in high-density urban airspace. Whereas these studies show significant differences in algorithm design, simulation scenarios, and application focus in different regions, they all face the challenge of trade-off between coordination efficiency and real-time performance as follows: DRL-assisted path planning based on adaptive policy networks [15]; cellular-connected UAV trajectory planning using DRL communication-aware optimization [16]; DRL solution with novel quantum-inspired experience replay mechanisms to enhance UAV trajectory planning [17] or optimal UAV-mounted network path planning problem [18]; DRL-based approaches that emphasize learning efficiency but are limited by sample inefficiency or high training costs in large-scale tasks, and networks that integrate connectivity and physical constraints [19]; a class of communication-aware models that often assume static network conditions or depend on perfect information; visual DRL navigation approaches that combine CNN and visual attention [20], which struggle to cope with the truly complex environments and the over-reliance on data environments; distributed MARL-based collaborative frameworks [21] are prone to scalability issues as the number of intelligences increases. In addition, a recent review [22] systematically summarizes these approaches and their application scenarios. Despite many studies, most of the existing work focuses on task allocation or trajectory planning in isolation and lacks a unified framework that can jointly optimize these two dimensions under complex spatio-temporal and environmental constraints. Therefore, there is an urgent need for a unified framework of hierarchical algorithms that enables the integration of global task assignment and local trajectory planning, incorporates multi-dimensional constraints, and balances performance, robustness, and adaptability across different task scenarios.

Based on the identified research gaps, this paper proposes a two-layer collaborative optimization framework IALNS-IWOA—targeted at improving performance and generalization in multi-UAV path planning under spatio-temporal and environmental constraints. The framework addresses the inefficient coupling between task assignment and trajectory planning, the tendency to fall into local optima, and the insufficient handling of multi-dimensional constraints such as time windows, payload, endurance, and terrain. In the upper-level task allocation stage, an improved adaptive large neighborhood search (IALNS) algorithm is employed, which incorporates K-means clustering to generate high-quality initial solutions and utilizes a simulated annealing acceptance criterion to enhance solution diversity and avoid premature convergence. In the lower-level trajectory planning stage, an improved whale optimization algorithm (IWOA) is developed, featuring a reverse learning initialization strategy, nonlinear convergence control, and embedded genetic operators to balance global exploration and local refinement. The two layers are interconnected through iterative feedback, i.e., upper-level allocation results guide lower-level trajectory generation, while cost evaluations from trajectory planning inform the adaptive refinement of task allocation. This bidirectional interaction ensures efficient global coordination and significantly improves optimization performance in complex multi-UAV scheduling scenarios. To better highlight the novelties of this study, Table 1 presents a concise comparison between three mainstream UAV path planning paradigms—traditional algorithms, deep reinforcement learning methods, and our proposed dual-layer metaheuristic framework—focusing on their respective strengths and limitations in practical multi-UAV scheduling scenarios. Compared with these prior works, our proposed IALNS-IWOA framework offers a hierarchical, jointly optimized solution that integrates multi-drone coordination, spatio-temporal constraints, and terrain-aware path planning into a unified structure.

The main contributions of current work are summarized as follows:

A novel two-tier optimization framework is proposed to combine the improved adaptive large neighborhood search for global task allocation with the enhanced whale optimization algorithm for local trajectory planning to achieve coordinated scheduling in complex multi-UAV scenarios.
Multi-dimensional constraints are integrated into a unified MDRP-TW formulation, and algorithmic components such as backward learning initialization, nonlinear convergence, and gene mutation are improved to enhance the stability and accuracy of the optimization.
The robustness and scalability of the proposed IALNS-IWOA framework is verified through many simulations under various task settings, and the superior performance is demonstrated in comparison with the baseline approaches such as IALNS-PSO, IALNS-WOA, and IALNS-ACO.

2. Modeling

The multi-drone routing problem with time windows (MDRP-TW) extends the classical vehicle routing problem with time windows (VRP-TW) to multiple aerial vehicles operating in 3D space. In MDRP-TW, a fleet of UAVs must service spatially distributed tasks—each with its own time window and demand—while respecting additional aerial constraints such as altitude limits, no-fly zones, endurance and payload capacities. The objective is to jointly assign tasks to drones and generate collision-free and energy-efficient 3D trajectories that minimize a composite cost (e.g., time, distance, or energy) under these spatio-temporal constraints. To ensure the alignment of the planned routing scheme with real-world application scenarios while maintaining feasibility and effectiveness, it is necessary to establish constraint conditions for the MDRP-TW model. This section provides the mathematical formulation of the multi-drone routing problem with time window (MDRP-TW). The model incorporates various real-world constraints such as no-fly zones, altitude restrictions, time windows, and UAV payload capabilities. The objective functions are detailed in Equations (1)–(5).

\min D = R + F

(1)

R = σ_{1} R_{t} + σ_{2} R_{c}

(2)

F = ω_{1} f_{L e n g t h} + ω_{2} f_{s a f e} + ω_{3} f_{H e i g h t} + ω_{4} f_{S m o o t h} + ω_{5} f_{C o l l i s i o n}

(3)

σ_{1} + σ_{2} = 1

(4)

ω_{1} + ω_{2} + ω_{3} + ω_{4} + ω_{5} = 1

(5)

where

D

in Equation (1) represents the objective function of the MDRP-TW model,

R

denotes the objective function of the task allocation model, and

F

represents the objective function of the trajectory optimization model. Specifically,

R_{t}

in Equation (2) denotes the cost associated with time window constraints,

R_{c}

denotes the cost related to payload constraints, and

σ

denotes the weighting coefficient in the task allocation model. In Equation (3)

f_{L e n g t h}

the cost of trajectory length constraints with a weighting factor of

ω_{1}

; in addition,

f_{S a f e}

indicates the cost due to no-fly zone constraints with a weighting factor of

ω_{2}

,

f_{H e i g h t}

denotes the cost related to flight altitude constraints with a weighting factor of

ω_{3}

,

f_{Smooth}

refers to the cost of flight smoothness constraints with a weighting factor of

ω_{4}

, and

f_{C o l l i s i o n}

represents the cost arising from spatial conflict constraints with a weighting factor of

ω_{5}

. All weight coefficients add up to 1 in Equation (5), ensuring flexibility in problem adjustment. Table 2 defines all the mathematical symbols used in the model of this study and their meanings, while Figure 1 shows a schematic diagram of the no-fly zone model with the coordinates of the trajectory points and the coordinates of the center axis of the no-fly zone, and distinguishes the meanings of different distances between the UAV and the no-fly zone.

In this study, the no-fly zone is modeled as an infinitely high cylinder. Let the coordinates of the

i t h

trajectory point be denoted as

(x_{i}, y_{i})

, and a point on the central axis of the no-fly zone be denoted as

(x_{o}, y_{o})

.

2.1. Mathematical Model of the Trajectory Planning Layer

To solve the MDRP-TW problem, a dual-layer framework is constructed in this study. The drone flight path is represented by a set of discrete trajectory points, denoted as set

N

. The first step is to determine the optimal trajectory between each pair of task points, which defines the trajectory optimization layer [23,24,25]. The specific formulation is provided in Equations (6)–(13).

f_{L e n g t h} = \sum_{i = 1}^{l} L_{i}

(6)

d_{i} = \sqrt{{(x_{i} - x_{o})}^{2} + {(y_{i} - y_{o})}^{2}}

(7)

r_{c o l l i s i o n} = r + d_{1}

(8)

r_{t h r e a t} = r + d_{2}

(9)

f_{H e i g h t} = \sum_{i = 1}^{N - 1} (F_{H e i g h t, i} = \{\begin{matrix} |h_{i} - \frac{(h_{m a x} + h_{m i n})}{2}| & h_{m i n} \leq h_{i} \leq h_{m a x} \\ 1000 & o t h e r w i s e \end{matrix})

(10)

f_{Smooth} = \sum_{i = 2}^{N} δ_{i} + \sum_{i = 1}^{N - 1} γ_{i}

(11)

f_{C o l l i s i o n} = \{\begin{matrix} 0 & d_{p, q} > 2 d_{2} \\ d_{2} + d_{1} - d_{p, q} & 2 d_{1} < d_{p, q} < 2 d_{2} \\ 1000 & d_{p, q} < d_{1} \end{matrix}

(12)

d_{p, q} = \sqrt{{(x_{p} - x_{q})}^{2} + {(y_{p} - y_{q})}^{2} + {(z_{p} - z_{q})}^{2}}

(13)

All symbols in Equations (6)–(13) are defined as before. The schematic of Equation (13) is shown in Figure 2, which shows the UAV flying a distance

d_{p, q}

over two different lines of operation.

Having modeled the assessment of each UAV trajectory based on spatial and safety constraints, the focus will now be on a tasking model that determines which UAV is assigned to which task point, considering the time window and payload constraints.

2.2. Mathematical Model of the Task Allocation Layer

Task allocation layer assigns tasks based on the information provided by the lower-level path planning model, considering time windows, task loads, and drone payload capacities. Let

n

denote the total number of task points and

K

represent the number of drones used. Task allocation is determined according to task loads. The specific formulations are given in Equations (14)–(16).

R_{t} = 1000 \times \sum_{i = 1}^{n} [m a x (T ω_{i, 1} - t_{i}, 0) + m a x (t_{i} - T ω_{i, 2} -, 0)]

(14)

R_{c} = 100 \times m a x (\sum_{i = 1}^{n} q_{i} - P_{c})

(15)

K = ⌈\frac{\sum_{i = 1}^{n} q_{i}}{p_{c}}⌉

(16)

In this case, the constraints for both

R_{t}

and

R_{c}

are realized by imposing great penalties, which increase as the degree of constraint violation increases; the number of unoccupied people and the number of tasks is derived by back-calculating the number of tasks, which can vary with the number of tasks.

Having defined the mathematical formulation of the MDRP-TW problem, work will proceed to describe the solution strategy. A two-tier metaheuristic algorithm—the IALNS algorithm for task assignments and the IWOA algorithm for path planning—is designed to solve the problem efficiently.

3. IALNS-IWOA Algorithm

Single-layer optimization methods are often insufficient for MDRP-TW problems, as they cannot simultaneously handle spatial-temporal trajectory constraints and combinatorial task assignment. Therefore, a hierarchical framework is adopted, where the upper layer handles assignment feasibility, and the lower layer focuses on trajectory cost minimization. The dual-layer architecture adopted in this study offers distinct structural advantages, as it decouples the task allocation and path optimization processes, thereby providing greater flexibility and scalability when addressing complex constraint problems. First, an improved whale optimization algorithm (IWOA) is employed to plan the optimal flight path between each pair of task points. Then, an improved adaptive large neighborhood search algorithm (IALNS) is used to optimize task allocation by determining the visiting sequence of task points for each drone. This approach reduces conflicts and idle time during the task assignment process, thereby improving the overall efficiency of multi-drone cooperative operations. IALNS and IWOA each possess unique optimization capabilities: the former demonstrates strong local search performance in solving combinatorial optimization problems [26], while the latter offers superior global search ability, effectively avoiding entrapment in local optima [27]. The overall dual-layer architecture of the proposed framework is illustrated in Figure 3.

A two-tier solution framework is designed based on the MDRP-TW mathematical formulation in Section 2. The lower layer focuses on optimal trajectory generation between task points, while the upper layer determines the distribution of these tasks among available UAVs. These two algorithmic layers are not isolated but are tightly coupled together through iterative information exchange. Specifically, in each iteration of the IALNS procedure, a candidate solution corresponds to a set of UAV task assignments. To reduce computational overhead, all point-to-point trajectory planning results are pre-computed by the lower layer and stored in a centralized data structure. Each entry includes key metrics such as total path cost, interpolated 3D waypoints, altitude profiles, etc. These pre-calculated values are directly retrieved by the upper layer at mission assignment time to quickly assess the cost of any mission sequence without the need to invoke the path planner in real time. The path costs of all UAV routes are then aggregated to compute the overall suitability of the solution. In turn, IALNS update operators (e.g., destroy and repair) generate new task sequences, which are then re-evaluated using stored trajectory data.

3.1. Adaptive Large Neighborhood Search

The adaptive large neighborhood search (ALNS) algorithm [28] has been widely applied to complex combinatorial optimization problems. Its core idea is to iteratively perform destruction–repair operations to evolve the solution. Traditional ALNS generates an initial solution and treats it as the current best solution. During the iterative process, new solutions are produced through destruction and repair operators and are accepted based on a roulette wheel mechanism, continuing until the maximum number of iterations is reached or other termination criteria are satisfied. However, such an initialization is prone to falling into local optima and is difficult to escape. Moreover, the operator selection and weight adjustment mechanisms lack adaptability, resulting in low computational efficiency, especially for large-scale problems. To address these issues, this study incorporates a K-means clustering strategy [29] during the algorithm initialization phase. During the solution update stage, simulated annealing is employed to determine whether the new solution should be accepted within the IALNS framework. The algorithm flow is illustrated in Figure 4.

3.1.1. K-Means Clustering

The K-means clustering method divides the given dataset into

K

different clusters such that the data points in the same cluster show high similarity, while the data points in different clusters have relatively low similarity [30]. The core idea is to iteratively search for

K

cluster centers such that the sum of distances from each data point to the center of its assigned cluster is minimized. In this study, the number of clusters is set equal to the number of drones deployed. The clustering formulation for K-means is shown in Equation (17).

J = {\sum_{j = 1}^{K} \sum_{i \in C_{j}} d (s_{i}, μ_{j})}^{2}

(17)

where

C_{j}

denotes the

j t h

cluster, the mathematical meaning of which is the sum of the squares of the errors within the cluster, and

s_{i}

represents a data point belonging to cluster

C_{j}

. During the iterative optimization process, the cluster centers

μ_{j}

are continuously updated to gradually reduce the objective function value

J

until convergence is achieved. The use of K-means clustering during initialization enables the generation of solutions with better structure and rationality, facilitating faster convergence of the algorithm toward near-optimal solutions. The clustering results are shown in Figure 5, where the three colored task points shown are the initial solution classifications.

3.1.2. Simulated Annealing Algorithm

Simulated annealing (SA) is a general probabilistic algorithm commonly used to search for the global optimum within a large solution space [31]. Its core lies in the acceptance criterion, which is defined in Equation (18).

P = \{\begin{matrix} 1 & \nabla f < 0 \\ \exp (- \frac{\nabla f}{T}) & \nabla f \geq 0 \end{matrix}

(18)

where

P

represents the probability of accepting a new solution,

\nabla f

is the difference in objective function value between the new solution and the current solution, and

T

denotes the current temperature. The ALNS algorithm is prone to becoming trapped in local optima during the search process. To address this, IALNS integrates the concept of simulated annealing, allowing the algorithm to probabilistically accept inferior solutions under certain conditions. This facilitates escape from local optima, broadens the search space, and increases the likelihood of finding a global optimum. In summary, the pseudocode of IALNS is presented in Algorithm 1.

Algorithm 1: Adaptive large neighborhood search (IALNS)

01: Input Clustering parameters; feasible initial solution from K-means clustering

02:

x^{b} = x

;

ρ^{-} = (1 \cdot \cdot \cdot 1)

;

ρ^{+} = (1 \cdot \cdot \cdot 1)

;
03: Repeat
04: Update annealing temperature; update

ρ^{-}

and

ρ^{+}

; select

d \in Ω^{-}

,

r \in Ω^{+}

;
05:

x^{t} = r (d (x))

;
06: Use the simulated annealing rule to determine whether to accept the new solution
07: If accepted, set

x = x^{t}

;
08: End if
09: Determine whether the new solution is accepted as the current best solution
10: If accepted, update

ρ^{-}

and

ρ^{+}

;
11: End if
12: Until stopping criterion is met
13: Return the best solution

While the IWOA module provides high-quality inter-point trajectory costs, it requires an effective upper-layer strategy to assign tasks to UAVs. We thus construct an improved adaptive large neighborhood search (IALNS) framework to explore the assignment space under multiple constraints.

3.2. Improved Whale Optimization Algorithm

The whale optimization algorithm (WOA) is a swarm intelligence optimization algorithm inspired by the foraging behavior of humpback whales, and it has been widely applied to solve nonlinear optimization problems. However, WOA tends to fall into local optima, has limited global search capability, and struggles with complex constraint scenarios. To address these issues, this study introduces reverse learning mechanisms, nonlinear convergence factors, random number generation strategy, and evolutionary algorithm techniques to propose an improved whale optimization algorithm (IWOA) for optimizing drone trajectory planning. Reverse learning enhances the diversity of the initial population, while the convergence factor is modified from linear to nonlinear decay. Additionally, enhanced random number generation mechanism is integrated into the iteration process to coordinate the algorithm’s global exploration and local exploitation capabilities [32]. The flowchart of the proposed IWOA is shown in Figure 6.

Genetic Algorithm

The core of the genetic algorithm (GA) lies in crossover and mutation operations. By applying these operations to parent individuals, GA generates offspring and searches for the solution space to find better solutions. In this study, solution updates are guided by this principle: a new solution is generated by comparing the current solution with another solution produced through crossover and mutation, and the better one is selected as the optimal solution for the current generation [33]. The crossover operation simulates genetic information exchange between two individuals. The corresponding formulations are shown in Equations (19) and (20).

g_{n e w} = \frac{1}{2} [(1 + β) g_{1} + (1 - β) g_{2}]

(19)

β_{i} = \{\begin{matrix} {(2 μ_{i})}^{\frac{1}{1 + η}} & μ_{i} \leq 0.5 \\ {(2 (1 - μ_{i}))}^{- \frac{1}{1 + η}} & μ_{i} > 0.5 \end{matrix}

(20)

where

g_{1}

and

g_{2}

represent the parent individuals,

g_{n e w}

denotes the offspring individual,

β

is the crossover coefficient,

μ_{i} \sim U (0, 1)

is the crossover control parameter, and

η

is a binary variable

r_{i} \in \{0, 1\}

used to control crossover directionality.

The mutation operation is applied to adjust individuals after crossover, helping to prevent premature convergence to local optima. Based on a multipoint mutation strategy, the mutation update is formulated, as shown in Equation (21).

g_{i}^{n e w} = \{\begin{matrix} g_{i} + {[2 μ + (1 - 2 μ) {(1 - δ_{1})}^{η_{m} + 1}]}^{\frac{1}{η_{m} + 1}} \cdot (g_{\max} - g_{\min}) & μ \leq 0.5 \\ g_{i} + [1 - {(2 (1 - μ) + 2 (μ - 0.5) {(1 - δ_{2})}^{η_{m} + 1})}^{\frac{1}{η_{m} + 1}}] \cdot (g_{\max} - g_{\min}) & μ > 0.5 \end{matrix}

(21)

where

g_{i}

denotes the value of the

i t h

decision variable of the current individual, and

g_{i}^{n e w}

represents the value of the

i t h

decision variable after mutation.

g_{\max}

and

g_{\min}

are the allowable maximum and minimum values of the normalized decision variable.

δ_{1}

and

δ_{2}

represent the distances between the current solution and the upper and lower bounds, respectively. In summary, the pseudocode for the improved whale optimization algorithm (IWOA) is provided in Algorithm 2.

Algorithm 2: Improved whale optimization algorithm (IWOA)

01: Input Population size, maximum number of generations, search probability, variable range

02: Execute reverse learning initialization to enhance population diversity
03: Evaluate the fitness value of each individual and determine the current best solution
04: Repeat
05 Calculate the nonlinear convergence factor and spiral coefficient
06 Generate a random number

p

07 If

p < 0.5

08 If

|A| \geq 1

09 Perform random search
10 Else:
11 Perform encircling prey
12 Else:
13 Update position using GA-based strategy
14 Update solution
15 End if
16 Determine whether termination condition is met
17 Return the best solution found

This section outlines the overall algorithmic structure. The IWOA and IALNS components of architecture operate in a coordinated manner, with IWOA providing the trajectory costs of the task combinations, and IALNS using this information to adaptively re-assign tasks. The interplay between these two layers enables flexible yet coordinated optimization. In the following section, we validate the proposed method through simulation experiments under various mission settings.

4. Simulation Experiments

4.1. Simulation Environment and Mapping to Mathematical Model

All simulations and algorithm implementations were executed on a desktop computer equipped with an Intel Core i7-12700 CPU, 16-core, 2.10 GHz, 32 GB of RAM, and running Windows 10 operating system. The algorithms were developed and tested using MATLAB R2024a. No GPU acceleration was used during the experiments to ensure a fair comparison of computational efficiency across all methods, with the simulation environment set as a 3D space of size

500 \times 500 \times 300

, containing 2 no-fly zones, 1 starting point, and 15 task points with varying task loads. The maximum payload capacity of a single drone was set to 110 [34]. Using the analytic hierarchy process (AHP) [35], and after consistency verification, the weight coefficients were determined as follows:

σ_{1} = 0.5

σ_{2} = 0.5

;

ω_{1}

,

ω_{2}

,

ω_{3}

,

ω_{4}

,

ω_{5}

were set to 0.4357, 0.1875, 0.0625, 0.2500, and 0.0625, respectively. In the simulation code, these were used as fixed input parameters during fitness evaluation. As illustrated in Figure 7, the terrain environment was modeled in 3D space

500 \times 500 \times 300

using curved surfaces and cylindrical volumes to simulate obstacles and topographical constraints. The information from this map was used as input for the lower trajectory planning section. These elements were considered infinitely high and impassable. The corresponding data are listed in Table 3.

The distribution of task points in the terrain is shown in Figure 7 and Figure 8, and Table 3 reflect the inputs to the lower objective function

F

. The

f_{S a f e}

mentioned in Equation (3) uses the distance between the UAV path and the no-fly zone, and the

f_{H e i g h t}

calculates the spacing between the UAVs to avoid conflicts.

f_{C o l l i s i o n}

penalizes flights that exceed the allowable flight altitude range, and each term shown in the representation is a method for calculating the flight altitude range. And the symbols

f_{S a f e}

,

f_{H e i g h t}

,

f_{C o l l i s i o n}

represent that each term shown is calculated based on the 3D environment and geometry using Equations (6)–(13).

The mission point information is shown in Table 4, complete with a mapping of the upper-level objective function

R

, where

R_{t}

is the penalty for violating the time window, as shown in Equation (14), which is computed by comparing the arrival time of each mission point with its allowed service interval.

R_{c}

represents the mission overload, as shown in Equation (15), which penalizes the UAV if its total mission exceeds the payload limit. The number of drones used is derived from the cumulative task volume, as shown in Equation (16) of this article. These values are calculated during the mission assignment phase using time and load data extracted from the solution.

Figure 8 shows the spatial distribution of the mission points in the side and top views. These mission locations are represented as basic inputs to the mission assignment framework defined in Equations (1)–(3), where UAVs are assigned based on their spatio-temporal distance and capacity constraints.

The total objective function

D

is shown in Equation (1), where

R

denotes the mission assignment cost and

F

denotes the trajectory cost. In the simulation, the environmental parameters are transformed into a model for input, using IALNS algorithm to calculate

R

, using IWOA to calculate

F

, and finally adjusting the weights by AHP to output

D

.

4.2. Experimental Results

4.2.1. Algorithm Base Performance Validation

In order to validate the effectiveness of the proposed IWOA, this study compares the IWOA with the WOA, PSO, and ACO using five benchmark test functions, as shown in Table 5. Among them,

F_{1} (x)

and

F_{2} (x)

are unimodal test functions,

F_{3} (x)

is a multimodal test function, and

F_{4} (x) a n d F_{5} (x)

are fixed-dimension multimodal test functions. Each algorithm was run for 100 iterations. The average value and standard deviation of the results were used as evaluation metrics. Detailed test functions are listed in Table 6.

As shown in Table 6, based on the comparison of mean values and standard deviations of the optimal search results, IWOA exhibits slightly lower search precision than PSO only for the

F_{3}

test function. However, for the other test functions, the optimal and mean values obtained by IWOA are smaller than those of the other algorithms. Moreover, IWOA achieves theoretical optimal values in the

F_{1}

and

F_{4}

tests, demonstrating that the improved algorithm is more stable.

Table 6 shows that IWOA reaches the theoretical optimal value in single-peak function

F_{1}

and fixed-dimension multi-peak function

F_{4}

with lower standard deviation than traditional algorithms, which proves that it has the stability and anti-local optimality in complex optimization problems, and provides the algorithmic basis for UAV trajectory optimization.

4.2.2. Simulated Annealing Algorithm

The number of drones used was set to three. The algorithms compared include IALNS-IWOA, IALNS-WOA, IALNS-PSO, and IALNS-ACO, each with an average population size of 90 and a maximum number of 300 generations. Multi-drone cooperative path planning based on the dual-layer algorithm was simulated, and the experimental results are shown in Figure 9, Figure 10, Figure 11 and Figure 12.

The fitness value represents the total cost of combining the task assignment penalty and the trajectory planning objective. As shown in Figure 13, the proposed IALNS-IWOA algorithm converges the fastest and has the lowest final fitness value. All algorithms exhibit exploration fluctuations during the first 50 iterations; however, IALNS-IWOA converge significantly earlier, i.e., around 120 iterations. The other algorithms either converge slowly or fluctuate around the local optimum. This performance improvement stems from the synergy of the two-layer architecture, where IWOA enhances global trajectory exploration using nonlinear convergence and adaptive mutation, while IALNS dynamically adjusts task allocation using adaptive damage–repair operators and simulated annealing. Together, they enable efficient navigation of large solution spaces and robust escape from local optima. The smoother convergence profile of IALNS-IWOA also reflects a better balance between exploration and exploitation, confirming its robustness and efficiency in solving MDRP-TW with complex constraints.

The visualization of the experimental results clearly demonstrates the superiority of IALNS-IWOA in solving trajectory planning for VRP-TW. Detailed information on the dual-layer structures of the different algorithms is presented in Table 7.

The comparison results indicate that the proposed dual-layer framework effectively addresses the multi-drone coordination problem with time windows. Specifically, the total fitness of the IALNS-IWOA model is 7.36% higher than that of the IALNS-WOA model, 3.08% higher than that of the IALNS-PSO model, and 39.13% higher than that of the IALNS-ACO model. In terms of total flight distance, the IALNS-IWOA model achieves reductions of 7.30%, 3.34%, and 16.43% compared with the IALNS-WOA, IALNS-PSO, and IALNS-ACO models, respectively. These results validate both the effectiveness of the improved algorithm and the superiority of the proposed dual-layer framework.

4.2.3. Extended Experiments Under Varying Mission Scenarios

To further evaluate the scalability, adaptability, and robustness of the proposed IALNS-IWOA algorithm, this study conducted two additional simulation experiments under mission scenarios of varying complexity. These extended cases aim to validate the performance of the algorithm when varying the number of mission points, the presence of obstacles, and the number of UAVs required. The generalized capabilities of the two-tier planning framework proposed in this paper are demonstrated by analyzing both lightly loaded scenarios and complex obstacle-rich scenarios.

The lightweight experiment was conducted first, and in this minimal configuration, five mission points were randomly placed in a flat 3D environment with an area of 500 × 500 × 300. The mission point information is shown in Table 4 for mission points 1 to 5, with no terrain obstacles or no-fly zones. The payload and flight parameters of the UAV are consistent with the main experiment. The experimental environment is shown in Figure 14.

The inverse of Equation (16) shows that two UAVs are used in this experiment, and the algorithm assigns the task to the two UAVs after completing the task evaluation and trajectory planning. The planned trajectories are shown in Figure 15.

Among them, the flight trajectories generated by IALNS-IWOA are compact and spatially separated from each other, which ensures efficient mission coverage and minimal overlap. At this point the experiment yields an adaptation level of A. The mission information performed by each UAV is shown in Table 8.

The results of this experiment show that the IALNS-IWOA algorithm can efficiently determine balanced assignments and short and feasible flight paths without unnecessary complexity, even in the case of sparse tasks.

The second one is the complexity experiment, in which the complexity of the environment and the density of the tasks were increased. Twenty task points were randomly distributed in 3D space, except for task point 1 to task point 15, as shown in Table 6. The information of the added task points is shown in Table 9. Meanwhile, three cylindrical no-fly zones were introduced to simulate the complex environment with dense obstacles, except for cylinder 1 and cylinder 2, as shown in Table 5. The information of the added cylinders is shown in Table 10. This experimental environment is shown in Figure 16.

The algorithm automatically calculates that four UAVs are required for optimal coverage due to the cumulative mission requirements and spatial complexity of the layout. The planning trajectory is shown in Figure 17. As can be seen from the figures, the trajectory avoids forbidden areas while maintaining reasonable distances and smooth transitions. The UAV task assignments, coverage task sequences, total distance flown, and final fitness values are detailed in Table 11. Despite the increased difficulty, the IALNS-IWOA framework maintains the feasibility of the trajectory, achieves good load balancing, and demonstrates strong coordination of task assignments under complex constraints.

These two additional experiments further demonstrate the effectiveness and flexibility of the proposed IALNS-IWOA framework, which not only adapts to varying numbers of UAVs and task points but also maintains high-quality solutions under both obstacle-free and obstacle-rich environments. Its ability to balance task allocation, avoid constraints, and generate smooth feasible trajectories under diverse mission conditions highlights its robustness and practical applicability in real-world UAV planning scenarios. In summary, the extended experiments under varying mission complexities further validate the generalizability and reliability of the proposed dual-layer framework. The IALNS-IWOA algorithm consistently achieves efficient task allocation and trajectory optimization across both sparse and densely constrained UAV scenarios. The computational scalability and resource efficiency of the proposed method is further illustrated in Figure 18, which compares the time spent and the total memory usage per experiment for the three experimental cases in this paper for the IALNS-IWOA framework with the other variants, IALNS-ACO, IALNS-PSO, and IALNS-WOA.

To ensure consistency, this experiment was repeated three times and the mean values are reported. From the image, it is seen that the computational time of the algorithm with the memory used for the experiment increases with the increase in the number of task points and from the overall trend it is found that IALNS-PSO and IALNS-WOA take less time, followed by IALNS-IWOA, and lastly IALNS-ACO, but the sacrifice of the time cost is worth it compared with the increase in its computational accuracy.

5. Future Research Directions

Future work will highly prioritize focus on extending the proposed framework to dynamic and uncertain environments, in which we need to resolve the computation cost issue, mathematics model reconstruction issue, etc. In real UAV deployments, tasks may be generated in real time or vary rapidly due to unexpected events such as weather conditions and urgent needs. Integrating real-time data-driven scheduling and event-triggered replanning will increase the flexibility of the system. Another promising direction is to extend the model to heterogeneous fleets of UAVs with different speeds, capacities, and endurance. This will require redesigning the tasking strategy to be capacity-aware and adaptable to UAV-specific constraints. From an engineering perspective, we also plan to deploy the algorithm on an embedded platform such as an onboard edge processor and validate its performance through hardware-in-the-loop simulations or small-scale flight experiments. These steps will bridge the gap between theoretical modeling and actual deployment, bringing the system closer to practical applications. In addition, an important future direction is to improve the modeling of obstacles and terrain within the framework. This study used convex shapes such as cylinders or curves to simplify computations, but real-world environments often have concave or irregular obstacles that can lead to local minima in path planning [36,37]. Future research will work on integrating more complex obstacle geometries and evaluating the robustness of the proposed framework under such realistic and challenging conditions.

6. Conclusions

The experimental results validate the robustness and effectiveness of the proposed IALNS-IWOA framework. Compared with baseline algorithms such as IALNS-PSO, IALNS-WOA, and IALNS-ACO, our method consistently achieves better convergence behavior and lower overall fitness. This performance gain is attributed to the dual-layer coordination: IWOA improves global trajectory search, while IALNS efficiently adjusts task assignments with adaptive operators. The algorithm maintains strong generalization capability across scenarios with different numbers of UAVs, task points, and obstacle configurations. This confirms its potential for deployment in practical UAV mission environments such as urban logistics, post-disaster mapping, and dynamic area surveillance.

Despite its advantages, the proposed method has certain limitations. First, all simulation scenarios assume static task locations and predefined obstacle regions, which may differ from dynamic real-world missions. Second, the current model handles only homogeneous UAVs with uniform payload and flight parameters. Scalability to large-scale missions beyond 50 task points may also require further optimization, such as parallel computing or heuristic pruning. In summary, this study proposed a dual-layer UAV trajectory planning framework that integrates improved whale optimization (IWOA) and adaptive large neighborhood search (IALNS) to solve the MDRP-TW problem. The model incorporates windows, no-fly zones, altitude, payload constraints, and environmental obstacles. Extensive experiments confirmed that the proposed method is competitive, adaptive, and applicable to varied UAV scheduling tasks.

Author Contributions

Writing—review and editing, Y.Y.; writing—original draft preparation, Y.F.; visualization, R.X.; investigation, W.F.; data curation, K.X. All authors have read and agreed to the published version of the manuscript.

Funding

This study has been supported by the Open Fund Project of the Key Laboratory of Civil Aviation Flight Technology and Fight Safety (Grant No. FZ2021ZZ06). This work has also been supported by Central University Basic Research Projects (Grant No. 24CAFUC04002) and the Sichuan Provincial Engineering Research Center for Smart Operation and Maintenance of Civil Aviation Airports (Grant No. JCZX2024ZZ25 and Grant No. JCZX2023ZZ07). This work has also been supported by Sichuan Flight Engineering Technology Research Center Project (Grant No. GY2024-30D), and Student Innovation and Entrepreneurship Training Program (Project Name: Research on critical technologies for multi-drone distribution in mountainous and complex environments).

Data Availability Statement

The original contributions presented in the manuscript are included in the article, any further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

The Chinese Communist Party (CCP) Central Committee; The State Council. China’s Foreign Economic Relations and Trade Bulletin. In Outline of the Strategic Plan for Expanding Domestic Demand (2022–2035); The Central Committee of the Communist Party of China and the State Council: Beijing, China, 2023; Volume 17, pp. 3–17. [Google Scholar]
Li, H.; Wang, T.; Du, X. Analysis of collaborative navigation algorithms for multi-UAV swarm. Tactical Missile Technol. 2024, 6, 118–126. [Google Scholar]
Fang, K.; An, Y.; Zhu, N.; Huang, D. The vehicle routing problem with drone stations. J. Manag. Sci. China 2025, 28, 61–76. [Google Scholar]
Zhao, W.; Bian, X.; Mei, X. An Adaptive Multi-Objective Genetic Algorithm for Solving Heterogeneous Green City Vehicle Routing Problem. Appl. Sci. 2024, 14, 6594. [Google Scholar] [CrossRef]
Pan, C. Research on Vehicle Routing Problem Based on Deep Reinforcement Learning. In Proceedings of the 2023 8th International Conference on Intelligent Computing and Signal Processing (ICSP), Xi’an, China, 21–23 April 2023; pp. 2116–2119. [Google Scholar]
Huang, N.; Zhu, J.; Zhu, W.; Qin, H. The multi-trip vehicle routing problem with time windows and unloading queue at depot. Transp. Res. Part E Logist. Transp. Rev. 2021, 152, 102370. [Google Scholar] [CrossRef]
Christian, M.M.F.; Alexander, J.; Markus, F.; Rainer, K. The vehicle routing problem with time windows and flexible delivery locations. Eur. J. Oper. Res. 2023, 308, 1142–1159. [Google Scholar]
Liu, T.; Xu, W.; Wu, Q. Modeling of Multi-vehicle Route Searching with Soft Time Windows Under Sudden-on set Disaster. J. Tongji Univ. (Nat. Sci.) 2012, 40, 109. [Google Scholar]
Zhang, L.; Wang, J.; Liu, X. Deep Reinforcement Learning for Dynamic Multi-UAV Path Planning in Urban Environments. IEEE Trans. Autom. Sci. Eng. 2024, 21, 712–725. [Google Scholar]
Li, Y.; Chen, M.; Xu, J. A Hybrid GA-PSO Algorithm for Real Time Multi Drone Task Scheduling with Time Windows. J. Intell. Robot. Syst. 2023, 101, 34. [Google Scholar]
Chen, Q.; Zhao, S.; Huang, H. Distributed Consensus Based Coordination for Heterogeneous UAV Swarms under Communication Constraints. Int. J. Robot. Res. 2025, 44, 58–75. [Google Scholar]
Chen, R.; Li, J.; Chen, Y.; Huang, Y. A Distributed Scheduling Method for Networked UAV Swarm based on Computing for Communication. In Proceedings of the 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, MI, USA, 1–5 October 2023; pp. 11071–11078. [Google Scholar]
Kim, S.; Lee, H. Real time Obstacle Avoidance for Multi UAV Systems Using Graph Neural Networks. J. Field Robot. 2024, 41, 295–312. [Google Scholar]
Sato, Y.; Tanaka, K.; Watanabe, M. Multi Agent Reinforcement Learning and Game Theoretic Co-ordination for UAV Airspace Sharing. In Proceedings of the International Conference on Intelligent Unmanned Systems, Changzhou, China, 18–19 September 2025; Volume 12, pp. 88–97. [Google Scholar]
Jonas, W.; Julius, R.; Marija, P. Multi-UAV Adaptive Path Planning Using Deep Reinforcement Learning. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, MI, USA, 1–5 October 2023; pp. 649–656. [Google Scholar]
Tang, J.; Liang, Y.; Li, K. Dynamic Scene Path Planning of UAVs Based on Deep Reinforcement Learning. Drones 2024, 8, 60. [Google Scholar] [CrossRef]
Li, Y.; Hamid, A.A.; Dong, D. Path Planning for Cellular-Connected UAV: A DRL Solution With Quantum-Inspired Experience Replay. IEEE Trans. Wirel. Commun. 2022, 21, 7897–7912. [Google Scholar] [CrossRef]
Li, Y.; Aghvami, A.H.; Dong, D. Intelligent Trajectory Planning in UAV-Mounted Wireless Networks: A Quantum-Inspired Reinforcement Learning Perspective. IEEE Wirel. Commun. Lett. 2021, 10, 1994–1998. [Google Scholar] [CrossRef]
Xu, Y.; Wei, Y.; Jiang, K.; Wang, D.; Deng, H. Multiple UAVs Path Planning Based on Deep Reinforcement Learning in Communication Denial Environment. Mathematics 2023, 11, 405. [Google Scholar] [CrossRef]
Hans, H.; Dirk, H.; Felix, W.; Rank, K. Quantum Deep Reinforcement Learning for Robot Navigation Tasks. IEEE Access 2022, 12, 87217–87236. [Google Scholar]
Zheng, L.; Li, J.; Wang, Y. Multi-UAV Autonomous Obstacle Avoidance Based on Reinforcement Learning. In Proceedings of the Chinese Control Conference, Tianjin, China, 24–26 July 2023; pp. 8657–8661. [Google Scholar]
Liu, X.; Du, X.; Zhang, X.; Zhu, Q.; Mohsen, G. Evolutionary computation for unmanned aerial vehicle path planning: A survey. Artif. Intell. Rev. 2024, 57, 267. [Google Scholar]
Feng, O.; Zhang, H.; Tang, W.; Wang, F.; Feng, D.; Zhong, G. Digital Low-Altitude Airspace Unmanned Aerial Vehicle Path Planning and Operational Capacity Assessment in Urban Risk Environments. Drones 2025, 9, 320. [Google Scholar] [CrossRef]
Merei, A.; Mcheick, H.; Ghaddar, A.; Rebaine, D. A Survey on Obstacle Detection and Avoidance Methods for UAVs. Drones 2025, 9, 203. [Google Scholar] [CrossRef]
Guo, J.; Gan, M.; Hu, K. Cooperative Path Planning for Multi-UAVs with Time-Varying Communication and Energy Consumption Constraints. Drones 2024, 8, 654. [Google Scholar] [CrossRef]
Lei, Q.; Gao, Y.; Zhou, Y.; Wu, Z. Multi-delivery option path planning based on improved ALNS algorithm. Syst. Eng. Electron. 2025, 47, 173–181. [Google Scholar]
Wang, X.; Zhang, Q.; Jiang, S.; Dong, Y. Dynamic UAV path planning based on modified whale optimization algorithm. J. Comput. Appl. 2025, 45, 928–936. [Google Scholar]
Cherfi, S.; Boulaiche, A.; Lemouari, A. Exploring the ALNS method for improved cybersecurity: A deep learning approach for attack detection in IoT and IIoT environments. Internet Things 2024, 28, 101421. [Google Scholar] [CrossRef]
Lu, Z.; Wu, K.; Bai, E.; Li, Z. Optimization of Multi-Vehicle Cold Chain Logistics Distribution Paths Considering Traffic Congestion. Symmetry 2025, 17, 89. [Google Scholar] [CrossRef]
Li, J.; Cui, W.; Kong, X. DMR Kmeans: Identifying Differentially Methylated Regions Based on k-means Clustering and Read Methylation Haplotype Filtering. Curr. Bioinform. 2024, 19, 490–501. [Google Scholar]
Xi, F.; Lin, F. Research on dynamic collaborative path planning combining simulated annealing algorithm and genetic algorithm. Ship Sci. Technol. 2024, 46, 161–164. [Google Scholar]
Yang, Y.; Fu, Y.; Lu, D.; Xu, K. Three-dimensional unmanned aerial vehicle trajectory planning based on the improved whale optimization algorithm. Symmetry 2024, 16, 1561. [Google Scholar] [CrossRef]
Li, Z.; Xu, X. Journal of Safety and Environment. J. Saf. Environ. 2025, 25, 237–249. [Google Scholar]
Xu, J.; Shang, S.; Wang, W. Statics Analysis and Optimization Design of Heavy Load Agricultural UAV. J. Agric. Mech. Res. 2023, 45, 16–23. [Google Scholar]
Ma, X.; Zhang, J. Characteristic Gene Analysis of Mini-tiller Product Family Based on AHP and SPSS. J. Agric. Mech. Res. 2024, 46, 34–38. [Google Scholar]
Phadke, A.; Medrano, F.A.; Chu, T.; Sekharan, C.N.; Starek, M.J. Modeling Wind and Obstacle Disturbances for Effective Performance Observations and Analysis of Resilience in UAV Swarms. Aerospace 2024, 11, 237. [Google Scholar] [CrossRef]
Pham, H.; Bestaoui, Y.; Mammar, S. Aerial robot coverage path planning approach with concave obstacles in precision agriculture. In Proceedings of the 2017 Workshop on Research, Education and Development of Unmanned Aerial Systems (RED-UAS), Linköping, Sweden, 3–5 October 2017; pp. 43–48. [Google Scholar]

Figure 1. Schematic diagram of the UAV trajectory, collision avoidance buffer, and cylindrical no-fly zone model. The UAV flight path is represented as a series of 3D points

(x_{i}, y_{i}, z_{i})

, while the red ellipsoid denotes the safety buffer for collision avoidance. The right-side vertical cylinder illustrates a no-fly zone with center

(x_{o}, y_{o})

and radius

r

. Distances

d_{1}

,

d_{2}

, and

d_{i}

represent the UAV’s proximity to restricted airspace boundaries and are used in constraint formulations.

Figure 1. Schematic diagram of the UAV trajectory, collision avoidance buffer, and cylindrical no-fly zone model. The UAV flight path is represented as a series of 3D points

(x_{i}, y_{i}, z_{i})

, while the red ellipsoid denotes the safety buffer for collision avoidance. The right-side vertical cylinder illustrates a no-fly zone with center

(x_{o}, y_{o})

and radius

r

. Distances

d_{1}

,

d_{2}

, and

d_{i}

represent the UAV’s proximity to restricted airspace boundaries and are used in constraint formulations.

Figure 2. Schematic of the spacing between the two drones.

Figure 3. Overall structure of the proposed dual-layer IALNS-IWOA framework for multi-UAV trajectory planning.

Figure 4. Flow chart of IALNS.

Figure 5. K-means clustering diagram.

Figure 6. IWOA flow chart.

Figure 7. Schematic diagram of the terrain environment.

Figure 8. Spatial distribution of mission points and depot locations over 3D terrain. (a) Side view of the terrain surface showing altitude changes and UAV start points. (b) Top view showing the 2D distribution of the 15 mission points on the terrain heat map. The surface map shows the terrain elevation, with the surface colour changing from blue to yellow as the height increases. Black square icons indicate depots and blue circles indicate numerically labeled mission points.

Figure 9. Flight path planning diagram. Trajectory planning for the MDRP-TW problem under the IALNS-ACO algorithm. The surface map shows the terrain elevation, with the surface colour changing from blue to yellow as the height increases. The black square indicates the distribution centre, the starting point of the UAV. The blue circle indicates the mission point to be visited. Pink cylinders indicate no-fly zones. Red, black and green lines indicate different UAV flight paths connecting the mission points in a specified order.

Figure 10. Flight path planning diagram. Trajectory planning for the MDRP-TW problem under the IALNS-PSO algorithm. The surface map shows the terrain elevation, with the surface colour changing from blue to yellow as the height increases. The black square indicates the distribution centre, the starting point of the UAV. The blue circle indicates the mission point to be visited. Pink cylinders indicate no-fly zones. Red, black and green lines indicate different UAV flight paths connecting the mission points in a specified order.

Figure 11. Flight path planning diagram. Trajectory planning for the MDRP-TW problem under the IALNS-WOA algorithm. The surface map shows the terrain elevation, with the surface colour changing from blue to yellow as the height increases. The black square indicates the distribution centre, the starting point of the UAV. The blue circle indicates the mission point to be visited. Pink cylinders indicate no-fly zones. Red, black and green lines indicate different UAV flight paths connecting the mission points in a specified order.

Figure 12. Flight path planning diagram. Trajectory planning for the MDRP-TW problem under the IALNS-IWOA algorithm. The surface map shows the terrain elevation, with the surface colour changing from blue to yellow as the height increases. The black square indicates the distribution centre, the starting point of the UAV. The blue circle indicates the mission point to be visited. Pink cylinders indicate no-fly zones. Red, black and green lines indicate different UAV flight paths connecting the mission points in a specified order.

Figure 13. Convergence comparison of different dual-layer algorithms (IALNS-IWOA, IALNS-WOA, IALNS-PSO, IALNS-ACO) over 300 iterations, evaluated by total fitness.

Figure 14. Lightweight experimental simulation environment diagram.

Figure 15. Lightweight experimental drone trajectory planner. The surface map shows the terrain elevation, with the surface colour changing from blue to yellow as the height increases. The black square indicates the distribution centre, the starting point of the UAV. Blue circles indicate the mission points to be visited. The red and black lines indicate the different UAV flight paths connecting the mission points in a specified order.

Figure 16. Complexity experimental simulation environment diagram. The surface map shows the terrain elevation, with the surface colour changing from blue to yellow as the height increases. The black square indicates the distribution centre, the starting point of the UAV. The blue circle indicates the mission point to be visited. Pink cylinders indicate no-fly zones.

Figure 17. Complexity experimental drone trajectory planner. The surface map shows the terrain elevation, with the surface colour changing from blue to yellow as the height increases. The black square indicates the distribution centre, the starting point of the UAV. The blue circle indicates the mission point to be visited. Pink cylinders indicate no-fly zones. The red, black, blue and green lines indicate the different UAV flight paths connecting the mission points in a specified order.

Figure 18. Algorithm performance demonstration chart.

Table 1. Comparison of representative UAV path planning methods.

Methodology	Traditional Algorithms	Deep RL-Based Path Planning	IALNS-IWOA Dual-Layer
Advantages	Simple and efficient, suitable for static environments, easy to model; widely studied, e.g., A*, GA, PSO	Suitable for complex environments, adaptive and supportive of online updates, new technologies	Simultaneous consideration of global task assignment and local path optimization; flexible constraint handling, module expandability
Limitations	Difficulty in adapting to high-dimensional dynamic environments; lack of real-time feedback mechanisms and synergies	High learning costs, inefficient samples, difficult to interpret strategy results, not easy to migrate	Dependent on preprocessing path cost matrix, lack of real-time feedback mechanisms and synergies

Table 2. Description of variables and parameters used in the MDRP-TW optimization model.

Symbol	Explanation
$L_{i}$	Euclidean distance between the $i t h$ and $(i + 1) t h$ waypoints of the UAV
$R_{i j}$	Constraint from the $i t h$ waypoint to the $j t h$ no-fly zone
$h_{i}$	Flight altitude of the UAV at the $i t h$ waypoint
$h_{m a x}$ $h_{\min}$	Maximum and minimum flight altitudes of the UAV
$δ_{i}$	Turning angle
$γ_{i}$	Pitch angle
$d_{p, q}$	Distance between the $p t h$ and $q t h$ UAVs
$t_{i}$	Time when the UAV arrives at the $i t h$ task point
$T ω_{i, 1}$ $T ω_{i, 2}$	Left and right time windows of the $i t h$ task point
$q_{i}$	Task volume of the $i t h$ task point
$P_{c}$	Maximum task volume that the UAV can complete

Table 3. Data information table of the no-fly zone.

Zone Number	Center of the Bottom Circle	Radius
Cylinder no-fly zone 1	(250,370)	40
Cylinder no-fly zone 2	(140,250)	35

Table 4. Detailed attributes of task points including coordinates, demands, and time windows.

Point Number	X-Coordinate	Y-Coordinate	Demand	Left Time Window	Right Time Window	Service Time
Distribution Center	25	30	/	0	1260	/
1	50	50	40	480	945	20
2	380	50	10	456	900	20
3	50	450	40	48	225	20
4	450	220	10	432	855	20
5	250	250	20	16	90	20
6	100	100	10	384	780	20
7	120	110	40	128	300	20
8	130	120	30	176	405	20
9	300	100	10	368	750	20
10	300	120	5	240	495	20
11	400	350	17	312	660	20
12	420	360	3	392	825	20
13	150	380	16	72	225	20
14	80	420	23	344	753	20
15	480	80	31	272	600	20

Table 5. Test functions.

Category	Function Name	Expression	Theoretical Optimal Value
Unimodal Test Function	Sphere Function	$F_{1} (x) = \sum_{i = 1}^{n} x_{i}^{2}$	0
Unimodal Test Function	Quartic Function	$F_{2} (x) = \sum_{i = 1}^{n} [x_{i}^{2} - 10 \cos (2 π x_{i}) + 10]$	0
Multimodal Test Function	Ackley’s Function	$\begin{array}{l} F_{3} (x) = 0.1 \{\sin^{2} (3 π x_{1}) + \sum_{i = 1}^{n} {(x_{i} - 1)}^{2} [1 + \sin^{2} (3 π x_{i + 1})] + {(x_{n} - 1)}^{2} [1 + \sin^{2} (2 π x_{n})]\} \\ + \sum_{i = 1}^{n} u (x_{i}, 5, 100, 4) \end{array}$	0
Fixed-dimension Multimodal Test Function	Branin Function	$F_{4} (x) = {(x_{2} + \frac{5.1}{4 π^{2}} x_{1}^{2} + \frac{5}{π} x_{1} - 6)}^{2} + 10 (1 - \frac{1}{8 π}) \cos x_{1}$	0.39788735
Fixed-dimension Multimodal Test Function	Kowalik Function	$F_{5} = {\sum_{i = 1}^{11} [a_{i} - \frac{x_{1} (b_{i}^{2} + b_{i} x_{2})}{b_{i}^{2} + b_{i} x_{3} + x_{4}}]}^{2}$	0.0003075

Table 6. Performance comparison of IWOA, PSO, ACO, and standard WOA on benchmark test functions (the bold is to highlight the performance of different algorithms in the test function).

Test Function	Data Type	ACO	PSO	WOA	IWOA
$F_{1}$	Mean Value	100,222.22	12,324.9345	0.6090	0.00077
$F_{1}$	Best Value	100,222.22	3874.1177	0.05230	1.64983 × 10⁻⁵
$F_{2}$	Mean Value	5,605,894,559	202,601,216.1	33.5324	0.119037
$F_{2}$	Best Value	4,579,640,494	37,290,487.49	1.2149	0.01651
$F_{3}$	Mean Value	21.7181	19.9999	20.4272	20.27560
$F_{3}$	Best Value	21.7180	19.9999	20.1486	20.05561
$F_{4}$	Mean Value	15.9554	1.1616	0.6112	0.655569
$F_{4}$	Best Value	14.5972	0.3986	0.3979	0.3979
$F_{5}$	Mean Value	0.1170	0.0063	0.0024	0.0013
$F_{5}$	Best Value	0.0156	0.0011	0.0010	0.0009

Table 7. Optimal task allocation, flight distance, and total fitness values for different dual-layer algorithms in UAV path planning. This table summarizes the optimal results obtained from four dual-layer algorithms: IALNS-IWOA, IALNS-WOA, IALNS-PSO, and IALNS-ACO. For each algorithm, the task sequences, flight distances, task volumes, total distances, and overall fitness values are presented across three UAVs.

Algorithm	Drone	Task Points	Flight Distance	Task Volume	Total Flight Distance	Total Fitness
IALNS-IWOA	1	0-8-7-1-0	461.30792	110	2961.5315	4019.38625
	2	0-5-13-3-14-0	1054.25558	99
	3	0-10-9-2-15-4-12-11-6-0	1445.96800	96
IALNS-WOA	1	0-13-3-14-10-9-0	1227.17871	94	3194.65192	4338.738
	2	0-7-8-1-0	462.48449	110
	3	0-5-11-12-4-15-2-6-0	1504.98872	101
IALNS-PSO	1	0-7-8-1-0	462.48449	110	3063.74593	4147.02121
	2	0-5-13-3-14-0	1054.25558	99
	3	0-9-10-2-15-4-12-11-6-0	1547.00586	96
IALNS-ACO	1	0-7-1-6-0	566.87832	90	3543.66885	6603.29729
	2	0-3-13-14-8-0	1238.45150	109
	3	0-5-10-2-9-15-4-12-11-0	1738.33903	106

Table 8. Lightweight experimental UAV flight path planning trajectory information sheet.

Algorithm	Drone	Task Points	Flight Distance	Task Volume	Total Flight Distance	Total Fitness
IALNS-IWOA	1	0-3-1-0	852.76203	80	1973.14747	998.83682
IALNS-IWOA	2	0-5-4-2-0	1120.38544	40	1973.14747	998.83682

Table 9. Additional task point information sheet.

Point Number	X-Coordinate	Y-Coordinate	Demand	Left Time Window	Right Time Window	Service Time
16	280	180	12	256	540	20
17	350	260	25	144	330	20
18	190	300	28	320	660	20
19	410	140	14	192	420	20
20	200	230	9	400	810	20

Table 10. Added no-fly zone data information sheet.

Center of the Bottom Circle	Radius
(400,100)	20

Table 11. Lightweight experimental UAV flight path planning trajectory information sheet.

Algorithm	Drone	Task Points	Flight Distance	Task Volume	Total Flight Distance	Total Fitness
IALNS-IWOA	1	0-8-7-6-0	372.88136	80	4130.72609	1953.13468
	2	0-5-19-17-11-12-16-10-0	1366.03646	96
	3	0-13-3-14-18-0	1085.99370	107
	4	0-9-2-15-4-20-1-0	1305.81457	110

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, Y.; Fu, Y.; Xin, R.; Feng, W.; Xu, K. Multi-UAV Trajectory Planning Based on a Two-Layer Algorithm Under Four-Dimensional Constraints. Drones 2025, 9, 471. https://doi.org/10.3390/drones9070471

AMA Style

Yang Y, Fu Y, Xin R, Feng W, Xu K. Multi-UAV Trajectory Planning Based on a Two-Layer Algorithm Under Four-Dimensional Constraints. Drones. 2025; 9(7):471. https://doi.org/10.3390/drones9070471

Chicago/Turabian Style

Yang, Yong, Yujie Fu, Runpeng Xin, Weiqi Feng, and Kaijun Xu. 2025. "Multi-UAV Trajectory Planning Based on a Two-Layer Algorithm Under Four-Dimensional Constraints" Drones 9, no. 7: 471. https://doi.org/10.3390/drones9070471

APA Style

Yang, Y., Fu, Y., Xin, R., Feng, W., & Xu, K. (2025). Multi-UAV Trajectory Planning Based on a Two-Layer Algorithm Under Four-Dimensional Constraints. Drones, 9(7), 471. https://doi.org/10.3390/drones9070471

Article Menu

Multi-UAV Trajectory Planning Based on a Two-Layer Algorithm Under Four-Dimensional Constraints

Abstract

1. Introduction

2. Modeling

2.1. Mathematical Model of the Trajectory Planning Layer

2.2. Mathematical Model of the Task Allocation Layer

3. IALNS-IWOA Algorithm

3.1. Adaptive Large Neighborhood Search

3.1.1. K-Means Clustering

3.1.2. Simulated Annealing Algorithm

3.2. Improved Whale Optimization Algorithm

Genetic Algorithm

4. Simulation Experiments

4.1. Simulation Environment and Mapping to Mathematical Model

4.2. Experimental Results

4.2.1. Algorithm Base Performance Validation

4.2.2. Simulated Annealing Algorithm

4.2.3. Extended Experiments Under Varying Mission Scenarios

5. Future Research Directions

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Point Number	X-Coordinate	Y-Coordinate	Demand	Left Time Window	Right Time Window	Service Time
16	280	180	12	256	540	20
17	350	260	25	144	330	20
18	190	300	28	320	660	20
19	410	140	14	192	420	20
20	200	230	9	400	810	20

Point Number	X-Coordinate	Y-Coordinate	Demand	Left Time Window	Right Time Window	Service Time
16	280	180	12	256	540	20
17	350	260	25	144	330	20
18	190	300	28	320	660	20
19	410	140	14	192	420	20
20	200	230	9	400	810	20

Point Number	X-Coordinate	Y-Coordinate	Demand	Left Time Window	Right Time Window	Service Time
16	280	180	12	256	540	20
17	350	260	25	144	330	20
18	190	300	28	320	660	20
19	410	140	14	192	420	20
20	200	230	9	400	810	20