A Comprehensive Review of Path-Planning Algorithms for Multi-UAV Swarms
Highlights
- A scenario-conditioned taxonomy (mission × environment dynamics) is established for UAV swarms, mapping centralized, decentralized, and hybrid planners to nine common settings with summary tables and quantitative evidence.
- Cross-scenario trade-offs among responsiveness, safety, scalability, and energy are synthesized, identifying best-fit choices and typical failure modes under real-world constraints (limited computing, unstable links, imperfect sensing).
- Provide deployment-oriented guidance for selecting planners given mission constraints and resources, together with a concise evaluation checklist to enable reproducible comparisons.
- Outline near-term R&D priorities: adaptive planners on resource-constrained platforms; scalable multi-objective planning with safety guarantees; sim-to-real benchmarks/digital twins; energy-aware hierarchical planning; and coupling offline pre-planning with online replanning.
Abstract
1. Introduction
- (1)
- A scenario-conditioned taxonomy for multi-UAV swarm planning is established and applied consistently across the paper. The taxonomy covers twelve mission–planning–environment cells, nine of which are populated by recent work.
- (2)
- For each populated cell, representative algorithm families are summarized together with their reported limitations, and cross-scenario trade-offs (e.g., scalability versus energy efficiency, centralized versus decentralized or hybrid architectures) are discussed using a common set of evaluation lenses.
- (3)
- The review provides deployment-oriented guidance by linking offline and online planning through architecture choices, digital-twin-based validation, and safety-aware collision avoidance, and by outlining scenario-specific research directions for disaster response, public-safety surveillance, urban logistics, and other application domains. The structure of this paper is as follows. Section 2 defines the classification criteria and method families. Section 3 reviews the algorithms across the nine scenarios. Section 4 discusses cross-cutting issues (architecture selection, digital twins, collision avoidance). Section 5 synthesizes trade-offs and scenario-conditioned guidance. Section 6 formulates open scientific problems. Section 7 concludes.
2. Classification and Analysis of Path-Planning Algorithms for Multi-UAV Swarms
2.1. Classification Criteria
2.1.1. Mission Types
- Path Missions: Path missions require multiple UAVs to launch from the same or different sites and then either converge on a common target or maintain designated relative formations during flight. For instance, roundup and surveillance operations typically demand convergence at the target to guarantee mission effectiveness. Path missions commonly optimize either minimal flight time or shortest path length to maximize execution efficiency.
- Distribution Missions: Distribution missions involve UAVs departing from the same or different launch points and navigating to distinct spatial locations to execute independent tasks. In transport and target-strike operations, UAV swarms must match mission demands with available vehicles, optimizing both allocation and routing to maximize efficiency and accuracy. The core challenge lies in developing algorithms with sufficient computational efficiency and real-time responsiveness to accommodate dynamic distribution requirements.
- Coverage Missions: Coverage missions feature uncertain objectives; search and reconnaissance operations, for example, require UAV swarms to explore every point within the designated area. During execution, planners must minimize mission cost while avoiding omissions of critical regions and redundant coverage of low-value zones. Because coverage environments are typically cluttered and dynamic, algorithms must offer high computational efficiency and real-time adaptability.
2.1.2. Planning Methods
- Online Planning: Online planning dynamically revises a UAV swarm’s flight path during execution based on real-time environmental data and mission updates. As environmental or mission conditions evolve, the path is continually replanned. Consequently, online-planning algorithms must deliver high real-time performance and robust stability to cope with dynamic scenarios.
- Offline Planning: By contrast, offline planning pre-computes a UAV swarm’s flight path using known environmental data and mission requirements before deployment. Because the path remains fixed during execution, offline-planning algorithms must guarantee global optimality to ensure efficient mission completion.
2.1.3. Environment Types
- Static Environments: A static environment features fixed obstacles and terrain throughout the planning process. Under such conditions, UAV swarms may acquire complete environmental information and pre-plan paths before deployment. Because frequent real-time computation is unnecessary, algorithms for static environments should prioritize computational efficiency to generate rapid, globally optimal routes.
- Dynamic Environments: A dynamic environment contains moving obstacles or time-varying conditions (e.g., other aircraft or emerging threats). In such settings, UAV swarms must perceive environmental changes and replan paths in real time; hence, dynamic-environmental algorithms require high real-time performance and flexibility to cope with complex conditions.
2.2. Algorithm Classification
- (1)
- This review systematically surveys domestic and international literature from the past two years on UAV-swarm path-planning algorithms and summarizes the eligible algorithms according to the proposed classification criteria.
- (2)
- The absence of algorithms in some categories (e.g., PNS, DNS, CNS) does not imply that these scenarios are unsolvable. The specific reasons for the absence of algorithms in these categories are as follows:
- Marginal benefit vs. cost. In static, fully known maps, continuous re-planning provides little improvement over a high-quality offline plan, yet incurs persistent onboard computing, communication, and energy overhead; most pipelines therefore pre-compute (PFS/DFS/CFS) and keep only a lightweight safety layer online.
- Problem reformulation. PNS typically degenerates to offline global routing with reactive near-field safety (e.g., APF/MPC) rather than full online re-planning. DNS is commonly handled as offline static scheduling (DFS), with periodic batch updates if tasks drip in; once meaningful exogenous changes appear, it is re-cast as DND.CNS reduces to deterministic coverage path planning (CFS) via decomposition/spanning-tree, since the environment does not change; adding online decision making brings little gain without dynamics.
- Benchmarking and applicability. “Online–static” lacks clear trigger events for re-planning, so widely used benchmarks emphasize offline–static or online–dynamic settings. In regulated airspace, certification also favors pre-flight plans, further steering research away from online–static cells.
- When these cells matter. They become meaningful only if “static” maps hide latent uncertainty (e.g., intermittent sensing or map incompleteness) or if long-duration missions impose periodic re-optimization; such cases are routinely labeled online–dynamic in practice and therefore appear among the populated cells.
- (3)
- Each publication is assigned to a single classification category. When an algorithm spans multiple mission forms, it is placed in the most representative category to simplify classification and highlight its primary features.
- (4)
- To avoid redundancy, algorithms applicable to multiple scenarios are discussed only within the scenario most relevant to their core contribution.
3. Research Status of Path-Planning Algorithms for Multi-UAV Swarms
- Reinforcement Learning (RL):
- RRT family (RRT, RRT *):
- Artificial Potential Field (APF):
- Model Predictive Control (MPC):
- Ant Colony Optimization (ACO):
- Meta-heuristic and bio-inspired algorithms:
- Supervised Learning (SL):
- Graph-search algorithms:
- Genetic Algorithm (GA):
- Particle Swarm Optimization (PSO):
- Unsupervised Learning (UL):
- Area-segmentation algorithms:
- Differential Evolution (DE):
3.1. PND Problem (Path, Online, Dynamic)
3.1.1. Reinforcement-Learning Algorithms
3.1.2. The Rapidly Exploring Random Tree Algorithm
3.1.3. Artificial Potential Field (APF) Methods
3.1.4. Model Predictive Control Algorithm
3.1.5. Ant Colony Optimization Algorithm
3.1.6. Meta-Heuristic and Bio-Inspired Algorithms
| Reference | Method (Family → Specific) | Scenario and Problem | Limitation |
|---|---|---|---|
| Hu J, Fan L, Lei Y, et al. [13] | RL → PPO; formation/APF | Low-altitude, radar-evading path via virtual leader; Role: global path; Arch: hybrid; Assump: known radar map; Quant: Mission success rate ≈ 95%; Average path length ≈ 121.7 km | Sensitive to radar-model fidelity and reward weights; on-policy training cost; robustness to unknown sensors/wind unclears |
| Westheider J, Rückin J, Popović M. [14] | RL(MARL) → COMA | Online 3D informative coverage; Role: coverage planning; Arch: CTDE/decentralized; Assump: flat grid terrain; Quant: Coverage ratio at mission end ≈ 79% (4 agents); Coverage ratio at mission end with zero communication ≈ 76% | Needs global features at training; critic-heavy; no explicit obstacle/kinodynamic safety |
| Kong X, Zhou Y, Li Z, et al. [15] | RL → TD3 + target-assignment network (Hungarian labels) | Joint target assignment and collision-free path in 3D dynamic obstacles; Role: per-step centralized assignment + decentralized TD3 motion; Arch: hybrid; Assump: local sensing, spherical obstacles, grid world; Quant: Mission success rate ≈ 84%; Targets reached = 5/5 | Per-step Hungarian adds cubic cost; relies on stable Q aggregation and engineered reward; no explicit kinodynamic safety |
| Arranz R, Carramiñana D, Miguel G, et al. [16] | RL → PPO sub-agents + deterministic swarm controller | Online ground surveillance (search, track, avoid); Role: centralized tasking + on-board learned behaviors; Arch: centralized controller with per-UAV sub-agents; Assump: fixed altitude, benign weather, ideal sensors; Quant: Time to first target ≈ 2.8 s; Tracking continuity ≥ 95% | Relies on stable comms and perfect sensing; fixed-altitude envelope; training cost; no explicit kinodynamic safety—needs added safety/comm-loss handling for cluttered airspace |
| Wang X, Gursoy M C. [17] | RL → D3QN (dueling double DQN); decentralized multi-agent | IoT data-collection path planning with non-cooperative UAVs (and jammer); Role: per-UAV on-board planner; Arch: decentralized with low-level neighbor exchange; Assump: local sensing, TDMA connectivity, velocity-set actions; Quant: Mission success rate ≥ 99%; Collision rate < 0.6% | Needs reliable sensing/limited comms; motion discretized; safety via reward shaping—add explicit kinodynamic/safety layer for cluttered, certified airspace |
| Cheng Y, Li D, Wong W E, et al. [18] | RL → MAXQ + SA | Cooperative path planning in 2D grids; Role: hierarchical subtask planning; Arch: centralized training/execution; Assump: point-mass UAVs, fixed altitude, ideal sensing; Quant: Average planning steps lower than MAXQ (proxy for time) | 2D abstraction; tuned cooling schedule; safety via penalties (no kinodynamic constraints) |
| Niu Y, Yan X, Wang Y, et al. [19] | Meta-heuristic → AEO + MEAEO-RL | Global optimization for multi-UCAV 3D paths with timing/collision constraints; Role: global optimizer with subpopulations; Arch: centralized coop. via shared costs; Assump: DEM terrain, known threat hemispheres, point-mass, speed bounds; Quant: Mission time ≈ 7.69 h (cooperative arrival); Average path length ≈ 122 km | Needs known threat maps and weight tuning; grid/point-mass abstraction; safety via penalties (no kinodynamic guarantees) |
| Wu W, Zhang X. [20] | RL → DDQN; virtual leader + APF local avoidance | Swarm navigation with static/dynamic obstacles; Role: global heading (leader) + local APF; Arch: hybrid; Assump: fixed-wing model, reliable sensing/comms; | Safety via penalties (no kinodynamic guarantees); discrete heading actions; relies on sensing/comms stability |
| Azzam R, Boiko I, Zweiri Y. [21] | RL(MARL) → Actor–Critic (CTDE); curriculum learning | Cooperative navigation to simultaneous arrival; Role: central critic (train) + decentralized actors (exec); Arch: CTDE; Assump: fixed altitude, planar model, local observations; Quant: Completion time ≈ 34 s (10 UAVs) | Requires reliable sensing/links; safety via reward shaping (no kinodynamic guarantees); fixed-altitude planar abstraction |
| Wang W, You M, Sun L, et al. [22] | RL → MASAC-Discrete (multi-agent SAC); CTDE | Unknown-environment cooperative exploration; Role: online coverage planning; Arch: CTDE with decentralized execution; Assump: fixed-altitude 2D grid, local sensing + map sharing; Quant: Task success rate ≈ 93%; Average steps per episode ≈ 121 | Depends on reliable sensing/sharing; planar abstraction; safety via reward shaping (no kinodynamic guarantees) |
| Liu Q. [23] | Game theory + Q-learning with adaptive exploration; Apollonius-circle capture | Cooperative roundup of an intelligent evader; Role: geometric capture test + payoff-matrix game + tabular RL; Arch: centralized game solves with per-UAV execution; Assump: ideal sensing/comms, 2D fixed altitude, point-mass kinematics; Quant: Capture time ≈ 22 s; Steps reduced ≈ 51% vs. standard Q-learning | 2D abstraction; relies on synchronous sensing/comms; safety via penalties (no kinodynamic guarantees); scalability to 3D/noisy maps untested |
| Wu Q, Liu K, Chen L, et al. [24] | RL(CTDE) → MADDPG + MPC-style multi-step value convergence; CTPDE/CTFDE + distance-weighted mean field | Stochastic hazards MAPF; Role: RL waypoints + fluid-field controller; Arch: CTDE (central critic) with CTPDE/CTFDE execution; Quant: reported collision count = 0 in tests; real-robot deployment: 3-UAV demo | Fixed-altitude, point-mass; ideal sensing/links; parameter-sensitive (mean-field, controller, horizon); training overhead from centralized critics; geometric (non-kinodynamic) safety |
| Zhao X, Yang R, Zhong L, et al. [25] | RL → SAC (LiDAR) + AIT* follow-points; parameter-sharing, off-policy; no comms | Multi-UAV path planning and following; Role: SAC end-to-end planner with AIT* tracking; Arch: shared replay; no inter-agent comms; Quant: 3-UAV success (1000 rounds): 829 vs. 705 (baseline SAC) | Fixed altitude; no kinodynamic safety; parameter-sensitive (LiDAR/range, rewards) |
| Liu Y, Li X, Wang J, Wei F, et al. [26] | RL → AM-MAPPO + action-mask CA + rule-based target capture + FoV encoding | 3D moving-target cooperative search; Role: high–low collaboration (sweep → descend) + masked CTDE policy; Quant: captured targets increase with team size (~2.58 → 3.77 → 3.65 for 3/5/8 UAVs); avg uncertainty decreases to ~0.156 at 8 UAVs | Three fixed altitude bands; grid/ideal sensing; no kinodynamic safety; parameter-sensitive (FoV/range, rewards/clip) |
| Burzyński W, Stecz W. [39] | Sampling → Spacetime RRT* (Multiplatform Spacetime RRT*) | Time-aware multi-UAV trajectory planning in dynamic, obstacle-dense environments; Role: global spacetime tunnel + per-UAV spacetime RRT planning; Arch: centralized backbone + per-UAV planning; Quant: Average planning time ≈ 1.6 s (3000 samples); Mission success rate ≈ 100% (≥2000 samples) | 2D → spacetime abstraction; sensitive to σ/D/sample choices; assumes reliable sensing/comm for dynamic collision checks; needs dynamics-aware safety for certified 3D deployments |
| Kelner J M, Burzynski W, Stecz W. [40] | Sampling → RRT* (FANET-aware, spacetime) | Swarm trajectory planning that enforces multi-hop connectivity (FANET) and collision-free time-stamped paths; Role: global spacetime planner + dynamic MST for connectivity; Arch: centroid backbone + informed sampling | Assumes accurate radio-propagation/LOS models and reliable sensing/comm; sensitive to sampling/MST parameters and latency of MST rebuilds; 2D → spacetime simplifications in parts may limit direct transfer to certified 3D airspace |
| Xiang L, Wang F, Xu W, et al. [41] | Joint → CETA (clustering) + JSSCT-APF + JA-MFG | Adaptive sub-swarm assignment, jammer-aware trajectory and power control; Role: association + jamming-sensitive path + jamming-aware power Quant (paper): Average total interference ≈ 28%; Tracking steps ≈ 33% vs. baselines | Requires propagation-model fidelity and reliable sensing/comm; needs tuning of weights/thresholds and frequent topology/SINR updates; direct transfer to cluttered/certified 3D airspace needs extra robustness work. |
| Kang C, Xu J, Bian Y. [42] | Virtual leader + APF (second-order differentiable virtual-force-field); affine formation maneuvering | Formation-keeping obstacle avoidance and continuous configuration change; role: virtual-leader global reference + local APF-based trajectory replanning; tested in 2D fixed-altitude formation maneuvers; Quant Tracking accuracy ≈ 89.5%; Completion time ≈ 450 s | Assumes ideal sensing/communication, 2D fixed-altitude, small team (N = 7); higher computational cost and sensitivity to APF/spring-constant tuning; scalability/kinodynamic certifiability untested. |
| Zhao W, Li L, Wang Y, et al. [43] | Theta + APF heuristic (Theta–APF) **—omni-directional Theta* with APF-guided heuristic for formation path planning | 3D/formation path planning in cluttered voxel maps; role: reduce node expansions and smooth paths while keeping formation via virtual-leader control. Quant: Search time ≈ 10.38 s (vs. 21.67 s A*); Average path length ≈ 119.21 (grid units). | Relies on grid/voxel discretization (scales poorly with resolution); inherits APF local-minima and parameter sensitivity; assumes reliable maps/sensing/communication and limited kinodynamic testing. |
| Chen G, Yuan S, Zhu X, et al. [44] | ESC + APF (normalized ESC; combined navigation function); swarm source seeking | Unknown-environment source seeking with obstacle and inter-UAV avoidance; role: normalized ESC for gradient following + APF for collision avoidance; arch: onboard sensing + PX4 velocity loop; Quant: leader time-to-target ≈ 100 s; minimum follower–follower distance ≈ 0.4 m (no collision) | Assumes reliable onboard sensing and fixed-altitude envelope; safety via potentials (no kinodynamic guarantees); performance sensitive to gains/perturbation frequencies; robustness to sensing/comm loss needs validation. |
| Li J, Zi S, Lu X, et al. [45] | APF (improved) + bee-colony control; bounded goal-repulsion for loiter-attack | Swarm path planning in complex island-reef terrain; role: target attraction + inter-UAV repulsion + loiter at goal; arch: APF planner with swarm-intelligence coordination; Quant: mission success rate 100%; attack time window ≈ 2.738 s (with bee-colony control, vs. improved APF without it ≈ 5.946 s) | Parameter sensitivity and APF local minima; assumes accurate target localization and ideal sensing; safety via potentials (no kinodynamic guarantees) |
| Kallies C, Gasche S, Karásek R. [46] | Optimal control → MPC (MILP; ECM); energy-aware cooperative planning | Dynamic obstacles and waypoint coverage; role: MILP planner with energy return; arch: centralized solver with receding horizon; Quant: covered waypoints 30/36 in 47 s (Scenario 1); 32/36 in 62 s with low-energy return | Linearized 2D indoor model; depends on fast MILP solver and warm starts; parameter sensitivity (horizon, big-M); safety not certified for 3D cluttered airspace. |
| Fan X, Li H, Chen Y, et al. [47] | Optimal control + Deep learning → MPC + LSTM (weather forecast); threat-field replan | routing under wind and mobile severe weather; role: LSTM predicts atmosphere, MPC replans with threat field each step; arch: receding-horizon solver + forecast; Quant: mission success rate ≈ 99%; average planning time ≈ 67 s under wind + moving threats | Assumes fixed-altitude 2D, reliable sensing/comm; depends on solver speed and forecast fidelity; no hard kinodynamic safety. |
| Wang Y, Zhang T, Cai Z, et al. [48] | Meta-heuristic + Optimal control → CGWO + distributed MPC (event-triggered) | Neighbor-sharing MPC with no-fly constraints; role: CGWO global search + MPC constraint handling; arch: distributed with event triggers; Quant: tracking-error convergence time ≈ 32 s (CGWO) vs. ≈ 39 s (PSO); total event-triggered solver calls ≈ 217 vs. 429 without event-trigger | 2D fixed-altitude, ideal sensing/links; depends on solver speed and threshold tuning; safety via penalties (no kinodynamic guarantees) |
| Xian B, Song N. [49] | Optimal control + Reactive → MPC (offline) + improved APF (online); event-triggered change/recovery | Global smooth path via MPC; local dynamic-obstacle avoidance via APF; role: MPC global guidance + APF local reaction | Assumes fixed-altitude 2D and ideal sensing/comms; requires solver/APF gain tuning; safety via penalties (no kinodynamic guarantees) |
| Wee L B, Paw Y C. [50] | SLAM + ant-foraging (decentralized revisit/cost maps); exploration and return | GNSS-denied search-and-rescue; role: decentralized coverage + built-in return-home planning; arch: local sensing + light map sharing; Quant: coverage ratio at mission end ≈ 99% (50-agent case); average search time ≈ 697 s (50-agent Monte Carlo mean) | 2D fixed-altitude grid and ideal sensing/links; safety via penalties (no kinodynamic guarantees); path optimality mainly benchmarked vs. iterative A* |
| Guo J, Gao Y, Liu Y. [51] | Clustering + Meta-heuristic → SOM + ACO (adaptive evaporation); joint allocation–routing | Multi-UAV task allocation + collision-aware routing; role: SOM assignment + ACO path; arch: centralized pipeline | Assumes known targets, mostly static maps; parameter sensitivity (population, evaporation); safety via penalties (no kinodynamic guarantees); dynamic re-tasking and large-swarm scalability untested |
| Wang Q, Xu M, Hu Z. [55] | Meta-heuristic → SL-TSO (Sine–Lévy TSO with elite opposition and golden-sine) | Offline global optimizer for 3D UAV paths with altitude/threat constraints; Role: global path synthesis (B-spline smoothing); Arch: centralized; Assump: known terrain/threat maps, point-mass kinematics | Parameter-sensitive; requires known maps; safety via penalties (no kinodynamic guarantees) |
| Gu G, Li H, Zhao C. [56] | Meta-heuristic → MEMPA (random-spiral; H–V crossover; centroid boundary; refined eddy/FADs) | Offline global optimizer for 3D swarm paths; Role: global path synthesis; Arch: centralized; Assump: known terrain/threat maps, point-mass kinematics; Quant: path cost improved ≈ 10% vs. MPA; ≈10% vs. NMPA | Parameter-sensitive; relies on known maps; safety via penalties (no kinodynamic guarantees) |
| Xu N, Zhu H, Sun J. [57] | Meta-heuristic → Krill-swarm planner (forage/evade/cruise + B-spline) | Offline global planner for 3D plant-protection terrain; Role: global path synthesis; Arch: centralized; Assump: known 3D terrain, ideal sensing; Quant: path length reduced by 1.1–17.5%; operation time reduced by 27.56–75.15% (vs. swarm-intelligence baselines) | Heuristic behavior-switch thresholds (step/perception/crowding) sensitive; point-mass abstraction; safety via distance penalties (no kinodynamic guarantees) |
| Fu S, Li K, Huang H, et al. [58] | Meta-heuristic → RBMO (small/large group foraging + storage) | Offline global UAV path planning; Role: global path synthesis (B-spline smoothing); Arch: centralized; Assump: known terrain/threat maps; Quant: average path cost ≈ 214 (3D planning); best cost ≈ 214 | Parameter-sensitive; point-mass abstraction; safety via penalties (no kinodynamic guarantees) |
| Liu P, Sun N, Wan H, et al. [59] | Meta-heuristic → SOEA (elite adversarial + adaptive threshold) | Offline global 3D path planning; Role: global path synthesis (smoothed); Arch: centralized; Assump: known terrain/threat maps, point-mass kinematics; | Parameter-sensitive (elite ratio, perturbation, threshold); relies on known maps; safety via penalties (no kinodynamic guarantees) |
| Liu L, Lu Y, Yang B, et al. [60] | Meta-heuristic → MISCSO (multi-population; distribution-estimation; elite pool; Cauchy perturb.) | Offline global optimizer for 3D UAV paths (length/threat/altitude/smoothness); Role: global path synthesis (B-spline); Arch: centralized; Assump: known terrain/threat, ideal sensing; | Sensitive to subpopulation ratios/Gaussian model/Cauchy step; point-mass abstraction; safety via penalties (no kinodynamic guarantees) |
| Yin S, Yang J, Ma L, et al. [61] | Meta-heuristic → QREWOA (quasi-opposition; real-time boundary; adversarial/history-guided) | Offline 3D path planning with length/threat/altitude/turning constraints; Role: centralized global path synthesis; Arch: centralized; Assump: known DEM and threat map; Quant: planning success rate increased by ~50%; coverage of feasible paths reported as 100% | Single-objective weight surrogate; relies on known maps; point-mass abstraction; safety via penalties (no kinodynamic guarantees) |
| Chen F, Tang Y, Li N, et al. [62] | Bionic flocking + RRT (Dubins smoothing) | Cooperative 3D path with formation maintenance in rugged terrain; Role: local flocking safety + global RRT; Arch: decentralized flocking + centralized/global planning; Quant: cluster sizes tested = 12/16/20 UAVs | Fixed neighbor/spacing radii and tuned gains; assumes ideal sensing/comms for escape broadcast; safety via geometric penalties (no kinodynamic guarantees) |
| Xiang H, Han Y, Pan N, et al. [63] | Meta-heuristic → MNRW-LSA + greedy RRT | Cooperative urban-patrol paths (energy/risk constraints); Role: RRT seeding + offline global optimizer; Arch: centralized; Assump: known 3D city model, fixed bounds; Quant: optimal path length ≈ 1.50 km; average running time ≈ 5.9 s | Needs known maps/weight tuning; partial fixed-altitude/point-mass abstraction; parameter-sensitive; safety via penalties (no kinodynamic guarantees) |
| Wu X, Xu L, Zhen R, et al. [64] | Meta-heuristic → GLMFO (chaos init; adaptive weighted update; crossover/mutation) | Offline global formation path planning over mountains; Role: centralized global path synthesis; Arch: centralized; Assump: known terrain, ideal sensing; Quant: average run time reduced by ~36%; total average iterations reduced by ~35% (vs. baselines) | Parameter-sensitive (weight schedule, crossover/mutation); point-mass abstraction; safety via penalties (no kinodynamic guarantees) |
| Zhang X, Zhang X, Miao Y. [65] | Meta-heuristic → HDEFWA (DE-sparks; chaotic init; min-radius; info-sharing) | Offline cooperative global paths (length/threat/separation); Role: centralized global optimizer (per-UAV groups); Arch: centralized with cooperative cost; Assump: known terrain/threat, point-mass; Quant: average path cost ≈ 1065.6 (Case I); ≈957.6 (Case II) | Although the algorithm incorporates DE to improve FWA, the introduction of additional operators (mutation, crossover, selection) inevitably increases computation, which may hinder real-time multi-UAV applications |
| Hou J.; Zhou X.; Pan N.; et al. [66] | Sampling → time-optimal primitive library + env → traj collision masks; async decentralized selector | Online path/coverage in unknown maps; Role: library → mask → min-cost selection; Arch: decentralized, asynchronous; Assump: fixed-altitude, onboard sensing, short-horizon broadcast; Quant: per-agent local planning ≈ 0.427 ms; real-time 1000-UAV sim | Geometric (no kinodynamic) safety; parameter-sensitive; needs high-level guidance for maze-like scenes |
3.2. PFS Problem (Path, Offline, Static)
3.2.1. Supervised-Learning Models
3.2.2. Graph-Search Algorithms
3.2.3. The Genetic Algorithm
| Reference | Algorithm/Method | Applicable Scenario and Problem Addressed | Limitation |
|---|---|---|---|
| Chen Y, Pi D, Wang B, et al. [71] | Meta-heuristic → MGOEO (EO + generalized opposition; crossover/mutation) | Offline multi-UAV path planning on static 3D maps; Role: centralized global path synthesis; Arch: centralized; Assump: known obstacle map, point-mass model; Quant: average runtime on Map-1 ≈ 60.5 s; average runtime on Map-2 ≈ 80.1 s | Extra runtime from opposition sampling; relies on known maps/weight tuning; point-mass abstraction; weaker on some fixed-dimension multimodal cases |
| Du Y. [72] | Graph search → Enhanced A* (query table; task-allocation partition) | Offline 3D grid planning with workspace partition for multi-UAV SAR; Role: centralized partition + per-UAV local A*; Arch: centralized partitioning; Quant: planning time ≈ 29.65 s (enhanced) vs. 51.53 s (A*); max open-list size reduced 5751 → 3286 (representative region) | Static known map; grid abstraction; dynamics not enforced |
| Bashir N, Boudjit S, Dauphin G. [73] | Graph search → Dijkstra (connectivity-aware) + layered control | Offline urban path with fleet/backhaul connectivity; Role: ground path finding + onboard formation tracking; Arch: centralized planning + per-UAV navigation; Quant: max reaction delay to leader speed change ≈ 0.58 s; minimum UAV–UAV spacing ≈ 15 m during mission | Assumes known static obstacles and radio thresholds; no explicit kinodynamic constraints; relies on sensing/comms stability for handover/topology updates |
| Xie J, Zhang G, Zhang W, et al. [74] | Graph search → improved A* + JPS; Traj. opt. → L-BFGS | Static 3D grid; formation-aware motion planning; Role: A* seed + JPS pruning + L-BFGS refinement; Arch: centralized; Assump: voxel map, point-mass | The integration of improved A*, JPS-based path simplification, and L-BFGS optimization increases algorithmic complexity, which may reduce real-time applicability for large UAV swarms |
| Kladis G P, Doitsidis L, Tsourveloudis N C. [75] | GA (multi-objective) → Bézier-curve planner | Offline static 3D map; minimize energy + path length; Role: centralized global path synthesis; Arch: centralized; Assump: DTED terrain, weather, and no-fly zones | Relies on weight/payoff-table tuning; point-mass and geometric safety (no kinodynamic guarantees) |
3.3. PFD Problem (Path, Offline, Dynamic)
Particle Swarm Optimization Algorithms
| Reference | Algorithm/Method | Applicable Scenario and Problem Addressed | Limitation |
|---|---|---|---|
| Liu Y, Zhu X, Zhang X Y, et al. [83] | PSO + RGG (variable-length) + divide-and-conquer | Offline 2D grid; RGG candidates + PSO sub-path refinement; Quant: average normalized path length ≈ 0.248 (vs. 0.872 PSO); iterations to first feasible path ≈ 4.1 (vs. 14.0 PSO) | Static known map; parameter-sensitive (radius/samples/waypoint cap); geometric safety only. |
| Meng Q, Chen K, Qu Q. [84] | PSO (hybrid) + RRT* + priority planning → PPSwarm | Cooperative routing; Role: RRT* seeding + per-UAV PSO with dynamic-obstacle list; Arch: centralized high-level + per-UAV refinement; Quant: average runtime ≈ 111.7 s; mean path cost ≈ 153,744 (Scenario-2) | Known DEM/cylindrical obstacles; point-mass abstraction; safety via geometric distances; performance depends on priority/params; replans needed when map/links change |
| Huang H, Li Y, Song G, et al. [85] | RL → DP-MATD3 (MATD3 + PSO tuning + dual replay) | AoI-aware multi-UAV data collection; Role: per-UAV policy with CTDE; Arch: centralized training/decentralized exec.; Assump: fixed altitude, local sensing/sharing; Quant: weighted average Age of Information reduced by ~33.3% (5 m/s) and ~27.5% (10 m/s) vs. MATD3 | DP-MATD3 integrates PSO optimization and dual experience pools into MATD3, which increases computational overhead and may hinder real-time deployment in large-scale UAV networks |
| Cao Z, Li D, Zhang B. [86] | Weighted Voronoi partition + PSO (real-time updates) | Dynamic cluttered airspace; Role: team-level partition + per-UAV local PSO; Arch: hybrid (central partition, onboard refinement); Assump: planar/altitude-band, reliable sensing/links | Sensitive to weights/update rate (boundary oscillation, load imbalance); point-mass abstraction; geometric safety (no kinodynamic guarantees) |
| Shao Z, Zhou Z, Qu G, et al. [87] | PH-curve parametrization + MHPSGA (multi-population PSO–GA) | Formation-constrained 3D planning in cluttered terrain; Role: PH geometry + hybrid optimizer; Arch: centralized; Assump: known static map, point-mass, curvature bounds | Parameter-sensitive (pop/migration/crossover/mutation); fixed formation templates; geometric safety only (no kinodynamic guarantees) |
| Wang C, Zhang L, Gao Y, et al. [88] | SPSO + DE (Nash bargaining) → GSPSODE | Offline inspection paths in urban pipe corridors; Role: hybrid global search with bargaining; Arch: centralized; Quant: average runtime (Scene 2) ≈ 122 s; best path cost (Scene 2) ≈ 46,900 | Despite the game-theoretic hybridization, SPSO-DE may still fall into local optima in highly complex environments |
| Sheng L, Li H, Qi Y, et al. [89] | Improved PSO + similarity screening (TDOA) | Online passive localization + trajectory optimization; Role: select 4 UAVs + optimize their next positions; Arch: centralized screening + per-UAV refinement; Quant: average positioning error ≈ 1.39 km | Although similarity screening reduces redundant calculations, the improved PSO with large populations (e.g., 6000 particles) and inheritance mechanisms still imposes heavy computational overhead, limiting real-time scalability |
| Tan L, Zhang H, Shi J, et al. [90] | PSO with Nash-equilibrium tuning | Offline 3D grid path planning; Role: PSO search with on-the-fly coefficient balance; Quant: average convergence time reduced by ~32%; average flight distance reduced by ~34% (vs. PSO) | Static known map; point-mass, fixed altitude; relies on Nash reaction-function assumptions; parameter-sensitive; penalty-based safety (no kinodynamic guarantees) |
| Wang L, Luan Y, Xu L, et al. [91] | DCPSO (dynamic clusters + Tent-chaos) on APF + receding horizon | Online waypoint selection with APF scene; Role: horizon model + clustered PSO search; Quant: mean path length ≈ 108.98 km; best path length ≈ 108.23 km (30 runs) | 2D fixed altitude; ideal sensing/links; clustering overhead; parameter-sensitive; penalty safety (no kinodynamic guarantees) |
| Li Y, Zhang L, Cai B, et al. [92] | FP-GPSO (Fermat-point grouping PSO) | Unified ALP + three-segment routes in mountains; Role: geometry-informed ALP + PSO segment routing; Quant: mean path cost ≈ 1.45; feasible-path rate = 100% | Needs known DEM/risk–safety maps; point-mass/fixed envelope; sensitive to group sizes and coefficient/weight schedules; penalty-based safety (no kinodynamic guarantees) |
3.4. DND Problem (Distribution, Online, Dynamic)
3.4.1. Ant Colony Optimization Algorithm
3.4.2. Reinforcement-Learning Algorithms
3.4.3. Unsupervised Learning Algorithms
3.4.4. Meta-Heuristic and Bio-Inspired Algorithms
| Reference | Algorithm/Method | Applicable Scenario and Problem Addressed | Limitation |
|---|---|---|---|
| Li K, Yan X, Han Y. [52] | BACOHBA (bidirectional ACO + discrete HBA) + HBAFOA (HBA + FOA) | Power-line inspection with multi-wind fields; Role: wind-aware task allocation + path planning; Quant: average path cost ≈ 7307; average runtime ≈ 378 s | Needs DEM/wind model; higher complexity/longer wall time; geometric (non-kinodynamic) safety |
| Luo X. [53] | Integrated ACO framework (coverage → surveillance → strike; GA-enhanced ACO for allocation) | Unknown environments: distributed sensing map; pheromone-based surveillance; GA–ACO strike allocation; Quant: coverage achieved = 100% in search demo; min inter-UAV spacing ≥ 150 m during monitoring | Grid/ideal sensing and comm; hand-tuned pheromone/GA–ACO weights; geometric (non-kinodynamic) safety |
| Zhang H, Ma H, Mersha B W, et al. [54] | Distributed ACO → DCS-UC (pheromone and position consensus + collision avoidance) | Online cooperative search with unstable links; Role: ACO waypointing + consensus + avoidance; Arch: decentralized on switching connected graphs; Quant: coverage completion ≈ 250 s (4 UAVs, fixed topology); collisions = 0 | 2D fixed altitude; obstacles not modeled; relies on consensus frequency/latency; ideal sensing assumed; no kinodynamic guarantees |
| Dhuheir M A, Baccour E, Erbad A, et al. [27] | RL → PPO (joint trajectory + distributed CNN inference) | Online surveillance with collaborative inference; Role: central agent picks layer-to-UAV and next move; Quant: avg per-request latency ≈ 0.26–0.53 s | 2D fixed altitude; fixed BW/power; no Doppler; ideal sensing; penalty-based safety; periodic re-opt. needed |
| Li M, Ma Q, Wu G. [28] | Attention-based RL (with GNN) + Gurobi/greedy | Dynamic task allocation with solver-based routing; Role: RL allocation + optimal/greedy path; Quant: task completion ↑ ≈ 27–60%; decision time < 1 s (multi-scale dynamic tests) | Grid/known obstacles; central coordination; solver runtime dominates at scale; geometric (non-kinodynamic) safety; frequent re-optimization under heavy dynamics. |
| Du J. [29] | MADDPG, MAPPO | Partial-observability routing + cooperative–competitive allocation; Role: MADDPG routing + MAPPO allocation | 2D fixed altitude; ideal sensing/links; parameter-sensitive (reward/game weights); critic/replay overhead at scale; geometric/penalty safety (no kinodynamic guarantees). |
| Chen H C, Yen L H. [30] | RL → DQN/DRQN (CTDE) | Distributed serving–charging scheduling with limited visibility; Role: per-UAV on-board policy; Arch: CTDE/decentralized execution; Quant: average residual energy ≈ 25–55% (DQN/DRQN runs); fails when charging rate = 0.5 or 1.0 with 3 slots/station; works with 9 slots or rate = 2.0 | 2D fixed altitude; fixed BW/power; ideal sensing; penalty safety (no kinodynamic guarantees); retraining/periodic re-opt. under changing demand. |
| Yu S. [109] | Enhanced Contract Net (hybrid) + Q-learning routing | Multi-target tracking with unknown map; Role: hierarchical clustering → hybrid auction → Q-learning paths | Ideal sensing/links; parameter-sensitive (pheromone/auction/Q-rates); auction/message overhead at scale; geometric (non-kinodynamic) safety; frequent re-optimization as targets change. |
| Wang X, Zhang X, Lu Y, et al. [110] | LSTM–Kalman prediction + Hungarian allocation + 3D Dubins + proportional guidance | Airport bird-dispersion; Role: predict → assign → route (centralized base-station; radar-tracked target);Quant: eviction completed ≈ 71 s; optimal formation size = 5 UAVs per group by cost–benefit analysis | Centralized computation and radar dependence; assumes single-bird (no flock dynamics); ideal links; geometric (non-kinodynamic) safety |
| Tan C, Liu X. [67] | Improved two-stage auction (ML-tuned bidding + re-auction) | Dynamic, resource-constrained allocation; Role: learned bidding + secondary auction; Quant: task-completion improvement (ITCR) ≈ 6–7%; degradation count NPD ≈ 1/100 runs (function + mechanism) | Sensitive to k and re-auction rate; central auctioneer overhead; geometric (non-kinodynamic) safety. |
| Wang G, Wang F, Wang J, et al. [68] | Two-Stage Greedy Auction (TSGAA) | Large-scale naval target allocation; Role: entropy-weighted initial auction + effectiveness-based reassignment; Quant: avg runtime ≈ 0.0004 s (20 UAVs/10 targets); avg runtime ≈ 0.52–0.66 s | Assumes static known map and Dubins surrogate; no comm/range constraints; weight tuning; geometric (non-kinodynamic) safety |
3.5. DFD Problem (Distribution, Offline, Dynamic)
Particle Swarm Optimization Algorithms
| Reference | Algorithm/Method | Applicable Scenario and Problem Addressed | Limitation |
|---|---|---|---|
| Yan X, Chen R. [93] | PSO path planning + ABC fire assessment/control | Offline multi-point routing on mountain DEM; Role: PSO for reachability, ABC for severity-driven allocation; Quant: iterations to convergence ≈ 62; per-path computation time ≈ 7.9–13.3 s | Needs known DEM/ideal sensing; PSO local-optimum/parameter sensitivity; ABC heuristic rules; geometric (non-kinodynamic) safety. |
| Beishenalieva A, Yoo S J. [94] | PSO with grid-cell 3D model; multi-objective utility (VSI/time/energy) | Offline multi-UAV planning on static WSN maps; Role: asynchronous next-cell selection with FANET connectivity; Quant: fitness evaluations per movement ≈ 250 vs. 22,832; cumulative sensing value ≈ 95% of full-search | Needs known DEM/sensor stats; free-space/beam assumptions; point-mass and grid-cell abstraction; weight tuning; geometric (non-kinodynamic) safety |
| Zhang J, Cui Y, Ren J. [95] | Enhanced PSO (particle coding + competitive co-evolution) | Distributed planning for TSTs; Role: alliance + allocation + per-UAV paths; Quant: avg fitness (Scenario-1) ≈ 2.87 × 102; avg calc. time (Scenario-1) ≈ 2.5 s (200 runs) | Fixed-altitude, known DEM/obstacles; relies on coding/update parameters; penalty-based safety; comm model assumes low-hop, stable links |
| Deng M, Yao Z, Li X, et al. [96] | DMOAWPSO (DT-assisted adaptive-weighted multi-objective PSO) | Dynamic multi-objective task allocation with DT change-response; Role: DT monitors scene → MO-PSO updates | Depends on DT fidelity and latency; parameter-sensitive (subgroups/mutation/weight schedule); added compute from response rounds; penalty-based safety (no kinodynamic guarantees) |
| Yu Y. [97] | Discrete PSO (matrix-coded + BAS) + extended CBBA (partial reset) | Static pre-allocation + dynamic re-tasking with heterogeneous loads and time windows; Quant: overall gain highest with partial-reset CBBA; runtime shorter than full-reset and closer to no-rese | Grid and ideal sensing/links; weight/evaporation and reset-rate sensitive; geometric (non-kinodynamic) safety |
| Tang G, Xiao T, Du P, et al. [98] | Improved PSO (inferior-solution mutation + selective crossover); time slicing | Offline multi-weapon/multi-target assignment; Role: fuzzy threat assess. → VNS-IBPSO allocation; Quant: convergence time ≈ 4.08 s; fitness variance ≈ 1.0 × 10−4 (best among compared) | Needs known hit probs and single-shot execution; parameter-sensitive (penalty/weights); added VNS overhead; geometric (non-kinodynamic) safety |
| Li Y, Chen W, Liu S, et al. [99] | VNS-IBPSO + intuitionistic-fuzzy MADM | Multi-weapon, multi-target assignment under evolving threat assessments | The VNS-IBPSO integrates intuitionistic fuzzy threat assessment, improved BPSO update rules, and variable neighborhood search. This multi-layer design enhances performance but significantly increases computational cost, limiting real-time deployment in large-scale air combat |
| Han D, Jiang H, Wang L, et al. [100] | PSOGWO (PSO + GWO; nonlinear factor; dynamic weighting) | Post-earthquake SAR VRPTW: minimize fleet/cost/penalties; Role: centralized offline allocation + routing; Quant: min rescue cost lower than PSO/GWO; UAV routes satisfy capacity and time windows | Static known map; homogeneous UAVs; ideal sensing/links; parameter-sensitive; geometric/penalty safety (no kinodynamic guarantees) |
3.6. DFS Problem (Distribution, Offline, Static)
Genetic Algorithm Algorithms
| Reference | Algorithm/Method | Applicable Scenario and Problem Addressed | Limitation |
|---|---|---|---|
| Wu Y, Liang T, Gou J, et al. [76] | HGSA + OAEC + MPSO-C + GA | Heterogeneous formation mission: GA/SA allocation → OAEC formation → MPSO-C waypointing → GA departure deconfliction; Quant: formation time (OAEC) ≈ 274.6 s; path length (MPSO) ≈ 7.54 × 104 vs. PSO ≈ 1.15 × 105 | Needs known DEM/no-fly/radar; fixed-altitude point-mass; parameter-sensitive (temp-point, step, PSO); safety via penalties (no kinodynamic guarantees); scalability tied to HGSA/GA search |
| Xiong T, Liu F, Liu H, et al. [77] | AGA (adaptive crossover/mutation) + SCPSO (sine–cosine scheduling) | Offline assignment + 3D routing on static DEM with threats; Role: AGA allocate → SCPSO route; Quant: mean path length ≈ 115.8 vs. 129.6 (SCPSO vs. PSO, 100 iters); mean path length ≈ 114.1 vs. 125.7 (200 iters) | Known DEM/threats; fixed-altitude point-mass; parameter-sensitive; geometric (non-kinodynamic) safety |
| Pan H, Liu Y, Sun G, et al. [78] | NSGA-II-KV (power/hovering) + PSO-NGDP (3D trajectory) | UAV-WPCN with obstacles; Role: MO power/hovering → spacetime waypoint PSO routing; Quant: coverage up to 18.03%; flight energy up to 25.30% (vs. baselines) | Needs known DEM/charging and channel models; partial fixed-altitude/point-mass; parameter-sensitive; penalty safety |
| Du P, He X, Cao H, et al. [79] | GA + LNS → IGCPA | Energy-aware logistics routing with mixed time windows; Role: GA global search + LNS local repair; Quant: energy cost ≈ 17% vs. GA; ≈10% vs. PSO (100-customer case) | Needs known customers/time windows/energy model; parameter-sensitive (GA/LNS); hybrid adds wall-time on large cases; point-mass and penalty safety (no kinodynamic guarantees) |
| Jia Z, Xiao B, Qian H. [80] | Discrete PSO → IM-DPSO (priority matrix; GA cross-mutation; adaptive weights) | Offline multi-task assignment on a static map; Role: centralized allocator with ordered task lists; Quant: execution time ≈ 18 min 27 s; total route length ≈ 396.6 km | 2D fixed altitude; known map; parameter-sensitive (weights/schedules); no dynamic re-allocation; geometric/penalty safety (no kinodynamic guarantees) |
| Li Y, Chen W, Liu S, et al. [81] | VNS-IBPSO + intuitionistic-fuzzy dynamic threat assessment | Offline multi-weapon/multi-target assignment; Role: fuzzy threat assess. → VNS-IBPSO allocation; Quant: convergence time ≈ 4.08 s; fitness variance ≈ 1.0 × 10−4 | Needs known hit probs/DEM; single-shot assumption; parameter-sensitive (penalties/weights/VNS depth); penalty-based safety (no kinodynamic guarantees) |
3.7. CND Problem (Coverage, Online, Dynamic)
3.7.1. Reinforcement-Learning Algorithms
3.7.2. Area-Segmentation Algorithms
3.7.3. Meta-Heuristic and Bio-Inspired Algorithms
| Reference | Algorithm/Method | Applicable Scenario and Problem Addressed | Limitation |
|---|---|---|---|
| Demir K, Tumen V, Kosunalp S, et al. [31] | RL → DDQN with risk-map grid | Wildfire reconnaissance coverage; Role: risk-prioritized patrol; Arch: per-UAV agent + replay/target net; Assump: fixed altitude, grid map, multi-hop to GS; Quant: point-collection ratio ≈ 30%; boundary violations → 0 after training | Grid/known risk weights; ideal sensing/links; long training; point-mass and geometric safety (no kinodynamic guarantees) |
| Cheng X, Jiang R, Sang H, et al. [32] | RL → MADDPG + trace-pheromone (TP-EDC) | Energy-aware dynamic coverage with stigmergy; Role: pheromone-guided policy; Arch: per-UAV actors + shared replay; Quant: average coverage rate ≈ 0.92 (6 UAVs); normalized average energy consumption ≈ 0.61 | Grid and fixed altitude; ideal sensing/links; parameter-sensitive (evaporation/bounds); geometric (non-kinodynamic) safety |
| Dhuheir M, Erbad A, Al-Fuqaha A, et al. [33] | Meta-RL for EH + WIT (disaster zones) | Energy-harvesting + data collection with time/energy/data-rate constraints; Role: central learner + on-board actors; fixed-altitude grid; Quant: total harvested energy +25% vs. DQN; +32% vs. PSO (also higher than greedy) | Grid and fixed altitude; ideal links; no Doppler/interference; penalty-based safety (no kinodynamic guarantees); training cost/episodes. |
| Baccour E, Erbad A, Hamdi M, et al. [34] | PPO for adaptive UAV clustering + RIS phase + association + trajectory + BS power | 6G urban anti-jamming: joint association/trajectory/beam/power with dynamic swarm sizing | Centralized PPO at BS; LoS/propagation assumptions; RIS energy neglected; ZF/SIC to ignore IUI; training cost; geometric (non-kinodynamic) safety |
| Puente-Castro A, Rivero D, Pedrosa E, et al. [35] | RL → Q-Learning + 2-layer ANN (replay, ε-greedy) | Obstacle-rich grid coverage; Role: local vs. global ANN controllers; Quant: point-collection/action counts reported per map and team size; boundary violations drop to ~0 after training | Fixed-altitude grid; ideal sensing/links; point-mass abstraction; reward/ANN/replay sensitivity; geometric (non-kinodynamic) safety |
| Zou L. [36] | Improved QMIX (state-space fusion + masked-highway mixer) | Cooperative area search on fixed-altitude grids; Role: fused-grid observation + decentralized actions + joint value mixing; Quant: steps to full coverage ≈ 74 (2 UAVs); ≈35 (5 UAVs) | Grid and fixed altitude; ideal sensing/links; sensitive to mask weight λ and reward settings; geometric (non-kinodynamic) safety |
| He J. [37] | RL (CTDE) → improved QMIX + multi-agent SAC (attention) | Cooperative coverage with conflict mitigation; Role: grid-cell planning + obstacle-aware motion; Quant: average coverage time reduced ≥11.6% vs. Anti-Flocking; mean coverage time (4 UAVs) ≈ 109 s | Fixed-altitude grid; ideal sensing/links; point-mass model; geometric (non-kinodynamic) safety; centralized critics/training cost |
| Hou Y, Zhao J, Zhang R, et al. [38] | RL → MADDPG (CTDE) with CNN map features | Large-scale cooperative target search; Role: decentralized actors + centralized critics; Quant: overall steps to find all targets ≈ 2631 vs. 3012 (DQN)/3150 (ACO); success rate ≈ 10% higher than DQN/ACO | Fixed-altitude grid; ideal sensing/links; point-mass model; reward/hyper-parameter sensitivity; centralized-critic training cost; geometric (non-kinodynamic) safety |
| Szklany M, Cohen A, Boubin J. [111] | Online partitioning + wavefront traversal (“Tsunami”) | Fault-tolerant swarm coverage with dynamic reassignment; Role: offline waypoint grid → online dispatch via drone pool; Quant: coverage time 1.6 × faster (ideal); 1.91× faster with faults vs. cellular decomposition (SCoPP) | Requires known polygonal map/NFZs; fixed-altitude, waypoint flight; ideal sensing/links; geometric (non-kinodynamic) safety; parameter-sensitive (corridor/σ/samples) |
| Yu Y, Lee S. [112] | MBS-MUCCPPAFOA; MUAV-CCPPAFOA-AS (A* avoidance) | Energy-aware multi-UAV coverage with no-fly zones; Role: multi-base-station global plan/four-region segmentation + per-UAV execution; Quant: proposed methods complete coverage at 950 × 950 m where baseline MUSCPP fails; lower completion times across 600–950 m sizes and varied aspect ratios | Fixed-altitude grid; known polygonal map/NFZs; partition/load sensitivity; geometric (non-kinodynamic) safety; ideal sensing/links |
| Gui J, Yu T, Deng B, et al. [113] | DCAS + NBV/RRT (decentralized) | Dynamic centroid area-segmentation + per-UAV NBV in partitions; Role: partition (load-balanced) + local RRT; Arch: decentralized; Quant: mean completion time (5 UAVs, outdoor) ≈ 376.7 s; indoor ≈ 86.2 s | Needs known initial poses, ideal sensing/links, fixed-altitude Octo Map; sampling variance/partition oscillation; geometric (non-kinodynamic) safety |
| Swain S, Khilar P M, Senapati B R. [114] | Voronoi + camera-footprint waypointing + online path and collision (FGM-I + cone rules) | Multi-UAV coverage with static and moving obstacles; Role: offline partition/waypoints + online path/avoidance; Quant: path length 16.88 m vs. 20.43 m and collision-avoidance events 5 vs. 7 | Fixed-altitude 2D, ideal sensing/links, parameter sensitivity (Rsense/footprint); geometric (non-kinodynamic) safety |
| Bakirci M, Ozer M M. [115] | Distributed Swarm Scanning (DSS): ID/altitude segmentation + FoV footprints + boundary counters; connector UAV relays | Post-disaster area scanning; Role: per-UAV sector scan + comm relay; Arch: decentralized scanning + relay layer | Assumes known polygon map and fixed-altitude waypoints; ideal sensing/links; parameter-sensitive (partition and comm); geometric (non-kinodynamic) safety |
| Aljalaud F, Kurdi H, Youcef-Toumi K. [69] | Booby-inspired heuristic inspection (default + ARS modes; role switching) | Indoor pipe inspection; Role: zone assignment + local scanning; Arch: lightweight heuristic; Quant: mean defect-detection time ↓ ≥13% vs. random; runtime speedup ≥ 3 × vs. random | Fixed-altitude grid; ideal sensing/links; parameter-sensitive (zone/thresholds/clusters); geometric (non-kinodynamic) safety |
| Saadi A A, Soukane A, Meraihi Y, et al. [70] | IMRFO-TS (MRFO + Tabu; tangential non-linear control) | Smart-city UAV placement optimizing coverage/connectivity/energy/load; Role: MRFO exploration + TS local refinement | Needs known DEM/LoS and fixed power model; parameter-sensitive; TS adds runtime/space; geometric (non-kinodynamic) safety |
3.8. CFD Problem (Coverage, Offline, Dynamic)
Particle Swarm Optimization Algorithms
| Reference | Algorithm/Method | Applicable Scenario and Problem Addressed | Limitation |
|---|---|---|---|
| Cheng K, Hu T, Wu D, et al. [101] | PSO-RHC + polar-coord. model + multi-layer map | Heterogeneous-swarm dynamic-target search; Role: horizon-based online replanning guided by pheromone/TPM; Quant: avg detected targets = 8/8; avg steps to finish ≈ 313.5 | 2D fixed altitude; ideal sensing/links; collision ignored (altitude separation); extra compute from horizon; parameter-sensitive (λ, σ, swarm/Q) |
| Pehlivanoglu V Y, Pehlivanoğlu P. [102] | PSO (FCM + ACO seeding; waypoint repair; mutation; prediction) | Offline multi-UAV checkpoint coverage on static DEMs; Role: seeded PSO + obstacle repair; Quant: mean utility (rural-1, 2 UAVs) 153 vs. 10,091 (PSO-3 vs. PSO-1); iterations ≥ 45% fewer than PSO-2 | Static known map; fixed-altitude point-mass; parameter-sensitive (σ/D/samples); extra-waypoint repair may constrain spline turns; geometric (non-kinodynamic) safety |
| Li Y, Chen W, Fu B, et al. [103] | Cooperative-Coevolution, Motion-Encoded PSO (CC-MPSO) | Dynamic-target search; Role: per-UAV sub-swarms + cross-swarm fitness coupling; Arch: centralized evaluation with distributed sub-swarms; Quant: optimized 52/75 statistical items across six scenarios; e.g., Scenario-2 (UAV2) mean detection prob. ≈ 0.153 vs. PSO ≈ 0.125 | 2D fixed-altitude grid; ideal sensing/links; no explicit collision model (geometric spacing only); parameter-sensitive (ω, c1/c2, clamp, ε, swarm size); extra compute from cross-swarm fitness coupling |
| Tang Y, Huang K, Tan Z, et al. [104] | MSC-PSO (fitness-based sub-swarms + level-based learning + adaptive inertia) | Plant-protection paths on static DEM; Role: multi-subswarm PSO + B-spline smoothing; Quant: convergence iterations ≈ 52; total non-spraying time ≈ 12.3 min in sim field | Fixed-altitude known map; parameter-sensitive (sub-swarm bounds/weights); overhead from sub-swarm coordination; geometric (non-kinodynamic) safety |
| Wang Y, Li X, Zhuang X, et al. [105] | DNBPT: multi-step gain sampling + improved DBPSO | Distributed exploration of unknown grids; Role: path–terminal co-evaluation + rolling execution; Quant: mean exploration time ≈ 65.9 s (Scene II, fixed); ≈205.4 s (Scene III, fixed) | Fixed-altitude 2D grid; ideal sensing/links; geometric (non-kinodynamic) safety; parameter-sensitive (horizon/weights/DBPSO) |
| Yan X, Chen R, Jiang Z. [106] | PSOHAC: O-FCM zoning + PSO–ACO routing | Offline coverage on mountain DEM; Role: balanced zoning + hybrid routing; Quant: total path length ≈ 1348.6 m; flight time ≈ 78.2 s | Needs known DEM/sensor model; fixed-altitude point-mass; parameter-sensitive (O-FCM/pheromone/PSO); added hybrid overhead; geometric (non-kinodynamic) safety |
| Chen Y, Qin D, Yang X, et al. [107] | IPSO + Reuleaux-tiling DCA + Roman-domination + convex association | Offline deployment and association for WSN data collection; Role: two-stage IPSO deploy → convex sensor–UAV assignment | Fixed-altitude 2D, ideal sensing/links; parameter-sensitive (λ/μ/ν, ω, c1–c3); geometric (non-kinodynamic) safety |
| Yan Y, Sun Z, Hou Y, et al. [108] | Clustering + ACO(TSP) seeding + PSO/CSO refinement | Offline fleet sizing + routing on static map; Role: 2D assignment → 3D track shaping; Quant: number of UAVs after optimization = 23; mean path length ≈ 64.34 km | Static known map; sensitive to cluster diameter/migration and ACO/PSO/CSO params; geometric (non-kinodynamic) safety |
3.9. CFS Problem (Coverage, Offline, Static)
3.9.1. Genetic Algorithm Algorithms
3.9.2. Differential Evolution Algorithms
| Reference | Algorithm/Method | Applicable Scenario and Problem Addressed | Limitation |
|---|---|---|---|
| Jasim A N, Fourati L C. [82] | GA + GLS → GGA (memetic CVRP solver) | Offline multi-UAV CPP-VRP (min battery + tank); Role: GA global search + GLS best-solution refinement; Quant: mean objective (256 nodes, 4 UAVs) ≈ 859.7; best ≈ 852.0; runtime ≈ 18.0 s at 256/4 vs. ≈3.5 s at 25/1 | Static known map; constant-speed point-mass model; FIFO bias; parameter-sensitive; geometric (non-kinodynamic) safety |
| Fan X, Li H, Chen Y, et al. [116] | Adaptive DE + POC/POD/POS model; variable-angle sweep (polygon) + sector/expanded-square/spiral (circle) | Offline multi-UAV disaster-area search with endurance/comms/safety constraints; Quant: total POS ≈ 0.874 in the time-limited 6-UAV case; POSC ≈ 0.940 for the polygon-partition global search | Fixed-altitude grid; requires prior probability map/DEM; parameter-sensitive (mutation schedule, spacing/region setup); geometric (non-kinodynamic) safety; comms/topology switching idealized |
4. Technique Selection
4.1. Architecture Selection Across the Nine Scenarios: Trade-Offs Among Centralized, Decentralized, and Hybrid Planning
4.2. Digital Twins Across the Nine Scenarios: A Bridge for Pre-Deployment Synthesis and Validation
4.3. Safety-Aware Collision Avoidance Across the Nine Scenarios
4.4. Technical Selection
4.5. Benchmarking
5. Discussion
6. Future Research
6.1. Risks and Challenges
6.1.1. Real-Time Computation and Scalability
6.1.2. Task Allocation, Cooperation, and Multi-Objective Scheduling
6.1.3. Communication, Safety, and Sensing Robustness
6.1.4. System-Level Concerns: Security, Privacy, and Energy/Lifecycle Constraints
6.2. Open Scientific Problems
6.2.1. Compute-Aware, Safety-Critical Online Planning (PND/CND/DND)
6.2.2. Scalable Task Allocation with Coupled Motion (PND/CND/DND)
6.2.3. Bandwidth-Aware Decentralized Planning with Reliability Guarantees (PND/CND/DND/PFD/DFD/CFD)
6.2.4. Certifiable Planning in Dense Low-Altitude Corridors (PND/CND/DND)
6.2.5. Energy-Aware Hierarchical Planning with Lifecycle Constraints (PFD/PND/CFS/CFD)
6.3. Application-Oriented Outlook
- Disaster response (wildfire reconnaissance; post-disaster SAR)—(CND/CFD + PND)
- b.
- Public safety surveillance (urban patrol, border monitoring, large-event security)—(PND/DND/CND)
- c.
- Urban low-altitude corridors/UAM-UTM—(PND/CND; certification focus)
- d.
- Smart logistics (multi-drop delivery with dynamic orders and airspace rules)—(PFD/CFD + DND/DFD)
- e.
- Linear-asset inspection (power lines, pipelines, rail)—(CFD/CFS + DND)
- f.
- Environmental monitoring and precision agriculture—(CFS/CND)
- g.
- Airport bird-dispersion and airfield safety—(DND/PND)
- h.
- Indoor warehouse swarms (GPS-denied)—(PND; static maps, dynamic agents)
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
| Acronyms and Abbreviations | |
| 6G | Sixth-generation cellular network (used as a communication context in some studies). |
| A* | A-star graph-search algorithm for shortest-path planning on grids/graphs. |
| ACO | Ant colony optimization. |
| ANN | Artificial neural network. |
| AoI | Age of Information; a timeliness metric for sensed/communicated data. |
| APF | Artificial potential field; a reactive method for obstacle/formation avoidance and guidance. |
| CFD | Coverage mission + offline planning + dynamic environment (scenario code used in this review). |
| CFS | Coverage mission + offline planning + static environment (scenario code used in this review). |
| CND | Coverage mission + online planning + dynamic environment (scenario code used in this review). |
| CNN | Convolutional neural network. |
| COMA | Counterfactual Multi-Agent policy gradients; a CTDE multi-agent RL method. |
| CTDE | Centralized training with decentralized execution (multi-agent learning paradigm). |
| DDPG | Deep Deterministic Policy Gradient (continuous-control RL algorithm). |
| DDQN | Double Deep Q-Network. |
| DE | Differential evolution. |
| DEM | Digital elevation model. |
| DFD | Distribution mission + offline planning + dynamic environment (scenario code used in this review). |
| DFS | Distribution mission + offline planning + static environment (scenario code used in this review). |
| DND | Distribution mission + online planning + dynamic environment (scenario code used in this review). |
| DQN | Deep Q-Network. |
| DT | Digital twin; a virtual replica/simulator used for validation, monitoring, or optimization. |
| FANET | Flying ad hoc network; a self-organized UAV communication network. |
| GA | Genetic algorithm. |
| GIS | Geographic information system. |
| GNSS | Global Navigation Satellite System. |
| LoS | Line of sight (communication or sensing). |
| LSTM | Long short-term memory network. |
| MADDPG | Multi-Agent DDPG (a CTDE multi-agent RL algorithm). |
| MAPPO | Multi-Agent PPO (a CTDE multi-agent RL algorithm). |
| MDP | Markov decision process. |
| MILP | Mixed-integer linear programming. |
| MPC | Model predictive control; receding-horizon optimization with constraints. |
| PFD | Path mission + offline planning + dynamic environment (scenario code used in this review). |
| PFS | Path mission + offline planning + static environment (scenario code used in this review). |
| PH curve | Pythagorean-hodograph curve; a smooth parametric curve used for trajectory generation. |
| PND | Path mission + online planning + dynamic environment (scenario code used in this review). |
| PPO | Proximal Policy Optimization (RL algorithm). |
| PSO | Particle swarm optimization. |
| QMIX | Value-decomposition method for cooperative multi-agent RL. |
| RGG | Random geometric graph. |
| RIS | Reconfigurable intelligent surface (communication technology). |
| RRT | Rapidly exploring random tree (sampling-based planner). |
| RRT* | Asymptotically optimal variant of RRT. |
| SAC | Soft Actor-Critic (RL algorithm). |
| SINR | Signal-to-interference-plus-noise ratio. |
| TD3 | Twin delayed DDPG (RL algorithm). |
| TDOA | Time difference of arrival (used for localization). |
| VNS | Variable neighborhood search. |
| VRP | Vehicle routing problem. |
| VRPTW | Vehicle routing problem with time windows. |
| WSN | Wireless sensor network. |
| Key Terms | |
| Centralized planning | Planning/decision making is computed at a central node with (near-)global information and then dispatched to individual UAVs. |
| Decentralized planning | Each UAV plans using local observations and limited messages; coordination emerges through communication and local rules. |
| Hybrid architecture | Combines centralized components (e.g., global assignment or map fusion) with decentralized local planning/execution. |
| Kinodynamic constraints | Motion constraints that account for both kinematics and dynamics (e.g., speed/acceleration limits, turn rate, and actuator bounds). |
| Motion primitives | A precomputed library of short, feasible maneuvers used to compose longer trajectories with low online computation. |
| Receding-horizon planning | Repeatedly optimizes over a finite future horizon as new observations arrive (e.g., MPC-style replanning). |
| Safety layer (shield) | An additional mechanism that enforces collision avoidance or constraint satisfaction, even when the high-level planner is imperfect. |
| Static vs. dynamic environment: Static | obstacles/threats are fixed during planning; Dynamic: obstacles, targets, or threats change over time and require replanning. |
| Scenario codes (e.g., PND) | Three-letter codes used in this review: the first letter indicates mission type (P = Path, D = Distribution, C = Coverage); the second indicates planning mode (N = online, F = offline); and the third indicates environment type (S = static, D = dynamic). |
| Online vs. offline planning | Offline: plans are computed before execution; Online: plans are updated during execution based on sensed changes or new tasks. |
References
- Yang, X.; Wang, R.; Zhang, T. Review of unmanned aerial vehicle swarm path planning based on intelligent optimization. Control Theory Appl. 2020, 37, 2291–2302. [Google Scholar]
- Aggarwal, S.; Kumar, N. Path planning techniques for unmanned aerial vehicles: A review, solutions, and challenges. Comput. Commun. 2020, 149, 270–299. [Google Scholar] [CrossRef]
- Yang, L.; Qi, J.; Xiao, J.; Yong, X. A literature review of UAV 3D path planning. In Proceedings of the 11th World Congress on Intelligent Control and Automation, Shenyang, China, 29 June–4 July 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 2376–2381. [Google Scholar]
- Cetinsaya, B.; Reiners, D.; Cruz-Neira, C. From PID to swarms: A decade of advancements in drone control and path planning—A systematic review (2013–2023). Swarm Evol. Comput. 2024, 89, 101626. [Google Scholar] [CrossRef]
- Rahman, M.; Sarkar, N.I.; Lutui, R. A survey on multi-UAV path planning: Classification, algorithms, open research problems, and future directions. Drones 2025, 9, 263. [Google Scholar] [CrossRef]
- Wu, Q.; Su, Y.; Tan, W.; Zhan, R.; Liu, J.; Jiang, L. UAV path planning trends from 2000 to 2024: A bibliometric analysis and visualization. Drones 2025, 9, 128. [Google Scholar] [CrossRef]
- Ait Saadi, A.; Soukane, A.; Meraihi, Y.; Benmessaoud Gabis, A.; Mirjalili, S.; Ramdane-Cherif, A. UAV path planning using optimization approaches: A survey. Arch. Comput. Methods Eng. 2022, 29, 4233–4284. [Google Scholar] [CrossRef]
- Zhang, H.; Xin, B.; Dou, L.; Chen, J.; Hirota, K. A review of cooperative path planning of an unmanned aerial vehicle group. Front. Inf. Technol. Electron. Eng. 2020, 21, 1671–1694. [Google Scholar] [CrossRef]
- Ghambari, S.; Golabi, M.; Jourdan, L.; Lepagnot, J.; Idoumghar, L. UAV path planning techniques: A survey. RAIRO-Oper. Res. 2024, 58, 2951–2989. [Google Scholar] [CrossRef]
- Chen, W.; Chi, W.; Ji, S.; Ye, H.; Liu, J.; Jia, Y.; Yu, J.; Cheng, J. A survey of autonomous robots and multi-robot navigation: Perception, planning and collaboration. Biomim. Intell. Robot. 2025, 5, 100203. [Google Scholar] [CrossRef]
- Bui, H. A survey of multi-robot motion planning. arXiv 2023, arXiv:2310.08599. [Google Scholar] [CrossRef]
- Athira, K.A.; Udayan, D.J.; Subramaniam, U. A systematic literature review on multi-robot task allocation. ACM Comput. Surv. 2024, 57, 6801–6828. [Google Scholar] [CrossRef]
- Hu, J.; Fan, L.; Lei, Y.; Xu, Z.; Fu, W.; Xu, G. Reinforcement learning-based low-altitude path planning for UAS swarm in diverse threat environments. Drones 2023, 7, 567. [Google Scholar] [CrossRef]
- Westheider, J.; Rückin, J.; Popović, M. Multi-UAV adaptive path planning using deep reinforcement learning. In Proceedings of the 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, MI, USA, 1–5 October 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 649–656. [Google Scholar]
- Kong, X.; Zhou, Y.; Li, Z.; Wang, S. Multi-UAV simultaneous target assignment and path planning based on deep reinforcement learning in dynamic multiple obstacles environments. Front. Neurorobot. 2024, 17, 1302898. [Google Scholar] [CrossRef]
- Arranz, R.; Carramiñana, D.; Miguel, G.; Besada, J.A.; Bernardos, A.M. Application of deep reinforcement learning to UAV swarming for ground surveillance. Sensors 2023, 23, 8766. [Google Scholar] [CrossRef]
- Wang, X.; Gursoy, M.C. Robust and decentralized reinforcement learning for UAV path planning in IoT networks. arXiv 2023, arXiv:2312.06250. [Google Scholar] [CrossRef]
- Cheng, Y.; Li, D.; Wong, W.; Zhao, M.; Mo, D. Multi-UAV collaborative path planning using hierarchical reinforcement learning and simulated annealing. Int. J. Perform. Eng. 2022, 18, 463–474. [Google Scholar] [CrossRef]
- Niu, Y.; Yan, X.; Wang, Y.; Niu, Y. Three-dimensional collaborative path planning for multiple UCAVs based on improved artificial ecosystem optimizer and reinforcement learning. Knowl.-Based Syst. 2023, 276, 110782. [Google Scholar] [CrossRef]
- Wu, W.; Zhang, X. Reinforcement learning-based swarm control for UAVs in static and dynamic multi-obstacle environments. In Proceedings of the 2023 China Automation Congress (CAC), Qingdao, China, 3–5 November 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1387–1392. [Google Scholar]
- Azzam, R.; Boiko, I.; Zweiri, Y. Swarm cooperative navigation using centralized training and decentralized execution. Drones 2023, 7, 193. [Google Scholar] [CrossRef]
- Wang, W.L.; You, M.; Sun, L.; Zhang, X.; Zong, Q. Intelligent cooperative exploration and path planning for UAV swarms in unknown environments. Chin. J. Eng. Sci. 2024, 46, 1197–1206. [Google Scholar]
- Liu, J. Research on UAV Swarm Capture Method Based on Game Learning. Master’s Thesis, Xi’an Technological University, Xi’an, China, 2023. [Google Scholar]
- Wu, Q.; Liu, K.; Chen, L.; Lv, J. Multi-Agent Reinforcement Learning-Based UAV Pathfinding for Obstacle Avoidance in Stochastic Environment. arXiv 2023, arXiv:2310.16659. [Google Scholar]
- Zhao, X.; Yang, R.; Zhong, L.; Hou, Z. Multi-UAV Path Planning and Following Based on Multi-Agent Reinforcement Learning. Drones 2024, 8, 18. [Google Scholar] [CrossRef]
- Liu, Y.; Li, X.; Wang, J.; Wei, F.; Yang, J. Reinforcement-Learning-Based Multi-UAV Cooperative Search for Moving Targets in 3D Scenarios. Drones 2024, 8, 378. [Google Scholar] [CrossRef]
- Dhuheir, M.A.; Baccour, E.; Erbad, A.; Al-Obaidi, S.S.; Hamdi, M. Deep reinforcement learning for trajectory path planning and distributed inference in resource-constrained UAV swarms. IEEE Internet Things J. 2022, 10, 8185–8201. [Google Scholar] [CrossRef]
- Li, M.; Ma, Q.; Wu, G. UAV swarm dynamic task planning algorithm based on reinforcement learning. Syst. Simul. Technol. 2023, 19, 193–204. [Google Scholar]
- Du, J. Path Planning and Task Assignment of UAV Swarm Under Incomplete Information. Master’s Thesis, Harbin Engineering University, Harbin, China, 2023. [Google Scholar]
- Chen, H.C.; Yen, L.H. DRL-based distributed joint serving and charging scheduling for UAV swarm. In Proceedings of the 2024 International Conference on Information Networking (ICOIN), Ho Chi Minh City, Vietnam, 17–20 January 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 587–592. [Google Scholar]
- Demir, K.; Tumen, V.; Kosunalp, S.; Iliev, T. A deep reinforcement learning algorithm for trajectory planning of swarm UAV fulfilling wildfire reconnaissance. Electronics 2024, 13, 2568. [Google Scholar] [CrossRef]
- Cheng, X.; Jiang, R.; Sang, H.; Li, G.; He, B. Trace pheromone-based energy-efficient UAV dynamic coverage using deep reinforcement learning. IEEE Trans. Cogn. Commun. Netw. 2024, 10, 1063–1074. [Google Scholar] [CrossRef]
- Dhuheir, M.; Erbad, A.; Al-Fuqaha, A.; Seid, A.M. Meta reinforcement learning for UAV-assisted energy harvesting IoT devices in disaster-affected areas. IEEE Open J. Commun. Soc. 2024, 5, 2145–2163. [Google Scholar] [CrossRef]
- Baccour, E.; Erbad, A.; Hamdi, M.; Guizani, M. RL-based adaptive UAV swarm formation and clustering for secure 6G wireless communications in dynamic dense environments. IEEE Access 2024, 12, 125609–125628. [Google Scholar] [CrossRef]
- Puente-Castro, A.; Rivero, D.; Pedrosa, E.; Pereira, A.; Lau, N.; Fernandez-Blanco, E. Q-learning based system for path planning with unmanned aerial vehicles swarms in obstacle environments. Expert Syst. Appl. 2024, 235, 121240. [Google Scholar] [CrossRef]
- Zou, L. Research on UAV Cooperative Area Search Planning Based on Reinforcement Learning. Master’s Thesis, Huazhong University of Science and Technology, Wuhan, China, 2023. [Google Scholar]
- He, J. Multi-Agent Reinforcement Learning Regional Coverage Method for Real Tasks and Constraints. Ph.D. Thesis, University of Electronic Science and Technology of China, Chengdu, China, 2023. [Google Scholar]
- Hou, Y.; Zhao, J.; Zhang, R.; Cheng, X.; Yang, L. UAV swarm cooperative target search: A multi-agent reinforcement learning approach. IEEE Trans. Intell. Veh. 2023, 9, 568–578. [Google Scholar] [CrossRef]
- Burzyński, W.; Stecz, W. Trajectory planning with multiplatform spacetime RRT. Appl. Intell. 2024, 54, 9524–9541. [Google Scholar] [CrossRef]
- Kelner, J.M.; Burzyński, W.; Stecz, W. Modeling UAV swarm flight trajectories using Rapidly-exploring Random Tree algorithm. J. King Saud Univ.-Comput. Inf. Sci. 2024, 36, 101909. [Google Scholar] [CrossRef]
- Xiang, L.; Wang, F.; Xu, W.; Zhang, T.; Pan, M.; Han, Z. Dynamic UAV swarm collaboration for multi-targets tracking under malicious jamming: Joint power, path and target association optimization. IEEE Trans. Veh. Technol. 2023, 73, 5410–5425. [Google Scholar] [CrossRef]
- Kang, C.; Xu, J.; Bian, Y. Affine formation maneuver control for multi-agent based on optimal flight system. Appl. Sci. 2024, 14, 2292. [Google Scholar] [CrossRef]
- Zhao, W.; Li, L.; Wang, Y.; Zhan, H.; Fu, Y.; Song, Y. Research on a global path-planning algorithm for unmanned aerial vehicle swarm in three-dimensional space based on Theta*–artificial potential field method. Drones 2024, 8, 125. [Google Scholar] [CrossRef]
- Chen, G.; Yuan, S.; Zhu, X.; Zhou, G.; Zhang, Z. Path planning for fast swarm source seeking in unknown environments. Int. J. Adapt. Control Signal Process. 2024, 38, 360–377. [Google Scholar] [CrossRef]
- Li, J.; Zi, S.; Lu, X. Combat strategy of UAV swarm based on improved artificial potential field method. Radio Eng. 2024, 54, 1970–1977. [Google Scholar]
- Kallies, C.; Gasche, S.; Karásek, R. Multi-agent cooperative path planning via model predictive control. In Proceedings of the 2024 Integrated Communications, Navigation and Surveillance Conference (ICNS), Herndon, VA, USA, 23–25 April 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–7. [Google Scholar]
- Fan, X.; Li, H.; Chen, Y.; Dong, D. A path-planning method for UAV swarm under multiple environmental threats. Drones 2024, 8, 171. [Google Scholar] [CrossRef]
- Wang, Y.; Zhang, T.; Cai, Z.; Zhao, J.; Wu, K. Multi-UAV coordination control by chaotic grey wolf optimization based distributed MPC with event-triggered strategy. Chin. J. Aeronaut. 2020, 33, 2877–2897. [Google Scholar] [CrossRef]
- Xian, B.; Song, N. Multi-UAV path planning based on model predictive control and improved artificial potential field method. Control Decis. 2024, 39, 2133–2141. [Google Scholar]
- Wee, L.B.; Paw, Y.C. Simultaneous mapping localization and path planning for UAV swarm. In Proceedings of the 2023 IEEE Aerospace Conference, Big Sky, MT, USA, 4–11 March 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1–6. [Google Scholar]
- Guo, J.; Gao, Y.; Liu, Y. Task assignment and path planning algorithm for multiple fixed-wing UAVs. J. Taiyuan Univ. Technol. 2025, 56, 348–355. [Google Scholar]
- Li, K.; Yan, X.; Han, Y. Multi-mechanism swarm optimization for multi-UAV task assignment and path planning in transmission line inspection under multi-wind field. Appl. Soft Comput. 2024, 150, 111033. [Google Scholar] [CrossRef]
- Luo, X. Research on Multi-UAV Cooperative Task Decision-Making and Planning Method Based on Ant Colony Algorithm in Unknown Environments. Master’s Thesis, University of Electronic Science and Technology of China, Chengdu, China, 2023. [Google Scholar]
- Zhang, H.; Ma, H.; Mersha, B.W.; Zhang, X.; Jin, Y. Distributed cooperative search method for multi-UAV with unstable communications. Appl. Soft Comput. 2023, 148, 110592. [Google Scholar] [CrossRef]
- Wang, Q.; Xu, M.; Hu, Z. Path planning of unmanned aerial vehicles based on an improved bio-inspired tuna swarm optimization algorithm. Biomimetics 2024, 9, 388. [Google Scholar] [CrossRef]
- Gu, G.; Li, H.; Zhao, C. A multi-strategy enhanced marine predator algorithm for global optimization and UAV swarm path planning. IEEE Access 2024, 12, 112095–112115. [Google Scholar] [CrossRef]
- Xu, N.; Zhu, H.; Sun, J. Bionic 3D path planning for plant protection UAVs based on swarm intelligence algorithms and krill swarm behavior. Biomimetics 2024, 9, 353. [Google Scholar] [CrossRef]
- Fu, S.; Li, K.; Huang, H.; Ma, C.; Fan, Q.; Zhu, Y. Red-billed blue magpie optimizer: A novel metaheuristic algorithm for 2D/3D UAV path planning and engineering design problems. Artif. Intell. Rev. 2024, 57, 134. [Google Scholar] [CrossRef]
- Liu, P.; Sun, N.; Wan, H.; Zhang, C.; Zhao, J.; Wang, G. Improved adaptive snake optimization algorithm with application to multi-UAV path planning. Trans. Inst. Meas. Control 2024, 47, 1639–1650. [Google Scholar] [CrossRef]
- Liu, L.; Lu, Y.; Yang, B.; Yang, L.; Zhao, J.; Chen, Y.; Li, L. Research on a multi-strategy improved sand cat swarm optimization algorithm for three-dimensional UAV trajectory path planning. World Electr. Veh. J. 2024, 15, 244. [Google Scholar] [CrossRef]
- Yin, S.; Yang, J.; Ma, L.; Fu, M.; Xu, K. An enhanced whale algorithm for three-dimensional path planning for meteorological detection of the unmanned aerial vehicle in complex environments. IEEE Access 2024, 12, 60039–60057. [Google Scholar] [CrossRef]
- Chen, F.; Tang, Y.; Li, N.; Wang, T.; Hu, Y. A study of collaborative trajectory planning method based on starling swarm bionic algorithm for multi-unmanned aerial vehicle. Appl. Sci. 2023, 13, 6795. [Google Scholar] [CrossRef]
- Xiang, H.; Han, Y.; Pan, N.; Zhang, M.; Wang, Z. Study on multi-UAV cooperative path planning for complex patrol tasks in large cities. Drones 2023, 7, 367. [Google Scholar] [CrossRef]
- Wu, X.J.; Xu, L.; Zhen, R.; Wu, X. Global and local moth-flame optimization algorithm for UAV formation path planning under multi-constraints. Int. J. Control Autom. Syst. 2023, 21, 1032–1047. [Google Scholar] [CrossRef]
- Zhang, X.; Zhang, X.; Miao, Y. Cooperative global path planning for multiple unmanned aerial vehicles based on improved fireworks algorithm using differential evolution operation. Int. J. Aeronaut. Space Sci. 2023, 24, 1346–1362. [Google Scholar] [CrossRef]
- Hou, J.; Zhou, X.; Pan, N.; Li, A.; Guan, Y.; Xu, C.; Gan, Z.; Gao, F. Primitive-Swarm: An Ultra-lightweight and Scalable Planner for Large-scale Aerial Swarms. arXiv 2025, arXiv:2502.16887. [Google Scholar] [CrossRef]
- Tan, C.; Liu, X. Improved two-stage task allocation of distributed UAV swarms based on an improved auction mechanism. Int. J. Mach. Learn. Cybern. 2024, 15, 5119–5128. [Google Scholar] [CrossRef]
- Wang, G.; Wang, F.; Wang, J.; Li, M.; Gai, L.; Xu, D. Collaborative target assignment problem for large-scale UAV swarm based on two-stage greedy auction algorithm. Aerosp. Sci. Technol. 2024, 149, 109146. [Google Scholar] [CrossRef]
- Aljalaud, F.; Kurdi, H.; Youcef-Toumi, K. Autonomous multi-UAV path planning in pipe inspection missions based on booby behavior. Mathematics 2023, 11, 2092. [Google Scholar] [CrossRef]
- Saadi, A.A.; Soukane, A.; Meraihi, Y.; Gabis, A.B.; Ramdane-Cherif, A. A hybrid improved manta ray foraging optimization with Tabu search algorithm for solving the UAV placement problem in smart cities. IEEE Access 2023, 11, 24315–24342. [Google Scholar] [CrossRef]
- Chen, Y.; Pi, D.; Wang, B.; Mohamed, A.W.; Chen, J.; Wang, Y. Equilibrium optimizer with generalized opposition-based learning for multiple unmanned aerial vehicle path planning. Soft Comput. 2024, 28, 6185–6198. [Google Scholar] [CrossRef]
- Du, Y. Multi-UAV search and rescue with enhanced A∗ algorithm path planning in 3D environment. Int. J. Aerosp. Eng. 2023, 2023, 8614117. [Google Scholar] [CrossRef]
- Bashir, N.; Boudjit, S.; Dauphin, G. A connectivity aware path planning for a fleet of UAVs in an urban environment. IEEE Trans. Intell. Transp. Syst. 2023, 24, 10537–10552. [Google Scholar] [CrossRef]
- Xie, J.; Zhang, G.; Zhang, W. A swarm motion planning algorithm for multi-UAV cooperative tasks. In Proceedings of the 7th National Conference on Swarm Intelligence and Cooperative Control, Harbin, China, 24–27 September 2023; China Command and Control Society, Harbin Institute of Technology Simulation Center: Harbin, China, 2023; p. 7. [Google Scholar]
- Kladis, G.P.; Doitsidis, L.; Tsourveloudis, N.C. Energy-efficient path-planning for UAV swarm based missions: A genetic algorithm approach. In Proceedings of the 2024 International Conference on Unmanned Aircraft Systems (ICUAS), Chania, Greece, 11–14 June 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 458–463. [Google Scholar]
- Wu, Y.; Liang, T.; Gou, J.; Tao, C.; Wang, H. Heterogeneous mission planning for multiple UAV formations via metaheuristic algorithms. IEEE Trans. Aerosp. Electron. Syst. 2023, 59, 3924–3940. [Google Scholar] [CrossRef]
- Xiong, T.; Liu, F.; Liu, H.; Ge, J.; Li, H.; Ding, K.; Li, Q. Multi-drone optimal mission assignment and 3D path planning for disaster rescue. Drones 2023, 7, 394. [Google Scholar] [CrossRef]
- Pan, H.; Liu, Y.; Sun, G.; Fan, J.; Liang, S.; Yuen, C. Joint power and 3-D trajectory optimization for UAV-enabled wireless powered communication networks with obstacles. IEEE Trans. Commun. 2023, 71, 2364–2380. [Google Scholar] [CrossRef]
- Du, P.; He, X.; Cao, H.; Garg, S.; Kaddoum, G.; Hassan, M.M. AI-based energy-efficient path planning of multiple logistics UAVs in intelligent transportation systems. Comput. Commun. 2023, 207, 46–55. [Google Scholar] [CrossRef]
- Jia, Z.; Xiao, B.; Qian, H. Improved mixed discrete particle swarms based multi-task assignment for UAVs. In Proceedings of the 2023 IEEE 12th Data Driven Control and Learning Systems Conference (DDCLS), Harbin, China, 5–7 August 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 442–448. [Google Scholar]
- Wu, J.; Zhang, N.; Li, D.; Bi, J.; Han, G. A context-aware feature fusion method for multi-uav cooperative air combat. IEEE Trans. Intell. Transp. Syst. 2025, 26, 7197–7210. [Google Scholar] [CrossRef]
- Jasim, A.N.; Fourati, L.C. Guided genetic algorithm for solving capacitated vehicle routing problem with unmanned-aerial-vehicles. IEEE Access 2024, 12, 106333–106358. [Google Scholar] [CrossRef]
- Liu, Y.; Zhu, X.; Zhang, X.; Xiao, J.; Yu, X. RGG-PSO+: Random geometric graphs based particle swarm optimization method for UAV path planning. Int. J. Comput. Intell. Syst. 2024, 17, 127. [Google Scholar] [CrossRef]
- Meng, Q.; Chen, K.; Qu, Q. Ppswarm: Multi-UAV path planning based on hybrid PSO in complex scenarios. Drones 2024, 8, 192. [Google Scholar] [CrossRef]
- Huang, H.; Li, Y.; Song, G.; Gai, W. Deep reinforcement learning-driven UAV data collection path planning: A study on minimizing AoI. Electronics 2024, 13, 1871. [Google Scholar] [CrossRef]
- Cao, Z.; Li, D.; Zhang, B. Dynamic trajectory planning for UAV cluster by weighted Voronoi diagram with particle swarm optimization. In Proceedings of the International Conference on Autonomous Unmanned Systems, Xi’an, China, 23–25 September 2022; Springer Nature: Singapore, 2022; pp. 3479–3490. [Google Scholar]
- Shao, Z.; Zhou, Z.; Qu, G.; Zhu, X. Reference path planning for UAVs formation flight based on PH curve. In Proceedings of the Asia-Pacific International Symposium on Aerospace Technology, Jeju, Republic of Korea, 15–17 November 2021; Springer Nature: Singapore, 2021; pp. 155–168. [Google Scholar]
- Wang, C.; Zhang, L.; Gao, Y.; Zheng, X.; Wang, Q. A cooperative game hybrid optimization algorithm applied to UAV inspection path planning in urban pipe corridors. Mathematics 2023, 11, 3620. [Google Scholar] [CrossRef]
- Sheng, L.; Li, H.; Qi, Y.; Shi, M. Real-time screening and trajectory optimization of UAVs in cluster based on improved particle swarm optimization algorithm. IEEE Access 2023, 11, 81838–81851. [Google Scholar] [CrossRef]
- Tan, L.; Zhang, H.; Shi, J.; Liu, Y.; Yuan, T. A robust multiple unmanned aerial vehicles 3D path planning strategy via improved particle swarm optimization. Comput. Electr. Eng. 2023, 111, 108947. [Google Scholar] [CrossRef]
- Wang, L.; Luan, Y.; Xu, L. UAV swarm path planning method based on dynamic cluster particle swarm optimization. Comput. Appl. 2023, 43, 3816–3823. [Google Scholar]
- Li, Y.; Zhang, L.; Cai, B.; Liang, Y. Unified path planning for composite UAVs via Fermat point-based grouping particle swarm optimization. Aerosp. Sci. Technol. 2024, 148, 109088. [Google Scholar] [CrossRef]
- Yan, X.; Chen, R. Application strategy of unmanned aerial vehicle swarms in forest fire detection based on the fusion of particle swarm optimization and artificial bee colony algorithm. Appl. Sci. 2024, 14, 4937. [Google Scholar] [CrossRef]
- Beishenalieva, A.; Yoo, S.J. Multiobjective 3-D UAV movement planning in wireless sensor networks using bioinspired swarm intelligence. IEEE Internet Things J. 2022, 10, 8096–8110. [Google Scholar] [CrossRef]
- Zhang, J.; Cui, Y.; Ren, J. Dynamic mission planning algorithm for UAV formation in battlefield environment. IEEE Trans. Aerosp. Electron. Syst. 2022, 59, 3750–3765. [Google Scholar] [CrossRef]
- Deng, M.; Yao, Z.; Li, X.; Wang, H.; Nallanathan, A.; Zhang, Z. Dynamic multi-objective AWPSO in DT-assisted UAV cooperative task assignment. IEEE J. Sel. Areas Commun. 2023, 41, 3444–3460. [Google Scholar] [CrossRef]
- Yu, Y. Research on UAV Swarm Cooperative Task Assignment Technology in Complex Constrained Environments. Ph.D. Thesis, Xidian University, Xi’an, China, 2023. [Google Scholar]
- Tang, G.; Xiao, T.; Du, P.; Zhang, P.; Liu, K.; Tan, L. Improved PSO-based two-phase logistics UAV path planning under dynamic demand and wind conditions. Drones 2024, 8, 356. [Google Scholar] [CrossRef]
- Li, Y.; Chen, W.; Liu, S.; Yang, G.; He, F. Multi-UAV cooperative air combat target assignment method based on VNS-IBPSO in complex dynamic environment. Int. J. Aerosp. Eng. 2024, 2024, 9980746. [Google Scholar] [CrossRef]
- Han, D.; Jiang, H.; Wang, L.; Zhu, X.; Chen, Y.; Yu, Q. Collaborative task allocation and optimization solution for unmanned aerial vehicles in search and rescue. Drones 2024, 8, 138. [Google Scholar] [CrossRef]
- Cheng, K.; Hu, T.; Wu, D.; Li, T.; Wang, S.; Liu, K.; Yi, D. Heterogeneous UAV swarm collaborative search mission path optimization scheme for dynamic targets. Int. J. Aerosp. Eng. 2024, 2024, 6643424. [Google Scholar] [CrossRef]
- Pehlivanoglu, V.Y.; Pehlivanoğlu, P. An efficient path planning approach for autonomous multi-UAV system in target coverage problems. Aircr. Eng. Aerosp. Technol. 2024, 96, 690–706. [Google Scholar] [CrossRef]
- Li, Y.; Chen, W.; Fu, B.; Wu, Z.; Hao, L.; Yang, G. Research on dynamic target search for multi-UAV based on cooperative coevolution motion-encoded particle swarm optimization. Appl. Sci. 2024, 14, 1326. [Google Scholar] [CrossRef]
- Tang, Y.; Huang, K.; Tan, Z.; Fang, M.; Huang, H. Multi-subswarm cooperative particle swarm optimization algorithm and its application. Inf. Sci. 2024, 677, 120887. [Google Scholar] [CrossRef]
- Wang, Y.; Li, X.; Zhuang, X.; Li, F.; Liang, Y. A sampling-based distributed exploration method for UAV cluster in unknown environments. Drones 2023, 7, 246. [Google Scholar] [CrossRef]
- Yan, X.; Chen, R.; Jiang, Z. UAV cluster mission planning strategy for area coverage tasks. Sensors 2023, 23, 9122. [Google Scholar] [CrossRef]
- Chen, Y.; Qin, D.; Yang, X.; Zhang, G.; Zhang, X.; Ma, L. A deployment strategy for UAV-aided data collection in unknown environments. IEEE Sens. J. 2024, 24, 27017–27028. [Google Scholar] [CrossRef]
- Yan, Y.; Sun, Z.; Hou, Y.; Zhang, B.; Yuan, Z.; Zhang, G.; Ma, X. UAV swarm mission planning and load sensitivity analysis based on clustering and optimization algorithms. Appl. Sci. 2023, 13, 12438. [Google Scholar] [CrossRef]
- Yu, S. Research on Task Assignment and Path Planning Methods of UAV Swarm for Target Tracking. Master’s Thesis, Shenyang University of Technology, Shenyang, China, 2023. [Google Scholar]
- Wang, X.; Zhang, X.; Lu, Y.; Zhang, H.; Li, Z.; Zhao, P.; Wang, X. Target trajectory prediction-based UAV swarm cooperative for bird-driving strategy at airport. Electronics 2024, 13, 3868. [Google Scholar] [CrossRef]
- Szklany, M.; Cohen, A.; Boubin, J. Tsunami: Scalable, fault tolerant coverage path planning for UAV swarms. In Proceedings of the 2024 International Conference on Unmanned Aircraft Systems (ICUAS), Chania, Greece, 11–14 June 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 711–717. [Google Scholar]
- Yu, Y.; Lee, S. Multi-UAV coverage path assignment algorithm considering flight time and energy consumption. IEEE Access 2024, 12, 26150–26162. [Google Scholar] [CrossRef]
- Gui, J.; Yu, T.; Deng, B.; Zhu, X.; Yao, W. Decentralized multi-UAV cooperative exploration using dynamic centroid-based area partition. Drones 2023, 7, 337. [Google Scholar] [CrossRef]
- Swain, S.; Khilar, P.M.; Senapati, B.R. An efficient path planning algorithm for 2D ground area coverage using multi-UAV. Wirel. Pers. Commun. 2023, 132, 361–407. [Google Scholar] [CrossRef]
- Bakirci, M.; Ozer, M.M. Post-disaster area monitoring with swarm UAV systems for effective search and rescue. In Proceedings of the 2023 10th International Conference on Recent Advances in Air and Space Technologies (RAST), Istanbul, Turkey, 8–10 June 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1–6. [Google Scholar]
- Zhao, X.; Zhang, W.; Zhang, H.; Zheng, C.; Ma, J.; Zhang, Z. ITD-YOLOv8: An infrared target detection model based on YOLOv8 for unmanned aerial vehicles. Drones 2024, 8, 161. [Google Scholar] [CrossRef]
- Li, Z.; Lei, L.; Shen, G.; Liu, X.; Liu, X. Digital Twin-Enabled Deep Reinforcement Learning for Safety-Guaranteed Flocking Motion of UAV Swarm. Trans. Emerg. Telecommun. Technol. 2024, 35, e70011. [Google Scholar] [CrossRef]
- Shen, G.; Lei, L.; Zhang, X.; Li, Z.; Cai, S.; Zhang, L. Multi-UAV Cooperative Search Based on Reinforcement Learning with a Digital Twin Driven Training Framework. IEEE Trans. Veh. Technol. 2023, 72, 8354–8368. [Google Scholar] [CrossRef]
- Sun, Y.; Fazli, P. Real-Time Policy Distillation in Deep Reinforcement Learning. arXiv 2019, arXiv:1912.12630. [Google Scholar] [CrossRef]
- Stulp, F.; Schaal, S. Hierarchical Reinforcement Learning with Movement Primitives. In Proceedings of the 11th IEEE-RAS International Conference on Humanoid Robots, Bled, Slovenia, 26–28 October 2011; pp. 231–238. [Google Scholar]
- Joint Authorities for Rulemaking on Unmanned Systems (JARUS). SORA—Specific Operations Risk Assessment for Unmanned Aircraft Systems, v2.5; Main Body (JAR Doc 25); Joint Authorities for Rulemaking on Unmanned Systems (JARUS): Vienna, Austria, 2024. [Google Scholar]
- European Commission. Commission Implementing Regulation (EU) 2021/664 of 22 April 2021 on a Regulatory Framework for the U-Space. Off. J. Eur. Union 2021, 139, 161–183. [Google Scholar]
- ASTM F3269-21; Standard Practice for Methods to Safely Bound Behavior of Aircraft Systems Containing Complex Functions Using Run-Time Assurance. ASTM International: West Conshohocken, PA, USA, 2021.
- ASTM F3548-21; Standard Specification for UAS Traffic Management (UTM) UAS Service Supplier (USS) Interoperability. ASTM International: West Conshohocken, PA, USA, 2021.


| Framework | Primary Axis/Taxonomy Style | Scope/Focus | What It Captures Well | Gaps for Multi-UAV Swarms (Re Scenario Cells) |
|---|---|---|---|---|
| Aggarwal & Kumar [2] | Method lineage: classical/heuristic/meta-heuristic/ML/hybrid | UAV path planning (broad) | Clear family catalog; quick orientation by “how it works” | No explicit mission (Path/Distribution/Coverage) × planning (offline/online) × environment (static/dynamic) mapping; weak deployment guidance |
| Yang et al. [3] | Algorithmic paradigm: sampling/node/model-based/bio-inspired/multi-fusion | UAV 3D path planning (classical emphasis) | Nuanced split of classical planners | Platform-/scenario-agnostic; when to add MPC/APF/RL safety or which architecture to pick is unclear |
| Ait Saadi et al. [7] | Optimization approach: classical/heuristic/meta-heuristic/ML/hybrid | UAV path via optimization | Broad optimization view; pros/cons per family | Limited linkage to online vs. offline and dynamic vs. static missions; few cross-scenario rules |
| Zhang et al. [8] | Cooperative path planning; optimization-oriented synthesis | Multi-UAV cooperative path planning | Cooperation aspects summarized | Lacks scenario-conditioned mapping across Path/Distribution/Coverage and environment dynamics |
| Cetinsaya et al. [4] | Systematic review of control + path (2013–2023) | UAVs/UAS and swarms (controls + planning) | Decade-wide sweep; challenges and trends | Platform-level view; not tied to scenario cells or deployment bridges |
| Rahman et al. [5] | Family buckets: meta-heuristic/classical/heuristic/ML/hybrid; comparative criteria | Multi-UAV path planning | Family usage stats; criteria (time, cost, complexity, convergence, adaptability) | Comparisons not anchored to mission/planning/environment contexts |
| Wu et al. [6] | Bibliometric mapping (MKD) of 2000–2024 | Trend and cluster analysis | Macro trends; surge post-2018; scenario-agnostic map | Method-agnostic; no operational scenario guidance |
| Ghambari et al. [9] | Taxonomy + environment modeling; optimality/completeness | General UAV path planning | Modeling choices and criteria well covered | Stops short of dynamic coverage, allocation–routing coupling, bandwidth-limited teams across scenario cells |
| Chen et al. [10] | Navigation stack: perception/planning/collaboration/control | Autonomous and multi-robot navigation (ground and aerial platforms) | System-level view of the navigation stack; links perception, planning and coordination | Not UAV-swarm-specific; no mission (Path/Distribution/Coverage) × planning-mode × environment grid; limited treatment of low-altitude airspace, energy and bandwidth constraints |
| Bui [11] | Four-axis taxonomy: robot model/environment type/communication mode/planner type | Multi-robot motion planning (platform-agnostic) | Makes communication modes and planner centralization explicit; clear gap analysis for different planner types | Focuses on general multi-robot systems; not tailored to low-altitude UAV corridors or swarm-scale missions; no mapping to Path/Distribution/Coverage cells |
| Athira et al. [12] | PRISMA-based task-assignment taxonomy (objectives, constraints, solution methods) | Multi-robot task allocation (mainly ground robots) | Detailed classification of task-assignment formulations and solvers; thorough PRISMA synthesis | Covers allocation but not 3D routing/coverage coupling; little UAV-swarm evidence; no mission/planning/environment mapping across scenario cells |
| Algorithm Family | Primary Fit | Secondary Fit | Runtime Compute | Offline Cost | Typical Role/Key Applicability Notes |
|---|---|---|---|---|---|
| Graph search (A*/Dijkstra/JPS) | PFS | PND, PFD, DND, DFD, DFS, CND, CFD, CFS | Medium–High | Low | Reproducible offline backbone on known maps; grid/voxel dependence; expensive global replans on very large graphs |
| RRT/RRT* (incl. spatiotemporal) | PND | PFS, PFD, DND, DFD, DFS, CND, CFD, CFS | Medium–High | Low | Anytime kinodynamic routing; replan-friendly; slow in narrow passages; sensitive to collision-check budget |
| APF/reactive safety layers | PND | DND, CND | Low | Low | Near-field safety wrapper; low latency; local minima/oscillation; gain tuning sensitive; needs global backbone |
| MPC (incl. distributed MPC) | PND | PFS, PFD, DND, DFD, CND, CFD | High | Medium | Constraint-aware receding horizon; strong tracking; solver/model sensitivity; best for moderate team sizes |
| RL/MARL (CTDE, PPO/TD3/COMA, etc.) | PND, CND | PFD, DND, DFD, CFD | Medium (inference) | High | Online adaptivity, rich objectives; reward/safety design + sim-to-real issues; typically needs safety wrapper/monitor |
| PSO/PSO-hybrids | PFD, DFD, CFD | PND, PFS, DND, DFS, CND, CFS | >Medium (rolling) | Medium–High | Time-varying cost optimization; parameter sensitivity; improved by graph seeding/clustering/decomposition hybrids |
| GA/DE (incl. hybrids) | PFS, DFS, CFS | PND, PFD, DFD, CND, CFD | Low (runtime) | Medium–High | Offline multi-objective optimization/scheduling/capacitated tours; stochastic variability; no deterministic optimality |
| ACO (incl. SOM/FCM seeding) | DND | PND, PFS, PFD, DFD, DFS, CND, CFD, CFS | Low–Medium | Medium | Constructive allocation–routing under uncertainty; pheromone tuning; large-graph scalability limits; comm/pheromone sharing often needed |
| Unsupervised (clustering/partition embeds) | DND | DFD, DFS, CND, CFD | Low–Medium | Medium | Structure discovery for partitioning and load/bandwidth balancing; metric/cluster sensitivity; needs periodic rebalancing |
| Area segmentation (Voronoi/decomposition) | CND, CFD, CFS | DND | Low | Medium | Deterministic spatial structure for coverage; reduces overlap; requires stitching and dynamic reweighting for balance |
| Auction/market-based tasking | DND | PND, DFD, DFS, CND, CFD | Low–Medium | Medium | Online (re)assignment via bids/consensus; communication-heavy; pairs with local planners for motion feasibility |
| Scenario | Best-Suited Algorithm Family | Rationale |
|---|---|---|
| PND | RL (PPO/COMA/TD3/DQN) | Online learning, strong adaptability and scalability in dynamic scenes |
| PFS | Graph-search (A*/Dijkstra); GA | Optimality and speed in static maps; GA effective for offline multi-objective paths |
| PFD | PSO/PSO-hybrids | Adapts to time-varying costs; improved convergence and scale |
| DND | Auction/market-based | Real-time scalability; re-bid/utilization benefits |
| DFD | PSO/PSO-hybrids | Scales to large assignments; hybrids escape local optima |
| DFS | GA (and GA-SA hybrids) | Effective constraint-aware offline assignment; integrates with routing |
| CND | RL | Drives exploration; reduces revisits; energy-aware adaptation |
| CFD | PSO/PSO-hybrids | Coverage efficiency, balanced load in dynamic targets/environments |
| CFS | GA; DE | Widely used baselines; hybrids improve efficiency and route quality |
| Scenario | Metrics Reported in the Cited Studies | Representative Sources |
|---|---|---|
| PND | Mission/coverage efficiency; obstacle-avoidance success; robustness and scalability across swarm sizes | DQN hybrid improves task-completion efficiency and obstacle-avoidance under dynamic obstacles; COMA outperforms non-learning baselines across sizes/conditions (robustness and scalability) |
| PFS | Energy and path-length (multi-objective); runtime/memory; connectivity compliance | GA minimizes energy + length for swarm trajectories ; enhanced A* addresses heavy runtime/memory on large spaces; connectivity-aware path planning validated in urban airspace |
| PFD | Shorter paths; faster convergence; lower runtime; AoI for data collection; planning efficiency in dynamic airspace | PPSwarm: shorter paths, faster convergence, lower runtime ; DP-MATD3: lowering Age-of-Information in multi-UAV data collection; weighted-Voronoi + PSO shows high planning efficiency and robust avoidance in dynamic environments |
| DND | Flight time; energy consumption; completion rate; resource consumption | Power-line inspection under heterogeneous wind fields: flight time, energy; two-stage auction boosts completion rate and reduces resource consumption |
| DFD | Information value vs. flight time/energy (multi-objective); convergence/diversity under dynamics; re-allocation efficiency | WSN planning maximizes sensing information while minimizing time and energy; DT-assisted PSO improves convergence/diversity in dynamic conditions; dynamic re-allocation improves execution flexibility/efficiency |
| DFS | Execution efficiency; convergence speed; route cost | Disaster-relief AGA/SCPSO shows strong efficiency, convergence, and route cost performance on benchmarks |
| CND | Coverage efficiency; task efficiency; energy consumption; exploration time; load balance | Tsunami dynamic reassignment improves coverage efficiency (fault-tolerant); multi-base-station/area-segmentation planners improve task efficiency and reduce energy; DCAS shortens exploration time and improves load balance |
| CFD | Minimum tour distance; rapid, energy-aware exploration; energy/load balance; coverage efficiency | Checkpoint coverage focuses on minimum distance tours; DNBPT targets maximum gain with minimal energy and yields faster exploration and higher coverage efficiency; PSOHAC raises coverage, trims energy, and balances load |
| CFS | Battery/spray consumption; overall planning efficiency in waypoint-dense fields | GGA reduces battery and spray-tank usage, improving overall planning efficiency |
| Task/Scenario Pattern | Brief Task Description | Reported Evaluation Setup |
|---|---|---|
| Forest-fire missions (CND/CFD) | Coverage/search and ignition-source localization; trajectory planning for reconnaissance. | Rugged/complex terrain; dynamic fire-risk maps; simulation-based assessment of coverage/time/energy. |
| Transmission-line inspection under multi-wind fields (DND) | Joint task allocation and path planning targeting safety, time, and energy. connectivity compliance | Heterogeneous wind-field profiles; environmental constraints along transmission corridors; online reallocation effects. |
| Airport bird-dispersion (DND) | Trajectory prediction + assignment + curvature-constrained interception (Dubins). | Parametric simulations over fleet size, exclusion radius, and intrusion cases; LSTM–Kalman prediction + Hungarian assignment; curvature-limited paths. |
| Urban pipeline corridors inspection (PFD/CFD) | 3D inspection-route optimization with connectivity/safety constraints. | Corridor geometry and obstacle density specified; smooth, collision-free 3D trajectories; connectivity maintained along the route. |
| Post-disaster SAR (DFS/CND) | Cooperative allocation + routing; distributed swarm scanning for large areas. | Case studies with dynamic reassignment/failure recovery; polygonal maps discretized into GPS waypoints; runtime wavefront dispatch. |
| Plant-protection coverage (CFD/CFS) | Multi-sub-swarm cooperative coverage minimizing round-trip distance and non-spraying time. | Field tiling/area partition; target-density and operation-speed constraints; evaluation by distance/non-spraying time. |
| AoI-driven data collection (PFD/CND) | Multi-UAV tours minimizing the Age of Information. | Time-varying sensing tasks; AoI computed from visit times; communication/scheduling constraints considered. |
| Urban patrol/corridor navigation (PND/PFS/CFD) | Patrol/route planning under urban geofences and curvature limits. | Low-altitude urban corridors; connectivity and minimum-separation constraints; route smoothness/feasibility checks. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Li, J.; Li, J.; Zhang, J.; Meng, W. A Comprehensive Review of Path-Planning Algorithms for Multi-UAV Swarms. Drones 2026, 10, 11. https://doi.org/10.3390/drones10010011
Li J, Li J, Zhang J, Meng W. A Comprehensive Review of Path-Planning Algorithms for Multi-UAV Swarms. Drones. 2026; 10(1):11. https://doi.org/10.3390/drones10010011
Chicago/Turabian StyleLi, Junqi, Junjie Li, Jian Zhang, and Wenyue Meng. 2026. "A Comprehensive Review of Path-Planning Algorithms for Multi-UAV Swarms" Drones 10, no. 1: 11. https://doi.org/10.3390/drones10010011
APA StyleLi, J., Li, J., Zhang, J., & Meng, W. (2026). A Comprehensive Review of Path-Planning Algorithms for Multi-UAV Swarms. Drones, 10(1), 11. https://doi.org/10.3390/drones10010011

