Next Article in Journal
Geometric Control with Decoupled Yaw for Quadrotor Cable-Suspended Payload Transportation with Viewpoint Control
Next Article in Special Issue
HGR-QL: Optimized Q-Learning for Multi-UAV Path Planning in Mountain Search and Rescue
Previous Article in Journal
Research on Integrated Decision-Control Cooperative Target Assignment for Cross-Domain Unmanned Systems Based on a Bi-Level Optimization Framework
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

RG-HDP-VD: A Physics-Aware Cooperative Trajectory Planning Framework for Heterogeneous Multi-UAVs

1
College of Aviation Electronic and Electrical Engineering, Civil Aviation Flight University of China, Chengdu 641419, China
2
Institute of Electrical Engineering, Chinese Academy of Sciences, Beijing 100190, China
3
University of Chinese Academy of Sciences, Beijing 100049, China
4
Qingdao Air Traffic Management Station, Civil Aviation of China, Qingdao 266300, China
*
Author to whom correspondence should be addressed.
Drones 2026, 10(3), 192; https://doi.org/10.3390/drones10030192
Submission received: 21 January 2026 / Revised: 24 February 2026 / Accepted: 4 March 2026 / Published: 10 March 2026

Highlights

What are the main findings?
  • A physics-aware cooperative planning framework (RG-HDP-VD) was developed and validated in real-world flights, integrating mass-augmented energy topology, regret-guided arbitration, and velocity decomposition.
  • The framework demonstrates superior scalability in saturated airspace, maintaining a 95% success rate where baselines fail, while reducing average planning time by ~45% and lowering total system energy by 6.7%.
What are the implications of the main findings?
  • Physics-consistent right-of-way allocation mitigates energy-inefficient congestion by prioritizing high-penalty platforms, providing a highly scalable alternative to conventional methods that prevents deadlocks.
  • Mapping rigid time-window constraints into length-feasible regions via velocity envelopes offers a robust way to maintain spatiotemporal feasibility and prevent timing failures without relying on terminal loitering.

Abstract

This paper presents Regret-Guided Heuristic Decentralized Prioritized Planning with Velocity Decomposition (RG-HDP-VD), a physics-aware cooperative trajectory planning framework for heterogeneous Unmanned Aerial Vehicles (UAVs) relief delivery in post-earthquake, non-convex canyon environments. RG-HDP-VD addresses two prevalent failure modes: energy-inefficient congestion caused by ignoring time-varying payload dynamics, and the collapse of feasible sets due to strict arrival windows in fixed-speed planning. We construct a mass-augmented energy topology and use a mass-augmented energy-aware A* search to extract baseline physical metrics—path length, total energy, and unit-distance energy—for each UAV. Regret-Guided (RG) arbitration then quantifies the relative energy cost of waiting versus detouring at conflicts and grants right-of-way to heavy-load, high-cost platforms. These priorities are embedded into Heuristic Decentralized Prioritized Planning (HDP), which maintains a global spatiotemporal occupancy map and serializes planning to eliminate deadlocks. To satisfy tight time windows, Velocity Decomposition (VD) maps 4D temporal constraints into a 3D path-length feasible interval and is realized via an improved VD-TSRRT* sampling-based planner. In high-fidelity simulations, RG-HDP-VD demonstrates superior scalability over conventional methods, maintaining high success rates (up to 100%) in saturated scenarios, while reducing average planning time by ~45% and total system energy by 6.7%. Finally, real-world flight demonstrations using a heterogeneous quadrotor team validate the framework’s practical feasibility and robust hardware execution.

1. Introduction

Post-earthquake rescue operations often take place in extreme conditions, where transportation networks are disrupted, communication infrastructure is degraded, and terrain exhibits drastic elevation variations. In unstructured and constrained airspace formed by canyons, ridges, and fragmented slopes, aerial delivery has become a key means of achieving relief-supply coverage within the golden hour. Consequently, heterogeneous multi-unmanned aerial systems (HMUASs)—leveraging complementary payload capabilities and coordinated task execution—have emerged as an important paradigm for post-disaster UAV operations [1]. As a cornerstone of autonomous HMUAS missions, multi-UAV cooperative path planning (MUCPP) aims to generate collision-free trajectories satisfying physical feasibility and inter-agent safety [2,3,4]. While recent surveys [5,6] have cataloged the algorithmic landscape of MUCPP, they consistently identify scalability, strong constraint coupling, and physical consistency as the most persistent barriers to deployable collaboration.
However, the strong spatiotemporal coupling inherent to post-disaster delivery poses significant challenges, a point also emphasized in recent reviews [6]. First, severe occlusions and bottleneck passages induced by non-convex terrain lead to dense trajectory interleaving in terminal airspace. Second, many missions require simultaneous arrival (SAT) to enable saturated service at target regions [7,8]. This evolves the planning problem into a joint optimization in a high-dimensional configuration space, where complexity grows rapidly with environmental clutter. Some studies [9,10] emphasize that introducing continuous-time domains and variable action durations substantially increases the difficulty of conflict resolution. More critically, post-disaster delivery introduces variable-mass dynamics, a key physical factor often overlooked in conventional MUCPP: the step change in mass due to payload release can cause nonlinear variations in propulsion energy consumption and hovering cost, and such effects differ significantly across heterogeneous platforms. Recent data-driven energy modeling [11] suggests that ignoring such state-dependent power variations can significantly distort cost evaluation, leading to suboptimal or even failed coordination.
Existing MUCPP approaches can generally be categorized into reactive, coupled, and decoupled paradigms [4]. Reactive methods (e.g., Optimal Reciprocal Collision Avoidance (ORCA) [12] and artificial potential fields [13]) generate collision-avoidance behaviors through local rules, offering strong online responsiveness and low computational overhead. However, their reliance on local information limits system-level energy efficiency improvement and completeness guarantees and makes it difficult to enforce mission-level temporal constraints such as SAT. Coupled methods (e.g., Conflict-Based Search (CBS) [14] and its bounded-suboptimal extension Enhanced Edge-Weighted Conflict-Based Search (EECBS) [15]) jointly resolve conflicts under a global constraint framework, providing stronger theoretical guarantees but often suffering from scalability and real-time limitations as the number of agents and conflicts increases [14,15,16,17]. Decoupled methods (e.g., prioritized planning (PP) [18], decentralized prioritized planning (DPP) [19], and dynamic-priority sampling planners [20]) improve scalability by serializing multi-agent planning through priority assignment, but their performance can be highly sensitive to priority order and may fail on solvable instances under unfavorable sequencing [18,19,20,21]. While learning-assisted approaches [22,23,24] have gained traction for their adaptability in dense airspace, ensuring strict constraint satisfaction and physical interpretability remains challenging. Specifically, if right-of-way arbitration relies solely on geometric distance or random ordering, it ignores the nonlinear dependence of energy cost on payload mass, potentially leading to physically inefficient outcomes [11,25].
In summary, post-disaster canyon delivery can be formulated as a cooperative planning problem characterized by platform heterogeneity, time-varying payloads, and strong spatiotemporal coupling. This setting exposes two key gaps:
  • Insufficient modeling of physical heterogeneity and payload changes can trigger energy imbalance and deadlocks. Heavy-lift and light UAVs exhibit strong asymmetry in hovering power. In bottlenecks, distance-based arbitration underestimates the waiting penalty of heavy-load platforms, forcing them to idle where hovering dominates expenditure. This may trigger energy-inefficient congestion. Consistent with the need for realistic cost modeling highlighted in recent surveys [5,6], this gap calls for a physically consistent, regret-guided arbitration mechanism.
  • Rigid coupling between time windows and fixed-speed assumptions leads to rapid feasible-set shrinkage and cascading failures. SAT missions with tight windows often suffer from solution-space contraction. Fixed-speed assumptions rigidly couple arrival time to path length, meaning any detour directly erodes time margins. While spacetime planning variants (e.g., spacetime RRT* [26]) and learning-based conflict resolution [23,24,27] attempt to handle these constraints, they often lack the explicit velocity elasticity needed for robust rhythm modulation. To prevent single-agent infeasibility from propagating through the fleet, time feasibility should be transformed into distributed velocity envelopes along flight segments, structurally mitigating the contraction of the feasible set.
To address these challenges, we propose a physics-aware cooperative planning framework for heterogeneous multi-UAVs, termed RG-HDP-VD. The framework integrates physical constraints into a layered perception–decision–execution pipeline. First, a mass-augmented energy topology is constructed to extract baseline physical metrics via mass-augmented energy-aware A*. Second, Regret-Guided (RG) dynamic arbitration quantifies the asymmetric energy costs of waiting versus detouring and assigns right-of-way to platforms with higher waiting penalties. This arbitration is embedded into HDP [28] to maintain spatiotemporal consistency and guarantee deadlock-free serialized planning. Finally, Velocity Decomposition (VD) relaxes rigid time-window constraints into an elastic path-length feasible set, enabling efficient planning under tight deadlines when combined with an improved VD-TSRRT* algorithm.
The main contributions are summarized as follows:
  • Regret-Guided dynamic right-of-way arbitration. A physically grounded fairness mechanism that explicitly captures energy asymmetry across heterogeneous platforms to prevent energy-inefficient congestion in bottleneck airspace.
  • Velocity-decomposition-based elastic spatiotemporal planning. By introducing velocity envelopes, rigid fixed-speed constraints are decoupled into an elastic path-length feasible set, expanding feasibility under tight time windows in non-convex terrain.
  • A physics-aware layered cooperative planning framework (RG-HDP-VD). The framework integrates mass-augmented energy topology, dynamic arbitration, and executable 4D trajectory generation with adaptive smoothing (B-spline or PCHIP) and continuous collision checking.
Based on the aforementioned modeling and method design, the subsequent sections of this paper are organized as follows: Section 2 formalizes the post-disaster heterogeneous cooperative delivery task as a joint optimization problem; Section 3 details the layered algorithmic implementation of RG-HDP-VD; Section 4 validates the effectiveness and robustness of the proposed framework through high-fidelity simulation experiments; Section 5 presents real-world flight demonstrations, and Section 6 concludes the paper.

2. Problem Formulation and System Modeling

This section models the cooperative material delivery task of heterogeneous multi-UAVs in complex fractal canyon environments as a joint optimization problem constrained by multiple physical limits and strong spatiotemporal coupling. We explicitly characterize the time-varying mass induced by payload release and the resulting differences in energy-consumption topology. In addition, an elastic velocity envelope is introduced to describe the physical feasibility region for temporal coordination, providing a mathematical constraint basis for subsequent algorithm design. The essential mathematical notations used in this framework are detailed in Appendix A.

2.1. System Modeling and Performance Constraints

Consider a multi-UAV system U = { U 1 , , U N } composed of N heterogeneous UAVs executing a cooperative delivery task in a 3D workspace W R 3 . The planner must generate an executable trajectory σ i for each UAV from a start point s i to a goal region G i . The environment consists of fractal terrain obstacles O t e r r a i n (non-convex height fields) and a cylindrical threat zone O t h r e a t , denoted as O e n v = O t e r r a i n O t h r e a t . The heterogeneous characteristics of UAV U i are described by the parameter tuple P i .

2.1.1. Dynamic Mass Model

Material delivery causes a sudden change in UAV mass at the drop moment. The total mass is modeled as a piecewise constant function with respect to the drop time t d r o p , i :
m i ( t ) = m e m p t y , i + I ( t < t d r o p , i ) m p a y l o a d , i ,
where I ( ) is the indicator function. This model enables the planner to distinguish the physical cost differences between heavy-load and light-load states.

2.1.2. Mass-Augmented Energy Cost Model

To address the issue of energy fairness among heterogeneous UAVs, the system moves beyond Euclidean distance and constructs a cost model J e n g based on physical work. The total energy consumption of any trajectory σ i consists of motion work and hovering waiting:
J e n g ( σ i ) = 0 T ( P m o t i o n ( v , z ˙ , m i ( t ) ) + P h o v e r ( m i ( t ) ) ) d t .
where z ˙ denotes the vertical velocity of the UAV. In implementation, this integral is approximated by discrete path segment accumulation. The model incorporates two key physical attributes:
  • Asymmetric Motion Power: P m o t i o n distinguishes between climbing and descending efficiency, penalizing aggressive maneuvering under heavy-load states;
  • Nonlinear Hovering Power: P h o v e r = α h m 1.5 , where α h is the hovering power coefficient, implying that the cost for heavy-load platforms to wait at bottlenecks is significantly higher, providing a physical basis for subsequent right-of-way arbitration.

2.1.3. Kinematic Feasibility Under a Velocity Envelope

Each UAV is modeled as a point mass subject to speed limits, with dynamics p ˙ i ( t ) = v i ( t ) . Unlike fixed-speed assumptions, we introduce an elastic velocity envelope, allowing the aircraft to adjust its speed within physical limits to satisfy temporal coordination. The speed constraint is defined as:
v m i n , i v i ( t ) v m a x , i , t [ 0 , T ] .
The feasible interval v m i n , i v m a x , i forms the physical foundation for temporal coordination.

2.2. Spatiotemporal Cooperative Constraints

Cooperative planning requires heterogeneous UAVs to satisfy static obstacle avoidance in non-convex environments, maintain inter-agent safety separation in continuous time, and achieve temporal coordination under physical feasibility. We therefore provide computable constraint definitions from both spatial and temporal coupling perspectives.

2.2.1. Continuous Inter-Agent Safety Constraint

To avoid the Tunneling Effect caused by discrete time-step detection during high-speed relative motion, we employ continuous occupancy sets to characterize inter-agent separation. Let the spatial occupancy of UAV U i at time τ be the Minkowski Sum of its centroid trajectory and a safety sphere:
V i ( τ ) = p i ( τ ) B ( 0 , r s a f e ) ,
where p i ( τ ) R 3 is the position, and r s a f e integrates body scale and control uncertainty. The global temporal safety constraint is defined as:
d i s t ( V i ( τ ) , V j ( τ ) ) D m i n , τ [ 0 , T c o m ] , i j ,
where T c o m is the unified mission timeline length. When a UAV reaches the target area first, it is considered to maintain occupancy at the target point, ensuring the definition consistency of V i ( τ ) throughout the entire time domain.

2.2.2. Cooperative Arrival Time Window Constraint

With a velocity envelope, we derive the feasible travel-time interval of each geometric path and evaluate its compliance with individual arrival windows.
For any geometric path π i of UAV U i , based on its elastic velocity envelope V i = [ v m i n , i , v m a x , i ] , the physical reachable time domain mapped by path length L ( σ i ) is:
T p h y s ( L ( σ i ) ) = L ( σ i ) v m a x , i L ( σ i ) v m i n , i .
The existence condition for a global cooperative time anchor t c o is:
t c o R + s . t . t c o i = 1 N T p h y s ( L ( σ i ) ) .

2.3. Cooperative Path Planning Problem Definition

Based on the aforementioned kinematic models, velocity envelope constraints, and spatiotemporal coupling constraints, this paper formalizes the cooperative material delivery task for heterogeneous UAVs as a joint optimization problem.
Problem 1 (Physics-Aware Cooperative Planning).
Given a heterogeneous multi-UAV system U , start and goal sets S , G , and a fractal environment O e n v (where terrain obstacles O t e r r a i n are hard constraints and threat zones involve soft risk penalties via potential fields R ( p ) ), find an optimal set of trajectories Σ * = { σ 1 , , σ N } , a global cooperative time anchor t c o , and mission-specified individual arrival timing biases δ i } i = 1 N , such that the following weighted cost is minimized:
m i n Σ , t c o J g l o b a l = i = 1 N ( λ e n g J ^ e n g ( σ i ) + λ t i m e J ^ s y n c ( σ i , t c o , δ i ) + γ J ^ r i s k ( σ i ) ) .
The weights λ e n g and λ t i m e control the trade-off between energy efficiency and time synchronization in Equation (8) and are also used in the time-anchoring objective in Equation (12). We tune λ e n g   and λ t i m e using a lexicographic criterion: (i) maximize the mission success rate, and (ii) among solutions achieving full success, minimize total energy and synchronization error. Following this rule, we perform structured parameter sweeps in the baseline Scenario A to identify a feasible region and choose practical defaults (details in Section 4.4). The risk weight γ is fixed ( γ = 0.2 ) as a soft safety margin, and ϵ is fixed ( ϵ = 1 × 10 6 ) as a numerical stabilizer. Here, the individual target arrival time is defined as t t a r g e t , i = t c o + δ i and is used to measure synchronization error.
Cost Function Analysis:
  • Energy Efficiency ( J ^ e n g ): A normalized energy term based on Equation (2), guiding the planner to generate energy-saving paths consistent with heterogeneous energy efficiency characteristics;
  • Synchronization Accuracy ( J ^ s y n c ): Quantifies the deviation relative to the target arrival time:
    J ^ s y n c ( σ i , t c o , δ i ) = t a r r , i t t a r g e t , i T n o r m , t t a r g e t , i = t c o + δ i ,
    where t a r r , i is the final arrival time of the trajectory, and T n o r m is a normalization scale (can be taken as m a x ( t t a r g e t , i , ε ) or uniformly m a x ( t c o , ε ) to avoid division by zero). δ i is given by mission rank and arrival interval (e.g., δ i = ( rank ( i ) 1 ) Δ t g a p ). Under strict constraints, this term should converge within the allowable time window error range;
  • Threat Exposure ( J ^ r i s k ): Conditional Value-at-Risk (CVaR) is employed to assess spatial safety. Let R ( p ) be the threat potential field value at position p , and define the random variable X i as the risk distribution along trajectory σ i (e.g., constituted by time-domain sampling of R ( σ i ( t ) ) ). Then, η -CVaR [29] is defined as the expected value of the worst η % high-risk segments:
    J ^ r i s k ( σ i ) = C V a R η ( X i ) = i n f a R a + 1 η E [ ( X i a ) + ] ,
    Here, a is an auxiliary scalar (VaR-like threshold) introduced in the standard CVaR reformulation; the infimum over a yields CVaR η ( X i ) , where x ) + = m a x ( x , 0 . This metric enhances robustness against non-deterministic disturbances by heavily penalizing the highest-risk segments of the path rather than the average risk [30].

3. The RG-HDP-VD Physics-Aware Cooperative Planning Framework

3.1. System Architecture and Problem Decomposition

This paper proposes a hierarchical cooperative planning framework, termed RG-HDP-VD, which constructs a layered computable closed-loop as illustrated in Figure 1. In this framework, the RG mechanism handles priority arbitration using consistent physical costs. HDP manages decentralized sequential coordination and spatiotemporal occupancy. Finally, VD decouples time-window feasibility into geometric length-feasible regions, which are then embedded into the underlying planner.
Physics-Aware and Topology Abstraction Layer (L0–L1): Acting as the perception frontend, this layer aims to establish a unified physical metric baseline. It performs nonlinear fusion of heterogeneous physical parameters (from L0) and environmental constraints, mapping differences in payload time-variance and aerodynamic efficiency into an anisotropic energy cost field. Specifically, rather than employing explicit fluid-dynamics equations, these aerodynamic effects are introduced through the empirical power-consumption models in Equation (2) via P m o t i o n ( v , z ˙ , m i ( t ) ) and P h o v e r ( m i ( t ) ) . The corresponding model coefficients are included in the platform parameter tuple P i (e.g., the hovering power coefficient α h in Table 1), enabling the cost field to capture macroscopic phenomena such as asymmetric vertical-motion efficiency and nonlinear hovering penalties under heavy load. L1 extracts baseline topological features for each UAV using a Mass-Augmented Energy A* algorithm, outputting a triplet L i E i k i containing geometric path length, total energy consumption, and energy consumption per unit distance. This process maps differences in motion capability across heterogeneous airframes in unstructured environments into comparable physical cost metrics. These metrics then provide an objective baseline for the subsequent game-theoretic layer.
Cooperative Strategic Decision Layer (L2): This layer performs strategic resource allocation based on physical topology information. Addressing resource contention deadlocks among heterogeneous UAVs, the Regret-Guided (RG) mechanism utilizes the ratio difference between “waiting” and “detouring” energy costs to dynamically generate conflict-free priority sequences, strategically establishing right-of-way advantages for heavy-lift, high-energy platforms. Simultaneously, the Global Time Anchoring block searches for an optimal cooperative tempo ( t c o ) within the intersection of physical feasibility regions. This step eliminates rhythm mismatches across different aircraft types in the time domain and outputs rigid spatiotemporal constraint directives to downstream layers. On this basis, HDP acts as the coordination kernel to maintain the global spatiotemporal occupancy map H a l l : it records determined high-priority trajectory occupancies into H a l l and treats them as dynamic spatiotemporal obstacles for low-priority planning.
Elastic Execution and Validation Layer (L3–L4): This layer addresses feasibility issues under tight time windows, realizing the grounding of discrete decisions into continuous trajectories. To counter the solution space contraction caused by fixed-speed assumptions in non-convex terrain, L3 introduces Velocity Decomposition (VD) technology. Using the velocity envelope v m i n v m a x , rigid time window constraints are decoupled and mapped into elastic path-length feasible regions. This implies that the planner (VD-TSRRT*) needs only to search for paths satisfying geometric length requirements under H a l l constraints to implicitly restore temporal feasibility. Finally, L4 performs temporal smoothing, continuous capsule-swept volume detection, and cooperative error auditing via B-spline or PCHIP. In this way, it generates 4D trajectories σ i ( t ) that satisfy physical feasibility constraints, remain collision-free, and complete the overall “Perception–Decision–Execution” closed loop.

3.2. Mass-Augmented Energy Topology

In the post-disaster fractal canyon environment O e n v = O t e r r a i n O t h r e a t , simple geometric distance cannot reflect the costs of heterogeneous aircraft types. We employ the Mass-Augmented Energy A* algorithm to extract a baseline topology for each UAV U i that satisfies terrain and threat constraints, outputting the triplet L i E i k i as physical tokens for the subsequent cooperative game.
Regarding the topology extraction mechanism, the algorithm constructs a payload-aware cost evaluation model in which instantaneous mass is explicitly embedded into the cost function. The energy for each displacement segment is dynamically weighted based on m s e g = m e m p t y , i + m p a y l o a d , i . To avoid the computational explosion associated with high-dimensional search while still ensuring physical fidelity, the algorithm does not expand payload as an explicit fourth dimension. Instead, it attaches payload as a node attribute that participates in the g ( n ) update, so the skeletal search remains in 3D space.
The output of this layer characterizes only the motion work baseline in a static environment. Hovering waiting costs, which are directly related to dynamic coordination, are explicitly modeled in L2 via regret values and time anchoring. This design of “dynamic–static separation” effectively avoids strong coupling assumptions regarding cooperative strategies during the pre-planning phase.

3.3. Regret-Guided Arbitration and Time Anchoring

Based on the physical topological features submitted by L1, this layer focuses on resource allocation across two strategic dimensions: establishing the global cooperative time tempo and adjudicating passage priority in conflict zones. By generating a set of physically feasible rigid time windows t i l o w t i h i g h and a deadlock-free planning priority sequence U s o r t e d , the framework transforms the complex multi-agent game into a serialized single-agent spatiotemporal constrained planning problem.

3.3.1. Global Cooperative Time Anchoring

In heterogeneous multi-UAV systems, the natural rhythms of different aircraft types are difficult to synchronize. L2 executes global anchoring on the time axis to search for an optimal cooperative moment t c o that satisfies mission requirements while aligning with collective physical energy efficiency.
The algorithm first calculates the baseline arrival time for each UAV based on cruising speed: t m i n , i = L i / v c r u i s e , i . Subsequently, it searches for the optimal solution within the interval t [ m a x ( t m i n , i ) , ρ m a x ( t m i n , i ) ] by constructing a weighted objective function J ( t ) that incorporates time efficiency and energy cost:
J ( t ) = i = 1 N ( λ t i m e Δ i ( t ) + λ e n g k i n o r m Δ i ( t ) ) ,
where Δ i ( t ) = ( t t m i n , i ) / m a x ( t m i n , i , ϵ ) is the normalized time deviation and ρ ( ρ 1 ) is a time relaxation factor defining the upper bound multiplier for the feasible arrival time search space. In this function, the unit distance energy consumption k i n o r m serves as a key weighting factor, explicitly amplifying the time deviation cost for heavy-load models. This means that the optimization naturally biases the solution toward “high-energy” platforms. As a result, the selected t c o better accommodates the energy-efficient operating regime of heavy-load aircraft.
After determining the baseline time t c o , the system generates rigid time windows to be issued to the execution layer. To prevent top-level directives from exceeding the physical feasibility boundaries of the airframes, a strict physical limit truncation mechanism is introduced. First, the physical lower bound for time under maximum thrust is calculated for each aircraft type:
t m i n , i p h y s = L i v c r u i s e , i ρ m a x ,
where v m a x _ r a t i o is the maximum allowable overspeed ratio. Subsequently, a mandatory check is performed when generating the time window lower bound t i l o w :
t i l o w = m a x { t c o Δ T e a r l y , t m i n , i p h y s } , t i h i g h = t i l o w + δ .
Here, Δ T e a r l y > 0 (in s) is a user-specified early-arrival tolerance that limits how much earlier than the cooperative anchor time t c o   a vehicle may be scheduled to arrive. This mechanism ensures that, even if t c o is set aggressively, the time window t l o w , i t h i g h , i issued to each UAV always lies within its physical feasibility region. It therefore prevents planning failures where high-level spatiotemporal constraints violate low-level dynamic limits.

3.3.2. Regret-Guided Dynamic Right-of-Way Arbitration

Time anchoring solves “when to arrive”, while right-of-way arbitration solves “who goes first”. To achieve energy-aware fairness passage allocation in conflict scenarios, this layer proposes a Regret-Guided (RG) dynamic arbitration mechanism. This mechanism uses the unit distance energy consumption k i (unit: J/m) output by L1 as a strategic proxy metric. It compares the physical cost differences induced by two types of actions—hovering (waiting) and maneuvering (detouring)—at the conflict point to calculate a regret value R i , thereby generating a right-of-way priority sequence.
Specifically, for the i -th UAV, define the waiting energy cost C i w a i t and the detouring energy cost C i d e t o u r (both in J) as:
C i w a i t = P i h o v e r Δ t i w a i t ( k i v c r u i s e , i η h o v e r ) Δ t i w a i t ,
C i d e t o u r = e i u n i t Δ L i d e t o u r k i η r e r o u t e Δ L i d e t o u r .
Here: v c r u i s e , i is the cruising speed (m/s). Since the dimension of k i v c r u i s e , i is Watts (W), it serves as a baseline proxy for “energy consumption per unit time/power.” It is mapped to the hovering power proxy P i h o v e r (W) via the coefficient η h o v e r . η r e r o u t e characterizes the additional energy gain caused by detouring, mapping k i to the detour unit distance energy proxy e i u n i t (J/m). Δ t i w a i t is determined by the cooperative time difference and conflict severity, while Δ L i d e t o u r is mapped from the conflict zone’s geometric scale with a lower bound set to avoid degenerate comparisons caused by “zero detour.”
On this basis, the regret value is defined as the ratio of the costs of the two actions:
R i = C i w a i t C i d e t o u r + ϵ ,
where ϵ is a small positive number to prevent numerical instability. A larger R i indicates that the energy penalty associated with waiting is higher relative to that of detouring, suggesting that the platform should preferentially detour rather than hover.
For heterogeneous platforms in bottleneck games, the hovering power of heavy-lift transports increases nonlinearly with mass ( P h o v e r m 1.5 ), whereas level-flight propulsion energy grows more moderately (approximately proportional to m). Consequently, their C i w a i t is often significantly larger than C i d e t o u r , resulting in a large R i . Conversely, light platforms have lower hovering and maneuvering costs, yielding a smaller R i , making them more suitable as “yielders/regulators” in the system. The system allocates right-of-way in descending order of R i , prioritizing the passage of high-energy-consumption platforms through bottleneck areas, thereby suppressing energy-inefficient congestion and reducing total system energy consumption in the sense of “physical fairness”.

3.3.3. HDP Decentralized Priority Coordination and Occupancy Closed-Loop

To ground the physics-aware priority sequence U = U 1 , , U N obtained from L2 into executable collision-free trajectories, we employ HDP as the cooperative execution backbone. Unlike traditional HDP, which relies on heuristics to determine right-of-way, this framework uses the priority output by the RG mechanism as input, achieving one-way conflict resolution through an asynchronous “Plan–Broadcast–Update” closed loop.
The system executes serialized coordination according to U s o r t e d : high-priority UAVs generate trajectories σ i ( t ) first, and their results are treated as occupancy declarations for spatiotemporal resources; low-priority UAVs, upon receiving these trajectories, convert them into dynamic constraints and solve within the remaining feasible region, realizing a priority-ordered coordination mode in which higher-priority UAVs claim spatiotemporal occupancy first, and lower-priority UAVs yield accordingly.
To ensure continuous inter-agent safety and avoid deadlocks, the system maintains a global spatiotemporal occupancy set H a l l . When u i completes planning and broadcasts its occupancy envelope, any low-priority UAV u k ( k > i ) immediately updates its local constraints:
O l o c a l k O s t a t i c H k , H k = j < k SpatioTemporal ( σ j ) .
This update rule ensures that conflict constraints are propagated unidirectionally along the priority sequence: low-priority individuals must adapt to high-priority trajectories, mechanistically eliminating cooperative deadlocks caused by circular waiting.
Proposition 1 (deadlock-free w.r.t. circular waiting).
Let  U = U 1 , , U N  be a total order produced by the RG module. Under the HDP protocol in which agent  u k  plans while treating  σ j j < k  as fixed spatiotemporal obstacles (Equation (18)), the constraint-dependency graph is acyclic; hence, the circular-wait condition is absent and cooperative circular-wait deadlocks cannot occur.
Proof (sketch).
By Equation (18), any “yield/wait due to conflict” relation can only point from a lower-priority agent to a higher-priority one (from k to some j < k ). Therefore, all directed edges follow the strict order of U s o r t e d , which forms a Directed Acyclic Graph (DAG) and contains no directed cycle. Circular waiting requires a directed cycle; hence, it is impossible. □
Remark (deadlock vs. planning failure).
Proposition 1 addresses cooperative deadlocks in the sense of circular waiting. This is strictly different from planning failure, where a low-priority agent finds no feasible solution under the remaining constraints (an incompleteness issue common to decoupled prioritized planning). Layer L3 (Velocity Decomposition) is explicitly designed to mitigate such feasibility collapse by enlarging the length-feasible set.

3.4. Elastic Execution Based on Velocity Decomposition

Layer L3 introduces the Velocity Decomposition (VD) technology, which decouples time feasibility from rigid 4D spatiotemporal constraints into path-length feasible intervals in 3D space. This logic is embedded into the extension, pruning, and cost evaluation mechanisms of RRT*, constituting the VD-TSRRT* planner.

3.4.1. Principle of Velocity Decomposition

For the i -th UAV, given the cruising speed v c r u i s e , i and the allowable speed adjustment ratio interval [ ρ m a x , i , ρ m i n , i ] , the physical velocity envelope available at the execution layer is defined as:
v m i n , i = v c r u i s e , i ρ m i n ,   v m a x , i = v c r u i s e , i ρ m a x
For any geometric path σ i in 3D space, let its length be L ( π i ) . Under this velocity envelope, the physically reachable time interval T p h y s corresponding to this path is given by Equation (6).
Therefore, the necessary and sufficient condition for path σ i to satisfy the time window T w i n = [ t i l o w , t i h i g h ] issued by L2 is that there exists a non-empty intersection between the two intervals, i.e., T p h y s ( L ( π i ) ) T w i n . This judgment condition can be equivalently rewritten as a feasible region constraint on the path length:
L ( π i ) [ v m i n , i t i l o w , v m a x , i t i h i g h ] L f e a s , i .
Consequently, the strong spatiotemporal coupling constraints that originally required handling in the x y z t joint configuration space are transformed into interval constraints on the geometric length L ( π i ) in the 3D search space. The planner only needs to find a collision-free path in space satisfying L ( π i ) L f e a s , i to provide a physically feasible geometric skeleton for the precise time alignment in the subsequent L4 layer, thereby trading velocity elasticity for geometric freedom.

3.4.2. Implementation of VD-TSRRT* Planning Algorithm

VD-TSRRT* does not search for precise trajectories directly in 4D spatiotemporal space; instead, it generates geometric paths σ i in 3D space that satisfy velocity decomposition feasibility, using a cost function to guide the search toward shorter (more energy-efficient) solutions. Its core improvements include three mechanisms:
  • Physical Pre-check: Utilizing the topological prior of the baseline path σ b a s e from L1, if its length L b a s e satisfies the intersection condition between the physically reachable time domain and the target window, sampling is skipped, and the path is output directly:
L b a s e v m a x , i L b a s e v m i n , i [ t i l o w , t i h i g h ] .
2.
Pruning & Adaptive Bias: A maximum physical length upper bound L m a x is introduced, including a time tolerance ϵ t and a fixed margin Δ L . For a tree node n , if its total length estimate L e s t > L m a x , it is judged as “inevitably late” and forcibly pruned:
L m a x = v m a x , i ( t i h i g h + ϵ t ) + Δ L .
Simultaneously, if the current optimal path is too short and leads to an “inevitably early” status (i.e., L e s t / v m i n , i < t i l o w ϵ t ), the algorithm automatically decreases the goal bias. This forces the random tree to grow more circuitously into the surrounding free space so that it can enter the feasible length interval.
3.
Feasibility-Aware Cost and Asymptotic Optimality: A step-type cost function is constructed, treating the time feasibility intersection as a hard threshold:
C o s t ( n ) = L e s t , if   L e s t v m a x , i L e s t v m i n , i [ t i l o w ϵ t , t i h i g h + ϵ t ] + , otherwise
On this basis, the algorithm explicitly executes Rewire Child and triggers Subtree Cost Propagation. This ensures asymptotic optimality within the feasible region under standard RRT* assumptions [31]. The complete procedure of the proposed VD-TSRRT planner is summarized in Algorithm 1. This optimality is conditional on a fixed priority order and the induced spatiotemporal constraints, and does not imply the global optimality of the coupled multi-UAV problem.
Algorithm 1. VD-TSRRT*(σ_base, t_low, t_high, v_min, v_max)
1.   L_max ← v_max · (t_high + ε_t) + ΔL
2.    if   Intersect (   L ( σ b a s e ) / v m a x , L ( σ b a s e ) / v m i n ,   t l o w , t h i g h ) then
3.    return σ_base
4.   end if
5.   Initialize tree T with x_start; optionally insert a prefix of σ_base into T
6.   best_node ← null; best_feas ← +∞; best_gap ← +∞
7.   for iter = 1 … maxIter do
8.    x_new ← SampleAndExtend(T)
9.    if g(x_new) + h(x_new) > L_max then continue end if (pruning; Equation (23))
10.  Rewire-Parent(T, x_new); Rewire-Child(T, x_new)
11.  if NearGoal(x_new) then
12.   L_curr ← PathLength(T, x_new)
13. if   Intersect (   L c u r r / v m a x , L c u r r / v m i n ,   t l o w , t h i g h ) then (feasible; Equation (19))
14.    if L_curr < best_feas then best_feas ← L_curr; best_node ← x_new end if
15.   else
16.    gap ← TimeGap(L_curr, t_low, t_high, v_min, v_max, ε_t) (Equation (23))
17.    if gap < best_gap then best_gap ← gap; best_node ← x_new end if
18.   end if
19.  end if
20.  if best_node ≠ null and L_est(best_node)/v_min < t_low − ε_t then decrease goalBias end if
21. end for
22. return BacktrackPath(T, best_node)

3.4.3. Fallback Strategy and Sequential Avoidance

Under extremely tight time windows or strong terrain constraints, a path satisfying the intersection condition may not temporarily exist. To prevent individual planning failure from triggering a collapse of multi-UAV coordination, VD-TSRRT* introduces a robust fallback mechanism: when no perfectly feasible solution exists, it returns the path with the minimum “Time Gap” distance to the feasible interval. Introducing ϵ t , the gap is calculated as:
g a p = L v m a x , i ( t i h i g h + ϵ t ) , if   L v m a x , i > t i h i g h + ϵ t ( inevitably   late ) ( t i l o w ϵ t ) L v m i n , i , if   L v m i n , i < t i l o w ϵ t ( inevitably   early )
The algorithm selects the candidate path with the minimum g a p for output. This design ensures that the system can consistently return an executable solution that minimizes physical-constraint violation, thereby preserving the maximum available slack for Layer L4 temporal fine-tuning or subsequent replanning.
Using the totally ordered priority U s o r t e d output by L2, the system plans sequentially and maintains the global spatiotemporal occupancy map H a l l . For any low-priority UAV, its planner maps the confirmed high-priority trajectory occupancies as dynamic spatiotemporal obstacles and superimposes them onto the constraint set. This achieves strict inter-agent safety separation under a unified continuous swept-volume collision detection logic. When an upstream trajectory updates, local replanning is triggered via broadcasting and occupancy updates, forming a decentralized closed-loop coordination.

3.5. Trajectory Realization and Continuous Verification

Layer L4 aims to transform the discrete path skeleton output by L3 into a 4D executable trajectory σ i ( t ) that satisfies physical feasibility constraints and continuous safety. First, a cascaded hybrid smoothing strategy is adopted: the Ramer–Douglas–Peucker (RDP) algorithm is used for geometric denoising of the original path [32], and B-Spline [33] or PCHIP interpolation [34] is adaptively selected based on obstacle distance to generate smooth curves. Subsequently, physical boundary projection is executed to map the cooperative time anchor t c o issued by L2 into path timestamps, enforcing the velocity bounds within the physical envelope v m i n v m a x whenever path deformation causes boundary violations. Furthermore, to eliminate discrete detection blind spots (tunneling effects) during high-speed flight, capsule-swept volume detection is introduced [35]. By constructing the Minkowski Sum of the continuous geometry moving along the trajectory and environmental obstacles, strict collision verification is achieved across the entire time domain ( t [ 0 , T ] ) [35,36].

4. Experimental Evaluation and Analysis

This section provides a comprehensive evaluation of the cooperative planning performance of the RG-HDP-VD framework for heterogeneous multi-UAVs in complex non-convex terrain through high-fidelity simulation experiments.

4.1. Experimental Setup

4.1.1. Simulation Environment Deployment and Heterogeneous Physical Models

All experiments were executed on a workstation equipped with an AMD Ryzen 9 9950X processor (16 cores, 4.30 GHz) and 64 GB RAM. The high-fidelity 3D simulation environment and the proposed RG-HDP-VD framework were implemented in MATLAB R2024a. To ensure a fair and reproducible evaluation, a paired Monte Carlo protocol was adopted: in each trial, all compared methods shared identical terrain seeds, start/goal configurations, and obstacle/threat-zone placements.
The experimental workspace was set as a 3D restricted airspace of 500   m × 300   m × 150   m . Non-convex canyon terrain was generated using the Diamond-Square algorithm. Environmental constraints were modeled as follows: (i) rigid terrain obstacles O terrain , requiring the trajectory to satisfy σ i O terrain = ; and (ii) soft cylindrical threat zones O threat with radii r obs m. These threat zones were penalized via a CVaR-based risk term J risk mapped by a risk potential field R ( p ) .
We considered a fleet of N = 8 heterogeneous UAVs, comprising heavy-lift and light models. Heavy-lift UAVs followed the piecewise variable-mass model defined in Equation (1), while light UAVs utilized a constant-mass model. The physical parameters are summarized in Table 1, encompassing maximum speeds, hovering power coefficients, safety radii, and the elastic velocity envelopes.

4.1.2. Mission Scenario Design

Two high-pressure mission scenarios were designed to evaluate spatiotemporal coordination and energy-aware arbitration (see Figure 2 and Figure 3).
Scenario A (Saturation Convergence): As shown in Figure 2, UAVs departed from distributed start locations and were required to converge to a single target under sequential arrival constraints. Each UAV i was assigned a target arrival time center:   t i * = t base + ( i 1 ) Gap . The actual arrival time was required to fall strictly within the corresponding window, with continuous inter-agent safety separation enforced throughout the flight.
Scenario B (Group Delivery): As illustrated in Figure 3, the UAVs were divided into two groups departing from diagonal start regions. They were tasked with traversing a central bottleneck obstructed by cylindrical threat zones to reach their respective targets. This layout explicitly induced strongly coupled decisions regarding detouring, waiting/hovering, and risk exposure.

4.1.3. Experimental Design and Evaluation Metrics

To comprehensively evaluate the proposed RG-HDP-VD framework, we conducted both baseline comparisons and ablation studies. For the baseline comparisons in Scenario A, the framework was evaluated against two representative methods: ECBS and ORCA. Their implementations were adapted from the established open-source codebases libMultiRobotPlanning and Python-RVO2, respectively, to incorporate unified map/task interfaces and dynamic constraint checking. In these comparative evaluations, the team size N was varied as depicted in Figure 4, while other parameters remained consistent with the standard Scenario A settings. For fairness under strict arrival windows, ECBS and ORCA are evaluated with an additional unified speed-retiming layer that modulates execution speed within the same velocity envelope and enforces the same continuous-time safety and timing checks, while keeping their original coordination logic unchanged.
Furthermore, to isolate the individual contributions of the proposed mechanisms, customized ablation studies were conducted with a fixed team size of N = 8 . Specifically, a Regret-Guided (RG) ablation was performed in Scenario B by comparing the RG-HDP variant against Baseline-Geo, a geometric priority strategy lacking heterogeneous energy awareness. Additionally, a Velocity Decomposition (VD) ablation was conducted in Scenario A by comparing the VD-Enabled variant against Baseline-FixedV. To ensure a fair comparison, Baseline-FixedV locked the cruising speed but permitted hovering or loitering in safe airspace to attempt to satisfy the strict arrival time windows.
Unless otherwise specified in the subsequent subsections, each experimental configuration was rigorously evaluated over 15–20 Monte Carlo trials using the paired protocol. Overall performance was assessed and reported across three primary dimensions: physical efficiency (encompassing total energy consumption and the heavy-lift energy reduction ratio), spatiotemporal robustness (characterized by the planning success rate and time synchronization error), and computational efficiency (measured by the average planning time required to generate executable trajectories).

4.2. Baseline Comparison

To assess scalability under stringent spatiotemporal coupling, we compare the proposed RG-HDP-VD framework with two representative baselines—ECBS (a coupled MAPF solver) and ORCA (a reactive collision-avoidance method)—in Scenario A (Saturation Convergence). For each team size N { 4 , 8 , 12 , 16 } , we conduct 20 paired Monte Carlo trials with identical seeds, start–goal configurations, and environment instances across methods. For fairness, all outputs are evaluated under the same continuous-time safety verification and timing constraints, and performance is reported in terms of mission success rate (SR), synchronization error Δ T s y n c , and total energy consumption E t o t a l (Table 2).
Figure 4 and Table 2 summarize the performance of the proposed RG-HDP-VD framework versus the baseline methods (ECBS and ORCA) under Scenario A (Saturation Convergence). Results are shown for team sizes N { 4 , 8 , 12 , 16 } , with 20 Monte Carlo trials per setting. Figure 4 is divided into three subfigures, each illustrating a key performance metric:
Figure 4a reports the mission success rate (SR) as the team size increases. RG-HDP-VD remains highly robust, achieving 100% success up to N = 12 and 95% at N = 16 . In contrast, ECBS and ORCA degrade rapidly with density, and both fail completely at N = 16 (0% SR), indicating that their coordination mechanisms cannot reliably satisfy simultaneous continuous-time separation and tight arrival-window constraints in saturated terminal airspace.
Figure 4b shows the synchronization error Δ T s y n c . Across all scales, RG-HDP-VD yields the smallest timing deviation (0.8–4.1 s). At N = 16 , its error (≈4.1 s) is roughly an order of magnitude lower than ECBS (≈18.2 s) and ORCA (≈34.8 s). This advantage is consistent with the core design of RG-HDP-VD, which enforces time-window feasibility through velocity decomposition (i.e., mapping rigid temporal requirements into length-feasible regions) rather than relying on terminal waiting/loitering, which becomes infeasible or unsafe under congestion.
Figure 4c compares total system energy E t o t a l . RG-HDP-VD consistently consumes less energy than both baselines for every N , and the gap widens as N grows. The trend suggests that, under increasing interaction complexity, ECBS and ORCA incur higher energy due to stop-and-go behaviors, redundant avoidance maneuvers, and/or prolonged hovering, whereas RG-HDP-VD preserves smoother, rhythm-consistent trajectories that reduce both unnecessary detours and high-cost waiting.
In summary, these results show that the physics-aware integration of trajectory generation and temporal alignment in RG-HDP-VD yields robust scalability and coordination efficiency. The proposed framework maintains high mission success rates, precise timing, and low energy usage even under saturated traffic conditions, where conventional coupled (ECBS) or reactive (ORCA) methods break down.

4.3. Core Mechanism Ablation and Mechanism Analysis

4.3.1. RG Mechanism Ablation: Energy Efficiency Gains and Waiting Suppression Mechanism

This subsection focuses on evaluating the energy efficiency optimization performance of the Regret-Guided (RG) arbitration mechanism in the heterogeneous multi-UAV group delivery task (Scenario B). It addresses the issue where traditional geometric priority strategies often lead to system-level energy efficiency degradation by ignoring the nonlinear coupling between airframe mass and energy consumption, causing heavy-lift UAVs to be passively detained at bottlenecks.
In 15 paired Monte Carlo trials, RG-HDP demonstrated a significant and stable advantage in system total energy consumption ( E t o t a l ). As shown in Figure 5, compared to the geometric baseline strategy (Baseline-Geo), which had a mean of 1.14 × 10 8 J, RG-HDP reduced the system total energy consumption to 1.06 × 10 8 J, achieving a significant reduction of 6.7%. The paired connecting lines in the box plot reveal extremely strong consistency: in all test samples, RG-HDP corresponded to a lower energy consumption level. This universal performance improvement indicates that the gain stems not from accidental advantages in specific terrains but from structural improvements in arbitration and resource allocation logic.
To reveal the physical mechanism behind the energy savings, Figure 6 decomposes the system energy consumption. The data show that the core benefit of RG-HDP originates from the effective suppression of high-cost hovering: the system total Hover Energy decreased significantly from 7.69 × 10 7 J to 6.54 × 10 7 J, a reduction of up to 15.0%. Correspondingly, the Path Energy increased slightly from 3.68 × 10 7 J to 4.07 × 10 7 J. This characteristic clearly validates the decision logic of the RG mechanism: the planner actively guides some low-regret-value platforms to undertake detouring and yielding tasks (causing a slight increase in path energy) in exchange for the relief of bottleneck congestion and a significant reduction in expensive hovering energy. This strategy of “trading space for energy efficiency” ultimately realized a significant net gain in system energy consumption.
Figure 7 and Figure 8 further reveal how the RG mechanism achieves “physical fairness” by reshaping the right-of-way. From the perspective of energy consumption distribution (Figure 6), the energy-saving benefit presents a clear asymmetric distribution: the energy consumption of heavy-lift UAVs decreased significantly by 8.2% ( 9.13 × 10 7   J 8.37 × 10 7   J ), while the energy change for light UAVs was negligible ( 0.5 % ) and statistically insignificant (n.s.). This difference directly corresponds to the reallocation of passage rights (Figure 7): the RG mechanism compressed the average waiting time of heavy-lift UAVs by 13.5% ( 675.1   s 583.8   s ), while the waiting time for light aircraft remained basically flat ( 1.0 % , n.s.). This indicates that the system successfully identified the cost asymmetry in the heterogeneous game: granting priority passage to heavy-lift UAVs with high waiting costs (i.e., high regret values), while light platforms with lower maneuvering and waiting costs undertake the necessary system regulation tasks. Compared to geometric priority, this differentiated scheduling based on real physical costs achieves substantive fairness and improves system-level energy efficiency.
The trial-by-trial comparison (Figure 9) shows that in all 15 paired Monte Carlo trials, whether for system total energy, hovering energy, or heavy platform energy, RG-HDP was consistently lower than the baseline in all trials. The two curves showed a highly consistent trend of rising and falling together with changes in terrain difficulty, indicating that the experimental design effectively controlled the confounding variables brought by environmental difficulty differences. Facing identical terrain challenges and random perturbations, RG-HDP always maintained a stable energy-saving margin, proving that its advantage possesses statistical consistency and engineering robustness.
In summary, the ablation experiment in Scenario B comprehensively validated the necessity of the regret-guided arbitration mechanism. Compared with geometric-priority baselines, RG-HDP achieved a 6.7% reduction in system total energy consumption. This improvement arises from a cost-aware right-of-way mechanism that trades a modest increase in path (detour) effort for a larger reduction in expensive hovering, effectively shifting regulation actions to lower-cost agents while prioritizing heavy platforms at bottlenecks. By drastically cutting the ineffective hovering of heavy platforms by 15.0% and compressing their waiting time by 8.2%, it effectively alleviated bottleneck congestion. This physics-aware game mechanism provides a solution that balances fairness and efficiency for resource allocation of heterogeneous multi-UAVs in extreme environments.

4.3.2. VD Mechanism Ablation: Spatiotemporal Feasibility and Synchronization Accuracy Gains of Velocity Decomposition

This subsection aims to verify the core contribution of the Velocity Decomposition (VD) mechanism in high-density spatiotemporal conflict scenarios (Scenario A: Saturation Convergence). By comparing the performance of enabling VD (Ours) versus disabling VD (Baseline-FixedV), this experiment focuses on demonstrating the necessity of decoupling rigid time window constraints into elastic velocity envelopes (i.e., feasible path-length intervals) to avoid high-dimensional spatiotemporal deadlocks.
The experimental results show that RG-HDP-VD achieved a 100% planning success rate in all 20 rounds of simulation: all 160 UAV sorties were able to precisely hit the preset time windows while satisfying continuous obstacle avoidance and inter-agent safety separation. In contrast, the mission-level success rate of Baseline-FixedV was 0%, manifesting as systemic failure rather than random error. This failure is exacerbated by the non-convex canyon geometry, which severely restricts feasible detour and terminal loitering space; consequently, the feasible set for late/low-priority UAVs rapidly contracts and can collapse to near-empty under continuous safety constraints.
Further analysis of the “Distribution of Successful Sorties per Round” (Figure 10) reveals the degradation mode of the Baseline: its successful quantity was mainly concentrated in 1–2 aircraft. As priority decreased, subsequent UAVs, unable to adjust speed to maintain the necessary Arrival Gap, caused temporal conflicts to cascade, ultimately triggering mission-level cooperative collapse. This observation indicates that, under stringent spatiotemporal constraints, a fixed-speed planner without a speed-adjustment degree of freedom effectively reduces the problem to repeated single-agent feasibility checking and therefore cannot resolve the high-dimensional, tightly coupled constraints required for multi-UAV coordination.
To reveal the nature of the failure of Baseline-FixedV, Figure 11 displays the Cumulative Distribution Function (CDF) of arrival time errors. The error is defined as Δ t = t a r r t * , where t a r r is the actual arrival time output by the planner, and t * is the center of the target time window. The results show that the error of RG-HDP-VD converges near 0, indicating its stable time window locking capability; in contrast, the error of the Baseline shows a significant negative offset, mainly distributed in the 30   s , 10   s interval, and a large number of samples crossed the “Early Arrival Failure” boundary ( Δ T e a r l y = 10   s ). This indicates that although the Baseline was allowed to insert loitering to consume time, under terminal high-density and continuous safety constraints, feasible loitering space was extremely scarce, causing “safe feasible solutions” to often degenerate into early-arrival trajectories or be directly judged as unsolvable.
Under the fixed-speed assumption, the nominal flight time is rigidly locked by the path length ( T f l i g h t L / v f i x e d ). When the queue order requires a low-priority UAV to significantly postpone its arrival, the only time adjustment means for the Baseline is to insert loitering segments near the terminal to “compensate” for the time difference. However, in the saturation convergence scenario, this strategy triggers three types of chain failures:
  • Terminal Airspace Saturation: Free space near the target area is extremely limited, and terrain undulations restrict available loitering radii. Once the Baseline attempts to loiter at a bottleneck, the detained airframe quickly transforms into a “long-duration dynamic obstacle,” causing the continuous collision detection and inter-agent separation constraints in the terminal airspace to fail simultaneously;
  • Occupancy Cascade: The loitering strategy under fixed speed possesses extremely high spatial exclusivity. The loitering behavior of high-priority UAVs generates long-duration dynamic occlusion in the Spatiotemporal Occupancy Map ( H -Map), compressing the feasible region for low-priority UAVs and forcing them to seek more distant loitering points, leading to an exponential increase in energy consumption and risk costs;
  • Cost Divergence: To avoid channels occupied by loitering aircraft, subsequent UAVs are forced to execute large-scale detours. This detouring induced by “passive loitering” causes surges in path length and Risk Cost, making the underlying sampling planner unable to converge to a solution satisfying cost constraints within limited iterations.
The arrival time scatter plot in Figure 12 provides further intuitive verification of this structural contradiction. RG-HDP-VD absorbs timing discrepancies during transit via speed modulation, thereby achieving consistent adherence to the scheduled arrival order. Conversely, the data divergence of Baseline-FixedV (orange dots) indicates that as the queue order progresses, the exhaustion of “loitering space” renders the planner unable to match the target time via geometric means (waiting/detouring), leaving it only able to output physically feasible “early arrival” paths.
Furthermore, regarding computational efficiency (Figure 13), although RG-HDP-VD introduces an extra speed search dimension, the average planning time of the Baseline (~460 s) was significantly higher than that of RG-HDP-VD (~250 s). The reason lies in the fact that the Baseline needs to repeatedly attempt to find feasible loitering locations/durations at the terminal under tight constraints. However, this feasible set converges approximately to an empty set in congested airspace, causing the underlying VD-TSRRT* algorithm to frequently trigger constraint relaxation and repeated iterations, reaching the maximum iteration count (MaxIter) in an attempt to find non-existent feasible solutions, thereby incurring a substantial computational overhead. In contrast, RG-HDP-VD, by mapping time windows to length feasible regions and incorporating feasibility screening during the spatial search phase, achieves rapid search convergence, thereby reducing total time consumption.
Combining the analysis of success rate, error distribution, and time consumption, it is evident that the value of Velocity Decomposition (VD) lies in transforming time adjustment from local loitering at the terminal into feasible region search and distributed rhythm control along the entire flight segment. This elevates the mission success rate from 0% to 100% and improves computational efficiency by approximately 45%.

4.4. Parameter Sensitivity Analysis

In evaluating parameter sensitivity, a lexicographic performance criterion was adopted: first, the mission success rate was maximized, and only among solutions with full success were total energy use and synchronization error subsequently minimized. The key coefficients λ t i m e and λ e n g (the energy- and time-weighting factors in the global objective in Equation (8), and reused in the time-anchoring objective in Equation (12)) were chosen through structured sensitivity sweeps under our baseline Scenario A (Saturation Convergence). In contrast, the risk-weight γ and numerical stabilizer ϵ were held fixed ( γ = 0.2 , ϵ = 1 × 10 6 ), since γ merely enforces a soft safety margin and ϵ is a small computational conditioner with negligible effect on coordination feasibility. Finally, each experiment was repeated 10 times under identical initial conditions, and all reported metrics are averages across these trials to ensure statistical reliability.

4.4.1. Experiment A: Synchronization Pressure ( λ t i m e )

Table 3 shows that the synchronization weight λ t i m e has a clear success threshold. When λ t i m e is below a critical value (for example, λ t i m e < 0.5 with λ e n g = 0.2 ), the success rate remains near 0% because the planner prioritizes path economy over timing, causing many missed arrival windows. Once λ t i m e exceeds this threshold (around 0.5 for λ e n g = 0.2 and around 1.0 for λ e n g = 0.5 ), the success rate abruptly jumps to 100%. Increasing λ t i m e further yields only marginal synchronization gains ( Δ T s y n c decreases) but incurs an energy penalty. In our data, forcing very high λ t i m e raises the total energy E t o t a l by about 5–10%. Thus, in practice λ t i m e should be set just above the value needed for full success, to balance guaranteed feasibility with energy efficiency.
For example, at  λ e n g = 0.2 the success rate jumps from 20% at λ t i m e = 0.2 to 100% at λ t i m e = 0.5 , after which Δ T s y n c continues to shrink (improving synchronization) at only a modest energy cost. This confirms the threshold and diminishing-return behavior described above.

4.4.2. Experiment B: Energy-Fairness Anchor ( λ e n g )

Varying the energy-fairness weight λ e n g shifts waiting time between heavy-lift and light UAVs (Figure 14). At λ e n g = 0 the scheduler ignores weight, so heavy UAVs suffer long waits: heavy UAVs wait on average 18.5 s versus 8.2 s for light UAVs ( T h e a v y / T l i g h t 2.3 ). As λ e n g increases, heavy UAV waiting decreases and light-UAV waiting increases. By λ e n g 1.0 the waits invert: heavy 9.1 s, light 11.8 s (ratio 0.77 ). At λ e n g 1.2 heavy waits 8.2 s while light waits 15.5 s (ratio 0.53 ). Beyond λ e n g 1.2 heavy-wait time plateaus ( 7.5 s at λ e n g = 2.0 ) while light UAVs absorb the remainder ( 22.1 s at λ e n g = 2.0 ), so the fairness metric T h e a v y / T l i g h t steadily falls. This trend matches the intuition (and prior analysis) that higher λ e n g increasingly favors heavy-lift vehicles, forcing lightweight UAVs to take on extra loitering.
Crucially, the total mission energy E t o t a l follows a convex profile. From λ e n g = 0 to 1.2, E t o t a l decreases (from 1.14 to 1.06 , normalized) because prioritizing heavy UAVs avoids costly idling for those high-power platforms. Increasing λ e n g beyond 1.2 then raises E t o t a l again (to 1.15 at λ e n g = 2.0 ) because light UAVs begin taking longer detours. In other words, an intermediate λ e n g (~1.2) balances the heavy/light waiting costs and minimizes total energy. This U-shaped energy response (optimal near λ e n g 1.2 ) was also observed in our earlier analysis. In summary, λ e n g 1.2 provides a practical trade-off: it equalizes heavy/light waiting and yields the lowest mission energy, without over-penalizing light UAVs.

4.4.3. Discussion of Secondary Parameters

The remaining tuning parameters (risk weight γ and numerical stabilizer ϵ ) are held constant because they have no significant effect on core outcomes. In all experiments we fixed γ = 0.2 and ϵ = 1 × 10 6 (as in our algorithm setup). The risk weight γ simply enforces a soft safety margin on trajectories; adjusting γ shifts how aggressively UAVs avoid risk but does not fundamentally change mission feasibility or coordination logic. Likewise, the constant ϵ is used for numerical conditioning (e.g., to avoid division by zero) and has negligible impact on the high-level scheduling results. By keeping γ and ϵ fixed, we focus our analysis on the primary trade-offs governed by λ t i m e and λ e n g without loss of generality.

5. Real-World Flight Demonstration

To validate the practical feasibility of the proposed RG-HDP-VD framework, a real-world flight experiment was conducted in a 6   m × 6   m indoor arena featuring a constricted bottleneck passage created by cylindrical obstacles. The mission required a heterogeneous team of three Alpha-type quadrotors manufactured by Differential Intelligence Fly (Hangzhou) Technology Co., Ltd., Hangzhou, China (base mass ≤ 1.9 kg, 400 × 370 × 178 mm), to navigate the gap and reach a common target beyond the obstacles. Each vehicle was equipped with an NVIDIA Jetson Orin NX (16 GB) for high-level computation and an STM32H743 flight controller running PX4 for real-time stabilization. To introduce physical heterogeneity, UAV 1 was augmented with a 0.5   kg payload, increasing its weight by approximately 26 % and significantly elevating its hover energy penalty, thereby creating the specific asymmetric cost condition that the RG module is designed to address.
The cooperative 4D trajectories were computed offline using the full RG-HDP-VD pipeline and subsequently uploaded to the onboard systems for real-time tracking. During the planning phase, the RG module automatically prioritized UAV 1 due to its superior loitering penalty, assigning it the first right-of-way through the bottleneck. Simultaneously, the VD module provided elastic timing envelopes that allowed the follower drones (UAV 2 and 3) to modulate their velocities en route rather than resorting to inefficient loitering or terminal hovering. This integrated approach successfully serialized the passage order without requiring any vehicle to come to a full stop.
Experimental results demonstrate high consistency between the planned trajectories and real-world execution. As shown in the planned paths (Figure 15) and the time-lapse sequence of the actual flight (Figure 16), UAV 1 navigated the central bottleneck first, while the follower drones adjusted their flight rhythm within the VD-prescribed velocity bounds to queue smoothly behind the leader. No stop-and-go behavior or hovering deadlocks were observed; instead, the fleet maintained safe separation via continuous velocity modulation. All three vehicles cleared the constricted area without collision and converged at the goal area as scheduled. This successful demonstration validates that the RG-HDP-VD framework effectively translates theoretical physics-aware coordination into robust hardware execution, bridging the gap between high-fidelity simulation and practical multi-UAV operations in complex environments.

6. Conclusions

This paper has proposed and validated a physics-aware cooperative planning framework (RG-HDP-VD) to address energy imbalance and rigid spatiotemporal constraints in heterogeneous multi-UAV missions. Specifically, it has integrated a novel Regret-Guided (RG) arbitration mechanism to allocate right-of-way based on physical cost and a Velocity Decomposition (VD) approach to create elastic time windows for tight-deadline tasks. Simulation results have shown that RG-HDP-VD has effectively prevented energy-inefficient congestion by reducing total system energy consumption by 6.7% (including a 15% reduction in heavy-lift UAV hovering energy), and that the velocity-decomposition approach has increased the success rate of tight time-window tasks from 0% to 100% and improved computational efficiency by avoiding infeasible searches. These results have demonstrated that our physics-grounded strategy has significantly enhanced the robustness and efficiency of heterogeneous multi-UAV coordination under complex constraints. Future work will extend the framework to distributed communication settings, dynamic environments, and real-world closed-loop validation and system-level energy efficiency improvement.

Author Contributions

Conceptualization, D.H.; methodology, D.H. and Z.H.; software, Z.H.; validation, X.Z. and L.L.; formal analysis, Z.H. and L.L.; investigation, X.Z. and H.J.; resources, H.J. and L.W.; data curation, Z.H.; writing—original draft preparation, Z.H.; writing—review and editing, D.H.; visualization, L.L. and Z.H.; supervision, L.W. and H.J.; project administration, L.W.; funding acquisition, L.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Civil Aviation Flight University of China, grant numbers 25CAFUC03022 and 25CAFUC03085, and by the National Natural Science Foundation of China (General Program), grant number 6247071842.

Data Availability Statement

The partial implementation code of the proposed RG-HDP-VD framework is publicly available at [IEEE Dataport, https://doi.org/10.21227/qsdc-1291].

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
UAVUnmanned Aerial Vehicle
HMUASHeterogeneous Multi-Unmanned Aerial Systems
MUCPPMulti-UAV Cooperative Path Planning
SATSimultaneous Arrival Task
RG-HDP-VDRegret-Guided Heuristic Decentralized Prioritized Planning with Velocity Decomposition
RGRegret-Guided (right-of-way arbitration)
HDPHeuristic Decentralized Prioritized Planning
VDVelocity Decomposition
A*A-star Search
RRT*Rapidly exploring Random Tree Star
VD-TSRRT*Velocity-Decomposition Time–Space RRT*
CDFCumulative Distribution Function
CVaRConditional Value-at-Risk
APFArtificial Potential Field
ORCAOptimal Reciprocal Collision Avoidance
CBSConflict-Based Search
EECBSEnhanced Edge-Weighted Conflict-Based Search
RDPRamer–Douglas–Peucker (path simplification)
PCHIPPiecewise Cubic Hermite Interpolating Polynomial

Appendix A

Table A1 lists the essential symbols used in the problem formulation and the proposed RG-HDP-VD framework. Symbols are grouped by system/environment, trajectory-time-safety constraints, and objective/cost terms.
Table A1. Essential notation used in the formulation and RG-HDP-VD framework.
Table A1. Essential notation used in the formulation and RG-HDP-VD framework.
SymbolMeaning
U = { U i } i = 1 N Set   of   UAVs   ( heterogeneous   multi-UAV   system ) ,   N is the number of UAVs
s i ,   G i Start   state   and   goal   region   of   UAV   i
O e n v Environment constraints set
O t e r r a i n Terrain obstacles (hard constraints)
O t h r e a t Threat regions (risk exposure)
σ i Trajectory / path   of   UAV   i   ( recommended :   use   π i   for   geometric   path   and   σ i ( t ) for timed trajectory)
p i ( t ) Position   of   UAV   i   at   time   t
v m i n , i , v m a x , i Min / max   speed   bounds   of   UAV   i
T c o m Common mission timeline length
t c o Global cooperative time anchor
δ i Arrival   time   bias / offset   for   UAV   i
t t a r g e t , i = t c o + δ i Target   arrival   time   of   UAV   i
t l o w , i t h i g h , i Assigned   arrival   time   window   for   UAV   i
r s a f e Safety radius (inflation for body size and uncertainty)
V i ( t ) = p i ( t ) B ( 0 , r s a f e ) Occupancy   set   of   UAV   i   at   time   t (Minkowski sum)
D m i n Minimum safe separation threshold
H a l l Global spatiotemporal occupancy/constraint map used in HDP updates
m i ( t ) Time-varying   mass   of   UAV   i (payload drop causes piecewise change)
t d r o p , i Payload   drop   time   of   UAV   i
J e n g ( σ i ) Energy   cos t   along   σ i (motion + hover)
P m o t i o n ( ) , P h o v e r ( m i ( t ) ) Motion power and hover power terms
J g l o b a l Global weighted objective for multi-UAV planning
J ^ e n g , J ^ s y n c , J ^ r i s k Normalized energy/synchronization/risk components
λ e n g , λ t i m e , γ Weights for the three objective components
R ( p ) Threat-field   risk   value   at   position   p
C V a R Tail-risk metric penalizing high-risk exposure segments
R i Regret   value   for   UAV   i (RG-based prioritization)
C i w a i t ,   C i d e t o u r Energy costs of “wait/hover” vs. “detour” actions in RG arbitration
L ( σ i ) Geometric path length used in VD feasibility check
L f e a s , i Feasible   length   interval   induced   by   t l o w , i t h i g h , i   and   v m i n , i v m a x , i

References

  1. Chung, S.-J.; Paranjape, A.A.; Dames, P.; Shen, S.; Kumar, V. A Survey on Aerial Swarm Robotics. IEEE Trans. Robot. 2018, 34, 837–855. [Google Scholar] [CrossRef]
  2. Mellinger, D.; Kumar, V. Minimum Snap Trajectory Generation and Control for Quadrotors. In Proceedings of the 2011 IEEE International Conference on Robotics and Automation (ICRA), Shanghai, China, 9–13 May 2011; pp. 2520–2525. [Google Scholar] [CrossRef]
  3. Richter, C.; Bry, A.; Roy, N. Polynomial Trajectory Planning for Aggressive Quadrotor Flight in Dense Indoor Environments. In Robotics Research; Springer: Cham, Switzerland, 2016; pp. 649–666. [Google Scholar] [CrossRef]
  4. Kumar, P.; Pal, K.; Govil, M.C. Comprehensive Review of Path Planning Techniques for Unmanned Aerial Vehicles (UAVs). ACM Comput. Surv. 2025, 58, 1–44. [Google Scholar] [CrossRef]
  5. Rahman, M.; Sarkar, N.I.; Lutui, R. A Survey on Multi-UAV Path Planning: Classification, Algorithms, Open Research Problems, and Future Directions. Drones 2025, 9, 263. [Google Scholar] [CrossRef]
  6. Wang, L.; Huang, W.; Li, H.; Li, W.; Chen, J.; Wu, W. A Review of Collaborative Trajectory Planning for Multiple Unmanned Aerial Vehicles. Processes 2024, 12, 1272. [Google Scholar] [CrossRef]
  7. Babel, L. Coordinated Target Assignment and UAV Path Planning with Timing Constraints. J. Intell. Robot. Syst. 2019, 94, 857–869. [Google Scholar] [CrossRef]
  8. Yan, F.; Zhu, X.; Zhou, Z.; Chu, J. A Hierarchical Mission Planning Method for Simultaneous Arrival of Multi-UAV Coalition. Appl. Sci. 2019, 9, 1986. [Google Scholar] [CrossRef]
  9. Andreychuk, A.; Yakovlev, K.; Surynek, P.; Atzmon, D.; Stern, R. Multi-Agent Pathfinding with Continuous Time. Artif. Intell. 2022, 305, 103662. [Google Scholar] [CrossRef]
  10. Phillips, M.; Likhachev, M. SIPP: Safe Interval Path Planning for Dynamic Environments. In Proceedings of the 2011 IEEE International Conference on Robotics and Automation (ICRA), Shanghai, China, 9–13 May 2011; pp. 5628–5635. [Google Scholar] [CrossRef]
  11. Ait Saadi, A.; Bhuyan, B.P.; Ramdane-Cherif, A. Power Consumption Model for Unmanned Aerial Vehicles Using Recurrent Neural Network Techniques. Aerosp. Sci. Technol. 2025, 157, 109819. [Google Scholar] [CrossRef]
  12. van den Berg, J.; Guy, S.J.; Lin, M.-C.; Manocha, D. Reciprocal n-Body Collision Avoidance. In Robotics Research; Springer: Berlin, Germany, 2011; pp. 3–19. [Google Scholar] [CrossRef]
  13. Khatib, O. Real-Time Obstacle Avoidance for Manipulators and Mobile Robots. In Proceedings of the 1985 IEEE International Conference on Robotics and Automation (ICRA), St. Louis, MO, USA, 25–28 March 1985; pp. 500–505. [Google Scholar] [CrossRef]
  14. Sharon, G.; Stern, R.; Felner, A.; Sturtevant, N.R. Conflict-Based Search for Optimal Multi-Agent Path Finding. Artif. Intell. 2015, 219, 40–66. [Google Scholar] [CrossRef]
  15. Li, J.; Ruml, W.; Koenig, S. EECBS: A Bounded-Suboptimal Search for Multi-Agent Path Finding. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), Virtual Event, 2–9 February 2021; pp. 12353–12362. [Google Scholar] [CrossRef]
  16. Liu, X.; Su, Y.; Wu, Y.; Guo, Y. Multi-Conflict-Based Optimal Algorithm for Multi-UAV Cooperative Path Planning. Drones 2023, 7, 217. [Google Scholar] [CrossRef]
  17. Semiz, F.; Polat, F. Incremental Multi-Agent Path Finding. Future Gener. Comput. Syst. 2021, 116, 220–233. [Google Scholar] [CrossRef]
  18. Silver, D. Cooperative Pathfinding. In Proceedings of the First Artificial Intelligence and Interactive Digital Entertainment Conference (AIIDE), Marina del Rey, CA, USA, 1–5 June 2005; pp. 117–122. [Google Scholar] [CrossRef]
  19. Velagapudi, P.; Sycara, K.P.; Scerri, P. Decentralized Prioritized Planning in Large Multirobot Teams. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Taipei, Taiwan, 18–22 October 2010; pp. 4603–4609. [Google Scholar] [CrossRef]
  20. Guo, Y.; Liu, X.; Jiang, W.; Zhang, W. Collision-Free 4D Dynamic Path Planning for Multiple UAVs Based on Dynamic Priority RRT* and Artificial Potential Field. Drones 2023, 7, 180. [Google Scholar] [CrossRef]
  21. Ma, H.; Harabor, D.; Stuckey, P.J.; Li, J.; Koenig, S. Searching with Consistent Prioritization for Multi-Agent Path Finding. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), Honolulu, HI, USA, 27 January–1 February 2019; pp. 7643–7650. [Google Scholar] [CrossRef]
  22. Chagas, F.S.; Ruseno, N.; Bechina, A.A.A. Artificial Intelligence Approaches for UAV Deconfliction: A Comparative Review and Framework Proposal. Automation 2025, 6, 54. [Google Scholar] [CrossRef]
  23. Zhang, M.; Yan, C.; Dai, W.; Xiang, X.; Low, K.H. Tactical Conflict Resolution in Urban Airspace for Unmanned Aerial Vehicles Operations Using Attention-Based Deep Reinforcement Learning. Green Energy Intell. Transp. 2023, 2, 100107. [Google Scholar] [CrossRef]
  24. Kong, X.; Zhou, Y.; Li, Z.; Wang, S. Multi-UAV Simultaneous Target Assignment and Path Planning Based on Deep Reinforcement Learning in Dynamic Multiple Obstacles Environments. Front. Neurorobotics 2024, 17, 1302898. [Google Scholar] [CrossRef]
  25. Yan, H.; Chen, Y.; Yang, S.-H. New Energy Consumption Model for Rotary-Wing UAV Propulsion. IEEE Wirel. Commun. Lett. 2021, 10, 2009–2012. [Google Scholar] [CrossRef]
  26. Burzyński, W.; Stecz, W. Trajectory Planning with Multiplatform Spacetime RRT*. Appl. Intell. 2024, 54, 9524–9541. [Google Scholar] [CrossRef]
  27. Guo, Y.; Liu, X.; Jiang, W.; Zhang, W. HDP-TSRRT*: A Time–Space Cooperative Path Planning Algorithm for Multiple UAVs. Drones 2023, 7, 170. [Google Scholar] [CrossRef]
  28. Rockafellar, R.T.; Uryasev, S. Optimization of Conditional Value-at-Risk. J. Risk 2000, 2, 21–42. [Google Scholar] [CrossRef]
  29. Hakobyan, A.; Kim, G.C.; Yang, I. Risk-Aware Motion Planning and Control Using CVaR-Constrained Optimization. IEEE Robot. Autom. Lett. 2019, 4, 3924–3931. [Google Scholar] [CrossRef]
  30. Karaman, S.; Frazzoli, E. Sampling-Based Algorithms for Optimal Motion Planning. Int. J. Robot. Res. 2011, 30, 846–894. [Google Scholar] [CrossRef]
  31. Douglas, D.H.; Peucker, T.K. Algorithms for the Reduction of the Number of Points Required to Represent a Digitized Line or Its Caricature. Can. Cartogr. 1973, 10, 112–122. [Google Scholar] [CrossRef]
  32. Zhou, B.; Gao, F.; Wang, L.; Liu, C.; Shen, S. Robust and Efficient Quadrotor Trajectory Generation for Fast Autonomous Flight. IEEE Robot. Autom. Lett. 2019, 4, 3529–3536. [Google Scholar] [CrossRef]
  33. Fritsch, F.N.; Carlson, R.E. Monotone Piecewise Cubic Interpolation. SIAM J. Numer. Anal. 1980, 17, 238–246. [Google Scholar] [CrossRef]
  34. Redon, S.; Kheddar, A.; Coquillart, S. Fast Continuous Collision Detection between Rigid Bodies. Comput. Graph. Forum 2002, 21, 279–287. [Google Scholar] [CrossRef]
  35. Pan, J.; Chitta, S.; Manocha, D. FCL: A General Purpose Library for Collision and Proximity Queries. In Proceedings of the 2012 IEEE International Conference on Robotics and Automation (ICRA), St. Paul, MN, USA, 14–18 May 2012; pp. 3859–3866. [Google Scholar] [CrossRef]
  36. Fournier, A.; Fussell, D.; Carpenter, L. Computer Rendering of Stochastic Models. Commun. ACM 1982, 25, 371–384. [Google Scholar] [CrossRef]
Figure 1. The RG-HDP-VD Cooperative Planning Framework.
Figure 1. The RG-HDP-VD Cooperative Planning Framework.
Drones 10 00192 g001
Figure 2. Schematic of Saturation Convergence Scenario. Different colored lines indicate the trajectories of different UAVs.
Figure 2. Schematic of Saturation Convergence Scenario. Different colored lines indicate the trajectories of different UAVs.
Drones 10 00192 g002
Figure 3. Schematic of Group Delivery Scenario. Different colored lines indicate the trajectories of different UAVs.
Figure 3. Schematic of Group Delivery Scenario. Different colored lines indicate the trajectories of different UAVs.
Drones 10 00192 g003
Figure 4. Performance trends of RG-HDP-VD, ECBS, and ORCA under Scenario A.
Figure 4. Performance trends of RG-HDP-VD, ECBS, and ORCA under Scenario A.
Drones 10 00192 g004
Figure 5. Comparison of System Total Energy Consumption Distribution.
Figure 5. Comparison of System Total Energy Consumption Distribution.
Drones 10 00192 g005
Figure 6. Decomposition of System Energy Components. The dark and light orange colors represent the path energy and hover energy for RG-HDP, respectively, while the dark and light gray colors represent those for the Baseline.
Figure 6. Decomposition of System Energy Components. The dark and light orange colors represent the path energy and hover energy for RG-HDP, respectively, while the dark and light gray colors represent those for the Baseline.
Drones 10 00192 g006
Figure 7. Comparison of Heterogeneous Platform Energy Optimization.
Figure 7. Comparison of Heterogeneous Platform Energy Optimization.
Drones 10 00192 g007
Figure 8. Statistical Analysis of Waiting Time.
Figure 8. Statistical Analysis of Waiting Time.
Drones 10 00192 g008
Figure 9. Trial-by-Trial Robustness Analysis.
Figure 9. Trial-by-Trial Robustness Analysis.
Drones 10 00192 g009
Figure 10. Distribution of Successful Sorties per Round.
Figure 10. Distribution of Successful Sorties per Round.
Drones 10 00192 g010
Figure 11. Cumulative Distribution Function of Flight Time Errors. The green shaded region marks the feasible arrival-error interval, and the vertical boundary indicates the early-arrival failure threshold.
Figure 11. Cumulative Distribution Function of Flight Time Errors. The green shaded region marks the feasible arrival-error interval, and the vertical boundary indicates the early-arrival failure threshold.
Drones 10 00192 g011
Figure 12. Arrival Time Scatter Plot.
Figure 12. Arrival Time Scatter Plot.
Drones 10 00192 g012
Figure 13. Computation Time per Round. Different colors denote RG-HDP-VD and Baseline-FixedV, respectively.
Figure 13. Computation Time per Round. Different colors denote RG-HDP-VD and Baseline-FixedV, respectively.
Drones 10 00192 g013
Figure 14. Sensitivity analysis of the energy-fairness anchor λ e n g . In the right panel, the pink curve denotes the normalized total mission energy as a function of λ e n g , showing a minimum near λ e n g 1.2 .
Figure 14. Sensitivity analysis of the energy-fairness anchor λ e n g . In the right panel, the pink curve denotes the normalized total mission energy as a function of λ e n g , showing a minimum near λ e n g 1.2 .
Drones 10 00192 g014
Figure 15. Planned 3D and top-down trajectories for the three-UAV heterogeneous team in the bottleneck indoor scenario.
Figure 15. Planned 3D and top-down trajectories for the three-UAV heterogeneous team in the bottleneck indoor scenario.
Drones 10 00192 g015
Figure 16. Time-lapse sequence of the real-world flight experiment.
Figure 16. Time-lapse sequence of the real-world flight experiment.
Drones 10 00192 g016
Table 1. Heterogeneous UAV Physical Parameter Settings.
Table 1. Heterogeneous UAV Physical Parameter Settings.
Parameter SymbolHeavy-Lift UAVsLight UAVsUnit
Empty/Payload Mass m e m p t y / m p a y l o a d 20.0/10.02.0/0.5kg
Max Flight Speed v m a x 1215m/s
Hovering Power Coeff. α h 28080 W / kg 1.5
Safety Collision Radius r s a f e 5.03.0m
Elastic Velocity Envelope V i 7–127–15m/s
Table 2. Comparative results under Scenario A.
Table 2. Comparative results under Scenario A.
N MethodSuccess Rate (%) Δ T s y n c (s) E t o t a l ( × 108J)
4OURS1000.80.56
ECBS904.20.63
ORCA5511.50.69
8OURS1001.121.1
ECBS558.31.32
ORCA1521.51.46
12OURS1002.81.7
ECBS3013.62.17
ORCA527.42.36
16OURS954.12.31
ECBS018.23.03
ORCA034.83.21
Table 3. λ t i m e threshold and trade-off under two λ e n g values.
Table 3. λ t i m e threshold and trade-off under two λ e n g values.
λ e n g λ t i m e Success Rate E t o t a l ( × 108J) Δ T s y n c (s)
0.200%1.0232.51
0.220%1.0416.25
0.5100%1.075.52
1100%1.101.12
2100%1.180.32
0.500%0.9833.33
0.220%1.0318.98
0.580%1.0610.38
1100%1.095.54
2100%1.152.37
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Han, D.; Hua, Z.; Zhu, X.; Luo, L.; Jiang, H.; Wang, L. RG-HDP-VD: A Physics-Aware Cooperative Trajectory Planning Framework for Heterogeneous Multi-UAVs. Drones 2026, 10, 192. https://doi.org/10.3390/drones10030192

AMA Style

Han D, Hua Z, Zhu X, Luo L, Jiang H, Wang L. RG-HDP-VD: A Physics-Aware Cooperative Trajectory Planning Framework for Heterogeneous Multi-UAVs. Drones. 2026; 10(3):192. https://doi.org/10.3390/drones10030192

Chicago/Turabian Style

Han, Dan, Zhaoyuan Hua, Xinyu Zhu, Liang Luo, Hao Jiang, and Lifang Wang. 2026. "RG-HDP-VD: A Physics-Aware Cooperative Trajectory Planning Framework for Heterogeneous Multi-UAVs" Drones 10, no. 3: 192. https://doi.org/10.3390/drones10030192

APA Style

Han, D., Hua, Z., Zhu, X., Luo, L., Jiang, H., & Wang, L. (2026). RG-HDP-VD: A Physics-Aware Cooperative Trajectory Planning Framework for Heterogeneous Multi-UAVs. Drones, 10(3), 192. https://doi.org/10.3390/drones10030192

Article Metrics

Back to TopTop