1. Introduction
A small pilotless drone navigates between skyscrapers in a windy urban corridor, buffeted by gusts, intermittently losing and regaining GPS lock, and maneuvering to avoid traffic, pedestrian bridges, and other airborne vehicles. Its mission is to deliver urgent medication to an elderly resident while complying with aviation regulations and safety requirements. This scenario illustrates a central challenge in modern uncrewed aerial vehicle (UAV) research: transitioning path planning algorithms from controlled laboratory environments to the uncertainty, complexity, and safety-critical demands of real-world deployment.
Over the past two decades, substantial progress has been made in UAV pathfinding under simplified assumptions. Classical planning approaches have demonstrated strong performance in static or partially known environments. Notably, Koenig and Likhachev introduced D* Lite, enabling efficient replanning in response to dynamic changes in traversal cost [
1]. Karaman and Frazzoli later established a theoretical foundation for optimal motion planning with respect to user-defined cost functions, providing asymptotic optimality guarantees while accommodating penalties associated with distance, risk, or energy consumption [
2]. These algorithms remain foundational components of contemporary UAV navigation research.
As UAV applications continue to expand, recent surveys have emphasized the increasing complexity of real-world operating environments. Meng et al. provided a comprehensive review of artificial intelligence and machine learning–based approaches for UAV path planning, with particular attention to three-dimensional urban environments and real-time operational constraints [
3]. Ghambari et al. analyzed simulation platforms and evaluation methodologies, highlighting persistent gaps between algorithmic performance in simulated environments and operational deployability in real-world scenarios [
4]. Constraint-based techniques, such as those proposed by Tayal et al., address collision avoidance in dynamic settings but remain sensitive to perception delays, modeling uncertainty, and computational limitations [
5]. Despite these advances, fully autonomous and reliable UAV operation in complex real-world environments remains an open research challenge.
This paper advances a unifying research direction for UAV pathfinding termed Risk-Calibrated, Certifiably Safe, Resource-Aware (RCSR) pathfinding. Rather than introducing a new planning algorithm, RCSR serves as a conceptual framework that integrates fragmented research efforts into a coherent research agenda for deployable autonomy. More concretely, RCSR can function as a design pipeline directing planners to explicitly address each dimension in sequence, an evaluation rubric for determining which RCSR pillars a given method satisfies, or a research taxonomy for identifying deployment-critical gaps in current planning approaches. The framework synthesizes four interdependent dimensions: uncertainty-aware planning under imperfect sensing and dynamic environments; formal safety assurance and runtime verification; coordination among multiple UAVs operating in shared airspace under regulatory constraints; and resource-aware implementation subject to strict latency, energy, and computational limits.
For clarity, this survey uses three related but distinct terms throughout. Pathfinding refers to the algorithmic computation of a feasible or optimal route from a start state to a goal state, often at the graph-search or motion-planning level. Path planning is used in a broader sense to include route generation together with relevant objectives and constraints such as safety, risk, energy, dynamics, and regulatory feasibility. Navigation refers to the execution-level problem of following or adapting the planned path online using perception, state estimation, control, and replanning in response to environmental changes. Because these concepts are tightly coupled in real UAV systems, all three terms remain relevant in this manuscript, but path planning is used as the broadest umbrella term unless a narrower distinction is specifically intended.
Three conceptually related but operationally distinct modifiers appear throughout this survey and in the broader literature. Risk-aware planning refers to general recognition of environmental hazards—such as collision zones, terrain, or air traffic—without necessarily quantifying their probability or magnitude. Risk-calibrated planning involves explicit probabilistic or quantitative models of uncertainty distributions and threat exposure that are traded directly against path efficiency within the planning objective. Certifiably safe planning goes further by providing formal, verifiable guarantees on safety metrics—such as provable bounds on collision probability or constraint-satisfaction certificates—that hold under specified operating assumptions. Understanding these distinctions is essential for interpreting the algorithmic trade-offs surveyed in subsequent sections and for situating each method within the RCSR framework.
Reliable UAV path planning is critical across a wide range of current and emerging applications. Existing civilian uses include aerial imaging, precision agriculture, surveying and mapping, infrastructure inspection, public safety operations, and environmental monitoring [
3,
4,
6]. Emerging applications envision urban air mobility, medical and humanitarian delivery, coordinated multi-UAV systems, and high-altitude communication platforms [
7]. Many of these scenarios require beyond visual line of sight (BVLOS) operation in dense and regulated airspace, imposing stringent demands on robustness, safety, and efficiency.
Current planning paradigms address these requirements only partially. Classical search-based methods such as A* and D* variants support fast replanning but typically rely on deterministic models and limited representations of uncertainty [
1,
8,
9]. Sampling-based planners, including RRT* and BIT*, explore continuous spaces effectively but often struggle to provide formal safety guarantees within strict real-time constraints [
2]. Optimization-based approaches, such as model predictive control and mixed-integer formulations, explicitly encode vehicle dynamics and operational constraints but frequently suffer from computational fragility under sensing delays and model mismatch [
10,
11]. Across these paradigms, risk calibration is rarely modeled explicitly, safety is frequently assumed rather than formally certified, and resource constraints are often treated as secondary considerations [
5,
12].
RCSR pathfinding reframes these limitations by explicitly coupling uncertainty modeling, certifiable safety, coordination, and resource awareness within a unified research perspective. By doing so, it emphasizes the requirements necessary for UAV systems that are not only effective in simulation but also reliable in deployment. The contributions of this survey are as follows:
- 1.
An integrated review of UAV path planning algorithms, simulation and testing methodologies, uncertainty-aware planning, and formal safety assurance, with explicit focus on real-world operational constraints.
- 2.
A forward-looking research agenda based on the RCSR framework, identifying concrete research directions aimed at bridging the gap between laboratory prototypes and deployable UAV autonomy.
The remainder of this paper is organized as follows.
Section 2 reviews prior survey studies and highlights the gaps that motivate the proposed synthesis.
Section 3 reviews classical and modern UAV path planning paradigms.
Section 4 surveys simulation platforms and experimental testbeds.
Section 5 examines uncertainty-aware planning and formal safety assurance.
Section 6 discusses environmental, computational, and energy constraints encountered in real-world deployment.
Section 7 outlines future research directions through the RCSR framework, and
Section 8 concludes the paper. A detailed taxonomy illustrating the overall structure of this paper and the surveyed research domains is presented in
Figure 1.
2. Related Studies
Research on UAV path planning has expanded rapidly over the past decade, resulting in a growing body of survey papers that analyze planning algorithms, mission requirements, and environmental constraints. Most existing reviews, however, primarily emphasize algorithmic taxonomies and performance characteristics, while treating uncertainty, safety assurance, airspace regulation, and real-world deployment constraints as secondary or isolated concerns.
Jones et al. present one of the most comprehensive recent taxonomies of UAV path-planning methods, emphasizing the influence of environmental complexity and map representation on algorithmic performance [
13]. Meng et al. provided a broad review spanning deterministic, sampling-based, evolutionary, learning-based, and hybrid planners—identifying scalability, real-time feasibility, and robustness as persistent challenges for operational deployment [
3]. Debnath et al. focused on remote-sensing and agricultural missions, reviewing planning and obstacle-avoidance techniques with emphasis on environmental constraints and sensing payloads [
14]. Gugan and Haque analyzed the limitations of widely used planners, including poor adaptability in cluttered three-dimensional environments, weak integration between perception and planning modules, and limited robustness to uncertainty [
15]. A complementary mission-oriented classification by Luo et al. organized UAV planners by operational context, highlighting trade-offs between computational cost, optimality, and adaptability [
16]. While these surveys provide valuable algorithmic insight, they largely stop short of integrating safety guarantees, regulatory considerations, and deployment realism into a unified planning perspective.
Several studies narrow their focus to specific operational subdomains, including multi-UAV coordination, autonomous swarms, and urban airspace integration. Puente-Castro et al. surveyed artificial intelligence–based approaches for multi-UAV path planning, covering reinforcement learning, swarm intelligence, and evolutionary algorithms, with emphasis on distributed optimization, communication constraints, and energy balancing across fleets [
17]. Davidović and Urošević examined planning constraints arising from UAV integration into dense urban airspace, surveying separation requirements, airspace structure, and regulatory barriers that directly influence feasible trajectory design [
18]. Ghambari et al. provided a mission-centric taxonomy of UAV planning problems, highlighting the growing importance of multimodal sensing, energy constraints, and autonomous operation in dynamic environments [
4]. Collectively, these studies identify environmental uncertainty, coordination complexity, and mission-specific constraints as central limitations of current planning frameworks. However, these challenges are often addressed in isolation rather than within an integrated deployment-oriented framework.
A growing body of research focuses explicitly on risk-aware UAV path planning. Primatesta et al. proposed a risk-aware planning strategy for dense urban environments by integrating static and dynamic risk maps into a modified A* algorithm, demonstrating improved safety performance in populated areas [
19]. Tang et al. extended this line of work by introducing a third-party risk model that quantifies obstacle risk, fatality risk, and infrastructure vulnerability, embedding these metrics within a multi-objective A* planner designed for urban missions [
20]. Zhou et al. incorporated crash probability estimates and economic risk into a unified cost formulation that adapts to complex air–ground environments, demonstrating substantial reductions in expected operational risk [
21]. While these approaches successfully integrate risk into planning objectives, they are primarily evaluated offline and typically lack formal safety guarantees or runtime enforcement mechanisms.
Parallel advances in formal safety assurance offer complementary mechanisms for enabling reliable UAV autonomy. Ames et al. surveyed control barrier functions (CBFs), demonstrating how safety constraints can be encoded as forward-invariant sets within optimization-based controllers [
22]. Hobbs et al. reviewed runtime assurance (RTA) architectures, including Simplex-style supervisory control and optimization-based safety filters capable of enforcing safety constraints during online operation, even when primary planners are learning-based or unverified [
23]. Sciancalepore et al. introduced ORION, a framework that leverages Remote ID broadcasts to verify UAV trajectories for regulatory compliance, bridging trajectory verification with emerging airspace management requirements [
24]. Despite their importance, these safety-focused studies typically treat planning as an upstream component and therefore do not provide a unified perspective linking risk modeling, uncertainty, regulatory constraints, and real-time computational limitations.
Table 1 summarizes representative studies across key dimensions relevant to deployable UAV path planning. The comparison highlights that while existing surveys and frameworks address individual aspects such as algorithmic design, risk modeling, coordination, or safety assurance, none provide an integrated treatment spanning uncertainty-aware planning, certifiable safety, regulatory compliance, and real-world resource constraints.
Overall, the literature demonstrates substantial progress across individual research directions, including algorithm design, risk assessment, swarm coordination, energy-aware planning, and igureormal safety mechanisms. However, these themes are often developed independently, with limited consideration of their interaction in real-world deployment scenarios. This survey addresses this gap by adopting a constraint-centric perspective and synthesizing prior work through the RCSR framework, unifying risk modeling, safety assurance, regulatory considerations, and resource-aware implementation into a coherent research agenda for next-generation UAV autonomy.
3. Pathfinding Algorithms for Real-World UAVs
Many pathfinding algorithms widely used in UAV navigation were originally developed for abstract graph search problems or ground-based robotic systems. Despite their origins, several algorithmic families have proven highly effective for aerial navigation in complex three-dimensional environments. In particular, algorithms that support explicit constraints, dynamic replanning, and partial environmental knowledge have become central to modern UAV autonomy.
Rather than presenting a purely historical survey, this section focuses on algorithmic families that serve as foundational building blocks for RCSR pathfinding in real-world UAV deployments. These approaches underpin many contemporary navigation systems and continue to influence how uncertainty, safety guarantees, and resource constraints are incorporated into practical UAV planning frameworks. Readers should note that the text shifts between two analytical modes: a survey perspective that documents algorithmic properties, capabilities, and representative implementations, and a framework perspective that evaluates how each family aligns with or falls short of the RCSR pillars. Subsections labeled ‘Implications for RCSR’ or ‘RCSR Relevance’ explicitly signal these analytical transitions.
Classical path-planning algorithms remain particularly important because they provide well-understood guarantees regarding optimality, completeness, and computational behavior. Many modern planners, including learning-based approaches, still rely on these classical methods either directly or as underlying components. Reinforcement learning–based path-planning approaches represent a complementary and more recent paradigm and are discussed later in the paper (
Section 3.2.)
3.1. Classical Pathfinding Algorithms for UAV Navigation
Classical pathfinding algorithms provide the conceptual and algorithmic foundation for many UAV navigation systems. These methods typically operate on graph representations of the environment or continuous configuration spaces and are designed to compute collision-free paths while optimizing cost functions such as distance, time, or energy consumption.
Figure 2 illustrates a taxonomy of classical pathfinding algorithms relevant to UAV navigation. The taxonomy highlights four major categories: deterministic grid-based search, incremental graph search methods designed for dynamic environments, sampling-based planners operating in continuous spaces, and trajectory generation approaches that combine search with motion constraints and optimization techniques.
3.1.1. Deterministic Grid-Based Pathfinding
Classical grid-based path planning models the environment as a weighted graph
where
V denotes discrete states (e.g., grid cells),
represents feasible transitions, and
encodes traversal costs such as distance, time, or energy consumption. A path
has a total cost
and the canonical planning problem is to determine a minimum-cost path between a start state
s and a goal state
g in a static, fully known environment.
Dijkstra and A*
Dijkstra’s algorithm computes optimal shortest paths on graphs with non-negative edge costs by iteratively expanding the node with the smallest cost-to-come value
[
25]. While optimal, the algorithm performs an exhaustive exploration of the search space, which becomes computationally expensive on large three-dimensional grids typical of urban UAV environments.
A* improves efficiency by introducing a heuristic estimate
of the remaining cost-to-go and expanding nodes according to
When
h is admissible and consistent, A* guarantees optimality while significantly reducing the number of explored nodes [
8]. In UAV applications, heuristics commonly incorporate Euclidean distance, altitude change penalties, or approximate energy expenditure. From an RCSR perspective, heuristics provide a natural mechanism for encoding prior knowledge about cost structure. However, classical A* assumes deterministic edge costs and does not explicitly represent uncertainty or probabilistic risk.
Weighted and Bounded-Suboptimal Variants
Weighted A* and related bounded-suboptimal algorithms trade optimality for reduced computational effort by inflating the heuristic:
For admissible heuristics, the resulting path cost satisfies
where
denotes the optimal path available, meaning the resulting path cost given by the Weighted A* is guaranteed not to exceed the ideal cost by more than a factor of
. This guarantee of an explicit bound on solution quality is particularly useful for real-time UAV navigation, where computational resources are limited, and rapid responses are required. This property aligns naturally with the RCSR emphasis on resource-aware planning, where bounded deviations from optimality can be justified by strict latency or energy constraints. Nevertheless, these planners remain fundamentally deterministic and static, requiring extensions to handle dynamic obstacles, environmental uncertainty, or strict real-time constraints.
3.1.2. Incremental Graph Search with Temporal and Environmental Constraints
Real-world UAV operations rarely permit single-shot planning. Environmental conditions, obstacle locations, and sensing information often evolve during flight. Incremental and anytime search algorithms are therefore particularly relevant, as they reuse prior computation and adapt efficiently to changing environments.
LPA* and D* Lite
Lifelong Planning A* (LPA*) and D* Lite update the shortest paths incrementally when edge costs change, avoiding the need for full replanning. Both algorithms maintain a cost-to-come value
and a one-step lookahead estimate
and enforce consistency by driving
toward
. Nodes are prioritized using a lexicographically ordered key
enabling efficient updates when obstacles, weather conditions, or energy costs change during flight [
1,
26].
From an RCSR perspective, incremental planners provide three key advantages: reduced computational overhead through reuse of prior search effort, online adaptability to newly sensed environmental information, and formal guarantees of optimality under admissible heuristics. However, these algorithms still treat costs as deterministic, and risk must therefore be incorporated indirectly through expected cost models or external runtime safety mechanisms.
Anytime Variants (ARA*, AD*)
Anytime Repairing A* (ARA*) rapidly computes a bounded-suboptimal solution using an inflated heuristic and incrementally improves solution quality as time permits [
27]. Anytime Dynamic A* (AD*) extends this capability to dynamic environments by combining bounded-suboptimal search with incremental replanning [
28]. At any point, the current solution is guaranteed to lie within a known factor
of optimality. These guarantees are particularly attractive for UAVs operating under strict real-time constraints since planners can produce feasible trajectories quickly and refine them as additional computation becomes available.
Safe Interval Path Planning (SIPP)
Dynamic obstacle avoidance can be addressed by augmenting spatial states with temporal constraints. Naïve time-expanded grid formulations quickly become computationally intractable, motivating the Safe Interval Path Planning (SIPP) framework. SIPP represents each state as
, where
denotes a contiguous time interval during which state
v is guaranteed to remain collision-free. Transitions are permitted only if the arrival time
lies within the successor state’s safe interval [
9]. Subsequent extensions generalize SIPP to any-angle motion and real-time search settings suitable for dynamic robotic environments [
29,
30]. While effective for temporal collision avoidance, SIPP often requires preprocessing of obstacle trajectories and can remain computationally demanding in dense environments.
Summary and Limitations
Grid-based planners provide a flexible mechanism for encoding constraints relevant to real-world UAV operations. Regulatory restrictions and geofenced regions can be represented as forbidden vertices or edges, while energy and time budgets naturally appear as edge costs. Dynamic obstacles may be incorporated through incremental cost updates or temporal safe intervals. Despite these strengths, classical grid-based methods scale poorly to high-dimensional state spaces and dense three-dimensional maps. These limitations motivate the transition toward planners operating in continuous spaces, which are discussed in subsequent sections.
3.1.3. Sampling-Based Planning in Continuous 3D Airspace
Sampling-based motion planners are well suited for UAV navigation in cluttered three-dimensional environments, where grid discretizations can become prohibitively large or too coarse to capture feasible kinodynamic motion. Let the continuous state space be
, with obstacle region
and collision-free region
UAV motion is typically modeled by continuous-time dynamics
and a trajectory
over
is evaluated by a cost functional
where
L may encode travel time, energy consumption, trajectory smoothness, or multi-objective penalties. Exact optimal planning for realistic UAV dynamics is generally computationally intractable, which motivates randomized planners that seek feasible, and in some cases asymptotically optimal, solutions through sampling of the configuration space.
RRT and RRT*
The rapidly exploring random tree (RRT) algorithm constructs a search tree rooted at the start state by repeatedly sampling
, selecting the nearest tree node
, and extending toward
using a steering operator that respects system constraints while performing local collision checking [
31]. RRT is probabilistically complete: if a feasible path exists, the probability of finding one approaches one as the number of samples increases. However, the solutions produced by RRT are typically suboptimal.
RRT* extends the original RRT algorithm by introducing a rewiring step that selects parents and reconnects nearby nodes to improve the cost-to-come of the tree. Under standard assumptions, RRT* is asymptotically optimal, meaning that
where
denotes the optimal cost [
2]. From an RCSR perspective, RRT* provides a principled mechanism to trade computational effort (e.g., number of samples, neighbor radius, and rewiring budget) for solution quality while retaining convergence guarantees. In practice, however, finite-sample performance can vary significantly in large three-dimensional environments, and classical collision checking typically assumes deterministic obstacle representations without explicit modeling of uncertainty or risk.
Implications for RCSR
Sampling-based planners offer several properties that are directly relevant for real-world UAV pathfinding: (i) scalability to continuous three-dimensional airspace, (ii) flexibility in handling geometric constraints through collision checking and steering functions, and (iii) compatibility with multi-objective cost formulations through the running cost . Within an RCSR framework, these planners can be extended by shaping sampling distributions using risk or turbulence maps, incorporating risk-sensitive objective functions, and pruning candidate solutions using probabilistic safety constraints rather than purely deterministic collision checks.
Despite these advantages, classical sampling-based planners provide limited hard real-time guarantees and typically rely on simplified dynamic models during planning. Producing aerodynamically feasible and trackable UAV trajectories therefore often requires an additional smoothing or optimization stage, and safety under perception uncertainty is frequently enforced through downstream safety filters or runtime assurance mechanisms discussed later in this survey.
3.2. Reinforcement Learning–Based Path Planning
While classical graph search, sampling-based planning, and trajectory optimization provide strong algorithmic foundations for UAV pathfinding, they typically rely on explicit environmental models and carefully engineered cost functions. Reinforcement learning (RL) offers a complementary paradigm in which navigation policies are learned directly through interaction with the environment. By optimizing behavior through trial-and-error experience, RL-based planners can adapt to complex disturbances, partial observability, and mission objectives that may be difficult to encode analytically [
35,
36].
Recent studies demonstrate that RL-based approaches can achieve effective online navigation, energy-aware flight behavior, and cooperative multi-UAV coordination, particularly when trained in high-fidelity simulation environments [
3,
12,
37,
38,
39].
Figure 3 summarizes the major categories of reinforcement learning–based pathfinding approaches relevant to the RCSR framework.
A common formulation models UAV decision-making as a Markov decision process (MDP),
where
denotes the state (e.g., position, velocity, battery state, and local map features),
denotes actions (e.g., heading commands or continuous thrust/attitude setpoints),
is the transition model,
is the reward, and
is a discount factor. The objective is to learn a policy
that maximizes the expected discounted return
3.2.1. Single-UAV Navigation and Obstacle Avoidance
RL has been widely explored for single-UAV local navigation and obstacle avoidance in partially known environments. Typical observations include occupancy grids, distance-field features, or onboard camera/LiDAR inputs combined with UAV kinematics [
3]. For discrete actions, deep Q-learning updates
using
while continuous-control settings commonly use actor–critic algorithms such as DDPG, TD3, SAC, and PPO [
40,
41,
42,
43]. These methods can learn reactive behaviors that avoid obstacles, improve smoothness, and reduce energy consumption, and they can be trained under randomized disturbances (e.g., wind fields and sensor noise) to improve robustness in deployment. Attention-based and recurrent architectures further improve performance under partial observability and perception latency by leveraging temporal context [
44].
From an RCSR standpoint, RL is most compelling as a fast local policy that can adapt to uncertainty and changing conditions with modest onboard inference cost. However, most approaches still rely on hand-designed rewards and do not provide formal guarantees of collision avoidance or constraint satisfaction, limiting their direct use in safety-critical UAV operations [
37,
39].
3.2.2. Multi-UAV Cooperation and Deep Multi-Agent RL
RL has also been extended to cooperative multi-UAV settings, where coordination, task allocation, and collision avoidance must be addressed jointly. Such problems are often modeled as decentralized partially observable MDPs (Dec-POMDPs), in which each UAV acts based on local observations while the team optimizes a shared objective [
45,
46]. A widely used paradigm is centralized training with decentralized execution (CTDE), where a centralized critic has access to joint state/action information during training, but each UAV executes a local policy at deployment. Multi-agent DDPG (MADDPG) is a canonical CTDE approach that learns decentralized actors with centralized critics [
47]. Related methods have been applied to cooperative monitoring, tracking, and data-harvesting missions, improving coverage and robustness relative to heuristic baselines [
12,
45]. Value-factorization approaches (e.g., QMIX) and multi-agent actor–critic variants have also been used to couple task assignment with collision-aware path planning in dynamic environments [
48,
49,
50,
51].
These approaches relate directly to the RCSR agenda because they can incorporate resource limits (battery, bandwidth, team size) and adapt to changing mission geometry or agent failures. However, training is computationally expensive and typically performed offline. Ensuring that learned coordination policies remain safe, interpretable, and certifiable under realistic sensing and airspace constraints remains an open challenge [
38,
52].
3.2.3. Safe RL and Runtime Assurance
Standard RL optimizes expected return and does not inherently satisfy hard safety or resource constraints. Safe RL addresses this limitation by introducing constraint costs
and enforcing bounds on their expected discounted sums. A constrained MDP can be written as
where constraints may represent collision probability, minimum separation, or energy budget. Constrained policy optimization (CPO) and Lagrangian actor–critic approaches enforce these constraints approximately during training by optimizing a Lagrangian objective with dual variables [
53,
54].
A complementary and often more deployable strategy is to combine RL with shielding or runtime assurance: a safety filter monitors proposed actions and intervenes whenever the learned policy would violate certified constraints [
55]. For UAVs, shields can be derived from control barrier functions or reachability analysis and wrapped around PPO- or SAC-trained controllers. More broadly, recent hybrid architectures combine otherwise unverified planners, especially learning-based policies, with certified supervisory layers that enforce hard safety constraints only when necessary. In such systems, the nominal planner provides adaptive or high-performance behavior, while runtime assurance, shielding, control barrier function filters, or reachability-based safety monitors ensure collision avoidance, safe separation, and control feasibility during execution. These approaches directly strengthen the “certifiably safe” pillar of RCSR by preserving the adaptability of RL-based planners while moving safety enforcement into a certifiable supervisory layer. However, systematic evaluation on real UAV platforms and integration with certification workflows remain limited [
52,
56].
A deeper challenge concerns the gap between academic safe RL methods and formal aviation certification standards such as DO-178C. [
57].The methods discussed in this section—including PPO- and SAC-trained controllers—are deep reinforcement learning approaches that represent their learned policies as neural networks (typically actor and critic networks). These standards require traceable, deterministic software artifacts with verified coverage and requirements traceability—properties that are difficult to establish for such learned policies, even when augmented with shielding or control barrier functions. Shielding and CBF-based filters can reduce the frequency of unsafe actions during deployment, but they do not by themselves produce a certifiable safety case for the underlying learned policy. The internal representations of these neural network components are not amenable to exhaustive formal verification under current tools, making it difficult to provide the level of assurance demanded by aviation regulators. Bridging this gap will require closer engagement with certification authorities, development of interpretable or formally verifiable representations of learned policies, and integration of verification methods capable of reasoning about neural network components within broader certified system architectures. In the near term, the most credible path to deployment in safety-critical airspace remains a hybrid architecture in which a fully certified supervisory layer retains authority to override any unverified learned policy, and whose behavior can be independently validated to applicable aviation standards.
3.2.4. Evolutionary and Bio-Inspired Methods
Evolutionary and bio-inspired optimizers—including genetic algorithms (GA), particle swarm optimization (PSO), and ant colony optimization (ACO)—have also been explored for UAV path planning. These approaches evolve candidate paths to optimize multi-objective criteria such as path length, threat exposure, and fuel or energy consumption [
58]. They are well-suited to complex nonconvex objective landscapes and are frequently used for offline route design or to tune parameters of other planners. However, their computational cost and limited online reactivity generally restrict their role in RCSR settings to offline optimization or hybrid warm-starting rather than standalone onboard planning [
3].
3.2.5. Hybrid Learning–Planning Architectures and RCSR Alignment
Given challenges in sample efficiency, safety during exploration, and hard constraint enforcement, RL is increasingly deployed as part of hybrid architectures that combine learning with classical planning or optimization [
3,
52]. A common structure is to use a global planner (e.g., A*, RRT*, or mixed-integer optimization) to generate a constraint-respecting route, while a learned local policy adapts online to wind, moving obstacles, or model mismatch. RL has also been used to bias sampling in motion planners, propose warm-start trajectories for local optimization, or adjust waypoints and velocity profiles to reduce energy consumption [
37,
52,
59,
60]. In multi-UAV settings, learning can be combined with distributed optimization and edge/cloud offloading to handle computation-heavy policies under tight onboard resource budgets [
38].
From an RCSR perspective, RL-based techniques are most promising when embedded within a broader safety and certification architecture. Learned policies provide adaptive behavior under uncertainty and resource variability, but they typically require additional layers—safe RL formulations, runtime assurance, control barrier functions, or reachability-based shields—to provide certifiable guarantees. Bridging high-performing RL policies with risk-calibrated, certifiably safe, resource-aware UAV operation therefore remains a central research challenge and a key opportunity for RCSR-oriented pathfinding.
3.3. Summary and RCSR Relevance
The algorithm families reviewed in this section each offer distinct advantages and limitations, and their suitability depends on the demands of the target application.
Table 2 summarizes this perspective by providing a scenario-oriented selection guide that relates common UAV operating conditions to appropriate planning families.
Overall, classical algorithm families provide well-understood building blocks for RCSR by offering structure, heuristics, and—in some cases—formal optimality or bounded suboptimality guarantees. Their main gaps for deployable autonomy, addressed in later sections, include (i) calibrated models of uncertainty and risk, (ii) formal runtime safety layers, and (iii) systematic treatment of compute and energy budgets. Consequently, the most promising practical direction is toward hybrid planners that combine search or sampling with dynamically feasible trajectory generation and safety-enforcing runtime mechanisms, evaluated under realistic environmental and regulatory constraints. Taken together, the reviewed algorithmic families lay the groundwork for RCSR-compliant planning but individually satisfy only subsets of its three core pillars—risk calibration, certifiable safety, and resource awareness—highlighting the need for the integrated architectures and formal safety mechanisms discussed in subsequent sections.
4. Simulation Frameworks, Testbeds and Datasets
This section provides an integrated view of the experimental ecosystem supporting RCSR Path Planning for UAVs. The ecosystem spans simulation platforms, physical testbeds, and datasets, each addressing complementary aspects of deployable UAV autonomy. No single tool captures all dimensions of risk calibration, safety assurance, and resource constraints, making a combined evaluation pipeline essential.
Simulation enables controlled experimentation under uncertainty and resource limits, testbeds expose real-world system effects, and datasets support perception-driven risk estimation and benchmarking. Together, these components form the foundation for validating end-to-end RCSR pipelines.
4.1. Simulation Frameworks Supporting RCSR Path Planning
Simulation platforms vary in physical realism, sensor fidelity, uncertainty modeling capabilities, autopilot integration, scalability, and suitability for safety-critical benchmarking. Rather than relying on a single standard tool, the field uses a layered simulation ecosystem in which different platforms support different aspects of RCSR path planning.
Algorithm-centric simulators, such as MATLAB/Simulink (The MathWorks, Inc., Natick, MA, USA) [
61], enable controlled evaluation of risk-aware cost functions, optimization strategies, and resource trade-offs. They are widely used in studies involving 2D/3D grid maps, meta-heuristic optimization, and comparative evaluation of planners such as A*, RRT [
62], TSO [
63], PSO [
64], and related methods under risk-aware and resource-constrained objectives [
65,
66,
67]. These environments typically employ simplified kinematic or dynamic UAV models, with trajectory outcomes visualized in 2D or 3D. Their primary advantage lies in rapid prototyping and fine-grained control over risk-weighted cost functions, safety constraints, and resource budgets. However, they provide limited realism in aerodynamics, sensing, and vehicle physics. As a result, MATLAB/Simulink is best suited for algorithmic benchmarking, convergence analysis, ablation studies, and exploration of risk-calibrated objective functions, but lacks realistic sensing and dynamics.
Physics-based robotics simulators, such as Gazebo [
68], PX4 [
69], and ArduPilot SITL [
70], form a robotics-grade simulation stack for RCSR UAV path-planning research in safety-critical and resource-constrained 3D environments. Gazebo supports physics modeling, 3D environments, wind fields, and sensor plugins, while SITL executes the actual autopilot firmware. Together, these tools support end-to-end validation from planner to autopilot to dynamics, enabling evaluation under realistic flight dynamics, sensor noise, autopilot behavior, and environmental disturbances such as wind, while incorporating explicit safety and risk constraints [
71,
72,
73,
74].
Photorealistic simulators, such as AirSim [
75], enable perception-driven RCSR path planning, particularly for vision-based navigation and deep reinforcement learning, by exposing planners to realistic sensing uncertainty and complex environments. AirSim provides APIs for Python and C++ and optional PX4-SITL integration, making it a useful platform for risk-aware vision-based navigation and for evaluating perception–planning pipelines, obstacle avoidance, and high-level autonomy under realistic camera and LiDAR conditions with explicit risk and safety constraints [
76,
77,
78].
Flexible and RL-oriented environments, including Unity/Unity3D [
79], Flightmare [
80], and gym-pybullet-drones [
81], support learning-based and multi-agent planning, emphasizing scalability and rapid experimentation under constrained resources. Unity/Unity3D [
79] is a popular engine for building custom UAV simulation environments due to its flexibility, high-quality 3D rendering, and support for ML-Agents. It is especially attractive when custom and visually rich environments are required for RCSR experimentation. Unity/Unity3D is often used for swarm-based, heuristic, and learning-based RCSR path-planning research under safety-constrained objectives [
82,
83,
84]. RL-oriented simulators like Flightmare [
85] and gym-pybullet-drones are optimized for deep RL and multi-agent training with fast physics, simple APIs, and parallel environments. They enable efficient learning of safe, high-speed navigation and risk-aware path planning under safety constraints in RCSR settings [
86,
87,
88,
89].
Lightweight and autopilot-integrated tools, including jMAVSim [
90] and Paparazzi NPS [
91], enable efficient validation of control behavior, waypoint tracking, and onboard resource constraints. Although jMAVSim is less physically detailed than Gazebo or AirSim, it is widely used for validating controller performance, waypoint-following, and motion-planning algorithms under realistic onboard limitations [
92,
93]. Paparazzi NPS supports fixed-wing and rotorcraft platforms with realistic sensor noise, making it well suited for evaluating plan-level logic and waypoint-generation strategies for RCSR deployments under calibrated risk [
94,
95,
96].
Multi-UAV and cloud-enabled platforms, including OpenUAV [
97,
98], UavSim [
99], and SkyRover [
100,
101], enable large-scale coordination, mission-level evaluation, and resource-aware planning under shared environments. OpenUAV emphasizes scalability and system-level integration through containerized 3D simulation with PX4/ArduPilot, supporting swarm deployment and emerging capabilities such as vision-language navigation and risk-aware planning [
102,
103]. In contrast, UavSim focuses on algorithmic flexibility, offering plug-and-play planning modules for cooperative missions, with strengths in small-object detection and comparative evaluation of path-planning performance [
99]. SkyRover targets cross-domain coordination by integrating UAVs and AGVs in ROS2–Gazebo environments, enabling standardized MAPF benchmarking with explicit modeling of constraints and collision dynamics [
100].
Grid-based and custom simulators support algorithmic benchmarking, multi-agent conflict resolution, and controlled safety analysis by abstracting away full system complexity while preserving key planning constraints. Platforms such as V-REP/CoppeliaSim [
104] provide moderate physics realism and motion-planning libraries for benchmarking coverage and indoor navigation performance [
105,
106], while MORSE [
107] emphasizes sensor-driven simulation and conceptual design for perception and multi-UAV missions [
108,
109,
110,
111]. Aviones [
112,
113] focuses on fixed-wing dynamics and energy-aware planning with hardware-in-the-loop extensions. In contrast, grid-based simulators [
114,
115] prioritize discrete, structured environments to evaluate efficiency and conflict resolution in multi-agent settings. Additionally, other custom research simulators [
82,
116,
117,
118] target specific RCSR problems, ranging from swarm navigation and heuristic routing to RL-based collision avoidance and large-scale task allocation under explicit safety and resource constraints.
Overall,
Table 3 highlights that no single simulator satisfies all RCSR requirements, reflecting an inherent trade-off between fidelity and scalability. High-fidelity platforms support safety validation under realistic conditions, whereas lightweight environments enable large-scale benchmarking and rapid experimentation. Consequently, the literature adopts a layered and complementary simulation ecosystem, where platforms are selected based on the maturity of the planning approach and the specific RCSR dimension being evaluated. Within this pipeline, simulators can be broadly categorized into three roles: (i) rapid algorithm prototyping environments (e.g., MATLAB), (ii) high-fidelity physics and perception simulators (e.g., AirSim, Gazebo), and (iii) scalable multi-agent or reinforcement learning (RL) training environments.
Qualitative ratings in
Table 3 reflect the level of physical and sensing realism. Realism/Physics refers to how accurately the simulator models UAV dynamics, environmental interactions, and physical constraints. It ranges from low (simplified or discrete motion without aerodynamic effects), to moderate (basic rigid-body dynamics and collision handling), to high (physically grounded dynamics with environmental effects and controller integration). Sensor Fidelity reflects the realism and diversity of sensor outputs available for perception-driven planning. The values include minimal (no or abstract sensing), low (idealized outputs), moderate (configurable sensors with partial realism), high (realistic multi-modal sensing with noise and environmental interaction), and very high (photorealistic, physically consistent sensing suitable for perception learning and sim-to-real transfer).
4.2. UAV Testbeds Supporting RCSR Validation
While simulation is indispensable for scalable experimentation, physical UAV testbeds are critical for validating RCSR path-planning algorithms under real sensing, actuation, timing, communication, and resource constraints. Hardware testbeds expose planners to unmodeled dynamics, latency, packet loss, and sensor noise that are difficult to capture faithfully in simulation, and therefore play a key role in demonstrating robustness, risk calibration, certifiable safety, and deployability.
A large class of experimental platforms relies on indoor motion-capture systems that provide high-precision ground-truth pose estimates for small multirotors. Early influential examples include the GRASP multi-UAV testbed, which enabled coordinated flight, formation control, and cooperative transport under centralized planning. The Crazyswarm framework extends this paradigm using nano-quadrotors, supporting dense indoor swarms and rapid prototyping of multi-agent risk-aware and safety-constrained planning algorithms in safe, repeatable conditions. Such testbeds are particularly valuable for evaluating collision avoidance, formation changes, and coverage planning under bounded energy and communication budgets without the regulatory and weather constraints of outdoor flight.
Dedicated indoor facilities further extend these capabilities. TU Delft’s Cyberzoo and Swarming Lab provide configurable obstacle-rich environments for long-duration swarm autonomy, resource-aware coordination, and persistent coverage experiments. Related work on swarms of miniature drones highlights the importance of physical testbeds for validating exploration- and mapping-driven planning under severe sensing, computation, and risk-bound constraints.
Outdoor multi-UAV testbeds form a second major category, supporting search-and-rescue, inspection, and large-area coordination tasks. Platforms such as RISCuer and MUAVET enable evaluation of cooperative planning, task allocation, and path execution under realistic constraints including limited battery capacity, communication range, GNSS uncertainty, and airspace restrictions. These testbeds bridge the gap between laboratory-scale validation and safety-critical RCSR deployment.
4.3. Datasets for Risk-Aware and Safety-Critical Path Planning
Publicly available datasets play a central role in evaluating RCSR UAV path-planning, risk-aware navigation, safety-critical SLAM, and obstacle-avoidance algorithms under realistic sensing conditions. These datasets differ in sensing modality, environment type, and intended application, supporting different stages of the risk-calibrated and resource-aware planning pipeline.
As illustrated in
Figure 4, we group influential datasets into five functional categories: (i) geospatial and aerial mapping datasets, (ii) vision-based perception datasets (with red boxes denoting object detection), (iii) SLAM and visual–inertial navigation datasets, (iv) disaster and forest inspection datasets, and (v) high-speed indoor navigation datasets. Sample images extracted from representative real-world UAV datasets are shown in
Figure 5.
4.3.1. Geospatial and Aerial Image Datasets for Global Risk-Aware Path Planning
OpenStreetMap (OSM) [
119] provides open geospatial data (e.g., roads, buildings, land use) widely used to construct realistic urban environments for high-level UAV path planning. Although not a UAV benchmark, it supports generation of georeferenced scenarios for urban routing, multi-UAV coordination, and safety-constrained missions [
120,
121,
122,
123,
124]. Benchmarks such as SAREnv derive standardized search-and-rescue environments directly from OSM layers [
125], enabling risk-aware planning under structured urban constraints. Massachusetts Road Dataset (MRD) is a key aerial-imagery dataset for road extraction and traversability segmentation, supporting perception-driven RCSR path planning. Beyond mapping, it is widely used to train segmentation models (e.g., U-Net, SegNet, LinkNet, D-LinkNet) that produce semantic cost maps feeding the perception layer of risk-calibrated planning pipelines [
126]. These representations enable planners (e.g., URA* [
127], multi-objective D* Lite [
128], and A*/RRT*-based methods [
127,
129]) to define feasible corridors, encode safety margins, and incorporate environmental risk directly into path selection. pNEUMA/pNEUMA Vision provides large-scale drone-recorded urban traffic trajectories with high-resolution motion data for vehicles, pedestrians, and cyclists [
130]. It is widely used for traffic-aware path planning, dynamic obstacle prediction, and risk-sensitive routing in dense urban environments [
131,
132,
133]. Extensions such as pNEUMA Vision [
134] further support integrated perception–planning pipelines for safety-critical RCSR evaluation.
4.3.2. Vision-Based Perception Datasets Supporting Navigation and Obstacle Avoidance
Vision-based datasets play a critical role in developing perception modules that support navigation and obstacle avoidance in RCSR UAV systems. VisDrone [
135] provides large-scale urban aerial imagery for object detection and tracking under diverse conditions and is widely used to generate perception outputs (e.g., detection and tracking) that inform cost maps, dynamic obstacle prediction, and safety-aware trajectory generation. Similarly, UAVDT [
136] offers annotated vehicle-centric aerial data with environmental attributes such as altitude, view angle, and weather, enabling perception-driven traffic monitoring and coordination pipelines that support risk-calibrated planning and multi-UAV routing [
137,
138,
139]. The Stanford Drone Dataset (SDD) [
140] provides detailed trajectories of humans and objects in complex outdoor environments, supporting research in dynamic-scene navigation, social-aware path planning, and prediction-based obstacle avoidance under explicit safety constraints [
141,
142]. In addition, KITTI [
143], although originally collected from ground vehicles, is widely reused for UAV perception tasks such as depth estimation, obstacle detection, and scene understanding, with outputs frequently integrated into navigation, collision avoidance, and local planning modules [
144,
145,
146].
4.3.3. Visual–Inertial and SLAM-Focused Datasets Used in UAV Navigation
Visual–inertial and SLAM-focused datasets provide essential perception and localization inputs for risk-aware path planning in RCSR UAV systems. EuRoC MAV [
147,
148] offers synchronized stereo imagery, IMU data, and precise motion-capture ground truth, making it a standard benchmark for evaluating VIO, SLAM, and pose estimation modules that support safe trajectory generation under uncertainty. The Zurich Urban MAV Dataset [
149] captures UAV flights in dense urban environments, including narrow streets and GPS-denied areas, enabling research in global 3D path planning, skyline-based localization, and safety-constrained navigation in complex outdoor settings. PennCOSYVIO [
150] provides visual–inertial sequences across challenging indoor–outdoor environments with complex geometries and varying lighting, supporting evaluation of VIO-guided navigation, global risk-aware planning, and reactive obstacle avoidance under safety constraints.
4.3.4. Synthetic Datasets and Simulation-Based Benchmarks
Synthetic datasets enable controlled, large-scale evaluation of planning algorithms in RCSR settings, particularly for DRL and multi-UAV systems where real-world experimentation is costly or unsafe. AirSim synthetic worlds [
151] provide photorealistic environments for DRL navigation, obstacle avoidance, and risk-aware policy validation, while MATLAB/Simulink-generated environments support optimization-based planning through geometric safety constraints such as desired and forbidden regions [
66,
67]. Unity-generated environments enable flexible simulation of swarm navigation and collision avoidance under adjustable conditions [
82], whereas Gazebo-based environments [
100] supports safety- and resource-aware validation pipelines in structured indoor/outdoor scenarios. OpenUAV [
97] and CityNav-style [
152] datasets provide scalable synthetic environments and trajectory datasets, including vision-language navigation under operational constraints. Additionally, custom synthetic scenarios support specialized evaluations such as spatiotemporal coordination, stochastic risk modeling, and large-scale multi-agent task allocation under safety and resource constraints [
115,
117,
118].
4.3.5. High-Speed, Agile, and Indoor Navigation Datasets
High-speed and indoor navigation datasets support evaluation of agile flight, time-critical planning, and collision avoidance under constrained RCSR conditions. The UZH-FPV Drone Racing Dataset [
153] provides high-speed FPV flights with precise ground truth in cluttered indoor tracks, enabling research in time-optimal planning and DRL-based collision avoidance. The MIT Blackbird Dataset [
154] offers high-frequency multi-camera and IMU data from aggressive quadrotor maneuvers, supporting perception-aware trajectory optimization and high-speed control with calibrated safety margins. Mini-drone indoor datasets [
155,
156] capture navigation in tight indoor spaces such as corridors and warehouses, enabling evaluation of DRL-based obstacle avoidance, waypoint tracking, and constrained navigation.
4.3.6. Disaster, Forest, and Inspection Datasets
Disaster, forest, and inspection datasets support environment-specific evaluation of risk-aware planning in RCSR UAV applications. Disaster and SAR aerial datasets [
157,
158] capture scenarios such as wildfires, collapsed structures, and victim search, enabling research in coverage-based search, uncertainty-aware replanning, and DRL-based search strategies. Forest navigation datasets [
159,
160] provide sequences in dense vegetation, supporting low-altitude obstacle avoidance, navigation through narrow gaps, and environment-aware replanning under perception and safety constraints. DroneDeploy mapping datasets [
161] offer high-resolution orthomosaics and elevation models from real-world environments, enabling evaluation of coverage planning, inspection trajectory design, and altitude-constrained navigation under resource-aware conditions.
To summarize the above, the diversity of publicly available UAV datasets can be seen clearly in
Table 4, which consolidates representative datasets by data content, indoor/outdoor coverage, and capture method. Taken together,
Figure 4 and
Figure 5, and
Table 4 illustrate how datasets support different layers of the RCSR autonomy pipeline—global routing and map priors, perception modules for detection/tracking, localization/mapping prerequisites, and high-speed reactive navigation under explicit safety and risk constraints.
4.4. Evaluation Methods and Metrics
Evaluation in RCSR UAV path-finding research varies widely depending on each study’s focus—from multi-agent coordination and optimization to autonomy, risk calibration, efficiency, or realism under resource constraints. The reviewed works demonstrate a blend of algorithmic performance metrics, mission-specific indicators, and, in some cases, real-world validation to assess scalability, reliability, and computational feasibility. As summarized in
Figure 6, RCSR UAV path-planning metrics can be categorized into five major groups based on what aspect of performance they measure, while
Table 5 provides concise definitions to support consistent reporting and comparison.
- 1.
Path Optimality Metrics: assess how efficiently UAVs navigate from start to goal, considering distance, trajectory quality, and energy efficiency. Examples include path quality [
82], path length [
66,
67,
116,
118], and goal distance/distance to goal [
66,
67].
- 2.
Computational Performance Metrics: measure how efficiently algorithms compute paths and handle system resources. Examples include computation time/runtime [
82,
100,
116,
118], computation efficiency [
117], memory use [
82], and convergence [
67,
117].
- 3.
Safety and Collision Metrics: evaluate whether UAVs maintain safe separation and avoid obstacles or restricted regions. Examples include collision rate/collision avoidance [
100,
117], conflict rate [
118], and forbidden region (FR) penalty [
66,
67].
- 4.
Information- and Mission-Based Metrics: evaluate how well UAVs accomplish domain-related goals beyond path geometry. Examples include information gain/collected information [
66,
67], average waiting distance (AWD) [
116], charging count [
116], total delay [
115], and number of rejections [
115].
- 5.
System-Level and Scalability Metrics: capture overall robustness, fairness, adaptability, and ability to scale across environments or agents. Examples include scalability [
82,
100,
118], fairness [
115], success rate [
100], and transferability [
117].
Figure 6.
Categories of Metrics Used in RCSR Path-Finding Simulation.
Figure 6.
Categories of Metrics Used in RCSR Path-Finding Simulation.
Table 5.
Definition of key metrics.
Table 5.
Definition of key metrics.
| Metric | Definition/Purpose |
|---|
| Path Quality | Total cost or efficiency of the planned path (often shortest or smoothest). |
| Path Length | Sum of distances between all consecutive waypoints, reflecting total travel distance. |
| Goal Distance | Euclidean distance between UAV’s final position and target endpoint, indicating path accuracy |
| Computation Time | Time required to generate a valid or optimal solution. |
| Computation Efficiency | Inverse of average computation time; measures algorithmic responsiveness in dynamic environments |
| Memory Use | Total memory consumed during the path computation process. |
| Convergence | Number of iterations needed for optimization or learning algorithms to stabilize. |
| Collision Avoidance | Number or rate of UAV collisions in simulation or physical tests. |
| Conflict Rate | Fraction of conflicting or overlapping trajectories among UAV pairs. |
| Forbidden Region Penalty | Quantifies penalties or violations for entering restricted or unsafe zones. |
| Average Waiting Distance | Average distance between UAV and assigned service target during waiting periods (e.g., rescue or delivery missions). |
| Charging Count | Number of visits to charging stations during a mission, indicating operational endurance. |
| Total Delay | Cumulative delay experienced by UAVs due to traffic conflicts or rescheduling. |
| Number of Rejections | Count of denied flight operations resulting from airspace congestion. |
| Scalability | Evaluates system performance as the number of UAVs or
tasks increases. |
| Fairness | Assesses the equitable distribution of costs or delays across service providers or agents. |
| Information Gain | Quantifies the amount of sensor data or imagery collected during the mission |
| Success Rate | Percentage of successful missions or path completions without collisions. |
| Transferability | Measures how well a trained simulation model performs in real-world conditions. |
4.5. Summary and RCSR Relevance
Across simulation frameworks, testbeds, datasets, and evaluation metrics, the RCSR ecosystem reflects a layered validation pipeline that spans controlled abstraction to real-world deployment. Simulation platforms enable systematic evaluation of risk calibration, safety constraints, and resource trade-offs under controlled and repeatable conditions, while datasets support perception, localization, and uncertainty modeling that directly influence planning decisions. In contrast, physical testbeds expose planners to tightly coupled system effects—including sensing noise, control limitations, communication delays, and environmental disturbances—that are often abstracted in simulation. Evaluation metrics further provide a unified framework for quantifying performance across path optimality, safety, robustness, computational efficiency, and transferability.
From an RCSR perspective, these components collectively support the four core pillars of deployable UAV autonomy. Simulation and synthetic environments enable controlled risk modeling and scalable benchmarking, while high-fidelity and SITL/HIL platforms allow validation of certifiable safety under realistic dynamics and control. Datasets provide measurable uncertainty in perception and mapping that propagates into risk-aware planning, and testbeds verify whether these objectives remain valid under real operational constraints. However, across the reviewed literature, validation remains predominantly simulation-based, with relatively few studies demonstrating comprehensive real-world flight experiments. Many works rely on simulation or intermediate validation stages (e.g., SIL/HIL or indoor testbeds), which, while valuable, do not fully capture operational conditions.
Despite strong performance in simulation, several studies report challenges in real-world deployment, including failures due to perception noise, GPS drift, and unmodeled environmental disturbances. In particular, RL-based planners often struggle under real-world uncertainty and distribution shifts, where assumptions made during training (e.g., ideal sensing or full state observability) no longer hold. Prior work shows that policies trained in simulation can degrade significantly when deployed with real sensor inputs or under out-of-distribution conditions, highlighting a fundamental sim-to-real gap [
162,
163].
This imbalance highlights a critical gap between algorithmic performance and deployable autonomy, underscoring the need for more systematic real-world validation to ensure that risk-aware, certifiably safe, and resource-aware planning objectives hold under practical flight conditions.
5. Uncertainty-Aware Planning and Formal Safety Assurance
Real-world UAV operations face uncertainty that directly impacts both safety and mission performance. Three major sources dominate practical deployments. First, imperfect perception means that onboard sensors (vision, LiDAR, GPS/IMU, radar) provide incomplete, noisy, or delayed information about the vehicle state and surrounding obstacles due to sensing limitations (e.g., field of view, range) and adverse environments (e.g., occlusion, low texture, lighting variation). Second, dynamic environments introduce time-varying obstacles and traversability: people, vehicles, animals, and other UAVs move, while environmental elements such as vegetation and water can change the effective free space over time. Third, stochastic disturbances (wind, turbulence, temperature and humidity effects, precipitation) perturb the UAV dynamics and can also degrade sensing, creating coupled uncertainty sources.
These factors are rarely independent. For example, wind can deflect the UAV state while simultaneously causing motion blur and localization drift, increasing perception uncertainty. As a result, deterministic planners that assume static, fully known maps and disturbance-free dynamics can be unreliable in real deployments, especially when uncertainty sources compound. Modern UAV path-planning therefore integrates probabilistic state estimation, prediction, and constraint-handling mechanisms, and often pairs planning with runtime safety layers to maintain safety under model mismatch. The following subsections summarize the dominant technical approaches used to address each uncertainty source.
Figure 7 represents these categories and the principal methods used to address them.
5.1. Imperfect Perception
A core question in safe planning is where the UAV and surrounding obstacles actually are at any instant. In practice, localization and mapping must be inferred from noisy sensors, motivating probabilistic state estimation pipelines [
164,
165,
166]. Rather than relying solely on a point estimate, many systems maintain a distribution (or at least a covariance) over the vehicle state and propagate this uncertainty into planning decisions.
Accounting for uncertainty magnitude is critical. Ignoring sensor noise, mapping error, and disturbance-induced drift can produce trajectories that appear safe under nominal estimates but are unsafe in the true state [
167]. Accordingly, modern planners incorporate uncertainty through mechanisms such as inflating obstacles, enforcing chance constraints (e.g., bounding collision probability), and planning in belief space [
167,
168], where actions are chosen to both advance toward objectives and reduce future uncertainty.
A widely used backbone for state estimation is Kalman-style filtering, including the extended Kalman filter (EKF) and unscented Kalman filter (UKF) [
169]. A standard negative log-likelihood objective for the measurement update can be written as
where
is the measurement at time step
k,
is the predicted measurement mean, and
is the residual covariance. EKF uses Jacobian linearization and is typically appropriate when local linear approximations are adequate, while UKF uses sigma points to better capture nonlinearities, often at higher computational cost.
Uncertainty-aware planning also depends on estimating the environment. Probabilistic mapping methods such as occupancy grid mapping [
170] have been extended with multi-sensor fusion (IMU, GPS, LiDAR, cameras) to improve robustness under partial observability [
171]. These maps then support downstream cost fields and collision checking for planning.
Belief-space planning extends classical planning by optimizing over distributions. A common objective minimizes a scalar function of covariance across a horizon:
where
is the predicted state covariance at time
k,
T is the planning horizon, and
is the control sequence. More recent belief-space formulations explicitly incorporate measurement informativeness and sensing conditions [
172]. One representative form replaces covariance-only penalties with residual-based costs:
with
where
is the measurement residual and
is the residual covariance. In this view, routes that preserve measurement quality (e.g., maintaining informative visual features rather than flying through low-light or texture-poor regions) may be preferred over purely shortest-distance paths.
5.2. Dynamic Environments
Even with accurate state estimation, real-world UAVs operate in environments where obstacles and traversability change over time. Humans, vehicles, animals, and other UAVs can enter or exit the flight corridor, while environmental elements such as foliage and rocks can alter the effective free space. Modern approaches commonly address dynamic environments using a combination of (i) belief-space reasoning over obstacle states, (ii) receding-horizon re-planning via model predictive control (MPC), (iii) explicit obstacle prediction, and (iv) chance-constrained safety envelopes.
Belief-space methods extend naturally to dynamic obstacles by maintaining uncertainty over obstacle states (position, velocity, intent) and penalizing proximity according to uncertainty and predicted motion [
168]. In practice, predictable entities are assigned tighter safety margins than uncertain or erratic ones, reflecting different risk levels in the planning objective.
A second widely used mechanism is model predictive control (MPC), which repeatedly solves a short-horizon planning or trajectory-optimization problem using the latest state estimate and environment model, executes only the first control action, then re-plans after receiving new measurements [
173,
174]. This receding-horizon structure enables rapid response to changes but requires balancing re-planning frequency against onboard compute limits and sensing latency. MPC is particularly effective in cluttered or densely populated scenes, where stale plans quickly become invalid.
Explicit prediction of dynamic obstacles produces forecasts of obstacle occupancy over a time horizon. A common pipeline first classifies an observed entity (e.g., pedestrian, vehicle, UAV) and then applies an appropriate motion model (e.g., constant-velocity/acceleration) with filtered state estimates. Kalman-style predictors are frequently used for this purpose by leveraging the same residual/covariance machinery as in (
19). More advanced predictors use learned models (e.g., recurrent networks or Gaussian processes) to capture nonlinear or context-dependent motion patterns, often improving empirical performance in crowded scenes.
Other strategies used to deal with dynamic obstacles include sensor-aware planning, which focuses on keeping dynamic obstacles in sensor view in order to monitor their movement [
175], as well as sensor training optimization, which focuses on streamlining object recognition for specific sensors so they are both effective and efficient in computation for real-time aware systems [
176].
Finally, chance-constrained planning encodes safety requirements probabilistically, enforcing constraints such as collision probability below a threshold rather than purely geometric clearance. Chance constraints are most commonly applied as an added safety layer on top of belief-space planning or MPC, where spatial margins adapt based on obstacle uncertainty, classification, and predicted motion. This probabilistic framing provides a principled way to trade path efficiency against risk in time-varying environments.
5.3. Stochastic Disturbances
A third category of elements that impact the certainty of path planning approaches are stochastic disturbances. Broadly speaking, stochastic disturbances are defined to be relatively random and/or unpredictable effects that can alter the UAV’s motion, sensing, or surroundings. “Stochastic” here implies that these effects are probabilistic; for that reason, they are best represented using probability distributions. Because UAVs must effectively react to any and all challenges in real-time, they must be able to adjust to stochastic disturbances. The most common examples of disturbances, along with popular ways to mitigate them, are addressed below.
The most traditional (from a conceptual standpoint) form of disturbance is one that directly impacts the UAV’s motion in flight, meaning it may affect the trajectory of the UAV to its destination. Wind gusts and turbulence provide the most typical and critical forms of motion disturbance, as a sudden gust can completely change not only the position of the UAV but its heading as well [
177]. Even lesser but constant turbulence can slowly distort the actual trajectory from the planned one. Other less common, but still important motion disturbances include motor and sensor noise. As with turbulence, incorrect positional or bearing readings can cause deviation from initially planned paths.
There are several measures taken to mitigate motion disturbances, but the most common approach is to introduce additional parameters into belief-space calculation to account for the potential impact of each motion effect. Under specific circumstances, these parameters may be computed using sophisticated approaches that take into account the characteristics of the UAV and the expected wind direction/magnitude. However, when this computation is too expensive or unavailable, the more general-purpose approach is to employ a standard distribution to model the disturbance. The most common of these relies on a traditional Gaussian process [
178,
179], which is normally modeled as
where
w and
are preset values or computed using basic information about the UAV and environment (UAV mass, expected wind gust speeds, etc.)
A second category of disturbances is one that involves UAV sensors. In this context, it is assumed that the ability of the UAV to reliably collect information about its position and environment will be negatively impacted as disturbances render the sensors less reliable. Examples of these disturbances include atmospheric effects, such as humidity, which can negatively impact virtually any sensor type (be it visual, electronic, pressure-based, etc.). In addition, effects that impact individual sensors are a concern, with these including electronic noise—which disproportionally affects LiDAR and GPS-based sensors, as well as photonic or light-based noise, which disproportionally affects sensors reliant on conventional imaging of the environment [
179].
As sensor noise effects are highly varied and have disparate effects on the different types of sensors used, there is no singular solution to dealing with these types of disturbances. Attempts at mitigation may be preventative in nature—deployed during the planning phase of flight, or ameliorative—deployed in real-time to account for diminished reliability of sensors. The former approach might include additional parameters in belief-space planning that incorporate sensor disturbances. For example, a path that involves travel through a region with known electronic noise or interference would carry a heavy objective function penalty. Real-time mitigation might instead reduce the weight attached to sensors known to be under diminished capacity. For example, bright light or glare may reduce the weight given to traditional visual sensors mid-flight in comparison to GPS-based sensors.
The last category of disturbances to consider are those that affect the surroundings around the UAV, as opposed to the UAV or its sensors. These are most often referred to as environmental stochastic disturbances and are among the trickiest to deal with in practice. Sample environmental disturbances include variable surfaces, such as bodies of water that shift in response to external factors like wind, as well as terrain that may shift in response to stimuli, such as tree branches that move in response to a nearby entity.
While environmental disturbances overlap with the previously addressed subject of dynamic environments, the term disturbance is generally provided to indicate a phenomenon unpredictable to model using a projected path of motion. For that reason, when scientists or engineers attempt to incorporate these disturbances in path planning under the belief-space model, they are far more likely to incorporate parameters with uncertainty, such as those in (
25), or to incorporate a technique known as stochastic model predictive control (SMPC) [
180]. In a nutshell, SMPC is an approach to real-time UAV control that both incorporates calculations using parameters with uncertainity into the belief space, while also adjusting the interval for re-planning flight paths to account for proximity to disturbances. For example, a UAV traveling close to a water surface may re-plan its flight path much more frequently mid-control sequence than one traveling in an open space.
Another recent approach to dealing with environmental disturbances employs a corrective strategy during real-time planning. Prominent among these is the backstepping sliding mode method [
181] seeks to approximate unknown parameters attached to environmental disturbances and produce a so-called anti-interference link to counteract the impact of the external environment on the UAV’s project flight path. The goal is to, over time, produce an adaptive correction to effectively stabilize UAV flight without needing to know prior parameters or effects.
5.4. Summary and RCSR Relevance
There are three principal uncertainty sources in UAV path planning: imperfect perception, dynamic environments, and stochastic disturbances. Imperfect perception arises from noisy and limited sensors, requiring probabilistic state estimation methods such as EKF and UKF, together with belief-space planning, to propagate uncertainty into planning decisions. Dynamic environments, where obstacles and traversability change over time, can invalidate otherwise effective flight paths and are commonly addressed through receding-horizon replanning via MPC, obstacle prediction models, and chance-constrained safety envelopes. Stochastic disturbances include wind and turbulence affecting motion, atmospheric and electronic effects degrading sensors, and unpredictable environmental changes such as shifting water surfaces. These are often mitigated through Gaussian disturbance models, real-time sensor re-weighting, and stochastic MPC that adapts replanning rates to nearby disturbance sources.
Across all three categories, a recurring trade-off is that making the planner more uncertainty-aware generally improves safety and robustness but also increases computational and sensing burden on platforms with limited onboard resources. In addition, not all solutions apply equally well to every uncertainty source, particularly in the case of stochastic disturbances, which are often too broad and varied to be addressed by a single universal approach.
Table 6 provides a consolidated summary of the principal uncertainty sources, the trade-offs they introduce, and representative mitigation strategies.
These uncertainty sources correspond directly to the core pillars of the RCSR framework. Risk calibration depends on probabilistic state estimation and belief-space planning, since without quantified uncertainty, risk cannot be meaningfully computed or enforced. Certifiable safety is supported by chance-constrained planning and runtime safety layers that help manage dynamic environments and stochastic disturbances through formal probabilistic guarantees such as bounded collision probability. Resource awareness is relevant across all three uncertainty categories, because every improvement in uncertainty handling, including richer state distributions, more frequent MPC replanning, or disturbance parameter estimation, tends to increase computational and sensing demands on limited onboard platforms. Finally, the interdependence of these uncertainty sources reinforces the central premise of the RCSR framework: uncertainty, safety, and resource constraints must be addressed jointly within an integrated planning architecture.
6. Real-World Environment Constraints
We previously examined how path planners can address uncertainty through probabilistic state estimates to counter imperfect perception, dynamic obstacle prediction, and stochastic disturbance modeling. It’s key to note that these techniques do not operate in isolation. Each uncertainty-aware method must ultimately contend with the practical realities of deployment—that is, the sensors that feed state estimators have finite range and resolution, the onboard processors running belief-space optimization have limited compute budgets, energy constraints bound how long and how aggressively a UAV can maneuver, and airspace regulations limit where a UAV can legally fly. In other words, the counters to uncertainty developed previously are only as strong as the real-world resources and operational boundaries that support them. These relationships are often circular: unreliable perception by sensors degrades the accuracy of maps being produced during flight, and lack of map quality in turn increases our reliance on sensors in real-time to mitigate unreliable information.
For the reasons outlined above, we must therefore consider not just the mathematical models underpinning planning for uncertainty but also the engineering and regulatory constraints that shape what planners can actually execute in practice. This section focuses on the latter concern, wherein we consider that UAVs must operate with mapping constraints and limits, time-varying disturbances, strict energy and onboard compute budgets, and evolving airspace rules. These constraints are interdependent: mapping limitations can increase risk margins and energy use; wind and turbulence can amplify localization error; and regulatory compliance can restrict feasible corridors, thereby increasing planning complexity. This section reviews four practical constraint dimensions: (i) perception and mapping, (ii) environmental effects and resource limits, (iii) robustness and adaptation, and (iv) airspace and regulatory compliance, together with mitigation strategies reported in the literature.
6.1. Perception and Mapping Constraints
Perception and mapping underpin real-world UAV planning because representation fidelity and update rates directly influence feasibility, optimality, and safety. Accurate maps support obstacle avoidance and enable energy-aware and certifiably safe trajectory generation under limited onboard resources.
6.1.1. Occupancy Grids and Their Limits
Occupancy grid mapping remains widely used because of its probabilistic formulation and computational tractability. The environment is discretized into voxels, each assigned an occupancy probability
updated via Bayesian filtering:
where
is the observation at time
t. Such representations have been used in applications ranging from autonomous pollination [
182] to cooperative UAV–UGV navigation under degraded perception [
183]. However, occupancy grids impose a resolution trade-off: coarse grids can yield overly conservative routes, whereas fine grids increase memory and computational demands that may exceed the capabilities of small UAV platforms.
6.1.2. Distance Fields and Local Replanning
To address the limitations of grid resolution, continuous distance-field representations are commonly used for local planning and trajectory optimization. Truncated signed distance fields (TSDFs) and Euclidean signed distance fields (ESDFs) encode the distance from a query point
x to the nearest obstacle:
where
is the obstacle set. ESDFs provide smooth gradients that can be directly exploited by optimization-based planners to generate dynamically feasible, collision-free trajectories. These fields are especially valuable in cluttered or dynamic environments, where fast local replanning must respond to evolving obstacle boundaries. Recent work has also integrated semantic segmentation into mapping pipelines to distinguish traversable regions from semantically meaningful obstacles such as vegetation, buildings, and restricted structures [
184].
In many practical planning systems, obstacles are still represented using simplified geometric abstractions such as spheres, boxes, or occupancy voxels because these models are computationally efficient and integrate naturally with collision checking. However, real-world environments often contain obstacles with irregular geometry and semantic meaning, such as trees, buildings, poles, or restricted infrastructure, which may imply different levels of risk and different operational constraints. Incorporating semantic information into planning allows obstacle classes to be associated with class-specific safety margins, traversal penalties, or exclusion zones, while richer geometric representations such as meshes, signed distance fields, and semantic occupancy maps can better capture complex obstacle boundaries. These capabilities are especially important in cluttered urban and natural environments, where safe and deployable planning depends not only on obstacle location but also on obstacle type, structural complexity, and operational meaning. Hybrid pipelines, such as B-spline trajectory refinement layered over occupancy maps, further improve smoothness and robustness in the presence of dynamic obstacles [
185].
6.1.3. Sensor Modality and Compute Constraints
Mapping choices are shaped by sensor payload, power availability, and onboard compute capacity. LiDAR provides accurate 3D structure but is heavier and more power intensive than camera-only configurations. Visual–inertial odometry and monocular depth estimation offer lightweight alternatives, but they are more sensitive to lighting, texture, and motion blur. Event-based cameras can provide low-latency perception in high-dynamic-range or low-light conditions, thereby supporting agile flight. Multi-modal fusion of LiDAR, camera, and inertial sensing is increasingly used to mitigate single-sensor failures in complex terrain [
186].
6.1.4. Scalability and Adaptive Mapping
Scalable mapping remains a bottleneck for long-duration missions and dense 3D environments. Adaptive mapping strategies adjust voxel resolution, update frequency, or region-of-interest updates to balance representation fidelity against limited CPU/GPU budgets while preserving real-time feasibility on embedded platforms [
13]. Recent surveys emphasize the need to evaluate perception constraints jointly with environment complexity, resource limits, and certification requirements [
14]. Distributed mapping across multiple UAVs has also been explored, although bandwidth, synchronization, and regulatory acceptance remain open challenges.
6.2. Environmental Effects and Resource Limits
Real-world UAV deployments are constrained by both environmental disturbances and limited onboard resources. Wind, turbulence, and adverse weather perturb nominal trajectories and typically increase energy expenditure, motivating disturbance-aware models and compensation strategies. For example, Fan et al. [
59] incorporated dynamic wind and extreme weather effects into swarm planning to reduce mission failures, while learning-based approaches can predict disturbance patterns and adapt trajectories online [
37].
Energy limitations are a dominant constraint because UAV endurance depends on payload, climb rate, and maneuver aggressiveness. Energy-aware planning often incorporates explicit consumption models to ensure safe return-to-launch or diversion to a contingency landing site. A common formulation expresses energy use as
where
is instantaneous power,
is thrust, and
is velocity. Surveys emphasize embedding such models into planning costs to maintain energy-efficient operation in complex environments [
187,
188]. This consideration is especially relevant for delivery missions, where recharge or battery-swap stops may need to be integrated into long-range routing [
187].
Compute constraints further limit algorithmic choices. Many UAVs rely on embedded processors, such as Jetson-class platforms, which cannot support frequent global replanning using heavy nonlinear optimization or large-scale reinforcement learning. Consequently, hierarchical architectures are common: lightweight global planners provide coarse, constraint-respecting routes, while local modules refine trajectories in real time [
3,
15]. Offloading and distributed strategies allow swarms to share compute resources or use edge/cloud servers for updates [
38], but they introduce latency, bandwidth dependence, and additional safety concerns, particularly in urban or contested airspace [
189].
6.3. Robustness and Adaptation Strategies
Robust path planning requires mechanisms that adapt to uncertainty, disturbances, and unmodeled changes during flight. Offline plans can become unsafe or inefficient when exposed to real-world variability, thereby motivating online replanning and adaptation strategies that explicitly respect safety and resource constraints.
A major direction combines adaptive mapping with hierarchical planning. Frontier-based exploration can adjust voxel-map resolution and update rates in response to compute load, maintaining real-time feasibility without sacrificing critical environmental detail [
190]. Similarly, hybrid planners decompose the problem into a global planner and a local replanner that reacts to dynamic regions [
3].
Disturbance-aware planning further improves robustness. Real-time wind estimation integrated into optimization reduces deviation and energy use [
59]. Reinforcement learning can complement model-based methods by learning policies that compensate for disturbances during execution [
37]. At the swarm level, adaptive coordination is used to maintain safety and mission progress under multiple threats, including weather, interference, and adversarial conditions [
49].
Regulatory and safety compliance during adaptation is often enforced through runtime monitors that impose constraints such as no-fly zones, Remote ID requirements, minimum battery reserves, and safe separation. Runtime assurance architectures enable switching to certified fallback controllers when the nominal planner fails or resource limits are exceeded [
191].
Machine learning also supports robust adaptation through multi-agent learning and meta-heuristic strategies that generalize across environments and reallocate tasks under uncertainty [
38,
192]. Recent approaches such as RAPID incorporate robust reward design and inverse reinforcement learning to improve safety under distribution shift [
56]. Overall, robustness emerges from combining online replanning, disturbance compensation, compliance monitoring, and learning-based adaptation in a unified architecture.
6.4. Airspace and Regulatory Constraints
UAV path planning must satisfy airspace regulations that constrain altitude, geography, access permissions, and allowable modes of operation. No-fly zones (NFZs), geofencing boundaries, and controlled airspace classifications impose hard feasibility constraints that planners must respect to ensure both legality and operational safety. Earlier approaches often modeled these restrictions as static forbidden regions, whereas more recent work has emphasized regulation-aware planning strategies that update constraints during flight and incorporate them directly into decision-making [
193].
Regulatory requirements also vary substantially across jurisdictions, which complicates standardized planning, validation, and certification [
194]. In some settings, the primary emphasis remains on altitude limits, visual line-of-sight rules, and restricted-zone avoidance. In others, growing attention is being directed toward emerging operational requirements such as UAS Traffic Management (UTM) and the integration of UAVs into urban air mobility corridors [
195]. These developments are especially important for large-scale and shared-airspace operations, where planners must account not only for static restrictions but also for coordinated traffic management and evolving operational policies.
Digital low-altitude airspace infrastructures increasingly combine geofencing, obstacle mapping, traffic coordination, and policy enforcement to support scalable UAV deployment in urban environments [
18,
196]. Within this context, Remote ID and UTM-related mechanisms are becoming central to practical UAV operations because they support identification, traceability, conflict management, and compliance monitoring during flight. As a result, planners must be capable of responding to dynamic regulatory updates in near real time rather than relying solely on precomputed constraint maps. Palmerius et al. [
197] illustrated this shift through route planning in flexible airspace designs where regulatory constraints are embedded directly into mission planning.
Risk-aware planning models further extend this perspective by incorporating regulatory compliance into optimization objectives through estimates of collision risk, communication reliability, and violation probability [
198]. Urban airspace monitoring and infrastructure-aware planning likewise support routine UAV operations under these constraints [
185]. In parallel, safety architectures that maximize throughput while enforcing regulatory limits suggest that efficiency and compliance can be optimized jointly rather than treated as competing objectives [
199]. Overall, these developments show that deployable UAV path planning must address not only collision-free navigation but also real-time compliance, traceability, and scalable coordination under evolving airspace management requirements.
6.5. Summary and RCSR Relevance
Real-world UAV path planning is shaped not only by algorithmic capability but also by sensing fidelity, environmental disturbances, resource availability, adaptive robustness, and regulatory compliance. Perception and mapping determine how accurately the environment can be represented, while wind, energy, and compute limitations constrain what can be executed safely in practice. Robustness and adaptation mechanisms help planners remain effective under uncertainty, and airspace regulations define the legal and operational boundaries of deployment.
These practical considerations introduce recurring trade-offs across the literature. Higher-fidelity perception and richer environmental awareness can improve safety and planning accuracy, but they often increase memory, energy, and computational demands. Similarly, more adaptive and learning-enabled methods can enhance responsiveness under uncertainty, yet they are typically harder to validate and certify for safety-critical deployment.
Table 7 provides a consolidated summary of the principal real-world constraints affecting UAV path planning, the trade-offs they introduce, and the representative mitigation strategies adopted in prior work.
Together, these constraints correspond directly to the core pillars of the RCSR framework. Risk calibration depends on accurate perception, environment modeling, and disturbance awareness, since uncertainty in sensing and operating conditions directly affects how risk can be estimated and managed. Certifiable safety is reflected in the need for robust adaptation, runtime monitoring, fallback mechanisms, and compliance with operational constraints such as no-fly zones, Remote ID, and UTM-related requirements. Resource awareness remains central because improvements in mapping fidelity, replanning frequency, and adaptive capability often increase onboard computational load, sensing demands, and energy consumption. Finally, the interaction among perception limits, environmental effects, adaptation needs, and regulatory constraints reinforces the central premise of the RCSR framework: deployable UAV planners must be designed not only for nominal optimality but also for resilience, safety, efficiency, and regulatory compatibility in real operational environments.
7. Future Research Directions
Despite substantial progress in UAV path planning, a persistent gap remains between algorithmic advances and reliable real-world deployment. Much of the literature optimizes individual objectives such as path optimality, collision avoidance, or computational speed without jointly addressing uncertainty, safety guarantees, multi-agent coordination, regulatory compliance, and onboard resource limits [
3,
4]. Consequently, methods that perform well in controlled simulations often degrade when exposed to real-world disturbances, sensing errors, and operational constraints.
The proposed Risk-Calibrated, Certifiably Safe, Resource-Aware (RCSR) framework provides a structured perspective for identifying research directions that prioritize deployable autonomy rather than purely benchmark-oriented performance improvements.
Table 8 summarizes the primary limitations observed in current UAV planning systems and outlines key research directions required to enable safe, scalable, and deployable UAV autonomy.
7.1. Risk Calibration Under Uncertainty
Many path planners still assume deterministic or fully known environments and compensate for uncertainty using heuristic safety margins [
1,
2]. In operational deployments, however, uncertainty arises from perception errors, localization drift, wind disturbances, actuator saturation, and intermittent communications. These uncertainties propagate through the planning pipeline and can lead to unsafe trajectories if not explicitly modeled.
Future research should therefore emphasize quantitative risk calibration, where uncertainty is explicitly modeled and translated into interpretable and enforceable risk metrics. Belief-space planning and probabilistic roadmaps with uncertainty propagation provide promising foundations for such formulations [
201]. Similarly, chance-constrained and distributionally robust optimization methods can enforce probabilistic safety guarantees under bounded uncertainty. Another important direction is the incorporation of semantic obstacle understanding into risk calibration, so that planners can distinguish among obstacle classes with different geometric complexity, operational meaning, and risk profiles rather than relying only on simplified geometric abstractions.
A key open problem is the development of lightweight uncertainty-aware planners that remain tractable under strict real-time and onboard compute constraints. Another important research direction is the integration of perception quality into planning risk models so that risk budgets adapt dynamically to sensing reliability and environmental conditions.
7.2. Certifiable Safety and Runtime Assurance
Safe UAV operation in shared airspace requires verifiable guarantees that trajectories satisfy collision avoidance, separation, and regulatory constraints. In many current systems, safety is enforced indirectly through conservative planning heuristics or large safety margins [
10]. While such approaches reduce risk, they often degrade efficiency and fail to provide formal guarantees.
Recent work in control theory provides promising mechanisms for certifiable safety, including control barrier functions, reachability analysis, and invariant set methods [
5,
11]. However, these techniques remain insufficiently integrated with high-level mission planners and learning-based navigation systems.
Future research should therefore investigate compositional safety architectures that combine high-level planning, formally constrained local motion generation, and runtime assurance layers that detect violations and switch to certified fallback behaviors. Lightweight runtime verification and monitor synthesis will be particularly important for embedded deployment where latency constraints are strict. Cybersecurity should also be treated as a core requirement for certifiable safety in real-world UAV deployment. Threats such as GPS spoofing, signal jamming, and communication interference can corrupt localization, invalidate environment assumptions, and undermine formal safety guarantees even when the nominal planner is correct. Future runtime assurance layers should therefore incorporate integrity monitoring and attack-aware detection mechanisms that trigger certified fallback behaviors when sensing, localization, or communication reliability is compromised.
7.3. Multi-UAV Coordination and Airspace Compliance
As UAV deployments scale toward multi-vehicle operations, coordination becomes increasingly challenging. Many existing approaches assume centralized coordination or reliable communication channels [
3,
200]. In real deployments, communication bandwidth may be limited, latency may vary, and connectivity may be intermittent.
Future work should prioritize decentralized coordination strategies that operate under partial observability and unreliable communications. Mechanisms such as intent sharing, decentralized conflict resolution, and negotiation-based coordination can enable cooperative planning while maintaining scalability.
Another important direction involves integrating policy and regulatory constraints directly into planning algorithms. Rather than representing regulatory restrictions as static obstacles, planners should incorporate dynamic airspace rules such as geofencing updates, traffic corridors, and priority regulations associated with UAS traffic management systems [
7]. This requires tight coupling between planning algorithms and compliance monitoring systems.
7.4. Resource-Aware Planning and Implementation
Real UAV platforms operate under strict resource limitations including limited compute, constrained memory, finite battery capacity, and communication bandwidth restrictions [
10]. Algorithms that assume abundant computational resources often fail to scale to embedded systems used in operational UAV platforms.
Future research should therefore develop resource-aware planning frameworks that explicitly incorporate computational cost, energy consumption, and communication requirements into planning objectives. Anytime planning algorithms with bounded suboptimality guarantees can provide useful tradeoffs between planning quality and computational cost.
Hardware-aware planning implementations also represent an important research direction. Leveraging heterogeneous computing architectures, embedded GPUs, and specialized accelerators may enable complex algorithms to operate within practical energy and latency budgets.
7.5. Integrated Evaluation and Benchmarking
A persistent limitation in UAV planning research is the lack of standardized evaluation frameworks that capture real-world operational complexity. Many studies rely heavily on simplified simulation environments that fail to represent real disturbances, sensing errors, and regulatory constraints [
3].
Future research should focus on integrated evaluation pipelines that combine simulation, software-in-the-loop, hardware-in-the-loop, and real-world testing. Benchmark scenarios should incorporate uncertainty sources such as sensor noise, wind disturbances, GPS denial, and evolving airspace constraints.
Evaluation metrics should also extend beyond geometric path optimality to include safety violations, risk exposure over time, energy consumption, and computational resource usage. Establishing such benchmarks will enable meaningful comparison of UAV planning algorithms based on deployability rather than purely algorithmic performance.
8. Conclusions
This survey examined UAV pathfinding from the perspective of real-world deployment. Although decades of research have produced strong algorithmic results under controlled assumptions, achieving reliable autonomy in complex, dynamic, and regulated environments remains challenging. As illustrated by the motivating real-world vignette introduced at the beginning of this paper, practical UAV operation requires planning systems that reason under uncertainty, maintain verifiable safety guarantees, coordinate with other airspace users, and operate within strict computational and energy constraints.
Across major planning paradigms, a recurring pattern emerges: many methods demonstrate strong performance when evaluated in isolation but encounter limitations when integrated into complete autonomy stacks. Classical graph-based planners can efficiently replan in changing environments but typically lack explicit uncertainty modeling. Sampling-based planners provide desirable asymptotic properties yet may struggle under strict latency constraints and complex operational restrictions. Optimization-based planners can incorporate rich objectives and constraints but are often sensitive to model mismatch and limited onboard computational resources. In many existing studies, risk is not quantitatively calibrated, safety guarantees are assumed rather than formally verified, and resource constraints receive limited attention during evaluation.
To organize research around deployable UAV autonomy, this paper introduced the Risk-Calibrated, Certifiably Safe, Resource-Aware (RCSR) framework. Instead of proposing a single algorithmic solution, the RCSR perspective emphasizes four complementary requirements: (i) calibrated risk reasoning under uncertainty, (ii) formal safety assurance supported by runtime verification mechanisms, (iii) scalable multi-UAV coordination with explicit regulatory compliance, and (iv) resource-aware algorithm design and evaluation. Viewing existing work through these dimensions clarifies both the progress achieved in the field and the remaining challenges that must be addressed for trustworthy real-world deployment.
Looking forward, meaningful progress will require integration across traditionally separate research areas. Planning systems must tightly couple perception reliability, uncertainty propagation, risk calibration, safety assurance mechanisms, coordination strategies, and resource constraints. In addition, evaluation methodologies must extend beyond simulation to include software-in-the-loop testing, hardware-in-the-loop experimentation, and real-world flight trials that capture operational complexity.
In summary, bridging the gap between theoretical path planning advances and operational UAV autonomy requires a shift toward integrated, certifiable, and resource-conscious navigation architectures. By framing UAV pathfinding research through the RCSR perspective, this survey provides a structured foundation for future work aimed at enabling safe, reliable, and scalable UAV operations in the environments they are ultimately designed to serve.