Artificial Intelligence Approaches for UAV Deconfliction: A Comparative Review and Framework Proposal

Chagas, Fabio Suim; Ruseno, Neno; Bechina, Aurilla Aurelie Arntzen

doi:10.3390/automation6040054

Open AccessReview

Artificial Intelligence Approaches for UAV Deconfliction: A Comparative Review and Framework Proposal^†

by

Fabio Suim Chagas

¹

,

Neno Ruseno

^2,*

and

Aurilla Aurelie Arntzen Bechina

²

¹

Laboratory of Artificial Intelligence, Robotics and Cybernetics (LIARC), Military Institute of Engineering (IME), Praça Gen. Tibúrcio, 80—Urca, Rio de Janeiro 22290-270, RJ, Brazil

²

Department of Science and Industry Systems, Faculty of Technology, Natural Sciences and Maritime Sciences, University of South-Eastern Norway, Kongsberg Campus, 3616 Kongsberg, Norway

^*

Author to whom correspondence should be addressed.

^†

This paper is an extended version of our paper published in Nguyen, X.-P.P.; Ruseno, N.; Chagas, F.S.; Aurilla Arntzen Bechina, A. A Survey of AI-based Models for UAVs’ Intelligent Control for Deconfliction. In Proceedings of the 2024 10th International Conference on Control, Decision and Information Technologies (CoDIT), Vallette, Malta, 2024; pp. 2705–2710. https://doi.org/10.1109/CoDIT62066.2024.10708243.

Automation 2025, 6(4), 54; https://doi.org/10.3390/automation6040054 (registering DOI)

Submission received: 29 June 2025 / Revised: 2 September 2025 / Accepted: 26 September 2025 / Published: 11 October 2025

Download

Browse Figures

Versions Notes

Abstract

The increasing capabilities of Unmanned Aerial Vehicles (UAVs) or drones are opening up diverse business opportunities. Innovations in drones, U-space, and UTM systems are driving the rapid development of new air mobility applications, often outpacing current regulatory frameworks. These applications now span multiple sectors, from infrastructure monitoring to urban parcel delivery, resulting in a projected increase in drone traffic within shared airspace. This growth introduces significant safety concerns, particularly in managing the separation between drones and manned aircraft. Although various research efforts have addressed this deconfliction challenge, a critical need remains for improved automated solutions at both strategic and tactical levels. In response, our SESAR-funded initiative, AI4HyDrop, investigates the application of machine learning to develop an intelligent system for UAV deconfliction. As part of this effort, we conducted a comprehensive literature review to assess the application of Artificial Intelligence (AI) in this domain. The AI algorithms used in drone deconfliction can be categorized into three types: deep learning, reinforcement learning, and bio-inspired learning. The findings lay a foundation for identifying the key requirements of an AI-based deconfliction system for UAVs.

Keywords:

artificial intelligence; drone deconfliction; U-Space; deep learning; reinforcement learning; bio-inspired learning

1. Introduction

The rapid expansion of unmanned aerial systems (UAS), or drones, is unlocking a wide range of applications, including photography, package delivery, infrastructure inspection, search and rescue missions, agriculture, and surveillance [1,2]. However, this growth also brings significant challenges, particularly concerning the safe and efficient management of drone operations in increasingly congested airspace.

Among these challenges, one of the most pressing is deconfliction, which is the process of preventing collisions [3,4]. Drones must navigate not only around fixed obstacles, such as buildings, power lines, and towers, but also respond to dynamic hazards, including birds, adverse weather, and other airborne vehicles.

While avoiding static obstacles requires accurate navigation, the presence of moving objects adds further complexity to the task. Drones must be equipped with advanced systems that can detect and adapt to environmental changes in real-time to ensure operational safety.

As drone usage increases, so does the risk of mid-air collisions. It is therefore essential to implement robust conflict detection and resolution mechanisms that can proactively identify and address potential threats between multiple drones. Ensuring safe integration into shared airspace demands comprehensive solutions that can scale with drone traffic [5].

To meet these needs, advanced technologies like artificial intelligence (AI) emerge as vital enablers. Within this context, the SESAR-funded AI4HyDrop project is introduced. Its overarching goal is to ensure the safe integration of drones in U-space by developing AI-based tools that support automated flight plan approval, dynamic airspace structuring, and real-time surveillance for safe and efficient drone operations [6].

This paper presents a literature review of recent research on AI-based approaches to drone deconfliction, analyzing their advantages and limitations. It also introduces a preliminary framework for intelligent deconfliction control. The following sections cover foundational concepts of AI algorithms for drone deconfliction in Section 2, discussion of key findings and their role in enabling intelligent automation in Section 3, and the conclusion of the study in Section 4.

2. AI Algorithms for Drone Deconfliction

In this review, we adopted a structured approach to categorize and analyze relevant literature based on defined criteria, with a particular focus on the types of algorithms utilized, as shown in Figure 1 [7]. The literature search was conducted using keywords such as ‘drone deconfliction’ or ‘UAV conflict detection’ in combination with ‘machine learning’ or ‘artificial intelligence.’ We then narrowed the results to publications from the last six years, from 2020 to 2025, that were written in English and accessible online. Through this process, we identified three primary categories of AI algorithms applied to drone deconfliction: deep learning, reinforcement learning, and bio-inspired learning. The review writing is conducted by category defined in the categorization process.

The results of the AI algorithm categories and the number of publications per year are shown in Figure 2. In total, deep learning (DL) has 37 publications, reinforcement learning (RL) has 21, and bio-inspired learning (BL) has 19.

2.1. Deep Learning

Deep Learning (DL) has emerged as a transformative technology, playing an important role in areas such as image recognition, target tracking, obstacle avoidance, and sophisticated route planning [8,9]. Furthermore, DL significantly contributes to operational efficiency by facilitating faster computational processes and improving decision-making capabilities for air traffic management systems [10]. Beyond traditional Convolutional and Recurrent Neural Networks, the field now encompasses Transformer-based architectures with attention mechanisms [11], Graph Neural Networks for relational reasoning on graph-structured data [12], and emergent Large Language Models that demonstrate impressive few-shot learning capabilities [13]. A comparative table summarizing datasets, performance metrics, and application constraints for each Deep Learning literature is included in Appendix A.

2.1.1. Convolutional Neural Networks (CNNs) for Vision-Based Deconfliction

Convolutional Neural Networks (CNNs) have long been recognized for their efficacy in conflict avoidance systems that rely on visual data, drawing inspiration from the intricate connectivity patterns observed in the human brain’s visual cortex. In recent developments, the CNN-based You Only Look Once (YOLO) algorithm has become popular in vision-based deconfliction. The typical schematic flow diagram of YOLO is shown in Figure 3.

The YOLO pipeline begins with an input image (could be RGB (Red-Green-Blue) images, event-based vision, or Infra-Red (IR) images), which is preprocessed through resizing, normalization, or augmentation to match the network requirements. The image then passes through the backbone, typically a CNN architecture (e.g., Darknet, Cross Stage Partial (CSP) networks, or Transformer-based modules), which extracts hierarchical feature maps. These features are further refined in the neck, using architectures such as the Feature Pyramid Network (FPN), Path Aggregation Network (PAN), or Bidirectional Feature Pyramid Network (BiFPN) to enhance multi-scale feature representation. The detection head then predicts bounding boxes, object classes, and confidence scores, employing either anchor-based or anchor-free mechanisms depending on the YOLO variant. Finally, post-processing steps such as Non-Maximum Suppression (NMS) or its variants are applied to remove redundant detections, resulting in the final outputs of bounding boxes, class labels, and confidence scores. The algorithm performance is typically measured using metrics such as mean Average Precision (mAP), inference speed in frames per second (FPS), latency, and computational complexity in terms of parameters and floating-point operations per second (FLOPs) [14].

For instance, studies employed the YOLO version 2 model for tactical deconfliction of non-cooperative flying objects, or intruders. This involved using a pre-trained CNN, ResNet-50, for feature extraction, with detection handled by YOLOv2 layers [14]. Due to limitations in determining intruder range, the line-of-sight rate was utilized to assess conflict conditions. Similarly, the YOLOv3 model was applied for tactical drone deconfliction with static obstacles, where obstacles were initially detected, bounding boxes were localized around them, and clustering techniques, such as the standard deviation (SD) filter and K-means, were used to handle multiple obstacles, followed by decision-making for avoidance [15]. Lightweight CNN models, often leveraging transfer learning from datasets like MobileNet and validated with ColANet, were developed to classify images into “no-collision” and “collision” categories by extracting and filtering important features through convolution and pooling layers [16].

In recent years, CNN-based methods, particularly iterations of the YOLO series, have made significant advances in drone detection by enhancing both spatial and temporal feature extraction to address the challenges of detecting small objects under varied environmental conditions. For instance, the motion-guided YOLOMG detector, built on YOLOv5, fuses a pixel-level motion difference map with RGB imagery via a bimodal fusion module, achieving a 22-point average precision improvement on the ARD100 dataset—comprising 202,467 frames with an average object size of ~1% of frame area—under complex urban, low-light, and abrupt motion scenarios [17]. More recently, the DEAL-YOLO framework enhances YOLOv8 by integrating multi-objective loss functions—Wise IoU for dynamic non-monotonic focusing and Normalized Wasserstein Distance for tiny object localization—alongside a Scaled Sequence Feature Fusion (SSFF) module and linear deformable convolutions, reducing model parameters by up to 69.5% while maintaining state-of-the-art detection performance [18]. Architectural refinements to YOLOv8, such as appending a P2-stride detection head in the feature pyramid, further preserve high-resolution spatial details, boosting recall for sub-32-pixel objects without impacting inference latency [19,20]. In parallel, deployment-focused research has shown that lightweight YOLOv8-nano sustains 52 FPS in FP32 and 65 FPS under INT8 quantization on the NVIDIA Jetson Orin NX, underscoring its viability for low-latency, onboard UAV applications [21]. A comprehensive AirSim-based benchmark across YOLOv8 to YOLOv11—including transfer learning with EfficientNet backbones—reveals that EfficientNet-augmented YOLOv8-small elevates recall from 0.6318 to 0.7804 and mAP@50 from 0.7228 to 0.8447, while identifying YOLOv8-medium and YOLOv10-big as the optimal trade-off models for real-time UAV deployment [22,23]. To tackle extreme lighting, a six-stage pipeline combining Multi-Scale Retinex with Color Restoration (MSRCR), OPTICS segmentation, YOLOv10 detection, GLOH and Dense-SIFT feature extraction, Whale Optimization Algorithm–driven feature selection, and a Swin Transformer classifier achieves 91.5% mAP@50 and up to 95.5% classification accuracy on the UAVDT and VisDrone nighttime benchmarks, outperforming prior methods by up to 5.10% under challenging illumination, occlusion, and noise conditions [24]. Collectively, these innovations in motion integration, loss optimization, multi-scale spatial enhancement, hardware-aware quantization, and domain-adaptive preprocessing reflect a concerted effort to surmount the intrinsic hurdles of UAV-based visual deconfliction and pave the way for highly reliable, real-time drone detection systems.

2.1.2. Recurrent Neural Networks (RNNs) and LSTMs for Trajectory Prediction

Recurrent neural networks (RNNs) utilize cyclic connections to maintain an internal state across time steps, making them well-suited for modeling UAV trajectory sequences and anticipating potential conflicts in UTM environments [25]. Early work trained RNNs on historical flight data and BlueSky ATM simulator outputs to predict future UAV routes and prevent collisions in low-altitude U-Space [26]. Long Short-Term Memory (LSTM) variants further improved temporal retention, with grouping-based conflict detection methods leveraging LSTM cells on preprocessed ADS-B data to anticipate high-risk encounter points [27]. Hybrid CNN–LSTM frameworks then integrated spatial inputs—such as static no-fly-zone maps, emergency UTM constraints, and weather features—using encoder–decoder LSTM architectures and rule-based decision trees driven by PSO-simulated trajectories to refine airspace capacity estimates [28].

More recent RNN-based models emphasize hybrid, context-rich architectures and dynamic cues. The CNN-Attention-LSTM (CA-LSTM) model combines convolutional feature extraction with attention-weighted LSTM to achieve coarse-to-fine feature fusion, reducing mean absolute percentage error by 9.43% and mean squared error by 23.81% relative to vanilla LSTM networks on infrared UAV swarm datasets [29,30]. The VECTOR framework employs a velocity-enhanced GRU network that incorporates historical velocity estimates and first-order dynamics for real-time 3D trajectory prediction, achieving MSEs as low as 2 × 10⁻⁸ on both synthetic and real-world UAV datasets, outperforming traditional RNN and transformer baselines [31]. Optimized GRU pipelines using look-back and forward-length labeling capture complex temporal patterns for long-horizon forecasts, delivering RMSE improvements (e.g., latitude RMSE: 5 × 10⁻⁶ vs. 5.3 × 10⁻³ for FlightBERT++) with per-sample inference times around 33 ms, demonstrating their onboard UTM viability. Finally, Temporal Kolmogorov–Arnold Networks (TKAN), tuned via Particle Swarm Optimization, excel at modeling nonlinear, multiagent UAV interactions in urban settings—achieving R² scores of 0.98, RMSE of 39.85, and MSE of 1588.5—surpassing LSTM and GRU benchmarks in complex urban traffic forecasting [32,33].

2.1.3. Transformer Neural Networks for Complex Interaction Modeling

Transformer Neural Networks, initially designed for sequence modeling, are increasingly being applied in drone deconfliction for their ability to capture long-range dependencies and model complex interactions. A notable development is ASPILin, a model for trajectory prediction that enhances interpretability by replacing traditional Transformer attention scores with a “physical correlation coefficient.” This coefficient is derived from factors like distance and speed of approach, providing a clear, physically meaningful basis for interaction weighting. ASPILin also employs heuristic agent selection based on road topology, which improves prediction performance, substantially reduces computational costs, and filters out irrelevant agents, thereby increasing explainability [34]. The adoption of Transformers in multi-UAV systems signals a progression towards models capable of understanding and predicting complex, interdependent behaviors among drones. The emphasis on interpretability, as seen in ASPILin, is crucial for safety-critical applications, as it allows human operators or auditors to understand the rationale behind a deconfliction decision, which is vital for trust and certification.

Transformers are also being integrated into broader multiagent systems. For instance, Transformer models are combined with Graph Neural Networks (GNNs) and Generative Adversarial Networks (GANs) for intelligent logistics management robot path planning. In this context, the Transformer encodes warehouse environment information and desired optimal paths, leveraging its encoder–decoder structure and multi-head attention to extract features and optimize path prediction sequences [35]. Furthermore, supervised neural network time series classification (NN TSC), which includes Transformer models with attention mechanisms, is being used to predict key attributes and tactics of swarming autonomous agents for military contexts. These neural networks can predict swarm behaviors with 97% accuracy using short observation windows of 20 time steps and demonstrate scalability to swarm sizes ranging from 10 to 100 agents [36]. This capability directly informs counter-maneuvers, moving beyond individual drone deconfliction to managing collective behaviors.

While Transformer-based architecture offers superior performance in capturing global context, it traditionally incurs prohibitive computational costs for UAV deployment. However, recent research is actively addressing this challenge. Optimized Transformer-based neural networks are being developed for onboard aerial image classification in real-time disaster management, achieving high accuracy with reduced inference latency and memory usage on resource-constrained devices. This results in performance that is up to 3 to 5 times faster than the original models [37]. Similarly, a Swin Transformer-based classifier is part of a novel framework for nighttime intelligent UAV-based vehicle detection and classification. This system utilizes hierarchical attention mechanisms to achieve robust performance and has been demonstrated to outperform CNNs on complex aerial imagery. These efforts indicate a future where the benefits of Transformers, such as global context understanding and robust feature extraction, can be realized in real-time, onboard deconfliction systems, potentially leading to more sophisticated and adaptive collision avoidance strategies.

2.1.4. Graph Neural Networks (GNNs) for Cooperative Conflict Resolution

Graph Neural Networks (GNNs) represent a scalable framework for multi-UAV deconfliction by explicitly modeling air traffic as a graph, where UAVs are nodes and pairwise conflicts form edges that evolve dynamically as agents move [38]. Early GCRL work introduced the Deep Graph Network (DGN), comprising a shared observation encoder (MLP/CNN), a graph convolutional layer invariant to node ordering, and a Q-value network—augmented with a learned communication mechanism that lets connected UAVs exchange messages before action selection [39]. Training from tabula rasa on simulated 3- and 4-UAV scenarios, these agents converge to cooperative resolution policies, slashing both the number and duration of Losses of Separation and eliminating Near-Mid-Air Collisions—emergent behaviors that are difficult to handcraft [38]. As UAV traffic density rises, pairwise methods falter; the Multi-Scale Graph Reinforcement Learning (MS-GRL) framework scales these ideas to dense networks of up to 100 UAVs by combining graph embedding with safety-constrained maneuver strategies, outperforming prior GCRL baselines under high-density conditions [40].

Beyond aerial deconfliction, GNNs are being fused with Transformer architectures for intelligent route optimization in logistics and multiagent coordination. In smart logistics robot path planning, a GNN-Transformer-GAN fusion utilizes graph representations of maps, cargo allocations, and robot states, alongside attention-based global feature extraction, to achieve 15% shorter paths, 20% faster completion times, and 10% lower energy consumption on real-world datasets [35]. In multiagent UAV coordination, Graph-Based Deep Reinforcement Learning augmented with Transformer-style message-passing achieves 90% service provisioning and 100% grid coverage, while cutting the average steps per episode by two-thirds compared to PSO and DQN baselines, highlighting the power of GNN-Transformer hybrids for large-scale, communication-constrained environments [41]. These developments demonstrate the versatility of GNNs in modeling complex inter-agent dependencies, facilitating emergent cooperative behaviors, and enabling robust, real-time conflict resolution and path planning across aerial and ground robotics.

2.1.5. The Emerging Role of Large Language Models (LLMs) in UAV Operations

Large Language Models (LLMs) are rapidly transforming UAV operations by enabling contextual reasoning, in-context learning for data scheduling, natural-language mission planning, anomaly detection, and edge deployment through lightweight model compression techniques, thus paving the way for adaptive, human-interpretable, and scalable UAV autonomy in dynamic and safety-critical environments [42,43].

Recent advances demonstrate that LLMs can adapt to dynamic environments with far less task-specific training than deep reinforcement learning (DRL), benefiting from their ability to perform In-Context Learning (ICL) from a handful of examples or prompts [44]. For instance, the ICL-based Data Collection Scheduling (ICLDC) framework utilizes an LLM to generate and iteratively refine natural-language task descriptions for UAV-assisted sensor networks, resulting in a 56% reduction in cumulative packet loss compared to maximum channel-gain methods in emergency scenarios [43]. Beyond scheduling, LLM-augmented decision models, combining Retrieval-Augmented Generation (RAG), have proven capable of fusing mission logs, telemetry, and environmental factors to produce context-aware commands, achieving BLEU scores of 0.82 and cosine similarity of 0.87, which enables real-time Internet-of-Drones operations [45]. Moreover, LLMs support anomaly detection in communication streams, automatically identifying inconsistent sensor data or potential spoofing events through language-based pattern recognition [42].

LLMs are also redefining high-level UAV mission planning and control. Frameworks like FLUC translate natural-language operator commands into executable autopilot code, enabling energy-aware UAV positioning and multi-step reasoning with models such as Qwen 2.5 and LLaMA 3.2 [46]. The LEVIOSA system converts text and speech inputs into synchronized 3D waypoints for UAV swarms via multimodal LLMs, improving coordination and collision avoidance in Search & Rescue (SAR), agriculture, and infrastructure inspection [47]. In rapid SAR deployments, the UAV-VLRR pipeline combines Vision-Language Models (VLMs) with ChatGPT-4 for scene interpretation and Nonlinear Model Predictive Control (NMPC), resulting in a 33.75% reduction in response times compared to off-the-shelf autopilots and a 54.6% reduction compared to human pilots [48]. Emerging multimodal LLM-enabled swarm architectures leverage unified perception and reasoning to coordinate hundreds of UAVs in dynamic missions, demonstrating the scalability of LLM integration in dense airspace operations [49].

To meet onboard constraints, researchers are compressing LLMs—via pruning, quantization, and knowledge distillation—into lightweight variants capable of edge inference on UAVs without sacrificing decision-making quality [50]. In public-safety UAVs, in-context LLM frameworks deployed at the network edge reduce latency and preserve data privacy compared to cloud-based inference, making them suitable for real-time, mission-critical scenarios [44]. These developments mark a paradigm shift toward adaptive, human-interpretable, and context-aware UAV autonomy, setting the stage for truly intelligent, collaborative, and resilient unmanned aerial operations.

2.2. Reinforcement Learning

Reinforcement Learning (RL) represents a powerful paradigm in artificial intelligence, enabling an AI-driven agent to learn optimal behaviors through trial and error, guided by rewards and punishments that drive the discovery of strategies that maximize cumulative long-term returns [51]. Deep Reinforcement Learning (DRL) emerges as a specialized variant of RL, employing deep neural networks to tackle more intricate challenges by extracting compact state representations from raw sensory inputs and refining policies through environmental feedback [52]. Both RL and DRL are increasingly adopted in drone deconfliction applications, offering a robust framework for autonomous obstacle avoidance and navigation in complex, dynamic environments—thereby eliminating the need for manual rule design or parameter tuning [51,53]. The escalating complexity of drone operations, particularly with the proliferation of Unmanned Aerial Vehicles (UAVs) in shared airspace, necessitates such adaptive and intelligent conflict management solutions [54]. A comparative table summarizing agent/actions, environment/States, and application constraints for each Reinforcement Learning literature is included in Appendix B.

2.2.1. Traditional Reinforcement Learning (RL) for Single-Agent Deconfliction

Early applications of Reinforcement Learning in drone deconfliction laid foundational groundwork by enabling distributed conflict resolution policies that guarantee minimum separation via RL-controlled heading, speed, and altitude maneuvers—evaluated in the BlueSky ATM simulator and compared against the Modified Voltage Potential method using a global reward based on cumulative losses of separation [55]. Another study restructured the state space using the novel concept of risk sectors and introduced an Estimated Time of Arrival (ETA)-based temporal reward to balance collision avoidance with timely waypoint arrival, achieving a 40.59% increase in mission success rates over traditional tactical conflict methods [56]. Probabilistic risk-based operational safety bounds were integrated into RL frameworks as dynamic airspace reservations, defining buffer zones that account for UAS performance, weather uncertainty, and positioning errors. New reward functions were devised to penalize incursions, enabling collision-free trajectory learning in simulated environments [57]. In parallel, hybrid approaches combined geometric conflict resolution algorithms (e.g., Modified Voltage Potential) with RL agents that optimized maneuver parameters—such as look-ahead time and degrees of freedom—to reduce both the number and duration of losses of separation and to address emergent secondary conflicts [58]. These pioneering methods paved the way for more advanced RL-driven deconfliction, including self-prioritizing multiagent RL frameworks that minimize action overhead via learned priority levels [59], distributed swarm control approaches that scale to large UAV fleets through decentralized policies [60], Proximal Policy Optimization-based continuous control for obstacle avoidance in UAS [61], and counterfactual credit assignment schemes for cooperative multi-UAV collision avoidance [62]. Subsequent work extended RL-based deconfliction to adaptive collision avoidance in urban mUAV scenarios—formulating two-layer frameworks for speed adjustments and rerouting strategies [63]—and AI-based control surveys have underscored RL’s scalability and efficacy in UAV path planning and collision management.

2.2.2. Multiagent Reinforcement Learning (MARL) for Cooperative Deconfliction

Recent advances in Multiagent Reinforcement Learning (MARL) have enabled decentralized, cooperative conflict resolution strategies for dense UAV operations. Graph-based MARL models such as the Deep Graph Network (DGN) represent each UAV as a node and detected pairwise conflicts as edges, allowing agents to exchange messages via graph convolutions and jointly generate resolution maneuvers; in 3- and 4-agent scenarios, these models achieve complete separation avoidance with zero Near Mid-Air Collisions and significant reductions in Losses of Separation [38]. Strategic–tactical integrated frameworks combine Demand–Capacity Balancing (DCB) for macro-level traffic conditioning with multiagent RL for tactical deconfliction, showing that pre-conditioning traffic density via DCB significantly enhances RL-based safety separation performance in Urban Air Mobility simulations [64]. Asynchronous Advantage Actor-Critic variants, such as MAA3C, incorporate recurrent networks to address separation conflicts and block unavailability due to wind turbulence by autonomously selecting ground delays, speed adjustments, and cancelations [65].

A mean-field graph reinforcement learning approach for large-scale UAV path planning enables each vehicle to aggregate local neighborhood information into a mean-field embedding that serves as a global attention weight, achieving robust collision avoidance and coverage performance in swarms of up to 120 UAVs and outperforming prior graph-based MARL schemes under high traffic densities [66]. Energy-aware MARL models integrate onboard battery constraints and mission priorities into the reward structure, achieving over 80% mission success rates across varying swarm sizes in mission-oriented drone networks, marking the first work to jointly model battery capacity and task length in cooperative UAV networks [67].

Foundational decentralized training algorithms such as Multiagent Deep Deterministic Policy Gradient (MADDPG)—with decentralized actors and a centralized critic—and value-decomposition methods like QMIX underpin these cooperative policies, effectively handling mixed cooperative–competitive environments and complex interaction topologies [68]. Hierarchical MARL frameworks for UAV swarms decompose guidance tasks into curricula inspired by shepherding, enabling mixed aerial–ground coordination with robust real-world transfer [69,70]. As Urban Air Mobility and UTM initiatives anticipate dense, dynamic low-altitude traffic, these MARL innovations offer a scalable, adaptive foundation for autonomous, distributed UAV deconfliction and coordination.

2.2.3. Deep Reinforcement Learning (DRL) for Advanced Deconfliction

Deep reinforcement learning (DRL) has emerged as a powerful end-to-end paradigm for advanced UAV deconfliction, benefiting from its ability to learn policies directly from high-dimensional sensory inputs without the need for handcrafted rules. Early work introduced probabilistic-DRL hybrids for onboard collision avoidance, which jointly optimize safety and energy consumption in dense, unstructured UAV environments, demonstrating robust performance without requiring any prior scene models [71]. In 2023, tactical conflict resolution was recast as a Markov decision process and solved via a Double Deep Q-Network enhanced with an attention mechanism over neighbor states, enabling each drone to generate conflict-free maneuvers on the fly—scaling gracefully to arbitrary swarm sizes and suppressing cascading “domino” conflicts [72]. To tackle multi-objective requirements such as formation maintenance and dynamic obstacle avoidance, a two-stage training pipeline was developed: the first stage searches for a linear utility (reward) balancing key objectives in simplified settings, and the second stage applies curriculum learning in richer scenarios, achieving superior collision-free formation control in mixed static/dynamic environments [73]. Extending DRL beyond autonomy, the DroneARchery system couples learned swarm-control policies with augmented-reality and haptic interfaces, allowing human operators to intuitively “shoot” and guide drone formations while the underlying DRL policies automatically ensure intra-swarm collision avoidance during dynamic deployment [74]. Collectively, these advances underscore DRL’s capacity to deliver scalable, energy-efficient, and even human-centric deconfliction solutions for next-generation UAV operations.

2.3. Bio-Inspired Learning

To achieve truly autonomous, adaptive deconfliction, researchers have turned to bio-inspired algorithms—metaheuristics that mimic natural processes such as swarm foraging, collective flocking, and evolutionary selection—to solve complex UAV coordination tasks without relying on precise analytic models [75]. Particle Swarm Optimization (PSO), Ant Colony Optimization (ACO), Artificial Bee Colony (ABC), and Genetic Algorithms (GA) leverage stochastic, population-based searches to explore high-dimensional trajectory spaces, escape local optima, and self-adapt to evolving airspace conditions [76]. Hybrid schemes—such as multi-colony ACO combined with Differential Evolution (MMACO-DE)—have demonstrated efficient multi-UAV path planning in dynamic environments by updating pheromone intensities based on collision risk and energy cost, yielding shorter, safer routes under time-varying constraints [77]. Recent extensions incorporate multi-objective formulations (e.g., safety vs. fuel efficiency vs. no-fly-zone compliance) and dynamic parameter tuning—drawing on inspiration from gray-wolf hierarchies and firefly attraction—to further boost robustness and scalability. These bio-inspired frameworks operate in parallel on distributed UAV swarms, providing real-time, resilient deconfliction strategies that are ideally suited to the uncertainty and density of future urban air mobility. A comparative table summarizing algorithm, performance metrics, and application constraints for each Bio-Inspired Learning literature is included in Appendix C.

2.3.1. Swarm Intelligence Algorithms for Collision Avoidance and Coordination

Swarm-intelligence algorithms draw inspiration from the collective behaviors of social organisms—such as bird flocks, fish schools, and ant colonies—to provide fully distributed, adaptive collision-avoidance and coordination strategies for UAV swarms. Particle Swarm Optimization (PSO) treats each drone as a “particle” in a search space, dynamically balancing personal best and global best positions to converge on energy-efficient, collision-free trajectories; such PSO-based schemes have enabled real-time target detection in occluded forests [78], decentralized energy-aware collision avoidance via artificial potential fields and PSO fusion (E2CoPre) [79], and enhanced 3D path planning through PSO variants optimized for rapid convergence in dense swarms [80]. Ant Colony Optimization (ACO) mimics pheromone-mediated foraging to solve cooperative inspection and coverage tasks: by formulating multi-UAV inspection as an extended Traveling Salesman Problem and guiding waypoint selection through virtual pheromone trails, ACO yields up to 29.5% shorter infrastructure-inspection paths compared to classical heuristics [81] and supports multi-objective formulations that balance safety, energy, and data requirements in dynamic environments [82].

Beyond PSO and ACO, other bio-inspired heuristics have demonstrated superior performance in challenging deconfliction scenarios. Improved Grey Wolf Optimizers (IGWO) incorporate advanced cooperative predation strategies and lens-opposition learning to plan the shortest paths in obstacle-rich maps, outperforming standard GWO, PSO, and Whale Optimization Algorithm by 1.7–2.0 m on average [83]. Artificial Bee Colony (ABC) algorithms—with modifications for intelligent search and specialized division—optimize communication-aware trajectories in UAV-aided networks, enhancing data-rate metrics while implicitly deconflicting flight paths [84]. Meta-heuristic hybrids, such as multi-colony ACO combined with differential evolution (MMACO-DE), leverage pheromone updates based on collision risk and energy cost to generate robust, shortest-path solutions under time-varying constraints [77].

A notable quantum-inspired PSO variant, MDQPSO-ASA, encodes particle positions via quantum-inspired rotation gates and integrates multi-swarm diversification with adaptive simulated annealing, enabling the formation of dynamic UAV clusters of up to 100 vehicles for multi-target localization without any quantum hardware; this method demonstrates fast convergence and high solution quality in large-scale scenarios [85].

2.3.2. Evolutionary Algorithms for Adaptive Deconfliction

Evolutionary algorithms (EAs) draw on the principles of natural selection—mutation, crossover, and fitness-based selection—to iteratively evolve high-quality solutions for complex, nonlinear, and high-dimensional optimization tasks where traditional methods often fail. Genetic Algorithms (GAs), a prominent EA variant, have been applied to real-time UAV path planning and collision avoidance; for instance, a GA-based planner achieves optimized 3D trajectories in both static and dynamic environments by tuning mutation rates, crossover strategies, and population sizes to balance exploration and exploitation [86]. Co-evolutionary frameworks further decompose multi-UAV cooperative path planning into interacting with single-UAV subproblems, exchanging path information across populations to reduce computational complexity while preserving high-quality joint solutions [87]. In highly competitive settings such as autonomous drone racing, game-theoretic planners leveraging dynamic near-potential functions compute approximate Nash equilibria—capturing overtaking and blocking maneuvers under real-time constraints—to ensure collision avoidance and optimal positioning in head-to-head scenarios [88].

2.3.3. Neural-Inspired Algorithms: Spiking Neural Networks (SNNs)

Spiking Neural Networks (SNNs) emulate the brain’s event-driven, spike-based communication to enable energy-efficient, low-latency collision avoidance and coordination in UAV swarms. A reward-modulated SNN (RSNN) self-organizes collision-avoidance behaviors via local interactions and spike-timing-dependent plasticity, outperforming ANN baselines in both simulation and bounded-space flight tests [89]. Event-based SNNs integrating neuromorphic vision sensors (DVS) have achieved real-time obstacle avoidance on the Parrot Bebop2, leveraging shallow networks for low-latency reactive control in dynamic indoor environments [90]. Neuromorphic digital-twin architectures equip each UAV with an SNN that reproduces cloud-generated control signals, enabling up to 15 drones to coordinate in cluttered indoor spaces with guaranteed separation and robustness against communication interruptions [91]. On ultra-low-power platforms like the Crazyflie, modular SNNs map raw IMU inputs to motor commands at 500 Hz, providing attitude estimation and control with millisecond-scale latency and disturbance rejection [92]. Lobula Giant Movement Detector (LGMD)-inspired SNN controllers optimized via differential evolution reliably predict looming obstacles in the presence of sensor noise, triggering timely evasive maneuvers [93]. Deep SNN frameworks trained with reinforcement learning and surrogate-gradient methods achieve ANN-level obstacle avoidance accuracy while reducing inference energy by over 75% [94]. Binocular Bi-LGMD variants extract depth through disparity, enhancing collision prediction robustness across varied motion scenarios [95]. Comprehensive toolkits, such as BrainCog, offer supervised and unsupervised training pipelines for rapid SNN development, thereby accelerating the deployment of neuromorphic UAV controllers [96]. Together, these advances underscore SNNs’ promise for fully neuromorphic deconfliction systems that meet the stringent energy, latency, and robustness demands of next-generation autonomous UAV operations.

3. Discussion

This section discusses the findings from each of the AI algorithm categories used in drone deconfliction, as well as the trade-offs between those algorithms and considerations for selecting the algorithm used in drone deconfliction. Also, the proposed approach in the AI4HyDrop project on drone deconfliction is discussed.

3.1. Key Findings

Our survey reveals three dominant AI paradigms for UAV deconfliction—deep learning, reinforcement learning, and bio-inspired learning—each addressing different facets of the problem. Deep learning excels at perception and prediction through CNN, RNN/LSTM, Transformer, and graph-based models. Reinforcement learning (RL and DRL) enables adaptive, reward-driven avoidance policies in both single-agent and multiagent settings. Bio-inspired methods draw on natural heuristics to achieve decentralized and robust coordination. Below, we summarize the core strengths and limitations of each.

3.1.1. Deep Learning

Convolutional Neural Networks (CNNs) power high-accuracy vision-based deconfliction. Tactical detectors, such as YOLOv2, leverage ResNet-50 feature extractors for intruder detection [14], while YOLOMG fuses pixel-level motion with RGB imagery to boost mAP by 22 points on the ARD100 benchmark under low-light and abrupt-motion scenarios [17]. The DEAL-YOLO framework further integrates multi-objective loss functions and deformable convolutions, cutting parameters by nearly 70% with no loss in accuracy [18].

Temporal models based on RNNs/LSTMs achieve precise trajectory forecasts: CA-LSTM reduces MAPE by 9.4% and MSE by 23.8% versus vanilla LSTMs [29,30], while the VECTOR GRU network attains MSEs as low as 2 × 10⁻⁸ on both synthetic and real-world UAV data [31]. Transformer architectures, such as ASPILin, introduce physically grounded attention coefficients for interpretable interaction modeling, thereby cutting computational cost without sacrificing performance [34].

However, these models demand extensive labeled data and often exceed onboard resource budgets; quantization and pruning become essential to achieve real-time inference on edge devices [21].

3.1.2. Reinforcement Learning

RL frameworks let UAVs learn collision-avoidance maneuvers by maximizing long-term rewards. Single-agent RL methods that partition airspace into risk sectors and utilize ETA-based temporal rewards increase mission success by 40.6% compared to rule-based tactics [56], while probabilistic safety bounds incorporate weather and positioning uncertainty into dynamic buffer zones [57].

Graph-based multiagent RL (MARL) models—exemplified by the Deep Graph Network (DGN)—treat UAVs as nodes and conflicts as edges, achieving zero near-mid-air collisions and significant reductions in separation losses through learned message passing [69]. Mean-field MARL scales to swarms of 120+ UAVs, maintaining robust avoidance and over 80% mission success in high-density scenarios [66,67]. Attention-enhanced DRL agents (e.g., Double DQN with neighbor-state attention) generate on-the-fly conflict-free maneuvers that gracefully suppress cascading “domino” conflicts in arbitrarily large swarms [72].

The main drawbacks are long training times, complex reward shaping, and substantial computational demands for safe real-world operation.

3.1.3. Bio-Inspired Learning

Bio-inspired heuristics mimic natural processes for fully distributed deconfliction. Particle Swarm Optimization (PSO) variants—such as E2CoPre, which combines potential-field methods with PSO—offer energy-aware collision avoidance in dynamic environments [79]. Ant Colony Optimization (ACO) solves the multi-UAV inspection problem as a pheromone-guided Traveling Salesman Problem, yielding paths that are up to 29.5% shorter than those of classical heuristics [81].

Evolutionary algorithms, such as genetic algorithms, optimize 3D trajectories through mutation and crossover; however, scaling to large swarms requires co-evolutionary decompositions to manage complexity [86]. Spiking Neural Networks (SNNs) deliver neuromorphic, event-driven control: DVS-integrated SNNs achieve real-time obstacle avoidance on platforms like the Parrot Bebop2, and reward-modulated SNNs self-organize collision-free behaviors outperforming ANN baselines in bounded-space tests [90,94].

While highly resilient and low-latency, bio-inspired methods often require careful parameter tuning and may converge more slowly than learning-based approaches, suggesting that hybrid integration could be a fruitful avenue.

3.2. Trade-Offs and Considerations

Each category of algorithm comes with its own strengths and limitations. Deep learning methods are renowned for their high accuracy, but they require substantial computational power and large datasets for effective training. Reinforcement Learning (RL) is highly adaptable and capable of dynamic decision-making, yet it typically requires extensive real-world testing to ensure its safety and reliability in practice. Meanwhile, bio-inspired algorithms bring innovative and nature-inspired approaches but often fall short in terms of robustness and precision when applied to complex deconfliction scenarios, necessitating further refinement and real-world validation.

In addition, Deep Learning excels at obstacle and terrain recognition but is constrained by its dependence on large datasets and its opaque, black-box nature. Reinforcement Learning is well-suited for collision avoidance and route optimization, but faces limitations related to sample efficiency and reward modeling. Bio-inspired Learning provides flexibility and resilience but may suffer from slow convergence and limited scalability.

Our review acknowledges that the evidence base across the three AI categories remains uneven in terms of datasets, simulation platforms, and real-world validation. Deep learning approaches benefit from the extensive use of publicly available datasets (e.g., vision-based benchmarks for obstacle detection and collision prediction); yet, most deconfliction studies still rely on custom or limited datasets, which restricts comparability and generalizability. Reinforcement learning methods are often validated in high-fidelity simulation platforms such as Gazebo, AirSim, or custom-built environments, which allow safe exploration of large action spaces but cannot fully capture the uncertainties and failures encountered in real-world airspace operations. Similarly, bio-inspired algorithms are predominantly assessed through numerical simulations or small-scale laboratory experiments, with relatively few implementations tested in live multi-UAV scenarios.

As a result, a noticeable gap exists between simulated performance and operational deployment. While simulation studies demonstrate strong potential, the lack of systematic real-world trials limits the generalizability of current findings, particularly in terms of explainability, sensitivity to uncertainties, deployment challenges, and ethical considerations.

AI-driven deconfliction solutions face a critical challenge in terms of explainability, as aviation safety regulations require traceability and interpretability of decision-making processes. Deep learning models, while powerful, are often criticized for their “black box” nature. Recent studies have explored methods such as attention mechanisms, saliency mapping, and rule extraction to improve transparency; however, these are not yet widely adopted in UAV deconfliction. In contrast, reinforcement learning and bio-inspired algorithms often allow more interpretable policy rules or heuristic mappings. For the safe adoption of AI, hybrid models where deterministic safety rules serve as a transparent baseline and AI acts as an adaptive layer may be essential for compliance with aviation certification requirements.

AI-based deconfliction methods remain sensitive to uncertainties inherent in real-world operations. GPS errors can undermine trajectory prediction accuracy, communication delays affect timely conflict detection and resolution, and adversarial spoofing or jamming can trigger false alarms or prevent conflict avoidance altogether. While simulation studies occasionally incorporate noise models or latency assumptions, few explicitly address the issue of adversarial resilience. This highlights the need for robust testing under degraded sensing, communication loss, and cyber-attacks, alongside the adoption of sensor fusion and probabilistic reasoning to enhance reliability in uncertain environments.

The transition from research to operational deployment is constrained by certification, interoperability, and integration barriers. Certification remains a significant hurdle, as regulatory bodies such as the EASA and FAA require assurance evidence, which is challenging to generate for adaptive, non-deterministic AI systems. Interoperability with UTM/U-space services presents another challenge: deconfliction algorithms must be able to consume standardized data streams (e.g., Remote ID, traffic information, and geo-awareness) and exchange resolution advisories in real-time. Furthermore, seamless integration into existing ATM frameworks requires not only technical interoperability but also alignment with established separation minima, contingency procedures, and human-in-the-loop operations.

Ethical and legal issues also shape the design requirements of AI-driven deconfliction systems. Accountability in the event of accidents is unclear when decisions are partially delegated to autonomous algorithms, raising questions about operator liability versus manufacturer responsibility. In urban environments, the widespread use of onboard sensing (cameras, microphones, and Radio Frequencies) introduces privacy concerns that must be addressed through data minimization, secure storage, and compliance with regulations such as the GDPR (General Data Protection Regulation). Ethical considerations further extend to fairness in airspace allocation and avoidance of bias in AI models that may disadvantage certain operators.

Despite the issues and challenges, the AI techniques demonstrate potential for application within the AI4HyDrop project, particularly in areas such as flight plan approval, route optimization, task distribution, and sensor data integration, all of which aim to improve drone deconfliction.

3.3. AI4HyDrop Approach to AI-Based Drone Deconfliction

The AI4HyDrop project is primarily focused on developing AI solutions to address drone deconfliction. This process involves ensuring safe separation between multiple drones operating in shared airspace and encompasses both strategic and tactical dimensions. Strategic deconfliction typically focuses on pre-flight or early in-flight planning, where the objective is to generate conflict-free routes before execution. In this context, reinforcement learning and bio-inspired learning algorithms are widely applied, as they are effective in large-scale optimization, airspace allocation, and trajectory planning problems. By contrast, tactical deconfliction addresses conflicts that emerge in real time due to uncertainties, dynamic environments, or non-cooperative traffic. The deep learning methods, particularly those based on computer vision and sensor fusion are more prominent, as they enable rapid detection, intent prediction, and collision avoidance.

Across these categories, AI techniques exhibit complementary strengths. Reinforcement learning and bio-inspired learning demonstrate maturity in strategic planning by efficiently balancing fairness, efficiency, and environmental considerations in trajectory design. Meanwhile, deep learning approaches are particularly well-suited to the tactical domain, where the real-time processing of multimodal sensory data is essential for collision prediction and the generation of evasive maneuvers. This distinction suggests that hybrid frameworks, which combine strategic optimization with tactical responsiveness, offer a promising pathway for robust deconfliction in complex airspace environments.

Figure 4 illustrates our proposed intelligent control framework for managing drone deconfliction. In this system, each drone will have access to updated data about its own status and that of nearby drones. Additionally, low-cost sensors will provide continuous data on drone positions and velocities. The system will also incorporate operator input, such as predefined flight plans. Machine learning algorithms will then be used to fuse this sensor and operational data for effective conflict resolution. Rather than relying on a single technique, we intend to implement a hybrid AI model that integrates the various approaches identified in the literature review.

The proposed intelligent control framework for drone deconfliction has strong practical applicability in real-world operations. By combining onboard sensor fusion with cooperative data such as Remote ID and U-space traffic information services, the system can support both visual line-of-sight and beyond-visual-line-of-sight operations. Low-cost sensors such as cameras, microphones, or lightweight LiDARs provide continuous updates on nearby drones. While hybrid AI models with machine learning–based prediction enable conflict detection and resolution in real-time.

From a regulatory perspective, the framework aligns well with European U-space regulations (EU 2021/664) that require network identification, geo-awareness, flight authorization, and traffic information services [97]. It can be positioned as an advisory tactical deconfliction aid, complementing certified detect-and-avoid systems such as ACAS Xu. Compliance with ISO 21384-3 for operational procedures and ASTM F3411 for Remote ID interoperability further strengthens regulatory compatibility [98].

Maintenance and updating costs are expected to be concentrated on the software and AI lifecycle. Hardware costs involve regular calibration and occasional replacement of low-cost sensors, while communications depend on cellular data subscriptions for U-space services. More significantly, model updates require continuous data collection, training, and verification, with costs driven by storage, annotation, and GPU training resources. To remain compliant with regulatory expectations, each update must undergo verification, regression testing, and configuration control; substantial changes may trigger reassessment.

Reliability and safety assurance are critical considerations. The framework should incorporate redundancy by integrating cooperative and non-cooperative sensing, while fallback behaviors, such as return-to-home or loiter, are triggered if confidence in conflict detection falls below specified thresholds. Performance should be measured through conflict detection accuracy, false alarm rates, latency, and robustness under degraded sensing or communication conditions. Additionally, cybersecurity and integrity safeguards, including secure updates, signed models, and anomaly detection, are crucial for establishing trust in real-world applications.

4. Conclusions

This study presents a review of recent research on AI-based drone deconfliction within the context of the AI4HyDrop project, highlighting a predominant emphasis on tactical rather than strategic solutions. Among the AI methods examined, reinforcement learning emerged as the most applied, followed by deep learning and bio-inspired techniques. Our analysis also reveals a gap in comprehensive reviews specifically focused on AI approaches to drone deconfliction, underscoring the novelty of the contribution. As part of this work, we propose a hybrid AI approach to enhance deconfliction strategies. This approach could combine deep learning, reinforcement learning, and bio-inspired methods, leveraging their complementary strengths in perception, adaptive decision-making, and resilient coordination to improve robustness in dynamic airspace scenarios.

Furthermore, future research could develop a taxonomy or roadmap of AI techniques tailored to specific operational contexts. Such integration and systematic mapping would not only address current limitations but also provide practical guidance for deploying AI-driven deconfliction systems in real-world UTM and ATM frameworks. Another promising research direction is to compare AI methods with rule-based or optimization-based approaches.

Furthermore, ethical considerations such as fairness, explainability, transparency, and privacy must be carefully addressed to ensure the responsible implementation of AI systems, particularly in areas related to data privacy, security, and potential algorithmic bias.

Author Contributions

Conceptualization, N.R. and A.A.A.B.; methodology, N.R. and F.S.C.; formal analysis, F.S.C. and A.A.A.B.; investigation, N.R. and F.S.C.; resources, A.A.A.B.; writing—original draft preparation, N.R. and F.S.C.; writing—review and editing, A.A.A.B.; visualization, N.R.; supervision, A.A.A.B.; project administration, A.A.A.B.; funding acquisition, A.A.A.B. All authors have read and agreed to the published version of the manuscript.

Funding

The research is part of the AI4HyDrop project, supported by the SESAR 3 Joint Undertaking and its founding members, and co-funded by the EU’s research and innovation programme, Horizon Europe, under Grant Agreement No. 101114805.

Data Availability Statement

No data was created.

Acknowledgments

The authors would like to express their sincere gratitude to Xuan-Phuc Phan Nguyen for his valuable contributions to searching and summarizing some of the literature used in this research, as well as his contributions to the earlier conference paper.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Comparative Tables for Deep Learning

Table A1. CNN algorithms.

Ref. No.	Dataset	Performance Metrics	Application Constraints
[14]	Realistic vision data obtained with a self-developed simulator; Test flights using two unmanned aerial vehicles.	Effectiveness in real-time detection and avoidance; Ability to isolate moving aerial objects and classify them; Detection of all incoming obstacles along their trajectories.	SWaP-constrained mini-UAVs; MATLAB Deep Learning Toolbox; ResNet-50; 150 layers; Detection system cannot determine intruder range.
[15]	Datasets used in the field of Visual SLAM and object detection	Dynamic point removal, data association, point cloud segmentation, robustness, and accuracy	General robotics and autonomous driving applications, with a focus on dynamic environments.
[16]	ColANet dataset (for transfer learning customization); DJI Tello drone camera (for real-time deployment).	Accuracy (validation accuracy, training accuracy), model size, inference time, and power consumption.	Drones with limited computing resources; Real-time deployment on DJI Tello drone; Tested on AMD EPYC processor, 1.7T RAM, Tesla A100 GPU for evaluation.
[17]	ARD100 dataset (newly proposed, 100 videos, 202,467 frames, smallest average object size); NPS-Drones datasets; Untrained Drone-vs-Bird dataset.	Average Precision (AP).	Detection of tiny objects; Complex urban backgrounds; Abrupt camera movement; Low-light conditions; Focus on tiny drones; Potential for deployment on mobile platforms.
[18]	UAV-based imagery for wildlife.	Multi-objective loss functions (Wise IoU (WIoU) and Normalized Wasserstein Distance (NWD)); Parameter count reduction.	Small object detection in UAV imagery; Real-time processing and low-latency are critical for UAV applications.
[19]	VisDrone2019 dataset; DOTA v1.0 dataset.	mAP@0.5; Parameter count.	UAV remote sensing images; Challenges with small object size, multi-scale variations, and background interference.
[20]	SIMD dataset, NWPY VHR-10 dataset	Precision, accuracy	Remote sensing applications.
[21]	Real-time UAV image processing pipeline data.	Inference speed (FPS), power consumption, GPU memory consumption, mean average precision (mAP@50), and end-to-end processing times.	Constrained computational edge devices (Jetson Orin NX, Raspberry Pi 5); Real-time UAV image processing; Drone applications.
[22]	Various datasets are used across the YOLO series.	Speed, accuracy, and computational efficiency	General object detection applications; Focus on speed, accuracy, and computational efficiency.
[23]	Simulated forest environments.	Not explicitly named	Real-time UAV obstacle detection; Simulated Forest environments.
[24]	UAVDT dataset; VisDrone dataset.	Detection mAP@0.5 scores; Classification accuracies.	Nighttime operations; Challenges with poor illumination, noise, occlusions, and dynamic lighting artifacts.

Table A2. RNN algorithms.

Ref. No.	Dataset	Performance Metrics	Application Constraints
[25]	Historical flight data (simulated drone failure at Hannover airport, Germany).	Not explicitly quantified	U-Space environment; BlueSky ATM simulator used for simulation; Focus on system failure scenarios (hardware failure, broken communication)
[26]	Not explicitly named	Not explicitly quantified	Not explicitly explained
[27]	Preprocessed Automatic Dependent Surveillance-Broadcast (ADS-B) dataset.	Prediction performance (especially for long-term trajectory); Conflict detection results (detect conflicts within seconds).	Internet of Aerial Vehicles; Air-Ground Integrated Vehicle Networks (AGIVN); Addresses limitations of traditional surveillance technology for intensive Air Traffic Management (ATM).
[28]	Not explicitly named	Not explicitly quantified	Not explicitly explained
[29]	ADS-B information.	Not explicitly quantified	Relies on ADS-B information.
[30]	Real experimental drone flight data.	Comparison of prediction performances between Long Short-Term Memory (LSTM) and Nonlinear Autoregressive with Exogenous Inputs (NARX) models.	Enhancing UAV control capabilities; circumventing the complexity of traditional flight control systems; minimizing risk during actual flights (which implies a simulation environment for testing).
[31]	Synthetic/Real-world 3D UAV trajectory data (UZH-FPV, Mid-Air).	Mean Squared Error (MSE) (ranging from 2 × 10⁻⁸ to 2 × 10⁻⁷).	Real-time 3D UAV trajectory prediction.
[32]	Hourly notional amounts traded on various cryptocurrency assets from Binance (1 January 2020–31 December 2022).	Root Mean Square Error (RMSE) (loss function); Average R-squared (R²) (primary metric). TKAN achieved R² at least 25% higher than GRU for longer time steps, and demonstrated better model stability.	Numerical prediction problem; Compared against GRU and LSTM; Focus on layer performance rather than full architecture; Preprocessing involves two-stage scaling (moving median, MinMaxScaling).
[33]	UAV-collected urban traffic data (TUMDOT-MUC dataset)	R2 score, RMSE, MSE (R2: 0.98; RMSE: 39.85; MSE: 1588.5; Outperforms LSTM, GRU, and other ML/DL models in its context).	Urban environments; Multiagent trajectory prediction; Optimized with Particle Swarm Optimization (PSO) for hyperparameter tuning.

Table A3. TNN algorithms.

Ref. No.	Dataset	Performance Metrics	Application Constraints
[34]	INTERACTION, highD, and CitySim datasets.	Prediction performance, computational costs, and inference latency. Outperforms other state-of-the-art methods; Achieved better prediction performance with lower inference latency.	Autonomous driving applications; Need for interpretability and explainability; Real-time decision-making for intelligent planning systems.
[35]	Authentic logistics datasets.	Path length, time efficiency, energy consumption. Achieves 15% reduction in travel distance, 20% boost in time efficiency, and 10% decrease in energy consumption.	Logistics robots; Real-time responsiveness; Interpretability; May incur higher computational costs for large-scale environments.
[36]	Simulated swarm-vs-swarm engagements; Enriched dataset with variations in defender numbers, motions, and measurement noise levels.	Accuracy, noise robustness, and scalability to swarm size. Predicts swarm behaviors with 97% accuracy (20 time steps); Graceful degradation to 80% accuracy (50% noise); Scalable to 10–100 agents.	Military contexts; Real-time decision-making support; Short observation windows (20 time steps).
[37]	DisasterEye (custom dataset: 2751 images across eight categories, sourced from UAVs and on-site individuals); DFAN; AIDER.	Accuracy, inference latency, memory usage, and performance speed. Achieved high accuracy with lowered inference latency and memory use; 3–5x faster performance than original models with almost similar accuracy.	Resource-constrained UAV platforms (e.g., Jetson Nano); Onboard processing; Real-time performance; Addresses privacy, connectivity, and latency issues in disaster-prone areas.

Table A4. GNN algorithms.

Ref. No.	Dataset	Performance Metrics	Application Constraints
[38]	Not explicitly named	Convergence, reduction in Losses of Separation (LOSS), consistent avoidance of Near Mid-Air Collisions (NMACs).	Multiagent reinforcement learning (MARL) problem; Cooperative agents; Compound conflicts.
[39]	Not explicitly named	Not explicitly quantified	General GCRL framework
[40]	Not explicitly named	Not explicitly quantified	Dense UAV networks; Conflict resolution.
[41]	Not explicitly named	Not explicitly quantified	Multiagent cooperation.

Table A5. LLM algorithms.

Ref. No.	Dataset	Performance Metrics	Application Constraints
[42]	Not explicitly named	Not explicitly quantified	Focus on suitability for UAV integration; Identifying novel opportunities for LLM embedding within UAV frameworks.
[43]	data from UAV-assisted sensor networks.	Minimizing the average Age of Information (AoI) across ground sensors and optimizing data collection schedules and velocities.	LLM-Enabled In-Context Learning (ICL) for onboard Flight Resource Allocation; UAV-assisted sensor networks.
[44]	Case study on data collection scheduling.	Significantly reduce packet loss compared to conventional approaches; Mitigate potential jailbreaking vulnerabilities.	Deployment of LLMs at the network edge to reduce latency and preserve data privacy; Suitable for real-time, mission-critical public safety UAVs.
[45]	150,000 telemetry log entries	Decision Accuracy; Cosine Similarity (retrieved context vs. input data); BLEU Score (linguistic similarity of generated commands to expert commands); Response Time (Decision Latency)	LLaMA3.2 1B Instruct (lightweight model); Quantized (INT8 precision) for edge deployment; Cloud processing hub for centralized processing and scalability; Resource-constrained edge hardware.
[46]	evaluated across scenarios involving code generation and mission planning.	Qwen 2.5 excels in multi-step reasoning; Gemma 2 balances accuracy and latency; LLaMA 3.2 offers faster responses with lower logical coherence.	Integrates open-source LLMs (Qwen 2.5, Gemma 2, LLaMA 3.2) with UAV autopilot systems (ArduPilot stack); Supports simulation and real-world deployment; Operates offline via Ollama runtime;
[47]	Simulated environment for testing swarm behavior.	Multi-critic consensus mechanism to evaluate trajectory quality; Hierarchical prompt structuring for improved task execution	Multimodal LLMs (interprets text and audio inputs); Generates 3D waypoints for UAV movements
[48]	real-world SAR scenarios. Tested on two different scenarios.	Time to complete missions. Faster on average by 33.75% compared with an off-the-shelf autopilot and 54.6% compared with a human pilot.	Multimodal system (LLM: ChatGPT-4o, VLM: quantized Molmo-7B-D BnB 4-bit model); Onboard computer (e.g., OrangePi) for deployment.
[49]	real-time sensing and environmental feedback data.	Enhanced performance and scalability of UAV systems.	MLLMs (LLMs, VFMs, VLMs); Communication systems (WiFi, cellular, ground radio links, satellite communication); UAV platforms (multirotor, fixed-wing, rotary-wing, hybrid, flapping-wing)
[50]	Not explicitly named	Not explicitly quantified	state-of-the-art LLM technology, multimodal data resources for UAVs, and key tasks/application scenarios where UAVs and LLMs converge.

Appendix B. Comparative Tables for Reinforcement Learning

Table A6. Traditional Reinforcement Learning algorithms.

Ref. No.	Agent/Actions	Environment/States	Application Constraints
[55]	Agent guarantees minimum separation distance; exhibits centralized learning and distributed policy. Actions: (1) RL controlling heading and speed variation; (2) RL managing heading, speed, and altitude variation.	High traffic densities. Rewards are based on global information (cumulative losses of all aircraft).	Distributed conflict resolution.
[56]	The agent is the drone itself. Utilizes a novel Estimated Time of Arrival (ETA)-based temporal reward system. Baseline algorithm includes greedy search, delayed learning, and multi-step learning.	Restructured state space using the risk sectors concept—tactical conflict resolution for air logistics transportation.	Air logistics transportation.
[57]	The agent’s primary objective is to avoid obstacles within a defined buffer zone.	Airspace reservation with a risk-based operational safety bound. Considers UAS performance, weather conditions, and uncertainties in UAS operations (including positioning errors)—new reward function devised for reinforcement learning.	UAS conflict resolution.
[58]	Agent configures parameters for computing conflict resolution (CR) maneuvers using a geometric CR method. RL selected for its capacity to comprehend and execute complete action sequences.	Conflict resolution (CR) scenarios can lead to secondary conflicts. Reward: Adept at detecting emergent secondary conflicts.	Hybrid method combining geometric CR techniques with RL.
[59]	Agents learn to self-prioritize for conflict resolution.	Air Traffic Control (ATC) with limited instructions.	Air Traffic Control.
[60]	Distributed Reinforcement Learning. Agents are UAVs in a swarm.	UAV swarm control.	Flexible and efficient UAV swarm control.
[61]	Deep Reinforcement Learning (DRL) agent for obstacle avoidance. Actions in a continuous action space.	UAS obstacle avoidance.	Continuous action space.
[62]	Reinforcement learning with counterfactual credit assignment for multi-UAV collision avoidance.	Multi-UAV collision avoidance scenarios.	Multi-UAV systems.
[63]	DRL model with continuous state and action spaces. Dynamically chooses a resolution strategy for pairs of UAVs. The second layer is a collaborative UAV collision avoidance model integrating a three-dimensional conflict detection and resolution pool.	Urban environments. Multiple UAVs. Two-layer resolution framework involving speed adjustments and rerouting strategies.	Urban environments; Multiple UAVs.

Table A7. Multi-agent Reinforcement Learning algorithms.

Ref. No.	Agent/Actions	Environment/States	Application Constraints
[38]	Cooperative agents (UAVs) using a Graph Convolutional Reinforcement Learning (DGN) model. Actions: jointly generate resolution maneuvers.	Multi-UAV conflicts are conceptualized as compound ecosystems; Air traffic is modeled as a graph (UAVs as nodes, conflict creates an edge)—scenarios: 3-agent and 4-agent.	Multiagent reinforcement learning (MARL) problem; Cooperative agents; Compound conflicts.
[64]	Multiagent approach overseeing individual aircraft behavior. Actions: learning-based tactical deconfliction.	Urban Air Mobility (UAM) environment; Strategic conflict management with demand capacity balancing (DCB).	Integrated conflict management for UAM.
[65]	Agents are flights with conflict status—actions: ground delay, speed adjustment, flight cancelation (selected autonomously by recurrent actor-critic networks).	Dynamic Urban Air Mobility (UAM) environments, considering uncertainties. Strategic conflict management.	Urban Air Mobility operations; Addresses challenges related to DCB, separation conflicts, and block unavailability caused by wind turbulence.
[66]	UAVs in a large-scale swarm. Actions: path planning based on mean-field reinforcement learning.	Large-scale UAV swarm.	Large-scale UAV swarm.
[67]	Multiagent reinforcement learning. Agents are UAVs in a mission-oriented drone network. Actions: collaborative execution, energy-aware decisions.	Mission-oriented drone networks.	Energy-aware collaborative execution.
[68]	Multi-UAVs. Actions: cooperative search for moving targets.	3D scenarios with moving targets.	Multi-UAV systems; 3D scenarios.
[69]	Agents are UAVs in a swarm. Actions: confrontation/combat maneuvers.	UAV swarm confrontation.	UAV swarm confrontation.
[70]	Multi-UAVs using a leader-follower strategy within a hierarchical reinforcement learning framework. Actions: combat maneuvers.	Multi-UAV combat scenarios.	Multi-UAV combat.

Table A8. Deep Reinforcement Learning algorithms.

Ref. No.	Agent/Actions	Environment/States	Application Constraints
[71]	DRL-based algorithms. Actions: prevent collisions while optimizing energy consumption; operate without pre-existing knowledge.	UAV environment; Challenging setting with numerous UAVs moving randomly in a confined area without correlation.	Deployment onboard UAV or at Multi-Access Edge Computing (MEC); Diverse environments.
[72]	Host drone using Double Deep Q Network (DDQN) framework. Actions: generate conflict-free maneuvers at each time step.	Urban airspace for UAV operations; Tactical conflict resolution framed as a sequential decision-making problem	Urban airspace; Unmanned Aerial Vehicles (UAVs).
[73]	Multi-UAV formation control. Actions: maintain formation; avoid static and dynamic obstacles.	Multi-UAV formation control scenarios with static and dynamic obstacles.	Multi-UAV systems.
[74]	Swarm UAVs. Actions: intuitive and immersive control of swarm formation of UAVs; multi-UAV collision avoidance.	Swarm formation of UAVs; Augmented Reality (AR) human-drone interaction.	DroneARchery system; Haptic interface (LinkGlide) for tactile sensation

Appendix C. Comparative Tables for Bio-Inspired Learning

Table A9. Swarm Intelligence algorithms.

Ref. No.	Algorithm	Performance Metrics	Application Constraints
[78]	Drone swarm strategy for detection and tracking.	Not specified	Complex environments; Occluded targets.
[79]	Particle Swarm Optimization (PSO), Artificial Potential Field (APF). Extensive simulation experiments.	Energy saving, average tracking error, and task time.	3D space; UAV swarms; LiDAR for obstacle detection; Sharing environmental data within the swarm.
[80]	Inspired by bird flocking and fish schooling. Simulations dataset.	Total collisions with obstacles (NoCs), duration time (DT), path length, and energy efficiency.	IoD swarms; Limited detection range; Multiple static and dynamic obstacles; 3D dynamic environment; Battery constraints
[81]	Inspired by the foraging behavior of ant colonies. Three-dimensional models of real structures dataset.	Path length.	Multiple UAVs; Cooperative inspection tasks.
[82]	Inspired by the flashing behavior of fireflies.	Shortest collision-free path length, ability to converge, ability to discover optimum solutions.	Mobile robot path planning; 2D-dimensional space (static obstacles)
[83]	Inspired by the social hierarchy and cooperative hunting behavior of grey wolves.	Optimization performance on benchmark functions; Path length (m).	UAV path planning; Obstacle-laden environments
[84]	Inspired by the foraging behavior of honeybees. Simulation datasets.	Effectiveness for epistasis detection (for SFMOABC); Convergence capabilities; Susceptibility to local optima.	Multi-drone path planning; Addresses drawbacks of traditional ant colony algorithms
[85]	Multi-swarm discrete quantum-inspired particle swarm optimization with adaptive simulated annealing	Localization accuracy, computational efficiency, and adaptability to varying UAV/target scales.	Complex 3D environments; Multi-target localization; Resource-constrained collaborative localization tasks.

Table A10. Evolutionary algorithms.

Ref. No.	Algorithm	Performance Metrics	Application Constraints
[86]	GA with seeded initial population using ACO path, Voronoi vertices, clustering/collision-center seeding	≥70% fewer objective evaluations with best seeding; faster convergence vs. plain GA.	3-D terrain; target coverage (visit checkpoints); terrain collision avoidance.
[87]	Co-evolutionary multi-UAV CPP (sub-pop per UAV); penalty for constraints; two info-sharing strategies	Compared vs. two evolutionary baselines on cost (length + threat), constraint satisfaction, and efficiency; effective in complex rendezvous scenarios.	Known environment w/threat map; multi-UAV non-collision & time coordination enforced.
[88]	α-RACER: learn near-potential offline; maximize it online for approx. Nash; nonlinear bicycle + Pacejka; MPC-style policy with overtake/block	Wins most races vs. baselines; near-potential gap ≤ 10% (median ~2%); Nash regret ≤ 3%; compute comparable to tuned IBR, aimed at real-time.	Simulated track; three cars; full-state; proximity slow-down for collision handling; bounded throttle/steering; discrete time-step.

Table A11. Neural-Inspired Algorithms.

Ref. No.	Algorithm	Performance Metrics	Application Constraints
[89]	Reward-modulated SNN for swarm collision avoidance	Collision rate, success rate, inter-drone distance	Decentralized, onboard SNN hardware, local sensing only
[90]	Event-based vision + neuromorphic planning on Bebop2	Latency, nav accuracy, FPS, success rate	Indoor, Bebop2 hardware limits
[91]	Neuromorphic digital-twin for multi-UAV indoor control	Tracking error, coordination success, and latency	Indoor GPS-denied, comm stability
[92]	Neuromorphic SNN attitude estimator & controller	Estimation error, stability, and power use	Small UAVs, neuromorphic IMU needed
[93]	SNN for obstacle avoidance with GA optimization	Success rate, energy, latency	SpiNNaker hardware, sensing simulation
[94]	SNN + DRL (actor-critic)	Rewards, sample efficiency, and task completion	Robotics simulation, training costs high
[95]	LGMD binocular model	Accuracy, FPR/FNR, reaction time	Stereo cameras, vision-based
[96]	BrainCog large-scale SNN cognitive engine	Accuracy, adaptability, energy efficiency	General platform, scaling cost

References

Pavithra, S.; Kachroo, D.; Kadam, V.; Padala, H.; Purbey, R. Drone-Based Weed and Disease Detection in Agricultural Fields to Maximize Crop Health Using a Yolov8 Approach. In Proceedings of the 2023 IEEE 7th Conference on Information and Communication Technology, CICT 2023, Jabalpur, India, 15–17 December 2023; Institute of Electrical and Electronics Engineers Inc.: New York, NY, USA, 2023. [Google Scholar] [CrossRef]
Bernardo, R.M.; da Silva, L.C.B.; Rosa, P.F.F. UAV Embedded Real-Time Object Detection by a DCNN Model Trained on Synthetic Dataset. In Proceedings of the 2023 International Conference on Unmanned Aircraft Systems, ICUAS 2023, Warsaw, Poland, 6–9 June 2023; Institute of Electrical and Electronics Engineers Inc.: New York, NY, USA, 2023; pp. 580–585. [Google Scholar] [CrossRef]
Chour, K.; Pradeep, P.; Munishkin, A.A.; Kalyanam, K.M. Aerial Vehicle Routing and Scheduling for UAS Traffic Management: A Hybrid Monte Carlo Tree Search Approach. In Proceedings of the AIAA/IEEE Digital Avionics Systems Conference-Proceedings, Barcelona, Spain, 1–5 October 2023; Institute of Electrical and Electronics Engineers Inc.: New York, NY, USA, 2023. [Google Scholar] [CrossRef]
Bilgin, Z.; Bronz, M.; Yavrucuk, I. Automatic in Flight Conflict Resolution for Urban Air Mobility using Fluid Flow Vector Field based Guidance Algorithm. In Proceedings of the AIAA/IEEE Digital Avionics Systems Conference-Proceedings, Barcelona, Spain, 1–5 October 2023; Institute of Electrical and Electronics Engineers Inc.: New York, NY, USA, 2023. [Google Scholar] [CrossRef]
Pourjabar, M.; Rusci, M.; Bompani, L.; Lamberti, L.; Niculescu, V.; Palossi, D.; Benini, L. Multi-sensory Anti-collision Design for Autonomous Nano-swarm Exploration. In Proceedings of the ICECS 2023-2023 30th IEEE International Conference on Electronics, Circuits and Systems: Technosapiens for Saving Humanity, Istanbul, Turkiye, 4–7 December 2023; Institute of Electrical and Electronics Engineers Inc.: New York, NY, USA, 2023. [Google Scholar] [CrossRef]
SESAR 3 JU, “AI4HyDrop”. Available online: https://ai4hydrop.eu/ (accessed on 11 April 2024).
Pati, D.; Lorusso, L.N. How to Write a Systematic Review of the Literature. Health Environ. Res. Des. J. 2018, 11, 15–30. [Google Scholar] [CrossRef]
Lin, C.; Han, G.; Wu, Q.; Wang, B.; Zhuang, J.; Li, W.; Hao, Z.; Fan, Z. Improving Generalization in Collision Avoidance for Multiple Unmanned Aerial Vehicles via Causal Representation Learning. Sensors 2025, 25, 3303. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Koul, P. A review of machine learning applications in aviation engineering. Adv. Mech. Mater. Eng. 2025, 42, 16–40. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N. Attention Is All You Need. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Kipf, T.N.; Welling, M. Semi-Supervised Classification with Graph Convolutional Networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
Brown, T.B.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.; Dhariwal, P.; Amodei, D. Language Models are Few-Shot Learners. Adv. Neural Inf. Process. Syst. 2020, 33, 1877–1901. [Google Scholar]
Opromolla, R.; Fasano, G. Visual-based obstacle detection and tracking, and conflict detection for small UAS sense and avoid. Aerosp. Sci. Technol. 2021, 119, 107167. [Google Scholar] [CrossRef]
Peng, J.; Chen, D.; Yang, Q.; Yang, C.; Xu, Y.; Qin, Y. Visual SLAM Based on Object Detection Network: A Review. Comput. Mater. Contin. 2023, 77, 3209–3236. [Google Scholar] [CrossRef]
Zufar, R.N.; Banjerdpongchai, D. Selection of Lightweight Cnn Models with Limited Computing Resources for Drone Collision Prediction. ECTI Trans. Electr. Eng. Electron. Commun. 2024, 22. [Google Scholar] [CrossRef]
Guo, H.; Lin, X.; Zhao, S. YOLOMG: Vision-based Drone-to-Drone Detection with Appearance and Pixel-Level Motion Fusion. arXiv 2025, arXiv:2503.07115. [Google Scholar]
Naidu, A.P.; Gosalia, H.; Gakhar, I.; Rathore, S.S.; Didwania, K.; Verma, U. DEAL-YOLO: Drone-based Efficient Animal Localization using YOLO. arXiv 2025, arXiv:2503.04698. [Google Scholar]
Wu, Y.; Mu, X.; Shi, H.; Hou, M. An object detection model AAPW-YOLO for UAV remote sensing images based on adaptive convolution and reconstructed feature fusion. Sci. Rep. 2025, 15, 1–20. [Google Scholar] [CrossRef]
Wu, T.; Dong, Y. YOLO-SE: Improved YOLOv8 for Remote Sensing Object Detection and Recognition. Appl. Sci. 2023, 13, 12977. [Google Scholar] [CrossRef]
Rey, L.; Bernardos, A.M.; Dobrzycki, A.D.; Carramiñana, D.; Bergesio, L.; Besada, J.A.; Casar, J.R. A Performance Analysis of You Only Look Once Models for Deployment on Constrained Computational Edge Devices in Drone Applications. Electronics 2025, 14, 638. [Google Scholar] [CrossRef]
Sapkota, R.; Flores-Calero, M.; Qureshi, R.; Badgujar, C.; Nepal, U.; Poulose, A.; Zeno, P.; Vaddevolu, U.B.P.; Khan, S.; Shoman, M.; et al. YOLO advances to its genesis: A decadal and comprehensive review of the You Only Look Once (YOLO) series. Artif. Intell. Rev. 2024, 58, 1–83. [Google Scholar] [CrossRef]
Partheepan, S.; Sanati, F.; Hassan, J. Evaluating YOLO Variants with Transfer Learning for Real-Time UAV Obstacle Detection in Simulated Forest Environments. IEEE Access 2025, 13, 99266–99290. [Google Scholar] [CrossRef]
Alazeb, A.; Hanzla, M.; Al Mudawi, N.; Alshehri, M.; Alhasson, H.F.; AlHammadi, D.A.; Jalal, A. Nighttime Intelligent UAV-Based Vehicle Detection and Classification Using YOLOv10 and Swin Transformer. Comput. Mater. Contin. 2025, 84, 4677–4697. [Google Scholar] [CrossRef]
Komatsu, R.; Bechina, A.A.A.; Güldal, S.; Şaşmaz, M. Machine Learning Attempt to Conflict Detection for UAV with System Failure in U-Space: Recurrent Neural Network, RNNn. In Proceedings of the 2022 International Conference on Unmanned Aircraft Systems, ICUAS 2022, Dubrovnik, Croatia, 21–24 June 2022; Institute of Electrical and Electronics Engineers Inc.: New York, NY, USA, 2022; pp. 78–85. [Google Scholar] [CrossRef]
Nguyen, X.-P.P.; Ruseno, N.; Chagas, F.S.; Bechina, A.A.A. A Survey of AI-based Models for UAVs’ Intelligent Control for Deconfliction. In Proceedings of the 10th 2024 International Conference on Control, Decision and Information Technologies, CoDIT, Vallette, Malta, 1–4 July 2024; Institute of Electrical and Electronics Engineers Inc.: New York, NY, USA, 2024; pp. 2705–2710. [Google Scholar] [CrossRef]
Cheng, C.; Guo, L.; Wu, T.; Sun, J.; Gui, G.; Adebisi, B.; Gacanin, H.; Sari, H. Machine-Learning-Aided Trajectory Prediction and Conflict Detection for Internet of Aerial Vehicles. IEEE Internet Things J. 2022, 9, 5882–5894. [Google Scholar] [CrossRef]
Olive, X.; Sun, J.; Murça, M.C.R.; Krauth, T. A Framework to Evaluate Aircraft Trajectory Generation Methods. In Proceedings of the Fourteenth USA/Europe Air Traffic Management Research and Development Seminar (ATM2021), Virtual, 20–23 September 2021. [Google Scholar]
Zhang, Y.; Jia, Z.; Dong, C.; Liu, Y.; Zhang, L.; Wu, Q. Recurrent LSTM-based UAV Trajectory Prediction with ADS-B Information. In Proceedings of the GLOBECOM 2022-2022 IEEE Global Communications Conference, Rio de Janeiro, Brazil, 4–8 December 2022. [Google Scholar]
Dong, S.; Das, S.; Townley, S. Drone motion prediction from flight data: A nonlinear time series approach. Syst. Sci. Control Eng. 2024, 12, 2409098. [Google Scholar] [CrossRef]
Nacar, O.; Abdelkader, M.; Ghouti, L.; Gabr, K.; Al-Batati, A.; Koubaa, A. VECTOR: Velocity-Enhanced GRU Neural Network for Real-Time 3D UAV Trajectory Prediction. Drones 2025, 9, 8. [Google Scholar] [CrossRef]
Genet, R.; Inzirillo, H. TKAN: Temporal Kolmogorov-Arnold Networks. arXiv 2024, arXiv:2405.07344. [Google Scholar] [CrossRef]
Mohebbi, M.; Kafash, E.; Döller, M. Multiagent Trajectory Prediction for Urban Environments with UAV Data Using Enhanced Temporal Kolmogorov-Arnold Networks with Particle Swarm Optimization. In Proceedings of the International Conference on Agents and Artificial Intelligence, Porto, Portugal, 23–25 February 2025; Science and Technology Publications, Lda: Setúbal, Portugal, 2025; pp. 586–597. [Google Scholar] [CrossRef]
Huang, S.; Ye, L.; Chen, M.; Luo, W.; Wang, D.; Xu, C.; Liang, D. Interpretable Interaction Modeling for Trajectory Prediction via Agent Selection and Physical Coefficient. arXiv 2024, arXiv:2405.13152. [Google Scholar] [CrossRef]
Luo, H.; Wei, J.; Zhao, S.; Liang, A.; Xu, Z.; Jiang, R. Intelligent logistics management robot path planning algorithm integrating transformer and GCN network. arXiv 2025, arXiv:2501.02749. [Google Scholar] [CrossRef]
Peltier, D.W.; Kaminer, I.; Clark, A.H.; Orescanin, M. Swarm Characteristics Classification Using Neural Networks. IEEE Trans. Aerosp. Electron. Syst. 2024, 61, 389–400. [Google Scholar] [CrossRef]
Jankovic, B.; Jangirova, S.; Ullah, W.; Khan, L.U.; Guizani, M. UAV-Assisted Real-Time Disaster Detection Using Optimized Transformer Model. arXiv 2025, arXiv:2501.12087. [Google Scholar] [CrossRef]
Isufaj, R.; Omeri, M.; Piera, M.A. Multi-UAV Conflict Resolution with Graph Convolutional Reinforcement Learning. Appl. Sci. 2022, 12, 610. [Google Scholar] [CrossRef]
Jiang, J.; Dun, C.; Huang, T.; Lu, Z. Graph Convolutional Reinforcement Learning. arXiv 2018, arXiv:1810.09202. [Google Scholar] [CrossRef]
Li, Y.; Li, J.; Wang, J.; Zhang, X.; Ding, H.; Du, W. Multi-Scale Graph Enhanced Reinforcement Learning for Conflict Resolution in Dense UAV Networks. IEEE Internet Things J. 2025, 1. [Google Scholar] [CrossRef]
Elrod, M.; Mehrabi, N.; Amin, R.; Kaur, M.; Cheng, L.; Martin, J.; Razi, A. Graph Based Deep Reinforcement Learning Aided by Transformers for Multiagent Cooperation. arXiv 2025, arXiv:2504.08195. [Google Scholar] [CrossRef]
Javaid, S.; Fahim, H.; He, B.; Saeed, N. Large Language Models for UAVs: Current State and Pathways to the Future. IEEE Open J. Veh. Technol. 2024, 5, 1166–1192. [Google Scholar] [CrossRef]
Emami, Y.; Zhou, H.; Nabavirazani, S.; Almeida, L. LLM-Enabled In-Context Learning for Data Collection Scheduling in UAV-assisted Sensor Networks. arXiv 2025, arXiv:2504.14556. [Google Scholar] [CrossRef]
Emami, Y.; Zhou, H.; Gaitan, M.G.; Li, K.; Almeida, L.; Han, Z. From Prompts to Protection: Large Language Model-Enabled In-Context Learning for Smart Public Safety UAV. arXiv 2025, arXiv:2506.02649. [Google Scholar] [CrossRef]
Sezgin, A. Scenario-Driven Evaluation of Autonomous Agents: Integrating Large Language Model for UAV Mission Reliability. Drones 2025, 9, 213. [Google Scholar] [CrossRef]
Nunes, D.; Amorim, R.; Ribeiro, P.; Coelho, A.; Campos, R. A Framework Leveraging Large Language Models for Autonomous UAV Control in Flying Networks. arXiv 2025, arXiv:2506.04404. [Google Scholar] [CrossRef]
Aikins, G.; Dao, M.P.; Moukpe, K.J.; Eskridge, T.C.; Nguyen, K.-D. LEVIOSA: Natural Language-Based Uncrewed Aerial Vehicle Trajectory Generation. Electronics 2024, 13, 4508. [Google Scholar] [CrossRef]
Yaqoot, Y.; Mustafa, M.A.; Sautenkov, O.; Lykov, A.; Serpiva, V.; Tsetserukou, D. UAV-VLRR: Vision-Language Informed NMPC for Rapid Response in UAV Search and Rescue. arXiv 2025, arXiv:2503.02465. [Google Scholar] [CrossRef]
Ping, Y.; Liang, T.; Ding, H.; Lei, G.; Wu, J.; Zou, X.; Zhang, T. Multimodal Large Language Models-Enabled UAV Swarm: Towards Efficient and Intelligent Autonomous Aerial Systems. arXiv 2025, arXiv:2506.12710. [Google Scholar] [CrossRef]
Tian, Y.; Lin, F.; Li, Y.; Zhang, T.; Zhang, Q.; Fu, X.; Huang, J.; Dai, X.; Wang, Y.; Tian, C.; et al. UAVs Meet LLMs: Overviews and Perspectives Toward Agentic Low-Altitude Mobility. Inf. Fusion 2025, 122, 103158. [Google Scholar] [CrossRef]
Shakya, A.K.; Pillai, G.; Chakrabarty, S. Reinforcement learning algorithms: A brief survey. Expert Syst. Appl. 2023, 231, 120495. [Google Scholar] [CrossRef]
Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef] [PubMed]
Amodu, O.A.; Althumali, H.; Hanapi, Z.M.; Jarray, C.; Mahmood, R.A.R.; Adam, M.S.; Bukar, U.A.; Abdullah, N.F.; Luong, N.C. A Comprehensive Survey of Deep Reinforcement Learning in UAV-Assisted IoT Data Collection. Veh. Commun. 2025, 55, 100949. [Google Scholar] [CrossRef]
Abdalla, A.S.; Marojevic, V. Machine Learning-Assisted UAV Operations with UTM: Requirements, Challenges, and Solutions. arXiv 2020. [Google Scholar] [CrossRef]
Ribeiro, M.; Ellerbroek, J.; Hoekstra, J. Distributed Conflict Resolution at High Traffic Densities with Reinforcement Learning. Aerospace 2022, 9, 472. [Google Scholar] [CrossRef]
Li, C.; Gu, W.; Zheng, Y.; Huang, L.; Zhang, X. An ETA-Based Tactical Conflict Resolution Method for Air Logistics Transportation. Drones 2023, 7, 334. [Google Scholar] [CrossRef]
Hu, J.; Liu, Y.; Tyagi, A.; Wieland, F.; Toussaint, S.; Luxhoj, J.T.; Maroney, D.; Lacher, A.; Erzberger, H.; Goebel, K.; et al. Uas conflict resolution integrating a risk-based operational safety bound as airspace reservation with reinforcement learning. In Proceedings of the AIAA Scitech 2020 Forum, Orlando, FL, USA, 6–10 January 2020; American Institute of Aeronautics and Astronautics Inc., AIAA: Reston, VA, USA, 2020; pp. 1–10. [Google Scholar] [CrossRef]
Ribeiro, M.; Ellerbroek, J.; Hoekstra, J. Improving Algorithm Conflict Resolution Manoeuvres with Reinforcement Learning. Aerospace 2022, 9, 847. [Google Scholar] [CrossRef]
Nilsson, J.; Unger, J.; Eilertsen, G. Self-Prioritizing Multiagent Reinforcement Learning for Conflict Resolution in Air Traffic Control with Limited Instructions. Aerospace 2025, 12, 88. [Google Scholar] [CrossRef]
Venturini, F.; Mason, F.; Pase, F.; Chiariotti, F.; Testolin, A.; Zanella, A.; Zorzi, M. Distributed Reinforcement Learning for Flexible and Efficient UAV Swarm Control. IEEE Trans. Cogn. Commun. Netw. 2021, 7, 955–969. [Google Scholar] [CrossRef]
Hu, J.; Yang, X.; Wang, W.; Wei, P.; Ying, L.; Liu, Y. Obstacle Avoidance for UAS in Continuous Action Space Using Deep Reinforcement Learning. IEEE Access 2022, 10, 90623–90634. [Google Scholar] [CrossRef]
Huang, S.; Zhang, H.; Huang, Z. Multi-UAV Collision Avoidance using Multiagent Reinforcement Learning with Counterfactual Credit Assignment. arXiv 2022. [Google Scholar] [CrossRef]
Zhang, J.; Zhang, H.; Zhou, J.; Hua, M.; Zhong, G.; Liu, H. Adaptive Collision Avoidance for Multiple UAVs in Urban Environments. Drones 2023, 7, 491. [Google Scholar] [CrossRef]
Chen, S.; Evans, A.D.; Brittain, M.; Wei, P. Integrated Conflict Management for UAM with Strategic Demand Capacity Balancing and Learning-based Tactical Deconfliction. IEEE Trans. Intell. Transp. Syst. 2024, 25, 10049–10061. [Google Scholar] [CrossRef]
Huang, C.; Petrunin, I.; Tsourdos, A. Strategic Conflict Management using Recurrent Multiagent Reinforcement Learning for Urban Air Mobility Operations Considering Uncertainties. J. Intell. Robot. Syst. Theory Appl. 2023, 107, 21. [Google Scholar] [CrossRef]
Zhang, Y.; Ding, M.; Yuan, Y.; Zhang, J.; Yang, Q.; Shi, G.; Jiang, J. Large-scale UAV swarm path planning based on mean-field reinforcement learning. Chin. J. Aeronaut. Chin. J. Aeronaut. 2025, 38, 103484. [Google Scholar] [CrossRef]
Li, Y.; Li, C.; Chen, J.; Roinou, C. Energy-Aware Multiagent Reinforcement Learning for Collaborative Execution in Mission-Oriented Drone Networks. In Proceedings of the 2022 International Conference on Computer Communications and Networks (ICCCN), Honolulu, HI, USA, 25–28 July 2022. [Google Scholar] [CrossRef]
Liu, Y.; Li, X.; Wang, J.; Wei, F.; Yang, J. Reinforcement-Learning-Based Multi-UAV Cooperative Search for Moving Targets in 3D Scenarios. Drones 2024, 8, 378. [Google Scholar] [CrossRef]
Wang, B.; Li, S.; Gao, X.; Xie, T. UAV Swarm Confrontation Using Hierarchical Multiagent Reinforcement Learning. Int. J. Aerosp. Eng. 2021, 2021, 1–12. [Google Scholar] [CrossRef]
Pang, J.; He, J.; Mohamed, N.M.A.A.; Lin, C.; Zhang, Z.; Hao, X. A Hierarchical Reinforcement Learning Framework for Multi-UAV Combat Using Leader-Follower Strategy. Knowledge-Based Syst. 2025, 316, 113387. [Google Scholar] [CrossRef]
Ouahouah, S.; Bagaa, M.; Prados-Garzon, J.; Taleb, T. Deep-Reinforcement-Learning-Based Collision Avoidance in UAV Environment. IEEE Internet Things J. 2022, 9, 4015–4030. [Google Scholar] [CrossRef]
Zhang, M.; Yan, C.; Dai, W.; Xiang, X.; Low, K.H. Tactical conflict resolution in urban airspace for unmanned aerial vehicles operations using attention-based deep reinforcement learning. Green Energy Intell. Transp. 2023, 2, 100107. [Google Scholar] [CrossRef]
Xie, Y.; Yu, C.; Zang, H.; Gao, F.; Tang, W.; Huang, J.; Wang, Y. Multi-UAV Formation Control with Static and Dynamic Obstacle Avoidance via Reinforcement Learning. arXiv 2024. [Google Scholar] [CrossRef]
Dorzhieva, E.; Baza, A.; Gupta, A.; Fedoseev, A.; Cabrera, M.A.; Karmanova, E.; Tsetserukou, D. DroneARchery: Human-Drone Interaction through Augmented Reality with Haptic Feedback and Multi-UAV Collision Avoidance Driven by Deep Reinforcement Learning. In Proceedings of the 2022 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Singapore, 17–21 October 2022. [Google Scholar]
Aljalaud, F.; Kurdi, H.; Youcef-Toumi, K. Bio-Inspired Multi-UAV Path Planning Heuristics: A Review. Mathematics 2023, 11, 2356. [Google Scholar] [CrossRef]
Poudel, S.; Arafat, M.Y.; Moh, S. Bio-Inspired Optimization-Based Path Planning Algorithms in Unmanned Aerial Vehicles: A Survey. Sensors 2023, 23, 3051. [Google Scholar] [CrossRef]
Ali, Z.A.; Zhangang, H.; Zhengru, D. Path planning of multiple UAVs using MMACO and DE algorithm in dynamic environment. Meas. Control 2020, 56, 459–469. [Google Scholar] [CrossRef]
Nathan, R.J.A.A.; Kurmi, I.; Bimber, O. Drone swarm strategy for the detection and tracking of occluded targets in complex environments. Commun. Eng. 2023, 2, 12. [Google Scholar] [CrossRef]
Huang, S.; Zhang, H.; Huang, Z. E2CoPre: Energy Efficient and Cooperative Collision Avoidance for UAV Swarms with Trajectory Prediction. IEEE Trans. Intell. Transp. Syst. 2024, 25, 6951–6963. [Google Scholar] [CrossRef]
Ahmed, G.; Sheltami, T.; Mahmoud, A.; Yasar, A. IoD swarms collision avoidance via improved particle swarm optimization. Transp. Res. Part A Policy Pract. 2020, 142, 260–278. [Google Scholar] [CrossRef]
Bui, D.N.; Duong, T.N.; Phung, M.D. Ant Colony Optimization for Cooperative Inspection Path Planning Using Multiple Unmanned Aerial Vehicles. In Proceedings of the 2024 IEEE/SICE International Symposium on System Integration (SII), Ha Long, Vietnam, 8–11 January 2024. [Google Scholar] [CrossRef]
Ab Wahab, M.N.; Nazir, A.; Khalil, A.; Bhatt, B.; Noor, M.H.M.; Akbar, M.F.; Mohamed, A.S.A. Optimised path planning using Enhanced Firefly Algorithm for a mobile robot. PLoS ONE 2024, 19, e0308264. [Google Scholar] [CrossRef]
Teng, Z.; Dong, Q.; Zhang, Z.; Huang, S.; Zhang, W.; Wang, J.; Chen, X. An Improved Grey Wolf Optimizer Inspired by Advanced Cooperative Predation for UAV Shortest Path Planning. arXiv 2025. [Google Scholar] [CrossRef]
Wang, Z.-C.; Xu, T.-L.; Liu, F.; Wei, Y.-P. Artificial bee colony based optimization algorithm and its application on multi-drone path planning. AIP Adv. 2025, 15, 055306. [Google Scholar] [CrossRef]
Gong, W.; Lou, S.; Deng, L.; Yi, P.; Hong, Y. Efficient Multi-Target Localization Using Dynamic UAV Clusters. Sensors 2025, 25, 2857. [Google Scholar] [CrossRef]
Pehlivanoglu, Y.V.; Pehlivanoglu, P. An enhanced genetic algorithm for path planning of autonomous UAV in target coverage problems. Appl. Soft Comput. 2021, 112, 107796. [Google Scholar] [CrossRef]
Wu, Y.; Nie, M.; Ma, X.; Guo, Y.; Liu, X. Co-Evolutionary Algorithm-Based Multi-Unmanned Aerial Vehicle Cooperative Path Planning. Drones 2023, 7, 606. [Google Scholar] [CrossRef]
Kalaria, D.; Maheshwari, C.; Sastry, S. α-RACER: Real-Time Algorithm for Game-Theoretic Motion Planning and Control in Autonomous Racing using Near-Potential Function. arXiv 2024. [Google Scholar] [CrossRef]
Zhao, F.; Zeng, Y.; Han, B.; Fang, H.; Zhao, Z. Nature-inspired self-organizing collision avoidance for drone swarm based on reward-modulated spiking neural network. Patterns 2022, 3, 100611. [Google Scholar] [CrossRef]
Joshi, A.; Sanyal, S.; Roy, K. Real-Time Neuromorphic Navigation: Integrating Event-Based Vision and Physics-Driven Planning on a Parrot Bebop2 Quadrotor. arXiv 2024. [Google Scholar] [CrossRef]
Ahmadvand, R.; Sharif, S.S.; Banad, Y.M. Neuromorphic Digital-Twin-based Controller for Indoor Multi-UAV Systems Deployment. IEEE J. Indoor Seamless Position Navig. 2025, 3, 165–174. [Google Scholar] [CrossRef]
Stroobants, S.; De Wagter, C.; de Croon, G.C.H.E. Neuromorphic Attitude Estimation and Control. IEEE Robot. Autom. Lett. 2025, 10, 4858–4865. [Google Scholar] [CrossRef]
Salt, L.; Howard, D.; Indiveri, G.; Sandamirskaya, Y. Parameter Optimization and Learning in a Spiking Neural Network for UAV Obstacle Avoidance Targeting Neuromorphic Processors. IEEE Trans. Neural Networks Learn. Syst. 2019, 31, 3305–3318. [Google Scholar] [CrossRef]
Zanatta, L.; Barchi, F.; Manoni, S.; Tolu, S.; Bartolini, A.; Acquaviva, A. Exploring spiking neural networks for deep reinforcement learning in robotic tasks. Sci. Rep. 2024, 14, 1–15. [Google Scholar] [CrossRef]
Zheng, Y.; Wang, Y.; Wu, G.; Li, H.; Peng, J. Enhancing LGMD-based model for collision prediction via binocular structure. Front. Neurosci. 2023, 17, 1247227. [Google Scholar] [CrossRef]
Zeng, Y.; Zhao, D.; Zhao, F.; Shen, G.; Dong, Y.; Lu, E.; Zhang, Q.; Sun, Y.; Liang, Q.; Zhao, Y.; et al. BrainCog: A spiking neural network based, brain-inspired cognitive intelligence engine for brain-inspired AI and brain simulation. Patterns 2023, 4, 100789. [Google Scholar] [CrossRef]
EU 2021/664; A Regulatory Framework for the U-Space. Official Journal of the European Union: Luxembourg, 2021.
ISO 21384-3; Unmanned Aircraft Systems Part 3: Operational Procedures. ISO: Geneva, Switzerland, 2023.

Figure 1. Literature review process.

Figure 2. Number of publications per year for each AI algorithm category (DL: Deep Learning, RL: Reinforcement Learning, BL: Bio-inspired Learning).

Figure 3. Schematic flow diagram of CNN-based YOLO algorithm.

Figure 4. AI4HyDrop’s framework for drone deconfliction.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chagas, F.S.; Ruseno, N.; Bechina, A.A.A. Artificial Intelligence Approaches for UAV Deconfliction: A Comparative Review and Framework Proposal. Automation 2025, 6, 54. https://doi.org/10.3390/automation6040054

AMA Style

Chagas FS, Ruseno N, Bechina AAA. Artificial Intelligence Approaches for UAV Deconfliction: A Comparative Review and Framework Proposal. Automation. 2025; 6(4):54. https://doi.org/10.3390/automation6040054

Chicago/Turabian Style

Chagas, Fabio Suim, Neno Ruseno, and Aurilla Aurelie Arntzen Bechina. 2025. "Artificial Intelligence Approaches for UAV Deconfliction: A Comparative Review and Framework Proposal" Automation 6, no. 4: 54. https://doi.org/10.3390/automation6040054

APA Style

Chagas, F. S., Ruseno, N., & Bechina, A. A. A. (2025). Artificial Intelligence Approaches for UAV Deconfliction: A Comparative Review and Framework Proposal. Automation, 6(4), 54. https://doi.org/10.3390/automation6040054

Article Menu

Artificial Intelligence Approaches for UAV Deconfliction: A Comparative Review and Framework Proposal †