Intelligent UAV Navigation in Smart Cities Using Phase-Field Deep Neural Networks: A Comprehensive Simulation Study

Aljaburi, Lamees; Abiyev, Rahib H.

doi:10.3390/vehicles8010006

Open AccessArticle

Intelligent UAV Navigation in Smart Cities Using Phase-Field Deep Neural Networks: A Comprehensive Simulation Study

by

Lamees Aljaburi

¹ and

Rahib H. Abiyev

^2,*

¹

Department of Computer Engineering, Near East University, Nicosia 99138, Turkey

²

Department of Computer Engineering, Institute of Applied Artificial Intelligence, Near East University, Nicosia 99138, Turkey

^*

Author to whom correspondence should be addressed.

Vehicles 2026, 8(1), 6; https://doi.org/10.3390/vehicles8010006

Submission received: 31 August 2025 / Revised: 24 December 2025 / Accepted: 27 December 2025 / Published: 2 January 2026

(This article belongs to the Special Issue Air Vehicle Operations: Opportunities, Challenges and Future Trends)

Download

Browse Figures

Versions Notes

Abstract

This paper proposes the integration of the phase-field method (PFM) with deep neural networks (DNNs) for UAV navigation in smart city environments. Using the proposed approach, simulations of an intelligent navigation and obstacle avoidance framework for drones in complex urban environments have been presented. Within the unified PFM-DNN model, phase-field modeling provides a continuous spatial representation, allowing for highly accurate characterization of boundaries between free space and obstacles. In parallel, the deep neural network component offers semantic perception and intelligent classification of environmental features. The proposed model was tested using the 3DCity dataset, which comprises 50,000 urban scenes under diverse environmental conditions, including fog, low light, and motion blur. The results demonstrated that the proposed system achieves high performance in classification and segmentation tasks, outperforming modern models such as DeepLabV3+, Mask R-CNN, and HRNet, while exhibiting high robustness to sensor noise and partial obstructions. The framework was evaluated within a simulated environment, and no real-world UAV drone tests were performed. This framework proves its effectiveness as a promising solution for intelligent drone navigation in future cities thanks to its ability to adapt and respond in dynamic environments.

Keywords:

UAV navigation; DNN; phase-field method; obstacle avoidance

1. Introduction

Drones have recently become an essential component of smart cities amid rapid urbanization, offering innovative solutions in areas such as surveillance, delivery services, traffic management, precision agriculture, aerial photography, emergency rescue, power grid inspection, and disaster response. However, the effective deployment of drones in complex urban environments poses significant challenges, including navigating environments [1] with dynamic obstacles, maintaining efficiency, and ensuring real-time decision-making capabilities [2,3,4]. The most important elements for the safe and effective deployment of drones are obstacle avoidance and navigation. Traditional algorithms often depend on preprogrammed routes and static algorithms that do not consider the dynamic and unpredictable conditions of the urban environments, as shown in Figure 1. The emergence of smart cities has brought about a major transformation in urban infrastructure, integrating advanced technologies to improve the quality of life, optimize resource management, and enhance overall efficiency [5].

In recent years, autonomous drones (UAVs) have emerged as a key component, providing a versatile platform for various applications. The phase-field method, a mathematical framework traditionally used to model complex systems, has demonstrated promising effectiveness in addressing these challenges [6]. By representing obstacles and navigation paths as connected fields, the phase-field method enables seamless integration of environmental data into the navigation process. This method is leveraged to develop an advanced algorithm that has the capability of real-time decision making and adaptive path planning when combined with deep neural networks (DNNs). Therefore, this paper integrated a phase-field deep neural network into drone navigation and obstacle avoidance systems in smart cities, where environmental data is dynamically updated via a cloud-based infrastructure.

The phase-field method models the environment by representing the drone’s position relative to obstacles, goals, and boundaries. Obstacles and boundaries are encoded as repulsive regions characterized by high gradients, smoothly “pushing” the drone away from potential collisions. Conversely, goals are represented as attractive regions or local minima that draw the drone toward its target. The drone’s motion follows the gradient of this field, allowing it to navigate along smooth, optimal paths while avoiding obstacles. By generating continuous trajectories, the PFM helps maintain stable and efficient flight, preventing abrupt changes in direction or speed often seen in graph-based or discrete path planners. Additionally, unlike hard-coded constraints, the PFM gracefully manages soft boundaries and environmental uncertainties—such as wind or turbulence—by dynamically adapting the field to guide the drone while accounting for these uncertainties.

The PFM represents the obstacles and paths as smooth fields continuously and updates the data via cloud infrastructure dynamically, leveraging the deep neural network to make decisions in real-time navigation, offering a robust, smooth, and adaptable framework for complex environments’ navigation. The PFM could be effective for UAV navigation. It can be explained by the following arguments: The PFM method generated continuous fields, while the gradient points toward the target and away from obstacles and avoids collisions. Navigation decisions can be made from these phase-field gradients, ensuring smooth and stable trajectories for the UAV, which is crucial for flight safety in the dynamic or congested urban environment. In navigation, obstacles are symbolized in the phase field using repulsive values, which the UAV can naturally avoid without having to explicitly model them as obstacles, improving real-time decision-making. The phase field can be dynamically updated as it receives new sensor information, allowing the UAV to adapt in real time to changes such as moving obstacles or newly discovered areas. Navigation is guided by following gradient descent in the phase domain, providing a computationally efficient, mathematically driven approach to motion planning. The drone traffic management (DMS) system integrates the roles of environment representation, path planning, and motion control into a single framework. This simplifies the overall architecture of drone navigation. The DMS system generates a continuous potential field that guides the drone without intermediate steps to simplify the path, unlike other methods, such as RRT, which require a discrete path, waypoints, or trees. Additionally, the integration of cloud computing not only facilitates real-time data collection but also enhances the scalability and flexibility of the system, making it ideal for use in large urban environments. The continuous flow of real-time data to the deep neural network (DNN) enables adaptive adjustment of navigation strategies, ensuring that drones can maneuver efficiently and safely in ever-changing urban environments [7], which include moving vehicles, pedestrians, infrastructure development, and unpredictable weather conditions.

Drones are operating to generate massive information and datasets through sensors, such as high-resolution images, LiDAR cloud scans, etc. This data is transmitted via computation to remote servers for real-time processing, facilitating rapid decision-making. Cloud infrastructure supported the continuous updating and retraining of navigation algorithms, ensuring that drones adapt to urban environments and smart cities [8]. In smart cities, cloud computing enables drones for coordinating and sharing data and providing improvements in efficiency and collision avoidance. Additionally, cloud computing enables remote control and monitoring, allowing operators to access data in real time. It also provides secure storage for sensitive information generated by drones, which is critical for smart city applications. Autonomous drone navigation in urban environments, especially in smart cities, faces significant challenges and problems due to the dynamic nature of the environment. Drones may encounter obstacles, such as tall buildings, narrow alleys, moving vehicles, pedestrians, traffic lights, and climate change, to ensure safe and efficient operations. This makes it difficult to adapt traditional navigation algorithms to drones, increasing the risk of collision and decreasing efficiency [8]. Static maps that are used do not reflect real-time urban changes. This is due to poor integration between algorithms and urban environments, leading to poorly planned routes and obstruction avoidance. Therefore, it has become necessary to develop navigation systems to operate drones safely and efficiently, adapting to the surrounding environment, especially with the development of smart cities today [9].

The primary goal is to develop and implement an efficient and robust navigation framework that integrates real-time environmental data and efficiently processes the drone’s trajectory to avoid obstacles while maintaining optimal performance. Additionally, scalability and real-time responsiveness are crucial for the large-scale deployment of autonomous systems. Therefore, to develop a robust navigation framework, the phase-field method has been employed for navigation of the systems to avoid obstacles, and routes can be thought of as smooth as a continuous field or smooth boundaries, unlike the traditional algorithm that relies on static logic and predefined paths, making it unqualified for urban environments. This approach helps us incorporate environmental data directly into the navigation process by combining the PFM with deep neural networks (DNNs). There is real potential to develop smarter, real-time decision-making algorithms that can adapt to continuously updated cloud-based environmental data. Since the layout of urban environments is highly dynamic and constantly changing, navigation becomes a major challenge.

Recently, uncertainty models have been an essential component of UAV navigation; Bayesian networks, especially in urban environments, and related probabilistic methods provide an empirical approach to thinking about uncertainty by explicitly modeling conditional dependencies [10]. However, the performance of these networks often deteriorates in high-dimensional visual domains, such as drone perception, where inference becomes computationally expensive and scalability in real-time navigation is limited [11].

Second, probabilistic graphical models (PGMs) offer interpretability and structured representations of uncertainty. However, their reliance on iterative inference procedures and large-scale graph construction limits their applicability in environments with moving obstacles that change quickly.

Third, attention-based deep learning methods, including transformer architectures and attention-augmented convolutional neural networks (CNNs), have demonstrated promising results in improving feature selection for perception tasks [12]. However, these models usually handle uncertainty implicitly rather than explicitly and lack physical grounding in continuous spatial modeling.

Consequently, their robustness diminishes under conditions such as occlusion, sensor noise, and adverse weather, all of which are common in real-world UAV operations.

In contrast, the proposed PFM, integrated with DNNs (PFM-DNN), provides a unique balance between physical interpretability and computational efficiency. Unlike Bayesian or graphical models, the PFM represents the environment as a continuous spatial field, inherently capturing uncertainty in obstacle boundaries, soft constraints, and dynamic conditions. Compared to attention-based methods, the PFM explicitly encodes spatial uncertainty while maintaining adaptability through DNN-based learning. This integration creates a mathematically grounded yet scalable framework that directly addresses the challenges of UAV navigation in complex urban environments. Table 1 presents a comparison of the models.

Through our review of the previous studies—such as DeepLabV3+ [13], Mask R-CNN [14], and HRNet [15] models applied to drone navigation—we identified that these traditional models rely on discrete representations, which do not offer smooth, spatially connected fields to effectively guide drone motion. In contrast, the phase-field method (PFM) represents obstacles and paths as continuous, smooth fields. By leveraging cloud infrastructure, the PFM allows for dynamic updates of environmental data and enables real-time decision-making, resulting in a navigation framework that is robust, smooth, and highly adaptable to complex environments. While the phase-field method has been widely applied in material science, its recent adoption in robotics shows great promise for modeling complex spatial interfaces relevant to drone navigation.

Table 1. Simple model comparison.

Framework	Strengths	Limitation	Suitability for UAV Navigation
Bayesian Networks [16]	Explicit probabilistic reasoning; interpretable	Poor scalability in high-dimensional data; computationally expensive	Limited in real-time, vision-heavy UAV tasks
Probabilistic Graphical Models (PGMs) [17]	Structured uncertainty modeling; interpretable	Slow inference; large graphs are impractical in dynamic environments	Limited real-time adaptability
Attention-based DNNs [18]	Strong feature extraction; scalable	Implicit uncertainty handling; lacks physical interpretability	Good for perception, weaker for explicit navigation uncertainty
PFM-DNN (Proposed)	Continuous spatial uncertainty; interpretable; scalable	New integration requires cloud/DNN support	Highly suitable for dynamic UAV navigation

Therefore, our work presents a novel integration of the phase-field method with deep neural networks and cloud infrastructure for drone navigation in smart cities. Unlike traditional models that rely on static and preprogrammed maps—which often struggle to adapt to dynamic environments—our approach is designed to handle the evolving conditions typical of smart cities.

This paper seeks to develop an adaptive navigation and obstacle avoidance framework for drones in smart cities based on the phase-field method with deep neural networks (PFM-DNNs) and real-time data integration supported by cloud computing. This framework aims to enhance the safety, efficiency, and adaptability of drones in changing urban environments, contributing to the key objectives of smart city development, providing continuous fields and collision-free paths in urban deployment, and addressing the gaps identified for the previous studies. The main contributions of this paper are summarized as follows.

-: Proposed PFM-DNN Framework for UAV Navigation: This study proposes an innovative framework that integrates phase-field modeling with deep neural networks (PFM-DNN) to generate continuous-field representations, enabling drones to plan smooth and collision-free trajectories and effectively avoid obstacles, thereby enhancing autonomous navigation in complex urban environments.
-: Adaptive and Efficient Navigation via Semantic Segmentation and Uncertainty Estimation: The proposed PFM-DNN framework enables real-time perception and adaptive decision-making by integrating semantic segmentation and obstacle classification with uncertainty estimation. This integration enhances UAV stability and safety under dynamically varying environmental conditions, such as fog, occlusions, and low illumination.
-: Comprehensive Evaluation and Benchmarking against State-of-the-Art Models: Extensive simulations and quantitative analysis were conducted to validate the effectiveness of the proposed PFM-DNN framework. The results demonstrate superior performance in terms of accuracy and convergence rate (mIoU) compared to state-of-the-art models, including DeepLabV3+, Mask R-CNN, and HRNet, confirming its suitability for dynamic and complex navigation environments.

The paper is organized as follows. Section 2 presents literature reviews about drone navigation; Section 3 describes the algorithms applied for drone navigation. Section 3 presents simulations and results. Section 4 gives conclusions on the paper.

2. Literature Review

The development and the evaluation of UAV navigation systems have gained significant attention recently, particularly in smart cities and urban environments, and have become a broad topic for many researchers. Various methods have been explored to enhance the ability of drones to navigate complex environments, avoid obstacles, and operate efficiently within the constraints of smart cities. The paper [19] conducted a comprehensive study on deep learning techniques for vehicle detection from drone images, focusing on their effectiveness for processing speed and accuracy improvement. Ref. [6] proposed a real-time penetration path-planning algorithm for stealth UAVs in complex three-dimensional environments, demonstrating the potential of deep neural networks in adaptive navigation. However, the computational requirements of these techniques often exceed the capabilities of UAV systems, limiting their real-time application. Ref. [20] analyzed pilot-induced oscillations in UAVs using bifurcation theory, contributing insights into the stability and control challenges of UAV flights. Ref. [21] developed UAV navigation using Markov decision process and historical data, presenting trajectories for training the action policy of the drone. Ref. [22] developed a UAV navigation system using images taken by a molecular camera. Ref. [4] proposed a path-planning algorithm for multi-UAV systems to optimize energy consumption and data updating, and incorporate mechanisms that prepare to recharge batteries in case of consumption. Ref. [23] worked on swarm intelligence, network optimization, learning mechanisms for communication and control of cooperative drones, and addressing the challenges. Ref. [24] introduced a deep CNN, AlexNet, that revolutionized image and object classification, influencing UAV vision systems. Ref. [25] introduced deep residual learning (ResNet) for image recognition, a framework widely used in UAV-based visual perception tasks. Ref. [26] utilized deep learning techniques for object detection of drones, which improves the accuracy in real-time object recognition. Ref. [27] used a fuzzy PID gain scheduling controller to make the quadrotor drone more stable and responsive during real-time control. Ref. [28] synthesized routing algorithms and heuristics for the Internet of Drones and defined problems together with directions for further investigations. Ref. [29] proposed squeeze-and-excitation networks (SENet) with the specific aim of enhancing feature representation in deep networks, which broadened the scope of UAV vision. Ref. [30] proposed UNet, a deep learning model with improved inter-layer information flow, which can be used to analyze UAV-based scenes. Ref. [31] developed a spacecraft attitude control strategy with L2-gain temperature stabilization that expands to UAVs in difficult flight conditions. Ref. [32] solves the connectivity issues in mountains by working on path-planning algorithms for drone relay. The paper [33] investigated cooperative guidance methods, and the paper [34] presented cooperative localization of UAVs as a form of complementary navigation. Ref. [35] considered collaborative navigation of aerial vehicles in cluttered environments.

Due to the challenges in detecting small objects, the study in [36] presents three modules, namely, SKBlock, LKBlock, and CTBlock, to address this issue. Ref. [37] employs the U-Net architecture for image classification and observation tasks. In [11], Yang et al. tackle the problem of blurred boundaries by using multi-scale object detection and edge segmentation. Ref. [38] discusses key challenges in autonomous driving, including recognition, tracking, motion estimation, and learning. Ref. [39] applies noise injection techniques to UAV navigation to enhance performance. In [40], the authors demonstrate the effectiveness of HRNet across diverse applications, testing it on object detection and semantic segmentation tasks. Ref. [41] evaluates proposed models on the PASCAL VOC 2012 and Cityscapes datasets, achieving performance scores of 89% and 82.1%, respectively, without any post-processing. Ref. [42] develops a model to prepare a predictive maintenance plan of such a system using dynamic Bayesian networks (DBNs) effectively. Bayesian network representations allow for monitoring the system reliability in a given planning horizon and predicting the system state under different replacement plans [11]. The study uses Bayesian networks for decision-making in Intelligent Autonomous Vehicles.

3. Design of PFM-DNN Based Navigation System

3.1. Drone Navigation System

This methodology demonstrates the implementation approach of an adaptive navigation for UAVs using a phase-field approach with deep neural networks and real-time data updating via cloud computing. The algorithms optimize cloud-based infrastructure using deep learning-based perception and environmental adaptation. The main components of the implementation include training a deep neural network (DNN) using the Cityscapes dataset and simulating UAV motion in a 3DCity model using the 3DCity dataset. The basic modules used for drone navigation are presented in Figure 2.

This structure represents a comprehensive approach to urban drone navigation that combines deep learning (specifically PFM-DNN) simulation, cloud computing, and adaptive navigation techniques to create a framework that can be effectively deployed in smart city environments. The “Data Collection: Cityscapes” module contributes to the model training process using previously collected urban environment data. The collected cityscape data leads to “Train PFM-DNN Model”. In parallel, a “Simulate in 3D Model” component connects to the training process. This suggests the system uses both real urban data and simulated environments for training. “Cloud-enabled Real-time Processing” updates data according to changes in the real-time environment, and the system leverages cloud computing for handling these changes and complex calculations. The “Adaptive Navigation System” uses a real-time trained system in navigation to avoid obstacles. “Deploy in Smart City” is the final stage where the trained system is implemented in real urban environments. The flow shows logical progression from data collection through model training and simulation to cloud processing and adaptive navigation and, finally, to deployment.

Currently, the work focuses on simulation-based evaluation of PFM-DNN integration. Whereas the framework was designed for deployment in smart city environments, no experiments using physical drones in real-world conditions were included.

3.2. DNN Algorithm

Drones (UAVs) are considered reliable systems suitable for a wide range of applications, particularly in classification and detection tasks. Significant advancements have been made in object detection using deep learning techniques [3], with Deep Neural Network (DNN) algorithms playing a crucial role in enabling autonomous navigation for urban drones. They support decision-making, real-time scene understanding, and predictive analytics. By processing various sensory inputs like images and LiDAR data, the DNN refines raw inputs into usable information for path planning and obstacle detection. The DNN is composed of key components such as input preprocessing, feature extraction using a convolutional neural network (CNN), multi-scale feature fusion, and phase-field deep neural network (PFM-DNN) optimization. The preprocessing includes data normalization and augmentation for model generalization. Feature extraction utilizes a CNN to identify road, obstacles, traffic, and pedestrian trajectories consisting of the following:

The model has three convolutional layers.
Each layer utilizes of 3 × 3 filter size with padding set to “same”, which ensures the maps’ output has the same spatial dimension as the input.
The first layer of convolutional has 16 filters, while the other layer has 64 filters.
Both are using a stride of 1.
The two convolutional layers are followed by applying the batch normalization and RelU activation function.
Max pooling layers have 2 × 2 windows and with stride of 2, which is followed by a layer block to decrease the spatial resolution by half.
Output applied from the final max-pooling, then moved to the fully connected layer for classifying.

The preprocessing includes the data normalization and augmentation for model generalization. Feature extraction utilizes a CNN to identify road, obstacles, traffic, and pedestrian trajectories. Multi-scale feature fusion combines information from different receptive fields to optimize obstacle detection. PFM-DNN optimization includes phase-field energy constraints to improve segmentation accuracy in challenging conditions like occlusions and sensor noise. Overall, the DNN algorithm is vital for the efficient and accurate navigation of urban drones. Figure 3 depicts a flowchart of a deep neural network (DNN) implementation process that includes three main sections: data acquisition, learning process, and output results. “Data acquisition phase” collected the urban data describing the images of the city and set up the train–test datasets. “Learning process phase” built a DNN with L hidden layers, initialized the parameters of DNN, updated the “parameters of DNN results phase”, and computed output results, indicating the next position of the drone.

3.3. Phase-Field Deep Neural Network (PFM-DNN)

Phase-field deep neural network (PFM-DNN) is an advanced deep learning architecture aimed at improving semantic segmentation, obstacle detection, and uncertainty modeling in dynamic urban environments. Unlike traditional convolutional neural networks (CNNs), a deep phase neural network effectively addresses challenges such as occlusions, sensor noise, and boundary ambiguity by incorporating phase-field energy formulas. This enables the network to provide probabilistic uncertainty estimation and adaptive learning, which are essential for real-time scene understanding in drone autonomous navigation. The flowchart (Figure 4) illustrates how a phase-field deep neural network system navigates a drone. First, the system collects multi-modal sensor data (cameras, LiDAR, IMU, GPS) and processes it into usable formats for both environmental mapping and state estimation. A GNSS-IMU combines a GPS receiver with an IMU (inertial measurement unit). This integration ensures reliability and position precision in environments [43]. The IMU helps GPS work in areas where GPS signals are blocked, like tunnels or buildings. This data is used for extraction of the features and estimation of the drone state. CNN is applied for feature extraction. This data is used in phase-field generation. Environmental data is converted into phase fields—mathematical representations that encode obstacle distances and potential navigation paths. This creates smooth, differentiable fields that help with collision avoidance. The phase-field data, combined with the drone’s current state, feeds into a deep neural network architecture that likely includes the following:

Convolutional layers for spatial feature extraction;
Recurrent layers for temporal sequence learning;
Fully connected layers for control command generation.

The network outputs navigation commands, including velocity vectors, thrust commands, attitude control, and waypoint updates that are sent to the flight controller. The system continuously monitors the drone’s response and updates the phase-field representation, creating a closed-loop control system with emergency handling capabilities. The phase-field approach is particularly effective for drone navigation because it provides smooth, continuous representations of the environment that are well-suited for gradient-based optimization in neural networks while naturally encoding obstacle avoidance constraints.

The PFM-DNN, as described by [44], combines multi-scale attention models with phase-field uncertainty estimation layers for semantic segmentation and urban object recognition using a deep learning model. Following this, we present the equations that define the mathematical structure of multi-feature extraction through attention mechanisms, uncertainty modeling via phase fields, and multi-class classification. Since drones operate in 3D environments (e.g., urban canyons, indoor areas), the PFM naturally extends to 3D space, enabling it to handle multi-dimensional navigation challenges.

3.4. Mathematical Models

Feature extraction and convolutional processing.

The input to the PFM-DNN model consists of RGB images I, depth maps D, and LiDAR point clouds L. These inputs are processed through a series of convolutional layers that extract both spatial and depth-aware features [45]. Let X represent the input tensor:

X = [I, D, L]

(1)

where I ∈ (H × W × 3), D ∈ (H × W × 1) and L ∈ R^(H×W×C), C being the number of LIDAR channels. Feature extraction is performed using convolutional layers:

F_{l} = σ (W_{l} * X + b_{l}), l = 1, 2, \dots, L

(2)

where W_l and b_l are the weight and bias terms for layer l, * denotes the convolution operator, and σ represents the non-linear activation function, typically ReLU:

σ (x) = m a x (0, x)

(3)

The extracted feature maps F are then processed through multi-scale attention mechanisms to enhance representation learning [44].

2.: Multi-scale attention mechanism.

To capture both local and global dependencies, PFM-DNN employs a multi-scale attention mechanism [45]. The attention module enhances feature importance by dynamically weighting feature maps across different spatial resolutions. The attention mechanism can be expressed as follows:

A = s o f t m a x (\frac{{Q K}^{T}}{\sqrt{s}})

(4)

where Q = W_QF, K = W_KF, and V = W_VF are the query, key, and value projections of the feature map; W_Q, W_K, W_V are learnable weight matrices; s is the feature dimension (size); and A represents the attention scores computed using the scaled dot-product formulation.

The final attention-enhanced features are computed as follows:

F_{a t t} = A V

(5)

These features are then used for segmentation and uncertainty estimation.

3.: Phase-field uncertainty estimation.

One of the key novelties of PFM-DNN is its uncertainty estimation using phase-field modeling, which quantifies prediction confidence. The phase-field function ϕ(x) defines the segmentation boundary and is modeled as a continuous field:

ϕ (x) = t a n h (\frac{d (x)}{ϵ})

(6)

where d(x) is the signed distance function from the segmentation boundary and ϵ is a small positive parameter controlling boundary sharpness.

This research study establishes the boundaries associated with the UAV, its environment, and navigation platform with an integrated phase-field function. The phase-field function ϕ(x) segregates between spatial domains, where ϕ(x) = 1 is the UAV’s domain, ϕ(x) = −1 is the obstacle domain, and the floating transition domain (−1 < ϕ(x) < 1) is the interaction or boundary layer. Thus, the boundary is not a sharp threshold but a differentiable transition region whose evolution is governed by the phase-field energy functional and the Allen–Cahn equation. This allows the UAV–environment interaction to be modeled continuously in space. The UAV’s decision-making is directly derived from the value and gradient of the phase-field function. When ϕ > 0 and approaches 1, an attractive force is generated to guide the UAV toward the goal region, whereas when ϕ < 0 and approaches −1, a repulsive force is generated to push the UAV away from obstacles. The boundary information is detected through ∇ϕ, enabling the UAV to anticipate proximity to obstacles and generate smooth, collision-free motion as ϕ decreases below a predefined safety margin. The continuous representation yields a differentiable and stable interface between the UAV and its environment, which facilitates accurate perception and collision avoidance. Furthermore, the proposed PFM-DNN framework integrates the segmentation output of the perception module into the same phase-field representation. In this way, the perception platform and navigation module share a unified boundary model, ensuring consistency between environment understanding and motion generation in real time. In contrast to traditional obstacle-avoidance methods that rely on hard distance thresholds—where the UAV abruptly switches behavior when a threshold is crossed—our approach provided a smooth transition between motion states. This avoided rigid, discontinuous control actions and enabled gradual adaptation as the UAV moved between free space and obstacle-influenced regions.

Parameter

ϵ

plays a significant role in controlling the sharpness of the interface between obstacle zones and free space. A smaller epsilon value results in a sharper interface, potentially leading to numerical instability, while larger values result in smoother transitions and greater numerical stability. In this study, the epsilon value was experimentally calibrated using sensitivity analysis to ensure stable phase-field evolution while maintaining spatial accuracy. The adopted range for epsilon—typically between 1 and 3 pixels depending on the network resolution—falls within the accepted values for phase-field modeling, thus giving the drone navigation interface meaningful physical thickness.

The uncertainty map U(x) is computed using the Laplacian of the phase-field function.

U (x) = - \nabla^{2} ϕ (x) = - (\frac{\partial^{2} ϕ}{\partial x^{2}} + \frac{\partial^{2} ϕ}{\partial y^{2}})

(7)

where ∇²ϕ(x) represents the second-order spatial derivatives, highlighting regions where segmentation is uncertain.

The model is trained to minimize the phase-field energy functional, given by

L_{uncertainty} = \int_{Ω} (\frac{ϵ}{2} | \nabla ϕ |^{2} + \frac{1}{ϵ} W (ϕ)) d Ω

(8)

where W(ϕ) = ϕ² (1 − ϕ)² is a double-well potential function that enforces smoothness in uncertainty maps.

4.: Semantic segmentation loss function.

To optimize the segmentation accuracy, PFM-DNN employs a hybrid loss function, combining cross-entropy loss and IoU loss:

L_{s e g} = - \sum_{i} y_{i} l o g ({\hat{y}}_{i}) + λ_{I o U} (1 - \frac{| Y \cap \hat{Y} |}{| Y \cup \hat{Y} |})

(9)

where y_i is the ground truth label, ŷ_i is the predicted probability, λ_IoU is a weighting factor for the IoU term, and Y and Ŷ represent the ground truth and predicted segmentation masks, respectively.

The total training objective for segmentation is as follows:

L = L_{seg} + λ_{uncertainty} L_{uncertainty}

(10)

where λ_uncertainty balances the impact of uncertainty estimation in training.

5.: Classification with softmax activation.

For object classification, the final layer outputs a class probability vector [44]:

P (c ∣ F) = \frac{e x p (W_{c} F + b_{c})}{\sum_{j} e x p (W_{j} F + b_{j})}

(11)

where W_c and b_c are the classification layer parameters for class c and P(c|F) represents the probability of class c. The denominator ensures that the output is a normalized probability distribution (softmax). The classification loss function is defined as the categorical cross-entropy loss:

L_{c l s} = - \sum_{c} y_{c} l o g P (c∣ F)

(12)

The final total loss function for the PFM-DNN framework is as follows:

L_{total} = L_{seg} + λ_{uncertainty} L_{uncertainty} + λ_{cls} L_{cls}

(13)

The total free energy functional includes a weighting factor λ_cls for classification loss. The PFM-DNN architecture integrates deep learning feature extraction, multi-scale attention mechanisms, and phase-field uncertainty modeling into a single perception framework. The incorporation of segmentation, uncertainty estimation, and classification objectives guarantees PFM-DNN optimal accuracy, fast inference, and robustness to environmental changes. The PFM-DNN framework topology is novel, which improves the performance and efficiency of autonomous drones in real-time urban navigation.

To provide a comprehensive integration of the phase-field method into the collision avoidance logic, we must expand on how the energy-based formulation governs the evolution of the drone’s decision space. The phase-field model defines the environment using a scalar-valued function ϕ(x,t), where x = (x,y,z) ∈ Ω ⊂ R³, representing the 3D domain of the environment, and t is time. The field ϕ assigns continuous values to space, distinguishing free space (ϕ ≈ 1) from obstacle regions (ϕ ≈ 0), with smooth transitions across the interface between them. F[ϕ], which governs the behavior of ϕ, is defined as follows:

F [ϕ] = \int_{Ω} (\frac{ϵ^{2}}{2} | \nabla ϕ |^{2} + W (ϕ) + λ V (ϕ, x)) d Ω

(14)

Each term in this equation has a physical and computational role:

(a): Gradient energy term.

\frac{ϵ^{2}}{2} | \nabla ϕ |^{2}

(15)

This term penalizes large spatial gradients of the phase field, enforcing smooth transitions between regions. The parameter ϵ controls the thickness of the interface between the obstacle and free space.

(b): Potential energy term.

W (ϕ) = \frac{1}{4} {(ϕ^{2} - 1)}^{2}

(16)

This double-well potential drives ϕ towards either 0 (inside an obstacle) or 1 (free space), discouraging intermediate, ambiguous values. It ensures phase separation and stabilizes the boundary behavior.

(c): External potential term.

λ V (ϕ, x)

(17)

This term couples the phase field with external forces (e.g., building locations, sensor maps, or drone dynamics). It allows the environment or learned features (e.g., from a deep neural network) to influence the energy landscape. The parameter λ controls the strength of this external influence.

To evolve the phase field ϕ, we minimize the total energy by solving the Allen–Cahn equation, a reaction-diffusion PDE:

\frac{\partial ϕ}{\partial t} = - \frac{δ F}{δ ϕ} = ϵ^{2} \nabla^{2} ϕ - W^{'} (ϕ) - λ \frac{\partial V}{\partial ϕ}

(18)

This evolution equation drives the scalar field to reach a configuration where energy is minimized—i.e., where the interface between free space and obstacles is well-formed and reacts dynamically to external data. In the context of drone navigation, this phase-field is integrated into the path-planning logic. For each position x_t, the drone uses the gradient of the phase-field ∇ϕ(x_t) to guide motion away from obstacles. The path update rule becomes

x_{t + 1} = x_{t} + η \cdot \nabla ϕ (x_{t})

(19)

where η denotes the small step size or learning rate. When the ϕ function drops under a certain collision threshold, in this example ϕ < 0.3, it indicates a strong possibility of being inside or near an obstacle and triggers repulsion motion. This smooth repulsion method is in stark contrast to the unstable closet probe distance checks, which, owing to the use of binary thresholds and working principles, can lead to erratic and unsmooth corrections.

The phase-field gradient ∇ϕ(x_t) is transformed into high-level motion commands by characterizing desired velocity (v) as follows:

v * (x_{t}) = v_{m a x} s (ϕ (x_{t})) \frac{\nabla ϕ (x_{t})}{∥ \nabla ϕ (x_{t}) ∥ + ϵ}

(20)

where the component (

v_{m a x}

) represents the maximum flight speed. This velocity vector is used to steer the UAV toward areas of free space, ϵ is a small constant to avoid division by zero, and

s

(ϕ) ∈ [0, 1] is a scaling function that reduces speed near obstacles (ϕ < ϕ_th). This force prevents the UAV from colliding with the obstacles. This velocity is tracked to generate desired acceleration a* = (v* −

v

)/τ, which is then mapped to low-level thrust and attitude commands (

T_{d}, ϕ_{d}, θ_{d},

). This ensures a smooth, physically interpretable mapping from ∇ϕ to control actions.

Furthermore, in the PFM-DNN framework, deep neural networks’ output segmentation maps can be encoded as phase-field initial conditions or potential functions V(ϕ,x). This allows for bridging learning-enabled perception with energy-based reasoning for adaptive navigation in uncertain environments. While the network senses the visual features of the obstacles, such as RGB images, the phase field enables responses to actively deformable dynamic safety buffers around the obstacles in real time. In short, the phase-field model redefining the drone’s collision avoidance system enhances the consistency with which its physics-influenced boundary-exposed obstacles are circumstantial and differential features. Its energetic constitutive paradigm not only certifies differentiated and smooth outlines but also grounded path rectification throughout the active spatial potential field driven by spatial gradients, realizing advanced ease of control in highly populated regions of urban city environments.

4. Simulation Analysis

4.1. Dataset

The Cityscapes dataset used in the simulation is a well-established and high-resolution collection of urban scene images, developed to support research in understanding complex city environments through semantic segmentation. It has become a valuable benchmark for evaluating the performance of deep learning models, particularly those applied in autonomous systems, such as phase-field deep neural networks (PFM-DNNs). The dataset includes a total of 25,000 images—5000 of which are finely annotated with high accuracy and 20,000 that are more coarsely labelled. All images are captured at a resolution of 2048 by 1024 pixels, providing detailed visual information, as illustrated in Figure 5.

This training dataset has a number of preprocessing steps that make the model more reliable and accurate:

Normalization of Image: Pixel values are normalized in a 0–1 range to ensure that the system is stable and fast in convergence.
Augmentation of Data: Techniques such as cropping randomly, flipping horizontally, rotations, adjusting the brightness, and injection of Gaussian noise are applied to enhance the ability of the model to generalize across diverse environmental conditions.
Adjustment of the Resolution: The images are re-sized to 1024 × 512 pixels for balanced efficiency computation, with accurate segmentation.
LiDAR Fusion: Deep maps and LiDAR point clouds are incorporated to use spatial 3D data and enhance the obstacles detection and path estimation.

The Cityscapes dataset was divided into validation, training, and testing. Specifically, 4000 finely annotated images (80%) were used for training, 500 images (10%) for validation, and 500 images (10%) for testing. Data augmentation techniques—including random cropping, horizontal flipping, rotations, brightness adjustments, and Gaussian noise injection—were applied to the training set to enhance generalization across diverse urban environments.

The phase-field deep neural network, PFM-DNN, was developed and trained on the Cityscapes dataset to enhance urban scene precision and identify and track dynamic objects under diverse conditions. Leveraging high-quality, rich data, PFM-DNN was designed to perform accurate real-time obstacle detection and accurate depth perception and to adapt the drone to navigate in complex urban environments.

Computing Environment

To evaluate the computational efficiency and practical application potential of the proposed PF-DNN framework, the model was tested on three hardware configurations:

Basic workstation: Intel Xeon E5-2699 v4 processor (22 cores, 2.2 GHZ) with 64 GB of RAM, without any hardware acceleration.
GPU acceleration environment: NVIDIA RTX 3090 graphics card (24 GB VRAM, CUDA 11.3) for real-time inference and high-speed processing acceleration.
Peripheral AI computing: NVIDIA Jetson AX Xavier (512 Volta cores, octa-core ARM processor, 32 GB LPDDR4x RAM) to evaluate real-time, low-power performance and deployment on peripherals.

Each configuration was evaluated based on frame rate (FPS), power usage, and inference time. The results demonstrate the proposed system’s scalability and computational efficiency across different hardware environments.

4.2. PFM-DNN Navigation

A phase-field deep neural network (PFM-DNN) was developed specifically to support autonomous drone navigation in fast-changing urban environments. The key strength of this approach lies in its ability to process data in real-time while adapting to the complexities of urban settings—such as moving vehicles, pedestrians, and unpredictable lighting. To evaluate the system’s performance, the model was tested using the 3DCity dataset, which includes a rich variety of data: raw RGB images, depth maps, and LiDAR point clouds. These different data types help improve the accuracy of both semantic segmentation and object classification tasks. The PFM-DNN was put through a series of trials in diverse computing environments and tested under difficult conditions, including low-light scenarios, fog, glare from reflective surfaces, and high-speed motion. These tests were designed to assess how reliably the model could detect and classify obstacles as well as predict safe and efficient paths for navigation. Figure 6 shows the simulation process, which outlines a comprehensive evaluation and simulation pipeline for a phase-field deep neural network (PFM-DNN) system. This flowchart illustrates the sequential process for testing and validating the model of an autonomous drone.

✓: The process begins with multiple data sources providing rich environmental information from data sources, including RGB images, depth maps, LiDAR point clouds.
✓: “PFM-DNN Inference Pipeline” branches into two parallel operations:
✓: The first operation is segmentation and classification—identifying and categorizing objects in the scene.
✓: The second is trajectory prediction—forecasting movement paths.
✓: In the “Ground Truth Comparison” module, branches are validated against known correct data.
✓: In “Accuracy, Speed, Reliability Assessment”, an evaluation of the model’s performance across multiple metrics is carried out.
✓: In the “Robustness Testing” module, based on the test, one of the following decisions is made:
✓: The first is challenging conditions: occlusions, lighting, noise—testing the system under difficult environmental factors.
✓: The second is uncertainty analysis and adversarial testing—evaluating how the system responds to edge cases and potential attacks.
✓: The “Scenario-based Drone Simulation” combines the results of both testing branches to create realistic drone operation scenarios.
✓: The “Performance Metrics & KPI Analysis” measures key performance indicators.
✓: “Benchmarking Against SOTA Models” compares the results with state-of-the-art systems.
✓: “Model Refinement & Optimization” makes iterative improvements based on test results.
✓: The “Real-world Deployment Readiness” module assesses the system’s preparedness for actual deployment.

This systematic approach ensures comprehensive validation of the PFM-DNN model, progressing from fundamental testing to advanced simulations and, finally, to real-world deployment considerations. The process particularly emphasizes robustness under challenging conditions and benchmarking against existing solutions, which are crucial for autonomous systems operating in complex environments.

4.3. Simulation Setup

This simulation demonstrated the integration of image classification using a deep neural network (DNN) within a 3D drone simulation in an urban environment, such as a smart city. The process involved loading an image dataset using an image datastore, splitting it into training and test sets, and resizing the images to match the input size required by the DNN.

For model development for obstacles detection, semantic segmentation, and uncertainty estimation in smart cities and urban environments, an effective method is the design of urban scene segmentation using a phase-field deep neural network (PFM-DNN). Training the PFM-DNN is a critical step in developing models for semantic segmentation, obstacle avoidance and detection, and uncertainty estimation in urban environments. The model was designed for urban semantic segmentation, obstacle classification, and obstacle prediction, taking into account sensor noise, occlusion, and environmental variability. The training phase included data preprocessing, loss optimization, supervised learning, and model validation, ensuring that the neural network achieves high accuracy and strong generalizability across diverse urban mobility scenarios.

The PFM-DNN model was trained using hybrid loss functions, combining cross-entropy loss for classification with IoU-based loss to enhance segmentation accuracy. The training process employed a mini-batch stochastic gradient descent (SGD) optimizer with the following hyperparameters: batch size of 16, learning rate of 0.001 (reduced by 50% every 10 epochs using a step decay scheduler), weight decay of 0.0005, and momentum of 0.9. Training was conducted for 100 epochs on the 3DCity dataset, leveraging data parallelism across multiple GPUs to speed up the learning process (Figure 7). The final model weights were selected based on validation loss minimization, ensuring optimal generalization performance.

Testing the phase-field deep neural network (PFM-DNN) is an important step in validating its performance in realistic urban navigation scenarios. Although the model is trained on the Cityscapes dataset, which provides high-resolution, pixel images of urban environments, its generalization capability and robustness must be evaluated on an independent dataset to ensure reliable deployment under diverse conditions. For this purpose, the 3DCity dataset is used as the primary testing benchmark, as shown in Figure 8. Simulation results illustrating the training process and real-life applications are provided in the Supplementary Materials.

4.4. Obstacle Classification

The PFM-DNN framework ensures accurate classification of urban entities using integrated RGB, depth, and LiDAR data. After testing on 50,000 scenes from the 3DCity dataset, it has an accuracy of 92.5%, with an accuracy rate of 96.3% for vehicles, 88.4% for pedestrians, and 88.4% for bicycles, of 85.9% (Figure 9). Performance varied across conditions: 94.7% in daylight, 89.2% at night, 86.5% in fog, and 90.3% in motion blur. The model maintained 94.1% accuracy but had a recall of 90.8%, with the highest false negatives for pedestrians (12.7%) and cyclists (14.1%). Environmental obstacles had the highest false positives (7.6%).

After rigorously examining the system’s performance under persistent sensor-level uncertainties common in urban airspace, we simulated LiDAR signal dropout by randomly removing approximately 10–15% of LiDAR points in each frame, thereby modeling partial occlusions and temporary sensor failures [38]. Camera noise was modeled by adding Guassian noise and slight motion to RGB input channels, resulting in degraded image quality under challenging lighting conditions [39]. Additionally, sensor latency and data misalignment were modeled by introducing a 100 ms data fusion delay between the LiDAR and camera frames.

After testing for object classification, the results show that PFM-DNN performs a high level of accuracy for diverse types of objects and achieves high accuracy for large objects such as vehicles due to distinct structural features, as it was demonstrated in Table 2. However, it exhibits a significant drop in accuracy for relatively small objects such as pedestrians and cyclists, which pose a particular challenge in occlusion and noisy sensor environments where detection is difficult.

The PFM-DNN model has achieved high accuracy and demonstrated strong performance in vehicle detection. However, it showed limitations in detecting pedestrians and bicyclists due to occlusions and sensor noise. The false negative rates for these classes indicate the need to optimize multi-scale attention and motion-aware features. Classification accuracy also declined under adverse conditions; for example, fog reduced accuracy to 86.5% due to impaired depth estimation, while nighttime conditions lowered it to 89.2% due to decreased visibility. Optimizing sensor fusion and enhancing low-light processing can improve model performance. Environmental obstacles display a false positive rate of 7.6%, affecting navigation. Solutions include time consistency and test robustness checks for minimizing misclassification errors, as shown in Figure 10.

Despite these challenges, the PFM-DNN method outperforms traditional segmentation and classification networks such as Mask R-CNN and DeepLabV3+ on real-time classification tasks. Furthermore, the method achieves a dynamic balance between accuracy, inference speed, and precision. Performance can be further enhanced by optimizing uncertainty-sensitive classification layers, increasing the training data for challenging scenarios, and fine-tuning or activating hyperparameters.

4.5. Robustness to Environmental Conditions

The performance of the PFM-DNN model was validated under four challenging environmental conditions, which are nighttime low light, fog, intense glare, and motion blur, to determine its potential for autonomous drone navigation in real-world scenarios. PFM-DNN achieved an average classification accuracy of 89.6% and mIoU of 74.3% and exhibited excellent robustness with the assistance of phase-field uncertainty estimation and multi-scale attention mechanisms. Acceptance was lowest in foggy (mIoU: 66.7%, due to difficulty in depth estimation) and nighttime scenarios (accuracy: 85.2%, due to sensor noise and low contrast) that have higher uncertainty scores (14.6% and 12.8%, respectively). In contrast, glare conditions produced more false positives, particularly regarding infrastructure entities, while temporal motion blur contributed only slight degradation in performance, occurring over a shift of frames, resulting in very little loss at the model level (90.4% accuracy, 72.9% mIoU) due to the presence of temporal attention mechanisms, as shown in Figure 11.

We also evaluated the model’s robustness under realistic sensor-level uncertainties often encountered by autonomous drones. To simulate LiDAR outages, 15–20% of the point cloud data was randomly removed from each frame, simulating occlusions and temporary sensor failures.

By adding Gaussian noise to the RGB channels, camera noise was modeled, reflecting the effects of low contrast of nighttime conditions. A 100 ms delay was introduced into the fusion pipeline across the RGB, depth, and LiDAR inputs to replicate the arrival of asynchronous sensor data. Additionally, a 2° rotational offset and a 5 cm translational offset were applied between the LiDAR and camera frames to represent real-world external calibration errors.

Without data augmentation, PFM-DNN outperformed DeepLabV3+ and Mask R-CNN, and the only architecture to come close (HRNet) started to overfit in extremely low-light conditions. Improvements may include radar-based (or infrared-based) depth refinement, driving assistance with adaptive contrast enhancements, and glare overcoming through polarization filters. The broader state and union is that PFM-DNN demonstrates a relatively constant performance across various environmental conditions, indicative of a robust candidate for practical autonomous application, as shown in the following table (Table 3).

The comparative analysis with the models shows that PFM-DNN significantly overcomes DeepLabV3+ and Mask R-CNN in robustness, especially in fog and nighttime conditions, where both baseline models show over 12% greater performance decay. HRNet maintains a slight advantage in the low-light conditions, with an accuracy of classification of 87.6% compared to PFM-DNN’s 85.2%, likely due to its enhanced feature extraction capabilities. Nonetheless, PFM-DNN shows a superior adaptability across multiple environmental conditions, maintaining a better balance between speed, accuracy, and robustness.

Despite the strong performance of the PFM-DNN, several limitations need to be addressed to improve efficiency, robustness, and scalability. A major issue is the low segmentation and classification accuracy in harsh conditions, especially in dense fog (mIoU 66.7%) and at night (mIoU 70.5%). Future work should improve depth estimation by integrating sensors (thermal, radar, LiDAR) to improve perception in low-visibility environments. Therefore, the model still needs improvements in motion, temporal consistency, and monitoring and attention mechanisms to operate in harsh conditions. Further generalization testing is needed in various urban environments beyond the 3DCity dataset. Testing on diverse datasets and realistic field experiments using autonomous drones would enhance adaptability. Finally, robustness to adaptability remains an open challenge. While PFM-DNN involves uncertainty estimation, more research is needed on self-supervised learning and challenge-adaptive training to improve resilience to attacks, sensor noise, and domain shifts.

4.6. Comparison with State-of-the-Art Models

To validate the ability of PFM-DNN models to overcome other models, we conducted diverse comparisons between the model and three state-of-the-art models: DeeplabV3+ [37], MaskR-CNN [46], and HRNet [11], using a 3DCity dataset of 50,000 annotated urban scenes. These models are widely used in a lot of applications (Table 4). We tested each model under similar conditions (normal daylight, nighttime, foggy weather, and motion blur) with metrics as shown in the table below.

Based on the evaluation results, PFM-DNN demonstrates the most balanced performance among the compared models, achieving better segmentation accuracy with a mean IoU of 81.5%, overall classification accuracy of 94.9%, precision of 95.1%, and recall of 91.4%, making it a reliable model for drone navigation. HRNet achieved slightly higher precision (95.3%) and recall (91.6%) and also delivered faster performance at 40.1 frames per second. In contrast, DeepLabV3+ and Mask R-CNN exhibited lower accuracy and efficiency, with mean IoU scores of 76.8% and 74.5%, respectively, and longer inference times of 53.2 ms and 62.7 ms.

Overall, PFM-DNN proved to be an effective and reliable framework for drone navigation in urban scene environments. Figure 12 depicts a visual comparison of the simulation results across all models. The comparative analysis shows that PFM-DNN offers significant improvements over the baseline model by maintaining a strong balance between robustness, classification accuracy, and real-time adaptability in dynamic urban scenarios.

4.7. Limitations

Despite the strong performance of the PFM-DNN framework, some limitations and potential failure scenarios remain. Model accuracy drops when dealing with small or partially obscured objects, such as pedestrians and cyclists, especially in motion blur or low-light conditions. Harsh environmental conditions, including fog, nighttime lighting, and glare, also significantly impact segmentation and classification results. Additionally, sensor issues, such as LiDAR signal loss, camera noise, and data misalignment or lag, can affect navigation reliability. Furthermore, reliance on cloud updates and high computational demands may limit immediate deployment on small drones. While the system performed exceptionally well on the 3DCity dataset, broader evaluation is needed to ensure the generalizability of results across diverse urban environments. Future work will focus on enhancing the system’s resilience and adaptability by integrating multiple sensors, improving temporal consistency, and strengthening edge and cloud processing as well as conducting extensive field testing to ensure reliable and effective performance in real-world conditions.

5. Conclusions

This paper introduces a novel framework for autonomous drone navigation and obstacle avoidance in smart cities. The proposed framework integrates the phase-field method with deep neural networks and utilizes cloud-based infrastructures for updating real-time data. By integrating the phase-field method with DNNs, the framework offers continuous field representation of obstacles and navigable paths, allowing for real-time decision-making to enhance the safety and efficiency of drone operations. The use of cloud-based infrastructure ensures rapid data synchronization, enabling the system to adapt promptly to dynamic changes in the urban environment. Through extensive testing in 3D simulated environments and on datasets, the effectiveness of this framework was demonstrated in various scenarios. This research study contributes to the advancement of smart city technologies by offering a scalable and adaptable solution for drone navigation in urban environments. Despite the robust performance of the PFM-DNN, there are still some issues and gaps that need to be addressed to make it more sophisticated, efficient, reliable, and capable of handling more challenging tasks. One major challenge is identifying and classifying objects in challenging conditions, especially in dense fog (where the average intersection with the union is 66.7% or 70.5% at night). To address these issues, future work should focus on improving depth perception by incorporating additional sensors such as thermal cameras, radar, or LiDAR. This will enable the system to better perceive its surroundings in low-visibility environments. Also, the network requires future improvement in tracking movement over time, maintaining consistency across frames, and incorporating monitoring and attention mechanisms to ensure reliable performance under challenging conditions.

Following initial evaluation results, actual deployments require robustness and optimization to address sensor shortcomings. Initial evaluations focused on environmental conditions such as foggy weather and low-light scenarios. Drone operations in urban areas face LiDAR signal interruptions due to reflective surfaces, camera noise in low light, latency due to communication delays, and misaligned multi-sensor setups. These factors can significantly impact perception and navigation accuracy if left unaddressed.

Future work will focus on designing and validating the cloud-based component of the framework, including a full analysis of communication protocols, latency, bandwidth, and fault-tolerance mechanisms, to support scalable real-time drone navigation.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/vehicles8010006/s1.

Author Contributions

Conceptualization, R.H.A.; Methodology, L.A. and R.H.A.; Software, L.A.; Validation, L.A. and R.H.A.; Formal analysis, R.H.A.; Investigation, L.A.; Data curation, L.A.; Writing—original draft, L.A.; Writing—review & editing, R.H.A.; Supervision, R.H.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data openly available in a public repository at https://github.com/3dcitydb and https://www.kaggle.com/datasets/shuvoalok/cityscapes.

Conflicts of Interest

The authors declare no conflict of interest.

References

Abiyev, R.H.; Akkaya, N.; Aytac, E.; Ibrahim, D. Behaviour-tree Based Control for Efficient Navigation of Holonomic Robots. Int. J. Robot. Autom. 2014, 29, 44–57. [Google Scholar] [CrossRef]
Maguire-Day, J.; Al-Rubaye, S.; Warrier, A.; Sen, M.A.; Whitworth, H.; Samie, M. Emerging Decision-Making for Transportation Safety: Collaborative Agent Performance Analysis. Vehicles 2025, 7, 4. [Google Scholar] [CrossRef]
Xiao, J.; Zhang, R.; Zhang, Y.; Feroskhan, M. Vision-based Learning for Drones: A Survey. IEEE Trans. Neural Networks Learn. Syst. 2023, 36, 15601–15621. [Google Scholar] [CrossRef] [PubMed]
Eldeeb, E.; Sant’Ana JMde, S.; Pérez, D.E.; Shehab, M.; Mahmood, N.H.; Alves, H. Multi-UAV Path Learning for Age and Power Optimization in IoT with UAV Battery Recharge. IEEE Trans. Veh. Technol. 2023, 72, 5356–5360. [Google Scholar] [CrossRef]
Panico, A.; Fragonara, L.Z.; Al-Rubaye, S. Adaptive detection tracking system for autonomous UAV maritime patrolling. In Proceedings of the 2020 IEEE International Workshop on Metrology for AeroSpace, MetroAeroSpace, Pisa, Italy, 22–24 June 2020; pp. 539–544. [Google Scholar] [CrossRef]
Zhang, Z.; Wu, J.; Dai, J.; He, C. A Novel Real-Time Penetration Path Planning Algorithm for Stealth UAV in 3D Complex Dynamic Environment. IEEE Access 2020, 8, 122757–122771. [Google Scholar] [CrossRef]
Bahabry, A.; Wan, X.; Ghazzai, H.; Menouar, H.; Vesonder, G.; Massoud, Y. Low-Altitude Navigation for Multi-Rotor Drones in Urban Areas. IEEE Access 2019, 7, 87716–87731. [Google Scholar] [CrossRef]
Mao, Y.; Chen, M.; Wei, X.; Chen, B. Obstacle recognition and avoidance for UAVs under resource-constrained environments. IEEE Access 2020, 8, 169408–169422. [Google Scholar] [CrossRef]
Yang, S.; Meng, Z.; Chen, X.; Xie, R. Real-time obstacle avoidance with deep reinforcement learning. Three-Dimensional Autonomous Obstacle Avoidance for UAV. In Proceedings of the 2019 International Conference on Robotics, Intelligent Control and Artificial Intelligence; ACM International Conference Proceeding Series; ACM: New York, NY, USA, 2019; pp. 324–329. [Google Scholar] [CrossRef]
Neapolitan, R.E. Learning Bayesian Networks; Pearson Prentice Hall: Upper Saddle River, NJ, USA, 2004; Volume 38. [Google Scholar]
Torres, R.D.; Molina, M.; Campoy, P. Survey of Bayesian Network applications to Intelligent Autonomous Vehicles (IAVs). arXiv 2019, arXiv:1901.05517. [Google Scholar]
de Santana, C.A.; Colombini, E.L. Attention, please! A survey of neural attention models in deep learning. Artif. Intell. Rev. 2022, 55, 6037–6124. [Google Scholar] [CrossRef]
Li, X.; Li, Y.; Ai, J.; Shu, Z.; Xia, J.; Xia, Y. Semantic segmentation of UAV remote sensing images based on edge feature fusing and multi-level upsampling integrated with Deeplabv3. PLoS ONE 2023, 18, e0279097. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
Yang, X.; Fan, X.; Peng, M.; Guan, Q.; Tang, L. Semantic segmentation for remote sensing images based on an AD-HRNet model. Int. J. Digit. Earth 2022, 15, 2376–2399. [Google Scholar] [CrossRef]
Hou, T.; Li, J. Application of mask R-CNN for building detection in UAV remote sensing images. Heliyon 2024, 10, e38141. [Google Scholar] [CrossRef]
Alali, M.; Imani, M. Bayesian reinforcement learning for navigation planning in unknown environments. Front. Artif. Intell. Gence 2024, 7, 1308031. [Google Scholar] [CrossRef]
Huang, Y.; Xiang, X.; Yan, C.; Xu, H.; Zhou, H. Density-Based Probabilistic Graphical Models for Adaptive Multi-Target Encirclement of AAV Swarm. IEEE Robot. Autom. Lett. 2025, 10, 8228–8235. [Google Scholar] [CrossRef]
Ren, Y.; Dong, G.; Zhang, T.; Zhang, M.; Chen, X.; Xue, M. UAVs-Based Visual Localization via Attention-Driven Image Registration Across Varying Texture Levels. Drones 2024, 8, 739. [Google Scholar] [CrossRef]
Srivastava, S.; Narayan, S.; Mittal, S. A survey of deep learning techniques for vehicle detection from UAV images. J. Syst. Archit. 2021, 117, 102152. [Google Scholar] [CrossRef]
Bucolo, M.; Buscarino, A.; Fortuna, L.; Gagliano, S. Bifurcation scenarios for pilot induced oscillations. Aerosp. Sci. Technol. 2020, 106, 106194. [Google Scholar] [CrossRef]
Wang, C.; Wang, J.; Shen, Y.; Zhang, X. Autonomous Navigation of UAVs in Large-Scale Complex Environments: A Deep Reinforcement Learning Approach. IEEE Trans. Veh. Technol. 2019, 68, 2124–2136. [Google Scholar] [CrossRef]
Fu, C.; Xu, X.; Zhang, Y.; Lyu, Y.; Xia, Y.; Zhou, Z.; Wu, W. Memory-enhanced deep reinforcement learning for UAV navigation in 3D environment. Neural Comput. Appl. 2022, 34, 14599–14607. [Google Scholar] [CrossRef]
Javadi, S.; Dahl, M.; Pettersson, M.I. Vehicle Detection in Aerial Images Based on 3D Depth Maps and Deep Neural Networks. IEEE Access 2021, 9, 8381–8391. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
Wu, X.; Li, W.; Hong, D.; Tao, R.; Du, Q. Deep Learning for Unmanned Aerial Vehicle-Based Object Detection and Tracking: A survey. IEEE Geosci. Remote Sens. Mag. 2022, 10, 91–124. [Google Scholar] [CrossRef]
Abiyev, R.H.; Akkaya, N.; Aytac, E.; Abizada, S. Fuzzy Gain Scheduling Controller for Quadrotor. In Proceedings of the 11th International Conference on Theory and Application of Soft Computing, Computing with Words and Perceptions and Artificial Intelligence—ICSCCW 2021; Lecture Notes in Networks and Systems; Springer: Cham, Switzerland, 2022; Volume 362, pp. 375–383. [Google Scholar] [CrossRef]
Haider, S.K.; Nauman, A.; Jamshed, M.A.; Jiang, A.; Batool, S.; Kim, S.W. Internet of Drones: Routing Algorithms, Techniques and Challenges. Mathematics 2022, 10, 1488. [Google Scholar] [CrossRef]
Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar] [CrossRef]
Yan, W.; Chen, C.; Zhang, D. U-Net-based medical image segmentation algorithm. In Proceedings of the 13th International Conference on Wireless Communications and Signal Processing, WCSP 2021, Changsha, China, 20–22 October 2021. [Google Scholar] [CrossRef]
Liu, Y.; Ma, G.; Lyu, Y.; Wang, P. Neural network-based reinforcement learning control for combined spacecraft attitude tracking maneuvers. Neurocomputing 2022, 484, 67–78. [Google Scholar] [CrossRef]
El Debeiki, M.; Al-Rubaye, S.; Perrusquía, A.; Conrad, C.; Flores-Campos, J.A. An Advanced Path Planning and UAV Relay System: Enhancing Connectivity in Rural Environments. Future Internet 2024, 16, 89. [Google Scholar] [CrossRef]
Liu, S.; Lin, Z.; Huang, W.; Yan, B. Technical development and future prospects of cooperative terminal guidance based on knowledge graph analysis: A review. J. Zhejiang Univ.-Sci. A Appl. Phys. Eng. 2025, 26, 605–634. [Google Scholar] [CrossRef]
Shahkar, S. Cooperative Localization of Multi-Agent Autonomous Aerial Vehicle (AAV) Networks in Intelligent Transportation Systems. IEEE Open J. Intell. Transp. Syst. 2025, 6, 49–66. [Google Scholar] [CrossRef]
Zafar, M.; Khan, R.A.; Fedoseev, A.; Jaiswal, K.K.; Sujit, P.B.; Tsetserukou, D. HetSwarm: Cooperative Navigation of Heterogeneous Swarm in Dynamic and Dense Environments Through Impedance-Based Guidance. In Proceedings of the 2025 International Conference on Unmanned Aircraft Systems (ICUAS), Charlotte, NC, USA, 14–17 May 2025; pp. 309–315. [Google Scholar] [CrossRef]
Xing, W.; Cui, Z.; Qi, J. HRCTNet: A hybrid network with high-resolution representation for object detection in UAV image. Complex Intell. Syst. 2023, 9, 6437–6457. [Google Scholar] [CrossRef]
Siddique, N.; Paheding, S.; Elkin, C.P.; Devabhaktuni, V. U-Net and its variants for medical image segmentation: A review of theory and applications. Int. J. Remote Sens. 2021, 42, 5510–5546. [Google Scholar] [CrossRef]
Janai, J.; Güney, F.; Behl, A.; Geiger, A. Computer vision for autonomous vehicles: Problems, datasets and state of the art. Found. Trends Comput. Graph. Vis. 2020, 12, 1–308. [Google Scholar] [CrossRef]
Villanueva, A.; Fajardo, A. UAV Navigation System with Obstacle Detection using Deep Reinforcement Learning with Noise Injection. In Proceedings of the 2019 International Conference on ICT for Smart Society (ICISS), Bandung, Indonesia, 19–20 November 2019; pp. 1–6. [Google Scholar] [CrossRef]
Wang, J.; Sun, K.; Cheng, T.; Jiang, B.; Deng, C.; Zhao, Y.; Liu, D.; Mu, Y.; Tan, M.; Wang, X.; et al. Deep High-Resolution Representation Learning for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 3349–3364. [Google Scholar] [CrossRef]
Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; Lecture Notes in Computer Science. Springer: Cham, Switzerland, 2018; Volume 11211. [Google Scholar] [CrossRef]
Ozgür-Unlüakin, D.; Bilgiç, T. Predictive maintenance using dynamic probabilistic networks. In Proceedings of the 3rd European Workshop on Probabilistic Graphical Models PGM’06 2006, Prague, Czech Republic, 12–15 September 2006; pp. 141–148. [Google Scholar]
Mousa, M.; Al-Rubaye, S.; Inalhan, G. Unmanned Aerial Vehicle Positioning using 5G New Radio Technology in Urban Environment. In Proceedings of the AIAA/IEEE Digital Avionics Systems Conference, Barcelona, Spain, 1–5 October 2023. [Google Scholar] [CrossRef]
Wang, F.; Long, L. Event-triggered adaptive NN control for MIMO switched nonlinear systems with non-ISpS unmodeled dynamics. J. Franklin Inst. 2022, 359, 1457–1485. [Google Scholar] [CrossRef]
Zhang, X.; Han, N.; Zhang, J. Comparative analysis of VGG, ResNet, and GoogLeNet architectures evaluating performance, computational efficiency, and convergence rates. Appl. Comput. Eng. 2024, 44, 172–181. [Google Scholar] [CrossRef]
Huang, Q.; Yang, X.; Wang, L.; Wang, Z.; Liu, Y. Research on U-Net-based medical image segmentation method. Math. Biosci. Eng. 2020, 17, 2792–2807. [Google Scholar] [CrossRef]

Figure 1. Obstacle avoidance in vision-based drones.

Figure 2. An outline of the proposed model.

Figure 3. DNN algorithm design architecture.

Figure 4. The integration of phase-field methods with deep learning for autonomous control.

Figure 5. High-resolution urban scenes from the Cityscapes dataset.

Figure 6. Proposed simulation process steps.

Figure 7. Visualization of DNN training.

Figure 8. Three-dimensional city data that is used for the testing.

Figure 9. UAV navigation in a simulated city environment.

Figure 10. Obstacle classification accuracy.

Figure 11. Robustness to environmental conditions—performance metrics.

Figure 12. A comparative analysis of PFM-DNN vs. state-of-the-art models, HRNet [11], DeeplabV3+ [37] and MaskR-CNN [46].

Table 2. Object classification performance metrics for PFM-DNN.

Condition/Object Category	Accuracy (%)	Precision (%)	Recall (%)	False Negative Rate (FNR) (%)	False Positive Rate (FPR) (%)
Overall Performance	92.5	94.1	90.8	9.2	5.8
Standard Daylight	94.7	95.6	92.9	7.1	4.5
Nighttime (Low-Light)	89.2	91.8	87.4	12.6	6.7
Foggy Weather	86.5	89.5	84.3	15.7	7.3
Motion Blur Scenario	90.3	92.2	88.5	11.5	6.1
Vehicles	96.3	97.8	94.6	5.4	3.7
Pedestrians	88.4	90.6	86.3	12.7	5.9
Cyclists	85.9	88.1	83.6	14.1	6.5
Infrastructure (Signs, Buildings)	91.8	94	89.2	10.8	5.2
Environmental Obstacles (Trees, Construction)	87.6	89.9	85.1	14.9	7.6
DeepLabV3 + (Vehicles Comparison)	82.7	73.3	N/A	N/A	N/A
Mask R-CNN Vehicles Comparison)	37.1	65	N/A	N/A	N/A
HRNet Vehicles Comparison)	54.9	75.1	N/A	N/A	N/A

Table 3. Performance Metrics Under Different Environmental Conditions.

Environmental Condition	Mean IoU (mIoU) (%)	Classification Accuracy (%)	Uncertainty Confidence Score (%)	False Positive Rate (FPR) (%)
PFM-DNN Standard Daylight (Baseline)	85.4	94.7	3.2	4.5
PFM-DNN Nighttime (Low-Light)	70.5	85.2	12.8	6.7
PFM-DNN Foggy Weather	66.7	82.9	14.6	7.9
PFM-DNN High-Glare Sunlight	72.4	87.1	9.3	7.2
PFM-DNN Motion Blur Scenario	72.9	90.4	8.3	6.1
DeepLabV3+ Foggy Weather (Comparison)	58.2	79.6	16.4	9.2
Mask R-CNN (Nighttime for Comparison)	65.7	80.9	13.7	8.4
HRNet (Nighttime for Comparison)	72.8	87.6	11.5	6.2

Table 4. Comparative models—performance metrics.

Model	Mean IoU	Accuracy (%)	Precision	Recall	(FNR) (%)	(FPS)	Inference Time
PFM-DNN	81.5	94.9	95.1	91.4	9.2	36.3	27.5
HRNet [11]	81.4	94.7	95.3	91.6	8.4	40.1	24.9
DeepLabV3+ [37]	76.8	89.2	91.5	88.2	11.8	18.8	53.2
Mask R-CNN [46]	74.5	87.6	90.1	87.6	12.4	15.9	62.7

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Aljaburi, L.; Abiyev, R.H. Intelligent UAV Navigation in Smart Cities Using Phase-Field Deep Neural Networks: A Comprehensive Simulation Study. Vehicles 2026, 8, 6. https://doi.org/10.3390/vehicles8010006

AMA Style

Aljaburi L, Abiyev RH. Intelligent UAV Navigation in Smart Cities Using Phase-Field Deep Neural Networks: A Comprehensive Simulation Study. Vehicles. 2026; 8(1):6. https://doi.org/10.3390/vehicles8010006

Chicago/Turabian Style

Aljaburi, Lamees, and Rahib H. Abiyev. 2026. "Intelligent UAV Navigation in Smart Cities Using Phase-Field Deep Neural Networks: A Comprehensive Simulation Study" Vehicles 8, no. 1: 6. https://doi.org/10.3390/vehicles8010006

APA Style

Aljaburi, L., & Abiyev, R. H. (2026). Intelligent UAV Navigation in Smart Cities Using Phase-Field Deep Neural Networks: A Comprehensive Simulation Study. Vehicles, 8(1), 6. https://doi.org/10.3390/vehicles8010006

Article Menu

Intelligent UAV Navigation in Smart Cities Using Phase-Field Deep Neural Networks: A Comprehensive Simulation Study

Abstract

1. Introduction

2. Literature Review

3. Design of PFM-DNN Based Navigation System

3.1. Drone Navigation System

3.2. DNN Algorithm

3.3. Phase-Field Deep Neural Network (PFM-DNN)

3.4. Mathematical Models

4. Simulation Analysis

4.1. Dataset

Computing Environment

4.2. PFM-DNN Navigation

4.3. Simulation Setup

4.4. Obstacle Classification

4.5. Robustness to Environmental Conditions

4.6. Comparison with State-of-the-Art Models

4.7. Limitations

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI