Multi-Tier 3D Trajectory Planning for Cellular-Connected UAVs in Complex Urban Environments

Luo, Xiling; Zhang, Tianyi; Xu, Wenxiang; Fang, Chao; Lu, Tongwei; Zhou, Jialiu

doi:10.3390/sym15091628

Open AccessEditor’s ChoiceArticle

Multi-Tier 3D Trajectory Planning for Cellular-Connected UAVs in Complex Urban Environments

by

Xiling Luo

^1,2,

Tianyi Zhang

^1,2,

Wenxiang Xu

^2,*,

Chao Fang

^3,4

,

Tongwei Lu

⁴

and

Jialiu Zhou

²

¹

School of Electronic and Information Engineering, Beihang University, Beijing 100191, China

²

Hangzhou Innovation Institute, Beihang University, Hangzhou 310051, China

³

Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China

⁴

Hubei Key Laboratory of Intelligent Robot (Wuhan Institute of Technology), Wuhan 430205, China

^*

Author to whom correspondence should be addressed.

Symmetry 2023, 15(9), 1628; https://doi.org/10.3390/sym15091628

Submission received: 25 July 2023 / Revised: 18 August 2023 / Accepted: 21 August 2023 / Published: 23 August 2023

(This article belongs to the Section Computer)

Download

Browse Figures

Versions Notes

Abstract

:

Cellular-connected unmanned aerial vehicles (UAVs) present a viable solution to address communication and navigation limitations by leveraging base stations for air–ground communication. However, in complex urban scenarios with stringent communication requirements, achieving asymmetrical control becomes crucial to strike a balance between communication reliability and flight safety. Moreover, existing cellular-connected UAV trajectory planning algorithms often struggle to handle real scenes with sudden and intricate obstacles. To address the aforementioned challenges, this paper presents the multi-tier trajectory planning method (MTTP), which takes into account air–ground communication service assurance and collision avoidance in intricate urban environments. The proposed approach establishes a flight risk model that accounts for both the outage probability of UAV-ground base station (GBS) communication and the complexity of flight environments, and transforms the inherently complex three-dimensional (3D) trajectory optimization problem into a risk distance minimization model. To optimize the flight trajectory, a hierarchical progressive solution approach is proposed, which combines the strengths of the heuristic search algorithm (HSA) and deep reinforcement learning (DRL) algorithm. This innovative fusion of techniques empowers MTTP to efficiently navigate complex scenarios with sudden obstacles and communication challenges. Simulations show that the proposed MTTP method achieves a more superior performance of trajectory planning than the conventional communication-based solution, which yields a substantial reduction in flight distance of at least 8.49% and an impressive 10% increase in the mission success rate. Furthermore, a real-world scenario is chosen from the Yuhang District, Hangzhou (a southern Chinese city), to validate the practical applicability of the MTTP method in highly complex operating scenarios.

Keywords:

unmanned aerial vehicle (UAV); trajectory planning; asymmetrical network; deep reinforcement learning; urban environment

1. Introduction

Unmanned aerial vehicles (UAVs) have undergone significant and noteworthy advancements in various technical domains, encompassing control technology, data transmission, navigation and communication systems, and payload equipment. These progressive developments have greatly facilitated their extensive utilization across a diverse array of industries, such as cargo delivery, traffic control, aerial inspections, and data collection [1,2,3,4].

Nevertheless, navigation-related challenges, such as signal instability and imprecise positioning arising from GPS signal interference, have emerged as primary factors impeding the successful trajectory planning of UAVs. These issues hinder the attainment of safety and efficiency prerequisites essential for actual UAV operations, significantly restricting their applicability in complex urban environments [5,6,7]. To address these challenges, researchers have proposed UAV trajectory planning methods based on urban cellular network systems, which have gained prominence within the academic community. This approach relies on a dense distribution of ground base stations (GBSs) to furnish economical communication links and an extended control range [8]. Meanwhile, the asymmetric nature of GBS bidirectional communication enables UAVs to receive commands and transmit data efficiently, providing strong support for precise trajectory planning and dynamic flight. The positioning method based on GBS signals effectively mitigates communication interruptions caused by unstable GPS signals and complex environments, thereby enhancing positioning accuracy for trajectory planning in more intricate scenarios [9]. Moreover, the emergence of 5G and higher versions of mobile networks is anticipated to bring significant performance enhancements to cellular-connected UAVs, encompassing increased data transmission rates and reduced latency [10].

However, challenges remain. Firstly, several researchers primarily focus on global trajectory planning schemes from the perspective of GBS coverage, overlooking the influence of complex obstacles within the signal coverage area, which make the planning results not feasible in real scenarios [11,12,13]. Secondly, existing methods rely on globally deterministic map information and fail to consider temporary flight zone restrictions in the environment, rendering them inadequate for solving trajectory planning problems in low-altitude dynamic urban environments [14,15]. In such scenarios, the implementation of asymmetric control between communication reliability and flight safety is of paramount importance. This paper endeavors to address the aforementioned challenges by proposing a novel approach to enhance the efficacy of trajectory planning through the GBS-UAV link while concurrently ensuring flight safety in intricate and dynamic urban settings. To achieve this objective, a multi-tier trajectory planning (MTTP) method is introduced and specifically tailored to cellular-connected UAVs. This method seamlessly integrates air–ground communication service assurance with advanced collision avoidance mechanisms, designed to navigate complex environments effectively. This method comprises a connectivity-aware global planning tier and a collision-free local planning tier. In the connectivity-aware global planning tier, the GBS’s signal coverage is a planning unit used to obtain the globally optimal path by integrating the distance and flight risk of each GBS. In the collision-free local planning tier, the optimal 3D trajectory is further solved within the GBSs in the 3D urban area. The contributions of this paper can be summarized as follows.

This paper introduces the MTTP method tailored for cellular-connected UAVs, which effectively addresses the optimization challenge of determining the optimal flight path in complex dynamic environments. This method achieves integration between air–ground communication service assurance and collision avoidance while considering the mixed constraints imposed by communication reliability and environmental complexity;
This paper presents a flight risk model that takes into account the communication outage probability of the GBS-UAV link and the complexity of the flight environment. Leveraging this model, the complex 3D trajectory optimization problem is formulated as a risk distance minimization problem;
This paper proposes a hierarchical progressive solution approach that combines a heuristic search algorithm (HSA) with deep reinforcement learning (DRL). The algorithm factors in communication conditions, statically known obstacles, and unexpected obstacles detected by the UAV’s sensors to devise an optimal flight strategy. Additionally, the proposed DRL algorithm enhances accuracy and stability by integrating the double deep Q-Network (DDQN) with the dueling network structure.

The subsequent sections of this paper are structured as follows. In Section 2, related work is introduced. In Section 3, the system model is presented and the optimization problem is formulated. The proposed MTTP method and detailed design and implementation information are described in Section 4. Section 5 presents the experimental results, and finally, Section 6 provides the concluding remarks for the paper.

2. Related Work

2.1. Trajectory Planning Research

In recent years, researchers have made significant progress in developing various approaches to tackle the trajectory planning problem for cellular-connected UAVs [11,12,13,14,15]. Zhang et al. [11] conducted a systematic review in 2018, focusing on the communication connectivity constraints in UAV trajectory design. They proposed a shortest trajectory planning scheme with a minimum signal-to-noise ratio (SNR) as the connectivity constraint. Expanding on this work, Zhang et al. [12] developed a comprehensive channel gain map for GBSs to characterize the large-scale channel gains in a 3D space environment. To optimize the communication performance, they ingeniously incorporated load factors into the system, resulting in a signal-to-interference-plus-noise ratio (SINR) map. By leveraging graph theory methods, they successfully designed the shortest communication path while adhering to a specified minimum SINR constraint. Bulut et al. [13] explored the trajectory optimization problem of cellular-connected UAVs with time disconnection constraints, applying dynamic programming theory to find the optimal path considering blackout duration conditions. Similar problems were also investigated in references [14,15].

However, these studies have predominantly overlooked the impact of complex obstacles on trajectory planning algorithms. In urban scenarios, UAVs frequently encounter intricate obstacles formed by various types of buildings and traffic environments during mission execution. Therefore, it is crucial to consider UAV trajectory planning problems that encompass both static and unexpected obstacles to provide a more realistic representation. To address this challenge, scholars have conducted in-depth research on offline and online combinative trajectory planning algorithms. Yin et al. [16] present a multi-objective trajectory planning framework for exploring UAV operation paths in dynamic urban environments. Their method captured static obstacles and unexpected obstacles in geographic maps through a safety index map (SIM). Based on this map, they planned UAV flight paths to avoid static obstacles and navigate swiftly around unexpected obstacles. Wu et al. [17] designed an adaptive path replanning method using a discrete rapidly exploring random tree (DRRT) algorithm to solve path problems by leveraging dynamic characteristics of urban environments.

It is worth noting that existing research has mainly focused on the use of cellular networks in achieving various achievements. Related research includes UAV monitoring and management, communication relay, and cellular positioning [7,9,10,18]. Consequently, the path planning challenge concerning cellular-connected UAVs, which entails utilizing cellular networks, has garnered considerable attention in the research community. As a response to this, the present study concentrates on the operational mode of cellular-connected UAVs, with the objective of augmenting the efficacy of trajectory planning through the GBS-UAV link while ensuring safe flight within complex dynamic urban environments at low altitudes.

2.2. Trajectory Planning Algorithm

Researchers have devoted significant efforts to studying methods for generating collision-free paths in obstacle-laden environments, which hold great importance for UAVs. Among the various trajectory planning algorithms, the geometry method, velocity space method, and graph search method stand out as three frequently employed approaches. The geometry method leverages the geometric information of obstacles for path searching, including commonly used techniques, such as the artificial potential field [19], vector field histogram [20], and Voronoi diagram [21]. The velocity space method includes the curvature velocity [22] and the dynamic window method [23]. Its core idea is to consider the dynamic model of the vehicle. The graph search method is often used to find the shortest path between two given points, and the A* algorithm is widely known for its effectiveness in achieving obstacle avoidance and pathfinding concurrently. Wu et al. [24] described a multi-step A* search method, which uses the successor operator to find a path with the lowest cost by changing the parameters of the track. Moreover, Hernandez et al. [25] designed a multi-objective A* algorithm that takes flight distance, area deviation, and power consumption into account to determine the UAV’s path from the starting point to the destination. Furthermore, Guglieri et al. [26] propose a modified A* algorithm called RA* that accounts for the risk of flying over a specific area.

Despite addressing pathfinding problems in complex dynamic environments, these studies require real-time environment modeling and lack generalization learning capabilities [27]. Consequently, the algorithmic complexity increases as the number of obstacle objects grows, making it challenging to apply these methods to pathfinding in low-altitude complex environments [28]. Fortunately, advancements in the realm of reinforcement learning (RL) have offered an alternative approach to tackle trajectory planning issues in intricate environments [29]. DRL, which represents an advanced fusion of RL and deep learning (DL), employs UAVs as agents, enabling them to acquire navigation strategies by continuously interacting with their surroundings and receiving feedback information [30]. These strategies can be generalized to a wide parameter space of scenarios without the need for expensive retraining when the scenarios change. Bayerlein et al. [31] demonstrated the effectiveness of their DQN-based trajectory planning in generalizing to different scenarios, effectively balancing data collection and flight time efficiency objectives. Additionally, Theile et al. [32] proposed a DQN-based 2D city environment trajectory planning algorithm that utilizes uncompressed local maps and compressed global maps for collaborative planning near the UAV. Xu et al. [33] introduced a policy using the gravity-aware DQN approach, which provides an online strategy to avoid unexpected obstacles. It incorporates a reward scheme with safety counts to guide the agent with global information. Behzard et al. [34] presented a connectivity-constrained UAV trajectory optimization problem based on DDQN, considering the influence of the communication outage time and static obstacles on UAV trajectory planning. They also addressed the overestimation issue of DQN through decoupling.

While current trajectory planning algorithms based on DRL effectively tackle UAV trajectory planning challenges in complex dynamic obstacle environments, most of the existing research predominantly focuses on 2D spatial scenarios. Additionally, the collaborative planning capability that utilizes both offline and online environmental information is insufficient to achieve global optimality. In response to these limitations, this paper proposes the MTTP method specially designed for cellular-connected UAVs, which seamlessly integrates air–ground communication service assurance with collision avoidance mechanisms in complex 3D urban environments. In the connectivity-aware global planning tier, we optimize the flight path between GBSs by utilizing the HSA that takes into consideration communication quality, environmental complexity, and flight distance. Meanwhile, in the collision-free local planning tier, we address the flight path problem within the GBS area and present a local optimal 3D path algorithm based on dueling DDQN (D3QN).

3. System Model and Problem Formulation

In this section, we present the scenario, UAV mobility, and flight risk models, and formulate the 3D trajectory optimization problem as a minimal risk distance model to safeguard the flight safety of UAVs in complex urban environments.

3.1. Scenario and UAV Mobility Model

As shown in Figure 1, our study focuses on a cellular-connected UAV system that operates in an urban low-altitude airspace. Buildings in this area are densely distributed, and there are temporary flight restrictions (TFRs) and the deployment of several GBSs. The urban environment of interest is defined within a Cartesian coordinate system, denoted as

D \times D \times H m^{3}

. Here, D signifies the horizontal extent of the city, and H represents the city’s vertical extent. The maximum height of the building in the environment is represented by

h_{b d}

. In our system model, we define the GBS-UAV link based on the area and height of the building, following the recommendation of the International Telecommunication Union (ITU) [35].

We consider that the UAV flight trajectory includes a set of grid points, represented as

N = \{1, 2, \dots, N\}

, where N is the total number of grid points. The UAV’s position in 3D space at any given time t is denoted as

q_{n} (t) = (x_{n}, y_{n}, h_{n})

, where

n \in N

represents the current grid number. Here,

x_{n} \in [0, D]

and

y_{n} \in [0, D]

are the horizontal coordinates of the nth grid point, while

h_{n} \in [h_{m i n}, h_{m a x}]

is used to represent the UAV’s flight height of the region. During the flight, we assume that the UAV traveling at a constant speed, denoted by

V (m / s)

, and that the total flight time, represented as T, can be divided into k discrete steps such that

T = k Δ_{t}

. Therefore, the UAV’s 3D trajectory, represented as

\{q (t)\}

, can be considered a sequence

{\{q_{n}\}}_{n = 1}^{N}

, where

q_{s} = (x_{s}, y_{s}, h_{s})

is the starting point of the mission and

q_{f} = (x_{f}, y_{f}, h_{f})

is the ending point. According to the 3D discrete grid space, we model the UAV’s motion space based on adjacent grids and define the shortest distance between adjacent grid points as

Δ_{d}

. By using the knowledge of geometry, we can obtain other optional distances, such as

\sqrt{2} Δ_{d}

, and limit the maximum distance to

\sqrt{3} Δ_{d}

, as illustrated in Figure 2. Moreover, the UAV in our model can obtain real-time information about their surroundings through remote sensing systems, such as onboard cameras, infrared detection, and hyperspectral sensor [36].

3.2. Flight Risk Model

In this section, we describe models that quantify the communication outage probability of the GBS-UAV link and the complexity of the environmental conditions. These elements are crucial for maintaining effective communication during the UAV’s flight in an urban scenario. We then incorporate these models to establish an integrated flight risk model. This composite model aids in formulating an efficient trajectory planning strategy for the UAV, thus ensuring its safe and effective operation in complex urban environments.

3.2.1. Communication Outage Probability

We utilize a communication outage probability model to assess the stability of the UAV-GBS link. Consider a set of GBSs in the urban environment denoted by

M = {1, 2, \dots, M}

, where M is the total number of GBSs.

h_{m} (t)

denotes the channel power gain from the position of the mth GBS to the UAV at time t, which depends on the antenna gain of GBS, large-scale channel power gain, and small-scale fading. Based on [18], the received signal power by the UAV from the mth GBS at time t is given as

\begin{matrix} p_{m} (t) = P_{m} {| h_{m} (t) |}^{2} = P_{m} β_{m} (q_{n} (t)) {\bar{h}}_{m} (q_{n} (t)) {\tilde{h}}_{m} (t), \forall m \in M, \end{matrix}

(1)

where

P_{m}

is the constant transmit power of the mth GBS,

β_{m} (q_{n} (t))

and

{\bar{h}}_{m} (q_{n} (t))

represent the antenna gain of the GBS and large-scale channel power gain dependent on the current UAV position

q_{n} (t)

, and

{\tilde{h}}_{m} (t)

represents small-scale fading. The large-scale channel power gain

{\bar{h}}_{m} (q_{n} (t))

is significantly affected by the building’s position between the UAV and GBS. Based on the urban microcellular (UMI) model, as specified in the 3GPP specification [37], we assume that the power gain caused by static obstacles is constant and the large-scale channel power gain can be separated into line-of-sight (LoS) and non-line-of-sight (NLoS) links. For LoS, the large-scale channel power gain is expressed as

\begin{matrix} {\bar{h}}_{m}^{L o S} (q_{n} (t)) = max {h_{m}^{F S P L}, 30.9 + (22.25 - 0.5 {log}_{10} h_{n}) {log}_{10} d_{m} (t) + 20 {log}_{10} f_{c}}, \end{matrix}

(2)

where

h_{m}^{F S P L}

is the free-space path loss,

d_{m} (t)

is the 3D distance between the UAV and the mth GBS, and

f_{c}

is the carrier frequency. For NLoS links, the large-scale channel power gain is expressed as

\begin{matrix} {\bar{h}}_{m}^{N L o S} (q_{n} (t)) = max {{\bar{h}}_{m}^{L o S} (q_{n} (t)), 32.94 + (43.2 - 7.6 {log}_{10} h_{n}) {log}_{10} d_{m} (t) + 20 {log}_{10} f_{c}} . \end{matrix}

(3)

Under LoS and NLoS conditions, the small-scale fading

{\tilde{h}}_{m} (t)

follows Rayleigh and Rician fading, respectively. Consider a set of GBSs that have provided services for UAV denoted by

G = {1, 2, \dots, G} \subseteq M

, where G denotes the sum of those GBSs. Then, we denote the GBS serving the UAV at time t as

g (t) \in G

. According to [12], the instantaneous signal-to-interference-plus-noise ratio (SINR) of the downlink is expressed as

\begin{matrix} SINR (t) = \frac{P_{g (t)} {| h_{g (t)} (t) |}^{2}}{\sum_{m \neq g (t)} P_{m} {| h_{m} (t) |}^{2} + σ^{2}} . \end{matrix}

(4)

Due to the effect of small-scale fading,

SINR (t)

is a random variable for any given flight position

q_{n} (t)

and serving GBS

g (t)

. Thus, the communication outage probability can be defined as the probability of

SINR (t)

being less than a given communication outage threshold

γ_{t h}

, which is expressed as

\begin{matrix} P_{o u t} (q_{n} (t), g (t)) = \Pr {SINR (t) < γ_{t h}} . \end{matrix}

(5)

It is challenging to predict the actual communication outage probability in the operational environment during trajectory planning. We can obtain the real-time SINR by measuring or querying the reference signal receiving power (RSRP) and reference signal receiving quality (RSRQ) reports multiple times within a short duration, and calculate the empirical communication outage probability to estimate the actual outage probability

\begin{matrix} {\hat{P}}_{o u t} (q_{n} (t), g (t)) = \frac{1}{J} \sum_{j = 1}^{J} I (q_{n} (t), g (t)), \end{matrix}

(6)

where

{\hat{P}}_{o u t} (q_{n} (t), g (t))

is the empirical outage probability, J is the number of measurements, and

I (q_{n} (t), g (t))

is the communication outage decision function, which is defined as

I (q_{n} (t), g (t)) = \{\begin{matrix} 1, & SINR (t) < γ_{th}, \\ 0, & otherwise . \end{matrix}

(7)

According to the law of large numbers, when J is large enough, the actual communication outage probability approximates the empirical outage probability, as follows

\begin{matrix} lim_{J \to \infty} {\hat{P}}_{o u t} (q_{n} (t), g (t)) = P_{o u t} (q_{n} (t), g (t)) . \end{matrix}

(8)

Consequently, the UAV’s optimal connection strategy at any given location, referred to as the connection cell

g (t)

, corresponds to the GBS with the lowest empirical disruption probability among the M available GBSs.

3.2.2. Environmental Complexity

To quantify the impact of buildings and TFR for UAV trajectory planning, we introduce the environmental complexity model to assess the area’s safety level [16]. Particularly, buildings are considered geographically known static obstacles, and temporary flight area restrictions are treated as unknown unexpected obstacles in the case of path planning. The static obstacle complexity model, involving the number, area, and height of buildings, can be formulated as

\begin{matrix} E_{s} (q_{n} (t)) = \frac{\sum_{i \in B} S_{i} h_{i}}{S H}, \end{matrix}

(9)

where S is the size of the evaluation area centered on flight position

q_{n} (t)

.

B = {1, 2, \dots, B}

is the set of buildings in the area,

S_{i}

is the area occupied by the ith building and

h_{i}

is the height of the ith building. Additionally, the unexpected obstacle complexity model on the basis of the restricted airspace area and occurrence probability can be expressed as

\begin{matrix} E_{d} (q_{n} (t)) = \frac{τ_{d} S_{d}}{S}, \end{matrix}

(10)

where

τ_{d}

denotes the probability of the TFR occurrence, and

S_{d}

is the area occupied by the restricted airspace. The environmental complexity for any flight position is then given by

\begin{matrix} E_{c} (q_{n} (t)) = δ_{1} E_{s} (q_{n} (t)) + δ_{2} E_{d} (q_{n} (t)), \end{matrix}

(11)

where

δ_{1}

and

δ_{2}

are the corresponding weighting coefficients. Based on the above analysis, the UAV’s flight risk at any position can be written as

\begin{matrix} α_{q_{n} (t)} = μ {\hat{P}}_{o u t} (q_{n} (t), g (t)) + (1 - μ) E_{c} (q_{n} (t)), \end{matrix}

(12)

where

μ

is the corresponding impact factor.

3.3. Problem Formulation

In this paper, the following two objectives are considered by manipulating the UAV’s 3D trajectory

{\{q_{n}\}}_{n = 1}^{N}

, based on the flight risk model discussed earlier:

To minimize the cumulative flight risk from $q_{s}$ to $q_{f}$ ;
To minimize the flight distance between $q_{s}$ and $q_{f}$ .

The optimal strategy for the first objective requires the UAV to avoid as many areas with flight risks as possible to guarantee the safety of the operation; however, this can lead to excessively long flight distances that are even more difficult to solve. The second objective entails the UAV being tasked with flying the shortest distance possible from its starting point to its destination. Nevertheless, this may not always be feasible as it may cross areas with high flight risk, thus hindering the successful completion of the UAV mission. As such, this problem necessitates a synergistic consideration of both optimization objectives to obtain the optimal trajectory by taking the flight distance as the crucial factor in the solution while ensuring the safety of the UAV flights. Hereby, the optimization problem can be expressed as a minimal risk distance model in Equation (13), where

η (i)

represents the set of ith neighboring nodes and

X_{q_{i}, q_{j}}

is a Boolean variable, indicating whether the UAV moves from

q_{i}

to

q_{j}

.

\begin{matrix} min_{N, {\{q_{n}\}}_{n = 1}^{N}} \sum_{i \in N} \sum_{j \in η (i)} X_{q_{i}, q_{j}} α_{q_{j}} ∥ q_{j} - q_{i} ∥ \\ s . t . & C_{1} : \sum_{j \in η (i)} X_{q_{i}, q_{j}} = \sum_{j \in η (i)} X_{q_{j}, q_{i}}, \forall i \in N ∖ {s \cup f}, \\ C_{2} : \sum_{j \in η (i)} X_{q_{i}, q_{j}} \leq 1, \forall i \in N, \\ C_{3} : α_{q_{i}} \leq 1, \forall i \in N, \\ C_{4} : ∥ q_{j} - q_{i} ∥ \leq \sqrt{3} Δ_{d}, \forall i \in N, j \in η (i), \\ C_{5} : q_{1} = q_{s}; q_{N} = q_{f}, \\ C_{6} : 0 \leq x_{n} \leq D, \forall n \in N, \\ C_{7} : 0 \leq y_{n} \leq D, \forall n \in N, \\ C_{8} : h_{m i n} \leq h_{n} \leq h_{m a x}, \forall n \in N, \end{matrix}

(13)

Constraints

C_{1} - C_{2}

ensure that the UAV flight path is uninterrupted and only has a single path between adjacent points. Constraint

C_{3}

requires that the flight risk of any grid is no greater than 1. Constraint

C_{4}

specifies that the maximum distance between any two adjacent grid points should not exceed

\sqrt{3} Δ_{d}

. Constraint

C_{5}

specifies the start point and end point of the UAV’s trajectory. Constraints

C_{6} - C_{8}

mean that the flight coordinates cannot exceed the coordinate system limits.

However, the minimal risk distance model is a mixed-integer nonlinear programming (MINLP) problem, which is considerably challenging to solve directly due to its complexity. Moreover, considering the temporal variability of urban environments, the corresponding trajectory planning problem also is hard to solve and even impossible to obtain the optimal solutions with single-stage solving. Thus, it is very important to design an efficient hierarchical planning scheme from the perspective of engineering implementation. To address the above challenges, the joint optimization problem of flight risk and distance can be decomposed into two sub-problems based on the operational features of cellular-connected UAVs: global planning problem between GBSs and local planning problem within GBSs. And the corresponding MTTP method with lower computational complexity is proposed to solve the minimum risk distance model formulation (Equation (13)) in Section 4.

4. Description of the Proposed Method

In this section, we describe the MTTP method to tackle the above optimization problem, which consists of an HSA-based connectivity-aware global planning tier and a DRL-based collision-free local planning tier. Specifically, the global planning subproblem between GBSs is solved by the HSA-based connectivity-aware global planning tier, while the local planning subproblem within GBSs is solved by the DRL-based collision-free local planning tier. In the global planning tier, we first abstract the urban environment in the form of a directed graph based on the average flight risk of GBS’s signal coverage and voyage between GBSs, aiming to obtain the globally optimal path. Then, the HSA that integrates distance size and flight risk is proposed in Section 4.1, which can obtain the optimal path with high efficiency. In the local planning tier, the optimal 3D trajectory planning subproblem within GBS is solved in Section 4.2 by using the DRL algorithm to consider both static known obstacles and unexpected obstacles. In brief, the proposed MTTP method first uses the global planning tier to solve the optimal path of routed GBSs, and based on this, it utilizes the local planning tier to cyclically solve the 3D flight trajectories within each GBS until the UAV reaches the destination. The specific workflow is shown in Algorithm 1.

Algorithm 1: Workflow of the MTTP method.

4.1. HSA-Based Connectivity-Aware Global Planning

In the connectivity-aware global planning tier, we consider the GBSs’ signal coverage as the planning unit and integrate the distance and flight risk of each GBS, so that the inter-GBS global planning subproblem can be modeled in Equation (14), where

η_{G} (i)

denotes the set of ith GBS neighboring GBSs and

{\bar{α}}_{q_{j}}

is the average flight risk for the nth GBS signal range area.

\begin{matrix} min_{G, {\{q_{n}\}}_{n = 1}^{G}} \sum_{i \in G} \sum_{j \in η_{G} (i)} X_{q_{i}, q_{j}} {\bar{α}}_{q_{j}} ∥ q_{j} - q_{i} ∥ \\ s . t . & C_{1} : \sum_{j \in η (i)} X_{q_{i}, q_{j}} = \sum_{j \in η (i)} X_{q_{j}, q_{i}}, \forall i \in G, \\ C_{2} : \sum_{j \in η (i)} X_{q_{i}, q_{j}} \leq 1, \forall i \in G, \\ C_{3} : {\bar{α}}_{q_{i}} \leq 1, \forall i \in G, \\ C_{4} : 0 \leq x_{n} \leq D, \forall n \in N, \\ C_{5} : 0 \leq y_{n} \leq D, \forall n \in N, \\ C_{6} : h_{m i n} \leq h_{n} \leq h_{m a x}, \forall n \in N . \end{matrix}

(14)

In this case, the environment can be characterized as a directed graph

D = (M, E)

, where E is the GBS neighborhood relation. And each directed graph node represents the grid area of the corresponding signal coverage area of the GBS, while each graph edge represents the adjacency between GBSs. To address the task of finding the optimal global flight path efficiently, we propose employing an improved heuristic search algorithm (IHSA), which is suitable for performing the solution. The A* algorithm is a well-known HSA that enhances the efficiency of searching by utilizing heuristic information to guide decision-making while ensuring the superiority of the path. Its cost-heuristic function comprises the foregone cost and the estimated cost, expressed as follows

\begin{matrix} f (n) = g (n) + h (n) . \end{matrix}

(15)

In the equation, n denotes the current extended node,

g (n)

denotes the actual cost from the beginning node to the current node n, and

h (n)

is the estimation cost from node n to the target node. The heuristic function

h (n)

provides an estimate of the cost to reach the destination from node n. The total cost function

f (n)

combines the actual cost

g (n)

and the estimated cost

h (n)

, yielding the minimal cost path estimation from the beginning node to the target node while passing through node n. The A* algorithm uses this cost heuristic to efficiently explore the search space and find the optimal path in a search problem.

Our solution inherits inspiration from the A* algorithm. Unlike traditional algorithms. the IHSA utilizes a different approach. It incorporates the knowledge derived from the combined value of flight risk and distance size to control the search direction, instead of relying solely on the straight-line distance as the heuristic. By considering both the flight risk and the distance size, IHSA can make more informed decisions during the search process, leading to potentially better and safer paths for the UAV in its trajectory planning. Thus, the actual cost from

q_{n}

to

q_{m}

that is used by IHSA can be denoted as

\begin{matrix} f (q_{n}, q_{m}) = \sum_{i = n}^{m} {\bar{α}}_{q_{i + 1}} H (q_{i}, q_{i + 1}) + \sum_{i = m}^{G} {\bar{α}}_{q_{i + 1}} H (q_{i}, q_{i + 1}), \end{matrix}

(16)

where

H (q_{i}, q_{j})

is the voyage between two nodes. In detail, the input of IHSA is the directed graph

D = (M, E)

. We specialize the algorithm by considering a 2D grid map, where each location corresponds to a directed graph node, and each path between two nodes corresponds to a graph edge. Moreover, the starting cost of each grid point is calculated by Equation (16). The output of IHSA is a reconstructed path, which is the result of the back-pointer path reversal obtained by the algorithm, starting from the beginning to the end.

The HSA-based connectivity-aware global planning is solved by Algorithm 2, where

cost_between ()

and

get_neighbors ()

are two functions that calculate the actual cost and obtain the neighboring nodes of the current node, respectively. The complexity of this algorithm can be calculated as

O (N log (N))

.

Algorithm 2: Improved heuristic search algorithm (IHSA).

4.2. DRL-Based Collision-Free Local Planning

In the event of having access to a globally optimized flight path

{\{q_{n}\}}_{n = 1}^{G}

, our attention in the collision-free local planning tier shifts toward addressing the subproblem of 3D trajectory planning within GBSs. More explicitly, once the global optimum flight path is known, the local planning subproblem can be formulated as finding the optimal 3D trajectory within the GBSs. Thus, this subproblem can be mathematically modeled in Equation (17), where

N_{i}

represents the set of grid points that the UAV passes through in the ith GBS. However, the optimization subproblem outlined in Equation (17) is inherently non-convex, with multiple constraints that, together with the presence of static and unexpected obstacles within the urban environment, make it complex and challenging to solve. The difficulty in non-convex optimization problems mainly lies in the existence of numerous local optimal solutions, which make the discovery of a global optimal solution particularly elusive. Moreover, static and unexpected obstacles exacerbate the complexity of the problem as they impose additional constraints and in certain instances, render existing solutions unfeasible.

\begin{matrix} min_{N_{i}, {\{q_{n}\}}_{n = 1}^{N_{i}}} \sum_{i \in G} \sum_{j \in N_{i}} \sum_{k \in η (j)} X_{q_{j}, q_{k}} α_{q_{k}} ∥ q_{k} - q_{j} ∥ \\ s . t . & C_{1} : \sum_{k \in η (j)} X_{q_{j}, q_{k}} = \sum_{k \in η (j)} X_{q_{k}, q_{j}}, \forall j \in N_{i} ∖ {s \cup f}, \forall i \in G, \\ C_{2} : \sum_{k \in η (j)} X_{q_{j}, q_{k}} \leq 1, \forall j \in N_{i}, \forall i \in G, \\ C_{3} : α_{q_{i}} \leq 1, \forall i \in N, \\ C_{4} : ∥ q_{j} - q_{i} ∥ \leq \sqrt{3} Δ_{d}, \forall i \in N, j \in η (i), \\ C_{5} : q_{1} = q_{s}; q_{N} = q_{f}, \\ C_{6} : 0 \leq x_{n} \leq D, \forall n \in N, \\ C_{7} : 0 \leq y_{n} \leq D, \forall n \in N, \\ C_{8} : h_{m i n} \leq h_{n} \leq h_{m a x}, \forall n \in N . \end{matrix}

(17)

Fortunately, the Markov decision process (MDP) is suitable for modeling the UAV trajectory planning problem [38]. Within the MDP framework, the state transition relies only on the current state and the action taken, independent of past states or actions. Thus, each step of the UAV’s flight can be viewed as a decision-making process, with the current location and the target location of the UAV forming the state space. From this perspective, we present a DRL-based approach to address the optimal pathfinding problem in dynamic environments. Combining the strengths of DL and RL, DRL can handle high-dimensional continuous states, and action spaces, and can automatically extract essential features from raw inputs. By utilizing the cutting-edge D3QN, we first propose a D3QN-based algorithm for navigation. To address the limitation of D3QN algorithms becoming stuck in local optima when dealing with dynamic environments, we introduce a collaborative offline and online strategy subsequently.

The evaluation network and the target network are two deep neural networks (DNNs) used in the D3QN architecture. The primary role of the evaluation network is to produce the action value

Q (s_{t}, a_{t}; ω)

by taking the state

s_{t}

as input at a specific time slot t, where

ω

represents the weights of the evaluation network. On the other hand, the target network, with its network weights denoted as

ω^{-}

, provides the target value

max_{a t + 1} Q^{'} (s t + 1, a_{t + 1}; ω^{-})

. This target value is used to guide the learning process and improve stability. At every K step, the target network is updated periodically to correspond to the weights of the evaluation network. To enhance the exploration capabilities of D3QN, the action

a_{t}

is decided by the

ϵ

-greedy policy,

a_{t} = \{\begin{matrix} arg max_{a_{t}} Q (s_{t}, a_{t}; ω) & with probability of ϵ, \\ r a n d o m l y s e l e c t a n a c t i o n & otherwise . \end{matrix}

(18)

Furthermore, different from DQN, the network architecture of D3QN contains two independent streams, one for determining the state value function and the other for the state-dependent action advantage function. The outputs of these two streams are then aggregated to generate the Q-values for each action. This separation allows for a more precise estimation of the action values. Thus, The Q-value can be estimated by

\begin{matrix} Q (s, a; ω, α, β) = V (s; ω, α) + A (s, a; ω, β) - \frac{1}{| A |} \sum_{a^{'}} A (s, a^{'}; ω, β), \end{matrix}

(19)

where

α

and

β

represent the network parameters used to determine the state-value function

V (s; ω, α)

and the advantage function

A (s, a; ω, β)

, respectively. The

\frac{1}{| A |} \sum_{a^{'}} A (s, a^{'}; ω, β)

subtracts the average advantage of all possible actions from the advantage of the current action, thus assisting in the decomposition of the Q-function into state-value and advantage-value functions. By using the dueling network architecture, the algorithm can efficiently allocate its learning resources to focus on understanding the value of different states. This allows the agent to make more informed decisions and prioritize actions in states that have a meaningful impact on the overall task.

Moreover, the D3QN policy employs backpropagation (BP) and gradient descent algorithms for updating the evaluation network’s parameters. To solve the overestimation problem, the loss function of D3QN is changed to

\begin{matrix} L (ω) = E [{(r_{t} + γ Q^{'} (s_{t + 1}, arg max_{a_{t + 1}} Q (s_{t + 1}, a_{t + 1}; ω), ω^{-}) - Q (s_{t}, a_{t}; ω))}^{2}] . \end{matrix}

(20)

To better cope with static and unexpected obstacles in complex urban environments, we propose a collaborative offline and online strategy. In our proposed strategy, we incorporate two D3QN architectures, forming the comprehensive decision-making network framework, as shown in Figure 3. The long-term decision network, which processes static obstacle data and learns the corresponding flight strategy, is the first component of this dual structure. Secondly, the short-term decision network handles data related to unexpected obstacles, providing guidance for the UAV’s emergency collision avoidance. Within each discrete time step, the UAV identifies the presence of obstacles through its sensory apparatus. Upon detection of any obstacle information, the short-term policy network is activated, calculating the urgent obstacle information based on the current UAV position and consequently outputting the Q-value associated with the short-term policy decision network. Simultaneously, the long-term policy network generates the Q-value of the long-term decision network, informed by the end position, the current UAV location, and data on static structures. Subsequently, a comprehensive assessment of the Q-values from both decision networks is performed, and the decision yielding the maximum Q-value is selected as the recommended action for the current state. Finally, through continuous iterations, we generate the optimal collision-free flight path of UAVs within the coverage range of the GBS.

Algorithm 3: D3QN with the collaborative offline and online strategy.

In the proposed DRL-based method, a comprehensive description of the state space

s_{t}

, action space

a_{t}

, and reward function

r_{t}

at time t is provided as follows.

State space: In the DRL-based local planning model, state space $s_{t}$ consists of coordinates of the UAV’s current position $q_{n}$ , the destination $q_{f}$ , the set of relative static obstacle distances $R_{s}$ , and the set of relative unexpected obstacle distances $R_{d}$ . Thus, the state space is written as $s_{t} = \{q_{n}, q_{f}, R_{s}, R_{d}, \forall n \in N\}$ .
Action space: The action space $a_{t}$ comprises 26 directions, allowing the UAV to select any action to navigate to an adjacent grid point.
Reward function: The reward $r_{t + 1}$ is acquired upon executing action $a_{t}$ in state $s_{t}$ . The reward function of a long-term decision network is defined as follows:

$r_{t + 1} = \{\begin{matrix} r_{a}, if arrived at q_{f} \\ r_{c}, if collided with obstacles \\ θ_{1} ∥ q_{n + 1} - q_{n} ∥ + θ_{2} (h_{p r e} - h_{c u r}) \\ - θ_{3} \sum_{r_{s} (i) \in R_{s}} \frac{1}{r_{s} (i)}, otherwise \end{matrix}$

(21)

In addition, the short-term decision network reward function is set as follows:

$r_{t + 1} = \{\begin{matrix} r_{c}, if collided with obstacles \\ θ_{1} ∥ q_{n + 1} - q_{n} ∥ + θ_{2} (h_{p r e} - h_{c u r}) \\ - θ_{4} \sum_{r_{d} (i) \in R_{d}} \frac{1}{r_{d} (i)}, otherwise \end{matrix}$

(22)

where

h_{p r e} = ∥ q_{n} - q_{f} ∥

and

h_{c u r} = ∥ q_{n + 1} - q_{f} ∥

represent the previous and current relative distances between

q_{n}

and

q_{f}

, respectively. And these distances are utilized to expedite the UAV’s arrival at the destination, thereby aiding in faster navigation.

θ_{1}

,

θ_{2}

,

θ_{3}

, and

θ_{4}

are weighting factors.

r_{s} (i)

is the relative distance to the ith static obstacle, and

r_{d} (i)

is the relative distance to the ith unexpected obstacle.

The DRL-based collision-free local planning is solved by Algorithm 3. This algorithm chooses the optimal pathfinding decision in the given state space by maximizing the expected weighted cumulative reward. The complexity of this algorithm is

O (L_{e} N)

.

5. Numerical Experiments

In this section, numerical results are demonstrated to evaluate the performance of the proposed MTTP in complex urban environments, which is evaluated in the simulation from the perspectives of trajectory planning, flight successful rate, and convergence performance, respectively, to demonstrate its merits in different environments.

5.1. Experiment Settings

We conducted a simulation experiment on the proposed MTTP framework. We utilized Python 3.8 and PyTorch 1.1, operating on the AMD computing platform with 32GB of RAM and a 3.3 GHz AMD Ryzen processor. In this experiment, virtual and real urban experimental areas were established, respectively. Based on the experience of previous studies [39,40,41], we arrange 123 distributed buildings in a 500 m × 500 m × 25 m area, with the length and width range of 8 to 18 m and height range of 5 to 22 m to simulate a virtual city environment. We set up 25 GBSs that are evenly distributed in the considered area, and their signal coverage radius is denoted by 70 m. Similar to [18,38], the transmit power of each GBS is equal to

P_{m}

= 30 dBm, and the carrier frequency is 2 GHz. The outage SINR threshold

γ_{t h}

is 3 dB. The impact factor of flight risk is

μ = 0.5

. Moreover, the average flight risk of the environment is used to indicate the difficulty of path planning. We choose a total of three environments with different average flight risks to illustrate our results, which are equipped with 0, 10, and 17 TFRs in the same urban area. The lengths and widths of TFRs are 10 m and their heights are 25 m.

Additionally, we choose a 1400 m × 400 m × 150 m zone in Yuhang District, Hangzhou (

120.155 °

E,

30.274 °

N), as the real-world experiment area, which has a complex urban architecture of heights ranging from 5 to 150 m. Specifically, the 3D data of buildings and topography are extracted from OpenStreetMap [42]. Latitude and longitude coordinates obtained from online geolocation systems are converted into the Cartesian coordinate system, assuming a flat Earth model. Moreover, there are 9 TFRs with lengths and widths ranging from 10 to 110 m and their heights are 150 m in this area. We set 14 GBSs with a signal coverage radius of 100 m in the considered area. Other parameters are the same as in the virtual city.

For simulation parameters of UAV mobility, the minimum single movement distance is

Δ_{d}

= 1 m [43]. The allowable altitude range for UAVs is from 5 to 20 m in the virtual city, and in a realistic city, it is from 15 to 60 m [44]. The parameters in the reward function of the decision network were set to

r_{a} = 200

,

r_{c}

= −200,

θ_{1}

= −5,

θ_{2} = 20

,

θ_{3} = 0.5

, and

θ_{4} = 8

.

In the D3QN algorithm, we configure two identical DNNs. Each DNN consists of four fully connected hidden layers, each containing 16 neurons. Hidden layers use PReLU as the activation function, and update the DNN weights with the specified learning rate according to the AdamOptimizer method. In our approach, we set the initial coefficient of the

ϵ

-greedy exploration strategy to

0.7

. For the chosen learning rate for training, the DNN is

0.0004

, and the discount factor used for future rewards is defined as

0.99

. To manage memory efficiently, the replay memory size is

200, 000

, and we utilize a batch size of 128 for mini-batch training. Additionally, to improve the convergence speed of the algorithm, we set the update interval of the target network to every 10 episodes.

To validate the effectiveness of the MTTP, we use the communication-based trajectory planning scheme for cellular-connected UAV [38] as benchmarks. This scheme is a highly representative study, which focuses on the trajectory planning problem of UAVs based on communication perceptions in complex urban environments.

5.2. Results/Discussion

Figure 4 presents the main view and top view of trajectory planning in environments with average flight risks of 30%, 60%, and 90%. The blue line represents the trajectory obtained using the proposed MTTP method, while the green line corresponds to the trajectory generated by the communication-based trajectory planning algorithm [38]. In the 3D view results, both solutions enable the UAV to fly straight when encountering low-rise buildings and navigating around high-rise buildings. Moreover, both approaches effectively avoid risks when encountering sudden restricted areas. The top view results and Table 1 demonstrate that the MTTP achieves global optimization in trajectory selection. Specifically, the MTTP selects suitable flight areas during the global planning tier and realizes the shortest flight trajectory during the local planning tier. In contrast, the communication-based trajectory planning algorithm [38] struggles to guarantee global optimality, especially in complex flight environments, leading to increased collision risks. The proposed MTTP method consistently maintains optimal trajectory planning results and exhibits strong robustness even as the average flight risk of the environment increases. Additionally, as shown in Table 1, the MTTP reduces flight distance by at least 8.49% compared to the communication-based trajectory planning solution [38] in complex environments. The analysis of the trajectory planning results of the virtual urban environments indicates that during the MTPP’s overall planning, the connectivity-aware global planning algorithm based on HSA can minimize the flight risk by selectively choosing appropriate routes based on environmental complexity, thereby reducing the difficulty associated with subsequent local planning. Further, the collision-free local planning algorithm based on D3QN is capable of achieving optimal 3D flight paths through real-time obstacle avoidance.

Figure 5 shows the simulation results of the flight’s successful rate changing with the average flight risk of the environment. The average flight risk of the environment is 0%, indicating that there are no static obstacles and sudden obstacles in the environment. The average flight risk of the environment is 100%, indicating that the environment is full of obstacles and even cannot fly. The blue line represents the MTTP, and the green line represents the communication-based trajectory planning solution [38]. The results indicate that the flight success rate decreases as the average flight risk of the environment increases, and the blue line consistently outperforms the green line. These results demonstrate that the MTTP method, compared to the communication-based trajectory planning solution [38], selects more suitable flight areas and increases the flight’s success rate by up to 10%, thereby enhancing the safety and reliability of the UAV flight.

Figure 6 presents the average reward value of the proposed D3QN algorithm under various average flight risks. The results demonstrate that the reward value quickly converges and remains stable during the training process. Furthermore, lower average flight risks correspond to simpler UAV operating environments, resulting in higher reward values.

To validate the proposed MTTP method in realistic urban areas, we conducted tests in the Yuhang District of Hangzhou and compared the results with the communication-based trajectory planning solution [38]. As shown in Figure 7, the communication-based trajectory planning [38] failed to generate a complete path. In contrast, the proposed MTTP successfully generated a complete path under the same starting and ending point settings, with a total path length of 1930.62 m. These results highlight the superior performance of the proposed MTTP method compared to the communication-based trajectory planning [38] in real-world scale problem instances. This experiment validates the applicability of the MTTP method in highly complex real working environments.

6. Conclusions

This paper introduces the MTTP method to address the challenges of air–ground communication service assurance and collision avoidance in complex urban environments. By transforming the 3D trajectory optimization problem into a minimum risk distance problem and combining connection-aware global planning using HSA algorithms with collision-free local planning using DRL, the MTTP method effectively optimizes UAV flight trajectories, ensuring both reliable communication and safe navigation. Simulation results show the superiority of the MTTP method over the communication-based trajectory planning approach, with a significant reduction in flight distance of at least 8.49% and an impressive 10% increase in the mission’s success rate. Furthermore, real-world experiments validate the practical applicability of the MTTP method in highly complex operating environments. The experimental results above demonstrate that hierarchical progressive solutions composed of connection-aware global path planning based on HSA and collision-free local path planning based on DRL can effectively enhance the trajectory planning performances of UAVs in low-altitude urban areas using the proposed MTTP method. Global path planning can select suitable flight paths at the macroscopic level, reducing the solution complexity of subsequent local path planning. On the other hand, local trajectory planning acquires the optimal 3D flight path by using real-time environmental information to achieve the overall flight trajectory. To enhance the path planning effectiveness further, future work will mainly focus on incorporating the energy consumption of the UAV as an optimization variable into the overall problem. This enhancement aims to contribute to overall mission efficiency and endurance while ensuring reliable communication and collision avoidance. Moreover, we considered constant UAV speeds and various types of common buildings as obstacles in this study. To make the MTTP method even more versatile and adaptable, in the future, we may expand the current framework to dynamically adjust the flight speeds of UAVs in different operating areas to achieve further performance improvement. For instance, regions with high communication interruption probability could benefit from higher UAV speeds to minimize interruption times. Additionally, incorporating building point-of-interest (POI) indices and crowd distributions will make the trajectory planning process more aligned with realistic scenarios.

Author Contributions

Conceptualization, X.L. and T.Z.; methodology, X.L., T.Z. and W.X.; software, T.Z. and J.Z.; validation, X.L., T.Z., W.X. and C.F.; formal analysis, T.Z., W.X. and C.F.; investigation, X.L., T.Z. and W.X.; resources, X.L., T.Z., W.X. and C.F.; data curation, X.L., T.Z., W.X. and J.Z.; writing—original draft preparation, X.L., T.Z. and W.X.; writing—review and editing, X.L., T.Z., W.X., C.F. and T.L.; visualization, T.Z., C.F., T.L. and J.Z.; supervision, X.L, W.X., C.F., T.L. and J.Z.; project administration, X.L., T.Z. and W.X.; funding acquisition, X.L, W.X., C.F., T.L. and J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Zhejiang ‘JIANBING’ R&D project (No.2022C01055), the Postdoctoral Research Foundation of China and the Open Fund of the Hubei Key Laboratory of Intelligent Robots project (Wuhan Institute of Technology).

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Li, D.; Qiang, Y.; Mott, J.H. Hazard Analysis of Large Cargo Delivery UAVs Under the Chinese Air Traffic Control System. In Proceedings of the 2021 Systems and Information Engineering Design Symposium (SIEDS), Virtual, 29–30 April 2021; pp. 1–6. [Google Scholar] [CrossRef]
Elloumi, M.; Dhaou, R.; Escrig, B.; Idoudi, H.; Saidane, L.A. Monitoring road traffic with a UAV-based system. In Proceedings of the 2018 IEEE Wireless Communications and Networking Conference (WCNC), Barcelona, Spain, 15–18 April 2018; pp. 1–6. [Google Scholar] [CrossRef]
He, B.; Huang, B.; Lin, Y.; Wu, L. Intelligent unmanned aerial vehicle (UAV) system for aircraft surface inspection. In Proceedings of the 2020 7th International Forum on Electrical Engineering and Automation (IFEEA), Hefei, China, 25–27 September 2020; pp. 316–321. [Google Scholar] [CrossRef]
Liu, W.; Wang, C.; Zang, Y.; Lai, S.H.; Weng, D.; Bian, X.; Lin, X.; Shen, X.; Li, J. Ground Camera Images and UAV 3D Model Registration for Outdoor Augmented Reality. In Proceedings of the 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), Osaka, Japan, 23–27 March 2019; pp. 1050–1051. [Google Scholar] [CrossRef]
Van der Bergh, B.; Chiumento, A.; Pollin, S. LTE in the sky: Trading off propagation benefits with interference costs for aerial nodes. IEEE Commun. Mag. 2016, 54, 44–50. [Google Scholar] [CrossRef]
Al-Hourani, A.; Gomez, K. Modeling cellular-to-UAV path-loss for suburban environments. IEEE Wirel. Commun. Lett. 2017, 7, 82–85. [Google Scholar] [CrossRef]
Zeng, Y.; Lyu, J.; Zhang, R. Cellular-connected UAV: Potential, challenges, and promising technologies. IEEE Wirel. Commun. 2018, 26, 120–127. [Google Scholar] [CrossRef]
Lin, X.; Yajnanarayana, V.; Muruganathan, S.D.; Gao, S.; Asplund, H.; Maattanen, H.L.; Bergstrom, M.; Euler, S.; Wang, Y.P.E. The sky is not the limit: LTE for unmanned aerial vehicles. IEEE Commun. Mag. 2018, 56, 204–210. [Google Scholar] [CrossRef]
Zeng, Y.; Zhang, R.; Lim, T.J. Wireless communications with unmanned aerial vehicles: Opportunities and challenges. IEEE Commun. Mag. 2016, 54, 36–42. [Google Scholar] [CrossRef]
Zeng, Y.; Wu, Q.; Zhang, R. Accessing from the sky: A tutorial on UAV communications for 5G and beyond. Proc. IEEE 2019, 107, 2327–2375. [Google Scholar] [CrossRef]
Zhang, S.; Zeng, Y.; Zhang, R. Cellular-enabled UAV communication: A connectivity-constrained trajectory optimization perspective. IEEE Trans. Commun. 2018, 67, 2580–2604. [Google Scholar] [CrossRef]
Zhang, S.; Zhang, R. Radio map-based 3D path planning for cellular-connected UAV. IEEE Trans. Wirel. Commun. 2020, 20, 1975–1989. [Google Scholar] [CrossRef]
Bulut, E.; Guevenc, I. Trajectory optimization for cellular-connected UAVs with disconnectivity constraint. In Proceedings of the 2018 IEEE International Conference on Communications Workshops (ICC Workshops), Kansas City, MO, USA, 20–24 May 2018; IEEE: New York, NY, USA, 2018; pp. 1–6. [Google Scholar]
Yang, H.; Zhang, J.; Song, S.; Lataief, K.B. Connectivity-aware UAV path planning with aerial coverage maps. In Proceedings of the 2019 IEEE Wireless Communications and Networking Conference (WCNC), Marrakesh, Morocco, 15–18 April 2019; IEEE: New York, NY, USA, 2019; pp. 1–6. [Google Scholar]
Khamidehi, B.; Sousa, E.S. Federated learning for cellular-connected UAVs: Radio mapping and path planning. In Proceedings of the GLOBECOM 2020-2020 IEEE Global Communications Conference, Taipei, Taiwan, 7–11 December 2020; IEEE: New York, NY, USA, 2020; pp. 1–6. [Google Scholar]
Yin, C.; Xiao, Z.; Cao, X.; Xi, X.; Yang, P.; Wu, D. Offline and online search: UAV multiobjective path planning under dynamic urban environment. IEEE Internet Things J. 2017, 5, 546–558. [Google Scholar] [CrossRef]
Wu, Y.; Low, K.H. An adaptive path replanning method for coordinated operations of drone in dynamic urban environments. IEEE Syst. J. 2020, 15, 4600–4611. [Google Scholar] [CrossRef]
Zeng, Y.; Xu, X.; Jin, S.; Zhang, R. Simultaneous navigation and radio mapping for cellular-connected UAV with deep reinforcement learning. IEEE Trans. Wirel. Commun. 2021, 20, 4205–4220. [Google Scholar] [CrossRef]
Khatib, O. Real-time obstacle avoidance for manipulators and mobile robots. Int. J. Robot. Res. 1986, 5, 90–98. [Google Scholar] [CrossRef]
Ulrich, I.; Borenstein, J. VFH/sup*: Local obstacle avoidance with look-ahead verification. In Proceedings of the 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No. 00CH37065), San Francisco, CA, USA, 24–28 April 2000; IEEE: New York, NY, USA, 2000; Volume 3, pp. 2505–2511. [Google Scholar]
Rao, N.S.; Stoltzfus, N.; Iyengar, S.S. A’retraction’method for learned navigation in unknown terrains for a circular robot. IEEE Trans. Robot. Autom. 1991, 7, 699–707. [Google Scholar] [CrossRef]
Minguez, J.; Montano, L. Extending collision avoidance methods to consider the vehicle shape, kinematics, and dynamics of a mobile robot. IEEE Trans. Robot. 2009, 25, 367–381. [Google Scholar] [CrossRef]
Fox, D.; Burgard, W.; Thrun, S. The dynamic window approach to collision avoidance. IEEE Robot. Autom. Mag. 1997, 4, 23–33. [Google Scholar] [CrossRef]
Wu, P.P.Y.; Campbell, D.; Merz, T. Multi-objective four-dimensional vehicle motion planning in large dynamic environments. IEEE Trans. Syst. Man Cybern. Part (Cybern.) 2010, 41, 621–634. [Google Scholar] [CrossRef]
Hernandez-Hernandez, L.; Tsourdos, A.; Shin, H.S.; Waldock, A. Multi-objective UAV routing. In Proceedings of the 2014 International Conference on Unmanned Aircraft Systems (ICUAS), Orlando, FL, USA, 27–30 May 2014; IEEE: New York, NY, USA, 2014; pp. 534–542. [Google Scholar]
Guglieri, G.; Lombardi, A.; Ristorto, G. Operation oriented path planning strategies for rpas. Am. J. Sci. Technol. 2015, 2, 1–8. [Google Scholar]
Liu, X.; Liu, Y.; Chen, Y. Machine learning empowered trajectory and passive beamforming design in UAV-RIS wireless networks. IEEE J. Sel. Areas Commun. 2020, 39, 2042–2055. [Google Scholar] [CrossRef]
Wan, K.; Gao, X.; Hu, Z.; Wu, G. Robust motion control for UAV in dynamic uncertain environments using deep reinforcement learning. Remote. Sens. 2020, 12, 640. [Google Scholar] [CrossRef]
Hu, Z.; Wan, K.; Gao, X.; Zhai, Y.; Wang, Q. Deep reinforcement learning approach with multiple experience pools for UAV’s autonomous motion planning in complex unknown environments. Sensors 2020, 20, 1890. [Google Scholar] [CrossRef]
Liu, X.; Liu, Y.; Chen, Y. Reinforcement learning in multiple-UAV networks: Deployment and movement design. IEEE Trans. Veh. Technol. 2019, 68, 8036–8049. [Google Scholar] [CrossRef]
Bayerlein, H.; Theile, M.; Caccamo, M.; Gesbert, D. UAV path planning for wireless data harvesting: A deep reinforcement learning approach. In Proceedings of the GLOBECOM 2020-2020 IEEE Global Communications Conference, Taipei, Taiwan, 7–11 December 2020; IEEE: New York, NY, USA, 2020; pp. 1–6. [Google Scholar]
Theile, M.; Bayerlein, H.; Nai, R.; Gesbert, D.; Caccamo, M. UAV path planning using global and local map information with deep reinforcement learning. In Proceedings of the 2021 20th International Conference on Advanced Robotics (ICAR), Ljubljana, Slovenia, 6–10 December 2021; IEEE: New York, NY, USA, 2021; pp. 539–546. [Google Scholar]
Xu, Z.; Wang, Q.; Kong, F.; Yu, H.; Gao, S.; Pan, D. Ga-DQN: A Gravity-aware DQN Based UAV Path Planning Algorithm. In Proceedings of the 2022 IEEE International Conference on Unmanned Systems (ICUS), Guangzhou, China, 28–30 October 2022; IEEE: New York, NY, USA, 2022; pp. 1215–1220. [Google Scholar]
Khamidehi, B.; Sousa, E.S. A double Q-learning approach for navigation of aerial vehicles with connectivity constraint. In Proceedings of the ICC 2020-2020 IEEE International Conference on Communications (ICC), Dublin, Ireland, 7–11 June 2020; IEEE: New York, NY, USA, 2020; pp. 1–6. [Google Scholar]
Series, P. Propagation Data and Prediction Methods Required for the Design of Terrestrial Broadband Radio Access Systems Operating in a Frequency Range from 3 to 60 GHz. Recommendation ITU-R. 2013, pp. 1410–1415. Available online: https://scholar.google.com/scholar?as_q=Propagation+data+and+prediction+methods+required+for+the+design+of+terrestrial+broadband+radio+access+systems+operating+in+a+frequency+range+from+3+to+60+GHz&as_occt=title&hl=en&as_sdt=0%2C31 (accessed on 20 August 2023).
Xiang, H.; Tian, L. Development of a low-cost agricultural remote sensing system based on an autonomous unmanned aerial vehicle (UAV). Biosyst. Eng. 2011, 108, 174–190. [Google Scholar] [CrossRef]
3GPP. TR 36.777, Technical Specification Group Radio Access Network: Study on Enhanced LTE Support for Aerial Vehicles. Technical Report V15.0.0. 2017. Available online: https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=3231 (accessed on 20 August 2023).
Xie, H.; Yang, D.; Xiao, L.; Lyu, J. Connectivity-aware 3D UAV path design with deep reinforcement learning. IEEE Trans. Veh. Technol. 2021, 70, 13022–13034. [Google Scholar] [CrossRef]
Delamer, J.A.; Watanabe, Y.; Chanel, C.P. Safe path planning for UAV urban operation under GNSS signal occlusion risk. Robot. Auton. Syst. 2021, 142, 103800. [Google Scholar] [CrossRef]
Yao, W.; Chen, Y.; Fu, J.; Qu, D.; Wu, C.; Liu, J.; Sun, G.; Xin, L. Evolutionary utility prediction matrix-based mission planning for unmanned aerial vehicles in complex urban environments. IEEE Trans. Intell. Veh. 2022, 8, 1068–1080. [Google Scholar] [CrossRef]
Wu, Y.; Low, K.H.; Pang, B.; Tan, Q. Swarm-based 4D path planning for drone operations in urban environments. IEEE Trans. Veh. Technol. 2021, 70, 7464–7479. [Google Scholar] [CrossRef]
Costa Fonte, C.; Fritz, S.; Olteanu-Raimond, A.M.; Antoniou, V.; Foody, G.; Mooney, P.; See, L. A review of OpenStreetMap data. In Mapping and the Citizen Sensor; Ubiquity Press Ltd.: London, UK, 2017; pp. 37–59. [Google Scholar]
Bertram, J.; Zambreno, J.; Wei, P. Efficient Unmanned Aerial Systems Navigation With Collision Avoidance in Dense Urban Environments. IEEE Trans. Intell. Transp. Syst. 2023, 24, 8163–8173. [Google Scholar] [CrossRef]
Ladosz, P.; Oh, H.; Zheng, G.; Chen, W.H. A hybrid approach of learning and model-based channel prediction for communication relay UAVs in dynamic urban environments. IEEE Robot. Autom. Lett. 2019, 4, 2370–2377. [Google Scholar] [CrossRef]

Figure 1. Scenario model of the cellular-connected UAV system, where GBS and TFR are the abbreviations of ground base stations and temporary flight restrictions, respectively.

Figure 2. Mobility model of the cellular-connected UAV, where available 3D grid points and the flight distance per step are demonstrated.

Figure 3. The framework of the decision-making network.

Figure 4. Illustration of the proposed MTTP under different average flight risk of virtual urban area.

Figure 5. Flight’s successful rate versus different average flight risks.

Figure 6. Reward versus different average flight risks.

Figure 7. Illustration of the proposed MTTP in a realistic urban area.

Table 1. Comparisons of flight distances between the MTTP method and communication-based approach.

Average Flight Risk	MTTP Method	Communication-Based Approach	Improvement
30%	859.74	1050.18	18.13%
60%	1018.44	1161.96	12.35%
90%	1101.24	1203.36	8.49%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Luo, X.; Zhang, T.; Xu, W.; Fang, C.; Lu, T.; Zhou, J. Multi-Tier 3D Trajectory Planning for Cellular-Connected UAVs in Complex Urban Environments. Symmetry 2023, 15, 1628. https://doi.org/10.3390/sym15091628

AMA Style

Luo X, Zhang T, Xu W, Fang C, Lu T, Zhou J. Multi-Tier 3D Trajectory Planning for Cellular-Connected UAVs in Complex Urban Environments. Symmetry. 2023; 15(9):1628. https://doi.org/10.3390/sym15091628

Chicago/Turabian Style

Luo, Xiling, Tianyi Zhang, Wenxiang Xu, Chao Fang, Tongwei Lu, and Jialiu Zhou. 2023. "Multi-Tier 3D Trajectory Planning for Cellular-Connected UAVs in Complex Urban Environments" Symmetry 15, no. 9: 1628. https://doi.org/10.3390/sym15091628

APA Style

Luo, X., Zhang, T., Xu, W., Fang, C., Lu, T., & Zhou, J. (2023). Multi-Tier 3D Trajectory Planning for Cellular-Connected UAVs in Complex Urban Environments. Symmetry, 15(9), 1628. https://doi.org/10.3390/sym15091628

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Multi-Tier 3D Trajectory Planning for Cellular-Connected UAVs in Complex Urban Environments

Abstract

1. Introduction

2. Related Work

2.1. Trajectory Planning Research

2.2. Trajectory Planning Algorithm

3. System Model and Problem Formulation

3.1. Scenario and UAV Mobility Model

3.2. Flight Risk Model

3.2.1. Communication Outage Probability

3.2.2. Environmental Complexity

3.3. Problem Formulation

4. Description of the Proposed Method

4.1. HSA-Based Connectivity-Aware Global Planning

4.2. DRL-Based Collision-Free Local Planning

5. Numerical Experiments

5.1. Experiment Settings

5.2. Results/Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI