Uplink Throughput Maximization in UAV-Aided Mobile Networks: A DQN-Based Trajectory Planning Method
Abstract
1. Introduction
- We formulate an optimization problem to maximize the uplink throughput by optimizing the UAV’s trajectory, under the constraints of the UAV’s available energy and the QoS requirements of GMUs. To efficiently characterize GMU’s mobility, the Gauss–Markov mobility model (GM) is employed;
- To solve the formulated problem, we propose a DQN-based UAV trajectory optimization method in which the reward function is designed based on the GMU offload gain and the energy consumption penalty, and the -greedy method is employed to balance the exploration and exploitation;
- Simulation results show that the proposed DQN-based method outperforms traditional Q-Learning-based one in terms of convergence and network throughput, and the larger battery capacity the UAV has, the higher uplink throughput that can be achieved. In addition, compared with other specific trajectories, the DQN-based method can increase the total offload data amount of the system by more than about 19%.
2. Related Work
3. System Model
3.1. User Mobile Model
3.2. UAV Energy Consumption Model
3.3. Problem Formulation
4. The DQN-Based Method for UAV Trajectory Planning
4.1. Reinforcement Learning and DQN
4.2. The DQN-Based Method for UAV Trajectory Planning
4.3. Action Space
4.4. State
4.5. Reward
4.6. Algorithm Framework
| Algorithm 1: The DQN-based trajectory planning algorithm. | 
| Input: Markov decision process , replay storage M, number of cycles N, explore probability , deep learning networks Q, number of steps to update the target network , and learning rate . Output: The optimal strategy . for to Ndo According to , choose any action in action space with the probability , and choose with probability . Execute action and observe reward from the environment to obtain the next state . Store from the environmental transfer process into replay storage M. Sample from a small batch of individual samples in replay storage M. Calculate the target value in each conversion process . Updates parameters of the Q-network. Execute gradient descent algorithm. Updates to the target network: after steps, updates. end return The optimal strategy ; | 
5. Numerical Results and Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Mignardi, S.; Marini, R.; Verdone, R.; Buratti, C. On the Performance of a UAV-Aided Wireless Network Based on NB-IoT. Drones 2021, 5, 94. [Google Scholar] [CrossRef]
- Liu, L.; Xiong, K.; Cao, J.; Lu, Y.; Fan, P.; Letaief, K.B. Average AoI minimization in UAV-assisted data collection with RF wireless power transfer: A deep reinforcement learning scheme. IEEE Internet Things J. 2022, 9, 5216–5228. [Google Scholar] [CrossRef]
- Liu, Y.; Xiong, K.; Ni, Q.; Fan, P.; Letaief, K.B. UAV-assisted wireless powered cooperative mobile edge computing: Joint offloading, CPU control, and trajectory optimization. IEEE Internet Things J. 2020, 7, 2777–2790. [Google Scholar] [CrossRef]
- Wu, Q.; Zeng, Y.; Zhang, R. Joint trajectory and communication design for multi-UAV enabled wireless networks. IEEE Trans. Wireless Commun. 2018, 17, 2109–2121. [Google Scholar] [CrossRef]
- Zhang, S.; Zeng, Y.; Zhang, R. Cellular-enabled UAV communication: A connectivity-constrained trajectory optimization perspective. IEEE Trans. Commun. 2019, 67, 2580–2604. [Google Scholar] [CrossRef]
- AlJubayrin, S.; Al-Wesabi, F.N.; Alsolai, H.; Duhayyim, M.A.; Nour, M.K.; Khan, W.U.; Mahmood, A.; Rabie, K.; Shongwe, T. Energy Efficient Transmission Design for NOMA Backscatter-Aided UAV Networks with Imperfect CSI. Drones 2022, 6, 190. [Google Scholar] [CrossRef]
- Xiong, K.; Liu, Y.; Zhang, L.; Gao, B.; Cao, J.; Fan, P.; Letaief, K.B. Joint optimization of trajectory, task offloading and CPU control in UAV-assisted wireless powered fog computing networks. IEEE Trans. Green Commun. Netw. 2022, 6, 1833–1845. [Google Scholar] [CrossRef]
- Mao, C.; Liu, J.; Xie, L. Multi-UAV Aided Data Collection for Age Minimization in Wireless Sensor Networks. In Proceedings of the 2020 International Conference on Wireless Communications and Signal Processing (WCSP), Nanjing, China, 21–23 October 2020; pp. 80–85. [Google Scholar]
- Hou, M.-C.; Deng, D.-J.; Wu, C.-L. Optimum aerial base station deployment for UAV networks: A reinforcement learning approach. In Proceedings of the IEEE Globecom Workshops (GC Workshops), Waikoloa, HI, USA, 9–13 December 2019; pp. 1–6. [Google Scholar]
- Saxena, V.; Jalden, J.; Klessig, H. Optimal UAV base station trajectories using flow-level models for reinforcement learning. IEEE Trans. Cogn. Commun. Netw. 2019, 5, 1101–1112. [Google Scholar] [CrossRef]
- Liu, Y.; Xiong, K.; Lu, Y.; Ni, Q.; Fan, P.; Letaief, K.B. UAV-aided wireless power transfer and data collection in Rician fading. IEEE J. Select. Areas Commun. 2021, 39, 3097–3113. [Google Scholar] [CrossRef]
- Qin, Z.; Zhang, X.; Zhang, X.; Lu, B.; Liu, Z.; Guo, L. The UAV Trajectory Optimization for Data Collection from Time-Constrained IoT Devices: A Hierarchical Deep Q-Network Approach. Appl. Sci. 2022, 12, 2546. [Google Scholar] [CrossRef]
- Zhang, T.; Lei, J.; Liu, Y.; Feng, C.; Nallanathan, A. Trajectory optimization for UAV emergency communication with limited user equipment energy: A safe-DQN approach. IEEE Trans. Green Commun. Netw. 2021, 5, 1236–1247. [Google Scholar] [CrossRef]
- Lee, W.; Jeon, Y.; Kim, T.; Kim, Y.-I. Deep Reinforcement Learning for UAV Trajectory Design Considering Mobile Ground Users. Sensors 2021, 21, 8239. [Google Scholar] [CrossRef] [PubMed]
- Batabyal, S.; Bhaumik, P. Mobility models, traces and impact of mobility on opportunistic routing algorithms: A survey. IEEE Commun. Surv. Tutor. 2015, 17, 1679–1707. [Google Scholar] [CrossRef]
- Zeng, Y.; Xu, J.; Zhang, R. Energy Minimization for Wireless Communication With Rotary-Wing UAV. IEEE Trans. Wireless Commun. 2019, 18, 2329–2345. [Google Scholar] [CrossRef]
- Kiumarsi, B.; Vamvoudakis, K.G.; Modares, H.; Lewis, F.L. Optimal and autonomous control using reinforcement learning: A survey. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 2042–2062. [Google Scholar] [CrossRef]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529. [Google Scholar] [CrossRef] [PubMed]
- Dong, Y.; Hossain, M.J.; Cheng, J. Performance of wireless powered amplify and forward relaying over nakagami-m fading channels with nonlinear energy harvester. IEEE Commun. Lett. 2016, 20, 672–675. [Google Scholar] [CrossRef]











| Abbreviation | Meaning | Page | 
|---|---|---|
| Aerial base station | 1 | |
| Deep Q-network | 2 | |
| Experience relay | 7 | |
| Fixed perceptual access points | 3 | |
| Ground mobile user | 1 | |
| Line of Sight | 1 | |
| Quality of service | 6 | |
| Unmanned aerial vehicle | 1 | 
| Methods | Optimization Objectives | 
|---|---|
| Block coordinate descent and successive convex optimization method | Maximize the minimum data rate of all ground users [4]. | 
| Graph theory and convex optimization | Minimize the UAV’s mission completion time [5]; Minimize the the sensor nodes’ maximal age of information subject to the limited energy capacity [8]. | 
| Sub-gradient method | Minimize the power consumption of a UAV system while ensuring the minimum data rate of IoT [6]. | 
| Particle swarm optimization and genetics method | Minimize the transmit power required by the UAV subject to the users’ minimum data rate [7]. | 
| DQN-based method | Minimize the average data buffer length [9]; Maximize the residual battery level of the system [10]; Minimize the delay of the network [11]; Maximize the quality of service based on the freshness of data [12]; Maximize the uplink throughput [13]; Maximize the mean opinion score for users [14]. | 
| Parameter | Meaning | Value | 
|---|---|---|
| Flight height of the UAV | 100 m | |
| Flight velocity of UAV | 20 m/s | |
| Blade profile power | 79.9 W | |
| Induced power | 88.6 W | |
| Tip speed of the rotor blade of the UAV | 120 m/s | |
| The mean rotor induced velocity of the UAV | 9 m/s | |
| Fuselage drag ratio | 0.48 | |
| Density of air | 1.225 kg/m | |
| s | Rotor solidity | 0.002 | 
| A | Rotor disc area | 0.503 m | 
| Path loss for a distance of 1 m | −50 dB | |
| User transmission power | 0.1 W | |
| Noise | −130 dB | 
| Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. | 
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lu, Y.; Xiong, G.; Zhang, X.; Zhang, Z.; Jia, T.; Xiong, K. Uplink Throughput Maximization in UAV-Aided Mobile Networks: A DQN-Based Trajectory Planning Method. Drones 2022, 6, 378. https://doi.org/10.3390/drones6120378
Lu Y, Xiong G, Zhang X, Zhang Z, Jia T, Xiong K. Uplink Throughput Maximization in UAV-Aided Mobile Networks: A DQN-Based Trajectory Planning Method. Drones. 2022; 6(12):378. https://doi.org/10.3390/drones6120378
Chicago/Turabian StyleLu, Yuping, Ge Xiong, Xiang Zhang, Zhifei Zhang, Tingyu Jia, and Ke Xiong. 2022. "Uplink Throughput Maximization in UAV-Aided Mobile Networks: A DQN-Based Trajectory Planning Method" Drones 6, no. 12: 378. https://doi.org/10.3390/drones6120378
APA StyleLu, Y., Xiong, G., Zhang, X., Zhang, Z., Jia, T., & Xiong, K. (2022). Uplink Throughput Maximization in UAV-Aided Mobile Networks: A DQN-Based Trajectory Planning Method. Drones, 6(12), 378. https://doi.org/10.3390/drones6120378
 
        





 
       