#
UAV Trajectory Optimization in a Post-Disaster Area Using Dual Energy-Aware Bandits^{ †}

^{1}

^{2}

^{3}

^{4}

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

#### 1.1. Prior Works and Motivations

#### 1.2. Contributions and Organization

- In our situation, a UAV would gather user data in a disaster-affected region as part of a wireless emergency communication network. Ground BSs fail as a consequence of natural catastrophe damage, but ground UEs in the UAV coverage area may upload data using an alternate mode of connection from the sky thanks to the assistance of the UAV emergency wireless communication network. We propose an online optimization problem to optimize the uplink throughput for the UAV emergency wireless communication network by optimizing the flight trajectory of the UAV under these assumptions, taking into consideration the limited available energy for both the UAV and ground UEs in the post-disaster region.
- The optimization problem is adapted into a constrained MAB problem, with action, reward, and cost defined as the flight direction, uploaded data throughput, and dissipated energy for both the UAV and UEs, respectively.
- The numerical analysis of our proposed framework shows a considerable increase in long-term throughput and a slight increase in the energy consumption of the UEs in the post-disaster area, resulting in better energy efficiency for our proposed framework compared to other benchmark UAV trajectory optimization methods.

## 2. Network Architecture and Problem Formulation

#### 2.1. UAV Flying Model

#### 2.2. Wireless Communication Channel Model

#### 2.3. Data Transmission Model

#### 2.4. Energy Model

#### 2.5. Problem Formulation

## 3. Dual-Energy-Aware MAB-Based UAV Trajectory Optimization Approach

#### 3.1. General MAB Framework

#### 3.2. The DEA-MAB Approach

Algorithm 1: The proposed algorithm: DEA-MAB. |

#### 3.3. Complexity Analysis of the Proposed Approach

## 4. Simulation Results

- The post-disaster area spiral scanning (PASS) method: This method is designed to scan the whole area using the spiral path where the UAV starts to fly from the center of the post-disaster area. With respect to the UAV antenna’s radiation angle, a projected circle is created on the ground. This circle scans the whole post-disaster area from the center to the borders.
- Shortest flight path (SFP) method: In this method, the UAV starts to fly from the center of the post-disaster area and then selects the UE with the shortest path. Then, the UAV flies toward this UE and hovers above it to offload its data. After that, the UAV searches for the next close UE and flies toward it. This operation is performed till the last UE.

## 5. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Abbreviations

BS | Base station |

UAV | Unmanned aerial vehicle |

ML | Machine learning |

RL | Reinforcement learning |

MAB | Multi-armed bandit |

DPG | Deterministic policy gradient |

MDP | Markov decision process |

CRN | Cognitive radio network |

UCB | Upper confidence bound |

TS | Thompson sampling |

RIS | Re-configurable intelligent surface |

SUTOA | state-action-reward-state-action based UAV-trajectory optimization algorithm |

QUTOA | Q-learning based UAV-trajectory optimization algorithm |

UE | User equipment |

GPS | Global positioning system |

3GPP | 3rd generation partnership project |

LOS | Line-of-sight |

NLOS | Non-line-of-sight |

SNR | Signal-to-noise ratio |

AWGN | Additive white Gaussian noise |

EXP3 | The exponential-weight algorithm for exploration and exploitation |

LCB | Lower confidence bound |

DEA | Dual-energy aware |

PASS | Post-disaster area spiral scanning |

SFP | Shortest flight path |

PoI | Points of interest |

MILP | Mixed Integer Linear Programming |

## References

- Hoeppe, P. Trends in weather related disasters–Consequences for insurers and society. Weather Clim. Extrem.
**2016**, 11, 70–79. [Google Scholar] [CrossRef] [Green Version] - Deepak, G.; Ladas, A.; Sambo, Y.A.; Pervaiz, H.; Politis, C.; Imran, M.A. An overview of post-disaster emergency communication systems in the future networks. IEEE Wirel. Commun.
**2019**, 26, 132–139. [Google Scholar] - Tran, G.K.; Ozasa, M.; Nakazato, J. NFV/SDN as an Enabler for Dynamic Placement Method of mmWave Embedded UAV Access Base Stations. Network
**2022**, 2, 479–499. [Google Scholar] [CrossRef] - Saad, W.; Bennis, M.; Mozaffari, M.; Lin, X. Wireless Communications and Networking for Unmanned Aerial Vehicles; Cambridge University Press: Cambridge, UK, 2020. [Google Scholar]
- Erdelj, M.; Natalizio, E.; Chowdhury, K.R.; Akyildiz, I.F. Help from the sky: Leveraging UAVs for disaster management. IEEE Pervasive Comput.
**2017**, 16, 24–32. [Google Scholar] [CrossRef] - Kwasinski, A.; Weaver, W.W.; Chapman, P.L.; Krein, P.T. Telecommunications power plant damage assessment for hurricane katrina–site survey and follow-up results. IEEE Syst. J.
**2009**, 3, 277–287. [Google Scholar] [CrossRef] - Fotouhi, A.; Qiang, H.; Ding, M.; Hassan, M.; Giordano, L.G.; Garcia-Rodriguez, A.; Yuan, J. Survey on UAV cellular communications: Practical aspects, standardization advancements, regulation, and security challenges. IEEE Commun. Surv. Tutor.
**2019**, 21, 3417–3442. [Google Scholar] [CrossRef] [Green Version] - Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Federal Aviation Administration. FAA Aerospace Forecast: Fiscal Years 2019–2039; U.S. Department of Transportation: Washington, DC, USA, 2019; 43p. Available online: https://www.faa.gov/data_research/aviation/aerospace_forecasts/media/fy2019-39_faa_aerospace_forecast.pdf (accessed on 19 December 2022).
- Hashesh, A.O.; Hashima, S.; Zaki, R.M.; Fouda, M.M.; Hatano, K.; Eldien, A.S.T. AI-Enabled UAV Communications: Challenges and Future Directions. IEEE Access
**2022**, 10, 92048–92066. [Google Scholar] [CrossRef] - Chen, X.; Nie, Y.; Li, N. Online Residential Demand Response via Contextual Multi-Armed Bandits. IEEE Control Syst. Lett.
**2021**, 5, 433–438. [Google Scholar] [CrossRef] - Katehakis, M.N.; Veinott, A.F., Jr. The multi-armed bandit problem: Decomposition and computation. Math. Oper. Res.
**1987**, 12, 262–268. [Google Scholar] [CrossRef] [Green Version] - Bubeck, S.; Cesa-Bianchi, N. Regret analysis of stochastic and nonstochastic multi-armed bandit problems. arXiv
**2012**, arXiv:1204.5721. [Google Scholar] - Auer, P.; Cesa-Bianchi, N.; Fischer, P. Finite-time analysis of the multiarmed bandit problem. Mach. Learn.
**2002**, 47, 235–256. [Google Scholar] [CrossRef] - Audibert, J.Y.; Munos, R.; Szepesvári, C. Exploration–exploitation tradeoff using variance estimates in multi-armed bandits. Theor. Comput. Sci.
**2009**, 410, 1876–1902. [Google Scholar] [CrossRef] - Lahmeri, M.A.; Kishk, M.A.; Alouini, M.S. Artificial intelligence for UAV-enabled wireless networks: A survey. IEEE Open J. Commun. Soc.
**2021**, 2, 1015–1040. [Google Scholar] [CrossRef] - Mamaghani, M.T.; Hong, Y. Intelligent trajectory design for secure full-duplex MIMO-UAV relaying against active eavesdroppers: A model-free reinforcement learning approach. IEEE Access
**2020**, 9, 4447–4465. [Google Scholar] [CrossRef] - Han, S.I. Survey on UAV Deployment and Trajectory in Wireless Communication Networks: Applications and Challenges. Information
**2022**, 13, 389. [Google Scholar] [CrossRef] - Zeng, Y.; Xu, X.; Zhang, R. Trajectory design for completion time minimization in UAV-enabled multicasting. IEEE Trans. Wirel. Commun.
**2018**, 17, 2233–2246. [Google Scholar] [CrossRef] - Sugihara, R.; Gupta, R.K. Speed control and scheduling of data mules in sensor networks. ACM Trans. Sens. Netw. (TOSN)
**2010**, 7, 4. [Google Scholar] [CrossRef] - Chiaraviglio, L.; D’Andreagiovanni, F.; Liu, W.; Gutierrez, J.A.; Blefari-Melazzi, N.; Choo, K.K.R.; Alouini, M.S. Multi-area throughput and energy optimization of UAV-aided cellular networks powered by solar panels and grid. IEEE Trans. Mob. Comput.
**2020**, 20, 2427–2444. [Google Scholar] [CrossRef] - Trotta, A.; Andreagiovanni, F.D.; di Felice, M.; Natalizio, E.; Chowdhury, K.R. When UAVs ride a bus: Towards energy-efficient city-scale video surveillance. In Proceedings of the IEEE Infocom 2018-IEEE Conference on Computer Communications, Honolulu, HI, USA, 16–19 April 2018; pp. 1043–1051. [Google Scholar]
- Chen, M.; Challita, U.; Saad, W.; Yin, C.; Debbah, M. Artificial neural networks-based machine learning for wireless networks: A tutorial. IEEE Commun. Surv. Tutors
**2019**, 21, 3039–3071. [Google Scholar] [CrossRef] [Green Version] - Mozaffari, M.; Saad, W.; Bennis, M.; Debbah, M. Efficient deployment of multiple unmanned aerial vehicles for optimal wireless coverage. IEEE Commun. Lett.
**2016**, 20, 1647–1650. [Google Scholar] [CrossRef] - Pearre, B.; Brown, T.X. Model-free trajectory optimisation for unmanned aircraft serving as data ferries for widespread sensors. Remote Sens.
**2012**, 4, 2971–3005. [Google Scholar] [CrossRef] [Green Version] - Bayerlein, H.; de Kerret, P.; Gesbert, D. Trajectory optimization for autonomous flying base station via reinforcement learning. In Proceedings of the 2018 IEEE 19th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), Kalamata, Greece, 25–28 June 2018; pp. 1–5. [Google Scholar]
- Yin, S.; Zhao, S.; Zhao, Y.; Yu, F.R. Intelligent trajectory design in UAV-aided communications with reinforcement learning. IEEE Trans. Veh. Technol.
**2019**, 68, 8227–8231. [Google Scholar] [CrossRef] - Amrallah, A.; Mohamed, E.M.; Tran, G.K.; Sakaguchi, K. Enhanced dynamic spectrum access in UAV wireless networks for post-disaster area surveillance system: A multi-player multi-armed bandit approach. Sensors
**2021**, 21, 7855. [Google Scholar] [CrossRef] - Amrallah, A.; Mohamed, E.M.; Tran, G.K.; Sakaguchi, K. Radio Resource Management Aided Multi-Armed Bandits for Disaster Surveillance System. In Proceedings of the 2020 International Conference on Emerging Technologies for Communications (ICETC2020), Virtual, 2–4 December 2020. [Google Scholar]
- Mohamed, E.M.; Hashima, S.; Aldosary, A.; Hatano, K.; Abdelghany, M.A. Gateway selection in millimeter wave UAV wireless networks using multi-player multi-armed bandit. Sensors
**2020**, 20, 3947. [Google Scholar] [CrossRef] [PubMed] - Mohamed, E.M.; Hashima, S.; Hatano, K. Energy Aware Multi-Armed Bandit for Millimeter Wave Based UAV Mounted RIS Networks. IEEE Wirel. Commun. Lett.
**2022**, 11, 1293–1297. [Google Scholar] [CrossRef] - Lin, Y.; Wang, T.; Wang, S. UAV-assisted emergency communications: An extended multi-armed bandit perspective. IEEE Commun. Lett.
**2019**, 23, 938–941. [Google Scholar] [CrossRef] - Cui, J.; Ding, Z.; Deng, Y.; Nallanathan, A.; Hanzo, L. Adaptive UAV-trajectory optimization under quality of service constraints: A model-free solution. IEEE Access
**2020**, 8, 112253–112265. [Google Scholar] [CrossRef] - Zhang, T.; Lei, J.; Liu, Y.; Feng, C.; Nallanathan, A. Trajectory optimization for UAV emergency communication with limited user equipment energy: A safe-DQN approach. IEEE Trans. Green Commun. Netw.
**2021**, 5, 1236–1247. [Google Scholar] [CrossRef] - Amrallah, A.; Mohamed, E.M.; Tran, G.K.; Sakaguchi, K. Dual Energy-Aware based Trajectory Optimization for UAV Emergency Wireless Communication Network: A Multi-armed Bandit Approach. In Proceedings of the 2022 Thirteenth International Conference on Ubiquitous and Future Networks (ICUFN), Barcelona, Spain, 5–8 July 2022; pp. 43–48. [Google Scholar]
- 3GPP. Study on Enhanced LTE Support for Aerial Vehicles (Release 15); The 3rd Generation Partnership Project (3GPP): Valbonne, France, 2017; Available online: https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=3231 (accessed on 19 December 2022).
- González-Cañete, F.J.; Casilari, E. Consumption analysis of smartphone based fall detection systems with multiple external wireless sensors. Sensors
**2020**, 20, 622. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Vermorel, J.; Mohri, M. Multi-armed bandit algorithms and empirical evaluation. In Proceedings of the European Conference on Machine Learning, Porto, Portugal, 3–7 October 2005; Springer: Berlin/Heidelberg, Germany, 2005; pp. 437–448. [Google Scholar]
- Agrawal, R. Sample mean based index policies by o (log n) regret for the multi-armed bandit problem. Adv. Appl. Probab.
**1995**, 27, 1054–1078. [Google Scholar] [CrossRef] - Scott, S.L. A modern Bayesian look at the multi-armed bandit. Appl. Stoch. Model. Bus. Ind.
**2010**, 26, 639–658. [Google Scholar] [CrossRef] - Auer, P.; Cesa-Bianchi, N.; Freund, Y.; Schapire, R.E. The nonstochastic multiarmed bandit problem. SIAM J. Comput.
**2002**, 32, 48–77. [Google Scholar] [CrossRef] - Ding, W.; Qin, T.; Zhang, X.D.; Liu, T.Y. Multi-armed bandit with budget constraint and variable costs. In Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, Bellevue, WA, USA, 14–18 July 2013. [Google Scholar]
- Sinha, D.; Sankararaman, K.A.; Kazerouni, A.; Avadhanula, V. Multi-armed bandits with cost subsidy. In Proceedings of the International Conference on Artificial Intelligence and Statistics, PMLR, Virtual Event, 13–15 April 2021; pp. 3016–3024. [Google Scholar]

Parameter | Value |
---|---|

Simulation area | 500 m × 500 m |

Number of UEs in the simulation area (M) | $20,30,40,50$ |

UAV flight speed ($\nu $) | 20 km/h |

UAV flight altitude (H) | 100 m |

UAV antenna radiation angle ($\phi $) | $\pi $/8 rad |

Carrier frequency (f) | 2.4 GHz |

Channel bandwidth (B) | 10 MHz |

Data transmission duration ($\tau $) | 1 s |

UE Transmission power (${P}_{m}^{\mathrm{Tx}}$) | 23 dBm |

AWGN spectral density (${\sigma}_{0}$) | $-130$ dBm/Hz |

UAV battery capacity (${E}_{0}$) | $20,30,40$ Wh |

UAV flying power ($\mathsf{\Xi}$) | 120 W |

UE battery capacity (${e}_{0}$) | 1 Wh |

UE energy dissipation in idle mode (${e}^{\mathrm{idle}}$) | 0.01 J |

Data rate feasibility region factor ($\delta $) | 0.6 |

Critical power feasibility region factor ($\epsilon $) | 0.5 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Amrallah, A.; Mohamed, E.M.; Tran, G.K.; Sakaguchi, K.
UAV Trajectory Optimization in a Post-Disaster Area Using Dual Energy-Aware Bandits. *Sensors* **2023**, *23*, 1402.
https://doi.org/10.3390/s23031402

**AMA Style**

Amrallah A, Mohamed EM, Tran GK, Sakaguchi K.
UAV Trajectory Optimization in a Post-Disaster Area Using Dual Energy-Aware Bandits. *Sensors*. 2023; 23(3):1402.
https://doi.org/10.3390/s23031402

**Chicago/Turabian Style**

Amrallah, Amr, Ehab Mahmoud Mohamed, Gia Khanh Tran, and Kei Sakaguchi.
2023. "UAV Trajectory Optimization in a Post-Disaster Area Using Dual Energy-Aware Bandits" *Sensors* 23, no. 3: 1402.
https://doi.org/10.3390/s23031402