# Drone Deep Reinforcement Learning: A Review

^{1}

^{2}

^{3}

^{4}

^{5}

^{6}

^{7}

^{8}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Overview of Deep Reinforcement Learning

Algorithm 1: Pseudocode for DQN-Based Training Method. |

Algorithm 2: Pseudocode for Proximal Policy Optimization (PPO). |

Algorithm 3: Pseudocode for Deep Deterministic Policy Gradient (DDPG). |

Algorithm 4: Pseudocode for a single Advantage Actor Critic Agent. |

## 3. Reinforcement Learning for UAV Path Planning

## 4. Reinforcement Learning for UAV Navigation

## 5. Reinforcement Learning for UAV Control

## 6. Analysis and Insights

## 7. Discussion

## 8. Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

- Narayanan, R.G.L.; Ibe, O.C. Joint Network for Disaster Relief and Search and Rescue Network Operations. In Wireless Public Safety Networks 1; Elsevier: Amsterdam, The Netherlands, 2015; pp. 163–193. [Google Scholar]
- Suli, F. Electronic Enclosures, Housings and Packages; Woodhead Publishing: Cambridge, UK, 2018. [Google Scholar]
- Tsiatsis, V.; Karnouskos, S.; Holler, J.; Boyle, D.; Mulligan, C. Internet of Things: Technologies and Applications for a New Age of Intelligence; Academic Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Castellano, G.; Castiello, C.; Mencar, C.; Vessio, G. Crowd detection in aerial images using spatial graphs and fully-convolutional neural networks. IEEE Access
**2020**, 8, 64534–64544. [Google Scholar] [CrossRef] - Kim, I.; Shin, S.; Wu, J.; Kim, S.D.; Kim, C.G. Obstacle avoidance path planning for uav using reinforcement learning under simulated environment. In Proceedings of the IASER 3rd International Conference on Electronics, Electrical Engineering, Computer Science, Sapporo, Japan, 13–17 May 2017; pp. 34–36. [Google Scholar]
- Custers, B. Drones Here, there and everywhere introduction and overview. In The Future of Drone Use; Springer: Berlin/Heidelberg, Germany, 2016; pp. 3–20. [Google Scholar]
- Samanta, S.; Mukherjee, A.; Ashour, A.S.; Dey, N.; Tavares, J.M.R.S.; Abdessalem, K.W.; Taiar, R.; Azar, A.T.; Hassanien, A.E. Log Transform Based Optimal Image Enhancement Using Firefly Algorithm for Autonomous Mini Unmanned Aerial Vehicle: An Application of Aerial Photography. Int. J. Image Graph.
**2018**, 18, 1850019. [Google Scholar] [CrossRef] [Green Version] - Najm, A.A.; Ibraheem, I.K.; Azar, A.T.; Humaidi, A.J. Genetic Optimization-Based Consensus Control of Multi-Agent 6-DoF UAV System. Sensors
**2020**, 20, 3576. [Google Scholar] [CrossRef] [PubMed] - Azar, A.T.; Serrano, F.E.; Kamal, N.A.; Koubaa, A. Leader-Follower Control of Unmanned Aerial Vehicles with State Dependent Switching. In Proceedings of the International Conference on Advanced Intelligent Systems and Informatics; Springer: Berlin/Heidelberg, Germany, 2021; pp. 862–872. [Google Scholar]
- Azar, A.T.; Serrano, F.E.; Kamal, N.A.; Koubaa, A. Robust Kinematic Control of Unmanned Aerial Vehicles with Non-holonomic Constraints. In Proceedings of the International Conference on Advanced Intelligent Systems and Informatics; Springer: Berlin/Heidelberg, Germany, 2021; pp. 839–850. [Google Scholar]
- Azar, A.T.; Serrano, F.E.; Koubaa, A.; Kamal, N.A. Backstepping H-Infinity Control of Unmanned Aerial Vehicles with Time Varying Disturbances. In Proceedings of the 2020 First International Conference of Smart Systems and Emerging Technologies (SMARTTECH), Riyadh, Saudi Arabia, 15–17 March 2020; pp. 243–248. [Google Scholar]
- Dalamagkidis, K. Definitions and terminology. In Handbook of Unmanned Aerial Vehicles; Springer: Berlin/Heidelberg, Germany, 2015; pp. 43–55. [Google Scholar]
- Valavanis, K.P.; Vachtsevanos, G.J. Handbook of Unmanned Aerial Vehicles; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar]
- Dalamagkidis, K.; Valavanis, K.P.; Piegl, L.A. On Integrating Unmanned Aircraft Systems into the National Airspace System: Issues, Challenges, Operational Restrictions, Certification, and Recommendations; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2011; Volume 54. [Google Scholar]
- Weibel, R.; Hansman, R.J. Safety considerations for operation of different classes of UAVs in the NAS. In Proceedings of the AIAA 4th Aviation Technology, Integration and Operations (ATIO) Forum, Chicago, IL, USA, 20–22 September 2004; p. 6244. [Google Scholar]
- Huang, H.M. Autonomy levels for unmanned systems (ALFUS) framework: Safety and application issues. In Proceedings of the 2007 Workshop on Performance Metrics for Intelligent Systems, Washington, DC, USA, 11–13 October 2007; pp. 48–53. [Google Scholar]
- Clough, B.T. Unmanned aerial vehicles: Autonomous control challenges, a researcher’s perspective. In Cooperative Control and Optimization; Springer: Berlin/Heidelberg, Germany, 2002; pp. 35–52. [Google Scholar]
- Protti, M.; Barzan, R. UAV Autonomy-Which Level Is Desirable?-Which Level Is Acceptable? Alenia Aeronautica Viewpoint; Technical Report; Alenia Aeronautica SPA Torino: Torinese, Italy, 2007. [Google Scholar]
- Tüllmann, R.; Arbinger, C.; Baskcomb, S.; Berdermann, J.; Fiedler, H.; Klock, E.; Schildknecht, T. On the Implementation of a European Space Traffic Management System-I. A White Paper. 2017. Available online: https://www.semanticscholar.org/paper/On-the-Implementation-of-a-European-Space-Traffic-A-Tuellmann-Arbinger/6ac686ded55171072aa719c7c383e55c3cd059e2 (accessed on 5 January 2021).
- Arulkumaran, K.; Deisenroth, M.P.; Brundage, M.; Bharath, A.A. Deep reinforcement learning: A brief survey. IEEE Signal Process. Mag.
**2017**, 34, 26–38. [Google Scholar] [CrossRef] [Green Version] - Poole, D.L.; Mackworth, A.K. Artificial Intelligence: Foundations of Computational Agents; Cambridge University Press: Cambridge, UK, 2010. [Google Scholar]
- François-Lavet, V.; Henderson, P.; Islam, R.; Bellemare, M.G.; Pineau, J. An introduction to deep reinforcement learning. Found. Trends Mach. Learn.
**2018**, 11, 219–354. [Google Scholar] [CrossRef] [Green Version] - Zhang, H.; Yu, T. Taxonomy of Reinforcement Learning Algorithms. In Deep Reinforcement Learning; Springer: Berlin/Heidelberg, Germany, 2020; pp. 125–133. [Google Scholar]
- Huang, H.; Yang, Y.; Wang, H.; Ding, Z.; Sari, H.; Adachi, F. Deep reinforcement learning for UAV navigation through massive MIMO technique. IEEE Trans. Veh. Technol.
**2019**. [Google Scholar] [CrossRef] [Green Version] - Cao, W.; Huang, X.; Shu, F. Unmanned rescue vehicle navigation with fused DQN algorithm. In Proceedings of the 2019 International Conference on Robotics, Intelligent Control and Artificial Intelligence, Shenyang, China, 8–11 August 2019; pp. 556–561. [Google Scholar]
- Lillicrap, T.P.; Hunt, J.J.; Pritzel, A.; Heess, N.; Erez, T.; Tassa, Y.; Silver, D.; Wierstra, D. Continuous control with deep reinforcement learning. arXiv
**2015**, arXiv:1509.02971. [Google Scholar] - Shin, S.Y.; Kang, Y.W.; Kim, Y.G. Automatic Drone Navigation in Realistic 3D Landscapes using Deep Reinforcement Learning. In Proceedings of the 2019 6th International Conference on Control, Decision and Information Technologies (CoDIT), Paris, France, 23–26 April 2019; pp. 1072–1077. [Google Scholar]
- Wang, Z.; Schaul, T.; Hessel, M.; Van Hasselt, H.; Lanctot, M.; De Freitas, N. Dueling network architectures for deep reinforcement learning. arXiv
**2015**, arXiv:1511.06581. [Google Scholar] - Bøhn, E.; Coates, E.M.; Moe, S.; Johansen, T.A. Deep reinforcement learning attitude control of fixed-wing uavs using proximal policy optimization. In Proceedings of the 2019 International Conference on Unmanned Aircraft Systems (ICUAS), Atlanta, GA, USA, 11–14 June 2019; pp. 523–533. [Google Scholar]
- Guo, S.; Zhang, X.; Zheng, Y.; Du, Y. An autonomous path planning model for unmanned ships based on deep reinforcement learning. Sensors
**2020**, 20, 426. [Google Scholar] [CrossRef] [Green Version] - Xu, D.; Hui, Z.; Liu, Y.; Chen, G. Morphing control of a new bionic morphing UAV with deep reinforcement learning. Aerosp. Sci. Technol.
**2019**, 92, 232–243. [Google Scholar] [CrossRef] - Lee, S.; Bang, H. Automatic Gain Tuning Method of a Quad-Rotor Geometric Attitude Controller Using A3C. Int. J. Aeronaut. Space Sci.
**2019**, 21, 469–478. [Google Scholar] [CrossRef] - Hardin, P.J.; Jensen, R.R. Small-scale unmanned aerial vehicles in environmental remote sensing: Challenges and opportunities. GIScience Remote Sens.
**2011**, 48, 99–111. [Google Scholar] [CrossRef] - Pham, H.X.; La, H.M.; Feil-Seifer, D.; Nguyen, L.V. Autonomous uav navigation using reinforcement learning. arXiv
**2018**, arXiv:1801.05086. [Google Scholar] - Lin, Y.; Wang, M.; Zhou, X.; Ding, G.; Mao, S. Dynamic spectrum interaction of UAV flight formation communication with priority: A deep reinforcement learning approach. IEEE Trans. Cogn. Commun. Netw.
**2020**, 6, 892–903. [Google Scholar] [CrossRef] - Li, B.; Wu, Y. Path planning for UAV ground target tracking via deep reinforcement learning. IEEE Access
**2020**, 8, 29064–29074. [Google Scholar] [CrossRef] - Koch, W.; Mancuso, R.; West, R.; Bestavros, A. Reinforcement learning for UAV attitude control. ACM Trans. Cyber Phys. Syst.
**2019**, 3, 1–21. [Google Scholar] [CrossRef] [Green Version] - Dhargupta, S.; Ghosh, M.; Mirjalili, S.; Sarkar, R. Selective opposition based grey wolf optimization. Expert Syst. Appl.
**2020**, 151, 113389. [Google Scholar] [CrossRef] - Qu, C.; Gai, W.; Zhong, M.; Zhang, J. A novel reinforcement learning based grey wolf optimizer algorithm for unmanned aerial vehicles (UAVs) path planning. Appl. Soft Comput.
**2020**, 89, 106099. [Google Scholar] [CrossRef] - Jiang, S.; Jiang, C.; Jiang, W. Efficient structure from motion for large-scale UAV images: A review and a comparison of SfM tools. ISPRS J. Photogramm. Remote Sens.
**2020**, 167, 230–251. [Google Scholar] [CrossRef] - He, L.; Aouf, N.; Whidborne, J.F.; Song, B. Deep reinforcement learning based local planner for UAV obstacle avoidance using demonstration data. arXiv
**2020**, arXiv:2008.02521. [Google Scholar] - Bayerlein, H.; Theile, M.; Caccamo, M.; Gesbert, D. UAV path planning for wireless data harvesting: A deep reinforcement learning approach. arXiv
**2020**, arXiv:2007.00544. [Google Scholar] - Hasheminasab, S.M.; Zhou, T.; Habib, A. GNSS/INS-Assisted structure from motion strategies for UAV-Based imagery over mechanized agricultural fields. Remote Sens.
**2020**, 12, 351. [Google Scholar] [CrossRef] [Green Version] - Singla, A.; Padakandla, S.; Bhatnagar, S. Memory-based deep reinforcement learning for obstacle avoidance in UAV with limited environment knowledge. IEEE Trans. Intell. Transp. Syst.
**2019**. [Google Scholar] [CrossRef] - Bouhamed, O.; Ghazzai, H.; Besbes, H.; Massoud, Y. Autonomous UAV navigation: A DDPG-based deep reinforcement learning approach. In Proceedings of the 2020 IEEE International Symposium on Circuits and Systems (ISCAS), Seville, Spain, 10–21 October 2020; pp. 1–5. [Google Scholar]
- Challita, U.; Saad, W.; Bettstetter, C. Interference management for cellular-connected UAVs: A deep reinforcement learning approach. IEEE Trans. Wirel. Commun.
**2019**, 18, 2125–2140. [Google Scholar] [CrossRef] [Green Version] - Yan, C.; Xiang, X.; Wang, C. Towards Real-Time Path Planning through Deep Reinforcement Learning for a UAV in Dynamic Environments. J. Intell. Robot. Syst.
**2019**, 98, 297–309. [Google Scholar] [CrossRef] - Wang, Y.M.; Peng, D.L. A simulation platform of multi-sensor multi-target track system based on STAGE. In Proceedings of the 2010 8th World Congress on Intelligent Control and Automation, Jinan, China, 6–9 July 2010; pp. 6975–6978. [Google Scholar]
- Shin, S.Y.; Kang, Y.W.; Kim, Y.G. Obstacle Avoidance Drone by Deep Reinforcement Learning and Its Racing with Human Pilot. Appl. Sci.
**2019**, 9, 5571. [Google Scholar] [CrossRef] [Green Version] - Muñoz, G.; Barrado, C.; Çetin, E.; Salami, E. Deep Reinforcement Learning for Drone Delivery. Drones
**2019**, 3, 72. [Google Scholar] [CrossRef] [Green Version] - Hii, M.S.Y.; Courtney, P.; Royall, P.G. An evaluation of the delivery of medicines using drones. Drones
**2019**, 3, 52. [Google Scholar] [CrossRef] [Green Version] - Pham, H.X.; La, H.M.; Feil-Seifer, D.; Van Nguyen, L. Reinforcement learning for autonomous uav navigation using function approximation. In Proceedings of the 2018 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), Philadelphia, PA, USA, 6–8 August 2018; pp. 1–6. [Google Scholar]
- Kahn, G.; Villaflor, A.; Pong, V.; Abbeel, P.; Levine, S. Uncertainty-aware reinforcement learning for collision avoidance. arXiv
**2017**, arXiv:1702.01182. [Google Scholar] - Altawy, R.; Youssef, A.M. Security, privacy, and safety aspects of civilian drones: A survey. ACM Trans. Cyber Phys. Syst.
**2016**, 1, 1–25. [Google Scholar] [CrossRef] - Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature
**2015**, 518, 529–533. [Google Scholar] [CrossRef] - Bamburry, D. Drones: Designed for product delivery. Des. Manag. Rev.
**2015**, 26, 40–48. [Google Scholar] - Li, J.; Li, Y. Dynamic analysis and PID control for a quadrotor. In Proceedings of the 2011 IEEE International Conference on Mechatronics and Automation, Beijing, China, 7–10 August 2011; pp. 573–578. [Google Scholar]
- Liu, Y.; Nejat, G. Robotic urban search and rescue: A survey from the control perspective. J. Intell. Robot. Syst.
**2013**, 72, 147–165. [Google Scholar] [CrossRef] - Tomic, T.; Schmid, K.; Lutz, P.; Domel, A.; Kassecker, M.; Mair, E.; Grixa, I.L.; Ruess, F.; Suppa, M.; Burschka, D. Toward a fully autonomous UAV: Research platform for indoor and outdoor urban search and rescue. IEEE Robot. Autom. Mag.
**2012**, 19, 46–56. [Google Scholar] [CrossRef] [Green Version] - McClelland, J.L.; McNaughton, B.L.; O’Reilly, R.C. Why there are complementary learning systems in the hippocampus and neocortex: Insights from the successes and failures of connectionist models of learning and memory. Psychol. Rev.
**1995**, 102, 419. [Google Scholar] [CrossRef] [Green Version] - Sutton, R.; Barto, A.G. Reinforcement Learningan Introduction; MIT Press: Cambridge, MA, USA, 1998. [Google Scholar]
- Tai, L.; Liu, M. A robot exploration strategy based on q-learning network. In Proceedings of the 2016 IEEE International Conference on Real-Time Computing and Robotics (RCAR), Angkor Wat, Cambodia, 6–10 June 2016; pp. 57–62. [Google Scholar]
- Xu, J.; Du, T.; Foshey, M.; Li, B.; Zhu, B.; Schulz, A.; Matusik, W. Learning to fly: Computational controller design for hybrid UAVs with reinforcement learning. ACM Trans. Graph. (TOG)
**2019**, 38, 1–12. [Google Scholar] [CrossRef] - Wan, K.; Gao, X.; Hu, Z.; Wu, G. Robust Motion Control for UAV in Dynamic Uncertain Environments Using Deep Reinforcement Learning. Remote Sens.
**2020**, 12, 640. [Google Scholar] [CrossRef] [Green Version] - Passalis, N.; Tefas, A. Continuous drone control using deep reinforcement learning for frontal view person shooting. Neural Comput. Appl.
**2019**, 32, 4227–4238. [Google Scholar] [CrossRef] - Polvara, R.; Patacchiola, M.; Sharma, S.; Wan, J.; Manning, A.; Sutton, R.; Cangelosi, A. Toward end-to-end control for UAV autonomous landing via deep reinforcement learning. In Proceedings of the 2018 International Conference on Unmanned Aircraft Systems (ICUAS), Dallas, TX, USA, 12–15 June 2018; pp. 115–123. [Google Scholar]
- Tožička, J.; Szulyovszky, B.; de Chambrier, G.; Sarwal, V.; Wani, U.; Gribulis, M. Application of deep reinforcement learning to UAV fleet control. In Proceedings of the SAI Intelligent Systems Conference, London, UK, 5–6 September 2018; pp. 1169–1177. [Google Scholar]
- Liu, C.H.; Chen, Z.; Tang, J.; Xu, J.; Piao, C. Energy-efficient UAV control for effective and fair communication coverage: A deep reinforcement learning approach. IEEE J. Sel. Areas Commun.
**2018**, 36, 2059–2070. [Google Scholar] [CrossRef] - Yang, J.; You, X.; Wu, G.; Hassan, M.M.; Almogren, A.; Guna, J. Application of reinforcement learning in UAV cluster task scheduling. Future Gener. Comput. Syst.
**2019**, 95, 140–148. [Google Scholar] [CrossRef] - Koch, W. Flight controller synthesis via deep reinforcement learning. arXiv
**2019**, arXiv:1909.06493. [Google Scholar] - Song, Y.; Steinweg, M.; Kaufmann, E.; Scaramuzza, D. Autonomous Drone Racing with Deep Reinforcement Learning. arXiv
**2021**, arXiv:2103.08624. [Google Scholar] - Fujimoto, S.; Hoof, H.; Meger, D. Addressing function approximation error in actor-critic methods. In Proceedings of the International Conference on Machine Learning, Jinan, China, 26–28 May 2018; pp. 1587–1596. [Google Scholar]
- Wang, C.; Wang, J.; Zhang, X.; Zhang, X. Autonomous navigation of UAV in large-scale unknown complex environment with deep reinforcement learning. In Proceedings of the 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP), Montreal, QC, Canada, 14–16 November 2017; pp. 858–862. [Google Scholar]
- Imanberdiyev, N.; Fu, C.; Kayacan, E.; Chen, I.M. Autonomous navigation of UAV by using real-time model-based reinforcement learning. In Proceedings of the 2016 14th International Conference on Control, Automation, Robotics and Vision (ICARCV), Phuket, Thailand, 13–15 November 2016; pp. 1–6. [Google Scholar]
- Bou-Ammar, H.; Voos, H.; Ertel, W. Controller design for quadrotor uavs using reinforcement learning. In Proceedings of the 2010 IEEE International Conference on Control Applications, Yokohama, Japan, 8–10 September 2010; pp. 2130–2135. [Google Scholar]
- Duvall, T.; Green, A.; Langstaff, M.; Miele, K. Air-Mobility Solutions: What They’ll Need to Take off; Technical Report; McKinsey: New York, NY, USA, 2019. [Google Scholar]

**Figure 2.**Taxonomy of Reinforcement Learning Algorithms. DP (Dynamic Programming), TD (Temporal Difference), MC (Monte Carlos), I2A (Imagination-Augmented Agent), DQN (Deep Q-Network), TRPO (Trust Region Policy Optimization), ACKTR (Actor Critic using Kronecker-Factored Trust Region), AC (Actor-Critic), A2C (Advantage Actor Critic), A3C (Asynchronous Advantage Actor Critic), DDPG (Deep Deterministic Policy Gradient), TD3 (Twin Delayed DDPG), SAC (Soft Actor-Critic).

**Figure 7.**Step response of best trained RL agents compared to PID. Target angular velocity is ${\Omega}^{*}=[2.20,-5.14,-1.81]$ rad/s shown by dashed black line [70].

**Figure 8.**Average reward for the hover, land, random waypoint and target following tasks over 5000 iterations.

Algorithm | Agent Type | Policy | Policy Type | MC or TD | Action Space | State Space |
---|---|---|---|---|---|---|

State action reward state action(SARSA)SARSA Lambda | Value-based | On-policy | Pseudo- deterministic ($\u03f5-greedy$) | TD | Discrete only | Discrete only |

Deep Q Network (DQN)
Double DQN
Noisy DQN
Prioritized Replay DQN
Dueling DQN
Categorical DQN
Disturbuted DQN (C51) | Value-based | Off-policy | Pseudo- deterministic ($\u03f5-greedy$) | Discrete only | Discrete or Continuous | |

Normalized Advantage
Functions (NAF) =
Continuous DQN | Value-based | Continuous | Continuous | |||

REINFORCE (Vanilla
policy gradient) | Policy-based | On-policy | Stochastic | MC | ||

Policy Gradient | Policy-based | Stochastic | ||||

TRPO | Actor-critic | On-policy | Stochastic | Discrete or Continuous | Discrete or Continuous | |

PPO | Actor-critic | On-policy | Stochastic | Discrete or Continuous | Discrete or Continuous | |

A2C/A3C | Actor-critic | On-polciy | Stochastic | TD | Discrete or Continuous | Discrete or Continuous |

DDPG | Actor-critic | Off-policy | Deterministic | Continuous | Discrete or Continuous | |

TD3 | Actor-critic | Continuous | Discrete or Continuous | |||

SAC | Actor-critic | Off-policy | Continuous | Discrete or Continuous | ||

ACER | Actor-critic | Discrete | Discrete or Continuous | |||

ACKTR | Actor-critic | Discrete or Continuous | Discrete or Continuous |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Azar, A.T.; Koubaa, A.; Ali Mohamed, N.; Ibrahim, H.A.; Ibrahim, Z.F.; Kazim, M.; Ammar, A.; Benjdira, B.; Khamis, A.M.; Hameed, I.A.;
et al. Drone Deep Reinforcement Learning: A Review. *Electronics* **2021**, *10*, 999.
https://doi.org/10.3390/electronics10090999

**AMA Style**

Azar AT, Koubaa A, Ali Mohamed N, Ibrahim HA, Ibrahim ZF, Kazim M, Ammar A, Benjdira B, Khamis AM, Hameed IA,
et al. Drone Deep Reinforcement Learning: A Review. *Electronics*. 2021; 10(9):999.
https://doi.org/10.3390/electronics10090999

**Chicago/Turabian Style**

Azar, Ahmad Taher, Anis Koubaa, Nada Ali Mohamed, Habiba A. Ibrahim, Zahra Fathy Ibrahim, Muhammad Kazim, Adel Ammar, Bilel Benjdira, Alaa M. Khamis, Ibrahim A. Hameed,
and et al. 2021. "Drone Deep Reinforcement Learning: A Review" *Electronics* 10, no. 9: 999.
https://doi.org/10.3390/electronics10090999