Previous Article in Journal
Exploring the Potential of Autonomous Underwater Vehicles for Microplastic Detection in Marine Environments: A Systematic Review
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Reinforcement Learning-Based PD Controller Gains Prediction for Quadrotor UAVs

by
Serhat Sönmez
1,*,†,
Luca Montecchio
2,†,
Simone Martini
3,†,
Matthew J. Rutherford
4,
Alessandro Rizzo
2,
Margareta Stefanovic
5 and
Kimon P. Valavanis
5
1
Department of Electrical & Electronics Engineering, Istanbul Medeniyet University, 34700 Istanbul, Turkey
2
Department of Electronics & Telecommunications, Politecnico di Torino, 10129 Torino, Italy
3
Aerospace Engineering & Engineering Mechanics Department, University of Cincinnati, Cincinnati, OH 45221, USA
4
Department of Computer Sciences, University of Denver, Denver, CO 80208, USA
5
Department of Electrical & Computer Engineering, University of Denver, Denver, CO 80210, USA
*
Author to whom correspondence should be addressed.
Co-first authors, these authors contributed equally to this work.
Drones 2025, 9(8), 581; https://doi.org/10.3390/drones9080581 (registering DOI)
Submission received: 8 June 2025 / Revised: 6 August 2025 / Accepted: 11 August 2025 / Published: 16 August 2025

Abstract

This paper presents a reinforcement learning (RL)-based methodology for the online fine-tuning of PD controller gains, with the goal of bridging the gap between simulation-trained controllers and real-world quadrotor applications. As a first step toward real-world implementation, the proposed approach applies a Deep Deterministic Policy Gradient (DDPG) algorithm—an off-policy actor–critic method—to adjust the gains of a quadrotor attitude PD controller during flight. The RL agent was initially trained offline in a simulated environment, using MATLAB/Simulink 2024a and the UAV Toolbox Support Package for PX4 Autopilots v1.14.0. The trained controller was then validated through both simulation and experimental flight tests. Comparative performance analyses were conducted between the hand-tuned and RL-tuned controllers. Our results demonstrate that the RL-based tuning method successfully adapts the controller gains in real time, leading to improved attitude tracking and reduced steady-state error. This study constitutes the first stage of a broader research effort investigating RL-based PID, LQR, MRAC, and Koopman-integrated RL-based PID controllers for real-time quadrotor control.
Keywords: reinforcement learning; multirotor UAVs; PD controller reinforcement learning; multirotor UAVs; PD controller

Share and Cite

MDPI and ACS Style

Sönmez, S.; Montecchio, L.; Martini, S.; Rutherford, M.J.; Rizzo, A.; Stefanovic, M.; Valavanis, K.P. Reinforcement Learning-Based PD Controller Gains Prediction for Quadrotor UAVs. Drones 2025, 9, 581. https://doi.org/10.3390/drones9080581

AMA Style

Sönmez S, Montecchio L, Martini S, Rutherford MJ, Rizzo A, Stefanovic M, Valavanis KP. Reinforcement Learning-Based PD Controller Gains Prediction for Quadrotor UAVs. Drones. 2025; 9(8):581. https://doi.org/10.3390/drones9080581

Chicago/Turabian Style

Sönmez, Serhat, Luca Montecchio, Simone Martini, Matthew J. Rutherford, Alessandro Rizzo, Margareta Stefanovic, and Kimon P. Valavanis. 2025. "Reinforcement Learning-Based PD Controller Gains Prediction for Quadrotor UAVs" Drones 9, no. 8: 581. https://doi.org/10.3390/drones9080581

APA Style

Sönmez, S., Montecchio, L., Martini, S., Rutherford, M. J., Rizzo, A., Stefanovic, M., & Valavanis, K. P. (2025). Reinforcement Learning-Based PD Controller Gains Prediction for Quadrotor UAVs. Drones, 9(8), 581. https://doi.org/10.3390/drones9080581

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop