# OCTUNE: Optimal Control Tuning Using Real-Time Data with Algorithm and Experimental Results

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

- An online and model-free optimal auto-tuning algorithm for a generic LTI controller is developed, called OCTUNE, which is demonstrated using realistic simulations of a quadrotor system.
- Convergence proof of the OCTUNE algorithm is derived in order to guarantee safe control learning/tuning.
- We provide our implementation as an open-source software package of OCTUNE to facilitate the use and adaptation of the presented algorithm for different applications. The links to the open-source software is provided in the Supplementary Materials section.

## 2. Problem Statement

## 3. Control Tuning Algorithm

#### Optimization Using Backpropagation

Algorithm 1:Pseudo code of the OCTUNE algorithm |

## 4. Convergence Analysis

**Theorem**

**1.**

**Proof.**

## 5. Validation: Quadrotor Tuning

#### 5.1. Simulation Setup

- Gazebo simulator: An open-source robot simulator [29] that accurately and efficiently simulates several types of robots in complex indoor and outdoor environments with multiple options of robust physics engines. It also has strong integration with the robot operating system (ROS) to facilitate software development and integration. The robot model simulated in this work is an actual quadrotor UAV called Iris; see Figure 5. The quadrotor model has several plugins to simulate the onboard sensors (Inertial measurement unit, GPS, and magnetometer) and the propulsion system. The model also models the mechanical structure of the drone with its mass and inertial characteristics.
- PX4 autopilot: This is the autopilot firmware that interfaces with Gazebo to receive the simulated sensors readings, to perform control and operations, and to send motor commands to the motor plugins of the Gazebo quadrotor model. The PX4 autopilot firmware implements the PID control loops tuned using the OCTUNE algorithm. The autopilot firmware in simulation (called software in the loop, SITL) is the same as the one on actual autopilot hardware, except that the actual sensors and motors are replaced with simulated ones.
- MAVROS: MAVROS is a software package that interfaces between the PX4 autopilot and the robot operating system (ROS) [30]. Interfacing PX4 with ROS makes the software development and integration extremely streamlined and can be easily deployed on actual hardware with almost no modifications to the software used in the simulation. The MAVROS communicates the required signals (target, controller output, and actual), and the PID controller gains between the OCTUNE application and the PX4 autopilot.
- OCTUNE: This is the implementation of the OCTUNE algorithm as a ROS software package (node in ROS terminology) for real-time tuning. The OCTUNE node receives the quadrotor signals (target, actual, and controller output), and the PID gains from the MAVROS node in real-time. After the signals and the current gains are used to compute the updated PID gains by the OCTUNE node, the new gains are sent to the PX4 autopilot via the MAVORS node.

#### 5.2. Controller

#### 5.3. Algorithm Implementation and State Machine

- IDLE state: In the IDLE state, the tuning application waits for the operator to send a start signal. Upon receiving the start signal, the application transitions to the next state—the Get Data State.
- Initial Gain State: In this state, the initial (current) PID gains are requested from the autopilot. If there are no failures in receiving the initial gains, the application transitions to the next state, the Get Data Sate. Otherwise, it returns to the IDLE state.
- Get Data State: In this state, the required data for the tuning process, such as target, actual, and control output signals, are stored in buffers in real-time, over a predefined time period or number of samples. Once sufficient data samples are received, they are post-processed to align the data samples according to their time stamps and up-sampled to reduce the high-frequency noise in the acquired signals. If data post-processing is successfully performed, the application transitions to the next state—the Optimization state. Otherwise, the tuning process is stopped, and the application transitions to the IDLE state.
- Optimization state: In this state, an update step of the OCTUNE algorithm, Equation (16) is performed using the data collected and prepared in the Get Data State. The optimal learning rate ${\alpha}^{*}$ in Equation (23) is also computed in this state. If the update step is completed successfully, the application transitions back to the Get Data State to prepare a new set of signals for a new update iteration. If a termination condition is reached, such as the maximum optimization iteration or maximum optimization time, the state-machine is terminated, and the application transitions to the IDLE state to be ready for a new tuning cycle.

#### 5.4. Simulation Results with a Static Learning Rate, $\alpha $

- 1
- The quadrotor was commanded to take off in position stabilization mode and hover at 2 m above the ground. The quadrotor was initially stable.
- 2
- The proportional gain of the pitch rate PID control loop was increased from 0.2 to 0.6 in order to introduce high-frequency oscillations.
- 3
- The OCTUNE algorithm was started to tune the PID gains.
- 4
- At the end of the tuning process, the tuning performance was shown using different plots as shown in Figure 7.

#### 5.5. Simulation Results with an Optimal Learning Rate, ${\alpha}^{*}$

#### 5.6. Hardware Experiments

#### 5.6.1. Experiment 1

- 1
- The drone is started on the ground with disarmed motors. The PID gains of the roll/pitch speed control loops are left at their default values ($P=0.15,I=0.2$, $D=0.003$).
- 2
- The pilot flies the quadcopter to a hover position.
- 3
- The OCTUNE process is started.
- 4
- The pilot performs some maneuvers with the quadcopter in order to excite the system.
- 5
- The OCTUNE process is stopped automatically after the indicated maximum optimization time, 120 s, is reached, and the logs and plots are saved.

#### 5.6.2. Experiment 2

- 1
- Initially, the drone is on the ground, and the motors are disarmed. The PID gains of the roll/pitch speed control loops are left at their default values (p = 0.15, I = 0.2, D = 0.003).
- 2
- The pilot flies the quadcopter to a hover position.
- 3
- The P gains of the roll/pitch speed control loops are set to high values (from $0.15$ to $0.6$) to introduce high-frequency oscillations.
- 4
- The OCTUNE process is started during the flight
- 5
- The pilot tries to keep the quadcopter in hover position while tuning is running.
- 6
- After the quadcopter stabilizes, the pilot performs some maneuvers with the quadcopter in order to excite the system and make sure the system is tuned well.
- 7
- The OCTUNE process is stopped automatically after the indicated maximum optimization time, in the table below, is reached, and the logs and plots are saved.

## 6. Conclusions

## Supplementary Materials

## Author Contributions

## Funding

## Institutional Review Board Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Abbreviations

OCTUNE | Optimal control tuning algorithm |

PID | Proportional, integra, and derivative controller |

UAV | Unmanned aerial vehicle |

ROS | Robot operating system |

## References

- Garriga, J.L.; Soroush, M. Model predictive control tuning methods: A review. Ind. Eng. Chem. Res.
**2010**, 49, 3505–3515. [Google Scholar] [CrossRef] - Borase, R.P.; Maghade, D.; Sondkar, S.; Pawar, S. A review of PID control, tuning methods and applications. Int. J. Dyn. Control
**2021**, 9, 818–827. [Google Scholar] [CrossRef] - Meier, L.; Honegger, D.; Pollefeys, M. PX4: A node-based multithreaded open source robotics framework for deeply embedded platforms. Proc. IEEE Int. Conf. Robot. Autom.
**2015**, 2015, 6235–6240. [Google Scholar] [CrossRef] - ArduPilot Open Source Autopilot System. Available online: https://ardupilot.org/ (accessed on 15 March 2022).
- Lindquist, A. On feedback control of linear stochastic systems. SIAM J. Control
**1973**, 11, 323–343. [Google Scholar] [CrossRef] - Whittle, P. Risk-sensitive linear/quadratic/Gaussian control. Adv. Appl. Probab.
**1981**, 13, 764–777. [Google Scholar] [CrossRef] - Bansal, A.; Sharma, V. Design and analysis of robust H-infinity controller. Control. Theory Inform.
**2013**, 3, 7–14. [Google Scholar] - Åström, K.J. Theory and applications of adaptive control—A survey. Automatica
**1983**, 19, 471–486. [Google Scholar] [CrossRef] - Tao, G. Adaptive Control Design and Analysis; John Wiley & Sons: Hoboken, NJ, USA, 2003; Volume 37. [Google Scholar]
- Fliess, M.; Join, C. Model-free control. Int. J. Control
**2013**, 86, 2228–2252. [Google Scholar] [CrossRef] [Green Version] - Xu, D.; Jiang, B.; Shi, P. A novel model-free adaptive control design for multivariable industrial processes. IEEE Trans. Ind. Electron.
**2014**, 61, 6391–6398. [Google Scholar] [CrossRef] - Hou, Z.; Jin, S. Model Free Adaptive Control; CRC Press: Boca Raton, FL, USA, 2013. [Google Scholar]
- Kang, J.; Meng, W.; Abraham, A.; Liu, H. An adaptive PID neural network for complex nonlinear system control. Neurocomputing
**2014**, 135, 79–85. [Google Scholar] [CrossRef] [Green Version] - Qiu, J.; Ma, M.; Wang, T.; Gao, H. Gradient descent-based adaptive learning control for autonomous underwater vehicles with unknown uncertainties. IEEE Trans. Neural Netw. Learn. Syst.
**2021**, 32, 5266–5273. [Google Scholar] [CrossRef] [PubMed] - Lyu, F.; Xu, X.; Zha, X. An adaptive gradient descent attitude estimation algorithm based on a fuzzy system for UUVs. Ocean. Eng.
**2022**, 266, 113025. [Google Scholar] [CrossRef] - Yu, C.C. Autotuning of PID Controllers: A Relay Feedback Approach; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
- Uren, K.; van Schoor, G. Genetic Algorithm based PID Tuning for Optimal Power Control of a Three-shaft Brayton Cycle based Power Conversion Unit. IFAC Proc. Vol.
**2012**, 45, 685–690. [Google Scholar] [CrossRef] - Maddi, D.; Sheta, A.; Davineni, D.; Al-Hiary, H. Optimization of PID Controller Gain Using Evolutionary Algorithm and Swarm Intelligence. In Proceedings of the 2019 tenth International Conference on Information and Communication Systems (ICICS), Irbid, Jordan, 11–13 June 2019; pp. 199–204. [Google Scholar] [CrossRef]
- Narendra, K.S.; Parthasarathy, K. Neural networks and dynamical systems. Int. J. Approx. Reason.
**1992**, 6, 109–131. [Google Scholar] [CrossRef] [Green Version] - Forgione, M.; Piga, D. dynoNet: A neural network architecture for learning dynamical systems. Int. J. Adapt. Control. Signal Process.
**2021**, 35, 612–626. [Google Scholar] [CrossRef] - Peng, J.; Dubay, R. Identification and adaptive neural network control of a DC motor system with dead-zone characteristics. ISA Trans.
**2011**, 50, 588–598. [Google Scholar] [CrossRef] [PubMed] - Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. 2015. Available online: tensorflow.org (accessed on 30 October 2022).
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32; Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R., Eds.; Curran Associates, Inc.: La Jolla, CA, USA, 2019; pp. 8024–8035. [Google Scholar]
- Yongquan, Y.; Ying, H.; Bi, Z. A PID neural network controller. In Proceedings of the International Joint Conference on Neural Networks, Istanbul, Turkey, 26–29 June 2003; Volume 3, pp. 1933–1938. [Google Scholar]
- Patel, R.; Kumar, V. Multilayer neuro PID controller based on back propagation algorithm. Procedia Comput. Sci.
**2015**, 54, 207–214. [Google Scholar] [CrossRef] - Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016; Available online: http://www.deeplearningbook.org (accessed on 30 October 2022).
- Khalil, H. Nonlinear Systems; Pearson Education, Prentice Hall: Hoboken, NJ, USA, 2002. [Google Scholar]
- PX4 Control Architecture. Available online: http://docs.px4.io/master/en/flight_stack/controller_diagrams.html (accessed on 7 April 2022).
- Koenig, N.; Howard, A. Design and Use Paradigms for Gazebo, An Open-Source Multi-Robot Simulator. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Sendai, Japan, 28 September–2 October 2004; pp. 2149–2154. [Google Scholar]
- Quigley, M.; Conley, K.; Gerkey, B.; Faust, J.; Foote, T.; Leibs, J.; Wheeler, R.; Ng, A. ROS: An open-source Robot Operating System. ICRA Workshop Open Source Softw.
**2009**, 3, 5. [Google Scholar] - Åström, K.J.; Wittenmark, B. Computer-Controlled Systems: Theory and Design, 2nd ed.; Prentice-Hall, Inc.: Hoboken, NJ, USA, 1990. [Google Scholar]

**Figure 1.**A feedback system with an unknown discrete-time plant, $P\left(z\right)$, and a discrete-time LTI controller, $C\left(z\right)$.

**Figure 2.**A feedback system with controller $C\left(z\right)$ coefficients updated by the OCTUNE algorithm. The OCTUNE algorithm receives the reference, actual, and controller output signals and performs update operations to update the controller parameters.

**Figure 3.**Forward and backward operations are used to compute the partial derivatives. Solid arrows represent forward propagation, and backpropagation is represented by dashed arrows.

**Figure 4.**Abstract of the software architecture used in the simulations. Bold and underlined text represent software packages that are explained in Section 5.1.

**Figure 7.**The pitch-rate tuning process during hovering. A fixed learning rate, $\alpha =0.001$, was used. The quadrotor started with an oscillatory angular pitch response and ended with a worse response after tuning due to the use of a non-optimal fixed learning rate. (

**a**) signals before tuning, (

**b**) signals after tuning, (

**c**) pitch rate PID gains, (

**d**) performance error $V\left(E\right)$ over iterations.

**Figure 8.**The tuning process of the roll-rate PID controller during hovering. The quadrotor starts with an oscillating behavior due to poorly tuned PID gains. Eventually, the angular rate loops are stabilized after the real-time tuning process. (

**a**) signals before tuning, (

**b**) signals after tuning, (

**c**) performance error $V\left(E\right)$ over iterations, (

**d**) learning rate $\alpha $ over tuning iterations, (

**e**) PID gains.

**Figure 9.**The tuning process of the pitch rate PID controller during hovering. The quadrotor starts with an oscillating behavior due to poorly tuned PID gains. Eventually, the angular rate loops are stabilized after the real-time tuning process. (

**a**) signals before tuning, (

**b**) signals after tuning, (

**c**) performance error $V\left(E\right)$ over iterations, (

**d**) learning rate $\alpha $ over tuning iterations, (

**e**) PID gains.

**Figure 11.**The results of the tuning process of the pitch rate PID controller in Experiment 1. (

**a**) signals before tuning, (

**b**) signals after tuning, (

**c**) performance error $V\left(E\right)$ over iterations, (

**d**) learning rate $\alpha $ over tuning iterations, (

**e**) PID gains.

**Figure 12.**The results of the tuning process of the roll-rate PID controller in Experiment 1. (

**a**) signals before tuning, (

**b**) signals after tuning, (

**c**) performance error $V\left(E\right)$ over iterations, (

**d**) learning rate $\alpha $ over tuning iterations, (

**e**) PID gains.

**Figure 13.**The results of the tuning process of the pitch rate PID controller in Experiment 2. (

**a**) signals before tuning, (

**b**) signals after tuning, (

**c**) performance error $V\left(E\right)$ over iterations, (

**d**) learning rate $\alpha $ over tuning iterations, (

**e**) PID gains.

**Figure 14.**The results of the tuning process of the roll-rate PID controller in Experiment 2. (

**a**) signals before tuning, (

**b**) signals after tuning, (

**c**) performance error $V\left(E\right)$ over iterations, (

**d**) learning rate $\alpha $ over tuning iterations, (

**e**) PID gains.

**Table 1.**The mean squared error (MSE) for the simulation results with the optimal learning rate, ${\alpha}^{*}$.

Experiment | MSE before Tuning | MSE after Tuning |
---|---|---|

Roll rate tuning | 0.59 | 0.03 |

Pitch rate tuning | 1.21 | 0.01 |

Experiment | MSE before Tuning | MSE after Tuning |
---|---|---|

Roll rate tuning | 0.17 | 0.01 |

Pitch rate tuning | 0.16 | 0.007 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Abdelkader, M.; Mabrok, M.; Koubaa, A.
OCTUNE: Optimal Control Tuning Using Real-Time Data with Algorithm and Experimental Results. *Sensors* **2022**, *22*, 9240.
https://doi.org/10.3390/s22239240

**AMA Style**

Abdelkader M, Mabrok M, Koubaa A.
OCTUNE: Optimal Control Tuning Using Real-Time Data with Algorithm and Experimental Results. *Sensors*. 2022; 22(23):9240.
https://doi.org/10.3390/s22239240

**Chicago/Turabian Style**

Abdelkader, Mohamed, Mohamed Mabrok, and Anis Koubaa.
2022. "OCTUNE: Optimal Control Tuning Using Real-Time Data with Algorithm and Experimental Results" *Sensors* 22, no. 23: 9240.
https://doi.org/10.3390/s22239240