Next Article in Journal
A Hybrid Autoencoder Network for Unsupervised Image Clustering
Previous Article in Journal
Integration of Production Planning and Scheduling Based on RTN Representation under Uncertainties
Previous Article in Special Issue
Kalman-Filter-Based Tension Control Design for Industrial Roll-to-Roll System
Addendum published on 10 October 2019, see Algorithms 2019, 12(10), 212.
Open AccessArticle

Learning Output Reference Model Tracking for Higher-Order Nonlinear Systems with Unknown Dynamics

Department of Automation and Applied Informatics, Politehnica University of Timisoara, 2 Bd. V. Parvan, 300223 Timisoara, Romania
*
Author to whom correspondence should be addressed.
This paper is an extended version of our paper published in the 27th Mediterranean Conference on Control and Automation (MED 2019).
Algorithms 2019, 12(6), 121; https://doi.org/10.3390/a12060121
Received: 1 May 2019 / Revised: 7 June 2019 / Accepted: 9 June 2019 / Published: 12 June 2019
(This article belongs to the Special Issue Algorithms for PID Controller 2019)
This work suggests a solution for the output reference model (ORM) tracking control problem, based on approximate dynamic programming. General nonlinear systems are included in a control system (CS) and subjected to state feedback. By linear ORM selection, indirect CS feedback linearization is obtained, leading to favorable linear behavior of the CS. The Value Iteration (VI) algorithm ensures model-free nonlinear state feedback controller learning, without relying on the process dynamics. From linear to nonlinear parameterizations, a reliable approximate VI implementation in continuous state-action spaces depends on several key parameters such as problem dimension, exploration of the state-action space, the state-transitions dataset size, and a suitable selection of the function approximators. Herein, we find that, given a transition sample dataset and a general linear parameterization of the Q-function, the ORM tracking performance obtained with an approximate VI scheme can reach the performance level of a more general implementation using neural networks (NNs). Although the NN-based implementation takes more time to learn due to its higher complexity (more parameters), it is less sensitive to exploration settings, number of transition samples, and to the selected hyper-parameters, hence it is recommending as the de facto practical implementation. Contributions of this work include the following: VI convergence is guaranteed under general function approximators; a case study for a low-order linear system in order to generalize the more complex ORM tracking validation on a real-world nonlinear multivariable aerodynamic process; comparisons with an offline deep deterministic policy gradient solution; implementation details and further discussions on the obtained results. View Full-Text
Keywords: approximate dynamic programming; reinforcement learning; data-driven control; model-free control; reference trajectory tracking; output reference model; multivariable control; aerodynamic rotor system; neural networks; learning systems approximate dynamic programming; reinforcement learning; data-driven control; model-free control; reference trajectory tracking; output reference model; multivariable control; aerodynamic rotor system; neural networks; learning systems
Show Figures

Figure 1

MDPI and ACS Style

Radac, M.-B.; Lala, T. Learning Output Reference Model Tracking for Higher-Order Nonlinear Systems with Unknown Dynamics. Algorithms 2019, 12, 121.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop