Robust Tracking as Constrained Optimization by Uncertain Dynamic Plant: Mirror Descent Method and ASG—Version of Integral Sliding Mode Control

A class of controlled objects is considered, the dynamics of which are determined by a vector system of ordinary differential equations with a partially known right-hand side. It is presumed that the state variables and their velocities can be measured. Designing a robust tracking controller under some constraints to admissible state variables is the research goal. This construction, which extends the results for the average subgradient technique (ASG), and is an update of the subgradient descent technique (SDM) and integral sliding mode (ISM) approach, is realized by using the Legendre–Fenchel transform. A two-link robot manipulator with three revolute joints, powered by individual PMDC motors, is presented as an illustrative example of the suggested approach implementation.


Brief Survey
Constrained optimization is the process of optimizing an objective function with respect to some variables, subject to constraints on those variables. The objective function is either a cost function or energy function to be minimized, or a reward function or utility function to be maximized. Constraints can be either hard constraints, which set conditions on the variables that must be satisfied, or soft constraints, which have some variable values that are penalized in the objective function if and depending on the extent to which the conditions on the variables are not satisfied (see, for example, [1][2][3][4][5][6]).
All control strategies in most publications, treated as Static Optimization Methods (SOM), in continuous time may be represented in the following form where F : R n → R is a convex (not obligatory strongly convex) mapping, X adm is the admissible convex set of arguments, and the process x t is generated by the simple ordinary differential equation (ODE)ẋ with any initial conditions x 0 ∈ R n . The relation (1) is referred to hereafter as a static plant. All known procedures of SOM differ only in the designing of the control action u t (or an optimization algorithm) as a function of the current state x t (Markov's strategy) or more profound available history, namely, u t = u(t, x τ | τ∈ [0,t] ).
Here we consider a more general, and hence, more complex situation when the process x t is generated by the dynamic planẗ where the vector function f in the right-hand side is supposed to be unknown but belongs to some class C of nonlinearities. This problem is more closed to the, so-called, Extremum Seeking Problem [7][8][9][10] and includes the first-order derivatives only. Thus, in [11], several optimization schemes are considered and it is shown that under appropriate conditions, these schemes achieve the extremum point from an arbitrarily large domain of initial conditions if the parameters in the controller are appropriately adjusted. This approach was applied in [12] for two levels of plant's economic optimization. Many advanced process control systems use some form of model predictive control approach [13,14]. Article [15] describes a new algorithm for finding an extremum using stochastic online gradient estimation. Article [16] considers the optimization problem with constraints in dynamic linear time-invariant (LTI) systems characterized by the dimension of the control vector being smaller than the dimension of the system state vector. Convergence in finite time to a neighborhood of order ε of the optimal equilibrium point is proven. Ref. [17] presents variable structure convex programming control for a class of linear uncertain accessible state systems.
It is also crucial to discuss the relationship between the problem under consideration and the Model Predictive Control (MPC) approach, commonly referred to as moving or receding horizon control; see the two known surveys [18,19] as well as some new recent papers on Robust MPC (RMPC) [20][21][22][23]. MPC is a group of model-based control theories that anticipate system behavior using linear or nonlinear process models. The quality of the open-loop predictions, which in turn depends on the correctness of the process models, determines how well the MPC control performance operates. The estimated trajectory could not match the behavior of the plant in real life. The performance of the control system may be sluggish or unstable due to the mismatch between the plant and the model, sometimes referred to as model uncertainty. Robust predictive controllers are those that explicitly take the process and model uncertainty into account while establishing the best control strategies. Similar to H-infinity controllers, the core principle behind these controllers is to minimize the worst possible disruption to the behavior of the process. This idea is very different from the one suggested in this paper:

-
The prediction process cannot be accomplished precisely since the right-hand side of the ODE, representing the object model, is considered to be unknown (only dimensions of states and control are available); -Because the control action should be implemented in real-time online utilizing feedback (but not open-loop control), it is difficult to test-repeat the appropriate produced trajectories for various potential uncertainties.
In this paper, we consider a class of controlled plants with dynamics governed by a vector system of the second-order ordinary differential equations (ODE) with an unknown right-hand side. All mechanical Lagrange models belong to this class. The state variables and their velocities are assumed to be measurable. We design a controller minimizing a loss function subjected to a set of constraints to the state of the controlled plant. The designed control action is admitted to be a function of the current subgradients of loss function and constraints only, which are also supposed to be measurable online. The control is designed based on the SDM (subgradient descent method) version [24,25] of the integral sliding mode (ISM) concept [26][27][28] aimed at minimizing "on average" a given convex (not obligatory strongly convex) cost function of the current state under a set of given constraints. An optimization type algorithm is developed and analyzed using ideas from the SDM technique [1]. We prove the reachability of the "desired regime" (nonstationary analogue of sliding surface) [27] from the beginning of the process and obtain an explicit upper bound for the cost function decrement; that is, the convergence is proven and the rate of convergence is estimated as O(t −1 ). This paper generalizes the approach, suggested in [29] for unconstrained dynamic optimization, to the constraint optimization problem realized by an uncertain second-order dynamic plant.

•
The robust tracking problem is reformulated as a constrained optimization realized by a dynamic plant with an unknown (but bounded) right-hand side. When we refer to "robust tracking", we imply two distinct characteristics that are connected to imperfect a priori knowledge. While the exact control plant models and tracking trajectories are unavailable, a robust controller should nevertheless be able to successfully operate. It is just necessary to measure states and corresponding velocities online. • The cost as well as the constraints are admitted to be convex but not obligatory strictly or strongly convex. • The mirror descent method (MDM) and ASG version of sliding mode control are suggested and realized. • The convergence of the obtained trajectories of the controlled uncertain plant to the corresponding admissible zone close to the minimal point is realized.

Dynamic Model
The second-order dynamic model (2) can be represented in the following extended format Here the extended state variables x 1,t = x t , x 2,t =ẋ t are the current coordinates and their velocities at time t ≥ 0. Function f(t, x 1,t , x 2,t ) is piecewise continuous in all arguments and admits to being unknown but is bounded as with final positive constants c 0 , c 1 , and c 2 . Hereafter the symbol · means the Euclidean norm.

Reference Trajectory, Tracking Error Dynamics, and Admissible Zone
The aim of the controller (which will be exactly formulated below) is to realize the tracking of the state x t for the given reference trajectory {x * t } t≥0 . Define the tracking error δ 1,t as where x * 1,t is the continuously differentiable trajectory to be tracked satisfyinġ In view of that, the error tracking dynamics can be represented as follows Let us require that the dynamics of δ 1,t should be realized after time t 0 ≥ 0 within a bounded admissible zone D adm . This paper's primary objective is to build a control that minimizes the tracking error δ 1 . The minimization of the assumed convex loss function F(δ 1 ) can be used to represent this. For example, the class of convex loss functions under consideration includes the following functions: (4), is piecewise continuous in all arguments and admits to being unknown. A3 The current states (x * t ,ẋ * t ) of the reference trajectory are also supposed to be available online for any t ≥ 0. A4 Here we assume that the subgradient (Recall that a vector a(x) ∈ R n , satisfying the inequality is available online for a current time t ≥ 0, and the set of minimizers δ * 1 of F(·) on the set D adm includes the origin δ * 1 = 0; that is, A5 The admissible set D adm is nonempty convex compact, i.e., D adm = ∅.

Mirror Descent Method in Continuous Time
Let us apply the mirror descent approach using the Legendre-Fenchel transformation [30] as follows. For any ζ ∈ R n define so that (see, for instance, [31,32]) Define the dynamics for the vector function ζ t ∈ R n aṡ Remark 1. The second differential equation in (11) can be integrated as follows Therefore, δ 1,t ∈ D adm for all t ≥ t 0 because of convexity and due to (9) and (10).

Why the Dynamics δ 1,t are Desired
The following theorem explains why the dynamics δ 1,t may be considered as desired.

Example 1. Assume that
To calculate δ * 1 , according to (13), it is sufficient to note that the solution of the problem

Auxiliary Sliding Variable and Its Dynamics
Introduce a new auxiliary variable (sliding variable) Notice that the function s t is measurable online, and that the situation when corresponds exactly to the desired regime (11), starting from the moment t 0 . Then for V(s t ) = 1 2 s t 2 in view of (7) and the first equation in (11) we have Here Sign(s t ) = (sign(s 1,t ), . . . , sign(s n,t )) ,
Since δ * 1 (η) ∈ D adm , we may conclude that parameters θ > 0, η and initial conditions (δ 1,0 , δ 2,0 ) should be consistent in the sense that θδ 2,0 + δ 1,0 ∈ D adm . Remark 3. For example, with Euclidean r-ball in R n being the admissible set D adm , from (9) and (10) one has ∇U * (ζ) = arg max From (18) it follows that and Notice that ∇U * -function (10) is nondifferential in the points of r-sphere of the ball, and it is continuously differential in all other points of R n . The formulas in (20) and (23) are presented as their continuous versions on the ball, including the r-sphere.

Main Result
We are ready to formulate the main result.
Proof. In view of the relation (19) of the parameter η and initial conditions δ 1,0 ,δ 1,0 , the auxiliary variable s t = 0 for all t ≥ 0 starting from the beginning of the control process. Using Formula (12) for t 0 = 0 we obtain (24).

3.
Equation (22) holds under nonzero vector η with a sufficiently small η ≤ and for θ > 0 (see, as an example, the second item in the loss function (8)).

Numerical Example
A two-link robot manipulator with three revolute joints powered by individual PMDC motors is presented below as an illustrative example of the suggested approach.

Model Description
A dynamic model of a Lagrangian mechanical system with n degrees of freedom in standard form driven by n independent Permanent Magnet Direct Current (PMDC) motors [33] is defined by the following system of differential equations: where q t ,q t ∈ R n are the state and velocity vectors, τ t ∈ R n is a vector of external torques, I at ∈ R n is the armature current vector, W ∈ R n×n is the matrix of electromotive force constants (possibly taking into account engine gear ratios), K a ∈ R n×n is the matrix of constants of direct electromotive forces, D(q t ) = M(q t ) + W JW is a positive definite inertia matrix, that is D(q t ) = D (q t ) ≥ d − I n×n , d − > 0 and, therefore, invertible for all q t , J = diag{J 1 , J 2 , . . . , J n }is the rotor inertia matrix, M(q t ) is the matrix of the Lagrangian system corresponding to the armature inertia matrix in the original coordinates, C(q t ,q t ) ∈ R n×n is the matrix corresponding to the generalized nonpotential forces C(q t ,q t )q t , which can describe friction, hysteresis, Coriolis, damping, centripetal effects, etc., G(q t ) ∈ R n is a vector corresponding to the generalized potential forces, K e = diag{K e1 , K e2 , . . . , K en } is the matrix of reverse electromotive force constants, L a = diag{L a1 , L a2 , . . . , L an } and R a = diag{R a1 , R a2 , . . . , R an } are armature inductance and resistant positive matrices, respectively, ϑ t ∈ R n is the disturbance (or uncertainty) vector, and v at ∈ R n is the armature voltage vector , which is treated below as a control designed to achieve the desired behavior. In fact, the third equation in (25) describes the dynamics of the actuator implementing the applied control action v at . Equation (25) assumes fully allocated control.
We assume that q t ,q t , and I at are available online. From (25) it follows that (t 0 ≥ 0-any fixed time), and selecting (neglecting the Joule effect, related to the dependence of the winding motor resistance) with v (1) the relation (26) becomes Substituting (29) into (25) gives aτ dτ,θ t := WK a I at 0 + ϑ t .
Note that in the standard matrix format (3) with new state vectors x 1 = q ∈ R n and velocity vectors x 2 =q ∈ R n , the Lagrange dynamics under consideration (30) have the following form:ẋ Thus, in this representation, the dimension of control vector u is n and the extended state dimension x = x 1 , x 2 is 2n.

Intended Moving Point
The considered mechanical construction is depicted in Figure 1.
To implement the simulation, the following form of immeasurable nonpotential forces (friction, hysteresis, Coriolis, damping, centripetal effects, and others) was modeled as C(q t ,q t )q t = −k resq t Sign(q t ) q t , k res > 0.

Applied Robust Controller Structure
In this particular case, the suggested robust controller (17) and (18) is as follows: with the compensation control part u comp,t equal to where a(δ 1,t ) is defined in (34), and The discontinuous control u disc,t is designed as u disc,t := −k t Sign(s t ), (38)

Parameters of Simulation
The following are the computer simulation parameters:      As one can see, the suggested method demonstrates a successful workability in the presence of essential model uncertainties and external perturbations.

-
The constrained optimization problem is addressed in this study using a second-order differential controlled plant with an unknown (but bounded) right side of the model. - The desired dynamics in the tracking error variables is designed based on the mirror descent method. - The continuous time convergence to the set of minimizing points is established, and the associated rate of convergence is analytically evaluated. - The robust controller, containing both the continuous (compensating) u comp and the discontinuous u disc , is proposed using the ASG version of the integral sliding mode approach. - The suggested controller, under the special relations of it parameters with the initial conditions, is proved to provide the desired regime from the beginning of the control process.