Model Predictive Regulation on Manifolds in Euclidean Space

One of the crucial problems in control theory is the tracking of exogenous signals by controlled systems. In general, such exogenous signals are generated by exosystems. These tracking problems are formulated as optimal regulation problems for designing optimal tracking control laws. For such a class of optimal regulation problems, we derive a reduced set of novel Francis–Byrnes–Isidori partial differential equations that achieve output regulation asymptotically and are computationally efficient. Moreover, the optimal regulation for systems on Euclidean space is generalized to systems on manifolds. In the proposed technique, the system dynamics on manifolds is stably embedded into Euclidean space, and an optimal feedback control law is designed by employing well studied, output regulation techniques in Euclidean space. The proposed technique is demonstrated with two representative examples: The quadcopter tracking control and the rigid body tracking control. It is concluded from the numerical studies that the proposed technique achieves output regulation asymptotically in contrast to classical approaches.


Introduction
One of the fundamental problems in control theory is to regulate the output of the plant, which is known in the literature as model predictive regulation (MPR) [1]. Consider a controlled nonlinear system on a manifold M aṡ with an exosystemẇ = a(w) and the system output where x ∈ M, u ∈ R m , w ∈ R d , and y ∈ R p . The optimal MPR problem is to determine a feedback control law u = κ(x, w) that regulates the system output asymptotically to zero, i.e., lim t→∞ y(t) = 0.
An optimal regulation control law is synthesized from an infinite horizon optimal control problem, i.e., min u ∞ 0 l(x(t), u(t), w(t))dt subject to the controlled dynamics and a fixed initial condition (x(0), w(0)) = (x 0 , w 0 ), where the control Lagrangian, l : M × R m × R d → R ≥0 , is sufficiently well-behaved and is zero if and only if y = 0.
The output regulation problem for linear multivariate systems is formulated and its solvability conditions are derived, for the first time, by Francis in [2]. The regulation

Optimal Output Regulation of Nonlinear Systems
Consider a nonlinear dynamic systeṁ with the following data: (a) plant state x ∈ R n , plant input u ∈ R m , exosystem state w ∈ R d and system output y ∈ R p , (b) f : R n × R m × R d → R n is a map depicting the plant dynamics (1) on R n , (c) a : R d → R d is a map depicting the exosystem dynamics (2) on R d , (d) h : R n × R m × R d → R p accounts for the system output (3).
An output regulation problem is to find a control law that steers, for any set of initial conditions (x 0 , w 0 ) ∈ R n × R d , the system output of the nonlinear dynamics (1)-(3) asymptotically to zero, i.e., lim t→∞ y(t) = 0.
Output regulation problems are very commonly formulated for disturbance rejection and reference tracking by the system. The exosystem dynamics is, in general, designed to generate a reference signal or a modeled disturbance signal. Let us consider a case of reference tracking in which the plant output is the output of the plant dynamics (1) that does not include the exosystemh(x, u) ∈ R p for (x, w) ∈ R n × R m , needs to track the reference signal, q(w) ∈ R p for w ∈ R d , generated by the exosystem. Then, the system output asymptotically converging to zero ensures that the plant output is asymptotically tracking the reference signal. In an identical manner, let us consider a case of disturbance rejection in which the plant output is stabilized asymptotically to zero while the plant dynamics (1) are subjected to the disturbance generated by the exosystem. Therefore, the system output is regulated to zero under the influence of the disturbance introduced in the plant dynamics.
Before we discuss necessary and sufficient conditions for the solvability of the output regulation problem, let us elucidate standard assumptions considered in the literature: Assumption 1. The following assumptions for the nonlinear dynamics (1)-(3) hold: (a) The vector fields f and a and the map h are smooth. (b) For the control input u = 0, the system dynamic (1)-(3) has an equilibrium point (x, w) = (0, 0) such that the system output is zero, i.e., h(0, 0, 0) = 0. (c) The equilibrium exosystem state w = 0 of the exosystem (2) is stable and there exists a neighborhood W 0 containing zero, such that every initial condition w(0) ∈ W 0 is Poisson stable.
(d) The linear approximation of the plant dynamics (1) is stabilizable at the equilibrium point (x, u, w) = (0, 0, 0), i.e., the pair The output regulation problem with such a generality is difficult to solve in general. Therefore, the state feedback control law (4) is designed in an open neighborhood O ⊂ R n × R d of the origin at 0, such that for any initial condition (x(0), w(0)) = (x 0 , w 0 ) ∈ O, the system output (3) of the dynamics (1)-(3) converges at zero. The solvability condition for the output regulation problem is established by the following theorem: ). Under Assumption 1, there exist a neighborhood O ⊂ R n × R d of 0 and a C k (k ≥ 2) state feedback, u = u(x, w) ∈ R m for (x, w) ∈ O, that asymptotically stabilizes the output of the system dynamics (1)- (3) to zero if and only if there exist C k mappings x = θ(w) with θ(0) = 0, and u = λ(w) with λ(0) = 0, both defined in a neighborhood W ⊂ R d of 0, such that for all w ∈ W.
Thus, the feedback control law for the output regulation can be designed as where the feedback term κ with κ(0) = 0 is derived to make the so-called output regulation manifold The problem of synthesizing optimal feedback control laws for output regulation is first proposed by Krener [4,14], and that is generalized to model predictive regulations [15]. The feedback control law (7) using Krener's method is designed in two steps: Under these new coordinate changes, the output regulation problem (4) is posed as an optimal stabilization problem for asymptotic stabilization of the dynamics (8) to zero as min v ∞ 0 l(z(t), v(t))dt where (z 0 , w 0 ) is fixed and the smooth control Lagrangian satisfies l(z, v) = 0 if and only if (z, v) = (0, 0). Then, the feedback term κ in (7) is the feedback control law v obtained by solving the optimal control problem (9), i.e.,

Remark 1.
Note that the PDE (5) along with the algebraic constraints (6) is often solved approximately via finite series solutions [4,5]. Assume that the solution (θ, λ) of the PDE is approximated by polynomials of degree r of the form where γ [i] (α) denotes a polynomial homogeneous of degree i in α. Then, under the change of coordinates the optimal stabilization problem (9) leads to the following feedback control law that in turn ensures that the state-action pair (x, u) converges asymptotically to θ (r) (w), λ (r) (w) .
It is worth noting that h θ (r) (w), λ (r) (w), w may not be zero due to the approximation of the feedforward control u = λ(w) and the output regulation manifold M R . Therefore, the series solution (12) does not guarantee asymptotic convergence of the system output (3) to zero. However, the output approximation error Equipped with a sufficient understanding of output regulation, let us design a feedback law for a class of nonlinear systems that leads to an asymptotic convergence of the system output to zero.

Problem Statement
Consider a nonlinear systemẋ governing the plant dynamics, w ∈ R d is the exosystem state with vector field a governing the exosystem dynamics and y ∈ R p is the system output.
Note that the system dynamics (13)-(16) is in standard form, and therefore, Theorem 1, leads to the following necessary and sufficient condition for the solvability of the output regulation problem for the system dynamics (13)- (16): that asymptotically stabilizes the output of the system dynamics (13)- (16) to zero if and only if there Proof. We know that the dynamics (13)- (16) with the choice of x = (x 1 , . Hence, applying Theorem 1 to the dynamics (13)- (16) gives: There exists a neighborhood O ⊂ R n × R d of 0 and a C k (k ≥ 2) state feedback that asymptotically stabilizes the output of the system dynamics (13)- (16) to zero if and only if there exist C k mappings (x 1 , The algebraic constraint (21) is satisfied if and only if θ 1 (w) =h(w). Therefore, substituting θ 1 =h in (19) leads to (18) and (20) leads to (17). This proves the assertion. (17) and (18) with algebraic constraint is in the same form as (5) and (6); however, the dimension of the PDE (17) and (18) is reduced. Therefore, the reduced order PDE (17) and (18) is computationally efficient.

Remark 2. Note that the PDE
We now turn to designing an optimal feedback control law using Krener's method that locally regulates the system output (16) of the dynamics (13)-(16) asymptotically to zero.
First, a feedforward control law is designed by solving the FBI Equations (17) and (18) using HJB series solutions [4,5]. Let the series solution of the FBI Equations (17) and (18) be given by where γ (r) (w) is a homogeneous polynomial in w up to degree r. Second, the error dynamics is defined, under the change of coordinates and the output regulation problem is translated to a stabilization problem as where (y 0 , z 0 , w 0 ) is fixed and the smooth control Lagrangian satisfies l(y, z, v) = 0 if and only if (y, z, v) = 0. The infinite horizon optimal control problem (24) is solved using Al'brekht's method and the feedback control law is designed that locally stabilizes (y, z) asymptotically to zero [4], Theorem 4.2. Therefore, the optimal feedback control locally regulates the output of the dynamics (13)- (16) to zero asymptotically. The system output y converges asymptotically to zero due to the fact that the PDE series solutions (22) do not affect the output regulation manifold

Computational Complexity
The feedback regulation problem for the system (13)- (16) is solved in two ways. A feedback control law is obtained by solving one of the FBI (5) and (6) and the FBI (17) and (18). As the dimension of the PDE in FBI (17) and (18) is reduced by p as compared to the FBI (5) and (6), it leads to a significant reduction in computation time. On the other hand, the regulation manifold of the system (13)-(16) is explicitly known and therefore, the feedback regulation law obtained by FBI (5) and (6) is more accurate as compared to FBI (17) and (18). A series solution of degree r of the FBI (5) requires the solving of a linear system of order O (n + m)d j recursively for each degree j = 1, . . . , r. Therefore, the computation time for solving the FBI (17) and (18) using series solutions up to degree r is of order O (n + m) 3 d 3r and for the FBI (5) and (6) is of the order O (n − p + m) 3 d 3r . It can be concluded from the computation time analysis that there will be a significant reduction in computation time when the degree of the approximate series solution is large.
Let us now generalize the output regulation problem to manifolds. We know that many robotics and aerospace systems evolve on manifolds. The optimal stabilization theory developed by Krener [4] cannot be directly applied to the system evolving on manifolds. An intuitive way is to extended the system to the ambient Euclidean space and design the controller in that ambient space; however, such extensions may not preserve the stabilizability of the linearized system, which is one key assumption for the FBI Equations (5) and (6). This hurdle is circumvented by stably embedding the system dynamics into the ambient Euclidean space [8].

Consider a class of nonlinear systems on a manifold
where plant state x ∈ M, plant input u ∈ R m , exosystem state w ∈ R d and system output y ∈ N such that the manifold N ⊂ R n is embedded in R p with p ≤ n.
The output regulation problem on the manifold is solved by stably embedding the system dynamics (27)-(29) to an appropriate Euclidean space such that the linearized system in the ambient Euclidean space is stabilizable. We would like to stress on the fact that the stabilizability of the linearized dynamics is one of the key assumptions for existence of an output regulating feedback control law; see Assumption 1.
A stabilizable extension of the dynamics (27)-(29) on the ambient Euclidean space R n is conducted in two steps [8]: 1.
The plant dynamics (27) is extended to R n and the system output (29) is extended on  ∇V(x) · f e (x, u, w) = 0 for all (x, u, w) ∈ U × R m × R d . Therefore, the extended plant dynamics (30) is stably extended and that leads to the following linearly stabilizable extension of (27)-(29) on U × R m × R d : where α > 0. Here, instead of the number α > 0, one can more generally use an n × n positive definite symmetric matrix-valued function. A detailed discussion on the transversal stability of M in the stably extended dynamics (33)-(35) may be found in [8].
The system dynamics (33)-(35) is defined in Euclidean space and therefore, Krener's method for designing feedback control for the output regulation problem is directly applicable without any modification.
For the sake of clarity, let us consider an example of a single axis rotation of a rigid body. The state space of the dynamics is SO(2) × R where SO(2), (the set of 2 × 2 orthonormal matrices with determinant 1,) accounts for the attitude of the rigid body and the angular velocity of the body about the rotation axis lies in R. The manifold SO(2) is a Lie group and the set so(2), (the set of 2 × 2 real skew-symmetric matrices,) is its Lie algebra. The attitude dynamics for single axis rotation of the rigid body is given bẏ where (R, Ω) ∈ SO(2) × R with R determines the attitude of the rigid body and Ω determines the angular velocity of the rigid body, J ∈ R ≥0 is the moment of inertia, τ ∈ R is the torque applied along the axis of rotation, and the hat map ∧ : R → so(2) is the vector space isomorphism defined as follows: for β ∈ R Note that the manifold SO(2) is embedded in R 2×2 and therefore, the system dynamics (36) and (37) is naturally extended to the ambient space R 2×2 × R. However, such natural extensions may not guarantee the stabilization of its linearized dynamics around an equilibrium point of interest. Let us define a stable extension of the dynamics (36) and (37) in a neighborhood (2). To this end, let us define a Lyapunov-like function, V : GL for (R, Ω) ∈ GL + (2) × R with the usual Euclidean norm · on R 2×2 , which satisfies (2), and ∇ R V · (RΩ) = 0.
It leads to a stable extension of the dynamics (36) and (37) on R 2×2 × R aṡ where α > 0. Let us consider an output regulation problem on the manifold SO(2), where the exosystemẇ = a(w) generates attitude signals,h(w) ∈ SO(2) with w ∈ R d , for the dynamics (36) and (37) to track. The system dynamics with an exosystem for the output regulation is defined aṡ The dynamics (40)-(43) are defined in Euclidean space and therefore, the Krener's method [4] for optimal regulation is readily applied to find a feedforward and feedback control law. Using Theorem 2, the feedforward control law, τ = λ(w) with w ∈ R d , which makes the manifold Ω =θ(w) invariant, needs to satisfy the following FBI equations Remark 3. Note that the FBI (44) and (45) is a PDE in R with algebraic constraints in R 2×2 in contrast to the FBI obtained using Theorem 1 that is a PDE in R 2×2 × R with algebraic constraints in R 2×2 . This simple example demonstrates that the PDE dimension is reduced to a large extent and it contributes to fast computation.

Remark 4.
Embedding SO(2) to R 2×2 increases the dimension of the state space by 3; however, one can identify SO(2) with the unit circle and embed the unit circle in R 2 (the ambient space) which only increases the dimension of the state space by 1.

Remark 5.
Note that the output regulation technique by Krener [4] does not incorporate state and control constraints. For output regulation of the safety critical systems where state and control constraints are crucial to consider at the controller design stage, a model predictive control approach is proposed by Krener [15]. The model predictive control approach is directly extended to manifolds by stably extending the system dynamics to an ambient Euclidean space.

Simulation Results
Let us solve the output regulation problem for the bi-directional quadcopter [16] and the rigid body attitude motion with Krener's Matlab-based Nonlinear Systems Toolbox [6]. We demonstrate with the quadcopter example that fairly complex problems can be handled using this approach.

Quadcopter
A bidirectional quadcopter is an unmanned aerial vehicle that is fitted with four rotors to generate bidirectional (upward and downward) thrust and a torque to orient the quadcopter. The system dynamics for the quadcopter is given bẏ where R ∈ SO(3) (the set of 3 × 3 rotation matrices) denotes the attitude, Ω ∈ R 3 denotes the body angular velocity, and x ∈ R 3 defines the position of the quadcopter. The control inputs are f and τ, where f ∈ R accounts for the upward thrust generated by the rotors, and τ ∈ R 3 is the torque applied on the body. The parameter m is the quadcopter mass, e 3 = (0, 0, 1), and I is the 3 × 3 moment of inertia matrix. The hat map ∧ : R 3 → so (3), where so(3) denotes the set of 3 × 3 skew symmetric matrices, is a vector space isomorphism defined asxy = x × y for all x, y ∈ R 3 . Consider a position tracking problem in which the quadcopter traces a path that is generated by an exosystemẇ where The tracking problem is to regulate the output to zero that is subject to the quadcopter dynamics (46)-(48) and the exosystem dynamics (50). Note that the output regulation problem for the quadcopter dynamics (46)-(48) with the exosystem (50) and the output (51) is in the standard form. Therefore, Krener's method extended to manifolds, as described in Section 3, is employed to design a feedback control law for regulating the output asymptotically to zero. In order to apply Krener's method, let us first stably extend the quadcopter dynamics (46)-(48) to Euclidean space in an identical manner as in (38) and (39). The stably extended dynamics with the exosystem and output is defined bẏ where R ∈ R 3×3 , Ω ∈ R 3 , x ∈ R 3 , v ∈ R 3 , w ∈ R 2 , and y ∈ R 3 . The dynamics (52)-(57) is in standard form and therefore, a feedback control law (12) up to degree r is given bȳ where the feedforward λ (r) and the stabilizing manifoldz (r) = (R, Ω, x, v) − θ (r) (w) are computed by solving the FBI (5) and (6) and the feedback κ is computed by solving the stabilization problem (9) using Al'brekht's method. On the other hand, in our technique, the feedback control (26) up to degree r is given bỹ where the feedforwardλ (r) and the stabilizing manifoldz (r) = (R, Ω, v) −θ (r) (w) are computed by solving the FBI (17) and (18) and the feedback κ is computed by solving the stabilization problem (24) using Al'brekht's method. Letx{r} andx{r} be the positions that are traced by the quadcopter (46)-(48) under the feedbackū (r) andũ (r) , respectively. Then, the corresponding tracking errors are given bȳ The following parameters have been considered for the numerical experiments: We can infer from the phase portrait in Figure 1 that the position trajectoryx{r} that is traced under the control lawũ (r) is tracing the exosystem trajectoryh more effectively as compared to the position trajectoryx{r} that is traced under the control lawū (r) . The tracking errorsỹ{r} eventually converge to zero as shown in Figure 2, butȳ{r} does not converges to zero in Figure 3, which supports the claim that the tracking performance of the control lawũ (2) is better thanū (2) . As shown in Figure 4, the optimal tracking control lawũ (2) shows that the quadcopter rotors will produce a negative thrust to catch up the exosystem trajectory and then torque and thrust eventually go to zero as the tracking error y{r} goes to zero.

Remark 6.
Note that the referenceh(w) generated by the exosystem (50) is a cubic polynomial in w, and therefore, the feedback controlsũ (3) andū (3) are identical. However, the feedback control u (2) provides good performance and is computationally less intensive as compared to the feedback controlū (3) .

Rigid Body Attitude Control
A rigid body attitude dynamics is given bẏ where R ∈ SO(3) denotes the attitude, Ω ∈ R 3 denotes the body angular velocity, the control input τ ∈ R 3 is the torque applied on the body, and J is the 3 × 3 moment of inertia matrix. The hat map ∧ : R 3 → so (3) is the vector space isomorphism mentioned for the quadcopter example. Consider a rigid body attitude tracking problem in which the rigid body is tracking an attitude profile,h (w) = exp( e s w 1 ) ∈ R 3 (60) where e s = (1, 1, 1) and w = (w 1 , w 2 ) ∈ R 2 that is generated by an exosystem on R 2 defined byẇ = aw with a = 0 −1 1 0 .
The output regulation problem for the rigid body tracking problem is to regulate the system output y = R −h(w) to zero asymptotically. We adopt an identical procedure, as in case of quadcopter control, to derive the feedback controlτ (r) andτ (r) as discussed in (58) and (59), respectively, and the corresponding tracking errors are given byỹ{r} andȳ{r}. Note that the rigid body tracking case is different from the quadcopter tracking in the sense that we are tracking a transcendental function (60). Therefore, the feedback controlτ (r) cannot achieve regulation asymptotically for any r; however, the feedback controlτ (r) achieves regulation asymptotically for r = 2; see Figure 5.

Conclusions and Future Works
This article presents a technique of designing an optimal feedback controller that achieves regulation asymptotically for a class of controlled systems. Moreover, we have generalized the optimal regulation problems on Euclidean spaces to manifolds with the embedding technique, and demonstrated its applicability by designing optimal tracking feedback control laws for the bi-directional quadcopter system and the rigid body control system. As a future work, we plan to investigate the case of model predictive regulation on Lie groups in the framework developed in this paper. We also plan to apply this framework to reinforcement learning.