A convex data-driven approach for nonlinear control synthesis

We consider a class of nonlinear control synthesis problems where the underlying mathematical models are not explicitly known. We propose a data-driven approach to stabilize the systems when only sample trajectories of the dynamics are accessible. Our method is founded on the density function based almost everywhere stability certificate that is dual to the Lyapunov function for dynamic systems. Unlike Lyapunov based methods, density functions lead to a convex formulation for a joint search of the control strategy and the stability certificate. This type of convex problem can be solved efficiently by invoking the machinery of the sum of squares (SOS). For the data-driven part, we exploit the fact that the duality results in the stability theory of the dynamical system can be understood using linear Perron-Frobenius and Koopman operators. This connection allows us to use data-driven methods developed to approximate these operators combined with the SOS techniques for the convex formulation of control synthesis. The efficacy of the proposed approach is demonstrated through several examples.


I. INTRODUCTION
The celebrated Lyapunov theory lays the foundation for stability analysis of nonlinear dynamical systems. Lyapunov functions provide stability certificates for a nonlinear system. For a given system, searching for a proper Lyapunov function can often be formulated as a convex optimization problem and thus easy to address. For instance, for polynomial dynamics, this is achieved through the sum of squares (SOS). Regardless of its similarity to stability analysis, the problem of nonlinear controller synthesis is more challenging. Other than a few special cases such as linear quadratic control problems, the joint search for Lyapunov stability certificate and control strategy can no longer be cast as convex optimization problems. This is exacerbated by the fact that in many applications, the underlying mathematical models are not available. Our objective in this paper is to establish a principled approach for nonlinear control synthesis when the mathematical models of the underlying dynamics are not explicitly given.
We provide a systematic approach for data-driven control synthesis for a class of control affine nonlinear systems of the formẋ The objective is to design state feedback controller u = u(x) such that the closed-loop system is asymptotically stable. To achieve this objective, we use density function-based dual stability formulation introduced by Rantzer for almost everywhere stability analysis and synthesis for nonlinear control systems [1]. Unlike Lyapunov function-based approach for control design, the co-design problem of simultaneously finding the density function and almost everywhere stabilizing controller is a convex optimization problem. We exploit this convexity property for data-driven control synthesis. In [2], [3], it was shown that the duality between density and Lyapunov function in the stability theory could be understood using linear operator theoretic framework. In particular, the duality between Koopman and Perron-Frobenius operators is at the heart of the duality in the stability theory. This linear operator theoretic framework is also exploited for the datadriven control design [4], [5]. The recent advances in the data-driven approximation of the Koopman operator are used to discover a data-driven approach for the nonlinear control synthesis. In Koopman theory, a nonlinear system is lifted to, albeit infinite-dimensional, a linear system. This lifting can be approximated using data generated from the underlying nonlinear dynamics by the wellknown Extended Dynamic Mode Decomposition (EDMD) algorithm [6]. These tools have been successfully applied in many domains, such as fluid dynamics [7], power systems [8], [9], to understand the principle components/modes of given nonlinear dynamics [10]. Recently, Koopman theory has been introduced to the control synthesis tasks, hoping that the controller designed in the lifted space could be easier than that in the original state space. It turns out to be a challenging problem since the lifting argument in the presence of control is no longer valid. Regardless of the progress that has been made in this direction during the last few years [11]- [14], a principle data-driven approach for nonlinear control synthesis is not yet available. We use the EDMD algorithm combined with the duality results for the data-driven approximation of the Perron-Frobenius (P-F) operator corresponding to the control system. This linear P-F operator for the control system is used to formulate a convex optimization problem for control synthesis. This optimization is over polynomials and can be solved using the SOS solvers. The complexity of the resulting optimization problem depends on the polynomial basis used to approximate the linear operators. Since control often doesn't require high fidelity models, we expect to construct a reliable controller using a relatively small number of basis functions. We envision that this method can be applied to low dimensional and medium dimensional dynamical systems (e.g. robotics, distributed power-electronics control applications).
The rest of the paper is organized as follows. In Section II, we provide a review on density function methods, SOS, and Koopman theory; these are the ingredients of our approach. Problem formulation and the details of our method are presented in Section III. This is followed by several numerical examples in Section IV and a short concluding remark in Section V.

II. BACKGROUND
Our proposed method for control synthesis utilizes density function method for control design, SOS for polynomial optimization and Koopman theory for data-driven approximations. Necessary background on these components is discussed in this section.

A. Density function approach for control synthesis
Consider control-affine system (1) with feedback control u(x) and x ∈ R n . This closed-loop system is asymptotically stable with respect to the origin x = 0 if there exists a Lyapunov function V such that Thus, for the purpose of control synthesis, one seeks a pair (V, u) such that (2) holds. Note that this inequality is bilinear with respect to V, u and is thus a non-convex problem. This is the major obstacle preventing Lyapunov theory being widely used in control synthesis. In [1], a dual to Lyapunov's stability theorem was established. Theorem 1 ( [1]): Given the systemẋ = F(x), where F is continuous differentiable and F(0) = 0, suppose there exists a nonnegative ρ is continuous differentiable for Then, for almost all initial states x(0), the trajectory x(t) tends to zero as t → ∞. Moreover, if the equilibrium x = 0 is stable, then the conclusion remains valid even if ρ takes negative values. The density ρ serves as a stability certificate and can be viewed as a dual to the Lyapunov function [1]. Applying Theorem 1 to the closed-loop system we arrive at The control synthesis becomes searching for a pair (ρ, u) of functions such that (4) holds. Even though (4) is again bilinear, it becomes linear in terms of (ρ, ρu). Thus, the density function based method for control synthesis is a convex problem.

B. Sum of squares
SOS optimization [15]- [18] is a relaxation of positive polynomial constraints appearing in polynomial optimization problems which are generally difficult to solve. SOS polynomials are in a set of polynomials which can be described as a finite linear combinations of monomials, i.e., where p is a SOS polynomial; p i are monomials; and d i are coefficients. Hence, SOS is a sufficient condition for nonnegativity of a polynomial and thus SOS relaxation provides a lower bound on the minimization problems of polynomial optimizations. Using the SOS relaxation, any polynomial optmization problems with positive constraints can be formulated as SOS optimization as follows: where Σ[x] denotes SOS set; w is weighting coefficients; p s and p e are polynomials with coefficients d. The problem in (5) is translated into Semidefinite Programming (SDP) [16], [19].

C. Linear Koopman and Perron-Frobenius Operators
For a dynamical system,ẋ = F(x), there are two different ways of linearly lifting the finite dimensional nonlinear dynamics from state space to infinite dimension space of functions, F , namely Koopman and Perron-Frobenius operators. Denote the solution of system (1) by φ t (x). The definitions of these operators along with the infinitesimal generators of these operators are defined as follows.
Definition 1 (Koopman Operator): K t : F → F for dynamical system (1) is defined as The infinitesimal generator for the Koopman operator is Definition 2 (Perron-Frobenius Operator): P t : F → F for dynamical system (1) is defined as where |·| stands for the determinant. The infinitesimal generator for the P-F operator is given by These two operators are dual to each other where the duality is expressed as follows.
We are interested in data-driven control synthesis for multivariate nonlinear dynamics 1 : where state x ∈ R n and control inputs u; and F represents open-loop dynamics; and G(x) = (G 1 (x), . . . , G m (x)) constitutes feedback control loop corresponding to control inputs u = [u 1 , . . . , u m ] ⊤ . The explicit description of F and G are not available, but we have access to a set of sample trajectories generated from this system (9). Our goal is a state feedback strategy u that globally stabilizes (9).

A. Density function approach reformulation
Based on the density function method, [22] proposed an implementable algorithm using SOS. In particular, the parameterization where a and c = [c 1 , . . . , c m ] ⊤ are polynomials, b is a positive polynomial (positive at x = 0), and α is a sufficiently large number such that the integrability condition in Theorem (1) holds.
With this parametrization (10), (4) becomes The positive polynomial b can be chosen as a quadratic control Lyapunov function for the linearized dynamics at the origin x = 0 [22]. The control synthesis then becomes finding polynomials a and c such that (1 + α)b∇ · (Fa + Gc) − α∇ · (bFa + bGc) > 0, (11) which is clearly a standard SOS problem.

B. Data-driven approximation of linear operators
The fundamental object of interest in the data-driven control synthesis is the approximation of the infinitesimal generator of P-F operator shown in (7) corresponding to vector fields F and G affine in control system (9). For the finite dimensional approximate representation of inequality (11), we will approximate the divergence terms, i.e., ∇·(F · ) and ∇·(G i · ) for i = 1, . . . , m, using Koopman and P-F generators. We adopt the technique from [12], [23] for the approximation of these two generators. In particular, data generated from the control system (9) with zero input and unit step inputs for each control input is used for the approximation of the generators P F and P F+Gi respectively. Using linearity property, the infinitesimal generator for G i i.e., P Gi is approximated from 1 we use bold symbols to denote column vectors unless it is specified as a row vector or a matrix. P F+Gi − P F = P Gi . Using similar argument, it also follows that Furthermore, we notice that the P-F generator for vector field F can be written as (13) This allows us to approximate the P-F generator using algorithm known for the approximation of Koopman operator such as Extended Dynamics Mode Decomposition (EDMD). We show that the multiplication operator corresponding to ∇ · F in (13) can also be approximated using the approximate Koopman operator.
For the data-driven approximation, let φ(t, x; u) denote a solution of (9) at time t starting from x with control input u. First, we collect time-series data from the dynamical system in (9) by injecting different control inputs: i) zero control inputs (i.e., u = 0), and ii) unit step control inputs, i.e., u = e j for j = 1, . . . , m for a finite time horizon with sampling step δt, where e j ∈ R m denotes unit vectors (i.e., jth entry of e j is 1, otherwise 0). The time-series data of the system responses corresponding to each control input case are collected in: with i = 0, 1, . . . , m for zero and unity control inputs, where y = φ(t + δt, x; u); and T i are the number of time-series data points collected for each input case. The samples in X i do not have to be from a single trajectory; X i can be a concatenation of multiple experiment/simulation trajectories. We construct a polynomial basis denoted by as a vector of monomials up to qth order. The total number of monomials in the basis, Q = n+q q . Using the EDMD algorithm, the Koopman operator, K i 2 for i = 0, 1, . . . , m corresponding to zero input and step inputs u = e j for j = 1, . . . , m is approximated as [6]: where K i are the estimated Koopman operator matrices for K i , and X i,ℓ and Y i,ℓ denote ℓth column of X i and Y i , respectively. The solution of (16) is explicitly known, K i = A † i B i , where † stands for pseudo-inverse. The Koopman generator for vector field, F, can now be approximated as Fig. 1: Summary of the steps in our proposed algorithm described in Section III.
We approximate the multiplication operator corresponding to the divergence of vector field F as follows where C x is a coefficient vector corresponding to the original states in the basis function Ψ i.e., x = C ⊤ x Ψ. Since, Ψ are assumed to be monomials basis, we can extract x from Ψ. Using linearity property of the generator in (12), we can approximate the Koopman generator corresponding to vector field G j for j = 1, . . . , m as Similarly following (18), the multiplication operator corresponding to the divergence of vector fields G j are approximated as Finally, combining (13), and (17)- (20), we have the following approximation for the infinitesimal generators for the P-F operators corresponding to the vector fields F and G j , ∀j:

C. Convex Control Synthesis: Combining SOS with Koopman
In this section, we formulate convex control synthesis using SOS optimization and Koopman operator described in previous sections.
First, we create polynomials a(x), and c(x) = [c 1 (x), . . . , c m (x)] T with degrees up to q a , q c1 , . . . , q cm , respectively. Coefficients of those polynomials are denoted by z a = [â 1 , . . . ,â Qa ] ⊤ , z cj = [ĉ j,1 . . . ,ĉ j,Qc j ] ⊤ , j = 1, . . . , m, where Q a = n+qa qa and Q cj = n+qc j qc j . Subsequently, we manipulate z a and z cj algebraically to create coefficient vectors C a and C cj in terms of Ψ such that a(x) = C ⊤ a Ψ(x), c j (x) = C ⊤ cj Ψ(x), j = 1, . . . , m. Similarly, let C ab , C bc1 , . . . , C bcm denote coefficient vectors of multiplications of polynomials, a(x)b(x), b(x)c 1 (x), . . . , b(x)c m (x), namely, is an arbitrary positive polynomial appearing in (11). Note that the degree of the monomial basis Ψ(x) in (15) should be larger than any other polynomials described above, Note that there is no systematic way to optimally choose the degree of polynomials, however we require higher order polynomials for c(x) than a(x) depending on the complexity of the underlying dynamics. Now, using the approximation of the infinitesimal PF generators in (21), we restate the left-hand side of (11) as below: The polynomial in (22) is linear in terms of the coefficients of the polynomials, a(x), c j (x), j = 1, . . . , m, hence we can solve SOS problem with (22) as a SOS constraint, given as below: where d = [z ⊤ a , z ⊤ c ] ⊤ and the objective function is ℓ 1norm minimization to promote sparsity and robustness of solution. The last term in (23) reflects the constraint, ρ > 0. Subsequent to solving (23), we can construct a controller u j (x) = c j (x)/a(x), j = 1, . . . , m to stabilize the dynamical system in (9). The steps of the proposed method described here in Section III is summarized in Fig. 1.

A. Van der Pol Oscillator
Dynamics of Van der Pol Oscillator is given as below [24]: We collect time-series data points of zero and unit step input responses in X 1∼2 and Y 1∼2 shown in (14) In this case, number of data points for each input response case, T 1 = 9968, T 2 = 9970. We choose b(x) = x ⊤ x, a(x) = 1, α = 6, and also c(x) to be a polynomial with degree from 1 to 4. Following the control synthesis described in Section III, we have the solution, c(x) = 0.9015x 2 1 x 2 + 0.0251x 3 2 +0.0241x 2 2 −1.2505x 2 . Following this, a synthesized control, u(x) = c(x). Results of the control synthesis are shown in Fig. 2 where trajectories starting from some initial points converge to the origin.

B. Non-Polynomial System Example: Inverted Pendulum
Dynamics of a simple two-dimensional inverted pendulum is given as below: which is non-polynomial due to a sinusoidal function. We collect time-series data points for zero and unit step inputs in X 1∼2 and Y 1∼2 , by doing repeated simulations, from 0 to 0.001 [s] with time step δt = 0.001 [s], starting from 10 4 uniformly-distributed random initial points from [x 1 , x 2 ] = [−π, π] 2 . Number of data points for both input response cases, T 1 = T 2 = 10 4 . We choose α = 4, b(x) = x ⊤ x, a(x) = 1, and c(x) to be a polynomial with degree from 1 to 3. Following the proposed algorithm in Section III, a control solution is computed, u(x) = c(x) = 0.1553x 3 1 − 1.9884x 1 . Figure 3 shows trajectories of the dynamics with the synthesized control, starting from some initial points, demonstrating that the control solution from the proposed method can effectively stabilizes non-polynomial dynamical systems.

C. Lorenz System Dynamics
Dynamics of Lorenz attractor is given by [23]: where ρ = 28, σ = 10, and β = 8 3 . We sample the timeseries data points from repeated simulations, from 0 to 0.001 [s], with time step δt = 0.001 [s], and uniformly distributed initial points collected from [x 1 , x 2 , x 3 ] = [−5 × 5] 3 . The data points collected for all input cases, T 1 = T 2 = 9945. For the parameters of stability conditions, we choose α = 4, b(x) = x ⊤ x, a(x) = 1, and c(x) to be a polynomial with degree from 1 to 3. Following the proposed method described in Section III, we get the solution, u(x) = c(x) = −26.9591x 1 − 6x 2 , and the result of the control synthesis is depicted in Fig. 4, showing trajectories of the open-loop dynamics as well as the controlled dynamics, starting from different initial conditions. We can see that chaotic dynamics of the Lorenz attractor is stabilized to the origin by the control synthesized by our proposed method.

D. Rigid Body Control
In this case study, we investigate the dynamics of a rigid body system, which consists of six dynamical states and three control inputs [22]: where the angular velocity vector, ω ∈ R 3 ; Rodrigues parameter vector, ψ ∈ R 3 ; and control torque, u ∈ R 3 . We follow the same parameters J, S, and H, as shown in [22]. Time-series data points are sampled from repeated timedomain simulations for four control input cases, i.e., u = 0, u = e 1∼3 . Simulation time spans from 0 to 0.001 [s] with time step δt = 0.001 [s], starting from uniformly distributed random initial points, [ω ⊤ , ψ ⊤ ] = [−3×3] 6 . Each data matrix, X 1∼4 , Y 1∼4 has 9986 time-series data points. The parameters of stability formulation are chosen as α = 4, a(x) = 1, and c j (x) to be a polynomial with degree from 1 to 3. Also, b(x) = ||ω+ψ|| 2 +||ψ|| 2 which is known to be a CLF of the linearized dynamics of (24) from [22]. The resulting control u j = c j (x), j = 1, ..., 3. Figure 5 shows trajectories of the states ω 1∼3 and ψ 1∼3 , starting from some initial points, which demonstrates that the proposed method can deal with higher dimensional dynamical systems.

V. CONCLUDING REMARK
A systematic convex optimization-based framework is provided for data-driven stabilization of control affine nonlinear systems. The proposed approach relies on a combination of SOS optimization methods and recent advances in the data-driven computation of the Koopman operator. Future research efforts will focus on data-driven optimal control of the nonlinear system and the robust counterpart of this work by exploiting the sample complexity of Koopman and P-F operator [25].