Stability Margin of Data-Driven LQR and Its Application to Consensus Problem

: In contrast with traditional control input design techniques based on mathematical models of the system, in data-driven control approaches, which have recently gained substantial attention, the controller is derived directly from the data that are collected from experiments or observations of the target system. In particular, several data-driven optimal control and model predictive control (MPC) techniques have been proposed. In this paper, it is shown that the recently proposed data-driven LQR (Linear Quadratic Regulator) has a stability margin that is the set of the uncertainties in the control input channels maintaining the closed-loop stability. As an application of the proposed stability margin of the data-driven LQR, the consensus problem is considered. Since the control design for the consensus of multi-agent systems can be reformulated into the robust stabilization of a linear system with uncertainty in the input channel, it is demonstrated that the derived stability margin can be used to design a controller for the consensus of multi-agent systems.


Introduction
Over the past few decades, research on model-based control design, a control input design technique based on mathematical models of the system under consideration, has dominated the control community.Using mathematical models of these systems, it has been possible to design control techniques that take into account various performance indicators of the system, such as optimality of the control performance expressed by a quadratic cost with respect to the control and state [1,2].At the same time, it should be noted that the application of these control techniques requires the construction of an accurate mathematical model of the system.
However, constructing exact mathematical models is a difficult, expensive and timeconsuming task [3].At a first glance, the observed data from the running system are inseparable from noise.In the system identification field, researchers have attempted to design methods to address the influence of noise both on input and output data [4][5][6].However, even though the system may be successfully modeled based on the data, there is no guarantee that the fitting model is the perfect mathematical description for the control task due to uncertainties [7].After an exhaustive effort to optimally identify the underlying dynamics from the data, the actual plant still requires a robust controller to compensate for the above problems.
In contrast with model-based control design, data-driven control design aims to eliminate the need for system identification and directly tackle control problems when formulating the control input.For instance, one recent study [8] employs Petersen's lemma through the linear matrix inequality formulation for data generated by linear and polynomial systems.
An earlier study [9] constructs a quadratic matrix inequality (QMI) of a linear model with noise that is implied by another QMI that is composed of noisy data through S-Lemma [9].In another recent study [10], the authors exploit prior knowledge of uncertainty or disturbance to design a non-conservative data-driven control design.These efforts are mainly motivated by the desire to remove the complexity and shorten the required time to build control systems while also dealing with problems that exist in the model-based control method.
The driving force of the transition to the data-driven control method is well outlined by Willems et al. [11]: a persistently excited input and output data pair can represent the behavior of a dynamic system.In light of this, two recent studies have presented a framework to design closed-loop data-driven control [12,13].The notion of data informativity relaxes the requirement of persistently exciting data for some of the data-driven control problems [14].The framework is then extended to directly formulate an optimal control method via the observed data such as in a LQR (Linear Quadratic Regulator) [15] and MPC (Model Predictive Control) [16,17].
In order to apply control inputs to real-world systems successfully, it is important to design control inputs that ensure system robustness, such as stability margins.It is well known that a continuous-time LQR provides stability margins such as the gain margin [ 1  2 , ∞] and the phase margin (− π 3 , π 3 ) [2,18,19].Such a property is very useful in practice and quite special compared with other model-based control designs, which are usually weak against model uncertainties.In the case of a discrete-time LQR (DLQR), the stability margin is investigated By Kim et al. and Lee et al. [20,21].Because of these robustness guarantees, LQR and DLQR have received a lot of attention over the years.A generalized stability margin, called the disc margin, has been investigated in the transfer function setting [22].Furthermore, for the state space, a disc margin is proposed for a DLQR using Lyapunov stability theory [21].To derive the disc margin of a DLQR, the closed-loop system including the uncertainty in the input channels is represented as x(k + 1) = Ax(k) + Bψu(k), where ψ is a complex number denoting the uncertainty and u(k) is the DLQR.Then, the region for ψ is derived using Lyapunov stability theory such that closed-loop stability is maintained.Since the region is a disc in the complex plane, it is also called a disc margin.Since ψ is a complex number, the disc margin implies both the gain and phase margins simultaneously.
Recently [23], the idea of a data-driven LQR (DDLQR) has been presented using the closed-form solutions of the Riccati equation and Markov parameters estimation [24].Given the fact that the LQR and DLQR have received attention for their robustness guarantees, a natural straightforward question is whether the DDLQR provides a stability margin or not.In this paper, as the first contribution, a disc margin (or stability margin) of a DDLQR is derived using Lyapunov stability theory and an explicit solution of the Riccati equation.In other words, when the control u(t) is designed according to the DDLQR, a region for ψ is derived in which the closed-loop stability of x(k + 1) = Ax(k) + Bψu(k) is maintained.The result can be understood as a data-driven version of the result in Lee et al. [21].In particular, it is shown that the disc margin of the DDLQR becomes almost the same as those of DLQR and LQR [2,21] if the system is asymptotically null controllable [25].
It is known that designing a controller for consensus of a MAS (Multi-Agent System) is equivalent to the robust control design for one agent in the MAS when the agent has uncertainties in its input channels [26][27][28].Hence, a control design method guaranteeing the stability margin can be used for the controller design for consensus of the MAS [26,29].Therefore, as the second contribution in this paper, it is shown that the derived stability margin of the DDLQR can be used to design a control for the consensus problem of a multi-agent system.Simulation results show that the DDLQR indeed provides a stability margin and DDLQR-based control leads to consensus of the MAS if the agent in the MAS is asymptotically null controllable.
Notation: I k denotes the k × k identity matrix and x + and x denote x(k + 1) and x(k), respectively, where k is the discrete-time index.σ(•) and σ(•) represent the maximum and minimum singular values, respectively.System x(k + 1) = Ax(k) + Bu(k) is said to be stabilizable if there exists a matrix K such that A + BK is Hurwitz stable .Furthermore, system x(k + 1) = Ax(k) + Bu(k), y(k) = Cx(k) is said to be detectable if there exists a matrix F such that A + FC is Hurwitz stable.

Preliminaries
In this section, preliminary results required for deriving the main results are presented.

Guaranteed Stability Margin of Model-Based Discrete-Time LQR
In this subsection, the stability margin of DLQR (Discrete-Time Linear Quadratic Regulator) is briefly reviewed.To this end, consider the discrete-time LTI (Linear Time Invariant) system x where A ∈ R n×n is the system matrix, B ∈ R n×p is the input matrix, C ∈ R q×n is the output matrix.Furthermore, x ∈ R n , u ∈ R p , and y ∈ R q are the state, control input, and output, respectively.It is assumed that (A, B) is stabilizable.Then the DLQR is derived by minimizing the following infinite horizon cost function where output weighting Q and control input weighting R are positive definite.It is assumed that the pair (A, Q) is detectable.The minimizing DLQR is given by where the optimal feedback gain K is given by Note that the positive definite matrix P is the solution to the algebraic Riccati equation In contrast with other model-based controls, it is well known that the DLQR provides a stability margin, which is important for practical applications [2,21].To be specific, if there is input channel uncertainty in the real system such as x + = Ax + Bψu where ψ denotes the input channel uncertainty, the stability of the closed-loop system with a DLQR is maintained to some extent although the DLQR is designed x + = Ax + Bu.This is formally presented in the following theorem.

Theorem 1 ([21]
).Consider the closed-loop system consisting of (1) and (3) with the input channel uncertainty Ψ x where Ψ = diag{ψ 1 , . . ., ψ p } ∈ C p×p .The closed-loop system (6) is asymptotically stable if ψ i belongs to the following set, called the disc margin, for all i where j denotes the complex number, i.e., j In particular, if the matrix R is diagonal such that R = diag{r 1 , . . ., r p }, the disc margin is given by where δ := σ(B ⊤ PB).
Note that ψ i can be interpreted as the ith control input channel uncertainty and is used only for analysis.In other words, the LQR gain K is computed using the system model (A, B) and the applied control input is u = −Kx.Then, in theory, the closed-loop system becomes x + = Ax − BKx.However, in this paper, the stability of x + = Ax − BΨKx is analyzed in order to investigate the stability margin.This method is again used later for the stability margin of the data-driven LQR.Since Ω i can be viewed as the set of ψ i such that closed-loop stability of ( 6) is maintained, it can be interpreted as both the gain and phase margins simultaneously.For instance, the projection of the circle Ω i in the complex plane onto the real axis corresponds to the gain margin.Theorem 1 means that the closed-loop stability is guaranteed with the DLQR if ψ i belongs to the disc margin (i.e., stability margin) in (7).Note that the DLQR is a model-based design, i.e., it needs the system information of (A, B).

Data-Driven LQR (DDLQR)
This subsection summarizes the idea of the data-driven LQR [23], which is the main reference for the proposed result.In data-driven control design, the most important problem is how to obtain the feedback gain using measured data from the system rather than a system model (A, B).The approach [23] relies mainly on a model-based closed-form solution to the DARE (5) derived by Lewis and Furata and Wongsaisuwan [30,31], given by where N is sufficiently large and denotes the amount of measured data from the system, and Then, considering (4) and ( 9), the DLQR gain can be equivalently written as The main result of da Silva et al. [23] is to derive a data-driven version of (10), which is explained below.In view of the structure of the gain (10), since R N and Q N are weighting matrices, if M, O, and S can be estimated using the measured data from the system, it is possible to design a data-driven LQR.To this end, consider the Hankel matrix of a signal z(k) defined by where L D is a constant satisfying a rank condition L D ≥ q(N + 1) [12,32].Furthermore, let us define matrices comprised of the set of historical input and output data where it is assumed that (N + 1)q ≥ n is satisfied [23].In order to construct matrices U p , U f , Y p , and Y f , arbitrary input signals u are chosen and applied to the sytem and the corresponding outputs y are measured.From the main results in Overschee and de Moor and Lim et al. [33,34], a predictor of the system output can be written as where and F is a solution of A N+1 + FO = 0.Then, for the data-driven design, Ŝ (i.e., the estimate of S) is the last pN columns of Ŵ, which is the least squares solution of Equation (11).In other words, where Φ † stands for the pseudo inverse of Φ.Using Ŝ, the estimate of O can be obtained by where Therefore, the data-driven LQR gain K D is given by where M denotes the estimate of M, and Ôq− is the matrix obtained by removing the first q rows of Ô.It should be noted that as more and more data is used, i.e., N → ∞, the gain K D given in (15) converges to a gain K DLQR which is the optimal control gain for the DLQR problem [23,30].Hence, strictly speaking, there is no guarantee that K D is optimal when N is not sufficiently large.In what follows, we assume that a sufficiently large amount of data are used to compute K D so that the computed gain, called data-driven DLQR gain, is optimal.Note that all variables in (15) are irrespective of the system model, i.e., K D is computed without using model information, which means that the gain K D is a data-driven LQR gain.
In contrast with the conventional LQR, which is based on the system model (A, B), the DDLQR design computes the feedback gain K D relying only on the measured data from the system under the assumption that the amount of data N is sufficiently large.

Guaranteed Stability Margin of Data-Driven LQR
Considering the information provided in the previous section, it is a natural to question whether the DDLQR for (1) provides a stability margin or not.
To answer this question, consider the closed-loop system with the DDLQR gain K D in (15) x where Ψ = diag{ψ 1 , . . ., ψ p } ∈ C p×p , and again Ψ denotes the stability margin or the uncertainties in the control input channel.It turns out that the explained DDLQR guarantees a stability margin similarly to the model-based design of Lee et al. [21].
Theorem 2. The closed-loop system (16) with input channel uncertainty Ψ = diag(ψ i ) is asymptotically stable if ψ i belongs to the following set (i.e., disc margin) for all i where .
In particular, if the matrix R is diagonal, i.e., R = diag{r 1 , . . ., r p }, the disc margin is given by Proof.See Appendix A.
Since ψ i is a complex number, the disc margin in (17) (or ( 18)) represents both the gain and phase margins simultaneously.In view of Theorem 2, the DDLQR for system (1) also results in a stability margin.
It is well known that the gain and phase margins of a continuous-time LQR (CLQR) are given by [ 1 2 , ∞) and (− π 3 , π 3 ), respectively.In the case of the discrete-time LQR, it is proved that the same gain and phase margins can be derived if the discrete-time system is asymptotically null controllable (ANCBC), which means that the system (A, B) is controllable and the eigenvalues of A are inside the unit circle or on the unit circle [21].The following theorem means that the guaranteed gain and phase margins of the DDLQR are also [ 1  2 , ∞) and (− π 3 , π 3 ), respectively, if the system is ANCBC.This means that the stability margin of a DDLQR is the same as that of a DLQR or CLQR if the system is ANCBC.Theorem 3. Suppose that (A, B) is ANCBC, and that the DDLQR gain denoted by K D,ϵ is obtained according to (15) with weighting matrices Q = ϵI and R =diag{r 1 , • • • , r p }, where ϵ is a small positive constant.Consider a set which corresponds to the set Ω i in (18) with γ = 0.Then, as ϵ converges to zero, the set Ωi,ϵ approaches the set H := {s ∈ C : Re(s) > 1 2 }.
Since the proof of the theorem is quite similar to that of the corresponding theorem of Lee et al. [21], it is omitted here.However, it should be noted that the most important points in the proof are that lim ϵ→0 P ϵ = 0, where P ϵ is the solution of the discrete-time Riccati equation with Q = ϵI, that δ ϵ = σ(B ⊤ P ϵ B) converges to zero as ϵ → 0 if the system is ANCBC [25], and that the leftmost point of the disc margin is , which converges to 1  2 as ϵ → 0. lim ϵ→0 P ϵ = 0 holds in view of the structure of the explicit solution P in (9).
In view of the derivation in this section, the DDLQR also provides a stability margin.In the next section, how to use the stability margin for the control design for a consensus problem is presented.

Application of Disc Margin to a Consensus Problem
In this section, how to use the disc margin described in the previous section for controller design for a consensus problem is presented.

Consensus Problem
Consider a MAS (Multi-Agent System) consisting of N a identical discrete-time ANCBC systems written as where x i ∈ R n , and u i ∈ R p stand for the state and input of the ith agent in the MAS, A ∈ R n×n and B ∈ R n×p , the initial conditions of each agent are different, and the agents communicate with each other by exchanging state information between neighbor agents.It is assumed that the communication (i.e., the data exchange) is modeled by a graph Laplacian matrix L ∈ R N a ×N a [21,27,28,35].The Laplacian matrix is constructed by the degree of each agent and adjacency matrix, which indicates whether pairs of agents are adjacent or not in the graph.The diagonal element of the Laplacian matrix ℓ ii = −Σ j̸ =i ℓ ij expresses the degree of agent i. ℓ ij = −1 for all i ̸ = j if the agent i receives information from agent j.Otherwise, it is defined as ℓ ij = 0.
The following facts about the Laplacian are instrumental in solving the consensus problem of the MAS: • Fact 1.The eigenvalues of L are on the closed disc belonging to the complex plane, and its center and radius are (N a − 1, 0) and N a − 1, respectively, [36]; • Fact 2. At least one eigenvalue of the Laplacian matrix is located at the origin of the complex plane, and the associated eigenvector is 1 N [27]; • Fact 3. The zero eigenvalue at the origin is simple if the communication graph includes the directed spanning tree, which means that at least one node in the graph has a directed information path to all the other nodes in the graph [27,37].
For such a MAS, the consensus problem is to design a control achieving and the control of the ith agent is given by where K c stands for the control gain for consensus, l ij is the (i, j)th element of the Laplacian matrix L, and N i is the set of the neighbors of the ith agent.Note that the same control gain K c is used for each agent, and that the consensus problem requires only the design of the gain K c achieving (21).Considering these properties of the algebraic graph theory and the definition of the consensus problem, the following assumption is inevitable in order to solve the consensus problem.

Assumption 1.
A directed spanning tree is included in the communication graph modeling the information exchange of the MAS under consideration.
In designing K c for consensus of the MAS (20), Theorem 3 can be employed.Hence, only an ANCBC agent is considered in this paper.Assumption 2. The system matrix A in (20) is ANCBC.

DDLQR-Based Consensus Algorithm
This subsection describes how the derived stability margin of the DDLQR can be effectively applied to the consensus problem for a multi-agent system (MAS).
The design of the control gain K c for consensus of the MAS relies on so-called simultaneous stabilization.
Theorem 4 ([21,26,29,38]).Suppose that Assumptions 1 and 2 hold.The MAS (20) reaches consensus if the control gain K c is designed such that is asymptotically stable for all j = 1, • • • , N a − 1, where λ j denotes the nonzero eigenvalue of the graph Laplacian matrix L modeling the information exchange.
Note that the eigenvalue λ j of the Laplacian matrix is unknown in a distributed control setting.However, although λ j is unknown, if K c is designed according to the DDLQR and if it is possible to make all nonzero λ j belong to the stability margin of K c , consensus is reached [21,26,29].
In order to design the control for the consensus, it is assumed that the real part of the smallest nonzero eigenvalue of the Laplacian matrix L is known.Note that there are existing methods to estimate the second-largest eigenvalue of the Laplacian matrix in the literature [39,40].Let ρ be the real part.Then, it is obvious that the real part of the smallest nonzero eigenvalue of 1 ρ L =: L ′ is 1.As a result, it follows that the nonzero eigenvalues of L ′ belong to the area Θ in Figure 1 in light of Fact 1, Fact 3, and scaling by ρ.In view of Theorem 3, the disc margin becomes larger as ϵ gets smaller.Thanks to this tunable disc margin Ωϵ , there exists ϵ * such that Θ ⊂ Ωϵ for any ϵ < ϵ * .Therefore, the control gain K c for consensus can be designed as K c = 1 ρ K D,ϵ .In other words, the control for the consensus is given by Then, the entire MAS can be written as where and ⊗ denotes the Kronecker product.Using coordinate transformation v := (V −1 ⊗ I n )ξ, where V is the matrix whose columns are the eigenvectors of L, it is possible to convert (24) into For a detailed explanation, see Lee et al. [21].Furthermore, it can be shown that all nonzero eigenvalues λ 1 , • • • , λ N−1 belong to Θ in Figure 1 due to the scaling constant ρ, and Θ ⊂ Ωϵ is satisfied due to sufficiently small ϵ.In other words, since all to the disc margin of the DDLQR, Theorem 3 holds.Hence, the consensus problems is solved owing to Theorem 3 and the disc margin of the DDLQR.The designed procedure for the DDLQR-based consensus controller can be summarized as follows.
1. Estimate of the real part of the second smallest eigenvalue of the graph Laplacian L and set the value as ρ; 2. Collect input, state, and output data; 3. Choose sufficiently small ϵ such that the imaginary part of −ω Ωϵ is smaller than that of −ω Θ , which means Θ ⊂ Ωϵ ; 4. Calculate K c = 1 ρ K D,ϵ , and each agent implements the controller (23).Note that the proposed DDLQR-based consensus controller is a data-driven distributed controller in the sense that it does not use the agent's mathematical model nor any global information of the graph.

Simulation Study
In this section, a numerical example of the stability margin of the DDLQR is shown and its application to the consensus problem is presented.

Numerical Example for the Disc Margin of Marginally Stable Systems with DDLQR
Since the DDLQR is heavily dependent on the amount of data, the effect of N on the accuracy of the DDLQR gain is presented first, viewing the DLQR gain as the ground truth.For this purpose, two cases with N = 3 and N = 30 are considered and L D is chosen as L D = 3q(N + 1).In addition, the gain margin of the DDLQR is presented by considering several uncertainty levels for an ANCBC system given by To obtain rich data with which the DDLQR gain can be computed, a PRBS (Pseudo Random Binary Sequence) is applied to the system for (2N + L D − 1) steps, and the resulting state, input, and output data depicted in Figures 2 and 3 are stored.Then, the weighting matrices of the DDLQR are set to Q = 0.8I n and R = 1.3I m .The estimate Ŝ is obtained by solving (12) using the stored input and output data.In addition, the computed Ŝ is employed to calculate Ô and M together with the stored data.Finally, the DDLQR gain is obtained from (10)    Next, the derived stability margin is validated for various input channel uncertainties with N = 30.The maximum and minimum real parts of the gain margin (19) for the system are 0.5194 and 13.4183, respectively.Five input channel uncertainty levels p ∈ {−5.0, 0.6, 7.0, 13.0, 18.0} in Figure 4 are considered.In other words, input channel uncertainty p is applied to the system as follows and B ∈ R 2×1 .Figure 5 shows that the closed-loop system under the DDLQR is asymptotically stable when the input channel uncertainty p belongs to the disc margin depicted by the red circle in Figure 4. Conversely, it can be seen that the closed-loop system becomes unstable when the uncertainty p is not in the stability margin.

Application of the Disc Margin to the Consensus Problem
To present a DDLQR-based control design for consensus of a MAS consisting of ANCBC agents, consider an ANCBC agent where x i ∈ R 2 and u i ∈ R. It is assumed that they communicate with each other using the communication graph in Figure 6.The graph in Figure 6 can be modeled by the Laplacian matrix L. The Laplacian matrix L = [ℓ ij ] of a directed graph in Figure 6 is constructed as defined above.The following is the Laplacian matrix for the graph in Figure 6.
To collect the data required for calculating the DDLQR gain, a PRBS is applied to an agent for (2N + L D − 1) steps, then all input, output, and state data are saved.Thereafter, the procedure for calculating the DDLQR gain is similar to the previous example.One important difference is that an arbitrarily large disc margin Ωϵ can be obtained by setting the weighting matrices as Q = ϵI and R = I with sufficiently small ϵ.As can be seen in Figure 7, the disc margin gets larger as ϵ becomes smaller.To satisfy the requirement that the disc margin Ωϵ has to include Θ, this simulation set ϵ = 0.01 and ρ = 0.5 for scaling.The resulting set Ωϵ and Θ are shown in Figure 1. Figure 8a shows that the proposed consensus control (23) leads to consensus of the MAS consisting of the nominal ANCBC agents.Since the proposed data-driven consensus control is based on the stability margin, it is expected that robust consensus is achieved.To validate this, we consider the following MAS, in which each agent has different input channel uncertainties, i.e., the proposed consensus control is applied to the following MAS.where p i = 15 14.9 14 13.5 12.5 .Figure 8b shows that the designed consensus control based on the stability margin of the DDLQR results in consensus of the MAS even when there are uncertainties in the input channels.In other words, although the agents have different uncertainties in their input channels, such as (6), the consensus is achieved by the proposed controller as long as the uncertainties are in the disc margin.

Conclusions
In this paper, it is shown that the data-driven LQR (Linear Quadratic Regulator) provides a stability margin.To achieve this, the closed-loop system is written such that the input channel is affected by uncertainties.Then, how much the uncertainty can change while closedloop stability is maintained using Lyapunov stability is investigated.Since the uncertainty is expressed as a complex number, the set of possible uncertainty guaranteeing the closed-loop stability is given by a disc in the complex plane, which is the stability margin of the datadriven LQR.Furthermore, it is shown that the derived disc margin can be used to design a control for consensus of a MAS consisting of asymptotically null controllable linear agents.
Future work includes a similar analysis of data-driven MPC (Model Predictive Control).Since stability analysis of MPC is done posteriorly after computing the control input using the optimal cost in MPC, it might be possible to apply the proposed method if the prediction horizon is short.Furthermore, it would be beneficial to devise data-driven disturbance rejection methods. where It has to be noted that all terms on the left hand side of (A8) are diagonal matrices, which means that inequality (A8) can be understood element-wise.Hence, each element of (A8) is where σ i + jω i denotes the ith element of Ψ.This proves the first part of the theorem.For the case where R is diagonal, since it is simply an application of the first part, it is omitted here.
with K D 3 = 0.2319 0.2039 0.0046 0.0057 and K D 30 = 0.7247 0.6993 0.0299 0.0311 , where K D i denotes K D with i = N.As a comparison, we compute the model-based DLQR as K DLQR = 0.7247 0.6993 0.0299 0.0311 .In view of K D 3 , K D 30 , and K DLQR , the DDLQR K D i gain converges to the model-based DLQR gain K DLQR as i increases.In other words, the DDLQR gain converges to the DLQR gain as the amount of stored data increases.Hence, with sufficiently large N, the DDLQR gain K D becomes the DLQR gain K DLQR .

Figure 4 .
Figure 4. Gain margin of the DDLQR and uncertainty test points.

Figure 5 .
Figure 5. DDLQR simulation result with applied input channel uncertainty.

Figure 6 .
Figure 6.Graph for information exchange.

Figure 7 .
Figure 7. Sufficiently large disc margin with sufficiently small ϵ.