Abstract
In this paper, we are interested in the numerical aspects of the class of generalized Riccati difference equations which are involved in linear quadratic (LQ) stochastic difference games. More specifically, we address the problem of the numerical computation of the stabilizing solutions for this class of nonlinear difference equations. We propose an iterative deterministic algorithm for the computation of such a global solution. The performances of the proposed algorithm are illustrated with some numerical examples.
Keywords:
stochastic Riccati equations; stochastic control; iterative computation; deterministic approach MSC:
93E03; 93C05; 93E20; 60J27; 60J10; 91A05; 91A15
1. Introduction
In this paper, we address the problem of the numerical computation of the stabilizing solutions of a class of generalized Riccati difference equations. The considered nonlinear matrix equation occurs in connection with zero-sum linear LQ stochastic difference game control problems (see [] for more precision regarding this aspect). One of the particularities of such equations lies in the sign indefiniteness of their quadratic terms. This sign indefiniteness makes the characterization (as well as the numerical computation) of global solutions to such nonlinear matrix difference equations far more challenging when compared with the sign-definite counterpart. Even though some interesting results have already been reported in the literature (see [,] and the references therein), there are still substantial open problems in this field.
In [], we addressed some theoretical aspects related to the nonlinear difference equations under consideration. The present paper can be viewed as the numerical counterpart of []. We propose a globally convergent iterative algorithm for the computation of the stabilizing solutions to this class of Riccati equations. To the best of the authors’ knowledge, the numerical algorithms developed in the literature for the computation of the solutions to stochastic Riccati equations are mainly based on stochastic approaches consisting of transformation of the original problem into the problem of solving a sequence of coupled stochastic Riccati equations (see [,] and the references therein) that rely again on some iterative procedures for their numerical resolution. One of the most remarkable features of our proposed algorithm is its deterministic nature, in the sense that one has to solve at each main iteration a system of uncoupled deterministic Riccati equations. This allows us to use direct methods (invariant or deflating subspace-based methods; see []) for the numerical solutions to such deterministic equations. We believe that such a fundamental difference in the construction of what we called above deterministic and stochastic algorithms will have an important impact from the computation-time point of view. This will be illustrated via numerical experiments.
We mention here that in [], we proposed a deterministic iterative algorithm for the numerical computation of the stabilizing solutions to a class of generalized Riccati equations related to the so-called continuous-time, full-information stochastic control. The discrete-time counterpart of this type of Riccati equation is a particular case of the more general class of Riccati equations considered in the present paper. We have recently shown (see []) that the proof of the existence and uniqueness of the stabilizing solution for this more general class of Riccati equations presents substantial differences when compared with the full-information -type Riccati equations, even though we followed a similar philosophy in the proof procedure. We believe that we have a similar situation from the numerical computation point of view. The results reported in the present paper are more general and contain substantial differences when compared with [].
This paper is organized as follows. In Section 2, we describe the problem that we address. In Section 3, we introduce the main results of the paper. Some numerical experiments are included in Section 4.
Notations: , where is a fixed natural number. stands for the transpose of the matrix A, and denotes the trace of a matrix A. The notation (), where X and Y are symmetric matrices, means that is positive semi-definite (positive definite). In block matrices, ★ indicates symmetric terms, where . The expression is equivalent to , while is equivalent to . Consider the following space of matrices: . In the case where , we shall write instead of .
We introduce the following convention of notations:
- If and , then , where , , .
- .
- If with , , then.
As usual, denotes the subspace of symmetric matrices of a size , and . is a finite, dimensional real Hilbert space with respect to the inner product:
for all . Throughout this paper, stands for the mathematical expectation and denotes the conditional expectation with respect to the event .
2. Problem Setting
2.1. Problem Description
Consider the following nonlinear difference equation in the space :
where with an unknown function .
Here, () are defined by
where for all . In (2) , , and . Regarding the coefficients of Equation (2), we make the following assumption:
- (H1)
- (a)
- , (), , , and for all are periodic matrix-valued sequences of a period . with is also assumed to be a periodic matrix-valued sequence of a period .
- (b)
- For each , is a strong nondegenerate stochastic matrix (i.e., ,, for all ).
The discrete-time backward nonlinear equation (Equation (2)) will be called a generalized discrete-time Riccati equation (GDTRE) in the rest of this paper.
The GDTRE (Equation (2)) plays a key role in the solution of a zero-sum LQ stochastic difference game control problem described by the controlled system
and the quadratic performance criterion
where is the solution to the initial value problem (IVP) (Equation (7)), , and . In the first equation (Equation (7)), , is a sequence of independent random vectors, and the triple is a time non-homogeneous Markov chain defined in a given probability space with the finite states set and the sequence of transition probability matrices . Regarding processes and , the following assumptions are made:
- (H2)
- is a sequence of independent random vectors with the following properties: , , and , with being the identity matrix of a size r.
- (H3)
- (a)
- For each , the algebra is independent of the algebra , where and .
- (b)
- for all .
The following assumption regarding the weight matrices and is made:
- (H4)
- For each , we have
Let
In [], we considered two different types of admissible strategies, namely the full-state feedback and full-information feedback strategies. We succeeded in showing that for both strategies, the solution to the LQ game relies on the unique bounded and stabilizing solution to the GDTRE (Equation (2)) satisfying a sign condition of the form
for all , , , and being constants.
The sign conditions in Equations (13) and (14) mean that the quadratic part of the GDTRE (Equation (2)) is of an indefinite sign. This sign indefiniteness makes the characterization and the numerical computation of the global solutions to the GDTRE (Equation (2)) much more intricate than in the sign-definite case.
Remark 1.
We derived in [] the conditions for the existence and uniqueness of the stabilizing solution to Equation (2). In the present paper, we are interested in the numerical aspects of the GDTRE (Equation (2)). Our objective here is to propose a globally convergent algorithm for the computation of the unique stabilizing solution to Equation (2) with the sign (indefinite) conditions in Equations (13) and (14). We will propose an iterative deterministic algorithm which is based on the numerical computation of the bounded and stabilizing solutions of a sequence of Riccati difference equations arising in the deterministic framework. In order to accomplish this, we consider the following sequence of uncoupled Riccati difference equations (which are specific to the deterministic framework):
where
By taking , we may construct the inductive sequences , , , which are the unique bounded and stabilizing solution to the Riccati difference equation (Equation (16)). The aim of this study is to provide a set of conditions which guarantee that is well defined for all and for all and .
Remark 2.
Note that
if is such that , where , , and , . This follows by noticing that Equation (18) could be rewritten as
Remark 3.
In the Numerical Experiments section, we will clarify the deterministic nature of the proposed algorithm and highlight the contribution of such a paradigm.
2.2. Some Intermediate Results
Let us formally set . Hence, Equations (7) and (8) are rewritten as follows:
where is the solution to Equation (20) corresponding to and
With the above system (Equation (20)) and the corresponding quadratic functional in Equation (21), we associate the following Riccati-type difference equation of the type in Equation (2):
where
for all .
In the following, we associate to the GDTRE (Equation (2)) the set , which consists of all pairs of feedback gains , where and are -periodic matrix-valued sequences having the following properties:
- (i)
- The zero solution of the stochastic linear system
- (ii)
- The corresponding GRDE (Equation (23)) has a unique bounded and stabilizing solution satisfying the sign condition
The following result gives a necessary and sufficient condition which helps us to decide if the set is empty or not:
Proposition 1.
Under the considered assumptions, the following two assertions are equivalent:
- (i)
- is not empty;
- (ii)
- There exist -periodic sequences , and solving the following matrix inequalities
Proof.
One can apply Theorem 5.6 in [] to the Riccati difference equation
obtained from Equation (23) by taking , . □
We end this section by giving the existence conditions for the unique bounded and stabilizing solution to Equation (2). To this end, we introduce the following auxiliary system:
where
and is obtained from the factorization for all , .
Theorem 1.
Assume the following:
- (a)
- Assumptions (H1–H4) are fulfilled;
- (b)
- The set is not empty;
- (c)
- The auxiliary system in Equation (29) is exactly detectable at a time instant ;
Remark 4.
For the definition of the notion of exact detectability at the time instant , one can refer to [].
Remark 5.
Note that the above theorem was proven in [] under the assumption of stochastic detectability of the system in Equation (29) instead of exact detectability at the time instant . One can show that the concept of exact detectability at the time instant is wider than the stochastic detectability one. Hence, the above result can be applied to a larger class of stochastic systems than the one reported in []. From the technical point of view, the improvement reported in this paper consists of the modification of Lemma 4.7 from [], which is proven here under exact detectability at the assumption at the time instant . For the reader’s convenience, we include a sketch of the proof of this Lemma in Appendix A.
3. Main Results
For each , , the Riccati difference equation (Equation (16)) may be regarded as a special case of Equation (2). Hence, the Riccati difference equation (Equation (16)) is related to the deterministic LQ control problem described by the controlled system
where , , as well as the cost functional
where is the solution to the IVP described by the controlled system in Equation (31), , , and
with , , and being defined in Equation (17).
We formally set . Hence, Equations (31) and (32) are rewritten as follows:
where is the solution to Equation (34) corresponding to and
With the above system (Equation (34)) and the corresponding quadratic functional in Equation (35), we associate the following Riccati difference equation:
The notion of a stabilizing solution for Equation (37) is defined in the same way as for Equation (2).
In the following, we denote with the set of all pairs of feedback gains , where and are -periodic matrix-valued sequences having the following properties:
- (i)
- The zero solution of the closed-loop system
- (ii)
- The corresponding GRDE (Equation (37)) has a unique stabilizing and -periodic solution satisfying the sign condition
Following similar arguments to those in the proof of Proposition 1, the following result is deduced:
Proposition 2.
Under the considered assumptions, the following two assertions are equivalent:
- (i)
- is not empty;
- (ii)
- There exist -periodic sequences , , and solving the following matrix inequalities:
We are now in position to prove the main result of this paper. To this end, we introduce the following auxiliary system:
where
and is obtained from the factorization
for all , .
Theorem 2.
Assume the following:
- (a)
- Assumptions (– are fulfilled;
- (b)
- The set is not empty;
- (c)
- The auxiliary system in Equation (29) is stochastically detectable.
Under these conditions, if we take , where , then for each , is well defined as the unique minimal and positive semi-definite solution to the Riccati difference equation (Equation (16)), and we have the following:
- (i)
- (ii)
- for all and as the unique stabilizing and -periodic solution to Equation (2);
- (iii)
- (iv)
- for all .
Proof.
Since is not empty, it follows from Proposition 1 that there exist -periodic sequences , , and solving the matrix inequalities in Equation(27). Note that Equation (27) could be rewritten as
where
because for all . This allows us to deduce that
Hence, under Proposition 2, it follows that is not empty. If , then the Riccati difference equation (Equation (16)) reduces to
where . From Proposition 4.4 in [], we deduce that if is the solution to Equation (46) which satisfies , then it is well defined for all and , where , and for each , with defined by
This is the unique minimal positive semi-definite solution to Equation (46). Moreover, is a periodic sequence of a period .
Let us notice that the Riccati difference equation (Equation (2)) satisfied by its stabilizing solution may be rewritten as
where
Since , we deduce from Equation (18) that
Hence, by applying Theorem 4.2 in [] in the special case of the Riccati difference equation (Equations (46) and (48)), we may infer that for all , , and . By taking the limit for , we obtain for all . From the matrix inequality
we deduce, via Lemma 4.5 in [], that for each , satisfies the sign conditions in Equations (13) and (14). Thus, assertions (i) and (ii) from the statement are fulfilled for .
By using the Lyapunov-type characterization of the stochastic detectability of linear stochastic systems (see, for example, Chapter 4 in []), one can show that the stochastic detectability of the auxiliary system (Equation (29)) implies the detectability of the deterministic system
for each , where . Therefore, under assumption (c) in the statement, it follows that is just the bounded and stabilizing solution of the Riccati difference equation (Equation (16)) in the special case , which confirms the validity of assertion (iii) from the statement for .
Let us assume that for and for any and , the functions are well defined as unique minimal and positive semi-definite solutions of the Riccati difference equation (Equation (16)) (written for k and replaced by l) and have properties (i–iii) from the statement. We now show that for and , the Riccati difference equation (Equation (16)) has a minimal solution which is positive semi-definite, and it is a -periodic sequence satisfying the sign conditions in Equations (13) and (14). Moreover, we have
.
If , then we rewrite Equation (27) in the form
in which , where and is computed as in Equation (44) with replaced by .
Recalling that stochastic detectability implies exact detectability at time instant (see Remark 5), it follows from Proposition 4.4 in [] and Theorem 1 that for all . Note also that by using similar arguments to those in Chapter 5 from [], one can show that for all . Hence, we deduce that for all . Thus, . This allows us to conclude that the matrix-valued sequences satisfy
Therefore, we may conclude that is not empty for all if is not empty. Thus, we deduce that the solutions to the difference equation (Equation (16)) which satisfy the condition are well defined for all , , and . By applying Proposition 4.4 from [] in the special case of the Riccati difference equation (Equation (16)), we infer that , defined by , is the minimal positive semi-definite and -periodic solution of the Riccati difference equation (Equation (16)).
By again invoking the inequalities and , we may obtain . By applying Theorem 4.2 in [] in the special case of Equations (16) and (48), we deduce that for all , , and . By taking the limit for , we deduce that
for all . On the other hand, Equation (17) yields
Since and , one obtains , where . This allows to us apply Theorem 4.2 from [] in the special case of the Riccati difference equation (Equation (16)) to deduce that , , , and . By letting , we obtain
. Thus, Equations (55) and (56) confirm the validity of Equation (51).
Furthermore, Equation (55) yields
. These matrix inequalities, together with Lemma 4.5 from [], allow us to conclude that satisfies the sign conditions in Equations (13) and (14).
Finally, let us remark that if the auxiliary system in Equation (41) is detectable, then the minimal solution coincides with the bounded and stabilizing solution of Equation (16) for any . Thus, we have shown inductively that can be constructed for any and which satisfies properties (i–iii) from the statement. Now we remark that Equation (51) allows us to conclude that the sequences , , and are convergent. Let , . By taking the limit for in Equation (16), we obtain that is a positive semi-definite and -periodic solution of Equation (2). Based on the minimality property of the stabilizing solution of the Riccati equation (Equation (2)), we deduce that , and hence
Thus, the proof is complete. □
4. Numerical Experiments
The time-invariant case will be considered in this section. We will refer to the algorithm proposed here as Algo_Deter. In this example, and in order to evaluate the performance of Algo_Deter, we will compare it with an algorithm that belongs to the class of stochastic algorithms (see Section 1 for a description of this class of algorithms). We propose using here a stochastic algorithm that we adapted from [] to our setting. This algorithm is referred to as Algo_Stoch. We recall here that for solving the deterministic Riccati equations appearing in Algo_Deter, one can use direct methods (invariant or deflating subspace-based methods). We refer the reader interested in direct methods to [,,] and the references therein. We also recall that at each main iteration of Algo_Stoch, one has to use iterative methods. We will show, from the computation time point of view, the superiority of Algo_Deter when compared with Algo_Stoch, which is due to the direct or iterative method opposition.
We will use the following simulation protocol:
- 1.
- Set the example numbers n_good = 0, n_Deter = 0, and n_Stoch = 0, where n_good represents the number of examples for which both Algo_Deter and Algo_Stoch converge, n_Deter is the number of examples for which Algo_Deter converges but not Algo_Stoch, and n_Stoch is the number of examples for which Algo_Stoch converges but not Algo_Deter;
- 2.
- Choose n, , and randomly and uniformly among the integers from 1 to 10 and fix ;
- 3.
- Generate randomly the corresponding system matrices;
- 4.
- If the assumptions in Theorem 2 are not verified, then go back to step 2;
- 5.
- Use Algo_Deter and Algo_Stoch to solve the corresponding generalized Riccati equation. Let the stabilizing solution obtained using Algo_Deter be and the solution obtained using Algo_Stoch be , with CPU_time_1 and CPU_time_2 being the respective CPU running times;
- (a)
- If neither algorithms converge, then go back to step 2;
- (b)
- If Algo_Deter converges but not Algo_Stoch, then set n_Deter = n_Deter + 1 and go back to step 2;
- (c)
- If Algo_Deter does not converge but Algo_Stoch does, then set n_Stoch = n_Stoch + 1 and go back to step 2;
- (d)
- If both algorithms converge, then set n_good = n_good + 1 and compute the error and the coefficient ;
- 6.
- Repeat steps 2–6 until .
We generated random test samples with a specified level of accuracy for both algorithms.
The obtained results are listed in Table 1 and Figure 1. In Table 1, O is the order of magnitude of , and “Number of Examples” indicates the number of examples corresponding to the same order of magnitude of . It follows from the obtained results that when Algo_Deter and Algo_Stoch converged, the obtained stabilizing solutions were computed with comparable accuracies.

Table 1.
Accuracy comparison for 100 random examples.

Figure 1.
Plot of the quantity .
As expected, and thanks to the use of direct resolution methods instead of iterative ones, one can see clearly from Figure 1 the improvement brought about by Algo_Deter from the computation time point of view.
During this experiment, we also obtained the following results: and . This shows that Algo_Deter still worked well in cases where Algo_Stoch failed. We believe that this was due partly to the fact that in Algo_Stoch, the computation of the sequence of approximations of the stabilizing solution relies on the computation of a vanishing matrix sequence , while in Algo_Deter, one directly computes the sequence of approximations . The vanishing nature of the matrix sequence could induce ill conditioning in its computation.
5. Conclusions
In this paper, we addressed the problem of the numerical computation of the stabilizing solution for a class of generalized Riccati difference equations. We proposed an iterative deterministic algorithm for the computation of such a global solution. The performances of the proposed algorithm were illustrated via a comparison with existing algorithms in the literature. Our ongoing efforts are twofold. On one side, we are interested in the numerical computation of some global solutions to Riccati equations arising in stochastic Nash and Stackelberg games. The degree of maturity of numerical methods for such an aim is very weak when compared with its deterministic analogue. On the other side, we are also interested in generalized Riccati equations arising in mean field LQ games. Such equations present a coupling that makes this problem very challenging.
Author Contributions
Conceptualization, S.A. and V.D.; Methodology, S.A. and V.D.; Software, S.A.; Validation, S.A. and V.D.; Formal analysis, S.A. and V.D.; Investigation, S.A. and V.D. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Data Availability Statement
Not acceptable.
Conflicts of Interest
The authors declare no conflict of interest.
Appendix A
Lemma A1.
Assume that the assumptions of Theorem 1 hold. If is bounded on , a positive semi-definite solution to Equation (2), then the system
is ESMS, where is defined as in Lemma 4.7 from [], , and , as introduced in Remark 1.
Proof.
Using similar arguments to those in [], one can show that Equation (2) can be rewritten as
where and .
Let us associate with Equation (A2) the system
Note that the first equation in Equation (A3) is simply Equation (A1). Hence, the conclusion may be obtained by applying Theorem 3.2 from [] in the case of the system in Equation (A3). To this end, we have to show that the system in Equation (A3) is exactly detectable at the time instant .
Let be a solution to the system in Equation (A3) with the property that the corresponding output satisfies
This means that
and
References
- Aberkane, S.; Dragan, V. On the existence of the stabilizing solution of generalized Riccati equations arising in zero-sum stochastic difference games: The time-varying case. J. Differ. Equ. Appl. 2020, 26, 913–951. [Google Scholar] [CrossRef]
- McAsey, M.; Mou, L. Generalized Riccati equations arising in stochastic games. Linear Algebra Its Appl. 2006, 416, 710–723. [Google Scholar] [CrossRef]
- Yu, Z. An Optimal Feedback Control-Strategy Pair for Zero-sum Linear-Quadratic Stochastic Differential Game: The RIccati Equation Approach. Siam J. Control. Optim. 2015, 53, 2141–2167. [Google Scholar] [CrossRef]
- Feng, Y.; Anderson, B. An iterative algorithm to solve state-perturbed stochastic algebraic Riccati equations in LQ zero-sum games. Syst. Control. Lett. 2010, 59, 50–56. [Google Scholar] [CrossRef]
- Ivanov, I.G. Iterations for solving a rational Riccati equation arising in stochastic control. Comput. Math. Appl. 2007, 53, 977–988. [Google Scholar] [CrossRef]
- Bini, D.A.; Iannazzo, B.; Meini, B. Numerical Solution of Algebraic Riccati Equations; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 2012. [Google Scholar]
- Dragan, V.; Aberkane, S. Computing The Stabilizing Solution of a Large Class of Stochastic Game Theoretic Riccati Differential Equations: A Deterministic Approximation. SIAM J. Control. Optim. 2017, 55, 650–670. [Google Scholar] [CrossRef]
- Dragan, V.; Morozan, T.; Stoica, A.M. Mathematical Methods in Robust Control of Discrete-Time Linear Stochastic Systems; Springer: New York, NY, USA, 2010. [Google Scholar]
- Freiling, G.; Hochhaus, A. Properties of the solutions of rational matrix difference equations. Adv. Differ. Equ. IV Comput. Math. Appl. 2003, 45, 1137–1154. [Google Scholar] [CrossRef]
- Dragan, V.; Aberkane, S.; Ivanov, I. On computing the stabilizing solution of a class of discrete-time periodic Riccati equations. Int. J. Robust Nonlinear Control 2015, 25, 1066–1093. [Google Scholar] [CrossRef]
- Mehrmann, V. The Autonomous Linear Quadratic Control Problem. Theory and Numerical Solution; Series Lecture Notes in Control andInformation Sciences; Springer: Berlin, Germany, 1991; Volume 163. [Google Scholar]
- Sima, V. Algorithms for Linear-Quadratic Optimization; Series Pure and Applied Mathematics: A Series of Monographs and Textbooks; Marcel Dekker, Inc.: New York, NY, USA, 1996; Volume 200. [Google Scholar]
- Dragan, V.; Costa, E.F.; Popa, I.L.; Aberkane, S. Exact detectability: Application to generalized Lyapunov and Riccati equations. Syst. Control. Lett. 2021, 157, 105032. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).