Next Article in Journal
Estimation of the Domain of Attraction on Controlled Nonlinear Neutral Complex Networks via Razumikhin Approach
Previous Article in Journal
An Adaptive Search Algorithm for Multiplicity Dynamic Flexible Job Shop Scheduling with New Order Arrivals
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Mean-Field Stochastic Linear Quadratic Optimal Control for Jump-Diffusion Systems with Hybrid Disturbances

1
College of Computer Science, Chengdu University, Chengdu 610106, China
2
School of Electrical Engineering, Southwest Jiaotong University, Chengdu 610031, China
3
Department of Construction Engineering, Chengdu Aeronautic Polytechnic, Chengdu 610100, China
*
Author to whom correspondence should be addressed.
Symmetry 2024, 16(6), 642; https://doi.org/10.3390/sym16060642
Submission received: 20 March 2024 / Revised: 7 May 2024 / Accepted: 12 May 2024 / Published: 22 May 2024
(This article belongs to the Section Engineering and Materials)

Abstract

:
A mean-field linear quadratic stochastic (MF-SLQ for short) optimal control problem with hybrid disturbances and cross terms in a finite horizon is concerned. The state equation is a systems driven by the Wiener process and the Poisson random martingale measure disturbed by some stochastic perturbations. The cost functional is also disturbed, which means more general cases could be characterized, especially when extra environment perturbations exist. In this paper, the well-posedness result on the jump diffusion systems is obtained by the fixed point theorem and also the solvability of the MF-SLQ problem. Actually, by virtue of adjoint variables, classic variational calculus, and some dual representation, an optimal condition is derived. Throughout our research, in order to connect the optimal control and the state directly, two Riccati differential equations, a BSDE with random jumps and an ordinary equation (ODE for short) on disturbance terms are obtained by a decoupling technique, which provide an optimal feedback regulator. Meanwhile, the relationship between the two Riccati equations and the so-called mean-field stochastic Hamilton system is established. Consequently, the optimal value is characterized by the initial state, disturbances, and original value of the Riccati equations. Finally, an example is provided to illustrate our theoretic results.

1. Introduction

1.1. Formulation of the Optimal Control Problem

In the whole paper, we suppose ( Ω , F , P ) is a completed probability space and T is a given final time. { W ( t ) , 0 t T } is the Wiener process, and the compensated Poisson random measure ν ˜ ( d t , d z ) = ν ( d t , d z ) d t π ( d z ) is independent of the Wiener process W ( t ) , where ν ( t , Z ) , containing the bounded characteristic measure π ( d z ) , is a Poisson random measure on Z. For the convenience of notations, we denote by F = { F t } 0 t T the completed natural filtration of W ( t ) and ν ( t , Z ) .
For any t [ 0 , T ] , consider the following quadratic cost functional with stochastic perturbations and cross-terms in a finite time horizon [ 0 , T ] :
J ( u t , h ) = 1 2 E { Q x T , x T + 2 ι , x T + Q ^ E [ x T ] , E [ x T ] + 2 ι ^ , E [ x T ] + 0 T ( R t x t , x t + 2 ρ t , x t + R ^ t E [ x t ] , E [ x t ] + 2 ρ ^ t , E [ x t ] + 2 S t u t , x t + 2 S ^ t E [ u t ] , E [ x t ] + N t u t , u t + 2 ϱ t , u t + N ^ t E [ u t ] , E [ u t ] + 2 ϱ ^ t , E [ u t ] ) d t } ,
where ι , ι ^ , ρ t , ρ ^ t , and ϱ t , ϱ ^ t are different stochastic perturbations, where ρ t , ϱ t are F -progressively measurable processes, ρ ^ t , ϱ ^ t are deterministic functions, ι is an F T -measurable random vector and ι ^ is a deterministic vector. At the same time, R t , R ^ t , S t , S ^ t , N t , N ^ t are bounded deterministic symmetric matrix-valued functions, Q , Q ^ being symmetric bounded matrices. Throughout the paper, we consider the following jump-diffusion systems with stochastic perturbations:
d x t = ( A t x t + A ^ t E [ x t ] + B t u t + B ^ t E [ u t ] + b t ) d t + ( C t x t + C ^ t E [ x t ] + D t u t + D ^ t E [ u t ] + σ t ) d W ( t ) + Z ( E t x t , z + E ^ t , z E [ x t ] + F t , z u t + F ^ t , z E [ u t ] + γ t , z ) ν ˜ ( d t , d z ) , x 0 = h R n .
Here, we assume matrix-valued functions A t , A ^ t , B t , B ^ t , C t , C ^ t , D t , D ^ t , E t , E ^ t , F t , F ^ t are bounded and deterministic, and vector-valued b t , σ t , γ t , z are F -progressively measurable processes. The following space is the admissible control set
L F 2 ( 0 , T ; R m ) { u : [ 0 , T ] × Ω R m E 0 T | u s | 2 d s < } ,
which denotes all the R m -valued square integrable { F t , 0 t T } -adapted processes. If u t L F 2 ( 0 , T ; R m ) , then u t is called an admissible control.
Our purpose is to find an admissible control u t L F 2 ( 0 , T ; R m ) , for any h R n , such that
J ( u t , h ) = i n f v t L F 2 ( 0 , T ; R m ) J ( v , h ) .
For simplicity, we call the above problem the MF-SLQ problem (1) and (2). If we find an admissible control u t satisfying J ( u t , h ) = i n f v t L F 2 ( 0 , T ; R m ) J ( v , h ) , then the MF-SLQ problem (1) and (2) is solved with the optimal pair ( u t , x t ) , where x t is the corresponding state of the jump-diffusion systems.

1.2. Brief History and Contributions of This Paper

As we all know, Kac [1] introduced the mean-field systems from the Vlasov-kinetic equation of plasma, and which was initially studied by McKean [2]. After that, much attention have been paid by many researchers on this topic, and many contributions have been made in the related fields, including the well-posedness results. Dawson [3] and Chan [4] mainly discussed the dynamics and fluctuations about the McKean–Vlasov equation. Ahmed [5] investigated the optimal control problem in Hilbert space with some non-linear diffusion. Buckdahn et al. [6] studied the mean-field BSDE by a limit approach. For more details on the mean-field model, please see the above works in the literature and the references therein. Over the last years, the mean-field optimal control problem, which contains terms E [ x t ] , E [ u t ] in the state equation and also the cost functional, have widely been investigated in recent years. Abundant theoretic results and applications have been established, including stochastic maximum principle, differential games, financial, numerical method and engineering applications and so on. Li [7] provided a stochastic maximum principle for the mean-field control systems. Chala [8] studied one kind of relaxed optimal control problem for mean-field SDEs. Benamou et al. [9] considered the mean-field games with variational methods. Wang [10] investigated the mean-field time inconsistent optimal control problem. Because of the great structure and available applications in engineering and finance, the SLQ optimal control problem on the mean-field type has attracted many researchers’ attention. By using a variational method and a decoupling technique, Yong 2013 [11] systemically studied the definite MF-SLQ optimal control in a finite time horizon. Actually, the MF-SLQ problem was solved by the solutions of two decoupled differential Riccati equations, through which the optimal feedback control was provided. Ni [12] considered the indefinite mean-field stochastic linear-quadratic optimal control in the discrete case. Wei et al. [13] studied the infinite horizon forward–backward SDEs, which are used to obtain the open-loop optimal control. Tang et al. [14,15] investigated the indefinite mean-field LQ solvability and also the robustness of the mean-field LQ games, respectively. For more details about the stochastic optimal control problem, we may refer to [16].
It is known that the Poisson random measure can be described as the counting measure associated with a jump process, and the solution to Poisson jump-diffusion systems is discontinuous because the stochastic perturbations in the equations come from the Poisson jumps and the Wiener process, which means our system (2) has more applications in a more complicated random environment; please see [17,18] about the stochastic differential equation with the Poisson measure. Based on this, a mean-field type maximum principle for the systems with Poisson jumps is proved by Shen and Siu [19], and its application to the mean-variance problem is also provided. Hafayed [20] study the forward–backward mean-field type SLQ problem for jump-diffusion systems and established a stochastic maximum principle. Tang and Meng [21] generalize the results in [11] to stochastic systems with Poisson jumps where there are no stochastic perturbations in the controlled state equations and also the cost functional. Therefore, the goal of our work is to further generalize the results above, that is, the controlled system considered in this paper almost coincides with that in [21] when setting the weight matrix-valued function S = S ^ 0 and the stochastic perturbations b t , σ t , γ t , z 0 and also ι , ι ^ , ρ t , ρ ^ t , ϱ t , ϱ ^ t 0 , which appear in the cost functional. The main contributions are summarized as follows:
1. We establish the existence and uniqueness of the solution of the jump-diffusion system with hybrid disturbances by the classic fixed point theorem and also the well-posedness of the optimal control problem by classic convex variational principle.
2. By the variational method, the open-loop control strategy is established, which is equivalent to the solvability of a mean-field FBSDE (12), and the closed-loop case is provided by the decoupling technique, which is equivalent to the solvability of two Riccati Equations (16) and (17).
It is worth noticing that the generalization in this paper is meaningful not only from the mathematical point of view but also in describing more complicated case in general random environment.
We shall organize this paper in the following. In Section 2, by virtue of the classic fixed point theorem, the well posedness for the controlled mean-field stochastic differential equation (MF-SDE) with jumps-diffusion and stochastic perturbations is obtained, and also we prove that the MF-SLQ problem established in this paper is well defined. In Section 3, a necessary and sufficient condition for the optimal control of the MF-SLQ problem (1) and (2) is provided, which helps solve a completely coupled stochastic Hamilton system. Although we obtained optimal controls for the MF-SLQ problem by the condition established in Section 3, it is not an implementable control policy. Thus, in the last section, in order to obtain a closed-loop control strategy, by the decoupling technique, we show the relationship between the two Riccati equations and the mean-field stochastic Hamilton system; meanwhile, the state feedback representation is also studied via the solutions of two Riccati differential equations and two perturbation differential equations. At last, we obtain the optimal value function by a direct verification, which is uniquely solvable.

2. Preliminary

In this part, we will prove the existence and uniqueness of controlled MF-SDE with random jumps and also the solvability for problem (1) and (2). Firstly, we need to introduce the following notations and spaces.
Suppose H = R n , R n × m , S n , S + n are Banach spaces with norm · H . ( S + n is the subset of all non-negative matrices of S n , and S n denotes all ( n × n ) symmetric matrices.) C ( [ 0 , T ] ; H ) is the space of all H-valued continuous functions on [ 0 , T ] . L p is the set of all H-valued and L p -integrable functions, p [ 1 , ] .
Let the following hold:
L π , 2 ( [ 0 , T ] × Z , H ) = r : [ 0 , T ] × Z H | r ( t , z ) is measurable , s u p 0 t T Z | r ( t , z ) | 2 π ( d z ) < ,
L F 2 ( [ 0 , T ] , H ) = f : [ 0 , T ] × Ω H | f ( t , ω ) is F t -adapted , E 0 T | f ( t ) | 2 d t < ,
L F 2 ( Ω , L 1 ( [ 0 , T ] , H ) ) = f : [ 0 , T ] × Ω H | f ( t , ω ) is F t -progressively measurable , E ( 0 T | f ( t ) | d t ) 2 < ,
S F 2 ( [ 0 , T ] , H ) = f : [ 0 , T ] × Ω H | f ( t , ω ) is an F t -adapted càdlàg process , E s u p 0 t T | f ( t ) | 2 < ,
S F π , 2 ( [ 0 , T ] × Z , H ) = r : [ 0 , T ] × Z × Ω H | r ( t , z , ω ) is F t -predictable , E [ s u p 0 t T Z | r ( t , z ) | 2 π ( d z ) ] <
In order to achieve our goal, we now make the following assumptions:
(A1) A t , A ^ t , C t , C ^ t L ( [ 0 , T ] , R n × n ) , E t , z , E ^ t , z L π , 2 ( [ 0 , T ] × Z , R n × n ) , B t , B ^ t , D t , D ^ t L ( [ 0 , T ] , R n × m ) , F t , F ^ t L π , 2 ( [ 0 , T ] × Z , R n × m ) , b t L F 2 ( Ω , L 1 ( [ 0 , T ] , R n ) ) , σ t L F 2 ( [ 0 , T ] , R n ) , γ t , z S F π , 2 ( [ 0 , T ] × Z , R n ) .
(A2) R t , R ^ t L ( [ 0 , T ] , S n ) , the cross-terms S t , S ^ t L ( [ 0 , T ] , R n × m ) , weight coefficients N t , N ^ t L ( [ 0 , T ] , S m ) , matrixes Q , Q ^ S n , further more, N t 0 , N t + N ^ 0 , R S N 1 S T 0 , R ^ S ^ N ^ 1 S ^ T 0 , Q , Q + Q ^ 0 . ι L F T 2 ( Ω , R n ) , ι ^ R n , ρ t L F 2 ( Ω , L 1 ( [ 0 , T ] , R n ) ) , ρ ^ t L 2 ( [ 0 , T ] , R n ) , ϱ t L F 2 ( [ 0 , T ] , R n ) , ϱ ^ t L 2 ( [ 0 , T ] , R m ) .
Remark 1.
Here, we suppose the coefficients are definite, which is different from [14], mainly because we want to show the designing process of open-loop and closed-loop control strategy directly instead of providing some equivalent conditions, which may help the reader know how to obtain the feedback regulator on the mean-field LQ control problem with hybrid disturbances by some essential methods.
Remark 2.
In [22], the authors suggest an adaptive sliding-mode control for a buck converter to supply a resistive and a constant power load in a DC microgrid, which is mainly focused on the stability of the systems, and also the robustness of the control strategy. However, compared to the paper above, our purpose is different, which is to find an admissible control to achieve the minimum of the cost functional through the variational method. Therefore, the methods in these two papers are different for achieving different goals. To the best of my knowledge, I think these two papers can make a combination, such as generalizing the mathematical formulation of the buck converter into a stochastic model, so it is natural to add some stochastic perturbation, like white noise, to the voltage. So, it is meaningful to consider the stability of stochastic systems and also the robustness.
Throughout the paper, the Itô formula with Poisson jumps will play a key role in what follows (see B. ∅ksendal, A. Sulem [18]):
Lemma 1.
(i) Let α t , β t be two adapted processes, and g t , z be F -measurable for each z, which satisfies:
0 T | α s | + | β s | 2 + Z | g t , z | 2 π ( d z ) d s < , P a . s .
Let X be an n-dimensional process taking the following form:
X t = X 0 + 0 t α s d s + 0 t β s d W ( s ) + 0 t Z g s , z ν ˜ ( d s , d e ) , 0 t T .
Then, for any V C 2 ( R n ) , it holds that
V ( X t ) V ( X 0 ) = 0 t V ( X s ) , α s d s + 0 t V ( X s ) , β s d W ( s ) + 1 2 0 t t r [ V ( X s ) β s β s ] d s + 0 t Z [ V ( X s + g s , z ) V ( X s ) ] ν ˜ ( d s , d e ) + 0 t Z [ V ( X s + g s , z ) V ( X s ) V ( X s ) , g s , z ] π ( d e ) d s , 0 t T .
(ii) Let X 1 and X 2 be two processes with the following form:
X t j = X 0 j + 0 t α s j d s + 0 t β s j d W ( s ) + 0 t Z g s , z j ν ˜ ( d s , d e ) , j = 1 , 2 , 0 t T ,
where the coefficients are as previous stated. Then, it holds that
X t 1 , X t 2 X 0 1 , X 0 2 = 0 t X s 1 , d X s 2 + 0 t X s 2 , d X s 1 + 0 t t r [ β s 1 β s 2 ] d s + 0 t Z g s , z 1 , g s , z 2 ν ( d s , d z ) 0 t T .
Proof. 
Obviously, (ii) can be proved by assuming V ( X t 1 , X t 2 ) = X t 1 , X t 2 ; for the details, the reader can refer to Section 1 of B. ∅ksendal, A. Sulem [18], so we omit it here. □
Proposition 1.
Let assumption (A1) hold. For any initial state h R n and admissible control u t L F 2 ( 0 , T ; R m ) , the jump-diffusion Equation (2) admits a unique solution x S F 2 ( [ 0 , T ] , R n ) , whose trajectories are right-continuous almost surely.
Proof. 
We consider the space of stochastic processes x t , 0 t T , such that for any t, x t is F t -measurable, and s u p 0 t T E | x t | 2 < . Let S ^ F 2 be the quotient of this space by the subspace of the processes equivalent to the zero process. Thus, S ^ F 2 is an complete Banach space with norm s u p 0 t T E | x ( t ) | 2 .
Now, we define the following operators: for any x s S ^ F 2 , u s L F 2 ( [ 0 , T ] , R m ) , t [ 0 , T ] ,
[ A x s ] ( t ) = 0 t ( A s x s + A ^ s E [ x s ] + b s ) d s + 0 t ( C s x s + C ^ s E [ x s ] + σ s ) d W ( s ) + 0 t Z ( E s , z x s + E ^ s , z E [ x s ] + γ s , z ) ν ˜ ( d s , d z ) ,
[ B u s ] ( t ) = 0 t ( B s u s + B ^ s E [ u s ] ) d s + 0 t ( D s u s + D ^ s E [ u s ] ) d W ( s ) + 0 t Z ( F s , z u s + F ^ s , z E [ u s ] ) ν ˜ ( d s , d z ) .
With operators A and B , the controlled state Equation (2) can be rewritten as x t = h + [ A x s ] ( t ) + [ B u s ] ( t ) . Thus,
E s u p 0 s t | x t | 2 3 K [ | h | 2 + A x s 2 + B x s 2 ] .
By the Cauchy–Schwarz inequality, Doob’s inequality on martingales and Burkholder–Davis–Gundy’s inequality, we have
A x s 2 = E s u p 0 t T | A x s | 2 K E { ( 0 t | A s x s + A ^ s E [ x s ] + b s | d s ) 2 + ( 0 t | C s x s + C ^ s E [ x s ] + σ s | d W ( s ) ) 2 + ( 0 t Z | E s , z x s + E ^ s , z E [ x s ] + γ s , z | ν ˜ ( d s , d z ) ) 2 } K E { ( T 0 T | A s x s + A ^ s E [ x s ] + b s | 2 d s ) + 0 T | C s x s + C ^ s E [ x s ] + σ s | 2 d s + 0 t Z | E s , z x s + E ^ s , z E [ x s ] + γ s , z | 2 π ( d z ) d s } 3 K 0 T ( T | A s | 2 + | C s | 2 + Z | E s | 2 π ( d z ) ) E | x s | 2 d s + 3 K 0 T ( T | A ^ s | 2 + | C ^ s | 2 + Z | E ^ s , z | 2 π ( d z ) ) | E [ x s ] | 2 d s + 3 K E 0 T ( T | b s | 2 + | σ s | 2 + Z | γ s , z | 2 π ( d z ) ) d s 3 K ( T A s 2 + T A ^ s 2 + C s 2 + C ^ s 2 + E s , z 2 + E ^ s , z 2 ) E 0 T | x s | 2 d s + 3 K ( T b s 2 + σ s 2 + γ s , z 2 ) 3 K T ( T A s 2 + T A ^ s 2 + C s 2 + C ^ s 2 + E s 2 + E ^ s 2 ) s u p 0 t T E | x s | 2 + 3 K ( T b s 2 + σ s 2 + γ s , z 2 ) .
That is,
A x s 2 3 K T λ 1 s u p 0 t T E | x s | 2 + 3 K λ 2 ,
and hereafter, K > 0 represents a generic constant and λ 1 , λ 2 are two different bounded constants which can be different from line to line.
Similarly, we can obtain
B u s 2 3 K λ 3 u s 2 < .
From (3)–(5), we find
E s u p 0 t T | x t | 2 3 K ( | h | 2 + λ 3 u s 2 + λ 2 ) + 3 K λ 1 E 0 T | x s | 2 d s ,
Let β = 3 K ( | h | 2 + λ 3 u s 2 + λ 2 ) , then
E ( s u p 0 t T | x t | 2 ) β + 3 K λ 1 E 0 T | x s | 2 d s .
This implies E | x T | 2 β + 3 K λ 1 E 0 T | x s | 2 d s . Therefore,
E | x t | 2 β + 3 K λ 1 E 0 t | x s | 2 d s .
By Gronwall’s inequality, one obtains E | x t | 2 β e 3 K λ 1 t . That means
E s u p 0 t T | x t | 2 β e 3 K λ 1 T < ,
and thus, we obtain x S F 2 ( [ 0 , T ] , R n ) . The continuousness can be easily obtained by the boundedness.
Moreover, if x S F 2 ( [ 0 , T ] , R n ) , then
s u p 0 t T E | x s | 2 E s u p 0 t T | x s | 2 .
Due to the nice structure of the linear system, we only need to deal with the operators A when we consider constructing a contraction operator. From (4), we have
A x s 2 ( t ) 3 K λ 1 T s u p 0 t T E | x s | 2 + 3 K λ 2 3 K λ T E s u p 0 t T | x s | 2 ,
with 0 < 3 K λ T < 1 when K λ T is small enough. Then, A has a unique fixed point, so we obtain the well posedness of the state Equation (2). Thus, we complete the proof. □
From the proof of boundedness for x t , we obtain
Corollary 1.
For any admissible control u t L F 2 ( 0 , T ; R m ) , h R n . If x t is the unique solution of state Equation (2), then operator A S F 2 ( [ 0 , T ] , R n ) , mapping ( x 0 , u t ) x t is continuous from R n × L F 2 ( 0 , T ; R m ) into S F 2 ( [ 0 , T ] , R n ) .
Furthermore, we have the solvability result for MF-SLQ problem (1) and (2).
Proposition 2.
Let assumptions (A1) and (A2) hold. The problem MF-SLQ (1) and (2) has a unique optimal control.
Proof. 
From the proof of Proposition 1, the mapping u t x t is bounded and continuous, which proves easily that J ( h , u t ) is continuous and the set of admissible control u t L F 2 ( 0 , T ; R m ) is a reflexive Banach space. Moreover, since N t , N t + N ^ t are uniformly positive, i.e., N t u t , u t α | u t | 2 , and ( N t + N ^ t ) u t , u t α | u t | 2 , for some constant α > 0 , J ( h , u t ) is thus strictly convex, and when u t , J ( h , u t ) . This implies J ( h , u t ) has an optimum, and the optimum is unique. □
Remark 3.
Some efforts about the existence and uniqueness of solution to the state equation and the solvability of MF-SLQ problem have been made by Tang and Meng [21] along the lines of Yong [11]. Actually, the proof of the solution to the state equation is followed from a contraction mapping theorem, and the solvability is established by a classic convex variation principle. Above all, our results generalize their arguments to the more general case.

3. Optimality Conditions

In the following, we will study the MF-SLQ problem with bounded and observable coefficients. Necessary and sufficient conditions are derived for optimal control u t L F 2 ( 0 , T ; R m ) .
Theorem 1.
Suppose assumptions (A1) and (A2) are satisfied. Then,
N t u t + N ^ E [ u t ] + S t x t + S ^ E [ x t ] + B t p t + B ^ t E [ p t ] + D t q t + D ^ t E [ q t ] + Z ( F t , z r t , z + F ^ t , z E [ r t , z ] ) π ( d z ) + ϱ t + ϱ ^ t = 0 .
is a necessary and sufficient condition for u t to be an optimal control. Here, ( p t , q t , r t , z ) (called the adjoint processes) is the unique solution (see 2013 Shen and Siu [19]) of the following BSDE:
d p t = { A t p t + A ^ t E [ p t ] + C t q t + C ^ t E [ q t ] + R t x t + R ^ t E x t + S t u t + S ^ t E [ u t ] + Z ( E t , z r ( t , z ) + E ^ t , z E [ r ( t , z ) ] ) π ( d z ) + ρ t + ρ ^ t } d t q t d W ( t ) Z r t , z ν ˜ ( d t , d z ) , p T = Q x T + Q ^ E [ x T ] + ι + ι ^ .
Proof. 
Firstly, we prove the sufficiency. Let u t L F 2 ( 0 , T ; R m ) satisfy condition (7), for any admissible control v t L F 2 ( 0 , T ; R m ) , and x t u , x t v is the corresponding system state with the same initial value. We obtain
J ( h , v t ) J ( h , u t ) = 1 2 E { 0 T [ R t x t v x t u , x t v x t u + 2 R t x t v x t u , x t u + 2 ρ t , x t v x t u + R ^ t E [ x t v ] E [ x t u ] , E [ x t v ] E [ x t u ] + 2 R ^ t E [ x t v ] E [ x t u ] , E [ x t u ] + 2 ρ t ^ , E [ x t v ] E [ x t u ] + 2 x t v x t u , S t u t + 2 v t u t , S t x t v + 2 E [ x t v ] E [ x t u ] , S ^ t E [ u t ] + 2 E [ v t ] E [ u t ] , S ^ t E [ x t v ] + N t v t u t , v t u t + 2 N t v t u t , u t + 2 ϱ t , v t u t + N ^ t E [ v t ] E [ u t ] , E [ v t ] E [ u t ] + 2 N ^ t E [ v t ] E [ u t ] , E [ u t ] + 2 ϱ t ^ , E [ v t ] E [ u t ] ] d t + Q x T v x T u , x T v x T u + 2 Q x T v x T u , x T u + 2 ι , x T v x T u + Q ^ E [ x T v ] E [ x T u ] , E [ x T v ] E [ x T u ] + 2 Q ^ E [ x T v ] E [ x T u ] , E [ x T u ] + 2 ι ^ , E [ x T v ] E [ x T u ] } .
Since assumptions (A1) and (A2) hold, one obtains,
J ( h , v t ) J ( h , u t ) E 0 T { x t v x t u , R t x t u + R ^ t E [ x t u ] + ρ t + ρ t ^ + x t v x t u , S t u t + S ^ t E [ u t ] + v t u t , S t x t v + S ^ t E [ x t v ] + v t u t , N t u t + N ^ t E [ u t ] + ϱ t + ϱ t ^ } d t + E x T v x T u , Q x T u + Q ^ E [ x T u ] + ι + ι ^ .
Noticing that p T = Q x T + Q ^ E [ x T ] + ι + ι ^ . In order to apply the version (ii) of the Itô Formula (1), we assume V ( X t 1 , X t 2 ) = x t v x t u , p t , where X t 1 = x t v x t u , X t 2 = p t , x t v is the solution of (2) when control is v, and p t is the solution of (8). Then, taking the expectation on both sides, we have
E x T v x T u , Q x T u + Q ^ E [ x T u ] + ι + ι ^ = E 0 T [ p t , A t ( x t v x t u ) + A ^ t ( E [ x t v ] E [ x t u ] ) + B t ( v t u t ) + B ^ t ( E [ v t ] E [ u t ] ) x t v x t u , A t p t + A ^ t E [ p t ] + C t q t + C ^ t E [ q t ] + R t x t u + R ^ t E [ x t u ] + S t u t + S ^ t E [ u t ] + Z ( E t , z r t , z + E ^ t , z E [ r t , z ] ) π ( d z ) + ρ t + ρ ^ t + q t , C t ( x t v x t u ) + C ^ t ( E [ x t v ] E [ x t u ] ) + D t ( v t u t ) + D ^ t ( E [ v t ] E [ u t ] ) + Z r t , z , E t , z ( x t v x t u ) + E ^ t , z ( E [ x t v ] E [ x t u ] ) + F t , z ( v t u t ) + F ^ t , z ( E [ v t ] E [ u t ] ) π ( d z ) ] d t .
Furthermore,
E x T v x T u , Q x T u + Q ^ E [ x T u ] = E { 0 T v t u t , B t p t + B ^ t E [ p t ] + D t q t + D ^ t E [ q t ] + Z ( F t , z r t , z + F ^ t , z E [ r t , z ] ) π ( d z ) d t x t v x t u , R t x t + R ^ t E x t + S t u t + S ^ t E [ u t ] + ρ t + ρ t ^ } .
Combining (7), (9) and (10), we arrive at J ( h , v t ) J ( h , u t ) 0 ; thus, we prove that u t is an optimal control.
Next, we will show the necessity of this result.
Let ( u t , x t u ) is an optimal pair of MF-SLQ problem (1) and (2). For any control v t L F 2 ( 0 , T ; R m ) , and any state process x t v . By the definition of the Frèchet derivative, we obtain
E 0 T [ R t x t u + ρ t , x t v + R ^ t E [ x t u ] + ρ t ^ , E [ x t v ] + S t u t , x t v + S t v t , x t u + S ^ t E [ u t ] , E [ x t v ] + S ^ t E [ v t ] , E [ x t u ] + N t u t + ϱ t , v t + N ^ t E [ u t ] + ϱ t ^ , E [ v t ] ] d s + E [ Q x T u + ι , x T v + Q ^ E [ x T u ] + ι ^ , E [ x T v ] ] = 0 .
That is,
E 0 T [ R t x t u + R ^ t E [ x t u ] + ρ t + ρ t ^ , x t v + S t u t + S ^ t E [ u t ] , x t v + S t x t u + S ^ t E [ x t u ] , v t + N t u t + N ^ t E [ u t ] + ϱ t + ϱ t ^ , v t ] d t + E Q x T u + Q ^ E [ x T u ] + ι + ι ^ , x T v = 0 .
In order to apply the version (ii) of the Itô Formula (1), we assume V ( X t 1 , X t 2 ) = p t , x t v , where X t 1 = p t , X t 2 = x t v , x t v is the solution of (2) when control is v and p t is the solution of (8). Then, taking the expectation on both sides, one obtains
E Q x T u + Q ^ E [ x T u ] + ι + ι ^ , x T v = E 0 T [ p t , A t x t v + A ^ t E [ x t v ] + B t v t + B ^ t E [ v t ] x t v , A t p t + A ^ t E [ p t ] + C t q t + C ^ t E [ q t ] + R t x t u + R ^ t E x t u + S t u t + S ^ t E [ u t ] + Z ( E t , z r t , z + E ^ t , z E [ r t , z ] ) π ( d z ) + ρ t + ρ ^ t + q t , C t x t v + C ^ t E [ x t v ] + D t v t + D ^ t E [ v t ] + Z r t , z , E t , z x t v + E ^ t , z E [ x t v ] + F t , z v t + F ^ t , z E [ v t ] π ( d z ) ] d t = E 0 T [ x t v , R t x t u + R ^ t E [ x t u ] + S t u t + S ^ t E [ u t ] + ρ t + ρ ^ t + v t , B t p t + B ^ t E [ p t ] + D t q t + D ^ t E [ q t ] + Z ( F t , z r t , z + F ^ t , z E [ r t , z ] ) π ( d z ) ] d t .
With (11), it holds that
0 = E 0 T v t , N t u t + N ^ E [ u t ] + S t x t + S ^ E [ x t ] + B t p t + B ^ t E [ p t ] + D t q t + D ^ t E [ q t ] + Z ( F t , z r t , z + F ^ t , z E [ r t , z ] ) π ( d z ) + ϱ t + ϱ t ^ d s .
This implies that (7) holds. The whole proof is completed. □
From the above, the so-called MF-stochastic Hamilton system is given by
d x t = ( A t x t + A ^ t E [ x t ] + B t u t + B ^ t E [ u t ] + b t ) d t + ( C t x t + C ^ t E [ x t ] + D t u t + D ^ t E [ u t ] + σ t ) d W ( t ) + Z ( E t , z x t + E ^ t E [ x t ] + F t , z u t + F ^ t , z E [ u t ] + γ t , z ) ν ˜ ( d t , d z ) , d p t = { A t p t + A ^ t E [ p t ] + C t q t + C ^ t E [ q t ] + R t x t + R ^ t E [ x t ] + S t u t + S ^ t E [ u t ] + Z ( E t , z r t , z + E ^ t , z E [ r t , z ] ) π ( d z ) + ρ t + ρ t ^ } d t q t d W ( t ) Z r t , z ν ˜ ( d t , d z ) , N t u t + N ^ E [ u t ] + S t x t + S ^ E [ x t ] + B t p t + B ^ t E [ p t ] + D t q t + D ^ t E [ q t ] + Z ( F t , z r t , z + F ^ t , z E [ r t , z ] ) π ( d z ) + ϱ t + ϱ t ^ = 0 , x 0 = h R n , p T = Q x T + Q ^ E [ x T ] + ι + ι ^ .
It is a system of forward–backward stochastic mean-field differential equations (MF-FBSDE). If a F -adapted 5-tuple ( x t , u t , p t , q t , r t , z ) satisfies the above system, then it could be seen as an adapted solution of (12).
Theorem 2.
Suppose assumptions (A1) and (A2) hold. The stochastic Hamilton system (12) admits a unique adapted solution ( x t , u t , p t , q t , r t , z ) for any initial state h R n . Here, ( x t , u t ) is the unique optimal pair of problem (1) and (2).
Proof. 
This proof is straightforward, in fact, the part of existence can be obtained by Proposition 1 and Theorem 1, and by a usual uniqueness argument, we can complete the proof. For simplicity, we omit the details. □
Remark 4.
From the above statement, we can obtain the explicit expression of the open-loop optimal control; actually, taking the expectation to (7), we obtain
0 = ( N t + N ^ t ) E [ u t ] + ( S t + S ^ t ) E [ x t ] + ( B t + B ^ t ) E [ p t ] + ( D t + D ^ t ) E [ q t ] + Z ( E t , z + E ^ t , z ) E [ r t , z ] π ( d z ) + ϱ t + ϱ t ^ .
Thus,
E [ u t ] = ( N t + N ^ t ) 1 [ ( S t + S ^ t ) E [ x t ] + ( B t + B ^ t ) E [ p t ] + ( D t + D ^ t ) E [ q t ] + Z ( E t , z + E ^ t , z ) E [ r t , z ] π ( d z ) + ϱ t + ϱ t ^ ] .
Substituting (13) into (7) and pre-multiplying N t 1 , we have
u t = N t 1 N ^ t E [ u t ] N t 1 [ S t x t + S ^ E [ x t ] + B t p t + B ^ t · E [ p t ] + D t q t + D ^ t E [ q t ] + Z ( F t , z r t , z + F ^ t , z E [ r t , z ] ) π ( d z ) + ϱ t + ϱ t ^ ] = N t 1 N ^ t ( N t + N ^ t ) 1 [ ( S t + S ^ t ) E [ x t ] + ( B t + B ^ t ) E [ p t ] + ( D t + D ^ t ) E [ q t ] + Z ( E t , z + E ^ t , z ) E [ r t , z ] π ( d z ) + ϱ t + ϱ t ^ ] N t 1 [ S t x t + S ^ E [ x t ] + B t p t + B ^ t E [ p t ] + D t q t + D ^ t E [ q t ] + Z ( F t , z r t , z + F ^ t , z E [ r t , z ] ) π ( d z ) + ϱ t + ϱ t ^ ] .
Actually, the above equality shows the open-loop optimal control, that is, the fully coupled stochastic Hamilton system (12), to some extent, decides the optimal control of the problem (1) and (2). But the optimal policy in Remark 4 is not implementable. In order to fill the gap, naturally, the Riccati equation comes to mind.

4. Decoupling the MF-FBSDE and Representation of Optimal Feedback Regulator

In the following, for obtaining a explicit representation of optimal feedback regulator, by virtue of a decoupling technique, two Riccati differential equations, a BSDE with random jumps and a ODE are provided. As a result, the relationship between the two Riccati equations and the stochastic mean-field Hamilton system is established, which are two different but equivalent tools for the MF-SLQ problem.
Theorem 3.
Suppose assumptions (A1) and (A2) are satisfied and ( x t , u t , p t , q t , r t , z ) is the unique adapted solution of Hamilton system (12). Thus, the optimal control u t of the MF-SLQ (1) and (2) problem has the following closed-loop form:
u t = K 0 1 M 0 ( x t E [ x t ] ) K 1 1 M 1 E [ x t ] K 0 1 { ϱ t E [ ϱ t ] + B t ( ξ t E [ ξ t ] ) + D t [ P t ( σ t E [ σ t ] ) + η t E [ η t ] ] + Z F t , z [ P t ( γ t , z E [ γ t , z ] ) + ζ t , z E [ ζ t , z ] ] π ( d z ) } K 1 1 { E [ ϱ t ] + ϱ ^ t + ( B t + B ^ t ) χ t + ( D t + D ^ t ) ( P t E [ σ t ] + E [ η t ] ) + Z ( F t , z + F ^ t , z ) ( P t E [ γ t , z ] + E [ ζ t , z ] ) π ( d z ) } .
Furthermore, we have the following relationship:
p t = P t ( x t E [ x t ] ) + Π t E [ x t ] + ξ t + χ t E [ ξ t ] , q t = P t ( C t x t + C ^ t E [ x t ] + D t u t + D ^ t E [ u t ] + σ t ) + η t , r t , z = P t ( E t , z x t + E ^ t , z E [ x t ] + F t , z u t + F ^ t , z E [ u t ] + γ t , z ) + ζ t , z .
Here,
K 0 = N t + D t P t D t + Z F t , z P t F t , z π ( d z ) , K 1 = N t + N ^ t + ( D t + D ^ t ) P t ( D t + D ^ t ) + Z ( F t , z + F ^ t , z ) P t ( F t , z + F ^ t , z ) π ( d z ) , M 0 = S t + B t P t + D t P t C t + Z F t , z P t E t , z π ( d z ) , M 1 = S t + S ^ t + Π t ( B t + B ^ t ) + ( C t + C ^ t ) P t ( D t + D ^ t ) + Z ( E t , z + E ^ t , z ) P t ( F t , z + F ^ t , z ) π ( d z ) ,
and P t , Π t are solutions of the following Riccati equations, respectively:
P t + P t A t + A t P t + C t P t C t + Z E t , z P t E t , z π ( d z ) + R t M 0 K 0 1 M 0 = 0 P T = Q ,
Π t + Π t ( A t + A ^ t ) + ( A t + A ^ t ) Π t + ( C t + C ^ t ) P t ( C t + C ^ t ) + Z ( E t , z + E ^ t , z ) P t ( E t , z + E ^ t , z ) π ( d z ) + R t + R ^ t M 1 K 1 1 M 1 = 0 Π T = Q + Q ^ ,
and ( ξ t , η t , ζ t , z ) satisfies the BSDE with random jumps,
d ξ t = { P t b t + ρ t + A t ξ t + C t ( P t σ t + η t ) + Z E t , z ( P t γ t , z + ζ t , z ) π ( d z ) M 0 K 0 1 ϱ t + B t ξ t + D t ( P t σ t + η t ) + Z F t , z ( P t γ t , z + ζ t , z ) π ( d z ) } d t + η t d W ( t ) + Z ζ t , z ν ˜ ( d t , d z ) , ξ T = ι ,
and the χ t is the solution to the ODE
χ t = { Π t E [ b t ] + E [ ρ t ] + ρ ^ t + ( A t + A ^ t ) χ t + ( C t + C ^ t ) ( P t E [ σ t ] + E [ η t ] ) + Z ( E t , z + E ^ t , z ) ( P t E [ γ t , z ] + E [ ζ t , z ] ) π ( d z ) } M 1 K 1 1 { E [ ϱ t ] + ϱ ^ t + ( B t + B ^ t ) χ t + ( D t + D ^ t ) ( P t E [ σ t ] + E [ η t ] ) + Z ( F t , z + F ^ t , z ) ( P t E [ γ t , z ] + E [ ζ t , z ] ) π ( d z ) } , χ T = E [ ι ] + ι ^ .
Remark 5.
From the above theorem, in order to decouple the MF-FBSDE and provide the representation of the optimal feedback regulator, two other disturbance equations are introduced, which helps to deal with the stochastic perturbations. On the other hand, the form of optimal feedback has been obtained in (14) by the solvability of two Riccati equations, which is different from [14].
Proof. 
In view of the terminal condition of (8), for some deterministic differential equation P t , P ^ t , ξ ^ t and for some stochastic differential equation ξ t such that P T = Q , P ^ T = Q ^ , ξ T = ι , and ξ ^ T = ι ^ , we have the priori assumptions that p t = P t x t + P ^ t E [ x t ] + ξ t + ξ ^ t , t [ 0 , T ] , and let d ξ = Δ d t + η d W ( t ) + Z ζ t , z ν ˜ ( d t , d z ) .
Next, we will calculate the differential of the p t ; here, we assume V 1 ( X t 1 , X t 2 ) = P t x t , where X t 1 = P t , X t 2 = x t , x t is the solution of (2) and P t is the solution of (16). Let V 2 ( X t ) = ξ t , ξ t be the solution of (18). Thus, we apply the version (ii) of the Itô formula to V 1 and V 2 , and also calculate the differential of P ^ t E [ x ] and ξ ^ t , respectively. At last, we add them together, and it holds that,
d p t = d ( P t x t + P ^ t E [ x t ] + ξ t + ξ ^ t ) = [ P t x t + P t ( A t x t + A ^ t E [ x t ] + B t u t + B ^ t E [ u t ] + b t ) + P ^ t E [ x t ] + P ^ t ( ( A t + A ^ t ) E [ x t ] + ( B t + B ^ t ) E [ u t ] + E [ b t ] ) + Δ + ξ ^ t ] d t + [ P t ( C t x t + C ^ t E [ x t ] + D t u t + D ^ t E [ u t ] + σ t ) + η t ] d W ( t ) + Z [ P t ( E t , z x t + E ^ t , z E [ x t ] + F t , z u t + F ^ t , z E [ u t ] + γ t , z ) + ζ t , z ] ν ˜ ( d t , d z ) ,
and comparing to the diffusion term and the jump term of the following BSDE in the Hamilton system (12),
d p t = [ A t p t + A ^ t E [ p t ] + C t q t + C ^ t E [ q t ] + R t x t + R ^ t E [ x t ] + S t u t + S ^ t E [ u t ] + Z ( E t , z r ( t , z ) + E ^ t , z E [ r ( t , z ) ] ) π ( d z ) + ρ t + ρ ^ t ] d t + q t d W ( t ) + Z r t , z ν ˜ ( d t , d z ) ,
then the relationship (15) is clear by letting Π t = P t + P ^ t , and χ t = E [ ξ t ] + ξ ^ t .
Next, we are going to prove (14), that is, the closed-loop optimal control u t . Substituting the relationship (15) into the optimal condition (7), one has
0 = N t u t + N ^ E [ u t ] + S t x t + S ^ E [ x t ] + B t ( P t x t + P ^ t E [ x t ] + ξ t + ξ ^ t ) + B ^ t [ ( P t + P ^ t ) E [ x t ] + E [ ξ t ] + E [ ξ ^ t ] ] + ϱ t + ϱ ^ t + D t [ P t ( C t x t + C ^ t E [ x t ] + D t u t + D ^ t E [ u t ] + σ t ) + η t ] + D ^ t { P t [ ( C t + C ^ t ) E [ x t ] + ( D t + D ^ t ) E [ u t ] + E [ σ t ] ] + E [ η t ] } + Z F t , z [ P t ( E t , z x t + E ^ t , z E [ x t ] + F t , z u t + F ^ t , z E [ u t ] + γ t , z ) + ζ t , z ] π ( d z ) + Z F ^ t , z { P t [ ( E t , z + E ^ t , z ) E [ x t ] + ( F t , z + F ^ t , z ) E [ u t ] + E [ γ t , z ] ] + E [ ζ t , z ] } π ( d z ) .
Thus, note the presentations of K 0 and M 0 , then
0 = K 0 u t + [ ( N ^ + D t P t D ^ t + D ^ t P t ( D t + D ^ t ) + Z ( F t , z P t F ^ t , z + F ^ t , z P t ( F t , z + F ^ t , z ) ) π ( d z ) ] E [ u t ] + M 0 x t + [ S ^ + B t P ^ t + B ^ t ( P t + P ^ t ) + D t P t C ^ t + D ^ t P t ( C t + C ^ t ) + Z [ F t , z P t E ^ t , z + F ^ t , z P t ( E t , z + E ^ t , z ) ] π ( d z ) ] E [ x t ] + ϱ t + ϱ ^ t + B t ( ξ t + ξ ^ t ) + B ^ t ( E [ ξ t ] + E [ ξ ^ t ] ) + D t ( P t σ t + η t ) + D ^ t ( P t E [ σ t ] + E [ η t ] ) + Z [ F t , z ( P t γ t , z + ζ t , z ) + F ^ t , z ( P t E [ γ t , z ] + E [ ζ t , z ] ) ] π ( d z ) .
Taking the expectation, and noticing that
K 1 = K 0 + ( N ^ + D t P t D ^ t + D ^ t P t ( D t + D ^ t ) + Z F t , z P t F ^ t , z + F ^ t , z P t ( F t , z + F ^ t , z ) ) π ( d z ) ,
and
M 1 = M 0 + [ S ^ + B t P ^ t + B ^ t ( P t + P ^ t ) + D t P t C ^ t + D ^ t P t ( C t + C ^ t ) + Z ( F t , z P t E ^ t , z + F ^ t , z P t ( E t , z + E ^ t , z ) ) π ( d z ) ] ,
then we have,
0 = K 1 E [ u t ] + M 1 E [ x t ] + E [ ϱ t + ϱ ^ t ] + ( B t + B ^ t ) ( E [ ξ t ] + E [ ξ ^ t ] ) + ( D t + D ^ t ) ( P t E [ σ t ] + E [ η t ] ) + Z ( F t , z + F ^ t , z ) ( P t E [ γ t , z ] + E [ ζ t , z ] ) π ( d z ) .
Furthermore,
E [ u t ] = K 1 1 M 1 E [ x t ] K 1 1 [ E [ ϱ t + ϱ ^ t ] + ( B t + B ^ t ) ( E [ ξ t ] + E [ ξ ^ t ] ) + ( D t + D ^ t ) ( P t E [ σ t ] + E [ η t ] ) + Z ( F t , z + F ^ t , z ) ( P t E [ γ t , z ] + E [ ζ t , z ] ) π ( d z ) ] .
Putting (22) into (21) and noticing the form K 1 K 0 and M 1 M 0 , one obtains
0 = K 0 u t + M 0 x t + ( M 1 M 0 ) E [ x t ] + ϱ t + ϱ ^ t + B t ( ξ t + ξ ^ t ) + B ^ t ( E [ ξ t ] + E [ ξ ^ t ] ) + D t ( P t σ t + η t ) + D ^ t ( P t E [ σ t ] + E [ η t ] ) + Z [ F t , z ( P t γ t , z + ζ t , z ) + F ^ t , z ( P t E [ γ t , z ] + E [ ζ t , z ] ) ] π ( d z ) ( K 1 K 0 ) K 1 1 M 1 E [ x t ] ( K 1 K 0 ) K 1 1 [ E [ ϱ t + ϱ ^ t ] + ( B t + B ^ t ) ( E [ ξ t ] + E [ ξ ^ t ] ) + ( D t + D ^ t ) ( P t E [ σ t ] + E [ η t ] ) + Z ( F t , z + F ^ t , z ) ( P t E [ γ t , z ] + E [ ζ t , z ] ) π ( d z ) ] .
Thus, we have,
0 = K 0 u t + M 0 x t M 0 E [ x t ] + K 0 K 1 1 M 1 E [ x t ] + ϱ t E [ ϱ t ] + B t ( ξ t E [ ξ t ] ) + D t [ P t ( σ t E [ σ t ] ) + η t E [ η t ] ] + Z F t , z [ P t ( γ t , z E [ γ t , z ] ) + ζ t , z E [ ζ t , z ] ] π ( d z ) ] + K 0 K 1 1 { E [ ϱ t + ϱ ^ t ] + ( B t + B ^ t ) ( E [ ξ t ] + E [ ξ ^ t ] ) + ( D t + D ^ t ) ( P t E [ σ t ] + E [ η t ] ) + Z ( F t , z + F ^ t , z ) ( P t E [ γ t , z ] + E [ ζ t , z ] ) π ( d z ) } ,
and then we pre-multiply by K 0 1 both sides of the above equality, and noticing that E [ ϱ ^ t ] ϱ t , E [ ξ ^ t ] ξ ^ t and χ t = E [ ξ t ] + ξ ^ t we obtain (14).
For the rest of this proof, we give the formal derivation of the two Riccati Equations (16) and (17) and the two perturbation Equations (18) and (19).
From drift terms of (20), and the relationship (15), one obtains
0 = P t x t + P t ( A t x t + A ^ t E [ x t ] + B t u t + B ^ t E [ u t ] + b t ) + P ^ t E [ x t ] + P ^ t ( ( A t + A ^ t ) E [ x t ] + ( B t + B ^ t ) E [ u t ] + E [ b t ] ) + Δ + ξ ^ t + ρ t + ρ ^ t + A t ( P t x t + P ^ t E [ x t ] + ξ t + ξ ^ t ) + A ^ t [ ( P t + P ^ t ) E [ x t ] + E [ ξ t + ξ ^ t ] ] + C t [ P t ( C t x t + C ^ t E [ x t ] + D t u t + D ^ t E [ u t ] + σ t ) + η t ] + C ^ t { P t [ ( C t + C ^ t ) E [ x t ] + ( D t + D ^ t ) E [ u t ] + E [ σ t ] ] + E [ η t ] } + R t x t + R ^ t E x t + S t u t + S ^ t E [ u t ] + Z { E t , z [ P t ( E t , z x t + E ^ t , z E [ x t ] + F t , z u t + F ^ t , z E [ u t ] + γ t , z ) + ζ t , z ] + E ^ t , z [ P t ( ( E t , z + E ^ t , z ) E [ x t ] + ( F t , z + F ^ t , z ) E [ u t ] + E [ γ t , z ] ) + E [ ζ t , z ] ] } π ( d z ) .
Furthermore,
0 = P t + P t A t + A t P t + C t P t C t + Z E t , z P t E t , z π ( d z ) + R t x t + [ P ^ t + P t A ^ t + P ^ t ( A t + A ^ t ) + A t P ^ t + A ^ t ( P t + P ^ t ) + C t P t C ^ t + C ^ t P t ( C t + C ^ t ) + Z ( E t , z P t E ^ t , z + E ^ t , z P t ( E t , z + E ^ t , z ) ) π ( d z ) + R ^ t ] E [ x t ] + S t + P t B t + C t P t D t + Z E t , z P t F t , z π ( d z ) u t + [ S ^ t + P t B ^ t + P ^ t ( B t + B ^ t ) + C t P t D ^ t + C ^ t P t ( D t + D ^ t ) + Z ( E t , z P t F ^ t , z + E ^ t , z P t ( F t , z + F ^ t , z ) ) π ( d z ) ] E [ u t ] + P t b t + P ^ t E [ b t ] + Δ + ξ ^ t + ρ t + ρ ^ t + A t ( ξ t + ξ ^ t ) + A ^ t E [ ξ t + ξ ^ t ] + C t ( P t σ t + η t ) + C ^ t ( P t E [ σ t ] + E [ η t ] ) + Z [ E t , z ( P t γ t , z + ζ t , z ) + E ^ t , z ( P t E [ γ t , z ] + E [ ζ t , z ] ) ] π ( d z ) .
Putting (14) and (22) into the above equality, we have
0 = { P t + P t A t + A t P t + C t P t C t + Z E t , z P t E t , z π ( d z ) + R t S t + P t B t + C t P t D t + Z E t , z P t F t , z π ( d z ) · K 0 1 S t + B t P t + D t P t C t + Z F t , z P t E t , z π ( d z ) } x t .
This implies that the first Riccati equation has form (16), and under assumption (A1) and (A2), Riccati Equation (16) has a unique solution (see [16]).
At the same time, we have the following Riccati equation:
0 = { [ P ^ t + P t A ^ t + P ^ t ( A t + A ^ t ) + A t P ^ t + A ^ t ( P t + P ^ t ) + C t P t C ^ t + C ^ t P t ( C t + C ^ t ) + Z ( E t , z P t E ^ t , z + E ^ t , z · P t ( E t , z + E ^ t , z ) ) π ( d z ) + R ^ t ] + [ S t + P t B t + C t P t D t + Z E t , z P t F t , z π ( d z ) ] K 0 1 [ S t + B t P t + D t P t C t + Z F t , z P t E t , z π ( d z ) ] [ S t + S ^ + ( P t + P ^ t ) ( B t + B ^ t ) + ( C t + C ^ t ) P t ( D t + D ^ t ) + Z ( E t , z + E ^ t , z ) P t ( F t , z + F ^ t , z ) π ( d z ) ] K 1 1 [ S t + S ^ + ( B t + B ^ t ) ( P t + P ^ t ) + ( D t + D ^ t ) P t ( C t + C ^ t ) + Z ( F t , z + F ^ t , z ) P t ( E t , z + E ^ t , z ) π ( d z ) ] } E [ x t ] .
Since Q ^ is just assumed to be symmetric, the solvability of this Riccati equation is not obvious. We need to construct another one; letting (16)+(23) and noticing Π t = P t + P ^ t , the form of Riccati Equation (17) is derived by [11]. Under assumptions (A1) and (A2), it admits a unique solution of Riccati Equation (17).
Meanwhile, we have
0 = P t b t + P ^ t E [ b t ] + Δ + ξ ^ t + ρ t + ρ ^ t + A t ( ξ t + ξ ^ t ) + A ^ t E [ ξ t + ξ ^ t ] + C t ( P t σ t + η t ) + Z [ E t , z ( P t γ t , z + ζ t , z ) + E ^ t , z ( P t E [ γ t , z ] + E [ ζ t , z ] ) ] π ( d z ) M 0 K 0 1 { ϱ t E [ ϱ t ] + B t ( ξ t E [ ξ t ] ) + D t [ P t ( σ t E [ σ t ] ) + η t E [ η t ] ] + Z F t , z [ P t ( γ t , z E [ γ t , z ] ) + ζ t , z E [ ζ t , z ] ] π ( d z ) } M 1 K 1 1 { E [ ϱ t ] + ϱ ^ t + ( B t + B ^ t ) ( ξ t + ξ ^ t ) + ( D t + D ^ t ) ( P t E [ σ t ] + E [ η t ] ) + Z ( F t , z + F ^ t , z ) ( P t E [ γ t , z ] + E [ ζ t , z ] ) π ( d z ) } + C ^ t ( P t E [ σ t ] + E [ η t ] ) ,
and separating the deterministic and stochastic terms, we obtain the following equation:
0 = P t b t + Δ + ρ t + A t ξ t + C t ( P t σ t + η t ) + Z E t , z ( P t γ t , z + ζ t , z ) π ( d z ) M 0 K 0 1 { ϱ t + B t ξ t + D t ( P t σ t + η t ) + Z F t , z ( P t γ t , z + ζ t , z ) π ( d z ) } ,
and the ordinary differential equation
0 = P ^ t E [ b t ] + ξ ^ t + ρ ^ t + A t ξ ^ t + A ^ t E [ ξ t + ξ ^ t ] + C ^ t ( P t E [ σ t ] + E [ η t ] ) + Z E ^ t , z ( P t E [ γ t , z ] + E [ ζ t , z ] ) π ( d z ) M 0 K 0 1 { E [ ϱ t ] B t E [ ξ t ] + D t ( P t E [ σ t ] E [ η t ] ) + Z F t , z ( P t E [ γ t , z ] E [ ζ t , z ] ) π ( d z ) } M 1 K 1 1 { E [ ϱ t ] + ϱ ^ t + ( B t + B ^ t ) ( ξ t + ξ ^ t ) + ( D t + D ^ t ) ( P t E [ σ t ] + E [ η t ] ) + Z ( F t , z + F ^ t , z ) ( P t E [ γ t , z ] + E [ ζ t , z ] ) π ( d z ) } .
From (24), we obtain the stochastic differential Equation (18) on stochastic perturbation.
Taking the expectation to SDEJ (18) and adding equality (25), we obtain Equation (19). Thus, we finish the proof. □
By virtue of a direct verification, the following theorem holds.
Theorem 4.
Suppose assumptions (A1) and (A2) hold. P t , Π t , ( ξ t , η t , ζ t , z ) and χ t are the solutions of the Riccati Equations (16) and (17), and perturbation Equations (18) and (19), respectively. Then, (14) is the closed-loop optimal control of problem (1) and (2). Moreover, for any h R n ,
i n f u t L F 2 ( 0 , T ; R m ) J ( h , u t ) = E ( 1 2 Π ( 0 ) h , h + χ 0 , h ) + E 0 T { P t σ t , σ t + Z P t γ t , z , γ t , z π ( d z ) + ξ t , b t E [ b t ] + η t , σ t + Z ζ t , z , γ t , z π ( d z ) + χ t , E [ b t ] K 0 1 [ B t ( ξ t E [ ξ t ] ) + D t ( η t E [ η t ] ) + Z F t , z ( ζ t , z E [ ζ t , z ] ) π ( d z ) + D t P t ( σ E [ σ ] ) + Z F t , z ( γ t , z E [ γ t , z ] ) π ( d z ) + ϱ t E [ ϱ t ] ] 2 K 1 1 [ E [ ϱ t ] + ϱ ^ t + ( B t + B ^ t ) χ t + ( D t + D ^ t ) ( P t E [ σ t ] + E [ η t ] ) + Z ( F t , z + F ^ t , z ) ( P t E [ γ t , z ] + E [ ζ t , z ] ) π ( d z ) ] 2 } d t .
Proof. 
Note that Π 0 = P 0 + P ^ 0 , χ 0 = E [ ξ 0 ] + ξ ^ 0
J ( h , u t ) E ( 1 2 Π ( 0 ) h , h χ 0 , h ) = 1 2 E [ 0 T ( R t x t , x t + 2 ρ t , x t + R ^ t E [ x t ] , E [ x t ] + 2 ρ ^ t , E [ x t ] + 2 S t u t , x t + 2 S ^ t E [ u t ] , E [ x t ] + N t u t , u t + 2 ϱ t , u t + N ^ t E [ u t ] , E [ u t ] + 2 ϱ ^ t , E [ u t ] ) d t ] + 1 2 E [ Q x T , x T + 2 ι , x T + Q ^ E [ x T ] , E [ x T ] + 2 ι ^ , E [ x T ] ] 1 2 ( P 0 + P ^ 0 ) h , h E [ ξ 0 ] h ξ ^ 0 h .
By P T = Q , P ^ T = Q ^ , ξ T = ι , ξ ^ T = ι ^ . Next, we will calculate the differential of the the following terms P t x t , x t , P ^ t E [ x t ] , E [ x t ] , ξ t , x t and ξ ^ t , E [ x t ] . Here, we assume V 3 ( X t 1 , X t 2 ) = P t x t , x t , where X t 1 = P t x t , X t 2 = x t , x t is the solution of (2) and P t is the solution of (16). We let V 4 ( X t 1 , X t 2 ) = ξ t , x t , where X t 1 = ξ t , X t 2 = x t , x t is the solution of (2) and ξ t is the solution of (18). Thus, we apply the version (ii) of the Itô formula to V 3 and V 4 , also calculate the differential of P ^ t E [ x t ] , E [ x t ] and ξ ^ t , E [ x t ] , respectively. At last, we take the expectations on both sides, and one obtains
E [ Q x T , x T ] P 0 h , h = E 0 T { P t x t , x t + 2 P t x t , A t x t + A ^ t E [ x t ] + B t u t + B ^ t E [ u t ] + b t + P t ( C t x t + C ^ t E [ x t ] + D t u t + D ^ t E [ u t ] + σ t , C t x t + C ^ t E [ x t ] + D t u t + D ^ t E [ u t ] + σ t + Z E t , z x t + E ^ t , z E [ x t ] + F t , z u t + F ^ t , z E [ u t ] + γ t , z , E t , z x t + E ^ t , z E [ x t ] + F t , z u t + F ^ t , z E [ u t ] + γ t , z π ( d z ) ) } d t ,
and
Q ^ E [ x T ] , E [ x T ] P ^ 0 h , h = 0 T { P ^ t E [ x t ] , E [ x t ] + 2 P ^ t E [ x t ] , ( A t + A ^ t ) E [ x t ] + ( B t + B ^ t ) E [ u t ] + E [ b t ] } d t .
On the other hand,
E ι , x T E [ ξ 0 ] h = E 0 T { ξ t , A t x t + A ^ t E [ x t ] + B t u t + B ^ t E [ u t ] + b t + x t , ξ t + η t , C t x t + C ^ t E [ x t ] + D t u t + D ^ t E [ u t ] + σ t + Z ζ t , z , , E t , z x t + E ^ t , z E [ x t ] + F t , z u t + F ^ t , z E [ u t ] + γ t , z π ( d z ) } d t ,
and
ι ^ , E [ x T ] ] ξ ^ 0 h = 0 T { ξ ^ t , ( A t + A ^ t ) E [ x t ] + ( B t + B ^ t ) E [ u t ] + E [ b t ] + E [ x t ] , ξ ^ t } d t .
Here, P t , ξ t , P ^ t and ξ ^ t satisfy (16), (18), (23) and (25), respectively. From (26), the above equalities (27)–(30), and combining the expression (14) and (22),
J ( h , u t ) 1 2 Π ( 0 ) h , h χ 0 , h E 0 T { P t σ t , σ t + Z P t γ t , z , γ t , z π ( d z ) + ξ t , b t E [ b t ] + η t , σ t + Z ζ t , z , γ t , z π ( d z ) + χ t , E [ b t ] K 0 1 [ B t ( ξ t E [ ξ t ] ) + D t ( η t E [ η t ] ) + Z F t , z ( ζ t , z E [ ζ t , z ] ) π ( d z ) + D t P t ( σ E [ σ ] ) + Z F t , z ( γ t , z E [ γ t , z ] ) π ( d z ) + ϱ t E [ ϱ t ] ] 2 K 1 1 [ E [ ϱ t ] + ϱ ^ t + ( B t + B ^ t ) χ t + ( D t + D ^ t ) ( P t E [ σ t ] + E [ η t ] ) + Z ( F t , z + F ^ t , z ) ( P t E [ γ t , z ] + E [ ζ t , z ] ) π ( d z ) ] 2 } d t .
Then, our claim follows. □
Remark 6.
It is worth noticing that, intuitive to our technique, we may consider Riccati Equations (16) and (23) and the perturbation Equations (19) and (25) as the decoupling result, but in this case, the solvability of the Riccati Equation (23) is not obvious, so we consider constructing Riccati Equation (17). If we change the Assumption (A1) more strongly, such as Q 0 , Q ^ 0 , then the Riccati Equation (23) is well posed and the perturbation Equation (25) is also solvable, but this is meaningless because we can obtain the well-posedness result of the Riccati (17) under weaker assumptions. From the above proof, we can see that the derived process of the two Riccati Equations (16) and (17) is different from Tang-Meng’s [21], which is based on an experimental method. Our method is more likely essential.

5. An Example

In this part, an example is provided to illustrate our main results, which is inspired by [15]. We consider the following one-dimensional system:
d x t = { E [ x t ] + u t E [ u t ] } d t + { E [ u t ] + σ t } d W ( t ) + 2 2 [ ε , 1 ] ( e z 1 ) E [ u t ] ν ˜ ( d t , d z ) , x 0 = h , t [ 0 , 1 ] , f o r a n y ε ( 0 , 1 ) ,
and the cost functional
J ( 0 , h ; u t ) = E | x ( 1 ) | 2 + | E [ x ( 1 ) ] | 2 + 2 E [ x ( 1 ) ] + 0 1 | u t | 2 + | E [ u t ] | 2 d s .
Here, we assume there is no jump smaller than ε , and π ( d z ) > 0 is a positive finite measure. In this example, we suppose that,
A ^ t = 1 , B t = 1 , B ^ t = 1 , D ^ t = 1 , F ^ t , z = 2 2 ( e z 1 ) , Q = 1 , Q ^ = 1 , N t = N ^ t = 1 , ι ^ = 1 , σ t = e 2 W ( t ) t .
It is easily seen that σ t L F 2 ( [ 0 , T ] , R n ) . Actually,
E 0 1 | σ t | 2 d t = E 0 1 | e 2 W ( t ) t | 2 d t E ( sup 0 t 1 e 2 W ( t ) t ) 2 .
Since { e 2 W ( t ) t ; t 0 } is a square-integrable martingale, it follows from Doob’s maximal inequality that
E ( sup 0 t 1 e 2 W ( t ) t ) 2 4 E ( e 2 2 W ( 1 ) 2 ) 4 e 2 .
Hence, E 0 1 | σ t | 2 d t 4 e 2 .
Clearly, through Equation (16), the Riccati equations for the problem are
P t P t 2 = 0 , t [ 0 , 1 ] , P ( 1 ) = 1 ,
and
Π t + 2 Π = 0 , t [ 0 , 1 ] , Π ( 1 ) = 2 ,
also the ODE is
χ t + χ t = 0 , t [ 0 , 1 ] , χ ( 1 ) = 1 .
It is easy to see that P t = 1 2 t , Π t = e 2 t + ln 2 + 2 and χ t = e t + 1 .
Therefore, from Theorem 3, it is obvious that ( P t , Π t ) is a solution for Riccati equations, then the closed-loop optimal strategy can be established as follows:
u t * = 1 t 2 { x t * E [ x t * ] } 1 κ + 5 2 t , t [ 0 , 1 ] ,
with x t * being the solution to the following closed-loop system:
d x t * = 1 t 2 x t * + t 3 t 2 E [ x t * ] 2 κ + 5 2 t d s 2 κ + 5 2 t e 2 W ( t ) t d W ( t ) 2 2 [ ε , 1 ] e z 1 κ + 5 2 t ν ˜ ( d t , d z ) , t [ 0 , 1 ] , x 0 * = h .
where κ = 1 2 [ ε , 1 ] ( e z 1 ) 2 π ( d z ) 1 by some suitable ε .
Remark 7.
Obviously, in this example, it is easy to see that both the state equation and cost functional contain the disturbance terms; this is more general than the numerical example in [14]. It shows our theoretic results on how to obtain feedback control by introducing disturbance equations directly.

6. Conclusions

In the whole paper, we obtained systemical results on the MF-SLQ optimal control problem for jump-diffusion with hybrid disturbances. Consequently, the optimal closed-loop feedback control was represented by the solutions of two Riccati equations and the two disturbances equations’ infinite horizon; the case of an infinite horizon may be considered in the future.
Under our assumptions, all of the coefficients are deterministic; nevertheless, the case of random coefficients still is an open problem. Furthermore, when we suppose that the weighting matrices in cost functional are indefinite, the main problem is to establish the equivalence between the solvability of two appropriate Riccati equations and the solvability of the indefinite MF-SLQ problem. We will continue our research and provide some new results in our later works.

Author Contributions

Conceptualization, C.T.; Writing—original draft, X.L.; Investigation, Q.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Kac, M. Foundations of kinetic theory. In Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, USA, 1 October 1956; Volume 3, pp. 171–197. [Google Scholar]
  2. McKean, H.P. A class of Markov processes associated with nonlinear parabolic equations. Proc. Natl. Acad. Sci. USA 1967, 56, 1907–1911. [Google Scholar] [CrossRef] [PubMed]
  3. Dawson, D.A. Critical dynamics and fluctuations for a mean-field model of cooperative behavior. J. Statist. Phys. 1983, 31, 29–85. [Google Scholar] [CrossRef]
  4. Chan, T. Dynamics of the McKean-Vlasov equation. Ann. Probab. 1994, 22, 431–441. [Google Scholar] [CrossRef]
  5. Ahmed, N.U. Nonlinear diffusion governed by McKean-Vlasov equation on Hilbert space and optimal control. SIAM J. Control. Optim. 2007, 46, 356–378. [Google Scholar] [CrossRef]
  6. Buckdahn, R.; Djehiche, B.; Li, J.; Peng, S. Mean-field backward stochastic differential equations: A limit approach. Ann. Probab. 2007, 37, 1524–1565. [Google Scholar] [CrossRef]
  7. Li, J. Stochastic maximum principle in the mean-field controls. Automatica 2012, 48, 366–373. [Google Scholar] [CrossRef]
  8. Chala, A. The relaxed optimal control problem for Mean-Field SDEs systems and application. Automatica 2014, 50, 924–930. [Google Scholar] [CrossRef]
  9. Benamou, J.D.; Carlier, G.; Santambrogio, F. Variational Mean Field Games; Springer International Publishing: Cham, Switzerland, 2017. [Google Scholar]
  10. Wang, T. Equilibrium controls in time inconsistent stochastic linear quadratic problems. Appl. Math. Optim. 2020, 81, 591–619. [Google Scholar] [CrossRef]
  11. Yong, J. A linear-quadratic optimal control problem for mean-field stochastic differential equations. SIAM J. Control Optim. 2013, 51, 2809–2838. [Google Scholar] [CrossRef]
  12. Ni, Y.H.; Zhang, J.F.; Li, X. Indefinite mean-field stochastic linear-quadratic optimal control. IEEE Trans. Autom. Control 2015, 60, 1786–1800. [Google Scholar] [CrossRef]
  13. Wei, Q.; Yu, Z. Infinite horizon forward-backward SDEs and open-loop optimal controls for stochastic linear-quadratic problems with random coefficients. SIAM J. Control Optim. 2021, 59, 2594–2623. [Google Scholar] [CrossRef]
  14. Tang, C.; Li, X.; Huang, T. Solvability for indefinite mean-field stochastic linear quadratic optimal control with random jumps and its applications. Optim. Control Appl. Meth. 2020, 41, 2320–2348. [Google Scholar] [CrossRef]
  15. Tang, C.; Liu, J. The equivalence conditions of optimal feedback control-strategy operators for zero-sum linear quadratic stochastic differential game with random coefficients. Symmetry 2023, 15, 1726. [Google Scholar] [CrossRef]
  16. Yong, J.; Zhou, X. Stochastic Controls: Hamiltonian Systems and HJB Equations; Springer: New York, NY, USA, 1999. [Google Scholar]
  17. Gikhman, I.I.; Skorokhod, A.V. Stochastic Differential Equations; Springer: Berlin/Heidelberg, Germany, 1972. [Google Scholar]
  18. Øksendal, B.; Sulem, A. Applied Stochastic Control of Jump Diffusions, 3rd ed.; Springer Nature: Cham, Switzerland, 2019. [Google Scholar]
  19. Shen, Y.; Siu, T.K. The maximum principle for a jump-diffusion mean-field model and its application to the mean-variance problem. Nonlinear Anal. Theory, Methods Appl. 2013, 86, 58–73. [Google Scholar] [CrossRef]
  20. Hafayed, M. A mean-field maximum principle for optimal control of forward-backward stochastic differential equations with Poisson jump processes. Int. J. Dyn. Control 2013, 1, 300–315. [Google Scholar] [CrossRef]
  21. Tang, M.; Meng, Q. Linear-Quadratic Optimal Control Problems for Mean-Field Stochastic Differential Equations with Jumps. Asian J. Control 2018, 21, 809–823. [Google Scholar] [CrossRef]
  22. Mustafa, G.; Ahmad, F.; Zhang, R.; Haq, E.; Hussain, M. Adaptive sliding mode control of buck converter feeding resistive and constant power load in DC microgrid. Energy Rep. 2022, 9 (Suppl. S1), 1026–1035. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Tang, C.; Li, X.; Wang, Q. Mean-Field Stochastic Linear Quadratic Optimal Control for Jump-Diffusion Systems with Hybrid Disturbances. Symmetry 2024, 16, 642. https://doi.org/10.3390/sym16060642

AMA Style

Tang C, Li X, Wang Q. Mean-Field Stochastic Linear Quadratic Optimal Control for Jump-Diffusion Systems with Hybrid Disturbances. Symmetry. 2024; 16(6):642. https://doi.org/10.3390/sym16060642

Chicago/Turabian Style

Tang, Chao, Xueqin Li, and Qi Wang. 2024. "Mean-Field Stochastic Linear Quadratic Optimal Control for Jump-Diffusion Systems with Hybrid Disturbances" Symmetry 16, no. 6: 642. https://doi.org/10.3390/sym16060642

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop