1. Introduction
Since the seminal papers by Friedman [
1] and Fleming and Souganidis [
2], the study of two-player stochastic zero-sum differential games (SZSDGs) and non-zero-sum stochastic differential games (SDGs) has grown rapidly in various aspects; see [
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15] and the references therein. Specifically, Friedman in [
1] considered SDGs with classical (or smooth) solutions of the associated partial differential equation (PDE) from dynamic programming to prove the existence of the Nash equilibrium and the game value. Fleming and Souganidis in [
2] studied SZSDGs in the Markovian framework with
non-anticipative strategies. They proved that the lower and upper value functions are unique viscosity solutions (in the sense of [
16]) for lower and upper Hamilton–Jacobi–Isaacs (HJI) equations obtained from dynamic programming, which are nonlinear second-order partial differential equations (PDEs). They also showed the existence of the game value under the Isaacs condition. Later, the results of [
2] were extended by Buckdahn and Li in [
7], who defined the objective functional by the backward stochastic differential equation (BSDE). They used the
backward semi-group associated with the BSDE introduced in [
17] to obtain the generalized results of [
2]. The weak formulation of SZSDGs and SDGs with random coefficients was considered in [
10,
11], where the existence of the open-loop type saddle point (Nash) equilibrium as well as the game value was established. Note also that SZSDGs and non-zero-sum stochastic differential games (SDGs) have been studied in several different directions, including the minimax solution approach [
18], the characterization of multiple Nash equilibriums [
19], the choice of the associated probability measure [
20], the optimal contracting problem [
21], the risk-sensitive SZSDG [
22], the SZSDG with delay [
23], and the SZSDG on the probability space [
24]. Regarding some other recent processes and applications of SDGs, see [
25,
26] and the references therein.
Recently, state path-dependent SZSDGs have been studied extensively in the literature, considering a general class of SZSDGs including SZSDGs with delay in the state variable. This extends the results in [
2,
7] to the non-Markovian framework. Unlike [
2,
7], for the path-dependent or non-Markovian case, the associated (lower and upper) HJI equations obtained from dynamic programming are the so-called (state) path-dependent PDEs (PPDEs) defined on a space of continuous functions, which is an infinite dimensional Banach space. Hence, the approach for the Hilbert space in [
27,
28] cannot be applied to show the existence (and uniqueness) of viscosity solutions. In [
29,
30], state path-dependent SZSDGs in a weak formulation were studied, where the players were restricted to observe the state feedback information. The existence of viscosity solutions for state path-dependent HJI equations was shown in [
29,
30] in the sense of [
31,
32,
33], which involved a nonlinear expectation in the corresponding semi-jets. For the uniqueness, [
29] imposed the assumption on the maximum dimension of the state space (
) and the nondegeneracy condition of the diffusion coefficient (see Section 6 of [
29] and Remark 3.7 of [
30]). Note that [
30] did not prove the uniqueness of viscosity solutions. As mentioned in [
30] (p. 10), the major motivation of SZSDGs in a weak formulation is to study the existence of the saddle point equilibrium; however, it requires more stringent assumptions on the coefficients than SZSDGs in a strong formulation. Zhang in [
34] studied path-dependent SZSDGs in strong formulations, where the existence of the game value and the approximated saddle point equilibrium, both under the Isaacs condition, were established via the approximating techniques of the (lower and upper) state path-dependent HJI equations. (Note that [
34] did not consider the existence and uniqueness of the (classical or viscosity) solutions of the state path-dependent HJI equations).
In this paper, we consider the two-player state and control path-dependent stochastic zero-sum differential game. In our problem setup, the state process, which is controlled by the players, is dependent on the (current and past) paths of state and control processes of the players. Furthermore, the running cost of the objective functional depends on both the state and control paths of the players. The problem formulation and the results of this paper can be viewed as nontrivial generalizations of those in the existing literature mentioned above to the state and control path-dependent SZSDG. A detailed statement on the main contributions of this paper is given below. Moreover, in Examples 1–3 of
Section 3, motivating and practical applications of the SZSDG of this paper are discussed, where Examples 1–3 can be solved by the main results of this paper.
Note that our paper can be viewed as a generalization of [
34] to the state and control path-dependent case, and of [
35] to the two-player SZSDG framework. In particular, reference [
34] studied the SZSDG for the state (not state and control) path-dependent problem, where the state process and the associated objective functional are dependent on the state path only. In fact, unlike our paper, reference [
34] did not consider the viscosity solution property of the state path-dependent HJI equations. Moreover, in [
35], the state and control path-dependent stochastic control problem and (non-zero-sum) differential games were considered, where, differently from our paper, the existence of classical (smooth) solutions of the corresponding state and control path-dependent Hamilton–Jacobi–Bellman (HJB) equation was assumed to establish the verification theorem. Note also that SZSDGs with weak formulation in [
29,
30] did not consider the control path-dependent case. We mention that our paper considers the DPP and the viscosity solution analysis for the state and control path-dependent SZSDG, which have not been studied in the existing literature.
The main contributions of this paper and comparisons with the existing literature are as follows:
- (i)
The first main objective of this paper was to obtain the dynamic programming principle (DPP) for the value functionals (see Theorem 1). Specifically, by using the notion of non-anticipative strategies, the lower and upper value functionals are defined, whereby these are functions of the initial state and control paths of the players. By using the semigroup operator associated with the BSDE objective functional, we prove that the (lower and upper) value functionals satisfy the dynamic programming principle (DPP, see Theorem 1), which is the recursive-type value iteration algorithm. Regarding a comparison with the existing literature, we note that the proof of the DPP in Theorem 1 has to be different from that of the problem dependent on the state path only in [
34], of the problem for the one-player control case in [
35], and of the classical Markovian (the path-independent) case (e.g., [
2,
7]). Specifically, unlike the existing literature mentioned above, in the proof of the DPP, we were not able to use the supremum norm for a càdlàg path due to the lack of separability of càdlàg spaces (state and control paths) Section 15 of [
36]. Hence, we adopted the
Skorohod metric (see Section 12 of [
36] (pp. 12,13)) to maintain the separability of càdlàg spaces (state and control paths). This is the essential step to obtain inequalities between the (lower and upper) value functionals and the DPP in Theorem 1, which has not been considered in the existing literature. We also note that the DPP in Theorem 1 leads to the continuity of the (lower and upper) value functionals in their arguments (see Proposition 1), which has not been studied in the existing literature.
- (ii)
The second main objective was to prove that the (lower and upper) value functionals are viscosity solutions of the associated lower and upper state and control path-dependent Hamilton–Jacobi–Isaacs (PHJI) equations (see Theorem 2). Specifically, the lower and upper state and control PHJI equations from the dynamic programming principle (DPP) in Theorem 1 are a class of state and control path-dependent nonlinear second-order partial differential equations (PPDEs), whose structures are fundamentally different from those of path-independent PDEs or state path-dependent HJI equations in [
29,
30,
31,
32,
33,
37,
38,
39]. In particular, differently from the existing literature, the time derivative term also depends on the control path of the players, which is included in
of the lower PHJI equation and
of the upper PHJI equation (see (
32) and (
33)). We applied the functional Itô calculus introduced in [
40,
41,
42] to prove that the lower and upper value functionals are viscosity solutions of the (lower and upper) PHJI equations (see Theorem 2), where the notion of viscosity solutions is defined on a compact
-Hölder space similar to [
39] (see Definition 7). Specifically, in Definition 7, the notion of viscosity solutions is defined on a compact set
, where
and
, providing the precise estimates between the initial state path and its perturbed one (see Lemma 6). This initial state path perturbation is essential to prevent starting the DPP at the boundary of
(see Remark 6 of [
39]). In addition, the compactness of
guarantees the existence of minimum and maximum points between the (lower and upper) value functionals and the test functions (see (
34) and (
49)). Then using the functional Itô calculus and the dynamic programming principle in Theorem 1, we show that the (lower and upper) value functionals are viscosity solutions of the corresponding PHJI equations (see Theorem 2). In our definition of viscosity solutions, the
predictable dependence condition for test functions is essential to handle the control path-dependent nature of the problem; a similar condition was also introduced in [
35,
40]. As for the comparison with the existing literature, we note that [
34,
35] did not consider the viscosity solution analysis of the corresponding HJI (or HJB) equations. Our technique to prove Theorem 2 can be viewed as an extension of that of Theorem 4.3 of [
39], in which Theorem 4.3 of [
39] considered only the one-player state (not state and control) path-dependent control problem. We should mention that this extension is not straightforward, since our paper considers the two-player SZSDG framework, where our SZSDG is the state and control path-dependent problem and the players interact with each other via non-anticipative strategies. Hence, the techniques in the existing literature (e.g., [
34,
35,
39]) cannot be used directly to prove Theorem 2.
- (iii)
The third main objective of this paper was to show the existence of the game value using Theorems 1 and 2 (see Theorem 3). In particular, we show that if the state and control path-dependent Isaacs condition and the uniqueness of viscosity solutions hold, then the game admits a value, i.e., the lower and upper value functionals coincide (see Theorem 3). We should mention that the proof of Theorem 3 is simpler than that for the problem dependent on the state path only in [
34], where unlike [
34], the proof of Theorem 3 does not need the approximating technique of the (lower and upper) PHJI equations to the state-dependent (not path-dependent) HJI equations.
- (iv)
In the last main objective of this paper, we provide the uniqueness of classical solutions for the (lower and upper) state path-dependent HJI equations (see Proposition 2). The general uniqueness of viscosity solutions in our paper will be investigated in a future research study. In Proposition 2, under an additional assumption (see Assumption 3), we prove the comparison principle of classical sub- and super-solutions of the lower and upper state path-dependent HJI equations, further implying the uniqueness of classical solutions (for the state path-dependent case). We note that the proof of Proposition 2 requires establishing an equivalent classical solution structure as well as an appropriate contradiction argument, which have not been studied in the existing literature.
The paper is organized as follows. In
Section 2, we provide notation and preliminary results of the functional Itô calculus introduced in [
35,
40,
41,
42]. The problem formulation is given in
Section 3. Note that some potential practical applications of the SZSDG of this paper are also discussed in
Section 3 (see Examples 1–3). In
Section 4, we show the dynamic programming principle and then prove the continuity of the (lower and upper) value functionals. In
Section 5, we introduce the lower and upper PHJI equations and prove that the value functionals are viscosity solutions of the corresponding PHJI equations. We conclude this paper in
Section 6, where several potential future research problems are also discussed.
2. Notation and Preliminaries
The n-dimensional Euclidean space is denoted by , and the transpose of a vector by . The inner product of is denoted by , and the Euclidean norm of by . Let be the trace operator of a square matrix . Let be the indicator function. Let be the set of symmetric matrices.
We introduce the calculus of the path-dependent functionals in [
40,
41,
42]; see also [
31,
35,
39]. We follow the notations in [
35,
42]. For a fixed
and
, let
be the set of
-valued continuous functions on
, and
the set of
-valued càdlàg functions on
. Let
and
for
. Let
,
, and
. For any functions in
, the capital letter stands for the
path and the lowercase letter will denote the value of the function at a specific time. Specifically,
,
stands for the value of
A at
, and for
, we denote
by the path of the corresponding function up to time
. A similar notation is applied to
. Note that
.
For
and
, we introduce the following notations:
Note that
is the flat extension, and
is the vertical extension of the path
A. The metric on
is defined for
with
and
,
Note that is the norm on defined by , by which is the metric induced by .
Note that
is a complete metric space, and
is a Banach space. The same results hold for
and
. Unfortunately,
is not separable under the metric
. Therefore, we introduce the
Skorohod metric Section 12 of [
36] (pp. 12,13) defined by
with
being the class of strictly increasing and continuous mappings
of
onto itself such that
and
, allowing a deformation on the time scale to define a distance between
A and
B. We define the metric
. As shown in Section 12 of [
36],
is a metric and so is
. Then
is separable under
Theorem 12.2 of [
36]. We can easily see that
, which implies
.
Definition 1. A functional is any function . The functional f is said to be continuous at , if for each , there exists such that for each , implies . The continuity under implies the continuity under . Let be the set of real-valued continuous functionals for every path under . The set is defined similarly.
Next, we introduce the concept of time and space derivatives of the functional f.
Definition 2. - (i)
Let be the functional. The time derivative (or horizontal derivative) of f at is defined by . If the limit exists for all , a functional is called the time derivative of f.
- (ii)
The space derivative (equivalently, vertical derivative) of f at is defined by , where for , , being a coordinate unit vector of , . If the limit exists for all and , a functional is called the space derivative of f. Note that the second-order space derivative (Hessian) can be defined in a similar way, where .
Remark 1. If a functional f is differentiable in the sense of Definition 2 and depends only on a function (not its path), i.e., , then the notion of derivatives in Definition 2 is equivalent to those for the classical ones.
From Definition 2, let
be the set of functionals such that for
,
f is
k times the differentiable and
l times the space differentiable in
, where all its derivatives are continuous in the sense of Definition 1. The set
is defined similarly. We mention that these sets are well defined in view of Definition 2.4 and Remark 2 of [
39] (see also Theorem 2.4 of [
31] and [
40,
41,
42]).
Definition 3. Let . For any , A is an κ-Hölder continuous path if the following limit exists: , where we call the κ-Hölder modulus of . The κ-Hölder space is defined by . The κ-Hölder space with is defined by . The κ-Hölder space with and is defined by We can easily see that
. The space
holds the following topological property Proposition 1 of [
39]:
Lemma 1. For , is a compact subset of .
Definition 4. Let be the functional. For , f is Hölder continuous if the following limit exists: . Assume that . We define and The set of functionals such that (1) is finite is denoted by . Let
be a complete probability space satisfying the usual condition [
43]. Let
B be the standard
p-dimensional Brownian motion defined on
. Let
be the standard natural filtration generated by the Brownian motion
B augmented by all the
-null sets of
. Let
be the set of
-valued
-measurable random vectors such that
satisfies
. Let
be the set of
-valued
-adapted stochastic processes such that
satisfies
. Let
be the set of
-valued continuous and
-adapted stochastic processes such that
satisfies
.
Let
be the
n-dimensional
-adapted stochastic process, which is defined on
. Note that
x can be viewed as a mapping from
to
. By using the notation, for
,
is the path of
x up to time
, and
is the value of
X at time
. We can see that for any functional
,
is an
-adapted stochastic process. We now state the functional Itô formula in [
40,
41,
42].
Lemma 2. Suppose that x is continuous semi-martingale, and . Then for any , f holds the following result: 3. Problem Formulation
This section provides the precise problem formulation of state and control path-dependent SZSDGs. The state and control path-dependent problem was first introduced in [
35] to solve the stochastic control problem and (non-zero-sum) differential game. On the other hand, we study the (state and control path-dependent) problem in the SZSDG framework.
Let be the set of -valued -progressively measurable and càdlàg processes, where , which is the set of control processes for Player 1. The set of control processes for Player 2, , is defined similarly with . It is assumed that and are compact metric spaces with the standard Euclidean norm. The precise definitions of and are given later.
The state and control path-dependent stochastic differential equation (SDE) is given by
where
is the whole path of the controlled state process from time 0 to
s, and
and
are paths of the control processes of Players 1 and 2, respectively. In (
2),
is the initial condition that is a continuous path starting from time
. Let
and
.
The state and control path-dependent backward stochastic differential equation (BSDE) is given by
where the pair
is the solution of the BSDE. Note that the BSDE in (
3) is coupled with the (forward) SDE in (
2). Below, the BSDE in (
3) is used to define the objective functional of Players 1 and 2.
Remark 2. Note that (3) is a class of backward stochastic differential equations (BSDEs), in which the solution is defined by (see Lemma 3 below). While the first component of its solution () coincides with the standard solution concept of SDEs, its second component () is required due to the structure of the BSDE [44,45,46]. Specifically, the second component is essential to make the first component of the solution to the BSDE being an -adapted stochastic process via the martingale representation theorem [44,45,46]. We introduce the following assumption:
Assumption 1. In (2), the coefficients and are bounded. Furthermore, the running and terminal costs in (3), and , respectively, are bounded. There exists a constant such that for and , , the following conditions hold: Based on [
34,
39,
44,
45], we have the following result (see Lemmas 3.1 and 3.2 of [
39] and Lemma 2.3 of [
34]):
Lemma 3. Suppose that Assumption 1 holds. Then, the following hold:
- (i)
For , and , the SDE in (2) and the BSDE in (3) admit unique strong solutions, with and , respectively. - (ii)
For, , with , , and , , there exists a constant , dependent on the Lipschitz constant L in Assumption 1, such that - (iii)
For, , with , , and , , there exists a constant , dependent on the Lipschitz constant L in Assumption 1, such that - (iv)
Suppose that and are coefficients of the BSDE in (3) satisfying Assumption 1, and are the corresponding terminal conditions. Let and be solutions of the BSDE in (3) with and , respectively (note that and ). If and , then , a.s., for .
The objective functional of Players 1 and 2 is given by
where
y is the first component of the BSDE in (
3). Note that
. For the SZSDG of this paper, Player 1 minimizes the objective functional in (
4) by choosing
U, while Player 2 maximizes the same objective functional in (
4) by selecting
V. Hence, our problem can be regarded as the two-player state and control path-dependent SZSDG due to the inherent dependency of state and control (past and current) paths of the players on the SDE in (
2) and the objective functional in (
4).
Remark 3. - (i)
The motivation of using the BSDE-type objective functional in (3) and (4) is closely related to the recursive-type differential game, where the “recursive” means that the objective functional itself includes the dynamic structure. In fact, by the recursive-type stochastic differential game, we are able to consider the general dynamic structure of the objective functional. For example, the wealth process of investors in mathematical finance, the utility maximization model in economics, and the (continuous-time) principal–agent problem in economics can be formulated using the framework of recursive-type BSDE objective functionals, which describe the general dynamic behavior of the investors (agents); see [44,47,48,49,50] and the references therein. - (ii)
Note that by (4), the objective functional of the SZSDG depends on the state and control path of the players. Then in Definitions 5 and 6, the notions of admissible controls and non-anticipative strategies for Players 1 and 2 are defined to formulate the state and control path-dependent SZSDG of this paper. In particular, by the notion of admissible controls in Definition 5, it is possible to combine the past control path with the current control process of the players via (5). Then the notion of non-anticipative strategies in Definition 6 is applied to define the lower and upper value functionals in (7) and (8).
Remark 4. When l in (3) is independent of y and q, (4) becomes The admissible control of Players 1 and 2 is defined as follows:
Definition 5. For , the admissible control for Player 1 (respectively, Player 2) is defined such that (respectively, ) is a -valued (respectively, -valued) -progressively measurable and càdlàg process in (respectively, ). The set of admissible controls of Player 1 (respectively, Player 2) is denoted by (respectively, ). We identify two admissible control processes of Player 1 (respectively, Player 2) u and in (respectively, v and in ) and write (respectively, ) on , if (respectively, ).
Given the definition of the admissible controls for Players 1 and 2 in Definition 5, we introduce the concept of non-anticipative strategies for Players 1 and 2.
Definition 6. For , a non-anticipative strategy for Player 1 (respectively, Player 2) is a mapping (respectively, ) such that for any -stopping time and any with on (respectively, with on ), it holds that on (respectively, on ). The set of admissible strategies for Player 1 (respectively, Player 2) is denoted by (respectively, ).
The following notation captures control path-dependent SZSDGs: for
,
where
,
,
, and
. Note that
and
.
With the help of the notation in (
5), the objective functional of (
4) that includes the
path of the control of Players 1 and 2 can be written as follows:
Then for
and
, the lower value functional of (
6) for the state and control path-dependent SZSDG can be defined by
where the last equality follows from (
5). Moreover, for
and
, the upper value functional of (
6) is defined by
Note that .
We state some remarks on various formulations of (path-dependent) SZSDGs.
Remark 5. - (1)
One might formulate SZSDGs with control against control, in which the players can select admissible controls individually. Although this formulation is quite similar to stochastic optimal control and, therefore, can define the saddle point equilibrium, the dynamic programming principle cannot be established and the value of the game may fail to exist; see Appendix E of [29] and Example 2.1 of [30]. Note that under this formulation, the necessary condition for the existence of the saddle point equilibrium in terms of the (stochastic) maximum principle was studied in [13]. - (2)
The notion of non-anticipative strategies in Definition 6 is used in various zero-sum differential games; see [2,3,4,5,6,7,12,15,34]. This is the strong formulation with a strategy against the control. Under this formulation, it is possible to establish the dynamic programming principle, to show the existence of viscosity solutions of Hamilton–Jacobi–Isaacs (HJI) equations, and to identify the existence of the game value under the Isaacs condition. We also note that instead of the strong formulation with the strategy against the control, one can use the notion of the non-anticipative strategy with delay, which is still asymmetric information between the players that allows showing the existence of the (approximated) saddle point equilibrium and the game value [6,8,34]. - (3)
Instead of the strong formulation with the strategy against the control, SZSDGs can be considered in weak formulation [10,11,29,30,51]. Note that in [29,30], the players are restricted to observing the state feedback information. Since the information is symmetric, it is convenient to define the saddle point equilibrium and show the existence of the game value. The dynamic programming principle can also be obtained. Note that the notion of viscosity solutions of the HJI equation requires the nonlinear expectation and some additional assumptions are required to show the existence and uniqueness of viscosity solutions in the sense of [31,32,33]; see Remark 3.7 of [30] (p. 10) and Section 6 of [29].
The next remark is on the (lower and upper) value functionals.
Remark 6. We can see that the value functionals in (7) and (8) depend on the initial paths of both the state and control of the players. Consider the situation when the path-dependence is only in the state variable, i.e., , , and . Then, the value functionals can be written independent of Z and W: This is a special case of the SZSDG in this paper, which was studied in [34]. In addition, for the state and control path-independent case, i.e., the SZSDG in the Markovian formulation, the value functionals are reduced byfor any initial state and ; see [2,7] and the references therein. Below, we discuss some motivating and practical examples of the SZSDG in this paper.
Example 1. - (i)
As mentioned in [35], one main example of the SZSDG considered in this paper is the delay problem with delay . In particular, the SDE with delay is given by The objective functional with delay is as follows: Notice that there are two players in (9) and (10). While , is the initial path of the state process, and , are the initial paths of the control of the players. In addition, we observe that the objective functional in (10) also depends on the state and control paths of the players. Then this example can be viewed as the SZSDG with delay due to the presence of the delay in (9) and (10). Equivalently, since both the SDE in (9) and the objective functional in (10) are dependent on the state and control paths of the players, the SZSDG with delay can be formulated by the state and control path-dependent SZSDG studied in this paper. - (ii)
Regarding the SZSDG with delay mentioned in (i), we are able to consider the following simplified problem:where , , are deterministic constants. Moreover, the objective functional is given bywhere , , and , , are deterministic constants. Note that the above simplified case holds Assumption 1. Then we are able to apply the main results of this paper to solve the above state and control path-dependent SZSDG. - (iii)
Stochastic control problems and differential games with delay can be solved by infinite-dimensional approaches [52,53,54]. (Note that [53,54] considered the one-player stochastic control problem with delay. Of course, it is interesting to study the approach of [53,54] in the SZSDG framework). However, their approaches are applicable only to the delay-type problem and cannot be used to solve the general path-dependent problem. There are various applications of stochastic differential games and optimal control with delay in mathematical finance, economics, science, and engineering; see [55,56,57,58,59,60,61] and the references therein.
Example 2. Based on Example 4.5 of [39] and [62], the SZSDG of this paper can be converted into the stochastic zero-sum differential game withrandom coefficients, in which the coefficients of (2) and (3) are random. In fact, the purpose of allowing for random coefficients in stochastic control problems and their applications is to be able to have general modeling frameworks and to capture random parameter variations due to imprecisions, such as inaccurate modeling, environment changes, random disturbances, and high sensitivity of dynamical systems. The reader is referred to [63,64,65,66,67] and the references therein for applications of stochastic control with random coefficients in diverse fields, such as mathematical finance, economics, science, and engineering. Specifically, optimizing of FitzHugh–Nagumo communication networks was considered in [66,67], where their problems can be generalized to the state and control path-dependent recursive-type SZSDG studied in this paper. Moreover, various mathematical finance problems with random coefficients were considered in [63,64,68], which can be studied in different aspects using the approach of this paper. Example 3. Another application of the SZSDG in this paper is the two-player optimal consumption game in a delayed and path-dependent financial market, which can be regarded as a generalization of [69,70,71]. In particular, assume that is the consumption rate of the investor, whereas corresponds to the worst-case situation of the financial market. Then the investor’s wealth process with delay subject to non-risky and risky assets can be described bywhere with and indicate the sliding average and the instantaneous delay, respectively. The objective functional for the two players is the path-dependent terminal wealth given by (with and ), 4. Dynamic Programming Principle
This section establishes the dynamic programming principle for the lower and upper value functionals.
In view of Assumption 1, and the estimates in Lemma 3 and (
5), the following result holds:
Lemma 4. Suppose that Assumption 1 holds. For any , and , , there exists a constant such that the following estimates hold: for , Remark 7. Lemma 4 implies that the (lower and upper) value functionals are continuous with respect to , where is the metric induced by . Since (see Section 2), in view of Definition 1, we can easily see that the (lower and upper) value functionals are continuous with respect to . Before stating the dynamic programming principle of the lower and upper value functionals, we introduce the
backward semigroup associated with the BSDE in (
3). For any
with
and
, we define
where
is the first component of the pair
that is the solution of the following BSDE:
Note that (
12) can be regarded as a
truncated BSDE in terms of the terminal time
and the terminal condition. The superscripts
t and
indicate the initial and terminal times, respectively. By definition, we have
Now, we state the dynamic programming principle of the lower and upper value functionals in (
7) and (
8).
Theorem 1. Suppose that Assumption 1 holds. Then for any with , and for any and , the lower and upper value functionals in (7) and (8), respectively, satisfy the following dynamic programming principles: Proof. We prove (
14) only, as the proof for (15) is similar to that for (
14).
Below, we show
and
. We modify the proof of [
34] to the state and control path-dependent case.
Part (i):
We first show that given
and
, for any
, there exists
, such that
In view of Theorem A.3 of [
72], there exists
with
, such that
Let
,
. To make the disjoint partition of
with
, let
and
for
. Let
. Then, in view of the uniqueness of the solution to the BSDE and (
17), we have
which shows (
16). In fact, to show the first equality in (
18), for any
, let
, where
, in which
and
correspond to
and
, respectively. Based on this construction, we have
On the other hand, from Theorem A.3 of [
72], there exists
and
with
, such that
Then, (
19) and (
20) imply (
18); hence, (
16) holds.
Given
, let
. Then
is the set of continuous functions, which together with the metric
induced by the norm
, implies that
is a complete separable metric space. Recall that
is the Skorohod metric for
(see the notation in
Section 2). In view of Theorem 12.2 of [
36],
is a complete separable metric space, and from [
73],
is a complete separable metric space with the metric
. Hence, there exists a countable dense subset, denoted by
[
74], and for any
and
, there exist
,
, such that
. For
, we define the set of neighborhood of
by
In view of this construction, , and by a slight abuse of notation, with and , , we still have , where is the disjoint partition of .
For any
,
and
, with the above construction, together with Lemma 4 and Remark 7, for each
, there exists a constant
, such that
Note that (
16) implies that there exists
, such that for
,
Hence, from (
21), for any
, we have
where
and
.
Note that
and
. In view of (
5), we have
where it can be verified that
.
Then from the comparison principle in (iv) of Lemma 3, (
12), and (
13), we have
The arbitrariness of
and
, together with the definition of
and (
13), yield
By letting , we have the desired result.
Part (ii):
We first note that for any fixed
with
, its restriction to
is still non-anticipative independent of any special choice of
, i.e.,
for
, due to the non-anticipative property of
. Recall the definition of
; then with the restriction of
to
, we have
Furthermore, similar to (
17), there exists
, with
, such that
Then by using (
23) and the approach of (
16), for each
, there exists
, such that
Similar to the argument and the notation introduced in (
21) and (
22), there exists
,
, such that
where
.
Note that
and
. Then, from (
5),
where
. From (iv) of Lemma 3, (
12), and (
13), we have
The arbitrariness of
and the definition of
imply,
and by taking ess inf with respect to
and then
, we have the desired result. Hence, parts (i) and (ii) show the dynamic programming principle of the lower value functional
in (
14). This completes the proof of the theorem. □
From Lemma 4, the (lower and upper) value functionals are continuous with respect to the initial state and control paths. We next state the continuity of the (lower and upper) value functionals in .
Lemma 5. Suppose that Assumption 1 holds. Then, the lower and upper value functionals are continuous in t. In particular, there exists a constant , such that for any and with , () Proof. We prove the case for the lower value functional only, since the proof for the upper value functional is similar. Without loss of generality, for any
with
, we need to prove
In view of the dynamic programming principle (
14) in Theorem 1 and (
16), for any
, there exists
, such that for any
,
The definition of
implies
where
Note that for .
Now, Lemmas 3 and 4, the definition of
, and Jensen’s inequality imply that there exists a constant
, such that
Moreover, from the definition of
,
is equivalent to
Then Hölder inequality, Assumption 1 and Lemma 3 imply that
Furthermore, in view of the definitions of the lower value functional in (
7) and the objective functional in (
6), we have
From (iii) of Lemmas 3, Lemma 4, (
5), and the definition of
, we have
By substituting (
27)–(
29) into (
26),
Hence, the arbitrariness of
implies the first inequality part in (
25). The second inequality part in (
25) can be shown in a similar way. We complete the proof. □
Based on Lemmas 4 and 5, the (lower and upper) value functionals satisfy the following continuity result:
Proposition 1. Suppose that Assumption 1 holds. Then there exists a constant , such that for with and any and , , () 5. State and Control Path-Dependent Hamilton–Jacobi–Isaacs Equations and Viscosity Solutions
In this section, we introduce the lower and upper state and control path-dependent Hamilton–Jacobi–Isaacs (PHJI) equations that are path-dependent nonlinear second-order PDEs (PPDEs). We show that the (lower and upper) value functionals are viscosity solutions of the corresponding PHJI equations.
The Hamiltonian,
, is defined by
where
With (
30), we introduce the lower PHJI equation
and the upper PHJI equation
Remark 8. - (1)
In (32), . From Section 2, the time derivative of in (if it exists) can be written as follows:where is induced due to the definition of ⊗ in (5) (see also [35]). Moreover, the space derivative of with respect to is given by (if it exists), where Note that the definitions in (5), and imply that satisfies the predictable dependence condition in the sense of [40]; hence, the space derivative of with respect to the control of the players is zero; see Remark 4 of [40] and Remark 2.3 of [35]. The same argument applies to (33). - (2)
If there is a functional in in the sense of Definition 2, which solves (32), then it is a classical solution of (32). Moreover, similar to [31,37], the classical sub-solution (respectively, super-solution) is defined if the “=0” in (32) is replaced by “≥0” (respectively, “≤0”). When there is a classical solution of (32), it means that it is both classical sub- and super-solutions. The same argument can be applied to the upper PHJI equation in (33).
Remark 9. For the state path-dependence case (see Remark 6), (32) and (33) are reduced to the state path-dependent HJI equations in (53) and (54). In addition, in the Markovian formulation (see Remark 6), the (lower and upper) PHJI equations are equivalent to those in Sections 4.1 and 4.2 of [7] We fix
in the
-Hölder modulus. The notion of the viscosity solution is given as follows, which was first introduced in [
39] for the state path-dependent case.
Definition 7. - (i)
A real-valued functional is said to be a viscosity sub-solution of the lower PHJI equation in (32) if for and , and for all test functions satisfying the predictable dependence in the sense of [40], i.e., and , where , the following inequality holds: - (ii)
A real-valued functional is said to be a viscosity super-solution of the lower PHJI equation in (32) if for and , and for all test functions satisfying the predictable dependence in the sense of [40], i.e., and , where , the following inequality holds: - (iii)
A real-valued functional is said to be a viscosity solution if it is both a viscosity sub-solution and super-solution of (32). - (iv)
The viscosity sub-solution, super-solution, and solution of the upper PHJI equation in (33) are defined in similar ways.
Remark 10. - (1)
In Definition 7, in view of Remark 8, . For the Markovian case, Definition 7 is equivalent to that of the classical one in [7,16,45]. Moreover, we can easily check that if the viscosity solution of (32) further belongs to , satisfying the predictable dependence, then it is also the classical solution of (32). The same argument applies to (33). This implies that when the (lower and upper) value functionals are in , they are classical solutions of the (lower and upper) PHJI equations. - (2)
The definition of viscosity solutions in Definition 7 is different from that in [31,32,33], which was applied to SZSDGs in weak formulation in [29,30]. In particular, in [31,32,33], a nonlinear expectation was included in the corresponding semi-jets, which is closely related to a certain class of BSDEs via the Feynman–Kac formula. It is interesting to investigate the relationship (or equivalence) between Definition 7 and the definition in [31,32,33]. As noted in Section 1 and Remark 3.7 of [30] (p. 10), the general uniqueness in the sense of [31,32,33] has not been completely solved, and the SZSDG in the weak formulation requires more stringent assumptions on the coefficients than the SZSDG in a strong formulation. Since we considered the SZSDG in the strong formulation, we modified the notion of viscosity solutions in [39], which was applied to the state path-dependent (one-player) stochastic control problem in the strong formulation. A similar definition was also introduced in [62] to study a class of stochastic HJB equations (in the strong formulation). Recently, [38] studied the uniqueness of the viscosity solution for the (one-player) stochastic control problem in the strong formulation under the notion similar to that for the finite-dimensional (Markovian) case of [16], where [38] requires several assumptions different from those of this paper.
Remark 11. This remark will be used in the proof of Theorem 2 given below. The condition of the predictable dependence for the test function ϕ in Definition 7 is introduced due to the control path-dependent nature of the SZSDG with (5). Specifically, from the predictable dependence of ϕ with respect to the control of the players in the sense of [40], i.e., , and the definition in (31) and (5), it holds that and (see also (1) of Remark 8). Therefore, the (space) derivative of ϕ with respect to the control of the players is zero, i.e., , , and . Similar discussions can be found in Remark 4 of [40] and Remark 2.3 of [35]. We should also mention that for the state path-dependent case (see Remarks 6 and 9), the predictable dependence condition is not needed. We state the main result of this section.
Theorem 2. Suppose that Assumption 1 holds. Then the lower value functional in (7) is the viscosity solution of the lower PHJI equation in (32). The upper value functional in (8) is the viscosity solution of the upper PHJI equation in (33). Before proving the theorem, for
,
,
, and
, let
be the perturbed version of
defined by
Note that
. The perturbation is essential to prove Theorem 2; see Remark 6 of [
39].
We state the following lemma, whose proof is given in Lemma 5.1 of [
39].
Lemma 6. Let . Assume that , , i.e., , and . Then we have
- (i)
.
- (ii)
.
- (iii)
There exists a constant , independent of μ, such that for any d with and , .
The proof of Theorem 2 is given as follows.
Proof of Theorem 2 We first prove that the lower value functional in (
7) is the viscosity super-solution of the lower PHJI equation in (
32). Note that in view of Lemmas 4 and 5, it is clear that
. Furthermore, from (
7), we have
.
From the definition of the viscosity super-solution in (ii) of Definition 7 and Lemma 1, for
,
and
,
where
. By definition of
, let
.
For any
, in view of (i) and (ii) in Lemma 6, we have
. Consider the following
-stopping time:
By definition, for any
,
and for a small
with
,
Hence, from (iii) of Lemma 6, we have
and by (ii) of Lemma 3 and the Markov inequality,
Now, from the dynamic programming principle in (
14) of Theorem 1,
Then similar to (
24), for any
, there exists
, such that for any
,
where in view of the definition of
,
in the above expression can be rewritten as (superscript
is omitted)
On the other hand, by using the functional Itô formula in Lemma 2, we have
where
Here, we used the fact that the (space) derivative of with respect to the control of the players is zero as stated in Remark 11.
From (
38) and (
39),
where
and
due to Assumption 1. We have from (
40),
Notice that (
42) is a linear BSDE; hence, by using Lemma 2 and Proposition 4.1.2 of [
44], its explicit unique solution can be written as follows:
where
is the scalar-valued state transition process given by
,
, with
, i.e.,
.
From (
36) and (
37), together with (
43) and the predictable dependence of
,
where
In view of (ii) in Lemma 3 and the fact that
is the linear SDE, we have
Furthermore, due to the property of
and Assumption 1, for
and
,
Then, from the definition of the viscosity super-solution (ii) in Definition 7, Lemmas 4 and 5, and the property of
, we have
Note that by (
40), (
45), Hölder inequality, Lemma 3 and (
35),
From (ii) of Lemma 3, we also have
and
Hence, by substituting (
45)–(
48) into (
44), we have
Let
. Then the arbitrariness of
v and
, and the definition of
F imply that
This shows that (
7) is the viscosity super-solution of (
32).
Next, we prove that (
7) is the viscosity sub-solution of the lower PHJI equation in (
32). From (i) in Definition 7 and Lemma 1, for
,
,
where
. This implies that
, and for
,
.
From Lemmas 4 and 5,
, and due to the definition of the lower value functional,
. Then it is necessary to prove that
Now, suppose that this is not true, i.e., there exists a finite
, such that for some
,
where by definition of
F,
. Note that
and
are compact; hence, there exists a measurable function
, such that for any
with
,
On the other hand, from (
14) of Theorem 1, we have
By defining
for
, we have
and
. This, together with the definition of
and the comparison principle in (iv) of Lemma 3, implies
For each
, similar to (
24), we can choose
, such that
Note (
40) and (
41). Then, similar to (
43), by Lemma 2, we have
With the same technique as in the super-solution case and the definition of
, by letting
, the arbitrariness of
and (
50) imply that
. This induces
, which leads to a contradiction. Hence, (
7) is the viscosity sub-solution of the lower PHJI equation in (
32).
The proof for the upper value functional
being a viscosity solution to the upper PHJI equation in (
33) is similar. We complete the proof of the theorem. □
We now discuss the existence of the game value of the SZSDG under the Isaacs condition. Specifically, we introduce the
state and control path-dependent Isaacs condition: for
,
Then the existence of the game value can be stated as follows:
Theorem 3. Suppose that Assumptions A1 and the uniqueness of the viscosity solutions of (32) and (33) hold. Under the Isaacs condition in (51), the game has a value, i.e., , where is the unique viscosity solution of the following PHJI equation: Proof. In view of Theorem 2 and the uniqueness assumption, the lower value functional
and the upper value functional
are the unique viscosity solutions of (
32) and (
33), respectively. Then, the Isaacs condition in (
51) implies
, which is the unique solution to the PHJI equation in (
52). We complete the proof. □
Before concluding the paper, we will discuss the state path-dependent case, which is a special case of the SZSDG in this paper and was studied in [
34]. (As mentioned in
Section 1, [
34] considered the existence of the game value and the approximated saddle point equilibrium, both under the Isaacs condition, but did not study the existence and uniqueness of (viscosity or classical) solutions of state path-dependent HJI equations). Specifically, as stated in Remarks 6 and 9, we need to assume that
Assumption 2. , , .
Remark 12. With Assumption 2, (51) becomes the state path-dependent Isaacs condition in Section 3.2 of [34]. Hence, Theorem 3 is reduced to Theorem 3.1 of [34] when Assumption 2 holds. Under Assumption 2, the lower and upper PHJI equations in (
32) and (
33) are reduced to the state path-dependent HJI equations (see Remark 9):
and
Assumption 3. Let and . For any , and with and , We state the uniqueness of classical solutions of (
53) and (
54).
Proposition 2. Assume that Assumptions 1–3 holds. Suppose that and are classical sub- and super-solutions of the lower PHJI equation in (53), respectively. Then we have for . The same result holds for the upper PHJI equation in (54). Consequently, there is a unique classical solution to (53) and (54). Proof. Let
, where
. Then we can easily see that
is a classical sub-solution of the following PDE (see (2) of Remark 8):
Since
follows from
in the limit
, it suffices to prove the theorem with the following additional assumption:
where
and
uniformly on
.
Assume that this is not true; that is, there exists
with
, such that
. In view of Lemma 1, there exists
with
such that
. Then from Lemma 9 of [
37], we have
and
. This, together with Assumption 3 and the fact that
is the classical super-solution, implies that
which induces a contradiction. Hence,
for
. Suppose that
and
are classical solutions of (
53). Then we have
and
, which implies
. Hence, the uniqueness follows. This completes the proof. □
6. Conclusions
We considered the two-player state and control path-dependent stochastic zero-sum differential game (SZSDG), where the state process and objective functionals of the players are dependent on (current and past) paths of state and control processes. The notion of non-anticipative strategies has been used to define lower and upper value functionals of the SZSDG, which are dependent on the initial state and control path of the players. We have shown that the (lower and upper) value functionals satisfy the dynamic programming principle (DPP), where the Skorohod metric is necessary in the proof to maintain the separability of the càdlàg (state and control) spaces. Then we have shown that the lower and upper value functionals are viscosity solutions of (lower and upper) state and control path-dependent HJI equations, where the notion of viscosity solutions is defined on a compact -Hölder space to use several important estimates and to guarantee the existence of minimum and maximum points between the (lower and upper) value functionals and the test functions. These two results, together with the Isaacs condition and the uniqueness of viscosity solutions, imply the existence of the game value. Finally, we have shown the uniqueness of classical solutions for the (state path-dependent) HJI equations in the state path-dependent case.
Some model limitations based on the results of this paper can be stated as follows. First, we need to develop numerical techniques to solve the state and control path-dependent (lower and upper) HJI equations in (
32) and (
33), which lead to the characterization of the value of the SZSDG in this paper via Theorems 2 and 3. Unfortunately, until now, there have been no notable results on this topic. We may extend (finite-difference and learning-based) numerical techniques of state-dependent problems in [
71,
75,
76] to the state and control path-dependent case studied in this paper. However, this extension is not trivial, as our (lower and upper) state and control path-dependent HJI equations and their time derivatives are dependent on the control paths of the players (see Remark 8), and their control paths are coupled with each other through the inf and sup operations (see (
32) and (
33)). Hence, we have to develop a new direction for numerical techniques to solve our (lower and upper) state and control path-dependent HJI equations. This is one of our primary research topics; we hope to present some notable results in an upcoming paper. Another limitation involved studying a class of state and control path-dependent non-zero-sum differential games. In this case, it was not possible to use the notion of non-anticipative strategies directly, and we needed a technique in [
6] to convert the original SZSDG into the equivalent game form.
There are several interesting potential future research problems.
First, one important problem is the uniqueness of the viscosity solutions of the (lower and upper) PHJI equations in (
32) and (
33). As mentioned in
Section 1, the uniqueness has not been shown even in the case of (strong and weak formulation) state path-dependent SZSDGs [
29,
30,
34].
Second, we may study the existence of the (approximated) saddle point equilibrium using the notion of non-anticipative strategies with delay as mentioned in (2) of Remark 5. For the state-dependent case (with Assumption 2), this was shown in Theorem 4.13 of [
34] under the Isaacs condition, where the key step involves approximating the PHJI equation in (
53) and (
54) to the state-dependent (not state path-dependent) HJI equations. Note that there is a unique viscosity solution of the approximated (lower and upper) state-dependent HJI equations in view of Theorem 5.3 of [
7]. Then, the existence of the (approximated) saddle point equilibrium can be shown using the property of non-anticipative strategies with delay Lemma 2.4 of [
6]. We speculate that the approach of Theorem 4.13 of [
34] can be applied to the state and control path-dependent SZSDG of this paper.
Third, we can consider the problem in a weak formulation. As noted in (3) of Remark 5, one major feature of this formulation is the symmetric feedback information between the players, which is convenient to show the existence of the saddle point equilibrium and the game value.
Forth, the forward–backward stochastic differential equation given in (
2) and (
3) is not fully coupled in the sense that the BSDE in (
3) is not included in the (forward) SDE in (
2). This can be extended to the fully-coupled FBSDE, where (
2) is also dependent on (
3). This can be viewed as a generalization of [
77], where the major challenge is the case when the diffusion term of (
2) depends on the second component of the solution of the BSDE, since the associated PHJI equation should require an additional algebraic equation.
Finally, we may study applications of the state and control path-dependent SZSDG of this paper, where some motivating and practical examples are given in Examples 1–3. Note that Examples 1–3 can be treated by the main results of this paper. In fact, by Theorems 2 and 3, the optimal game value of Examples 1–3 can be obtained by solving the corresponding state and control path-dependent (lower and upper) HJI equations. However, as mentioned above, numerical techniques to solve the state and control path-dependent (lower and upper) HJI equations have to be studied, which is our primary research topic. The numerical techniques for solving (
32) and (
33) lead to studying various applications (see Examples 1–3) of the (state and control) path-dependent SZSDG in this paper.