Two-Player Nonzero-Sum Stochastic Differential Games with Switching Controls

Yongxin Liu; Hui Min

doi:10.3390/math12243976

and

¹

School of Statistics and Data Science, Nanjing Audit University, Nanjing 211815, China

²

School of Mathematics, Statistics and Mechanics, Beijing University of Technology, Beijing 100124, China

^*

Author to whom correspondence should be addressed.

Mathematics2024, 12(24), 3976;https://doi.org/10.3390/math12243976

This article belongs to the Section D1: Probability and Statistics

Version Notes

Order Reprints

Abstract

In this paper, a two-player nonzero-sum stochastic differential game problem is studied with both players using switching controls. A verification theorem associated with a set of variational inequalities is established as a sufficient criterion for Nash equilibrium, in which the equilibrium switching strategies for the two players, indicating when and where it is optimal to switch, are characterized in terms of the so-called switching regions and continuation regions. The verification theorem is proved in a piecewise way along the sequence of total decision times of the two players. Then, some detailed explanations are also provided to illustrate the idea why the conditions are imposed in the verification theorem.

Keywords:

stochastic differential game; switching control; verification theorem; variational inequality; sufficient condition

MSC:

91A05; 91A15; 91A23

1. Introduction

A differential game is concerned with the problem that multiple players make decisions, according to their own advantages and trade-off with other peers, in the context of dynamic systems. More precisely, suppose that there are N players acting on a dynamic system through their decisions

u_{i}

, and each of them has an individual cost functional

J_{i} (u_{1}, \dots, u_{N})

,

i = 1, 2, \dots, N

. The goal is to find a Nash equilibrium

(u_{1}^{*}, \dots, u_{N}^{*})

for the N players, which satisfies

J_{i} (u_{1}^{*}, \dots, u_{N}^{*}) \leq J_{i} (u_{1}^{*}, \dots, u_{i - 1}^{*}, u_{i}, u_{i + 1}^{*}, \dots, u_{N}^{*})

, for

i = 1, \dots, N

, where

u_{i}

is an arbitrary decision for Player i. The property of

(u_{1}^{*}, \dots, u_{N}^{*})

means that when the N players act with

(u_{1}^{*}, \dots, u_{N}^{*})

, it is advantageous for them to maintain their decisions throughout. If one of them acts unilaterally to change a decision, it is penalized if the others maintain their initial decisions. The problem described above is called a nonzero-sum differential game. In this paper, we assume

N = 2

and focus our attention on the two-player case. In particular, if the dynamic system is given by a stochastic differential equation, then the game is called a nonzero-sum stochastic differential game, and there is much literature on this subject; see Hamadène et al. [1], Hamadène [2,3], Wu [4], and Sun and Yong [5]. On the other hand, recently, the theory of optimal control and differential game has been found to be very useful in the field of human–machine interaction systems; see [6,7,8].

Optimal switching is used to determine a sequence of stopping times at which to shift the mode of the control process to another. When there are more than two modes, it needs to decide not only when to switch, but also where to switch. As an important branch in control theory, optimal switching has been extensively investigated by means of variational inequalities (see Yong [9,10], Tang and Yong [11], Pham [12], and Song et al. [13]) or backward stochastic differential equations (see Hamadène and Jeanblanc [14], El Asri and Hamadène [15], Hamadène and Zhang [16], Hu and Tang [17], and El Asri [18,19]). In addition, the zero-sum switching game problems have been discussed by Tang and Hou [20], Hamadène et al. [21], and El Asri and Mazid [22]; it is noted that the nonzero-sum switching game problems have not ever been investigated. Apart from the mathematical interest in its own right, optimal switching enjoys a wide range of applications, such as resource extraction (Brekke and Øksendal [23]), investment decisions (Duckworth and Zervos [24]), and electricity production (Carmona and Ludkovski [25]).

In this paper, we consider, for the first time in the literature, a two-player nonzero-sum stochastic differential game with both players using switching controls. The main contribution of this paper is to establish a verification theorem (see Theorem 1) as a sufficient criterion for Nash equilibrium between the two players, in which a set of variational inequalities is given and a regularity condition is imposed to require, taking player 1 for example, the solution

V_{1} (x, i, j)

to be

C^{1}

on the opponent’s continuation region

D_{2, i, j}

, and to be

C^{2}

on

D_{2, i, j}

except for the boundary of its own continuation region

D_{1, i, j}

. It turns out that the Nash equilibrium strategies for the two players can be constructed based on these variational inequalities, and the solutions

V_{1} (x, i, j)

and

V_{2} (x, i, j)

coincide with the corresponding value functions (or, Nash equilibrium payoffs in the terminology of Buckdahn et al. [26]) of the two players.

On the one hand, it is emphasized that, if we just prove the verification theorem, then the regularity condition given is more strict than what we actually need; however, the seemingly superfluous regularity condition is really necessary if we want to further apply the the so-called smooth-fit principle to solve some specific examples, otherwise we cannot find enough pasting equations for the undetermined parameters. On the other hand, we would like to mention that in this paper, the verification theorem is proved in a piecewise or stage-by-stage way along the sequence of total decision times of the two players (here, a stage means a period between two adjacent decision times), which, to the best of our best knowledge, is new in the nonzero-sum switching game literature.

The rest of this paper is organized as follows. Section 2 formulates the nonzero-sum game problem under consideration. Section 3 states and proves the verification theorem. Finally, Section 4 concludes the paper with some further remarks.

2. Problem Formulation

Let R be the one-dimensional real space. For a subset

D \subset R

,

C (D)

,

C^{1} (D)

, and

C^{2} (D)

, denote the spaces of all real valued continuous, continuously differentiable, and twice continuously differentiable functions on D, respectively. Let

(Ω, F, P)

be a probability space on which a one-dimensional standard Brownian motion

B (t)

,

t \geq 0

, is defined.

{F_{t}}_{t \geq 0}

denotes the natural filtration of

B (\cdot)

augmented by all the null sets.

Let the two players in the game be labeled Player 1 and Player 2. In order to formulate the problem precisely, we first provide the definition of admissible switching controls for the two players.

Definition 1.

Let

N_{1} = {1, \dots, N_{1}}

and

N_{2} = {1, \dots, N_{2}}

be two finite sets of possible modes for Player 1 and Player 2, respectively. An admissible switching control for Player 1 is a sequence of pairs

{(τ_{m}, ξ_{m})}_{m \geq 1}

, where

{(τ_{m})}_{m \geq 1}

is a sequence of stopping times with

τ_{m} < τ_{m + 1}

and

τ_{m} \to \infty

as

m \to \infty

, representing the decisions on “when to switch," and

{(ξ_{m})}_{m \geq 1}

is a sequence of

N_{1}

-valued random variables with each

ξ_{m}

being

F_{τ_{m}}

-measurable, representing the decisions on “where to switch." The collection of all admissible switching controls for Player 1 is denoted by

U_{1}

.

An admissible switching control

{(ρ_{n}, η_{n})}_{n \geq 1}

for Player 2 is defined similarly. The collection of all admissible switching controls for Player 2 is denoted as

U_{2}

.

Remark 1.

Here, an example of electricity production management is provided to illustrate the meaning of “possible modes for players” in the Definition 1. Typically, a power plant has multiple modes, such as operating at full capacity, operating at partial capacity, or even shutting down all generators. Suppose that there are two managers running the power plant by switching the mode of the power plant from one to another, and the fixed costs are associated with these switchings. Both managers make decisions to maximize their payoffs and eventually reach a Nash equilibrium. In this example, the multiple modes of the power plant that can be chosen by the two managers are the so-called “possible modes for players”.

The state process

X \in R

is described by the following:

\{\begin{matrix} d X (t) = & b (X (t)) d t + σ (X (t)) d B (t), t \geq 0, \\ X (0) = & x, \end{matrix}

(1)

where

b, σ : R \mapsto R

are two given functions satisfying the usual Lipschitz condition, so that (1) admits a unique strong solution. Note that the functions b and

σ

and the functions f,

ϕ_{1}

,

ϕ_{2}

,

ψ_{1}

,

ψ_{2}

appearing in the following payoff functionals are deterministic, as, in this paper, we adopt the verification theorem approach associated with a set of ordinary differential equations in the form of variational inequalities to deal with the game problem under consideration.

The payoff functionals for Player 1 and Player 2 to maximize are given, respectively, by

\begin{matrix} J_{1} (x, i, j; u_{1}, u_{2}) = & E [\int_{0}^{\infty} e^{- r t} f_{1} (X (t), u_{1} (t), u_{2} (t)) d t \\ - \sum_{m \geq 1} e^{- r τ_{m}} ϕ_{1} (ξ_{m - 1}, ξ_{m}) + \sum_{n \geq 1} e^{- r ρ_{n}} ψ_{1} (η_{n - 1}, η_{n})], \end{matrix}

and

\begin{matrix} J_{2} (x, i, j; u_{1}, u_{2}) = & E [\int_{0}^{\infty} e^{- r t} f_{2} (X (t), u_{1} (t), u_{2} (t)) d t \\ - \sum_{n \geq 1} e^{- r ρ_{n}} ϕ_{2} (η_{n - 1}, η_{n}) + \sum_{m \geq 1} e^{- r τ_{m}} ψ_{2} (ξ_{m - 1}, ξ_{m})], \end{matrix}

where

f : R \times N_{1} \times N_{2} \mapsto R

is a given function,

ϕ_{1} : N_{1} \times N_{1} \mapsto R

and

ϕ_{2} : N_{2} \times N_{2} \mapsto R

are the switching costs for Player 1 and Player 2, respectively,

ψ_{1} : N_{1} \times N_{1} \mapsto R

and

ψ_{2} : N_{2} \times N_{2} \mapsto R

are the corresponding gains for Player 1 and Player 2 due to the opposites’ actions, respectively, and

μ > 0

is the discount factor. It is emphasized that there are no specific conditions on the functions f,

ϕ_{1}

,

ϕ_{2}

,

ψ_{1}

,

ψ_{2}

; all of the conditions we need are listed in the verification theorem.

The objective is to find a Nash equilibrium

(u_{1}^{*}, u_{2}^{*}) \in U_{1} \times U_{2}

, i.e.,

\{\begin{matrix} J_{1} (x, i, j; u_{1}^{*}, u_{2}^{*}) \geq & J_{1} (x, i, j; u_{1}, u_{2}^{*}), \forall u_{1} \in U_{1}, \\ J_{2} (x, i, j; u_{1}^{*}, u_{2}^{*}) \geq & J_{2} (x, i, j; u_{1}^{*}, u_{2}), \forall u_{2} \in U_{2} . \end{matrix}

If such a Nash equilibrium exists, then we denote

\begin{matrix} (V_{1} (x, i, j), V_{2} (x, i, j)) = (J_{1} (x, i, j; u_{1}^{*}, u_{2}^{*}), J_{2} (x, i, j; u_{1}^{*}, u_{2}^{*})) \end{matrix}

as the corresponding value function. Note that

V_{1}, V_{2}

are not uniquely defined, but depend on the Nash equilibrium under consideration.

Remark 2.

In this paper, we assume the Brownian motion and the state process to be one-dimensional just for simplicity of presentation. There is no essential difficulty to generalize the results to the multi-dimensional case, but with more complex notation.

3. Verification Theorem

In this section, we establish a verification theorem as a sufficient criterion that can be used to obtain a Nash equilibrium.

Let

\begin{matrix} M_{1} V_{1} (x, i, j) = max_{k \neq i} {V_{1} (x, k, j) - ϕ_{1} (i, k)}, \end{matrix}

and

\begin{matrix} M_{2} V_{2} (x, i, j) = max_{l \neq j} {V_{2} (x, i, l) - ϕ_{2} (j, l)} . \end{matrix}

Then, let

\begin{matrix} N_{1} V_{1} (x, i, j) = V_{1} (x, i, l) + ψ_{1} (j, l), \end{matrix}

and

\begin{matrix} N_{2} V_{2} (x, i, j) = V_{2} (x, k, j) + ψ_{2} (i, k) . \end{matrix}

Remark 3.

The definitions of

M_{1} V_{1} (x, i, j)

and

M_{2} V_{2} (x, i, j)

have an immediate explanation: If Player 1 (respectively, Player 2) makes a switching from i to k (respectively, j to l), then the present Nash equilibrium payoff can be written as

V_{1} (x, k, j) - ϕ_{1} (i, k)

(respectively,

V_{2} (x, i, l) - ϕ_{2} (j, l)

); we have considered the payoff in the present mode of the switching control and the switching cost. The maximum point of

M_{1} V_{1} (x, i, j)

(respectively,

M_{2} V_{2} (x, i, j)

) is actually the best new mode that Player 1 (respectively, Player 2) would choose in case it wants to switch; otherwise, it would be in its interest to deviate by the definition of the Nash equilibrium.

Similarly,

N_{1} V_{1} (x, i, j)

(respectively,

N_{2} V_{2} (x, i, j)

) represents the payoff for Player 1 (respectively, Player 2) when Player 2 (respectively, Player 1) takes the best switching action and behaves optimally afterward.

Let

\begin{matrix} D_{1, i, j} = {V_{1} (x, i, j) - M_{1} V_{1} (x, i, j) > 0}, \end{matrix}

and

\begin{matrix} D_{2, i, j} = {V_{2} (x, i, j) - M_{2} V_{2} (x, i, j) > 0} . \end{matrix}

In fact,

D_{1, i, j}

(respectively,

D_{2, i, j}

) is the so-called continuation (or, no-switching) region for Player 1 (respectively, Player 2), in which it is better for Player 1 (respectively, Player 2) to do nothing than make a switching; see Figure 1 for a graphical representation of

D_{1, i, j}

and

D_{2, i, j}

. The boundaries of

D_{1, i, j}

and

D_{2, i, j}

are denoted by

\partial D_{1, i, j}

and

\partial D_{2, i, j}

, respectively.

Figure 1. A graphical representation of

D_{1, i, j}

and

D_{2, i, j}

.

Denote

\begin{matrix} L φ (x) = b (x) φ^{'} (x) + \frac{1}{2} σ^{2} (x) φ^{″} (x), \end{matrix}

where

φ^{'} (x)

and

φ^{″} (x)

are the first-order and second-order derivatives of an arbitrary twice differentiable function

φ

, respectively.

Now, we state and prove the verification theorem for the nonzero-sum stochastic differential game with switching controls.

Theorem 1.

Let

V_{1}

and

V_{2}

be real-valued functions, such that:

(i) For

i, j \in S

,

\begin{matrix} V_{1} (\cdot, i, j) \in C^{2} (D_{2, i, j} \ (\cup_{i \in N_{1}, j \in N_{2}} \partial D_{1, i, j})) \cap C^{1} (D_{2, i, j}) \cap C (R), \end{matrix}

and

\begin{matrix} V_{2} (\cdot, i, j) \in C^{2} (D_{1, i, j} \ (\cup_{i \in N_{1}, j \in N_{2}} \partial D_{2, i, j})) \cap C^{1} (D_{1, i, j}) \cap C (R) . \end{matrix}

(ii) For

i, j \in S

and

x \in R

,

\begin{matrix} V_{1} (x, i, j) - M_{1} V_{1} (x, i, j) \geq 0, \end{matrix}

and

\begin{matrix} V_{2} (x, i, j) - M_{2} V_{2} (x, i, j) \geq 0 . \end{matrix}

(iii) For

i, j \in S

and

x \in D_{2, i, j}

,

min \{r V_{1} (x, i, j) - L V_{1} (x, i, j) - f_{1} (x, i, j), V_{1} (x, i, j) - M_{1} V_{1} (x, i, j)\} = 0 .

(2)

(iv) For

i, j \in S

and

x \in D_{1, i, j}

,

min \{r V_{2} (x, i, j) - L V_{2} (x, i, j) - f_{2} (x, i, j), V_{2} (x, i, j) - M_{2} V_{2} (x, i, j)\} = 0 .

(3)

(v) For

i, j \in S

and

x \in D_{2, i, j}^{c} = {V_{2} (x, i, j) - M_{2} V_{2} (x, i, j) = 0}

,

\begin{matrix} V_{1} (x, i, j) - N_{1} V_{1} (x, i, j) = 0 . \end{matrix}

(4)

(vi) For

i, j \in S

and

x \in D_{1, i, j}^{c} = {V_{1} (x, i, j) - M_{1} V_{1} (x, i, j) = 0}

,

\begin{matrix} V_{2} (x, i, j) - N_{2} V_{2} (x, i, j) = 0 . \end{matrix}

(5)

Define

u_{1}^{*} = {(τ_{m}^{*}, ξ_{m}^{*})}_{m \geq 1}

and

u_{2}^{*} = {(ρ_{n}^{*}, η_{n}^{*})}_{n \geq 1}

inductively as

\{\begin{matrix} τ_{m}^{*} = & inf_{t > τ_{m - 1}^{*}} {X^{u_{1}^{*}, u_{2}^{*}} (t) \notin D_{1, u_{1}^{*} (t), u_{2}^{*} (t)}}, \\ ξ_{m}^{*} = & arg max_{k \neq ξ_{m - 1}^{*}} {V_{1} (X^{u_{1}^{*}, u_{2}^{*}} (τ_{m}^{*}), k, u_{2}^{*} (τ_{m}^{*})) - ϕ_{1} (ξ_{m - 1}^{*}, k)}, \end{matrix}

(6)

and

\{\begin{matrix} ρ_{n}^{*} = & inf_{t > ρ_{n - 1}^{*}} {X^{u_{1}^{*}, u_{2}^{*}} (t) \notin D_{2, u_{1}^{*} (t), u_{2}^{*} (t)}}, \\ η_{n}^{*} = & arg max_{l \neq η_{n - 1}^{*}} {V_{2} (X^{u_{1}^{*}, u_{2}^{*}} (ρ_{n}^{*}), u_{1}^{*} (ρ_{n}^{*}), l) - ϕ_{2} (η_{n - 1}^{*}, l)} . \end{matrix}

(7)

Then, we have that

(u_{1}^{*}, u_{2}^{*})

is a Nash equilibrium for the two players and

V_{1}, V_{2}

are the corresponding value functions of the game.

Remark 4.

It should be noticed that along the boundaries of

D_{1, i, j}

(respectively,

D_{2, i, j}

), the function

V_{1} (x, i, j)

(respectively,

V_{2} (x, i, j)

),

i \in M

, only belongs to

C^{1}

, but does not necessarily belong to

C^{2}

on

D_{2, i, j}

(respectively,

D_{1, i, j}

). In this situation, we can apply the smooth approximation argument introduced by Øksendal [27] (Theorem 10.4.1 and Appendix D) to complement the smoothness needed for Itô’s formula. Here, in this proof, for convenience, we simply consider

V_{1} (x, i, j)

(respectively,

V_{2} (x, i, j)

),

i \in M

, to be

C^{2}

on

D_{2, i, j}

(respectively,

D_{1, i, j}

); in this connection, see also Guo and Zhang [28] (Theorem 3.1) and [29] (Theorem 2) and Aïd et al. [30] (Theorem 1).

Proof.

We only show the part for Player 1, and the counterpart for Player 2 is symmetric. We first prove that

\begin{matrix} V_{1} (x, i, j) \geq J_{1} (x, i, j; u_{1}, u_{2}^{*}), \end{matrix}

where

u_{1}

is an arbitrary switching control for Player 1. Denote

{(ν_{k})}_{k \geq 1}

(with

ν_{0} = 0

) as the sequence of total switching times in the game: at each

ν_{k}

, we have either

ν_{k} = τ_{m}

for some m or

ν_{k} = ρ_{n}^{*}

for some n. Then, based on

{(ν_{k})}_{k \geq 1}

, the payoff functional for Player 1 can be rewritten as

\begin{matrix} J_{1} (x, i, j; u_{1}, u_{2}^{*}) = & E [\int_{0}^{\infty} e^{- r t} f_{1} (X^{u_{1}, u_{2}^{*}} (t), u_{1} (t), u_{2}^{*} (t)) d t \\ + \sum_{k \geq 1} e^{- r ν_{k}} \{- ϕ_{1} (ξ_{m - 1}, ξ_{m}) 1_{{ν_{k} = τ_{m}}} + ψ_{1} (η_{n - 1}^{*}, η_{n}^{*}) 1_{{ν_{k} = ρ_{n}^{*}}}\}] . \end{matrix}

(8)

Applying Itô’s formula to

e^{- r t} V_{1} (X^{u_{1}, u_{2}^{*}} (t), u_{1} (ν_{k - 1}), u_{2}^{*} (ν_{k - 1}))

between

ν_{k - 1}

and

ν_{k}

,

k \geq 1

, we have

\begin{matrix} E [e^{- r ν_{k}} V_{1} (X^{u_{1}, u_{2}^{*}} (ν_{k}), u_{1} (ν_{k - 1}), u_{2}^{*} (ν_{k - 1})) \\ - e^{- r ν_{k - 1}} V_{1} (X^{u_{1}, u_{2}^{*}} (ν_{k - 1}), u_{1} (ν_{k - 1}), u_{2}^{*} (ν_{k - 1}))] \\ = & E [\int_{ν_{k - 1}}^{ν_{k}} e^{- r t} \{- (r - L) V_{1} (X^{u_{1}, u_{2}^{*}} (t), u_{1} (ν_{k - 1}), u_{2}^{*} (ν_{k - 1}))\} d t] \\ \leq & - E [\int_{ν_{k - 1}}^{ν_{k}} e^{- r t} f_{1} (X^{u_{1}, u_{2}^{*}} (t), u_{1} (ν_{k - 1}), u_{2}^{*} (ν_{k - 1})) d t] \\ = & - E [\int_{ν_{k - 1}}^{ν_{k}} e^{- r t} f_{1} (X^{u_{1}, u_{2}^{*}} (t), u_{1} (t), u_{2}^{*} (t)) d t], \end{matrix}

where the inequality follows from condition (iii), as

X^{u_{1}, u_{2}^{*}} (t) \in D_{2, u_{1} (t), u_{2}^{*} (t)}

when

t \in (ν_{k - 1}, ν_{k})

and the last equality is due to the fact that no switching occurs between

ν_{k - 1}

and

ν_{k}

, thus, we have

u_{1} (t) \equiv u_{1} (ν_{k - 1})

and

u_{2} (t) \equiv u_{2}^{*} (ν_{k - 1})

.

Summing the indices

k \geq 1

, we have

\begin{matrix} \sum_{k \geq 1} E [e^{- r ν_{k}} {V_{1} (X^{u_{1}, u_{2}^{*}} (ν_{k}), u_{1} (ν_{k - 1}), u_{2}^{*} (ν_{k - 1})) - V_{1} (X^{u_{1}, u_{2}^{*}} (ν_{k}), u_{1} (ν_{k}), u_{2}^{*} (ν_{k}))}] \\ \leq & V_{1} (x, i, j) - E [\int_{0}^{\infty} e^{- r t} f_{1} (X^{u_{1}, u_{2}^{*}} (t), u_{1} (t), u_{2}^{*} (t)) d t] . \end{matrix}

(9)

Combining (8) and (9) yields

\begin{matrix} J_{1} (x, i, j; u_{1}, u_{2}^{*}) \leq V_{1} (x, i, j) & + \sum_{k \geq 1} e^{- r ν_{k}} {V_{1} (X^{u_{1}, u_{2}^{*}} (ν_{k}), u_{1} (ν_{k}), u_{2}^{*} (ν_{k})) \\ - V_{1} (X^{u_{1}, u_{2}^{*}} (ν_{k}), u_{1} (ν_{k - 1}), u_{2}^{*} (ν_{k - 1})) \\ - ϕ_{1} (ξ_{m - 1}, ξ_{m}) 1_{{ν_{k} = τ_{m}}} + ψ_{1} (η_{n - 1}^{*}, η_{n}^{*}) 1_{{ν_{k} = ρ_{n}^{*}}}} . \end{matrix}

(10)

In the following, the analysis is divided into three cases:

(a) If

ν_{r} = τ_{m}

for some m, then

\begin{matrix} V_{1} (X^{u_{1}, u_{2}^{*}} (ν_{k}), u_{1} (ν_{k - 1}), u_{2}^{*} (ν_{k - 1})) \\ = & V_{1} (X^{u_{1}, u_{2}^{*}} (ν_{k}), u_{1} (ν_{k - 1}), u_{2}^{*} (ν_{k})) \\ \geq & V_{1} (X^{u_{1}, u_{2}^{*}} (ν_{k}), u_{1} (ν_{k}), u_{2}^{*} (ν_{k})) - ϕ_{1} (ξ_{m - 1}, ξ_{m}) . \end{matrix}

(11)

(b) If

ν_{r} = ρ_{n}^{*}

for some n, then

\begin{matrix} V_{1} (X^{u_{1}, u_{2}^{*}} (ν_{k}), u_{1} (ν_{k - 1}), u_{2}^{*} (ν_{k - 1})) \\ = & V_{1} (X^{u_{1}, u_{2}^{*}} (ν_{k}), u_{1} (ν_{k}), u_{2}^{*} (ν_{k - 1})) \\ = & V_{1} (X^{u_{1}, u_{2}^{*}} (ν_{k}), u_{1} (ν_{k}), u_{2}^{*} (ν_{k})) + ψ_{1} (η_{n - 1}^{*}, η_{n}^{*}) . \end{matrix}

(12)

(c) If

ν_{r} = τ_{m} = ρ_{n}^{*}

simultaneously for some m and n, then

\begin{matrix} V_{1} (X^{u_{1}, u_{2}^{*}} (ν_{k}), u_{1} (ν_{k - 1}), u_{2}^{*} (ν_{k - 1})) \\ = & V_{1} (X^{u_{1}, u_{2}^{*}} (ν_{k}), u_{1} (ν_{k - 1}), u_{2}^{*} (ν_{k - 1})) - V_{1} (X^{u_{1}, u_{2}^{*}} (ν_{k}), u_{1} (ν_{k}), u_{2}^{*} (ν_{k - 1})) \\ + V_{1} (X^{u_{1}, u_{2}^{*}} (ν_{k}), u_{1} (ν_{k}), u_{2}^{*} (ν_{k - 1})) - V_{1} (X^{u_{1}, u_{2}^{*}} (ν_{k}), u_{1} (ν_{k}), u_{2}^{*} (ν_{k})) \\ + V_{1} (X^{u_{1}, u_{2}^{*}} (ν_{k}), u_{1} (ν_{k}), u_{2}^{*} (ν_{k})) \\ \leq & V_{1} (X^{u_{1}, u_{2}^{*}} (ν_{k}), u_{1} (ν_{k}), u_{2}^{*} (ν_{k})) - ϕ_{1} (ξ_{m - 1}, ξ_{m}) + ψ_{1} (η_{n - 1}^{*}, η_{n}^{*}) . \end{matrix}

(13)

From (10)–(13), we have

\begin{matrix} V_{1} (x, i, j) \geq J_{1} (x, i, j; u_{1}, u_{2}^{*}) . \end{matrix}

On the other hand, the proof of

\begin{matrix} V_{1} (x, i, j) = J_{1} (x, i, j; u_{1}^{*}, u_{2}^{*}) \end{matrix}

is the same as above, but with all inequalities becoming equalities. □

Remark 5.

Here, we provide some comments on the conditions (i)–(vi) given in the verification theorem. First, the condition (i) is the regularity requirement on the solutions of the variational inequalities. It is important for us, in specific cases, to solve the variational inequalities and obtain some analytical solutions by using the so-called smooth-fit principle. The condition (ii) is a typical assumption in optimal switching control theory, which comes from the dynamic programming principle. Regarding the condition (iii), if Player 2 does not make a switching (i.e.,

V_{2} (x, i, j) - M_{2} V_{2} (x, i, j) > 0

), then the problem for Player 1 becomes a classical one-player optimal switching control problem, so we have (2). On the contrary, if Player 2 makes a switching (i.e.,

V_{2} (x, i, j) - M_{2} V_{2} (x, i, j) = 0

), by the definition of Nash equilibrium, we expect that Player 1 does not lose anything. This is equivalent to (4) in condition (v); otherwise, it would be in its interest to deviate. Finally, the conditions (iv) and (vi) on

V_{2} (x, i, j)

are imposed for the same reason.

Remark 6.

The proof for the case with three or more than three players can be given in a similar way to that of Theorem 1. Note that in the proof of Theorem 1, we show the property of the Nash equilibrium, that

V_{1} (x, i, j) = J_{1} (x, i, j; u_{1}^{*}, u_{2}^{*})

and

V_{2} (x, i, j) = J_{2} (x, i, j; u_{1}^{*}, u_{2}^{*})

separately and independently. It can be naturally generalized to the case with three or more than three players, and one needs only to modify accordingly the conditions (i)–(vi) imposed on

V_{1}

and

V_{2}

to

V_{p}

for

p \geq 3

in Theorem 1.

4. Concluding Remarks

This paper is the first attempt to study a two-player nonzero-sum stochastic differential game with both players using switching controls. The verification theorem method is adopted with a set of variational inequalities and an appropriate regularity condition is proposed for Nash equilibriums.

It is mentioned that this paper is mainly devoted to the theoretical aspect of a stochastic differential game with switching controls. On the other hand, the numerical aspect is also an important and interesting issue, and it is very useful to apply the verification theorem for solving some practical problems arising from real world. We will consider this topic in our future study.

Author Contributions

Writing—original draft, Y.L.; Writing—review & editing, H.M. All authors have read and agreed to the published version of the manuscript.

Funding

The work is supported by the National NSF of China (12001278, 12001023), the NSF of the Higher Education Institutions of Jiangsu Province (20KJB110017) and the Science and Technology Project of Beijing Municipal Education Commission (KM202410005014).

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Hamadène, S.; Lepeltier, J.P.; Peng, S. BSDEs with continuous coefficients and stochastic differential games. In Backward Stochastic Differential Equations (Paris, 1995–1996); Pitman Research Notes in Mathematics Series 364; Longman: Harlow, UK, 1997; pp. 115–128. [Google Scholar]
Hamadène, S. Backward-forward SDE’s and stochastic differential games. Stoch. Process. Appl. 1998, 77, 1–15. [Google Scholar] [CrossRef]
Hamadène, S. Nonzero sum linear-quadratic stochastic differential games and backward-forward equations. Stoch. Anal. Appl. 1999, 17, 117–130. [Google Scholar] [CrossRef]
Wu, Z. Forward-backward stochastic differential equations, linear quadratic stochastic optimal control and nonzero sum differential games. J. Syst. Sci. Complex. 2005, 18, 179–192. [Google Scholar]
Sun, J.; Yong, J. Linear-quadratic stochastic two-person nonzero-sum differential games: Open-loop and closed-loop Nash equilibria. Stoch. Process. Appl. 2019, 129, 381–418. [Google Scholar] [CrossRef]
Kille, S.; Leibold, P.; Karg, P.; Varga, B.; Hohmann, S. Human-variability-respecting optimal control for physical human-machine interaction. arXiv 2024, arXiv:2405.03502. [Google Scholar]
Kille, S.; Varga, B.; Hohmann, S. Study on human-variability-respecting optimal control affecting human interaction experience. arXiv 2024, arXiv:2408.08620. [Google Scholar]
Varga, B. Toward adaptive cooperation: Model-based shared control using LQ-differential games. arXiv 2024, arXiv:2403.11146. [Google Scholar] [CrossRef]
Yong, J. Differential games with switching strategies. J. Math. Anal. Appl. 1990, 145, 455–469. [Google Scholar] [CrossRef]
Yong, J. A zero-sum differential game in a finite duration with switching strategies. SIAM J. Control Optim. 1990, 28, 1234–1250. [Google Scholar] [CrossRef]
Tang, S.; Yong, J. Finite horizon stochastic optimal switching and impulse controls with a viscosity solution approach. Stochastics 1993, 45, 145–176. [Google Scholar] [CrossRef]
Pham, H. On the smooth-fit property for one-dimensional optimal switching problem. In Séminaire de Probabilités XL; Springer: Berlin/Heidelberg, Germany, 2007; pp. 187–199. [Google Scholar]
Song, Q.; Yin, G.; Zhu, C. Optimal switching with constraints and utility maximization of an indivisible market. SIAM J. Control Optim. 2012, 50, 629–651. [Google Scholar] [CrossRef]
Hamadène, S.; Jeanblanc, M. On the statring and stopping problem: Applications in reversible investments. Math. Oper. Res. 2007, 32, 182–192. [Google Scholar] [CrossRef]
Asri, B.E.; Hamadène, S. The finite horizon optimal multi-modes switching problem: The viscosity solution approach. Appl. Math. Optim. 2009, 60, 213–235. [Google Scholar] [CrossRef]
Hamadène, S.; Zhang, J. Switching problem and related system of reflected backward SDEs. Stoch. Process. Appl. 2010, 120, 403–426. [Google Scholar] [CrossRef]
Hu, Y.; Tang, S. Multi-dimensional BSDE with oblique reflection and optimal switching. Probab. Theory Relat. Fields 2010, 147, 89–121. [Google Scholar] [CrossRef]
Asri, B.E. Optimal multi-modes switching problem in infinite horizon. Stoch. Dyn. 2010, 10, 231–261. [Google Scholar] [CrossRef]
Asri, B.E. Stochastic optimal multi-modes switching with a viscosity solution approach. Stoch. Process. Appl. 2013, 123, 579–602. [Google Scholar] [CrossRef][Green Version]
Tang, S.; Hou, S.H. Switching games of stochastic differential systems. SIAM J. Control Optim. 2007, 46, 900–929. [Google Scholar] [CrossRef]
Hamadène, S.; Martyr, R.; Moriarty, J. A probabilistic verification theorem for the finite horizon two-player zero-sum optimal switching game in continuous time. Adv. Appl. Probab. 2019, 51, 425–442. [Google Scholar] [CrossRef]
Asri, B.E.; Mazid, S. Stochastic differential switching game in infinite horizon. J. Math. Anal. Appl. 2019, 474, 793–813. [Google Scholar] [CrossRef]
Brekke, K.A.; Øksendal, B. Optimal switching in an economic activity under uncertainty. SIAM J. Control Optim. 1994, 32, 1021–1036. [Google Scholar] [CrossRef]
Duckworth, K.; Zervos, M. A model for investment decisions with switching costs. Ann. Appl. Probab. 2001, 11, 239–260. [Google Scholar] [CrossRef]
Carmona, R.; Ludkovski, M. Pricing asset scheduling flexibility using optimal switching. Appl. Math. Financ. 2008, 15, 405–447. [Google Scholar] [CrossRef]
Buckdahn, R.; Cardaliaguet, P.; Rainer, C. Nash equilibrium payoffs for nonzero-sum stochastic differential games. SIAM J. Control Optim. 2004, 43, 624–642. [Google Scholar] [CrossRef]
Øksendal, B. Stochastic Differential Equations, 6th ed.; Springer: Berlin/Heidelberg, Germany, 2003. [Google Scholar]
Guo, X.; Zhang, Q. Closed-form solutions for perpetual American put options with regime switching. SIAM J. Appl. Math. 2004, 64, 2034–2049. [Google Scholar]
Guo, X.; Zhang, Q. Optimal selling rules in a regime switching market. IEEE Trans. Automat. Control 2005, 50, 1450–1455. [Google Scholar] [CrossRef]
Aïd, R.; Basei, M.; Callegaro, G.; Campi, L.; Vargiolu, T. Nonzero-sum stochastic differential games with impulse controls: A verification theorem with applications. Math. Oper. Res. 2020, 45, 205–232. [Google Scholar] [CrossRef]

Figure 1. A graphical representation of

D_{1, i, j}

and

D_{2, i, j}

.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Two-Player Nonzero-Sum Stochastic Differential Games with Switching Controls

Abstract

1. Introduction

2. Problem Formulation

3. Verification Theorem

4. Concluding Remarks

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics