Inverse Stackelberg Solutions for Games with Many Followers

Averboukh, Yurii

doi:10.3390/math6090151

Open AccessArticle

Inverse Stackelberg Solutions for Games with Many Followers

by

Yurii Averboukh

^1,2

¹

Department of Control systems, Krasovskii Institute of Mathematics and Mechanics, 16, S. Kovalevskoi str. Yekaterinburg 620990, Russia

²

Department of Applaied Mathematics and Mechanics, Ural Federal University, 19, Mira str. Yekaterinburg 620002, Russia

Mathematics 2018, 6(9), 151; https://doi.org/10.3390/math6090151

Submission received: 30 July 2018 / Revised: 23 August 2018 / Accepted: 27 August 2018 / Published: 30 August 2018

(This article belongs to the Special Issue Mathematical Game Theory)

Download Versions Notes

Abstract

The paper is devoted to inverse Stackelberg games with many players. We consider both static and differential games. The main assumption of the paper is the compactness of the strategy sets. We obtain the characterization of inverse Stackelberg solutions and under additional concavity conditions, establish the existence theorem.

Keywords:

inverse Stackelberg games; incentives; differential games

1. Introduction

The paper is concerned with the inverse Stackelberg game, also known as the incentive problem. In ordinary Stackelberg games, one player (called a leader) announces his strategy while the other players (called followers) maximize their payoffs using this information. In the inverse Stackelberg games the leader announces the incentive strategy, i.e., the reaction to the followers’ strategies ([1,2,3,4,5] and reference therein). For dynamic cases, the reaction should be nonanticipative.

The inverse Stackelberg games appear in several models (see, for example, [6,7,8]). In games with many followers, it is often assumed that followers play a Nash game ([6,9,10]). If the strategy sets are normed space, then the incentive strategy can be constructed in the affine form (Ref. [11] for static games and Ref. [12] for differential games).

In this paper, we consider a case where the control spaces of the players are metric compacts. We consider both static and dynamic cases. Moreover, for the dynamic case, we apply punishment strategies. The concept of punishment strategies was first used for the analysis of Stackelberg games in the class of feedback strategies in Ref. [13]. The inverse Stackelberg solutions of two-person differential games were studied via punishment strategies in the paper by Kleimonov [14]. In that paper, the authors described the set of inverse Stackelberg solutions and derived the existence result. In particular, the set of inverse Stackelberg payoffs is equal to the set of feedback Stackelberg payoffs. Note that the incentive strategies considered in the paper by Kleimonov [14] use full memory, i.e., the leader plays with the nonanticipating strategies proposed in the papers by Elliot and Kalton [15] and Varaiya and Lin [16] for zero-sum differential games. The usage of the strategies depends only on the current follower’s control which decreases the payoffs.

In this paper, punishment strategies are applied to static inverse Stackelberg games and to differential inverse Stackelberg games with many followers. We obtain the characterization of the inverse Stackelberg solution and under additional concavity conditions, establish the existence theorem.

The paper is organized as follows. Section 2 is concerned with the static inverse Stackelberg game for a case with n followers. The differential game case is considered in Section 3. In Section 4, we prove the existence theorem for the inverse Stackelberg solution of a differential game.

2. Static Games

We denote the leader by 0. Further, we designate the followers by

1, \dots n

. Player i has a set of strategies (

P_{i}

) and a payoff function(

J_{i} : P_{0} \times P_{1} \dots \times P_{n} \to R

). We assume that the sets (

P_{i}

) are compact and the functions (

J_{i}

) are continuous.

The incentive strategy of the leader is a mapping:

α : \times_{i = 1}^{n} P_{i} \to P_{0} .

To define the inverse Stackelberg game, we specify the solution concept used by followers. We suppose that the followers play the Nash game. Let

P = \times_{i = 1}^{n} P_{i} .

An element (

u = (u_{1}, \dots, u_{n})

) of P is a profile of the followers’ strategies. If

u_{i}^{'} \in P_{i}

then

(u_{i}^{'}, u_{- i})

is the profile of strategies

(u_{1}, \dots, u_{i - 1}, u_{i}^{'}, u_{i + 1}, \dots, u_{n})

. For simplification, we write

J_{i} (u_{0}, u)

to denote

J_{i} (u_{0}, u_{1}, \dots, u_{n})

. Furthermore,

J_{i} (u_{0}, u_{i}^{'}, u_{- i}) ≜ J_{i} (u_{0}, (u_{i}^{'}, u_{- i}))

is put. If

α

is an incentive strategy of the leader, u is a profile of strategies of the followers. Then,

J_{i} [α, u] ≜ J_{i} (α [u], u)

,

J_{i} [α, u_{i}^{'}, u_{- i}] ≜ J_{i} [α, (u_{i}^{'}, u_{- i})]

are denoted. Further, let

E (α)

be a set of the followers’ Nash equilibria for a case where the leader uses the incentive strategy

α

:

E (α) ≜ {u : J_{i} [α, u] \geq J_{i} [α, u_{i}^{'}, u_{- i}] for any i = \bar{1, n} and any u_{i}^{'} \in P_{i}} .

Definition 1.

The pair

(α^{*}, u^{*})

is an inverse Stackelberg solution in the game with one leader and n followers playing the Nash equilibrium if

(1): $u^{*} \in E (α) .$
(2): $J_{0} [α^{*}, u^{*}] = {max}_{α} {max}_{u \in E (α)} J_{0} [α, u]$ .

The structure of the inverse Stackelberg solution is given in the following statements. Denote

B ≜ \{(u_{0}^{♮}, u^{♮}) : for any i = \bar{1, n}, J_{i} (u_{0}^{♮}, u^{♮}) \geq max_{u_{i}} min_{u_{0}} J_{i} (u_{0}, u_{i}, u_{- i}^{♮})\} .

Lemma 1.

The following properties hold true:

(1): If $u^{♮} \in E (α)$ , then $(α [u^{♮}], u^{♮}) \in B$ ;
(2): If the strategy of the leader ( $u_{0}^{♮}$ ), and the profile of the followers’ strategies ( $u^{♮}$ ) are $(u^{♮}, u^{♮}) \in B$ , then an incentive strategy of the leader α exists such that $u^{♮} \in E (α)$ .

Proof.

To use the first statement of the lemma,

{\hat{u}}_{i}

is picked to maximize

max_{u_{i} \in P_{i}} min_{u_{0} \in P_{0}} J_{i} (u_{0}, u_{i}, u_{- i}^{♮}) .

Using the definition of the set

E (α)

, for

u_{0}^{♮} ≜ α [u^{♮}]

and each

i = 1, \dots, n

, we have

\begin{matrix} J_{i} (u_{0}^{♮}, u^{♮}) = J_{i} [α, u^{♮}] \geq J_{i} [α, {\hat{u}}_{i}, u_{- i}^{♮}] = J_{i} (α ({\hat{u}}_{i}, u_{- i}^{♮}), {\hat{u}}_{i}, u_{- i}^{♮}) \\ \geq min_{u_{0} \in P_{0}} J_{i} (u_{0}, {\hat{u}}_{i}, u_{- i}^{♮}) = max_{u_{i} \in P_{i}} min_{u_{0} \in P_{0}} J_{i} (u_{0}, u_{i}, u_{- i}^{♮}) . \end{matrix}

Thus,

(α [u^{♮}], u^{♮}) \in B

.

Now, let us prove the second statement of the lemma.

For

u_{i} \in P_{i}

let

β_{i} [u_{i}] \in Argmin {J_{i} (u_{0}, u_{i}, u_{- i}^{♮}) : u_{0} \in P_{0}}

. Further, an arbitrary

\bar{u} \in P

is picked.

Put

α [u_{1}, \dots, u_{n}] ≜ \{\begin{matrix} u_{0}^{♮}, & u_{i} = u_{i}^{♮}, i = 1, \dots n, \\ β_{i} [u_{i}], & u_{i} \neq u_{i}^{♮}, u_{j} = u_{i}^{♮}, j \neq i, \\ \bar{u}, & otherwise . \end{matrix}

First, notice that

α [u^{♮}] = u_{0}^{♮}

. Further, if

u \in P

is such that

u_{i} \neq u_{i}^{♮}

for some i and, for all other j,

u_{j} = u_{j}^{♮}

, then

J_{i} (α [u], u) = J_{i} (β_{i} (u_{i}), u_{i}, u_{- i}^{♮}) \leq max_{u_{i} \in P_{i}} min_{u_{0} \in P_{0}} J_{i} (u_{0}, u_{i}, u_{- i}^{♮}) \leq J_{i} (u_{0}^{♮}, u^{♮}) = J_{i} [α, u^{♮}] .

This proves the second statement of the lemma. ☐

Theorem 1.

(1) If

(α^{*}, u^{*})

is an inverse Stackelberg solution, then the profile of strategies

(u_{0}^{*}, u_{1}^{*})

with

u_{0}^{*} = α^{*} (u_{1}^{*})

maximizes the value

J_{0} (u_{0}^{*}, u_{1}^{*})

over the set

B

. (2) If the profile of strategies

(u_{0}^{*}, u_{1}^{*})

maximizes the value

J_{0} (u_{0}^{*}, u_{1}^{*})

over the set

B

, then an incentive strategy (

α^{*}

) exists such that

α^{*} [u_{1}^{*}] = u_{0}^{*}

, and

(α^{*}, u_{1}^{*})

is an inverse Stackelberg solution. (3) If the function

u_{i}^{'} \mapsto J_{i} (u_{0}, u_{i}^{'}, u_{- i})

is quasi-concave for all

u_{0}

,

u_{- i}

, and

i = 1, \dots, n

, then at least one inverse Stackelberg solution exists.

Proof.

The proof of the first two statements directly follows from Lemma 1.

Let us prove the third statement of the theorem. Put

K_{i} (u_{1}, \dots, u_{n}) ≜ min_{u_{0} \in P_{0}} J_{i} (u_{0}, u_{1}, \dots, u_{i}) .

The functions

u_{i}^{'} \mapsto K_{i} (u_{i}^{'}, u_{- i})

are quasi-concave for all

u_{- i}

. Therefore, a profile of followers’ strategies (

u^{♮}

) exists such that all

u_{i} \in P_{i}

K_{i} (u^{♮}) \geq K_{i} (u_{i}, u_{- i}^{♮})

. Hence, we any pair

(u_{0}, u^{♮})

belongs to

B

. Consequently,

B

is nonempty. Moreover, the set

B

is compact. This proves the existence of the pair

(u_{0}^{*}, u^{*})

maximizing

J_{0}

over the set

B

. The existence of inverse Stackelberg solution directly follows on from the second statement of the theorem. ☐

Example 1.

Consider a game with two followers. Let the set of strategies of the players be equal to

{0, 1}

. In addition, let the followers’ rewards for

u_{0} = 0

be

a, b	0, 0
0, 0	b, a

where

a > b > 0

. Further, let the followers’ rewards for

u_{0} = 1

be given by

0, 0	a, b
b, a	0, 0

Finally, we assume that the leader’s reward is equal to 1 when the followers outcome is

(0, 0)

and 0 in the opposite case. One can consider this game as a variant of the battle of sexes with the leader who can shift the roles of the players and win when there is no arrangement between the players.

It is easy to check that the set

B

is equal to the set of all strategies

{0, 1}^{3}

. By maximizing the leader’s payoff over this set we get that the outcome of the players is

(1, 0, 0)

.

It is instructive to compare the result with the case where the leader declares his strategy first. Clearly, in this case, whatever the leader’s strategy is, the leader’s outcome is 0, whereas the flowers’ Nash equilibrium payoffs are

(a, b)

and

(b, a)

.

3. Inverse Stackelberg Solution for Differential Games

As above we assume that player 0 is a leader when players

1, \dots, n

are followers. The dynamics of the system is given by the equation

\dot{x} = f (t, x, u_{0}, u_{1}, \dots, u_{n}), t \in [0, T], x \in R^{d}, x (0) = x_{0}, u_{i} \in P_{i} .

(1)

Player i wishes to maximize the payoff

σ_{i} (x (T)) + \int_{0}^{T} g_{i} (t, x, u_{0}, u_{1}, \dots, u_{n}) d t .

The set

U_{i} = {u_{i} : [0, T] \to P_{i} m e a s u r a b l e}

is the set of open-loop strategies of player i. As above, the n-tuple of open-loop strategies of followers (

u = (u_{1}, \dots, u_{n})

) is called the profile of strategies. To simplify notations, denote

f (t, x, u_{0}, u) ≜ f (t, x, u_{0}, u_{1}, \dots, u_{n}), g (t, x, u_{0}, u) ≜ g (t, x, u_{0}, u_{1}, \dots, u_{n}) .

Further, put

U = \times_{i = 1}^{n} U_{i} .

If

u_{0} \in U_{0}

,

u = (u_{1}, \dots, u_{n}) \in U

,

(t_{*}, x_{*}) \in [0, T] \times R^{d}

, then denote by

x (\cdot, t_{*}, x_{*}, u_{0}, u)

the solution of initial value problem

\dot{x} (t) = f (t, x (t), u_{0} (t), u_{1} (t), \dots, u_{n} (t)), x (t_{*}) = x_{*} .

Put

z_{i} (t, t_{*}, x_{*}, u_{0}, u) = \int_{t_{*}}^{t} g_{i} (t, x (t), u_{0} (t), u_{1} (t), \dots, u_{n} (t)) d t .

If

t_{*} = 0

,

x_{*} = x_{0}

we omit the arguments

t_{*}

and

x_{*}

. Let

z (\cdot, t_{*}, x_{*}, u_{0}, u) = (z_{0} (\cdot, t_{*}, x_{*}, u_{0}, u), z_{1} (\cdot, t_{*}, x_{*}, u_{0}, u), \dots, z_{n} (\cdot, t_{*}, x_{*}, u_{0}, u))

. We assume that the set of motions is closed, i.e., for all

(t_{*}, x_{*}) \in [0, T] \times R^{d}

,

\begin{matrix} cl {(x (\cdot, t_{*}, x_{*}, u_{0}, u), z (\cdot, t_{*}, x_{*}, u_{0}, u)) : u_{0} \in U_{0}, u \in U} \\ = {(x (\cdot, t_{*}, x_{*}, u_{0}, u), z (\cdot, t_{*}, x_{*}, u_{0}, u)) : u_{0} \in U_{0}, u \in U} . \end{matrix}

Here,

cl

stands for the closure in the space of continuous functions from

[0, T]

to

R^{d}

.

We assume that the followers use open-loop strategies (

u_{i} \in U_{i}

) when the leader’s strategy is a nonanticipative strategy (

α : U \to U_{0}

). The nonanticipation property means that

α [u] (τ) = α [u^{'}] (τ)

for any u and

u^{'}

coinciding on

[0, τ]

.

For

u_{0} \in U_{0}

,

u \in U

,

(t_{*}, x_{*})

define

J_{i} (t_{*}, x_{*}, u_{0}, u) ≜ σ_{i} (x (T, t_{*}, x_{*}, u_{0}, u)) + z_{i} (T, t_{*}, x_{*}, u_{0}, u) .

Further, put

J_{i} [t_{*}, x_{*}, α, u] ≜ J_{i} (t_{*}, x_{*}, α (u), u) .

We omit the arguments

t_{*}

and

x_{*}

if

t_{*} = 0

,

x_{*} = x_{0}

.

We assume that the followers’ solution concept is Nash equilibrium. Let

E_{d} (α)

denote the set of Nash equilibria in the case when the leader plays with the nonanticipating strategy

α

:

E_{d} (α) ≜ {u \in U : J_{i} [α, u] \geq J_{i} [α, u_{i}^{'}, u_{- i}] for all u_{i}^{'} \in U_{i} and any i = \bar{1, n}} .

Denote the set of nonanticpating strategies by

Γ^{*}

.

Definition 2.

The pair consisting of a nonanticipative strategy of the leader (

α^{*}

) and

u^{*} \in U

is an inverse Stackelberg solution of the differential game if

(1): $u^{*} \in E_{d} (α^{*})$
(2): $J_{0} [α^{*}, u^{*}] = {max}_{α} {max}_{u \in E_{d} (α)} J_{0} [α, u] .$

The proposed definition is analogous to the definition of the inverse Stackelberg solution for static games. The characterization in the differential game case is close to the characterization in the static game case.

For a fixed profile of strategies of all players but the i-th

u_{- i}

, one can consider the zero-sum differential game of player 0 and player i. In this case, we assume that player 0 uses the nonaticipating strategies on

[t_{*}, T]

which are mappings (

β_{i} : U_{i} \to U_{0}

) that satisfy the feasibility condition: if

u_{i}^{'} = u_{i}^{″}

on

[t_{*}, τ]

, then

β_{i} [u_{i}^{'}] = β_{i} [u_{i}^{″}]

on

[t_{*}, τ]

. Denote the set of feasible mappings

β_{i} : U_{i} \to U_{0}

by

Γ_{i} [t_{*}]

. The lower value of this game is

V_{i}^{-} (t_{*}, x_{*}, u_{- i}) ≜ min_{β_{i} \in Γ_{i} [t_{*}]} max_{u_{i}^{'} \in U_{i}} J_{i} (t_{*}, x_{*}, β_{i} [u_{i}^{'}], u_{i}^{'}, u_{- i}) .

Let

\begin{matrix} C = {(u_{0}, u) \in U_{0} \times U : for any i = \bar{1, n}, t \in [0, T], and x (\cdot) = x (\cdot, u_{0}, u) \\ J_{i} (t, x (t), u_{0}, u) \geq V_{i}^{-} (t, x (t), u_{- i})} . \end{matrix}

Lemma 2.

Let α be an incentive strategy of the leader. If

u^{♮} \in E_{d} (α)

, then

(α [u^{♮}], u^{♮}) \in C

.

Proof.

Denote

u_{0}^{♮} ≜ α [u^{♮}] .

We claim that

J_{i} [t, x^{♮} (t), u_{0}^{♮}, u^{♮}] \geq J_{i} [t, x^{♮} (t), α [u_{i}^{'}, u_{- i}^{♮}], u_{i}^{'}, u_{- i}^{♮}]

(2)

for any

u_{i}^{'} \in U_{i}

,

u_{0}^{♮} = α (u^{*})

,

x^{♮} = x (\cdot, α [u^{*}], u^{♮})

. Assume the converse. This means that, for some

u_{i}^{'}

and

τ

,

J_{i} [τ, x^{♮} (τ), u_{0}^{♮}, u^{♮}] < J_{i} [τ, x^{♮} (τ), α [u_{i}^{'}, u_{- i}^{♮}], u_{i}^{'}, u_{- i}^{♮}] .

(3)

Let us introduce the control (

u_{i}^{♭}

) by the following rule:

u_{i}^{♭} ≜ \{\begin{matrix} u_{i}^{♮} (t), & t \in [0, τ] \\ u_{i}^{'} (t), & t \in [τ, T] . \end{matrix}

Further, denote

u_{0}^{♭} ≜ α [u_{i}^{♭}, u_{- i}^{♮}],

x^{♭} (\cdot) = x (\cdot, u_{0}^{♭}, (u_{i}^{♭}, u_{- i}^{♮})) .

We have

J_{i} [α, u_{i}^{♭}, u_{- i}^{♮}] = σ (x^{♭} (T)) + \int_{0}^{T} g (t, x^{♭} (t), u_{0}^{♭}, (u_{i}^{♭}, u_{- i}^{♮})) d t .

Since, for

t \in [0, τ]

,

u_{i}^{♭} (t) = u_{i}^{♮} (t), u_{0}^{♭} = u_{0}^{♮} (t) = α [u^{♮}] (t), x^{♭} (t) = x^{♮} (t),

and, for

t \in [τ, T]

,

x^{♭} (t) = x (t, τ, x^{♮} (τ), u_{0}^{♭}, (u_{i}^{'}, u_{- i}^{♮})),

Equation (3) implies the following inequality:

J_{i} [α, u_{i}^{♭}, u_{- i}^{♮}] > \int_{0}^{τ} g_{i} (t, x^{♮} (t), u_{0}^{♮}, u^{♮}) d t + J [τ, x^{♮} (τ), α, u^{♮}] = J [α, u^{♮}] .

This contradicts the assumption that

u^{♮} \in E_{d} (α)

.

The inequality (2) yields the inequality

J_{i} [t, x^{♮} (t), u_{0}^{♮}, u^{♮}] \geq V_{i}^{-} (t, x^{♮} (t), u_{- i}^{♮})

. ☐

Lemma 3.

For any

(u_{0}^{♮}, u^{♮}) \in C

, a nonanticipative strategy of the leader (α) exists so that

α (u^{♮}) = u_{0}^{♮}

and

u^{♮} \in E_{d} (α)

.

Proof.

Denote

x^{♮} (\cdot) = x (\cdot, u_{0}^{♮}, u^{♮})

.

Pick

u \in U

. Let

i_{1}, i_{2}, \dots, i_{n} \in \bar{1, n}

, and let

τ_{i_{1}}, \dots, τ_{i_{n}} \in [0, T]

satisfy the following properties

(1): $i_{1}, \dots, i_{n}$ is a permutation of $1, \dots, n$ ;
(2): $τ_{i_{1}} \leq τ_{i_{2}}, \dots, τ_{i_{n}}$ ;
(3): for each k, $t_{i_{k}}$ is the greatest time such that $u_{i_{k}} = u_{i_{k}}^{♮}$ on $[0, τ_{i_{k}}]$ .

Let

y_{i_{1}} = x^{♮} (τ_{i_{1}})

. The mapping

β^{i_{1}} \in Γ_{i_{1}} [τ_{i_{1}}]

exists such that

V_{i} (τ_{i_{1}}, y_{i_{1}}, u_{- i_{i}}^{♮}) = max_{u_{i_{1}} \in U_{i_{1}}} J (τ_{i_{1}}, y_{i_{1}}, β^{i_{1}} [u_{i_{1}}], u_{i_{1}}, u_{- i_{1}}^{♮}] .

Further, pick

{\bar{u}}_{0} \in U

arbitrarily.

Put

α [u] ≜ \{\begin{matrix} u_{0}^{♮}, & t \in [0, τ_{i_{1}}]; \\ β^{i_{1}} [u_{i_{1}}], & t \in (τ_{i_{1}}, τ_{i_{2}}]; \\ {\bar{u}}_{0}, & t \in (τ_{i_{2}}, T] . \end{matrix}\}

Notice that

α [u^{♮}] = u_{0}^{♮}

. Now let

u = (u_{i}^{'}, ♮_{- i})

Denote by

τ

the greatest time such that

u_{i} = u_{i}^{♮}

on

[0, τ]

. In this case,

i_{1} = i

,

τ_{i_{1}} = τ

,

τ_{i_{k}} = T

for

k = 2, \dots, n

. By construction, we have

\begin{matrix} J_{i} [α, u_{i}, u_{- i}^{♮}] = \int_{0}^{τ} g_{i} (t, x^{♮} (t), u_{0}^{♮} (t), u^{♮} (t)) d t + J_{i} [τ, x^{♮} (t), α, u_{i}, u_{- i}^{♮}] \\ = \int_{0}^{τ} g_{i} (t, x^{♮} (t), u_{0}^{♮} (t), u^{♮} (t)) d t + V_{i} (τ, x^{♮} (τ), u_{- i}^{♮}) \\ \leq \int_{0}^{τ} g_{i} (t, x^{♮} (t), u_{0}^{♮} (t), u^{♮} (t)) d t + J_{i} (τ, x^{♮} (τ), u_{0}^{♮}, u^{♮}) = J_{i} [α, u^{♮}] . \end{matrix}

☐

Theorem 2.

(1) If the pair

(α^{*}, u^{*})

is an inverse Stackelberg solution then

(u_{0}^{*}, u^{*}) \in C

, and

(u_{0}^{*}, u_{1}^{*})

maximizes the value

J_{0}

over the set

C

for

u_{0}^{*} = α^{*} [u^{*}]

. (2) Conversely, if the pair

(u_{0}^{*}, u_{1}^{*})

maximizes the value

J_{0}

over the set

C

, then an incentive strategy of the leader

α^{*}

exists such that

α^{*} [u_{1}^{*}] = u_{0}^{*}

and

(α^{*}, u_{1}^{*})

is an incentive Stackelberg solution.

The theorem directly follows from the Lemmas 2 and 3.

4. Existence of the Inverse Stackelberg Solution for Differential Game

In this section, we consider the differential game in the mixed strategies. This means that we replace the system (1) with the control system described by the following equation:

\dot{x} (t) = \int_{P_{0}} \int_{P_{1}} \dots \int_{P_{n}} f (t, x (t), u_{0}, u_{1}, \dots, u_{n}) μ_{n} (t, d u_{n}) \dots μ_{1} (t, d u_{1}) μ_{0} (t, d u_{0}) .

(4)

Here,

μ_{i} (t, \cdot)

are probabilistic measures on

P_{i}

.

The relaxation means that we replace the control spaces

P_{i}

with the control spaces

rpm (P_{i})

. Therefore, the open-loop strategy of the i-th player is a weakly measurable function:

μ_{i} : [0, T] \to rpm (P_{i})

. This means that the mapping

t \mapsto \int_{P_{i}} ϕ (u_{i}) μ_{i} (t, d u_{i})

is measurable for any continuous function (

φ \in C (P_{i})

). The set of open-loop strategies of the i-th player is denoted by

M

.

Further, we use the following designations. Put

P ≜ \times_{j = 1}^{n} P_{j}, P_{- i} ≜ \times_{j \neq i} P_{j} .

If

m_{j} \in rpm (P_{j})

,

j = 1, \dots, n

, then denote

m (d u) = m_{1} (d u_{1}) \dots m_{n} (d u_{n})

with a slight abuse of notation. Further, for

φ \in C (P)

,

\int_{P} φ (u) m (d u) = \int_{P_{1}} \dots \int_{P_{n}} φ (u_{1}, \dots, u_{n}) m_{1} (d u_{1}) \dots m_{n} (d u_{n}) .

Analogously, we assume that

m_{- i} (d u_{- i}) ≜ \times_{j \neq i} m_{j} (d u_{j})

. Thus,

\begin{matrix} \int_{P_{- i}} φ (u_{- i}) m_{- i} (d u_{- i}) = \int_{P_{1}} \dots \int_{P_{i - 1}} \int_{P_{i + 1}} \dots \int_{P_{n}} φ (u_{1}, \dots, u_{i - 1}, u_{i + 1}, \dots, u_{n}) \\ m_{1} (d u_{1}) \dots m_{i - 1} (d u_{i - 1}) m_{i + 1} (d u_{i + 1}) \dots m_{n} (d u_{n}) . \end{matrix}

If

(t_{*}, x_{*}) \in [0, T] \times R^{d}

,

μ_{0} \in M_{0}

,

μ_{1} \in M_{1}

,…,

μ_{n} \in M_{n}

, then we denote the solution of the initial value problem for equation (4) and the position

(t_{*}, x_{*})

by

x (\cdot, t_{*}, x_{*}, μ_{0}, μ_{1}, \dots, μ_{n})

.

As above, we call the n-tuple

μ = (μ_{1}, \dots, μ_{n})

the profile of followers’ mixed strategies. Denote the set of followers’ strategies by

M

. Put

x (\cdot, t_{*}, x_{*}, μ_{0}, μ) = x (\cdot, t_{*}, x_{*}, μ_{0}, μ_{1}, \dots, μ_{n})

,

x (\cdot, t_{*}, x_{*}, μ_{0}, μ_{i}^{'}, μ_{- i}) = x (\cdot, t_{*}, x_{*}, μ_{0}, (μ_{i}^{'}, μ_{- i}))

.

For the given position

(t_{*}, x_{*}) \in [0, T] \times R^{d}

and measures

μ_{0} \in M_{0}

,

μ \in M

, the corresponding payoff of player i is equal to

J_{i} (t_{*}, x_{*}, μ_{0}, μ) = σ_{i} (x (T, t_{*}, x_{*}, μ_{0}, μ)) + \int_{t_{*}}^{T} \int_{P_{0}} \int_{P} g_{i} (t, x (t, t_{*}, x_{*}, μ_{0}, μ), u_{0}, u) μ_{0} (t, d u_{0}) μ (t, d u) d t .

As above, the mapping

α : M \to M_{0}

satisfying the condition of feasibility (the equality

μ^{'}

and

μ^{''}

on

[0, τ]

yields the equality

α [μ^{'}] (t, \cdot) = α [μ^{″}] (t, \cdot)

on

[0, τ]

) is called the nonanticipative strategy. We denote the set of nonanticipating strategies by

Γ^{*}

. Analogously, the set of mappings

β_{i} : M_{i} \to M_{0}

satisfying the feasibility property on

[t_{*}, T]

is denoted by

Γ_{i} [t_{*}]

.

Further, we use the nonanticipating strategies of player i. This is a mapping

γ_{i} : M_{0} \to M_{i}

satisfying the feasibility property on

[t_{*}, T]

: if

μ_{0}^{'} = μ_{0}^{″}

on

[t_{*}, τ]

, then

γ_{i} [μ_{0}^{'}] = γ_{i} [μ_{0}^{″}]

on

[t_{*}, τ]

. Let

N_{i}

stand for the set of nonanticipating strategies of player i on

[t_{*}, T]

. By using these strategies, one can introduce the upper value function by the rule: if

(t_{*}, x_{*}) \in [0, T] \times R^{d}

,

μ_{1} \in M_{1}

, …,

μ_{i - 1} \in M_{i - 1}

,

μ_{i + 1} \in M_{i + 1}

,…,

μ_{n} \in M_{n}

, then

V^{+} (t_{*}, x_{*}, μ_{- i}) ≜ max_{β_{i} \in N_{i}} min_{μ_{0} \in M_{0}} J_{i} (t_{*}, x_{*}, μ_{0}, γ_{i} [μ_{0}], μ_{- i}) .

Generally,

V^{+} (t_{*}, x_{*}, μ_{- i}) \geq V^{-} (t_{*}, x_{*}, μ_{- i}) .

(5)

Theorem 3.

Assume that the following conditions hold true for each

i = \bar{1, n}

:

(1): $x \mapsto σ_{i} (x)$ is concave;
(2): $g_{i} (t, x, u_{0}, u) = g_{i}^{0} (t, x, u_{- i}) + g_{i}^{1} (t, u_{0}, u_{- i}) + g_{i}^{2} (t, u)$ and the function $x \mapsto g_{i}^{0} (t, x, u_{- i})$ is concave.

Then, an inverse Stackelberg solution exists in mixed strategies

(α^{*}, μ^{*})

.

Proof.

Let us prove that the set

C

is nonempty.

Define the multivalued map

G : M_{0} \times M ⊸ M_{0} \times M

by the rule

(μ_{0}^{'}, μ^{'}) \in G (μ_{0}, μ)

if, for each

i = \bar{1, n}

,

J_{i} (t, x_{i} (t), μ_{0}^{'}, μ_{i}^{'}, μ_{- i}) \geq V_{i}^{-} (t, x_{i} (t), μ_{- i}) .

Here,

x_{i} (\cdot) = x (\cdot, μ_{0}, μ_{i}, μ_{- i})

.

The assumption of the theorem implies that the set

G (μ_{0}, μ)

is convex for all

μ_{0} \in M_{0}

,

μ \in M

. Moreover,

G

has a closed graph. Let us prove the nonemptiness of

G (μ_{0}, μ)

.

Put

μ_{0}^{'} = μ_{0}

. From the Bellman principle, it follows that

\begin{matrix} V_{i}^{+} (t_{*}, x_{*}, μ_{- i}) = max_{γ_{i} \in N_{i}} min_{ν_{0} \in M_{0}} [V (t_{+}, x (t_{+}, t_{*}, x_{*}, ν_{0}, γ_{i} [ν_{0}], μ_{- i})) \\ + \int_{t_{*}}^{t^{+}} \int_{P_{0}} \int_{P_{i}} \int_{P_{- i}} g_{i} (t, x (t_{+}, t_{*}, x_{*}, ν_{0}, γ_{i} (ν_{0}), μ_{- i})), u_{0}, u_{- i}, u_{- i}) \\ μ_{- i} (t, d u_{- i}) γ_{i} [ν_{0}] (t, d u_{i}) ν_{0} (t, d u_{0}) d t] . \end{matrix}

(6)

Let N be a natural number. Put

t_{N}^{k} = T k / N

. Let

γ_{i, N}^{k}

maximize the right-hand side at (6) for

t_{*} = t_{N}^{k}

,

t_{+} = t_{N}^{k + 1}

,

x_{*} = y_{i, N}^{k - 1}

. Here

y_{i, N}^{k}

is defined inductively by the rule

y_{i, N}^{0} = x_{0}, y_{i, N}^{k} = x (t_{i, N}^{k}, t_{i, N}^{k - 1}, y_{i, N}^{k - 1}, μ_{0}, γ_{i, N}^{k - 1} [μ_{0}], μ_{- i}) .

Put

{\tilde{μ}}_{i, N} (t, \cdot) = γ_{i, N}^{k} [μ_{0}] (t, \cdot)

for

t \in [t_{N}^{k - 1}, t_{N}^{k})

. Denote

x_{i, N} (\cdot) = x (\cdot, t_{0}, x_{0}, μ_{0}, {\tilde{μ}}_{i, N}, μ_{- i})

. Notice that

y_{i, N}^{k} = x_{i, N} (t_{N}^{k})

. We have, for

k < l

, the inequality

\begin{matrix} V_{i}^{+} (t_{N}^{k}, x_{i, N} (t_{N}^{k}), μ_{- i}) \leq V_{i}^{+} (t_{N}^{l}, x_{i, N} (t_{N}^{l}), μ_{- i}) \\ + \int_{t_{N}^{k}}^{t_{N}^{l}} \int_{P_{0}} \int_{P_{i}} \int_{P_{- i}} g_{i} (t, x_{i, N} (t), u_{0}, u_{i}, u_{- i}) μ_{- i} (t, d u_{- i}) {\tilde{μ}}_{i, N} (t, d u_{i}) μ_{0} (t, d u_{0}) d t . \end{matrix}

Note that

V_{i}^{+} (t_{N}^{N}, y_{i, N}^{N}, μ_{- i}) = σ_{i} (ξ_{i, N}^{N})

.

Using the continuity of function

V_{i}^{+}

, we get

\begin{matrix} V_{i}^{+} (t_{*}, x_{i, N} (t_{*}), μ_{- i}) \leq V_{i}^{+} (T, x_{i, N} (T), μ_{- i}) \\ + \int_{t_{*}}^{T} \int_{P_{0}} \int_{P_{i}} \int_{P_{- i}} g_{i} (t, x_{i, N} (t), u_{0}, u_{i}, u_{- i}) μ_{- i} (t, d u_{- i}) {\tilde{μ}}_{i, N} (t, d u_{i}) μ_{0} (t, d u_{0}) d t + δ_{N} . \end{matrix}

(7)

Here,

δ_{N} \to 0

, as

N \to \infty

.

The sequence

{{\tilde{μ}}_{i, N_{r}}}

converges to some

μ_{i}^{'} \in M_{i}

, as

r \to \infty

. Therefore,

x_{i, N_{r}} (\cdot) = x (\cdot, t_{0}, x_{0}, μ_{0}, {\tilde{μ}}_{i, N_{r}}, μ_{- i})

tends to

x_{i} (\cdot) = x (\cdot, t_{0}, x_{0}, μ_{0}, μ_{i}^{'}, μ_{- i})

. This and inequalities (5), (7) yield the inequality for any

t_{*} \in [t_{0}, T]

:

\begin{matrix} V_{i}^{-} (t_{*}, x_{i} (t_{*}), μ_{- i}) \leq V_{i}^{+} (t_{*}, x_{i} (t_{*}), μ_{- i}) \leq V_{i}^{+} (T, x_{i} (T), μ_{- i}) \\ + \int_{t_{*}}^{T} \int_{P_{0}} \int_{P_{i}} \int_{P_{- i}} g_{i} (t, x_{i} (t), u_{0}, u_{i}, u_{- i}) μ_{- i} (t, d u_{- i}) μ_{i}^{'} (t, d u_{i}) μ_{0} (t, d u_{0}) d t . \end{matrix}

Put

μ^{'} ≜ (μ_{1}^{'}, \dots, μ_{n}^{'})

. We have

(μ_{0}, μ^{'}) \in G (μ_{0}, μ)

.

Since

M_{0} \times M

is compact, and

G

is an upper semicontinuous multivalued map with nonempty convex compact values,

G

admits the fixed point

(μ_{0}^{*}, μ^{*})

. Obviously, it belongs to

C

. The consequence of the theorem follows from this and theorem 2. ☐

Funding

This research was funded by Russian Foundation for Basic Research (grant No. 17-01-00069).

Conflicts of Interest

The author declares no conflict of interest.

References

Ho, Y.C.; L, P.; Muralidharan, R. Information structure, Stackelberg games, and incentive controllability. IEEE Trans. Autom. Control. 1981, 26, 454–460. [Google Scholar]
Ho, Y.C.; Luh, P.; Olsder, G. A control-theoretic view on incentives. Automatica 1982, 18, 167–179. [Google Scholar] [CrossRef]
Ho, Y.C. On incentive problems. Syst. Control Lett. 1983, 3, 63–68. [Google Scholar] [CrossRef]
Olsder, G. Phenomena in Inverse Stackelberg Games, Part 1: Static Problems. J. Optim. Theory Appl. 2009, 143, 589–600. [Google Scholar] [CrossRef]
Olsder, G. Phenomena in Inverse Stackelberg Games, Part 2: Dynamic Problems. J. Optim. Theory Appl. 2009, 143, 601–618. [Google Scholar] [CrossRef]
Martín-Herrän, G.; Taboubi, S. Incentive Strategies for Shelf-Space Allocation in Duopolies. In Dynamic Games: Theory and Applications; Haurie, A., Zaccour, G., Eds.; Springer: Berlin, Germany, 2005; pp. 231–253. [Google Scholar]
Staňková, K.; Olsder, G.; Bliemer, M. Bilevel optimal toll design problem solved by the inverse Stackelberg games approach. Urban Transp. 2006, 12, 871–880. [Google Scholar]
Ferrara, M.; Khademi, M.; Salimi, M.; Sharifi, S. A Dynamic Stackelberg Game of Supply Chain for a Corporate Social Responsibility. Discret. Dyn. Nat. Soc. 2017, 2017. [Google Scholar] [CrossRef]
Başar, T.; Olsder, G. Dynamic Noncooperative Game Theory; Academic Press: Philadelphia, PA, USA, 1999. [Google Scholar]
Martín-Herrän, G.; Taboubi, S.; Zaccour, G. A time-consistent open-loop Stackelberg equilibrium of shelf-space allocation. Automatica 2005, 41, 971–982. [Google Scholar] [CrossRef]
Zheng, Y.; Başar, T. Existence and derivation of optimal affine incentive schemes for Stackelberg games with partial information: a geometric approach. Int. J. Control 1982, 35, 997–1011. [Google Scholar] [CrossRef]
Ehtamo, H.; Hämäläinen, R. Incentive strategies and equilibria for dynamic games with delayed information. J. Optim. Theory Appl. 1989, 63, 355–369. [Google Scholar] [CrossRef]
Kleimonov, A. Nonantagonistic Positional Differential Games; Nauka, Ural’skoe Otdelenie: Ekaterinburg, Russian, 1993. [Google Scholar]
Averboukh, Y.; Baklanov, A. Stackelberg Solutions of Differential Games in the Class of Nonanticipative Strategies. Dyn. Games Appl. 2014, 4, 1–9. [Google Scholar] [CrossRef]
Elliot, R.; Kalton, N. The Existence of Value for Differential Games. J. Differ. Equ. 1972, 12, 504–523. [Google Scholar] [CrossRef]
Varaiya, P.; Lin, J. Existence of Saddle Points in differential game. SIAM J. Control Optim. 1967, 7, 141–157. [Google Scholar] [CrossRef]

© 2018 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Averboukh, Y. Inverse Stackelberg Solutions for Games with Many Followers. Mathematics 2018, 6, 151. https://doi.org/10.3390/math6090151

AMA Style

Averboukh Y. Inverse Stackelberg Solutions for Games with Many Followers. Mathematics. 2018; 6(9):151. https://doi.org/10.3390/math6090151

Chicago/Turabian Style

Averboukh, Yurii. 2018. "Inverse Stackelberg Solutions for Games with Many Followers" Mathematics 6, no. 9: 151. https://doi.org/10.3390/math6090151

APA Style

Averboukh, Y. (2018). Inverse Stackelberg Solutions for Games with Many Followers. Mathematics, 6(9), 151. https://doi.org/10.3390/math6090151

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Inverse Stackelberg Solutions for Games with Many Followers

Abstract

1. Introduction

2. Static Games

3. Inverse Stackelberg Solution for Differential Games

4. Existence of the Inverse Stackelberg Solution for Differential Game

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI