Inverse Stackelberg Solutions for Games with Many Followers

The paper is devoted to inverse Stackelberg games with many players. We consider both static and differential games. The main assumption of the paper is the compactness of the strategy sets. We obtain the characterization of inverse Stackelberg solutions and under additional concavity conditions establish the existence theorem.


Introduction
The paper is devoted to the inverse Stackelberg games, also known as incentive problems.In the ordinary Stackelberg games one player (called a leader) announces his strategy when the other players (called followers) maximize their payoffs using this information.In the inverse Stackelberg games the leader announces the incentive strategy i.e. the reaction to the followers' strategies (see [5], [6], [7], [11], [12] and reference therein).For dynamic case the reaction should be nonanticipative.
The inverse Stackelberg games appear in several models (see for example [9], [13]).In the games with many followers it is often assumed that followers play a Nash game (see [2], [9], [10]).If the strategy sets are normed space then the incentive strategy can be constructed in the affine form (see [16] for static games, and [3] for differential games).
In this paper we consider the static and differential games with many follower.The main assumption in the paper is the compactness of the strategy sets.In this case the most efficient tool is discontinuous incentive strategies realizing the concept of punishment.[8] first applied punishment strategies to the feedback differential Stackelberg games.The inverse Stackelberg solutions of two-person differential games were studied via punishment strategies in the paper by [1].In that paper the authors described the set of inverse Stackelberg solutions and showed its nonemptiness.In particular, the set of inverse Stackelberg payoffs is equal to the set of feedback Stackelberg payoffs.Note that the incentive strategies considered in the paper by [1] use full memory, i.e. the leader plays with the nonanticipating strategies proposed in the papers by [4], and [14] for zero-sum differential games.The usage of the strategies depending only on the current follower's control decreases the payoffs.
In this paper punishment strategies are applied to the static inverse Stackelberg games and to the differential inverse Stackelberg games with many follower.We obtain the characterization of inverse Stackelberg solution and under additional concavity conditions establish the existence theorem.
The paper is organized as follows.Section 2 starts with the two-player static inverse Stackelberg game.Here there exists only one follower.We give the characterization of the solutions in this case and compare it with the ordinary Stackelberg solutions.Then we consider the static inverse Stackelberg game for the case of n followers.The differential game case is considered in Section 3. In Section 4 we prove the existence theorem for the inverse Stackelberg solution of differential game.

Inverse Stackelberg Solutions for Two-player games
We assume that the set of the players is {0, 1}.Let P i be a set of strategies of player i; and let J i (u 1 , u 2 ) be an utility (payoff) function for player i.We assume that the sets P i are compact, and the functions J i are continuous.Each player wants to maximize his payoffs.
For definiteness let player 0 be a leader, and let player 1 be a follower.In the inverse Stackelberg game the leader uses an incentive strategy α[u 1 ].Here α[•] is an arbitrary map from P 1 to P 0 .The information about chosen incentive strategy of the leader is known to the follower.
Let α be a leader's incentive strategy.We say that u * 1 is an optimal strategy of the follower if . Denote the set of optimal strategies of the follower by F (α). Definition 1.The pair consisting of incentive strategy of the leader α * and the strategy of the follower u * 1 is said to be an inverse Stackelberg solution if 1. u * 1 ∈ F (α * ); 2. for any incentive strategy of the leader α the following inequality holds The second conditions in particular means that we consider the team solution.
The inverse Stackelberg solution can be described by means of the lower value of the auxiliary zero-sum game in which player 1 wishes to maximize his payoff Let A be a set of pairs of strategies (u 0 , u 1 ) such that J 1 (u 0 , u 1 ) Proof.Let û1 maximize the right-hand side of (1).We have that for all u 0 ∈ P 0 J 1 (u 0 , û1 ) ≥ V − .Therefore, The converse statement is also true.
Lemma 2. Let (u ♮ 0 , u ♮ 1 ) ∈ A, then there exists an incentive strategy of the leader α such that Therefore, u ♮ 1 ∈ F (α).The definition of inverse Stackelberg solution and lemmas 1, 2 yield the following Theorem.
3. There exists at least one inverse Stackelberg solution.
Proof.The first two statements directly follow from the definition of inverse Stackelberg solution and lemmas 1, 2.
The third statement follows from the second one and the compactness of A.
Now let us compare the payoffs given by inverse and ordinary Stackelberg solutions.Recall the definition of the Stackelberg solution.Let F (u 0 ) be a set of strategies u ♮ 1 such that u ♮ 1 maximizes the function Here V + is the upper value of the auxiliary zero-sum game; ) is a Stackelberg solution, and (α * , u * 1 ) is an inverse Stackelberg solution, then Indeed, denote . By Theorem 1 we have that (u * 0 , u * 1 ) maximizes the value of J 0 over the set A. The pair (u † 0 , u † 1 ) maximizes the value of J 0 over the set {(u 0 , u 1 ) : u 1 ∈ F (u 0 )}.Inequality (3) follows from this and the inclusion The following example shows that the inequality in (3) can be strick even in the case when Note that the pair (1, −1) maximizes the value of J 0 over the set {(u 0 , u 1 ) : Consequently, we have that in this example the inverse Stackelberg solution gives a larger payoff than the Stackelberg solution J 0 (α * 0 [−1], −1) = 2 > 0 = J 0 (1, 1).

Case of One Leader and Many Followers
Let player 0 be a leader, and let players 1, . . .n be followers.Player i has a set of strategies P i and a payoff function J i .As above, we assume that the sets P i are compact, the functions J i are continuous.
The incentive strategy of the leader is a mapping To define the inverse Stackelberg game we should specify the solution concept used by followers.We suppose that the followers play Nash game.Let If α is an incentive strategy of the leader, u is a profile of strategies of the followers, then denote . Further, let E(α) be a set of followers' Nash equilibria in the case when the leader play with the incentive strategy α: The pair (α * , u * ) is an inverse Stackelberg solution in the game with one leader and n followers playing Nash equilibrium if 1. u * ∈ E(α).

2.
J The structure of inverse Stackelberg solution is given in the following statements.Denote 2. If the strategy of the leader u ♮ 0 , and the profile of the followers' strategies u ♮ are so that (u ♮ , u ♮ ) ∈ B, then there exists an incentive strategy of the leader α such that u ♮ ∈ E(α) The proof of this Lemma is the same as the proofs of Lemmas 1 and 2.

If the function u
is quasiconcave for all u 0 , u −i , and i = 1, . . ., n, then there exists at least one inverse Stackelberg solutions.
Proof.The proof of the first two statements directly follows from Lemma 3.
Let us prove the third statement of the Theorem.Define The functions u ′ i → K i (u ′ i , u −i ) are quasiconcave for all u −i .Therefore there exists a profile of followers' strategies u ♮ such that for all u i ∈ P Hence, we have that any pair (u 0 , u ♮ ) belongs to B. Consequently, B is nonempty.Moreover, the set B is compact.This prove the existence of the pair (u * 0 , u * ) maximizing J 0 over the set B. The existence of inverse Stackelberg solution directly follows from the second statement of the Theorem.

Inverse Stackelberg Solution for Differential Games
As above we assume that player 0 is a leader, when players 1, . . ., n are followers.The dynamics of the system is given by the equation Player i wishes to maximize the payoff The set is the set of open-loop strategies of player i.As above the n-tuple of open-loop strategies of followers u = (u 1 , . . ., u n ) is called the profile of strategies.For notational simplicity denote If t * = 0, x * = x 0 we omit the arguments t * and x * .Let z(•, t * , x * , u 0 , u) = (z 0 (•, t * , x * , u 0 , u), z 1 (•, t * , x * , u 0 , u), . . ., z n (•, t * , x * , u 0 , u)).We assume that the set of motions is closed i.e. for all (t * , Here cl denote closure in space of continuous functions on [0, T ].We assume that the followers use the open-loop strategies u i ∈ U i , when the leader's strategy is a nonanticipative strategy α : U → U 0 .The nonanticipation property means that α[u](τ ) = α[u ′ ](τ ) for any u and u ′ coinciding on [0, τ ].
For u 0 ∈ U 0 , u ∈ U, (t * , x * ) define Further, put We omit the arguments t * and x * if t * = 0, x * = x 0 .We assume that the followers' solution concept is Nash equilibrium.Let E d (α) denote the set of Nash equilibria in the case when the leader plays with nonanticipating strategy α: Definition 3. The pair consisting of nonanticipative strategy of the leader α * and u * ∈ U is an inverse Stckelberg solution of the differential game if The proposed definition is analogous to the definition of inverse Stackelberg solution for static games.The characterization in the differential game case is close to the characterization in the static game case also.
For a fixed profile of strategies of all players but i-th one u −i one can consider the zero-sum differential game of player 0 and player i.The lower value of this game is Lemma 4. Let α be an incentive strategy of the leader.If Proof.We claim that for any Assume the converse.This means that for some u ′ i and τ Consider the control This contradicts with the assumption u ♮ ∈ E d (α).
The inequality (5) yields the inequality Lemma 5.For any (u ♮ 0 , u ♮ ) ∈ C there exists a nonanticipative strategy of the leader α so that α(u ♮ ) = u ♮ 0 and u ♮ ∈ E d (α).Proof.Let u i ∈ U, and let τ i be the greatest time so that u . There exists a nonanticipative strategy of the leader Let α * be a nonanticipative strategy of the leader so that 2. Conversely, if the pair (u * 0 , u * 1 ) maximizes the value J 0 over the set C then there exists an incentive strategy of the leader α * such that α * [u * 1 ] = u * 0 and (α * , u * 1 ) is an incentive Stackelberg solution.
The Theorem directly follows from Lemmas 4, 5.

Existence of Inverse Stackelberg Solution for Differential Game
In this section we consider the differential game in the mixed strategies.This means that we replace the system (4) with the control system described by the equation ẋ(t) = P 0 P 1 . . .Pn f (t, x(t), u 0 , u 1 , . . ., u n )µ n (t, du n ) . . .µ 1 (t, du 1 )µ 0 (t, du 0 ).(6) Here µ i (t, •) are probabilistic measures on P i .We denote the solution of initial value problem for equation (6) and the position (t * , x * ) by x(•, t * , x * , µ 0 , µ 1 , . . ., µ n ).Further, let M i be a set of function µ i (t, du i ) such that for all t µ i (t, •) is a probabilistic measure on P i and t → µ(t, •) is weakly measurable i.e.
is measurable for any continuous function ϕ.
As above we call the n-tuple µ = (µ 1 , . . ., µ n ) the profile of followers' mixed strategies.Denote the set of followers' strategies by If m = (m 1 , . . ., m n ) ∈ M then denote with a slight abuse of notation m(du) = m 1 (du 1 ) . . .m n du n .Further, means the integral by the measure m 1 (du 1 ) . . .m n (du n ) over the set P = P 1 ×. ..×P n .Analogously, if m −i is a (n − 1)-tuple of measures (m j ) j =i then we assume that m −i (du −i ) × j =i m j (du j ).Thus, designates the integral by the measure × j =i m j (du j ) over the set P −i .
As above the mapping α : M → M 0 satisfying condition of feasibility (the equality µ ′ and µ ′′ on [0, τ ] yields the equality α Theorem 4. Assume that the following conditions hold true for each i = 1, n Then there exists an inverse Stackelberg solution in mixed strategies (α * , µ * ).
Proof.Let us prove that the set C is nonempty.Since the players use mixed strategies the Isaacs condition holds for each i = 1, n i.e. for all profile of measures m −i and any vector s ∈ R d the following equality is valid min Here β i denotes a mapping M 0 → M i satisfying feasibility property.Define the multivalued map G : Here Moreover, G has a closed graph.Let us prove the nonemptiness of G(µ 0 , µ).
Since M 0 × M is compact, and G is an upper semicontinuous multivalued map with nonempty convex compact values, we get that G admits the fixed point (µ * 0 , µ * ).Obviously, it belongs to C. The consequence of the Theorem follows from this and Theorem 3.