2. Definition of ESS for Two-Person Games
Following [
2], the symmetric extensive-form 2-person game is a pair
where
is an extensive-form game and
T is a symmetry of
. If
are the behavior strategies of player 1 in
and
(behavior strategies of player 2) are the symmetric images of
, respectively, then the probability that the endpoint
z is reached when
is played is equal to the probability that
is reached when
is played. Therefore, the expected payoff of player 1 when
is played is equal to player 2’s expected payoff when
is played [
10]:
Equation (
1), restricted to pure strategies, defines the symmetric normal form of
.
Definition 1. Direct ESS in is a behavior strategy of player 1 that satisfies We try to purpose some refinement of this definition. Let be the probability generated over the set of endpoints in the game if players choose behavior strategies , , respectively.
Definition 2. The behavior strategy is called ESS in if satisfies Note that in Definition 2, the important condition is weak, and if we revert to biological interpretations of ESS, we have to take into account that the biological populations may not react to the changes of strategies in extensive-form games (remember that the strategy in an extensive game has a very complicated structure), and it is clear that “animals” cannot realize the deviation from it and may react to changes in probability measure on the final positions of the game (on the set of outcomes). Thus, deviations which do not affect measure on the endpoints cannot be taken into account when considering ESS.
Example 1. We repeated the Hawk and Dove game [11]. This game is a two-person bimatrix game Γ with payoff matrices: If , is ESS in Γ. Consider now a two-stage version of this game, which can be represented on Figure 1. The strategy of player I (II) in this game is a rule, which defines the choice of one from two alternatives H or D in each information set of a player. Player I (II) has 5 information sets, and thus, each of them has 32 strategies, which can be represented as sequence . Denote this strategy of player as .
Consider the strategy , which is composed from ESS (case V > C) in each stage game. It would be appropriate if this strategy is ESS in our two-stage game [12,13]. Unfortunately, it does not satisfy Definition 1, which was the reason to change in our paper this definition to Definition 2. It can be easily seen that condition (2) holds since is NE in the game Γ. However, there exist a strategyfor which the payoff Sinceandthis shows that the condition (3) is not satisfied. However, according to Definition 2, the strategy is ESS since the strategy giving the same payoff against as itself is excluded from consideration because of condition (5) of Definition 2.
Remark: In our example, ESS is in pure strategies, and thus in definitions ((2)–(5)), the mathematical expectation of the payoff coincides with the payoff itself [14]. Suppose now that is the n-stage repeated bimatrix game. Let G be a stage symmetric bimatrix game. The strategies in G are alternatives in . To each strategy i of player 1 in , we correspond a strategy of player 2 in G with the same index i. Each alternative in is a strategy (index) in some stage game G in . The mapping corresponds to the alternative c (strategy) of player 1 in stage game G, the alternative (strategy) of player 2 in the same stage game (strategy with the same index). To each information set of player 1, mapping T corresponds the information set of player 2 in the same stage game (the bimatrix game can be represented as a game in extensive form with two moves and two successive information sets for player 1 and for player 2).
Theorem 1. If is a ESS in G, then the behavior strategy prescribing the behavior to the alternatives of each information set ( is ESS in stage game G) is ESS in .
3. Definition of ESS for n-Person Games
There are many different approaches to how the ESS should be extended to the
n-person case. We shall follow the definition given in [
15]. Suppose we have a game
G in normal form:
when
is the set of players,
is the set of strategies of player
i, and
is the payoff function of player
i. We suppose for simplicity that the sets
are finite.
Note that the strategy profile
is an ESS in
G [
16], if it is a strict Nash equilibrium, i.e., if
It is proved that condition (6) protects the strategy against the invasion of a few mutants playing another strategy .
It is also clear that (6) cannot be used to define ESS in multistage games since there is always a large number of strategies such that for any strategy profile, .
Following the ideas of the previous section, try to refine the ESS concept specified in (6) in such a way that it could be useful also for n-person multistage games.
For this reason, we have to mention that (6) automatically excludes the mixed strategy profiles from consideration. Additionally, the refinement of this concept will act only with pure strategy profiles.
Denote by the strategy set of player i in . is the strategy of player i, and is the payoff function of player i. Let be a multistage n-person game.
Definition 3. The strategy profile in Γ is called ESS ifand if for some , then paths corresponding to and necessarily coincide. From Definition 3, it follows that strict inequality in (7) is valid for all those deviations, for which the resulting paths differ from that generated by the ESS strategy profile.
5. ESS for Metagames
Finite multistage game , at each stage of which some n-person game G is played, is called metagame; the game realized at each stage depends on the players’ choices in previous games.
Over the strategy profiles x in the stage game G, the mapping is defined, which corresponds to each stage game G and strategy profile x for the next stage game .
Suppose that in metagame
, on the first stage, the stage game
is played. If in
, players choose strategy profile
, then on the second stage, the game
is played. If on stage
k, players playing the stage game
choose strategy profile
, on the next stage, the game
is played. The metagame ends on stage
m. The payoff of player
in the metagame is equal to the sum of their payoffs in stage games. Denote by
, the payoff of player
in stage game
, then the payoff of player
in metagame is equal to
It is important that after each stage, players know all of the prehistory (prehistory—players’ choices before current stage of metagame).
The strategy of player in is a mapping which corresponds to the choice of strategy in stage game G as a function of the strategy profiles of all players in stage games realized before the stage game G.
Suppose that stage game G has ESS (strict Nash equilibrium). Note that strategy profile is ESS in G, and is the payoff of player i in G under the strategy profile .
Suppose that under strategy profile , the sequence of stage games is realized. This sequence of stage games we shall call path corresponding to n is the strategy profile .
Now consider the stage game
. Note that the game
depends also upon choices made by players in previous stage game
. This means that on stage
k dependent on previous strategy choices, different games of type
can be realized. For each stage game
, denote by
the zero-sum game between player
i as the first player and subset
as the second player with sets of strategies
respectively, and the payoff of player
i is given by
. (Payoff of the second player
is given by −
.)
Denote by
the corresponding mixed-strategy profile in the saddle point of
and by
the value of
. Fix some strategy profile in
as
and suppose that
Consider .
Definition 4. The strategy profile is ESS in the metagame iffor all i and all , and if for some , then paths corresponding to and coincide. This definition is common for definition of ESS for n-person games.
Generate strategy of player i in metagame as the following: in games , players chooses strategies , and at last stage in —. Then, strategy profile realizes a sequence of stage games in metagame , which we will call the optimal trajectory. Denote by the payoff of player i in .
Suppose that player
i deviates from
at some stage
, then, beginning from stage
, players from
choose
, see
Figure 2. Define
, satisfying
. After stage
t, players
choose strategy
, optimal in the zero-sum game
.
Denote
. Suppose that
Theorem 3. If there exist strategies in games such that (9) holds, then the strategy profile , mentioned above, is ESS in metagame Γ.
Proof. The payoff of player
i when the strategy profile
is used is
It is important to note that are pure strategies.
Suppose that player
i deviates from
, and this happens at stage
t of metagame
. Denote by
this new strategy of player
i. Then we obtain a new strategy profile
in
, which realizes the path, different from the optimal trajectory. Consider the payoff of player
i under strategy profile
, realizing the path different from the optimal trajectory. From (9), we obtain
Thus, is ESS (see Definition 4).
The theorem is proved. □
Example 2. Consider a metagame Γ, in which one of two possible games and is played on each stage. and are two-player games with strategy sets , in of players I and II, and strategy sets , in of player I and II, correspondingly. The payoffs in are defined as Table 1. In both games, the Nash equilibrium is , with payoffsand Suppose . In both stage games, if player i deviates from (or ), they can obtain at most Similarly, . The metagame Γ proceeds as follows. On the first stage, players play the game () and if in , they choose strategy profile (1, 1) or (1, 2), on the next stage, the game is repeated . In the other case (if strategy profiles (1, 2), (1, 3), (2, 1), (2, 3), (3, 1), (3, 2), and (3, 3) are chosen), on the next stage, the game is played . If on stage k the game is played, the next stage game is defined as in the first stage. If on stage k, the game is played, on the next stage, the game is played if in stage game , the strategy profiles (1, 1) or (1, 2) are chosen. In other cases, on stage , the game is played . The metagame ends on stage m.
In each case, when one of the players (player i) deviates from strategy profile , the other player will choose strategy 2 on the next stages of the metagame. Hence, the payoff of the deviating player in all future stage games will be equal to 0. We see that the conditionis satisfied, and the strategy profile constructed above is strong NE and, thus, ESS.