1. Introduction
Classical game theory assumes that economic agents are perfectly rational and have unlimited reasoning resources, which enable them to make correct decisions at all times. In reality, reasoning resources are scarce and the economic environment is complex, with agents interacting in many settings or games. If games played by agents are similar in some dimensions, distinguishing all games at all times requires undue reasoning effort. This paper examines whether an evolutionary process could weed out such a waste of resources.
Formally, it is assumed that players repeatedly play games drawn from some pre-specified family. The family of games remains fixed for the duration of play, but in every period, a new game is drawn. Due to the scarcity of reasoning resources, the family of games is partitioned into analogy classes, i.e., the subsets of games that players do not distinguish. While players choose strategies in the repeated games induced by the partitions rationally, the choice of how to partition the game set is beyond their control and is, instead, governed by some evolutionary process.
In this setting, an example is constructed demonstrating that when games are played a finite number of times, not knowing exactly which game is currently being played might make all players better off. Intuitively, coarse partitioning of the game set is a commitment device that allows players to play strategies that are not incentive-compatible when players know perfectly which game is currently being played. Such commitment can result in a Pareto improvement. Furthermore, in some instances, the new equilibrium payoffs are also immune to evolutionary pressure that operates at a partition selection level (also referred to as a partition selection meta-game).
In the context of this paper, introducing an appropriate notion of evolutionary viability poses a conceptual challenge. Ideally, the evolutionary pressure should operate on the level of partition selection only, without affecting the strategies players follow in the resulting dynamic game. Such dichotomy between decision-making and evolution of cognition is natural. On the one hand, in everyday interaction, people make conscious choices, taking their reasoning and cognition constraints as given. On the other hand, the process that shapes human reasoning operates on a different time scale and it takes countless generations for substantial changes to occur. However, in repeated games, such separation is difficult, as the partition determines the knowledge of players about the play of the game so far. Hence, an evolutionary viable strategy has to specify a partition and an equilibrium strategy in the resulting game.
The adopted definition of evolutionary viability is related to the notion of an evolutionarily stable strategy and attempts to identify partition-strategy pairs that are immune to one-off invasion by a small number of mutants with a different partition. It is assumed that players in the population are (boundedly) rational, but do not know much about the workings of the evolutionary process and, hence, are unaware of the possibility of mutation.
The idea that reasoning resources are scarce and that people use analogies to make decisions in their day-to-day interactions is not new. For example, [
1] advocates that players, rather than being action rational,
i.e., consciously optimizing in each decision situation, are rule rational and apply “rules of thumb". These rules of thumb emerge as a result of an evolutionary process and work well on average. For example, leaving a tip at a restaurant is rational only if one intends to come back to that restaurant and would like to incentivize staff to provide good service in future. However, customers routinely leave generous tips even when dining at restaurants where they are unlikely to come back in the near future. The rule “always tip generously" works well on average and spares its users the anguish of routinely estimating the likelihood of coming back.
Similarly, [
2] suggest that to the use of vague, imprecise language can be explained only if people have a vague view of the world. This paper models interactions of a player who has evolved to have an imprecise view of the world and their day-to-day interactions.
The setup of this paper is closely related to [
3], who studies the learning process carried out by agents who are involved in many games. [
3] builds a dynamic model in which players simultaneously learn how to partition the game set and which actions to choose in one-shot two-player games. Instead of being rational, players adapt their behavior through reinforcement learning. Unlike [
3], this paper assumes that players are engaged in repeated games and that evolutionary pressure operates only at the partition selection stage, while subsequent decisions of players are rational.
This paper is also distantly related to the literature on commitment in games, e.g., [
4]. This strand of literature assumes that in the first stage, players voluntarily and credibly commit to a subset of their pure strategies, and in the second stage, they play the game induced by their commitment. In this paper, players are limited in what they know about the game and not in what they can do in the game. More importantly, here, each player’s partition of the game set is not publicly observable and, thus, the explicit commitment mechanism is lacking.
The rest of the paper is organized as follows:
Section 2 outlines an example and characterizes equilibrium payoffs under exogenously given partitions,
Section 3 analyzes the evolutionary viability of coarse partitioning, and
Section 4 concludes.
2. Fixed Partitions
There are two players interacting over two periods. In every period
they play a normal form game that is randomly and independently drawn from the set
according to probabilities
Each game
is a finite simultaneous move game and both games in Γ share the same action space.
has the payoff matrix:
Note that
has three Nash equilibria:
and
Outcomes
and
are also Pareto-efficient in this game.
has the following payoff matrix:
The unique Nash equilibrium of this game is However, payoffs of the Nash action profile are Pareto-dominated by outcome
Players are rational, but they have limited reasoning resources, which makes distinguishing games costly. Thus, players may partition Γ into subsets of games they do not distinguish. These subsets are called analogy classes. The collection of player i’s analogy classes is referred to as i’s partition of Γ, and it is denoted by With only two games in Γ, player i can either distinguish the games and have the finest partition or not distinguish the games and have the coarsest partition .
In this section, it is assumed that each player is endowed with an exogenously given partition Π and both players partition the game set Γ in the
same way. Thus, whenever both players do not distinguish the games, they perceive that they are playing an “average" game
with the following payoff matrix:
Partition Π divides histories of dynamic game into equivalence classes and players, who partition Γ according to Π, can condition their continuation play only on these equivalence classes. This implies that in all respects, the fixed partition game is a standard dynamic game with the exception that the strategies available to players are restricted to some class.
Let
denote the expected average per period payoff of player
i when players partition Γ according to
and follow strategy profile
where
is a sequence of stage game payoffs associated with a realized outcome path and expectation is taken with respect to the measure over outcomes induced by
s and the stochastic process governing the draws of the games. While playing games from
player
i aims to maximize
In the current setting, the idea of sequential rationality is captured by the notion of subgame perfect equilibrium. Let denote the set of subgame perfect equilibrium payoffs in the fixed partition game, where all players partition Γ according to Π. The aim of this section is to compare to .
Below, it is demonstrated that, in the finitely repeated setting, coarse partitioning of Γ can introduce new equilibrium payoffs, and thus, is not necessarily included in Moreover, both players may be strictly better off in the best subgame perfect equilibrium when they partition Γ according to , as compared to the best equilibrium with partition .
Suppose players are endowed with partition . Our interest lies in finding the highest subgame perfect equilibrium payoff of the induced dynamic game. By the logic of backwards induction, in the second period, a Nash equilibrium should be played in any realized stage game. However, since game has more than one Nash equilibrium, it is possible to condition the second period play on the outcome of . This could potentially allow the construction of inter-temporal incentives to support non-Nash outcomes in the first period. Nevertheless, here, flexibility in the second period choice of Nash equilibrium in does not suffice to sustain Pareto-efficient play in A Pareto-efficient outcome in involves playing either or if is realized and playing if is drawn. Suppose that in game is realized, and consider the interim incentive of player i to comply with Pareto-efficient play of B in . If player i deviates and plays A instead, he improves his immediate payoff by three. This deviation triggers a punishment in which costs one if is drawn and nothing, otherwise. Hence, the expected magnitude of the future punishment is which is less than three, the myopic incentive to deviate.
Since in Pareto-efficient play is impossible, the average expected payoff in the best subgame perfect equilibrium is four. This payoff is attained by playing one of the "good" Nash equilibria, or whenever game is drawn and playing the only Nash equilibrium, whenever game is drawn.
Suppose players are endowed with partition
. Then, it is as if in each
t, they are playing the average game
which has two Nash equilibria:
and
However, the Pareto-efficient outcome is
This outcome can be sustained as a subgame perfect equilibrium outcome in
by the following strategy for
Play B in if is played in the first period, play C in otherwise, play A in
This strategy prescribes playing a Nash equilibrium in
in the subgames after a deviation, as well as in the subgame where there was no deviation. In
the best deviation from
B yields an immediate gain of
, but triggers reversion to the unfavorable Nash equilibrium in
which costs
Hence, player
i finds it optimal to follow the prescribed strategy,
, in
Thus, when players do not distinguish games, the best average expected payoff is:
which exceeds four, the best expected payoff attainable when players partition the set of games according to
.
3. Partition Selection
The aim of this section is to identify partitions that are viable under evolutionary pressure. To this end, the notion distantly related to an evolutionary stable strategy (ESS) is deployed.
A strategy is evolutionarily stable if there exists an such that for all , the population playing the native strategy can resist any invasion of ϵ mutants. This definition is equivalent to requiring the native strategy to be a best response to itself, as well as to satisfy an additional stability condition. Thus, the ESS notion is a refinement of a symmetric Nash equilibrium.
In the present setting, by assumption, the evolutionary pressure operates only at the level of partitioning of Γ. Hence, the interest lies in identifying the evolutionarily stable partitions. However, the viability of a partition depends on the strategies the population and the invading mutants subsequently follow in the dynamic game, and it is difficult to separate the evolutionary selection of viable partitions of Γ from the selection of strategies in the dynamic game.
Perhaps an obvious way to define an evolutionarily stable partition of Γ is to consider the following two-stage game. In the first stage, players commit to a partition of
and their choice becomes public knowledge. In the second stage, players play a subgame perfect equilibrium of the dynamic game induced by the chosen partition. An ESS could be defined as a partition profile that constitutes a symmetric (and stable) Nash equilibrium in the first stage of the game. The equilibrium partitions depend on the continuation payoffs of players, and a strategy of a player has to specify the course of actions for every, including out-of-equilibrium, selection of the partition. It could be assumed that after a deviation in the partition selection stage, the play of the dynamic game proceeds to the subgame perfect equilibrium with the lowest payoff for the deviator. This setting resembles the commitment games considered by [
4].
However, it is somewhat unnatural to assume that the choice of partition is public. If the chosen partition remains private, the only proper subgame of the two-stage game is the game itself and subgame perfection has no bite. In this case, the appropriate equilibrium concept needs to specify the beliefs players hold in every period.
To abstract from the issues related to learning of opponents’ partitions, as well as to avoid the evolutionary selection at the level of the dynamic game strategies, the following simplifying assumptions are made. The evolutionarily viable strategy is defined as a tuple consisting of a partition, Π, and a subgame perfect equilibrium strategy, s, such that, when agents in the population partition Γ according to Π and in the induced game, play no mutant with a different partition can do better than incumbents in a paring with incumbents. While playing the induced game, agents from the population believe that with probability 1 all other agents partition Γ according to Π and play strategy
The standard definition of ESS applies to symmetric strategies. Here, the symmetry requirement is imposed on the way agents partition Γ, but not on the strategy
Hence, for a strategy
to be evolutionarily stable, it is necessary that a mutant in the role of any player cannot do better than an agent from the population in the same role.
1In the partition selection meta-game, the overall utility of a player
where
with
and
takes into account the payoffs this player derives from playing the dynamic game induced by partition profile Π,
as well as the cost of sustaining
. It is assumed that
is lexicographic, firstly increasing in the payoffs derived from playing the games from Γ and, secondly, decreasing in the cardinality of the partition.
2Definition. In the partition selection game, strategy profile is partition deviation stable if and only if is a subgame perfect equilibrium in the dynamic game induced by where Π
is such that both players partition Γ
in the same way, i.e., and for each player :
or:
and:
for all alternative strategies of i,
Section 2 demonstrates that players endowed with coarse partition
could be better off than players who partition Γ more finely. It turns out that a population of agents endowed with a coarse partition might be immune to one-off invasion by mutants with finer partitions.
Consider the partition selection meta-game and suppose that in this game, strategy
is partition deviation stable. Strategy
prescribes choosing
and, in the induced dynamic game, playing according to
defined in the previous section. Suppose player
i is a mutant who can distinguish the games. By assumption, this player should behave optimally at every stage of the dynamic game. If in
this player does not exploit his superior information and mimics the play prescribed by
, in
he has no profitable deviation either as
is a Nash equilibrium strategy profile in both games. Thus, for the deviation to be profitable,
3 it has to involve a deviation from the play prescribed by
in
In
the best response to
B is to play
B in
and to play
A in
. In
the best response to
C is
C in both games, while the best response to
A is to play
A in
and to play
C in
. Thus, mutant
i, who responds optimally to
in every stage of the game and does not mimic the play prescribed by
, obtains the expected payoff of:
from the repeated play, which is less than
the payoff from following
. Thus, a mutant with a finer partition obtains lower payoff than the population, and the strategy that prescribes not distinguishing the games and then playing according to
is partition deviation stable in the partition selection meta-game.
The example indicates that a necessary condition for a strategy profile to be partition deviation stable is that, in the last period on the equilibrium path, prescribes playing a Nash equilibrium in every game, even if games are not distinguished. Otherwise, mutant i with a finer partition definitely has a profitable deviation in
4. Final Remarks
This paper studies the interaction of substantially rational agents who are involved in a repeated play of normal form games drawn from some fixed family. Reasoning resources are assumed to be costly, and hence, players do not necessarily distinguish all games. The primary interest lies in identifying equilibrium payoffs that are consistent with evolutionary pressure that shapes the constraints on reasoning of agents in the long-run.
In this setting, the most striking result is that, when games are played a finite number of times, coarse partitioning of the game set might shift outwards the Pareto frontier of the dynamic game, thus introducing new symmetric equilibrium payoffs that Pareto dominate the best equilibrium outcomes with distinguished games. This implies that coarse partitioning of the game set eliminates an unnecessary waste of reasoning resources as well as commits players to act in their common interest. Given this, it does not come as a surprise that the new equilibrium payoffs could be immune to evolutionary pressure at the partition selection stage.
Deriving the general conditions under which coarse partitioning of the game set is not only beneficial for society as a whole, but also stable in an evolutionary sense is a challenging task. The necessary conditions can be broadly summarized as follows. In order to generate a Pareto improvement through coarse partitioning, it is necessary that in some games, the Pareto-efficient action profile is not a Nash equilibrium. Furthermore, in these games, it should be impossible to provide inter-temporal incentives for achieving the Pareto efficiency in early periods, when games are distinguished. In contrast, coarse partitioning should provide the flexibility in the choice of future Nash equilibria that is sufficient for sustaining Pareto-efficient play in early periods. Finally, a necessary condition for coarse partitioning to be evolutionary stable is that, in the last period on the equilibrium path, players play a Nash equilibrium in every game, even if games are not distinguished. If this last condition does not hold, a mutant with a finer partition can improve his last period payoff relative to the payoff of the population.