1. Introduction
This paper generalizes quantal response equilibrium (QRE) using the standard Nash equilibrium concept via the class of concave perturbed utility games (PUGs). QRE, as studied in [
1,
2], assumes that individuals maximize expected utility, but that there is an unobserved additive random shock to a base utility index that drives deviations from Nash equilibrium behavior. In contrast, PUGs specify a deterministic non-expected utility function with a preference for randomization that is additively separable from a base utility index. Here, the preference for randomization can be viewed as a reduced form of limited attention. While these approaches seem different, any QRE can be modeled as a Nash equilibrium of a concave perturbed utility game.
1 PUGs also generalize the control costs approach of van Damme [
6,
7].
2Using the Nash equilibrium concept with the more general concave PUGs is useful compared to QRE since it is often difficult to select a distribution of tractable additive random shocks. PUGs facilitate estimation of model parameters via standard maximum likelihood methods (without numerical integration). Moreover, PUGs allows complementarity between strategies which is not allowed in any QRE. PUGs also can easily incorporate that the probability an individual chooses a strategy as a best response also depends on how likely opponents play their strategies. Importantly, any concave PUG has a Nash equilibrium as an immediate corollary of Debreu [
8]. This existence result allows us to generate different flavors of the logit best response, derive a nested logit best response function, and discuss quadratic perturbed utility games.
The result that QRE can be represented as Nash equilibria of a non-expected utility game matters for interpretation and has practical implications. Concerning interpretation, violations of standard Nash play can be interpreted either as errors of individual perception or as the manifestation of a non-expected utility preference. These two interpretations have different implications for welfare analysis. For example, one may not want perceptual errors to enter welfare calculations but may want to include all facets of a non-expected utility preference in welfare calculations. On the practical side, it may be easier to place restrictions on perturbation functions than restricting the additive error term of QRE since analytical results for integrating error distributions are known only in some special cases. In contrast, rich classes of analytical perturbation functions are studied in [
9,
10]. These perturbation functions can be seen as generalizations of the entropy function that is often used to study limited attention.
We mention a non-extensive summary of related work. A perturbed utility game is similar to the control costs approach developed in [
6,
7] whereby there is a cost associated to the ability to control a “tremble.” Control costs are assumed to be additively separable, whereas perturbed utility games have costs that are nonseparable and can depend on the opponents’ strategies. Rosenthal [
11] considered a related approach of bounded rationality where the probability of choosing an action is monotone with utility. Voorneveld [
12] showed that the solution concept in [
11] can be represented with quadratic control costs or a quantal response equilibrium with uniform errors. Stahl [
13] looked at a game with trembles and an entropic cost to control the trembles. Mattsson and Weibull [
14] gave axioms for entropic control functions for individual decisions and deals with a continuum of alternatives. With regards to QRE, the regular quantal response equilibrium of Goeree et al. [
15] places additional assumptions on QRE and a textbook treatment of QRE is available in [
16].
3 We also note that recent work by Melo [
18] examines uniqueness properties of quantal response equilibrium while using a perturbed utility game as an intermediate step.
We also briefly describe some history regarding non-expected utility games and QRE. Shortly after the study of individual non-expected utility functions in [
19], there was some interest to study non-expected utility in strategic games following those in [
20,
21]. For example, equilibrium concepts, existence of equilibria, and dynamic consistency properties when individuals do not satisfy the independence axiom are studied in [
22,
23]. However, there is little applied work that resulted from this research. In contrast, quantal response equilibria was developed a few years later by the authors of [
1,
2], and has been extensively used in applications to account for deviations from Nash equilibrium. By linking these two approaches, we show applied researchers that Nash equilibrium with non-expected utility preferences provides a rich and tractable avenue to account for deviations from Nash equilibrium with expected utility preferences.
Finally, PUGs model individuals with a deterministic preference for randomization following Machina [
24] and relate more broadly to the stochastic choice literature.
4 We say an individual has a preference for randomization when the individual plays the game as if they randomize their play according to a most-preferred distribution of pure strategies. We assume throughout that individuals commit to randomize according to their most-preferred distribution.
5 While a preference for randomization may seem foreign, there is growing experimental evidence that supports this interpretation [
34,
35,
36,
37,
38,
39,
40,
41]. The most important finding for our purposes is that of Agranov et al. [
41], who found that individuals who randomize choices in individual decision problems also randomize their choices in strategic environments. Therefore, it may be important to account for a preference of randomization in games.
The rest of this paper is organized as follows.
Section 2 describes the structure of a concave perturbed utility game and shows existence of equilibria for all such games.
Section 3 derives the logit best response function from a concave perturbed utility game with entropy costs and discusses various flavors of the logit best response.
Section 4 examines other forms of concave perturbed utility games. In particular, we examine nested logit equilibria and discusses quadratic perturbed utility games.
Section 5 provides an example game and graphs best responses for various perturbed utility functions. We give our final remarks in
Section 6. The proofs are mathematically simple, but included in the
Appendix A for completeness.
2. Concave Perturbed Utility Games and Existence
Consider a finite N-player game with a set of players. Each player has a set of pure strategies given by where is the number of pure strategies for the nth player. Let a pure strategy profile be defined by where S is the set of all pure strategy profiles. Occasionally, it is useful to represent a strategy profile by where is the strategy of the nth player while contains the pure strategies taken by all other players where .
Let be the set of probability measures on . We represent elements of by . The element is a mixed strategy for the nth player. Here, is the probability the nth agent plays their jth strategy . Further, let all mixed strategy profiles be given by where a mixed strategy profile is given by . We use the shorthand to denote a mixed strategy profile where the nth player plays the mixed strategy and all other agents play their corresponding mixed strategy in .
We assume there are observed outcomes for each player
that depend on the strategy profile
denoted by
. For example,
could be monetary outcomes that depend on the strategies of all players. If an individual has social preferences and cares about other individuals, then
could be the monetary allocations to all individuals. This means that a motive for cooperation can be modeled in this framework. Finally,
could be a consumption bundle. For example, if each pure strategy is to bring an item to a picnic, then
could be the space of ordered tuples of consumption goods. Thus, we view the outcome of a game as covariates similar to in [
42,
43,
44].
We let denote the vector of the nth player’s outcomes for all strategy profiles. Let denote the collection of all individual observable outcomes. We use the notation to denote the vector of outcomes the nth player can obtain for all other combinations of opponent pure strategies when their jth strategy is played. The different values of the outcomes are mapped into a utility index that depends on opponent mixed strategies.
We consider non-expected utility preferences for each player
given by the class of
concave perturbed utility preferences. In particular, the
nth player has preferences represented by the non-expected utility function given by
Here,
is a utility index that captures the attractiveness of the
nth player’s
jth strategy that is assumed continuous in
and depends on the outcomes associated with the
nth player’s
jth strategy. The function
is assumed concave in
for every
and jointly continuous in
. We call
the
nth individual’s
perturbation function. The perturbed utility approach differs from the control function approach of van Damme [
7] since the attractiveness of a pure strategy can depend on the play of opponents and the preference for randomization can also depend on the play of opponents.
The usual expected utility conditions are expressed when the
nth individual evaluates the value of the
jth strategy when the utility index is given by conditional expected utility (CEU)
where
is a sub-utility index that maps outcomes directly to utility numbers for the
nth player. Here,
is the probability of the pure strategy of all players except the
nth player for a strategy
in the mixed strategy
, so
. However, the utility indexes can be more general. For example, the utility index can depend on actions and outcomes of other players as in [
45]. When there are monetary outcomes so
, an individual can have a probability weighting function [
46] or use rank-dependent utility as studied in [
47]. Here, an individual weights probabilities of receiving monetary outcomes by a function
. For example, a probability weighting (PW) utility index is expressed
where
is a utility function over money and
is one when
and zero otherwise.
For the nth player, let be the vector of the utility index functions for all pure strategies. Let be the vector of all utility indices for all players. We also let the collection of all perturbation functions be given by . When all individuals have concave perturbed utility preferences, we call this a concave perturbed utility game.
Definition 1. A concave perturbed utility game is the tuple of players with perturbed utility preferences where pure strategies are in S, outcomes are defined by x, utility indices are defined by U, and concave perturbation functions are defined by D.
This setup nests expected utility when for every player
the perturbation function satisfies
and the utility index for every pure strategy is given by the conditional expected utility (CEU)
where
is a sub-utility index that maps outcomes directly to utility numbers for the
nth individual. We later show that Nash equilibria exist for any continuous utility indices. Recall, utility indices can depend on mixed strategies of other opponents. Thus, concave perturbed utility games can apply the Nash equilibrium concept beyond the common conditional expected utility restriction that has been common following Nash [
20] and Von Neumann and Morgenstern [
21].
We focus on the standard definition of Nash equilibrium when studying concave perturbed utility games.
Definition 2. A mixed strategy profileis a Nash equilibrium
of a concave perturbed utility game if for all it holds thatfor all . The definition of Nash equilibrium requires that mixed strategies be a best response. The above definition is exactly this condition translated to a concave perturbed utility game. We now state that Nash equilibria exist for every concave perturbed utility game.
Corollary 1 (Existence). For every concave perturbed utility game there exists a Nash equilibrium.
The above result is an immediate corollary of the main theorem in [
8].
6 The result of Debreu [
8] was also used by Crawford [
22] to show a Nash Equilibrium exists for any concave and jointly continuous non-expected utility function. We view the contribution of this paper as providing a tractable model that can be taken to data. In addition, the model can separate how an individual values outcomes from playing a particular pure strategy and the preference for randomization. As we show in the next section, this class of models can produce the logit best response function that is popular in applied work following quantal response equilibria of McKelvey and Palfrey [
1]. Thus, it is a natural springboard to explore other forms of strategic behavior.
We note that Nash equilibria may not exist when
is not concave in
for every
. Crawford [
22] provided one example of a game with non-expected utility preferences that are quasi-convex that has no Nash equilibrium. While we focus on concave perturbed utility games, there are equilibrium concepts which can be used for non-concave perturbed utility games and more generally any non-expected utility preference. In particular, Crawford [
22] defined a notion of equilibrium in beliefs for any non-expected utility game and shows existence of equilibrium without requiring the concavity assumption.
3. Entropy Perturbations and Logit Best Response
We now consider a concave perturbed utility game when each player has entropy perturbation functions. For this game, every player
has a perturbation function of Shannon entropy so that
with
. Here,
is 0 when
. Stahl [
13] used Shannon entropy in a control function approach with trembles. Cominetti et al. [
49] studied how the limit of certain learning procedures can be represented with entropy costs. Outside of game theory, the Shannon entropy function is used extensively in discrete choice analysis [
50], information economics [
51], to motivate games with learning [
52], and for route choice [
53]. The function
is concave and continuous, and thus Nash equilibria exist. When all individuals have entropy perturbations in a concave perturbed utility game, we call it an
entropy perturbed utility game. Below, we characterize the best response function of individuals for entropy perturbed utility games. This result is mathematically straightforward and similar computations are found in [
13,
50,
52].
Proposition 1. The best response function of the nth agent in an entropy perturbed utility game is given by where When the utility index takes the conditional expected utility form
, the best response function in Proposition 1 is the same as that from logit equilibrium in [
1] when all
take the same value. The Nash equilibria thus have the same comparative statics as quantal response equilibria with respect to the
term. For example, as
, an individual will uniformly randomize among all of their pure strategies. When
for all individuals, we return to the standard Nash equilibria for a normal form game with expected utility preferences when utility indices follow
. A convenient feature of representing the logit equilibrium in this format is that it by-passes integrating over a distribution of random shocks. Instead, this best response is found by solving a constrained optimization problem.
Variations of Entropy Perturbation
There are several variations of the logit best response function that are similar to logit quantal response equilibria. First, one could consider different utility indices
. For example, when the utility is over money, one could use rank dependent preferences [
47]. Alternatively, one could use the geometric mean rather than the arithmetic mean to aggregate probabilities and the outcome into a utility index.
Variations of logit best response can also occur by introducing unique weighting terms for each pure strategy. For example, one can consider the class of
weighted entropy (WE) perturbed utility games where each individual has a continuous weighting function
and the perturbation function takes the form
Here, weights how desirable it is for the nth player to play strategy j when opponents play . Note that since . Thus, a higher weight means the player can potentially obtain more utility by choosing this strategy with a probability close to , i.e., the value that maximizes . Since the weights are all nonnegative, the perturbation is concave and equilibria exist by Corollary 1.
We consider some examples of weighting functions. Suppose an individual has ex-ante beliefs about opponent play given by
. One example of a continuous weighting function that is everywhere nonnegative is
where
is a jointly continuous distance function for the
nth player and
jth strategy and
describes the weight of the discrepancy. This means that an individual only has a preference to randomize when opponents play strategies that differ from the player’s beliefs. Another natural weighting function is
. In this case, the weighting function on the randomization term for the
nth player’s
jth strategy does not depend on what others are playing.
Lastly, we can specialize so that
for every
and for all
. This makes the preference for randomization symmetric in own-probabilities. One example is
When does not depend on j, the best response has a sample analytic form following the logit equilibra except the desire to randomize depends on the probability opponents play various strategies.
Proposition 2. Suppose for every and for all that and . The best response function of the nth agent in a weighted entropy perturbed utility game is given by where As mentioned above, related mathematical results are well-known. We highlight this as a proposition due to its conceptual novelty; to the best of our knowledge, a logit best response where weights depend on opponents’ mixed strategies has not been studied. However, it seems intuitively sensible. For example, an individual may have a higher desire to randomize when other individuals choose disparate pure strategies with low probability.
4. Other Perturbed Utility Games
While the analysis above shows how entropy perturbation functions are related to logit quantal response equilibria, there are other games of interest. In particular, one of the features of concave perturbed utility games is that the desirability of mixing with one strategy can depend on opponents’ probabilistic play. We consider two types of games that have this feature.
4.1. Mixed Entropy Perturbations and Nested Logit Best Response
Here, we derive a nested logit best response using a mixed entropy perturbation function. The nested logit model of discrete choice is treated at a textbook level in [
54].
7 The main idea of nested logit models in discrete choice is that there are alternatives that share similar qualities. For example, when choosing among a car, a red bus, and a blue bus, one might group the two buses together into a nest. This kind of similarity is also plausible for strategies in games. For example, consider a prisoner dilemma game with two players where each player can say nothing, deny involvement, or accuse the other player of the crime. Here, a natural partition of actions is to treat saying nothing and denying involvement as having the feature of loyalty with the co-conspirator, while accusing the other player of the crime has the feature of disloyalty.
To formalize the mixed entropy perturbation function, we require a partition of the pure strategies of each agent. For every nth agent with , we partition the pure strategies into into sets given by . Thus, for , this means , for all it follows that , and . We refer to the sets which define the partitions as nests. Each nest is assigned a weight .
We consider the mixed entropy (ME) perturbation function for each player given by
The first summation term is a weighted entropy function while the second summation term is an entropy-like cost function that now depends on the probability that all items in a nest are chosen. Note that the above is a sum of functions that are all concave in
since
8 Thus, existence of equilibria is immediate from Corollary 1. We characterize the nested logit best response function below.
Proposition 3. For the nth player, let be the nest associated to the jth strategy so that is the nest that contains the strategy and is the corresponding nesting parameter. The best response function of the nth agent in a mixed entropy perturbed utility game is given by where To the best of our knowledge, nested logit equilibria have not been considered in the literature. The result above can also be further generalized to allow weighting functions that depend on the opponents’ mixed strategies. For example, consider the weighted mixed entropy (WME) perturbation function of
where
is a continuous weighting function for the
jth strategy that does not depend on the nest and
is a continuous weighting function for the
jth strategy that depends on its nest. This is a concave perturbation function so equilibria exist by Corollary 1.
Now, consider the restrictions that the weights for every player satisfy that , for all that , and for all and that . Under these conditions, the best response function takes the same form as the nested logit best response except with weights that depend on opponents’ mixed strategies.
Proposition 4. In a weighted mixed entropy game, let the weights for every player satisfy that , for all that , and for all and that . For the nth player, let be the nest associated to the jth strategy so that is the nest that contains the strategy and is the corresponding weighting function. The best response function of the nth agent in a weighted mixed entropy perturbed utility game is given by where 4.2. Quadratic Perturbations
Finally, we consider concave quadratic perturbed utility games where the perturbation function takes the form
where
is continuous,
is positive semidefinite for every
, and
is a reference probability. This specification makes the utility obtained from the perturbation lower for probabilities further away from the reference probability. We know that equilibria of concave quadratic perturbed utility games exist from Corollary 1. While these games do not yield analytical solutions in general, one can quickly compute the best response with quadratic programming for each
. One example of a quadratic perturbation function is the diagonal weighting (DW) perturbation function, where for all
entries on the diagonal are given by
where
is a continuous weighting function and
for
. Thus, one can model best response functions that are linear in the utility index, but non-linear in their opponents’ strategies. We also note that quadratic perturbations are flexible enough to allow individuals to express complementarities between strategies when
following Allen and Rehbeck [
29].
5. Example Game
In this section, we apply the use of perturbation functions in a simple game. The example shows how the perturbed utility approach can allow complementarities that are not present for QRE or control costs. We consider the
game in
Table 1. This is the minimal setup to permit complementarity.
We consider when the row player is an expected utility maximizer and the column player is a perturbed utility player. We plot the best response probabilities for the column player in
Figure 1, for four types of perturbation functions and two values of payoffs. Specific details on payoffs and perturbation functions are detailed in the code, available on the authors’ websites.
The left column corresponds to low payoffs for the high action of the column player and the right column corresponds to higher payoffs. The top row of
Figure 1 is a quadratic perturbation function. Importantly, it allows complementarity between the medium and high strategies as the payoff to the high action increases. Here, as payoffs to the high action increase, the probability of playing the high action as a best response increases as expected, but the probability of playing the medium action
also increases, indicating complementarity.
The next three rows do not allow complementarity.
Figure 1c,d in the second row correspond to (quadratic) control costs,
Figure 1e,f are logit perturbations, and
Figure 1g,h are nested logit perturbations.
In this paper, we focus on best responses because a detailed theoretical analysis of equilibrium is beyond the scope of the paper. With that said, a simple equilibrium analysis is possible by supposing the row player has the high action as a dominant strategy. Then complementarity shows up for equilibrium comparative statics in
Figure 1a,b by analogous reasoning as before. Specifically, the payoff of the high action increased, and both the medium and high probabilities increased in response.