The first experimental study of the Prisoner’s Dilemma game, published originally as a Rand Corporation Research Memorandum [1
], and later, slightly abbreviated, as a journal article [2
], included small financial incentives associated with the payoffs to motivate the players’ strategy choices. The payoffs in the game ranged from −1 to +2, and these were converted into pennies (US cents) after 100 repetitions of the game. This was the first published experimental game of any kind, apart from an experimental market study of Chamberlin [3
]. Most of the experimental games that were published from that time until the late 1980s were performed by psychologists, and few of them included task-related financial incentives [4
]. Subjects were often incentivized to participate by being offered flat-rate show-up fees or course credits, but the payoffs for which they played in the experimental games were in most cases simply notional points or imaginary money.
The growth of experimental economics in the 1970s was accompanied by the introduction of substantial task-related financial incentives in the vast majority of experiments published in economics journals, and the issue of incentives has continued to divide psychologists and economists. While experiments can be traced back to the 1940s, it might be said that the formalization of principles and “best practice” was not well articulated until the 1970s and 1980s, with Smith’s paper [6
], often cited as an influential flagship contribution in this respect. An increasing number of psychologists favor incentives, and while some psychologists and economists, notably [7
], have raised questions about whether they are always necessary or even desirable, it is nevertheless the case that literally every experimental article published in the American Economic Review
from 1970 to 1997 included task-related financial incentives [9
] (p. 31), whereas only 26% of those published in the more psychologically oriented Journal of Behavioral Decision Making
between 1988 and 1997, included them [10
] (p. 391).
A general consensus has evolved in economics, as a consequence of which experiments on both individual and interactive decision making are very difficult to publish in economics journals unless they include substantial financial incentives [11
] (Chapter 6). However, there is no consensus about the necessary or desirable magnitudes of those incentives. According to the influential capital-labor-production
framework of Camerer and Hogarth, an extended version of the simpler labor theory
of Smith and Walker [12
], “experimental subjects do not work for free and work harder, more persistently, and more effectively, if they earn more money for better performance” [9
] (p. 7). This seems to imply that, when it comes to incentive payments, bigger is better. In terms of this framework, a subject in an experiment applies costly effort (cognitive labor) and brings expertise or procedural knowledge (cognitive capital) to the performance of the experimental task (production). Camerer and Hogarth acknowledge that “ultimately, the effect of incentives is an empirical question” [9
] (p. 8). However, investigating these effects is not straightforward, because task performance is generally difficult to measure. Most of the 74 experimental studies reviewed by Camerer and Hogarth showed no effect on mean performance of increased task-related incentives, and where incentive magnitude effects were found, they were most often simply reductions in the variance of responses rather than increased or improved performance.
Some production tasks have the property that increased effort does generally improve performance, and evidence from studies of such tasks suggests that subjects appear to work harder if they are incentivized than if they are not; but provided that they are paid something, the magnitudes of the incentives do not seem to make any difference. For example, Libby and Lipe [13
] provided experimental evidence of a 21% step increase in total time allocated to accounting judgment tasks in subjects paid task-related incentives plus flat-rate show-up fees when compared to those paid flat-rate show-up fees only. In the light of this and similar studies, Camerer and Hogarth [9
] concluded from their review that when effects of task-related incentives on cognitive effort are found, they appear to differentiate zero incentives from positive incentives; but once some
level of task-related reward exists, “raising incentives from some modest level L to a higher level H is more likely to have no effect” (p. 21). According to Bardsley et al. [11
], “The message seems to be that in terms of the impact on cognitive effort allocation, the presence of task related incentives matters more than their level” (p. 253). However, in more recent research, comparisons of incentivized and non-incentivized individual decisions have most often found no significant differences [14
The effects of incentives are sometimes negative. There are circumstances in which financial incentives have the effect of “crowding out” intrinsic motivation. For example, Gneezy and Rustichini [24
] reported an experiment using a psychometric task in which subjects were randomly assigned to four incentive treatment conditions: no payment, low payment, medium payment, and high payment. In the conditions in which some payment was made, higher payment elicited better performance; but subjects who were offered no financial incentives performed significantly better than those who were paid. According to the researchers, “we may conclude that the monetary compensation produces a reduction in the performance” (p. 802). This suggests that extrinsic rewards can undermine or crowd out intrinsic motivation, an idea first suggested by Deci [25
]. In one of Deci’s key experiments, undergraduate students were either paid or not paid to work for a certain time on an interesting puzzle. In a later unrewarded “free-time” period, subjects in the no-reward condition played with the puzzle significantly more than those who were paid, and the unrewarded subjects also reported a greater interest in the task. This effect has been replicated many times, and a meta-analysis [26
] suggests that the phenomenon is fairly widespread (see also [27
The effects of incentive magnitude, even in individual decision making, are evidently not well understood. Furthermore, a Web of Science search suggests that only a few studies have investigated such effects in interactive decisions or games. For example, both cooperation and punishment were studied in a one-shot public goods experiment and neither were affected by stake size [28
]. Amir, Rand, and Gal [29
] studied four types of games (Dictator Game, Trust Game, Ultimatum Game, and Public Goods Game), and concluded that $
1 stakes produced results that were consistent with those when there were no stakes, and that the results from online experiments were consistent with those from in-person testing sessions. Karagözoğlu and Urhan [30
] recently reviewed the evidence on incentive magnitude effects in bargaining and distribution games and found only a small number of studies, mostly focused on the Ultimatum game, which had investigated such effects in games. They concluded that the number of published studies is not sufficient to justify any definite conclusions regarding incentive effects in games and that the existing findings do not provide any clear or consistent picture, at least for that class of games. The limited number of experimental normal-form games included in the earlier reviews of Smith and Walker [12
] and Camerer and Hogarth [9
] do not clarify the picture greatly for games of the type investigated in the experiment reported below.
Our experiment was designed to provide much-needed rigorous evidence on incentive magnitude effects of task-related payments in standard normal-form games. Our focus is on twelve experimental games reported by Colman, Pulford, and Lawrence [31
] that were designed to disentangle cognitive hierarchy [32
], team reasoning [33
], and strong Stackelberg [31
] theories in games without obvious, payoff-dominant solutions. The patterns of strategic choice in that study, and the reasons given by players for the choices they made, appeared to suggest that relatively low-effort thinking (simple heuristics and Level-1 cognitive hierarchy strategies) accounted for rather more behavior than more deliberative higher-level reasoning. One possible explanation for this is that the incentives in the Colman et al. [31
] study were insufficient to induce the effort involved in greater deliberation. To examine that possibility, in the present study subjects were randomly assigned to treatment conditions in which they were incentivized with task-related payments either of the original magnitude or else five times larger. In all other respects, the two treatment conditions were identical. In both conditions, subjects made one-off strategy choices in the same range of 3 × 3 and 4 × 4 games. We compared the strategy choices of subjects in the two conditions to determine whether incentive magnitude had any significant effects on strategy choices, and we also compared their self-reported reasons for their choices.
2. Materials and Methods
The subjects were 94 students and employees at the University of Leicester (55 female, 39 male), aged 18–54 years (M = 27.68, SD = 9.72) recruited from the School of Psychology’s subject panel and the university’s weekly online newsletter, an approximate sample size of 40 for each condition having been determined in advance. These 94 were to participate in the role of Player 1. A further 19 subjects were recruited to play the role of Player 2 in each condition in order to avoid deception and to enable calculations of payoffs for subjects in the role of Player 1. (Data from subjects in the role of Player 2 were too few to be included in the analyses reported in this article.) Participants were randomly allocated to conditions irrespective of whether they were students or employees of the university and all volunteered to take part. The average age of subjects in the control group (M = 27.21 years, SD = 9.99) did not differ from those in the experimental group (M = 28.15 years, SD = 9.52), t(92) = 0.465, p = 0.64. All of the subjects gave their informed consent before they participated in the study. The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by the University of Leicester Psychology Research Ethics Committee (PREC) (identification code bdp5-f797).
We paid each subject a show-up fee of £3.00 (US $
5.00) plus an additional amount, either up to £5.00 (US $
8.60) or £25.00 (US $
43.00), according to which condition they were assigned to and depending on how their decision in a single randomly selected game interacted with the strategy chosen by Player 2 in that game. We did not mention the show-up fee until the experiment was over since we wished them to focus upon the link between their decision and their earnings. The average incentive-linked payout in the control condition was £3.44 ($
5.74) and £17.45 ($
29.09) in the experimental condition (currency conversion rates at July 2014). The task-related incentives were implemented according to the random lottery incentive system, a technique that avoids problems that are associated with other incentive payment schemes [36
] and that has been shown to elicit true preferences [37
]. When collecting data from experimental subjects on multiple rounds it is effective to pay a subset of subjects, a subset of rounds, or both, to incentivize the subjects [39
]. Accordingly, after one game was randomly selected at the end of an experimental session, each player was randomly matched with one of the players in the other role for the purpose of calculating their payoff in that game.
We used a between-subjects design, with subjects randomly allocated to either a control or a ×5 experimental treatment condition. In the control condition, the payoffs in every game ranged from zero to 5. In the ×5 experimental condition, these payoffs were all multiplied by five. The subjects were told that, at the end of the session, the payoffs would be converted to pounds sterling for the game that was randomly selected for payment. In every other respect, subjects were treated identically in both conditions.
The experimental games used in the control condition were eight 3 × 3 and four 4 × 4 games originally used by Colman et al. [31
]. The 12 games are displayed in Figure 1
. They represent a diverse range of two-player normal-form games, extending considerably the range covered in earlier reviews [9
], and there are no strongly or weakly dominant strategies in any game. Multiplying all the payoffs by 5 has no effect whatsoever on a game’s strategic properties. We did not drop any variables, conditions, or games from our analyses.
The experiment was conducted over five 40-min testing sessions, with 20–25 subjects per session, approximately half being randomly allocated to the control condition and half to the ×5 experimental condition. The subjects sat at computer monitors and logged on to the SurveyGizmo website, where they read the following instructions. These were the same for both conditions, apart from adjustments to the payoff information.
You will be presented with a series of 12 grids. For some grids you will be asked to choose between A, B, and C, and for others you will be asked to choose between A, B, C, and D. The numbers in the grids represent pounds sterling (e.g., “5” = £5). You will be paired with another randomly selected participant in this room for each of your 12 decisions. In each case, the other participant will be presented with the identical grid and will also be choosing between A, B, and C, or A, B, C, and D. At the end of the experiment, one of the grids will be chosen randomly from the 12. The amount of money that you scored in that grid will be paid in cash at the end of today’s session. When you are making your choices, you will not know who you are paired with or what choices they are making. For each grid, please indicate your choice by selecting either A, B, C, or D.
The subjects were given the opportunity to ask questions (in practice, no one did), after which, the payoff matrices were presented in succession in a random order on their monitors, with Player 1’s labels and payoffs shown in blue and Player 2’s in red. In each session, at least one subject was assigned the role of Player 2 and was presented with the same games, but the instructions were slightly rewritten from the perspective of the red player.
For the subjects in the role of Player 1, the following text was displayed below each payoff matrix to help them interpret the game: “You are the Blue decision maker, choosing between the rows marked A, B, or C (or D). The person you have been paired with is the Red decision maker, choosing between columns A, B, or C (or D). Depending on what you and the other decision maker choose, you will get one of the blue payoffs, and the red decision maker will get one of the red payoffs.” Subjects thus knew that their choices were liable to impact their own and their co-player’s payoffs. A summary of the information shown in the payoff matrix was then presented, as follows (this example relates to Game 1 in the control condition):
If you choose A, then:
If Red chooses A, you will get 3, and Red will get 3
If Red chooses B, you will get 1, and Red will get 0
If Red chooses C, you will get 0, and Red will get 2
(and so on …)
The subjects then made one-off strategy choices in each of the 12 games by clicking radio buttons marked A, B, C, or D. They were able to change their strategy choice at any time until they clicked Next. Returning to previous games was not possible and no feedback was provided before they progressed to the next game.
After the subjects had recorded their decisions for all 12 games, they were presented with a randomized list of ten possible reasons (see Table 1
that might have influenced their decisions, and they were asked to indicate on a seven-point Likert scale the extent to which they agreed or disagreed with each reason (Strongly disagree; Moderately disagree; Slightly disagree; Neutral; Slightly agree; Moderately agree; Strongly agree). The reasons were based on a qualitative pilot study [31
] in which subjects had been asked to describe their reasons for choices in games similar to the ones used in the present study. Instructions to Subjects read: “Listed below are 10 possible reasons that may have influenced the decisions you made in choosing between A, B, C, and D in the decision task you have just completed. For each reason, please indicate to what extent you agree or disagree with the statement, taking into account all 12 decisions that you have just made.” These reasons for choices were asked at the end of all games because we (like Colman et al. [31
]) did not want to influence subjects’ thinking and strategy use while they were playing the games.
Finally, one game was selected at random. Subjects were reminded of their strategy choice in that game, were informed what strategy the other player had chosen, and were then paid what they had earned in that game.
The results of this experiment help to fill a gap arising from the fact that there is limited published evidence on incentive magnitude effects in experimental games. Karagözoğlu and Urhan [30
] attempted to review the relevant literature on bargaining and distribution games, but found few relevant studies, except on the Ultimatum game. Our findings provide no evidence of incentive magnitude effects in a wide range of normal-form experimental games. This corroborates a conclusion of Camerer and Hogarth [9
] from their review, and most subsequent studies of individual decision making, that have also failed to find evidence of such incentive effects.
The magnitude of the incentives in our control condition were roughly in line with typical payments for experiments in our laboratory and are generally considered to be adequate to motivate thoughtful participation. The payoffs in the ×5 experimental condition were very much larger than our subjects are accustomed to or than they expect. The manipulation of the independent variable left all strategic aspects of the games (Nash equilibria, Stackelberg strategies, strategic dominance, payoff dominance, and so on) unchanged, and, from a game-theoretic point of view, should not affect strategy choices. There is certainly little evidence that incentive magnitude influenced strategy choices in our experiment, and the analysis of self-reported reasons for choices suggests that the incentive manipulation hardly influenced the way players thought about the games. How subjects’ emotions are influenced by incentives was not something we studied, but this could be examined in future research on incentives in games, as recent research [40
] has been shown in a power-to-take game (PTTG) that emotions and incentives interact.
A compelling interpretation of the non-significant incentive magnitude effects in our data and in many previous studies of individual decision making is provided by the results of an important study of risky individual choice [41
]. Taking the capital-labor-production framework of Camerer and Hogarth [9
] as a starting point, Moffatt used experimental data previously published by Hey [42
] to estimate a fully parametric stochastic model of risky choice. In the model, the logarithm of decision time is used as a proxy measure of cognitive effort (labor) and it turns out that the monetary values of payoffs in risky choices have only a small positive effect on the amount of cognitive effort that subjects allocate to decision tasks. The incentive elasticity of effort
is estimated to be +0.028, meaning that if incentives were doubled, then response times would be expected to increase by 2.8%, compared to almost 40% if task complexity were doubled. In other words, Moffatt’s parametric model suggests that the complexity of a decision task has vastly more effect than incentive magnitude on the amount of cognitive effort that decision makers allocate to the task.