Of course even among those subscribing to this view, the precise meaning of ‘a higher share of the tax burden’ may be highly contested. Answers could range from (mildly) regressive to very progressive tax schedules.
This may be a reason why obvious tax concessions such as rate cuts often apply only to foreigners, as in the Danish example above.
Strictly speaking, the experimental design we introduce below models situations in which high-income individual taxpayers’ ‘responsiveness’ consists of leaving the tax jurisdiction. If one is willing to disregard differences in entities, income type or between actual relocation and avoiding taxation without relocation, one may consider alternative motivations and a broader applicability. For instance, capital income may not only be more concentrated and more mobile than labor income, but also easier to avoid or evade taxes on, which at times has been the explicit rationale for lowering tax rates (e.g., the introduction of the capital income withholding tax in Germany in 2009). Tax competition for businesses (see [8
], for literature surveys) may also be a source of public concern, although the link with equity issues is not as direct.
We deliberately implemented a regressive tax schedule that many participants would perceive as ‘unfair’.
] for an attempt to rebut external validity concerns regarding tax compliance experiments.
Allingham and Sandmo recognized that non-financial factors such as stigma costs or social norms could enter the decision-makers problem and in fact incorporated a reduced-form representation of such elements in one version of their model. Ref. [22
] points out that when audit risks are appropriately considered, e.g., a very high detection risk for misreporting third-party reported income, the simple model of deterrence goes a long way in explaining real-world tax compliance.
] found a lower proportion of evaders in a taxation context than when the compliance task was represented as a pure gamble. In contrast, neither [19
] nor [25
] found an overall difference in compliance between taxation and neutral framing, although [25
] reports that the interaction of tax framing and income source affected compliance dynamics. Ref. [36
] reports that income source (earned vs. endowed income) affected post-audit compliance dynamics.
At first glance, it may appear odd to vary knowledge about the outside option instead of directly varying its existence. It will become clear that in our experimental design, notably given the particular parametrization of the tax schedule, this makes no difference. We chose the former variant with a view to potential policy implications: knowledge about outside options seems more readily addressable by policy interventions than the existence of outside options itself.
Note that low-income taxpayers in the performance-based allocation differ from low-income taxpayers in the random allocation treatment in performance by construction. This will be taken into account in the analysis in Section 5
below. Note also that unlike in the performance-bases allocation treatment, the random allocation did not involve feedback about relative performance.
Reading out the instructions renders them common knowledge. The outside option could not be mentioned to those for whom not knowing about it was the intended treatment.
A translated version of the instructions (originally in German) is available in Appendix C
The questions were taken from [41
] and translated to German.
In a full session of 31 participants, the number of players allocated to role A was seven (in general: ).
That is, the minimum earnings for a fully compliant high-income taxpayer in the tax game were also 475 points.
With a higher tax rate on high-income taxpayers, the financial attractiveness of the outside depends on the beliefs about low-income taxpayers contributions and hence on higher order beliefs about social preferences. The parametrization also ensured that most high-income taxpayers would in fact not take the outside option, facilitating the collection of four-player group observations.
To avoid deception with regards to group size, each session exogenously contained at least one group of three players in the tax game, independent of the outside option decisions of the high-income taxpayers. This exogenous group of three players remained uninformed about the outside option. As a result, the number of participants per session had to be a multiple of four plus three. Table A1
in the Appendix A
presents the actual number of participants and their partition into roles A and B in each session.
Except for the exogenous group of three low-income taxpayers, which always remained uninformed about the outside option (compare footnote 16).
Beliefs and hypothetical actions were elicited without monetary incentives.
Setting the audit risk for high-income taxpayers to 100% made it unambiguous that the tax rate on high-income taxpayers was just low enough to make the outside option unattractive. Interior audit probabilities for the high-income taxpayer would have meant that reasoning about the attractiveness of the outside option requires assumptions about high-income taxpayers’ risk preferences.
Unlike the tax liability payments, the fine was not added to the public account. This prevents participants—in particular high-income taxpayers—from non-complying for fairness or efficiency reasons.
In particular, all participants had to indicate their beliefs about others’ average contributions in the tax game (after they had made their own decision). As mentioned previously, informed low-income taxpayers also had to indicate their beliefs about the opt out decision of the high-income taxpayer in their group (while this decision was made). All beliefs were elicited without monetary incentives.
See Table A1
in the Appendix A
for an overview of the number of participants in each session. Due to subject pool depletion, the number of participants fell short of the full capacity of 31 participants in three random allocation treatment sessions.
For low-income taxpayers, for instance, the expected value function reads . The relevant coefficient on is and therefore negative for both and .
With CRRA utility functions, risk aversion must exceed 2.4 for an individual who maximizes expected utility to contribute a positive amount. See Appendix B
for a calibration exercise with CRRA utility.
If relative performance-based role allocation per se already makes differences in income and tax rates acceptable, there is no more room for a ‘justification effect’ of the outside option.
Six high-income taxpayers in the random allocation sessions instead chose to opt out despite the fact that this lowered their own payoffs as well as those of the low-income taxpayers in their group.
Informed low-income taxpayers are classified as ‘confused’ if they indicate a belief that the high-income taxpayers opt out, or a hypothetical intention to do so themselves if they were in the position of high-income taxpayers. Of the 114 informed low-income taxpayers, 34 are classified as ‘confused’ according to this definition. For uninformed taxpayers, no corresponding classification is available because the existence of an outside option, by design, could not be mentioned to them. Of the 216 observations of main interest (low-income taxpayers in groups of four), ‘confused’ participants make up 15%.
The compliance choice distributions for taxpayers in groups of three (available in Figure A1
in Appendix A
) are few in number (54 observations) and of minor interest. A direct comparison of compliance choices between three- and four-player groups is affected by the fact that the marginal per capita return of tax compliance is higher in groups of three. This may lead to higher contributions [46
]. Because of the certainty of being audited, high-income taxpayers always contributed in full when participating in the tax game interaction, as mentioned in Section 5.1
Compliance rates are defined as pre-audit contributions relative to the tax liability, i.e., . In light of the public good incentive structure of the experimental tax game, we use the terms ‘compliance’ and (pre-audit) ‘contributions’ interchangeably.
Two alternative interpretations exist. First, participants may have been confused about the financial incentives. This is an unlikely explanation in the light of the evidence that most informed low-income taxpayers understood the incentive structure (see Section 5.1
). In fact, in the subset of informed low-income taxpayers classified as ‘confused’, a smaller
share than overall chose positive contributions (56% vs. 62%). Second, positive contribution levels may arise in a standard expected utility maximization framework due to risk aversion. As mentioned in Section 4
, however, aversion to the financial risk of being fined would have to be implausibly high to rationalize positive contributions.
Across all treatments, 38% of low-income taxpayers (in groups of four) chose zero contributions and 40% contributed fully.
For ease of interpretation, we focus on OLS results in the main text. Tobit regressions that account for the fact that possible compliance rates are censored at 0 and 100—available in the appendix—generally yield qualitatively similar results, as do non-parametric tests.
The p-values on two-sided Mann-Whitney tests of differences in the compliance rate by information are (overall), (PBA) and (RA). The corresponding p-values on Kolmogorov-Smirnov tests are , and respectively.
For example, with an effect of the size observed empirically (≈0.08 standard deviations), the empirical standard deviation approaching 46 in both the information and no-information condition, test size 0.05 (one-tailed) and desired power of 0.8, around 3000–4000 new observations would be needed (depending on the underlying distribution). This constitutes an upper limited compared to OLS with controls.
See Subfigures (c) and (d) of Figure 3
. Median ratings of agreement (1 = “Do not agree at all”, …, 7 = “Completely agree”) to the statement “Taxing participants in role A at 10% and participants in role B at 50% is fair” were 1 (for random allocation treatments) and 2 (for performance allocation treatments), respectively, regardless of information about the outside option. The median rating of agreement (1 = “Do not agree at all”, …, 7 = “Completely agree”) to the statement “It is not justified that participants in role A have to pay a lower tax rate” was 7 in all treatments except the random treatment without information about the outside option (were the median rating was 6, but the distribution not significantly different from the random treatment with information,
, Mann-Whitney U test).
Such preferences were elicited as statements of agreement or disagreement to certain propositions. As such, we can of course not rule out that participants were responding to perceived social desirability of agreement or disagreement rather than revealing their true judgments or preferences.
The corresponding nonparametric tests for differences in compliance choices, available in Table A6
in the appendix, yield the same pattern of an overall significant difference between performance-based and random allocation stemming from the subset of informed taxpayers. Tobit regressions, which like the OLS regressions can control for performance in the earnings stage, also yield similar results (see Table A8
and Table A9
See Figure 3
a, which shows low-income participants’ agreement to the statement that the ‘role allocation and resulting income differential are fair’, separately for performance-based and random allocation treatments. This difference in agreement between random and performance-based allocation is also statistically significant overall (Mann-Whitney U test:
) as well as within information treatments (
informed). Some participants also explicitly appraised the performance-based player type allocation in open questions of the questionnaire.
Alternatively, feelings of entitlement (‘house money effect’), i.e., the idea that participants value earned money more than money provided without any effort, may also be on board in the PBA treatments. However, as all participants had to complete the earnings stage regardless of treatment, income was in some sense always ‘earned’. Moreover, prior evidence in this context is not clear-cut. In particular, there has been disagreement about the existence of ‘house money effects’. In a repeated linear public goods game, Ref. [47
] reported the absence of a ‘house money effect’ on contributions, a finding that was later disputed by [48
] on serial correlation arguments. Ref. [49
] reported an ‘inverse found money effect’: participants who had to earn their income based on absolute performance on GMAT questions contributed more in a one-shot linear public goods game (compared to participants who were endowed with incomes of the same amount). Other studies found no statistically significant differences in overall contributions between random and GMAT performance-based income allocation in public goods games [50
]. Ref. [51
], however, noted that with earned income, low-income participants contributed more and high-income participants less than with randomly allocated endowments. Ref. [52
], the only of these studies to consider relative performance, report the absence of significant differences in contributions between subjects who had to earn their participation rights in an effort tournament and those who participated without any prior effort in a repeated linear public goods game task. Their study differs in several other dimensions from the present one, however, e.g., in that the worst-performing subjects were excluded from participation while income was kept homogeneous and the game was repeated.
In their experiment, all participants that were randomly assigned to the role of a dictator had to answer GMAT questions and earned income according to their absolute performance. Treatment variation lay in the feedback that was provided about relative performance.
In this experiment, participants received feedback about both absolute and relative performance in both allocation treatments.
To the extent that the performance differences actually depend on ‘circumstances’ that are beyond the individual’s control (e.g., genetic differences in cognitive ability), such an allocation mechanism may also be considered unfair.
Indirect evidence for such an emotional mechanism comes from the study on taking behavior by [55
]. Consistent with the idea that random income inequalities are considered undeserved, they report that participants subtracted more from each other when they had earned their income in a prior game of luck rather than a prior game of skill. However, they also indicate that only in the skill game condition did subtractions increase with a relative performance gap, arguing that “a larger size of the unfavorable gap from the others increased the unpleasantness of poor performance, which in turn motivated larger subtraction”.
For simplicity, we treat as continuous.
2.4 is the threshold value for groups of four players where it is assumed that only the high-income taxpayers rationally contributes fully, i.e., . For three-player groups, with , the corresponding threshold value in r is 1.6. If we consider the belief that all other taxpayers contribute fully ( for and for ), the threshold values increase to 4.0 and 3.6 respectively. With out-of-lab wealth considered and assumed to amount to €1000, the thresholds exceed 100 in all cases.