1. Introduction
In this paper, a theoretical learning model for tax compliance lab experiments was supposed. Moreover, it was demonstrated with data from a tax lab experiment that learning seems to occur in such experiments.
The most elementary structure for tax compliance lab experiments looks as follows. First, participants are acquired, mostly students. They receive a detailed description of the experiment’s design that runs over a certain number of rounds—20 for instance. The participants receive in each round of the experiment an income that is taxed at a constant tax rate. They are told to decide the income they declare for taxation; the declaration is audited by a tax authority with an announced probability. If the declaration is not identical to the income, i.e., if the tax is partially or fully evaded, the full tax and an additional penalty must be paid. The own results (audits, payoffs) are communicated to the respective participant after each round. After the announced number of rounds, the experiment is finished and the participants receive the remaining payoff.
What may be learned in tax compliance lab experiments? The crucial question is whether there is a systematic change in the behavior of participants during the experiment. Since in each and every round a tax must be paid, participants can choose between paying the tax due or to evade the tax partially or fully. Tax audits and fines for tax evasion are interventions with the intention to enhance tax compliance. If successful, these interventions initiate a learning process that increases tax compliance. However, this is not the only possibility. In contrast, participants may learn that they can successfully deceive tax authorities even when facing audits and penalties. In both cases, a learning effect might occur.
The observation in many lab tax experiments is that participants indeed react to tax stimuli, but often in a rather unexpected way—they pay either no tax or the full tax. Neither the level of tax compliance nor the actual tax payments are in line with the expected results based on the well-known Allingham–Sandmo–Yitzhaki (ASY) standard model of tax evasion [
1,
2]. According to this theory, participants would evade more taxes than they actually evade. Although there exist a number of additional behavioral assumptions to bring the theoretical results nearer to taxpayer compliance behavior—in particular, tax morale is considered an important factor—the behavior of participants in lab tax experiments is highly idiosyncratic. About three quarters of participants decide either to pay no tax or the full amount of the tax [
3,
4]. In addition, taxpaying behavior seems to meander during the rounds of the experiments, with few detectable patterns. Although there exist several informative reviews on the theory of tax evasion and tax compliance experiments [
4,
5,
6,
7,
8], the behavior of taxpayers, as well as of tax experiment participants, seems not to be sufficiently well understood.
In the following, the focus of the briefly reviewed literature is on relevant papers in which learning in tax experiments plays a role. However, papers on tax morale are not reviewed here since this is not the topic of the paper; moreover, there is a number of reviews on tax compliance experiments that include tax morale topics [
5,
6,
7,
8].
In a review of research on tax compliance behavior, Pickardt and Prinz [
7] classified models as belonging to mainly two strands, namely, the “Simple Model of Rational Crime” (SMORC; Reference [
9], Chapter 1), from economics, and the “Simple Model of Emotional Balancing” (SMOEB; Reference [
7], based on References [
9,
10]), from psychology. These classes of models encompass both aspects of tax compliance in general, as well as in tax compliance experiments. Nonetheless, the dynamics of decision making in tax compliance experiments requires further investigations. The crucial aspect is whether participants are learning to (not) evade the tax during the experiments. As all experiments are running over a number of rounds, the information on the dynamics of behavior in the data can be used to test whether participants’ behavior can be described by a learning model.
Although the theory of learning in games is well established in economics (e.g., [
11,
12,
13,
14,
15,
16]) and also in other sciences (e.g., [
17,
18]), it has not yet been applied very often and rigorously to tax compliance experiments. Moreover, the theoretical models applied to design experiments and to interpret the results are rather static (see Vale [
19] for a recent dynamic economic model of tax evasion).
Learning in the context of tax compliance that is enforced by tax audits and penalties is explicitly considered by Soliman et al. [
20]. The authors show that learning is an important factor for the explanation of the difference between the theoretically predicted behavior of participants in a tax compliance experiment and their actual behavior.
In a paper by Kastlunger et al. [
21], learning theory is applied to design tax audit patterns in such a way that they reinforce correct tax payments. A formal analysis is not presented; nonetheless, in the design as well as in the interpretation of the results, learning was explicitly discussed. Among others, it was found that early audits increase tax compliance, but that later audits are required to sustain the effect (see Reference [
21]).
The duration of tax audit effects on tax compliance was studied by DeBacker et al. [
22] with data from randomized audits by the Internal Revenue Service (IRS). The effects they found were usually short-lived, and persons with heavily fluctuating incomes returned quickly to the former compliance behavior. As it seems, a one-shot intervention is not an appropriate measure to increase compliance, probably because learning is a process that is driven by intermittent stimuli.
Other timing effects of audits have been studied. Tax compliance increased due to the deferred information about the results of tax audits in lab experiments, in comparison to immediate communication of the results, as reported by Kogler et al. [
23]. A similar compliance effect was found by Muehlbacher et al. [
24], when participants of an experiment had to wait for weeks after filing tax returns until they were informed whether an audit would be conducted.
Supervision and commitment may also increase compliance, perhaps as substitutes for learning processes. No compliance effect was detected by Gangl et al. [
25], when start-up firms were supervised in a field experiment by the tax authority concerning paying taxes timely. In contrast, in a lab experiment by Mittone and Saredi [
26], a long-lasting commitment to compliance did indeed increase compliance.
Two further papers seem to be relevant for the topic of this paper. Mittone [
27] analyzed the problems of repetitive decision making in experiments over a large number of rounds. The main result for this paper was that participants’ behavior was the result of a complicated mixture of risk attitudes and psychological factors. In the paper by Maciejovsky et al. [
28], participants’ compliance declined after a tax audit. The authors found empirical evidence that this effect was mainly driven by incorrect perceptions of audits since they believed that the probability of another audit directly after an audit decreased.
Another dynamic aspect of tax compliance, in combination with audits and penalties, was analyzed by Kirchler et al. [
29], as well as formally by Prinz et al. [
30]. Audits and penalties are instruments of enforced tax compliance, whereas taxpayers may also pay taxes voluntarily. The dynamic interactions of both versions of compliance are studied in the so-called slippery-slope framework [
29,
30]. However, this framework is not considered further here, as this paper is restricted to enforced tax compliance.
Although learning effects are involved in the tax compliance literature, a dynamical and formalized theoretical approach seems to be lacking. To provide such a theoretical model, two versions of a learning model of (enforced) tax compliance behavior are presented in this paper: a deterministic and a stochastic one. In the empirical part of the paper, it was tested econometrically whether participants’ behavior in a tax experiment was consistent with learning.
The remainder of this paper is structured as follows. In
Section 2, a deterministic learning model for tax compliance experiments is presented. In
Section 3, a stochastic version of the learning model is developed. In
Section 4, a tax compliance experiment that was already studied by Kastlunger et al. [
31], as well as Krauskopf and Prinz [
32], was analyzed with respect to the question of whether there were learning effects.
Section 5 concludes the paper.
2. A Deterministic Approach to Tax Experiment Learning
Suppose that the aim of a participant in a tax evasion experiment is to maximize her or his expected income, as in the tax evasion model of Srinivasan [
33]. This approach was chosen here instead of the standard ASY version of tax evasion models because it does not require a specific utility function. Since utility functions are individual characteristics that are represented by several parameters of the utility function, they cannot be observed directly in tax experiments. The model used here will not rely on unknown utility functions. Instead, a variable for the aspiration level of individuals was introduced for capturing in one variable all relevant differences among individuals with respect to the adjustment to income changes. One such relevant difference, in the context of this paper, was (among others) tax morale. Although the aspiration level can also not be observed, it makes the dynamic model simpler than a utility function-based model.
The crucial concept of the model presented in this paper is as follows. Suppose that individuals in a lab experiment receive in each round of the experiment a certain fixed windfall income of that is taxed with a flat income tax, . Individuals have to declare their income(s) to a tax authority that may audit the respective tax payment, , with probability . If the tax is not fully paid, the respective taxpaying individual is punished by a monetary penalty . The tax compliance probability is defined as the tax paid on declared income divided by the tax due for the correct income: .
In this way, a game between the individual taxpayer and the tax authority is initiated. The taxpayer is assumed to maximize the expected net income (i.e., the income after tax and penalty payments), while the tax authority applies tax audits at a previously fixed probability. It is assumed (as is the case in tax compliance experiments) that the taxpayers know the audit probability. However, the tax authority is a passive player as it does not change its strategy in response to the behavior of taxpayers. In this framework, learning may occur as follows. Starting with a certain tax payment, taxpayers may decrease (increase) tax payments if they were (not) audited in the previous round of the experiment. In terms of reinforcement learning (Reference [
12]), the respective tax paying behavior is either reinforced or changed, with respect to the taxpayer’s objective to maximize net income. This means that the compliance probability
is time-dependent and may change during the experiment. Hence, observing the development of the compliance probability over time indicates in which direction tax payments are adjusted in response to tax audits.
Formally, the variables of the model are defined as:
Y: income (exogenously fixed);
T(Y): tax tariff;
Td,t: tax due on the basis of the declared income with Td ≤ T(Y);
t: time;
: probability of a tax audit (detection probability of tax evasion), 0 < < 1;
F(Td): penalty in the case of detected tax evasion, F(Td) > 1;
pc: degree of tax compliance, pc = Td/T(Y);
The deterministic reinforcement learning equation is defined as follows:
In Equation (1),
A is the aspiration level with
Y −
A > 0, and λ is a (deterministic) learning rate, λ > 0. A learning rate was included since it may take time to adjust behavior. Moreover, it may be individually different to what an extent the adaptation occurs. In addition, an individual aspiration level was included, as indicated above. According to Güth [
34], incentives in experiments also trigger aspirations, besides choices and expectations. This is captured in the model by the variable
A. However, it should be clear that neither the aspiration level nor the learning rate can be observed directly in experiments. Nevertheless, the aspiration level in this model is the decisive parameter for individually different features of behavior (attitudes to taxpaying, to risk, etc.).
Because of
pc =
Td/T, Equation (1) can also be written as:
The learning Equations (1) and (2) can be motivated as follows. Tax payments are assumed to start with a value that implies some tax evasion, i.e., . If there is no tax audit in the next period (round) of the experiment, which occurs with probability , the tax payment is changed by , i.e., it is reduced. The reason is that tax evasion is reinforced by a non-detected tax evasion, combined with the motivation to maximize the expected income.
However, if there is a tax audit in the next period, which occurs with probability , tax evasion is detected. The detection of tax evasion leads to a loss of income, consisting of the full tax payment and a penalty based on the evaded tax. This income loss reinforces tax compliance via the income effect, . Hence, the tax payment increases by .
The fixed points of the learning Equation (2),
, describes the longer-term behavior of tax experiment participants. The condition for a fixed-point reads:
. Applied to Equation (2), this yields:
Hence, the first fixed point is at
= 0, i.e.,
. For
, a second fixed point exists:
i.e.,
= 1 and hence
.
To check the stability of the fixed point, Equation (1) is rewritten as:
Differentiating Equation (4) with respect to
yields:
The fixed point at
(at
) is stable if:
Hence, depending on the expected value of the penalty on the left-hand side of Equation (6), either full tax compliance or a zero tax compliance results (see
Figure 1 for the respective tax compliance dynamics). If the expected penalties are high, they render full compliance at a stable fixed point; in contrast, low values of the expected penalties render zero compliance of a stable fixed point. The decisive implication here is that deterministic learning implies an all-or-nothing tax compliance decision of participants.
However, it seems rather unlikely that all or a majority of participants in an experiment behave deterministically. Therefore, a generalization of the learning model is suggested in the following section.
5. Conclusions
In this paper, a theory of learning in games was applied to lab experiments on tax compliance. A theoretical model was provided that can be simulated and tested (at least in principle) with data from experiments.
In the theoretical analysis, two versions of a decision model were developed and analyzed which incorporated learning over time. In the discrete decision model, based on stochastic learning, participants choose the amount of the tax they pay in monetary units. In contrast, an all-or-nothing tax decision model with deterministic learning implies that either no tax or the full tax is paid.
The results of the stochastic model are consistent with experimental data. For example, in the experiment by Kastlunger et al. [
31], in 29.9% of all tax payment decisions (5160 cases), the tax was completely evaded, whereas in 46.1% of all decisions, the tax was fully paid. Hence, in 76% of all decisions either no tax or the entire tax was paid. The empirical analysis of the tax payment distribution indicates that the distribution seems to be bimodal, but with considerable variance.
The results of the econometric analysis showed that there was a tendency for participants to learn to evade tax payments over the rounds of the experiment. Controlling for idiosyncratic and erratic individual behavior by fixed individual effects, a statistically significant negative payment round trend was detected. The learning interpretation of this trend was contrasted with a tiring or “ego depletion” effect. Albeit participants of the first treatment showed the negative payment trend only in the last third rounds of the experiment, the participants of the third (rewarded) treatment reduced their tax payments over two-thirds of the rounds. The conclusion in this paper is, therefore, that the negative payment trend over the rounds may indicate a systematic learning effect.
As a general conclusion, it may be said that the idiosyncratic behavior of participants in lab experiments is creating a substantial level of statistical noise that makes it difficult to interpret the data of tax experiments. Nonetheless, a stochastic learning approach seems adequate to analyze these data.