Negative Feedback Punishment Approach Helps Sanctioning Institutions Achieve Stable, Time-Saving and Low-Cost Performances

Jun Qian; Xiao Sun; Ziyang Wang; Yueting Chai

doi:10.3390/math10152823

,

and

National Engineering Laboratory for E-Commerce Technologies, Department of Automation, Tsinghua University, Beijing 100084, China

^*

Author to whom correspondence should be addressed.

Mathematics2022, 10(15), 2823;https://doi.org/10.3390/math10152823

This article belongs to the Special Issue Uncertain System Optimization and Games

Version Notes

Order Reprints

Abstract

Sanctioning institutions widely exist in human society. Although these institutions play an important role in the management of social affairs, sanctions are often seen to be costly in terms of both time and money. To enable sanctioning institutions to develop effective sanctions, we propose a negative feedback punishment approach for these institutions that combines the feedback control principle and the negative correlation principle. In the negative feedback punishment approach, the punishment intensity imposed on the group is negatively correlated with the current group cooperation proportion. Through evolutionary simulation and theoretical analysis, we found that the negative feedback punishment approach facilitates more stable, time-saving and low-cost performance by sanctioning institutions than other punishment methods. This work offers a feasible solution for sanctioning institutions to solve social dilemmas and provides a possible theoretical starting point for investigating effective pool punishment measures.

Keywords:

negative feedback; sanctioning institution; PGG; evolutionary games; evolutionary dynamics; social dynamics

MSC:

91A22

1. Introduction

Sanctioning institutions play fundamental roles in societies, such as safeguarding people’s daily lives [1,2], protecting natural resources [3,4] and implementing foreign policy [5,6]. These institutions punish wrong-doers to promote cooperation in everyday life [1,7,8,9], but the poor performance of these institutions [10,11,12,13] constantly tells us that sanctions do not always work. For example, the U.S. tariffs imposed on imports from China since 2018 have resulted in U.S. consumers paying higher prices for goods and services imported from China. Judging from the current effect, however, it is uncertain whether the tariffs will solve the trade dispute between China and the United States. Additionally, the well-known case of the U.S. sanctions on Cuba has been going on for decades, causing economic losses of more than

$ 1

trillion. Moreover, a series of rigorous analyses indicate that sanctioning senders achieved their objectives “only” around

30 %

of the time [14]. In general, although sanctioning institutions have their own intentions on how to sanction, it would be apt to not expect the target results.

Existing research on sanctioning institutions is mainly conducted from two perspectives. On the one hand, the evolutionary gaming of social dilemmas considers sanctioning institutions as enforcers of pool punishment [15,16,17,18,19]. Since pool punishment can enhance cooperation [15,20,21,22], sanctioning institutions are explored to pursue the mechanisms by which pool punishment works [16,17,18,19,23,24,25,26,27,28,29,30]. On the other hand, empirical research on sanctioning institutions has been devoted to revealing the impacts of their operation [3,5,31,32,33,34,35,36,37], which enlightens the application of sanctioning institutions in real life. Despite the richness of these theoretical and practical studies, there are few studies on “how sanctioning institutions execute punishment in order to achieve group cooperation at a small cost.” The cases above show that significant time lost, substantial economic costs and uncertainty about outcomes all contribute to the poor performance of sanctions. Thus, in this paper, we (1) abstract the operation of sanctioning institutions into a basic model; (2) explore a time-saving, low-cost and stable sanctioning approach that helps sanctioning institutions to perform well.

We extended the standard public goods game (PGG) with pool punishment as the basic model to explore the operation of sanctioning institutions. The public goods game, which abstracts the trade-off between public and individual interests, embodies the cooperative social affairs in daily life. Pool punishment, on the other hand, corresponds to the sanctions of sanctioning institutions. Integrating PGG with pool punishment, we can observe the evolutionary process of people’s behavior under the dual pressure of social dilemma and sanctioning institutions. As for the sanctioning institution, the operation of a sanctioning institution is like guiding people to walk. The goal of the sanctioning institution indicates where and how people go, and sanctions are the tool by which the institution guides people. The good performance of a sanctioning institution then requires efficiently guiding people to their destinations at a small cost. Here, we focus on helping sanctioning organizations enforce sanctions to achieve good performance.

There are many punishment methods that are used to transform punishment intentions into punishment intensity in real life. The constant punishment intensity approach, where the punishment intensity is entirely determined by the punishment intention, is the most commonly employed punishment method in everyday life, such as data breach penalties in the EU General Data Protection Regulation and anti-dumping duties in trade wars. Punishment within a certain percentage range is an extension of the constant punishment intensity approach. For instance, in China, tax evaders are subject to penalties of one to five times the amount of tax evaded. Additionally, penalties within a certain range are widely used in enforcement; for example, drunk drivers in Japan are fined up to 1 million yen. Considering the abuse of discretion by law enforcement officers, punishing with a constant amount is also very common, such as the fixed penalty of HK$5000 for violations of the Prevention of Disease Regulation in Hong Kong in December 2020. In addition to these common sanctioning methods, we propose a negative feedback sanctioning method that we hope will provide a time-saving, low-cost and stable sanctioning method for sanctioning institutions.

We propose the negative feedback punishment approach for sanctioning institutions by combining the feedback control principle and the negative correlation principle. The feedback control principle [38] refers to making the next move based on the comparison of the current state with the goal. Inspired by this principle, a sanctioning institution can implement performance-triggered sanctions to ensure people behave correctly while avoiding unnecessary consumption. Specifically, the sanctioning institution does not have to punish people when they are going in the right direction; only when they go off course does the institution enforce sanctions. In contrast to institutions with regular time intervals, a performance-triggered institution is more cost effective. With the feedback control principle, the negative correlation principle is used to deliver sanctions at a small cost. In this principle, the group’s cooperation proportion is negatively correlated with the punishment intensity, which determines the amount of punishment. As people become better behaved under the guidance of the sanctioning institution, the cooperation proportion becomes larger. The negative correlation principle implies that the punishment intensity becomes smaller, and then people suffer less monetary loss. In addition, this negative correlation puts a constraint between punishment intensity and group performance, resulting in a one-to-one mapping between these two. Thus, the sanctioning outcomes at the same punishment intensity are expected to have less variation and be accordingly more stable. The negative feedback punishment approach is a combination of the feedback control principle and the negative correlation principle. In this paper, we explore whether and why the negative feedback punishment approach is a time-saving, low-cost and stable sanctioning method.

Through evolutionary simulation and theoretical analysis, we show that our proposed negative feedback punishment approach can help sanctioning institutions achieve more stable, time-saving and low-cost performances. The operation of sanctioning institutions is modeled by the PGG with pool punishment to analyze different sanctioning rules. Simulation results show that people’s performance with the negative feedback punishment approach varies less and is hence more stable than under a constant punishment intensity. On the other hand, the operation of sanctioning institutions based on a negative feedback punishment approach is less costly in both time and cost dimensions compared to other punishment methods. Moreover, theoretical analysis suggests that the reason for stable group performance under the negative feedback punishment approach is that the negative correlation between punishment intensity and group cooperation proportion constrains group behavior. Further comparisons illustrate the generality of the negative feedback punishment approach by showing group performances with different negative correlation forms. Overall, our proposed negative feedback punishment approach provides a more feasible and effective punishment method for real-life sanctions, and may be instructive for the operation of government departments and the management of various programs.

2. Materials and Methods

2.1. Model

2.1.1. PGG with Pool Punishment

There are n people playing a game as a group. In the game, they invest together and receive pooled punishment from sanctioning institutions. Initially, each individual has his/her strategy

s_{i}^{(0)}

, which determines the amount of their investment in the game (

s_{i}^{(0)} \in \{C, D\}

). Those who have the cooperation strategy C invest

r_{c}

resources, and those who hold the defection strategy D invest

r_{d}

resources (

r_{c} > r_{d}

). Individuals play round after round of games. Each round consists of two sessions: one PGG session and one pool punishment session.

In a PGG session, individuals invest into a public pool and receive revenues depending on their investments. In round p, each individual is given R units of resources and then simultaneously invests resources into the public pool based on his/her strategy. After the investment, all resources in the public pool are multiplied by a factor

μ > 1

and then distributed equally to each person as the revenue.

As for the pool punishment, the sanctioning institution identifies wrong-doers based on people’s investments in the PGG and enforces punishments on them. In round p, the sanctioning institution applies a punishment method to determine the punishment intensity

t^{(p)}

(

0 \leq t^{(p)} \leq 1

), and then punishes all those people who invest less;

t^{(p)}

.

t^{(p)}

refers to the intensity of the punishment in round p, that is, the punishment rate of the sanctioning institutions in round p. Specifically, for individual i who has resources

r_{i}^{(p)}

after the PGG, if his strategy is

s_{i}^{(p)} = D

, the institution punishes him with resources

r_{i}^{(p)} \times t^{(p)}

, and he has

r_{i}^{(p)} \times (1 - t^{(p)})

resources left. As an example, with

t = 0.3

, a defector who currently has 10 units of resources is penalized

10 \times 0.3 = 3

units of resources by the sanctioning institution and keeps the remaining

10 - 3 = 7

units of resources for himself.

After these two sessions, people learn from others with probability based on their current resources. For i and j who have resources

r_{i}

and

r_{j}

, respectively, the probability of i learning from j is

p_{i j}

:

p_{i j} = \frac{1}{1 + e^{\frac{r_{i} - r_{j}}{β}}}

(1)

where

β

indicates how difficult it is to learn. The larger

β

is, the less likely i is to learn from j. If i learns from j, i copies j’s strategy in this round; otherwise, i keeps his own strategy. In addition to learning, a mutation with a small chance

α

is also introduced into our model. Each mutated person changes his/her strategy from C to D or vice versa. People undergo round after round of the above evolution, culminating in a group-level evolutionary stabilization strategy.

In the PGG with pool punishment, how the punishment institution determines the punishment intensity has an important effect on group cooperation, and the way to determine the punishment intensity is where the punishment method comes into play. Consequently, we conducted a study on punishment methods to investigate whether the negative feedback punishment approach we propose can help sanctioning institutions achieve good performance.

2.1.2. Punishment Methods

Figure 1 depicts how the sanctioning institution influences the group performance through the punishment method. Considering the group as a system, the sanctioning institution with a punishment method can be seen as the environment, and the environment influences the system by changing the system’s input (i.e., the punishment intensity). As two parts of the environment, the sanctioning institution and the punishment method coordinate to change the system’s input. Specifically, the sanctioning institution tells the punishment method its punishment intention, and the punishment method converts that intention into the punishment intensity. Thus, the punishment method is the pivot that turns thinking into doing. When the sanctioning institution alters its punishment intention, the punishment intensity is changed accordingly and then fed directly to the group as a system input. Influenced by different punishment intensities, groups exhibit different performances as the output of the system.

Figure 1. Sanctioning institutions steer the group through punishment methods.

The sanctioning institution’s punishment intention, which is denoted by k, reflects whether the institution expects the punishment to be severe or light. In our model,

k \in [0, 1]

. A strong punishment intention yields a large punishment intensity. Although a large punishment intensity often leads to good group performance, it also results in significant monetary losses for people. The punishment intention, then, is not “the worse punishment, the better”, but rather achieving the sanctioning goal with the lowest possible cost.

To deeply explore the negative feedback punishment approach, we chose one of the most common punishment methods, the constant punishment intensity approach, as a comparison to our proposed negative feedback punishment approach. The constant punishment intensity approach implies that the punishment intensity

t^{(p)}

is determined by the sanctioning institution’s punishment intention, and is not influenced by other factors, such as group performance or group size. In other words, this punishment method processes the punishment intention into a punishment intensity through the mapping of

t^{(p)} = k

.

As for the negative feedback punishment approach, according to the previous introduction of feedback control and negative correlation, the punishment intensity is adjusted according to the system performance, and the punishment intensity is negatively correlated with the performance. That means

t^{(p)} = g (f_{c}^{(p)}, k)

and

\frac{\partial g (f_{c}^{(p)}, k)}{\partial f_{c}^{(p)}} < 0

, where

f_{c}^{(p)}

is the percentage of cooperators in the group in round p,

g (f_{c}^{(p)}, k)

is the punishment intensity function and

\frac{\partial g (f_{c}^{(p)}, k)}{\partial k} > 0

. Specifically, the feedback control principle is implemented through group performance

f_{c}^{(p)}

. Whatever the change in group performance, it brings about a modification in the punishment intensity

t^{(p)}

, which in turn leads to the further guidance towards group performance. On the other hand, it is the function

g (f_{c}^{(p)}, k)

, where

\frac{\partial g (f_{c}^{(p)}, k)}{\partial f_{c}^{(p)}} < 0

, that embodies the idea of negative correlation. For a given

g (f_{c}^{(p)}, k)

, the punishment intention k influences the value of

\frac{\partial g (f_{c}^{(p)}, k)}{\partial f_{c}^{(p)}}

, which affects how fast the punishment intensity changes when group performance varies. Thus, the intention k from the sanctioning institution and the performance

f_{c}

of the group jointly determine the value of punishment intensity. We apply

g (x, k) = k \times 5^{- x}

as the punishment intensity function in the negative feedback punishment approach to compare it with the constant punishment intensity approach. Please see Table 1 for the exploration of each notation in our model.

Table 1. Notation table.

3. Results

3.1. Punishment Intention Affects GROUP Performance

The constant punishment intensity approach and the negative feedback punishment approach are compared by analyzing evolutionarily stable strategies of the group under different punishment intentions. This comparison help us sort out the characteristics of these two punishment methods. We found that the negative feedback punishment approach can help groups have more stable performances than the constant punishment intensity approach.

Figure 2a shows the results under the constant punishment intensity approach. The red line represents

t = k

, and the std value of t (the shaded part) is 0. Red and gray dots represent the punishment intensity and cooperation proportion at the end of each evolution, correspondingly. The solid lines and shading indicate the mean and std value over 10 repetitions of each punishment intention. The optimal point is marked by a red cross. The group performance is measured as the percentage of cooperators in this group. In terms of the mean, the cooperation proportion steps from 0 to 1 around the punishment intention of

0.2

. Additionally, the std value around the intention value

0.2

is particularly large compared to other values. Taken together, we can see that the group performance exhibits great instability in the range of intention around

0.2

. With this range as the separation, we divided the whole range of k into three regions from small to large: the completely uncooperative region (

k \leq 0.12

), the extremely unstable region (

0.12 < k \leq 0.23

) and the completely cooperative region (

k > 0.23

). The blue dotted line in Figure 2a is the dividing line between three regions. The completely uncooperative region and the completely cooperative region correspond to stable group performances with the cooperation proportions of 0 and 1, respectively. The extremely unstable region exhibits two characteristics: On the one hand, the group is either completely cooperative or completely defective, so that the std value is extremely large. On the other hand, the length of this region is extremely small, making the group performance show a large change.

Figure 2. A group under different punishment methods. (a,b) The effect of punishment intention under the constant punishment intensity approach and the negative feedback punishment approach, respectively. The sampling interval of punishment intention is

0.01

in our simulation. (c,d) The 50 operation processes of the sanctioning institution under the constant punishment intensity approach and the negative feedback punishment approach, respectively. One colorful line depicts one operation, and the black line shows an average over 50 repetitions. Based on the operations, we compare the two methods in terms of time and cost in (e,f), respectively. Bar heights in (e,f) indicate the mean evolutionary rounds and the average of the cumulative punishment intensity, respectively. Error bars represent standard errors. The detailed setup of the simulation environment is shown in the Appendix A.1.

In contrast to the constant punishment intensity approach, group performance under the negative feedback punishment approach can be more stable and less costly. The punishment intensity and group performance under the negative feedback punishment approach are depicted in Figure 2b. The points and lines in Figure 2b have the same meaning as in Figure 2a. As the punishment intention k increases from 0 to 1, the punishment intensity first increases and then presents a u-shape around

0.2

; the proportion of cooperation remains at 0 at the beginning and then increases continuously until 1. Importantly, the small std values of the punishment intensity and the cooperation proportion suggest that both the punishment from sanctioning institutions and the performances of groups are more stable under the negative feedback punishment approach than under the constant punishment intensity approach. Another salient feature is that the maximum punishment intensity only goes up to about

0.2

as k varies from 0 to 1. The reason for the low punishment intensity is that the negative feedback punishment approach specifies a negative correlation between the cooperation proportion of group and the punishment intensity. Due to this negative correlation, when the group performs well, the punishment intensity t would not go very high when compared to other punishment methods. Then, the group would bear less monetary costs.

3.2. Operation of the Sanctioning Institution

Sanctioning institutions are designed to lead groups to a high level of cooperation at a low cost. The higher the punishment intensity, the greater the cost to the group. Thus, the punishment intention that achieves a high level of cooperation with a small punishment intensity is the optimal strategy that enables the best group performance, as marked in Figure 2a,b with red crosses. The role of the sanctioning institution is then to constantly input the punishment intention k into the punishment method until locating the optimal point over the entire k range. The binary search in computer science helps the institution continuously determine k (as introduced in Methods), and on this basis, different punishment methods are analyzed through the comparison of corresponding group performances.

We explored the constant punishment intensity approach and the negative feedback punishment approach by repeating the institution’s operation 50 times, as shown in Figure 2c,d. One colorful line depicts one operation, and the black line shows an average over 50 repetitions. The maximum value of the x-axis is determined by the maximum evolutionary rounds among all these operations. The constant punishment intensity approach shows significant fluctuations in Figure 2c, which can be attributed to the extremely unstable region where

k \in (0.12, 0.23)

. Specifically, the group exhibits either all cooperation or all defection when k is in this region. Once the sanctioning institution gets feedback from the group that they are currently all cooperative or defective, the institution will accordingly narrow the searching interval. Since the group is not in the completely uncooperative region or the completely cooperative region, the updated searching interval is likely to incorrectly exclude the optimal point, and then the search needs to be restarted. This is the reason why the whole processes fluctuates a lot. These contradictions and recurrence mean that “actions” from the sanctioning institution have no clear direction and often change dramatically.

For the negative feedback punishment approach, the punishment intentions rapidly converge to about

0.6

and are then fine-tuned. Despite the long process of fine-tuning, the proportion of cooperation around the intention of

0.6

is stable and close to 1, as can be seen from Figure 2b. Compared to searching processes in Figure 2c, sanctioning institutions operate with less volatility, and the group shows a higher percentage of cooperation under the negative feedback punishment approach in Figure 2d. Thus, people under the negative feedback punishment approach can perceive a purposeful and reliable sanctioning institution, and the group performance remains harmonious and stable over time.

The comparison on the sanctioning institution’s operation is performed in two dimensions: time and cost. Time refers to the mean evolutionary rounds from the beginning of the searching to the end. Cost means the money or resources people lose in the searching process. Since people’s penalties depend on the punishment intensity, we use the average of the cumulative punishment intensity over 50 repetitions here to be a proxy for cost. The time and money losses during operations are presented in Figure 2e,f. Dark gray bars correspond to the constant punishment intensity approach, and light gray bars represent the negative feedback punishment approach. Groups under the negative feedback punishment approach spend significantly less time and money than groups with the constant punishment intensity approach. This demonstrates that the negative feedback punishment approach is a time-saving and low-cost method for sanctioning institutions compared to the constant punishment intensity approach.

In addition to the constant punishment intensity approach, we also compare the time losses and monetary losses of several other common punishment approaches, including punishing within a certain percentage range, penalizing within a certain amount range and punishing with a constant amount. Punishing within a certain percentage range means that the punishment intensity varies within 98–102% of the punishment intention k. Punishing by amount means that the punishment intensity solely determines the amount of the fine, rather than the punishment rate. When the sanctioning institution penalizes within a certain amount range, the punishment intensity

t \in [0.98 k \times R, 1.02 k \times R]

. When punishing with a constant amount,

t = k \times R

. The operations of the sanctioning institution under these punishment methods are compared in Figure 3. Error bars indicate standard errors. From left to right, the punishment methods are: punishing within a certain percentage range, penalizing within a certain amount range, punishing with a constant amount and the negative feedback punishment approach. We can see that the negative feedback punishment approach shows advantages in terms of both time and money compared to other punishment methods.

Figure 3. Comparison of the operation of sanctioning institutions under different punishment methods. Based on 50 operations of the institution, we compare the time and money consumption of several punishment methods. Bar heights in (a,b) indicate the mean evolutionary rounds and the average of the cumulative punishment intensity over 50 repetitions, respectively. The detailed setup of the simulation environment is shown in Appendix A.1.

3.3. Theoretical Analysis

The theoretical analysis based on replication dynamics theory was performed to reveal the underlying reasons why the two methods, the negative feedback punishment approach and constant punishment intensity approach, presented the above results. There are two strategies in the group: cooperation and defection. We analyzed the evolutionarily stable strategies of the group by the expected utility of these two strategies.

The proportion of cooperation in the group is denoted here by x. For a cooperator, his expected utility

U_{c} = R - r^{c} + μ \times (r^{c} \times x + r^{d} \times (1 - x))

, and the expected utility for a defector is

U_{d} = (R - r^{d} + μ \times (r^{c} \times x + r^{d} \times (1 - x))) \times (1 - t)

. Then, the average utility of an individuals is

\bar{U} = x \times U_{c} + (1 - x) \times U_{d}

. Replicator dynamics of the group is

F (x) = x \times (U_{c} - \bar{U})

. The evolutionarily stable strategies correspond to x that satisfies

F (x) = 0

and

{F (x)}^{'} \leq 0

.

For the method with constant punishment intensity, the punishment intensity t is a constant and is independent of x. There are three possible equilibrium points that satisfy

F (x) = 0

:

x_{1} = 0

,

x_{2} = 1

and

x_{3} = \frac{r^{c} - r^{d} + r^{d} \times t - R \times t - r^{d} \times μ \times t}{(r^{c} - r^{d}) \times μ \times t}

. For all these three points, let us analyze the conditions for being evolutionarily stable strategies:

For $x_{1}$ , ${F (x)}^{'} \leq 0$ when $h_{1} = (R - r^{d} + r^{d} \times μ) \times t + r^{d} - r^{c} \leq 0$ , so $x_{1} = 0$ is a stable equilibrium point when $h_{1} \leq 0$ .
For $x_{2}$ , ${F (x)}^{'} \leq 0$ when $h_{2} = r^{c} - r^{d} + (r^{d} - R - r^{c} \times μ) \times t \leq 0$ , so $x_{2} = 1$ is a stable equilibrium point when $h_{2} \leq 0$ .
For $x_{3}$ , it is impossible to satisfy both $0 \leq t \leq 1$ and ${F (x)}^{'} \leq 0$ in any simulation settings. Accordingly, $x_{3}$ is an unstable equilibrium point in the constant punishment intensity approach.

x_{1}

and

x_{2}

correspond to the completely uncooperative and completely cooperative regions in Figure 2a, respectively. Conditions satisfying

0 \leq x_{3} \leq 1

eventually evolve to either

x_{1}

or

x_{2}

, which unveils the reason for the existence of the extremely unstable region.

For the negative feedback punishment approach,

t = g (x, k)

has a direct impact on

{F (x)}^{'}

, and this is the biggest difference between these two punishment methods. With

\frac{\partial g (x, k)}{\partial x} < 0

in this case, there are also three possible equilibrium points that satisfy

F (x) = 0

:

x_{1} = 0

,

x_{2} = 1

and

x_{3}

.

x_{3}

here refers to all points satisfying

p (x) = g (x, k)

, where

p (x) = \frac{r^{c} - r^{d}}{x \times μ (r^{c} - r^{d}) + R - r^{d} + μ \times r^{d}}

. All these three points are analyzed below:

For $x_{1}$ , ${F (x)}^{'} \leq 0$ when $h_{1} = (R - r^{d} + r^{d} \times μ) \times g (0, k) + r^{d} - r^{c} \leq 0$ , so $x_{1} = 0$ is a stable equilibrium point when $h_{1} \leq 0$ . In our simulation setup, this means that $g (0, k) \leq \frac{r^{c} - r^{d}}{R - r^{d} + r^{d} \times μ} = \frac{2}{9}$ . Thus, $x_{1}$ corresponds to the part in Figure 2b where the cooperation proportion equals 0.
For $x_{2}$ , ${F (x)}^{'} \leq 0$ when $h_{2} = r^{c} - r^{d} + (r^{d} - R - r^{c} \times μ) \times g (1, k) \leq 0$ , so $x_{2} = 1$ is a stable equilibrium point when $h_{2} \leq 0$ . $x_{2}$ In our simulation corresponds to the part in Figure 2b where the cooperation proportion equals 1.
For $x_{3}$ , all the x satisfying $p (x) = g (x, k)$ and ${F (x)}^{'} \leq 0$ are equilibrium points. The conditions of the equilibrium point are related to the punishment intensity function $g (x, k)$ . In our simulation, as shown in Figure 4a, for any given $x^{'} \in (0, 1)$ , there always exists one $k^{'}$ such that $p (x^{'}) = g (x^{'}, k^{'})$ holds. For $(x^{'}, k^{'})$ that satisfies $p (x^{'}) = g (x^{'}, k^{'})$ , Figure 4b depicts the points meeting ${F (x^{'})}^{'} \leq 0$ . Thus, along with the corresponding k, any $x \in (0, 1)$ can be an equilibrium point. These equilibrium points between 0 and 1 correspond to the part where the group’s cooperation proportion is between 0 and 1 in Figure 2b.

Figure 4. Numerical analysis of possible equilibrium point $x_{3}$ in the negative feedback punishment approach. (a) All $(x, k)$ that satisfy $p (x) = g (x, k)$ . For each x there is one and only one k that makes $(x, k)$ meet the condition. (b) This shows that ${F (x)}^{'} \leq 0$ for every $(x, k)$ satisfying $p (x) = g (x, k)$ .

Comparing the two methods, the reason

x_{3}

can be the ESS in the negative feedback punishment approach is that

g (x, k)

adds a constraint between punishment intensity and group performance. Specifically,

p (x)

is a monotonic function. As shown in Figure 4a, there is only one k such that

p (x) = g (x, k)

for every x, which means the evolutionarily stable point x corresponds to

g (x, k)

one-to-one. Thus, we can say that negative correlation makes the negative feedback punishment approach more stable, compared to the constant punishment intensity approach.

Since the negative feedback punishment approach has a good performance, it is natural to think of whether the positive feedback punishment approach also works. The positive feedback punishment approach means that the punishment intensity is positively correlated with the group performance. The equilibrium point

x_{3}

under the negative feedback punishment approach implies that each x can be a stable point when

x \in (0, 1)

, so the cooperation proportion continuously changes from 0 to 1 and stable. We test whether

x_{3}

is an ESS under the positive feedback punishment approach to reveal the performance of this approach. It turns out that there is no stable equilibrium point when the cooperation proportion is between 0 and 1 under the positive feedback punishment approach. Therefore, the theoretical analysis suggests that group performance under the positive feedback punishment approach may also be unstable, just like the group performance under the constant punishment intensity approach. A detailed simulation comparison between the positive and negative feedback punishment approaches is shown in Appendix A.3. Overall, group performances with the negative feedback punishment approach are better than those under both the constant punishment intensity approach and the positive feedback punishment approach.

4. Conclusions

The negative feedback punishment approach is a blend of feedback control and negative correlation, both of which work together to help the sanctioning institution achieve good performance. The punishment method specifies how to convert the sanctioning institution’s punishment intention into punishment intensity. Inspired by feedback control, we note that the group cooperation rate can be used as an input to the punishment method to determine the punishment intensity. Based on this, we constructed a negative correlation between cooperation proportion and punishment intensity to specify the transformation of punishment intention into punishment intensity. We demonstrated that the negative feedback punishment approach is a stable, time-saving, and low-cost punishment method that helps sanctioning institutions developing punishment intensity and further promote high-level group cooperation.

5. Discussion

It is obvious that a time-saving and low-cost punishment method can help people reduce their losses. Then what are the benefits of stability? For an enforcement agency, unstable group performance can easily lead to misjudgment of the current sanction, which in turn leads to misformulation of the next sanction. Being inaccurately sanctioned from time to time can be a disaster for the community. If people are sanctioned harshly one moment and then punished slightly the next, the agency is then imperceptible and fickle in people’s minds, and people will lose trust in the agency. This is how the group feels under the constant punishment intensity approach. On the contrary, if the group performance is stable, as it is under the negative feedback punishment approach, the sanctioning institution is able to quickly lock the range of intentions around a certain value and then fine-tune it. During the long fine-tuning process, the percentage of group cooperation remains high. For people, the group becomes increasingly cooperative, and the sanctioning agency is competent and reliable. The society is then positive and harmonious. Thus, the stability of the punishment method is critical for both the sanctioning institution and the group.

In addition to the practical inspiration for sanctioning institutions to achieve stable, time-saving and low-cost performance, the negative feedback punishment approach also has theoretical implications for the further study of punishment methods. Many common punishment methods act as open-loop controls in which group performance affects only the punishment intention and not the punishment method. In contrast, in the negative feedback punishment approach, the cooperation proportion of group has a direct impact on both the punishment intention and the punishment method. In this case, the sanctioning institution, the punishment method and the group together form a system. Then, stability, time loss and monetary cost become the system performance indicators; and the real-life limitations such as jurisdiction and law serve as institutional constraints. Good performance by the institutions requires a high level of cooperation with low cost during the whole operation, and helping sanctioning institutions achieve good performance can be understood as an optimal control problem. Then, developing a good punishment method becomes a matter of finding the optimal control strategy that makes the system performance indicators optimal under the given constraints.

Although we designed the negative feedback punishment approach for sanctioning institutions, the deployment of this approach is not limited to those enforcement agencies that impose sanctions. A variety of management departments can also apply negative feedback methods. For example, HR departments typically use employee performance as the “input” to employee evaluations. On this basis, we recommend applying employee performance to guide the development of employee evaluation rules, just as the negative feedback punishment approach uses group performance to develop the punishment intensity. Furthermore, the “input” of various policies in real life is mainly the performance of individuals or groups. However, it has been witnessed that many policies still fail to achieve the desired results. Inspired by the negative feedback punishment approach, in addition to using group performance as a policy input, we should also consider the application of performance to policy development. Commonly, many institutions actually use their perceptions of system performance as a basis for intervening in the system. However, the negative feedback punishment approach allows us to have new perceptions of system performance in policy making, so that we can be more efficient and harvest good results.

Author Contributions

J.Q. and X.S. conceived the experiments; J.Q. conducted the experiments and analyzed the results; J.Q., X.S. and Z.W. wrote the paper. All authors reviewed the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Key R&D Program of China No. 2021YFF0900800.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data in this paper are available at https://datadryad.org/stash/share/cP22Z1eqABzluVH9hZkBXPULcsKg-j2dVGHv_-qMtm4 (accessed on 25 July 2022).

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Appendix A.1. Simulation Setup

The group consisted of 100 individuals. Before the evolutionary simulation began, each individual randomly chose a strategy in the strategy set

\{C, D\}

with equal probability as his initial strategy. In our PGG with pool punishment, the initial resource

R = 5

. Individuals with strategy C invested

r_{c} = 4

, and individuals whose strategy was D invested

r_{d} = 2

. After the investment, the resources in the public pool were multiplied by

μ = 3

. As for learning and mutation, the learning noisy

β = 0.1

and mutation rate

α = 0.05

. The evolution consisted of 300 rounds under each punishment intention k. We report the evolutionary rounds, punishment intensity and cooperation proportion during the evolution as our simulation results.

Appendix A.2. Binary Search by Sanctioning Institutions

Regardless of the punishment method applied, the sanctioning institution used the same means, the binary search, to find the optimal punishment intention k in our simulation. Based on the binary search in computer science [39], here we introduce the way that sanctioning institutions find the optimal point, that is, the operation process of these institutions.

We define

s t a r t

and

e n d

as the upper and lower bounds of the searching interval, respectively. Additionally, the punishment intention k equals the midpoint of the current searching interval

k = \frac{s t a r t + e n d}{2}

. The whole searching process of institutions is continuous updating of the searching interval

[s t a r t, e n d]

and intention k based on the current cooperation proportion of the group

f_{c}^{(p)}

.

The searching interval is initialized to the entire range of values for k, which in our simulation is

[s t a r t, e n d] = [0, 1]

for all the punishment methods. The initial intention k is randomly generated from the searching interval

[0, 1]

. Updating the search interval is intended to narrow the interval so as to find the optimal point as soon as possible. Specifically, the evolutionarily stable group exhibiting

f_{c}^{(p)} > θ

implies that the current punishment intention k is too large, so the searching interval will be updated to

[s t a r t, k]

; otherwise, the new searching interval is

[k, e n d]

. Here,

θ = 0.95

refers to the institution’s goal for the group cooperation proportion. The new searching interval is accompanied by the update of k. The next punishment intention k is taken in the same way as described above, being the midpoint of the new interval.

The whole search stops until the searching interval is small enough

(e n d - s t a r t) < ε

(

ε = 0.01

). In addition, the case

(e n d - s t a r t) < 2 ε

and

g (x) < (θ - 0.05)

means that the searching interval is already very small but the group performance is still far from the expectation. To ensure the robustness of the search, if this case occurs, the searching interval would be returned to the initial space, and the punishment intention k would be generated randomly in this interval.

Appendix A.3. Further Comparison

We use different expressions of negative correlation to support the effectiveness and generality of the negative feedback punishment approach. We apply two representative functions, the exponential function and the linear function, as different negative correlation expressions. Group performances with the exponential function

g (x, k) = k \times e^{- x}

and the linear function

g (x, k) = k \times (1 - x)

are shown in Figure A1a,c. The results demonstrate that although the expressions are different, the group performances are still characterized by stability and continuous change, like that observed in Figure 2b.

From above we notice that k is the key to enabling negative feedback to function well in both the theoretical analysis and the above simulation comparison. We summarize a few notes about k in the following. First, the minimum value of k needs to ensure that

g (x, k)

is small enough when

x \in [0, 1]

, so that there is room to increase

g (x, k)

and then facilitate cooperation. Just like the condition for

x_{1}

to be a stable point is that

g (0, k) \leq \frac{2}{9}

. Second, negative correlation forms and k together determine how a change in k affects the punishment intensity and thus the cooperation proportion. Additionally, Figure A1a,c shows the cases where changes in k cause weak and strong variations in punishment intensity. The punishment intensity functions in (a) and (c) are

g (x, k) = k \times e^{- x}

,

g (x, k) = k \times (1 - x)

, respectively. Third, negative correlation forms and k together also limit the changes in the cooperation proportion. For example, the maximum cooperation proportion is about 1 in Figure 2b, and it is about

0.8

in Figure A1c.

In addition to the form of negative correlation, we also use the positive feedback punishment approach as a comparison to the negative one. Similarly, the exponential and linear functions are employed to observe the group performance under positive feedback. In both cases of

g (x, k) = k \times (2^{x} - 1)

(shown in Figure A1b) and

g (x, k) = k \times x

(shown in Figure A1d), the punishment intensity exhibits continuous fluctuations, resulting in a large variation in the proportion of group cooperation. Considering that the effects of the intention on groups are the same under the positive feedback punishment approach and the constant punishment intensity approach, we speculate that the operation of sanctioning institutions under the positive feedback punishment approach would cause significant time loss and cost loss, just as it does under the constant punishment intensity approach. Taken together, group performances with the negative feedback punishment approach are better than those under the constant punishment intensity approach and the positive feedback punishment approach.

Figure A1. Comparison of different functions in negative and positive feedback punishment approaches. (a,c) are the effects of punishment intention on group performance under the negative feedback punishment approach, and (b,d) are the results under the positive feedback punishment approach. Red and gray dots represent the punishment intensity and cooperation proportion at the end of each evolution, correspondingly. The solid line and shading indicate the mean and std value over 10 repetitions for each punishment intention. The sampling interval of punishment intention was

0.01

.

References

Boockmann, B.; Thomsen, S.L.; Walter, T. Intensifying the use of benefit sanctions: An effective tool to increase employment? IZA J. Labor Policy 2014, 3, 21. [Google Scholar] [CrossRef]
Schunk, D.; Wagner, V. What determines the willingness to sanction violations of newly introduced social norms: Personality traits or economic preferences? evidence from the COVID-19 crisis. J. Behav. Exp. Econ. 2021, 93, 101716. [Google Scholar] [CrossRef]
Schnegg, M.; Linke, T. Living institutions: Sharing and sanctioning water among pastoralists in Namibia. World Dev. 2015, 68, 205–214. [Google Scholar] [CrossRef]
Cory, D.C.; Germani, A.R. Criminal sanctions for agricultural violations of the Clean Water Act. Water Policy 2002, 4, 491–514. [Google Scholar] [CrossRef]
Felbermayr, G.; Morgan, T.C.; Syropoulos, C.; Yotov, Y.V. Understanding economic sanctions: Interdisciplinary perspectives on theory and evidence. Eur. Econ. Rev. 2021, 135, 103720. [Google Scholar] [CrossRef]
Bown, C.P. The US–China trade war and Phase One agreement. J. Policy Model. 2021, 43, 805–843. [Google Scholar] [CrossRef]
Midgette, G.; Kilmer, B.; Nicosia, N.; Heaton, P. A Natural Experiment to Test the Effect of Sanction Certainty and Celerity on Substance-Impaired Driving: North Dakota’s 24/7 Sobriety Program. J. Quant. Criminol. 2021, 37, 647–670. [Google Scholar] [CrossRef] [PubMed]
Arguedas, C. To comply or not to comply? Pollution standard setting under costly monitoring and sanctioning. Environ. Resour. Econ. 2008, 41, 155–168. [Google Scholar] [CrossRef]
Putterman, L.; Tyran, J.R.; Kamei, K. Public goods and voting on formal sanction schemes. J. Public Econ. 2011, 95, 1213–1222. [Google Scholar] [CrossRef]
Lashkaripour, A. The cost of a global tariff war: A sufficient statistics approach. J. Int. Econ. 2021, 131, 103419. [Google Scholar] [CrossRef]
Gowin, K.D.; Wang, D.; Jory, S.R.; Houmes, R.; Ngo, T. Impact on the firm value of financial institutions from penalties for violating anti-money laundering and economic sanctions regulations. Financ. Res. Lett. 2020, 40, 101675. [Google Scholar] [CrossRef]
Besedeš, T.; Goldbach, S.; Nitsch, V. Cheap talk? Financial sanctions and non-financial firms. Eur. Econ. Rev. 2021, 134, 103688. [Google Scholar] [CrossRef]
Crozet, M.; Hinz, J.; Stammann, A.; Wanner, J. Worth the pain? Firms’ exporting behaviour to countries under sanctions. Eur. Econ. Rev. 2021, 134, 103683. [Google Scholar] [CrossRef]
Hufbauer, G.; Oegg, B. Economic Sanctions for Foreign Policy Purposes: A Survey of the Twentieth Century. In Handbook on International Trade Policy; Kerr, W.A., Gaisford, J.D., Eds.; Edward Elgar: Cheltenham, UK, 2007; Chapter 47. [Google Scholar]
Sigmund, K.; Silva, H.D.; Traulsen, A.; Hauert, C. Social learning promotes institutions for governing the commons. Nature 2010, 466, 861–863. [Google Scholar] [CrossRef] [PubMed]
Hilbe, C.; Traulsen, A.; Röhl, T.; Milinski, M. Democratic decisions establish stable authorities thatovercome the paradox of second-order punishment. Proc. Natl. Acad. Sci. USA 2014, 111, 752–756. [Google Scholar] [CrossRef]
Schoenmakers, S.; Hilbe, C.; Blasius, B.; Traulsen, A. Sanctions as honest signals—The evolution of pool punishment by public sanctioning institutions. J. Theor. Biol. 2014, 356, 36–46. [Google Scholar] [CrossRef]
Alventosa, A.; Antonioni, A.; Hernández, P. Pool punishment in public goods games: How do sanctioners’ incentives affect us? J. Econ. Behav. Organ. 2021, 185, 513–537. [Google Scholar] [CrossRef]
Cobo-Reyes, R.; Katz, G.; Meraglia, S. Endogenous sanctioning institutions and migration patterns: Experimental evidence. J. Econ. Behav. Organ. 2019, 158, 575–606. [Google Scholar] [CrossRef]
Yamagjshi, T. The Provision of a Sanctioning System as a Public Good. J. Orpersonality Soc. Psychol. 1986, 51, 110–116. [Google Scholar] [CrossRef]
Sigmund, K.; Hauert, C.; Traulsen, A.; de Silva, H. Social control and the social contract: The emergence of sanctioning systems for collective action. Dyn. Games Appl. 2011, 1, 149–171. [Google Scholar] [CrossRef][Green Version]
Nowak, M.A. Five Rules for the Evolution of Cooperation. Science 2006, 314, 1560–1563. [Google Scholar] [CrossRef] [PubMed]
Gürdal, M.Y.; Gürerk, Ö.; Yahşi, M. Culture and prevalence of sanctioning institutions. J. Behav. Exp. Econ. 2021, 92, 101692. [Google Scholar] [CrossRef]
Ozono, H.; Kamijo, Y.; Shimizu, K. Punishing second-order free riders before first-order free riders: The effect of pool punishment priority on cooperation. Sci. Rep. 2017, 7, 14379. [Google Scholar] [CrossRef] [PubMed]
Sasaki, T.; Uchida, S.; Chen, X. Voluntary rewards mediate the evolution of pool punishment for maintaining public goods in large populations. Sci. Rep. 2015, 5, 8917. [Google Scholar] [CrossRef]
Perc, M. Sustainable institutionalized punishment requires elimination of second-order free-riders. Sci. Rep. 2012, 2, 344. [Google Scholar] [CrossRef]
Zhang, B.; Li, C.; Silva, H.D.; Bednarik, P.; Sigmund, K. The evolution of sanctioning institutions: An experimental approach to the social contract. Exp. Econ. 2014, 17, 285–303. [Google Scholar] [CrossRef]
Safarzynska, K. Collective punishment promotes resource conservation if it is not enforced. For. Policy Econ. 2020, 113, 102121. [Google Scholar] [CrossRef]
Nockur, L.; Pfattheicher, S.; Keller, J. Different punishment systems in a public goods game with asymmetric endowments. J. Exp. Soc. Psychol. 2021, 93, 104096. [Google Scholar] [CrossRef]
Baldassarri, D.; Grossman, G. Centralized sanctioning and legitimate authority promote cooperation in humans. Proc. Natl. Acad. Sci. USA 2011, 108, 11023–11027. [Google Scholar] [CrossRef]
Earl, A. Methodological issues in examining sanctions: Reflections on conducting research in Russia. Tour. Manag. Perspect. 2021, 39, 100858. [Google Scholar] [CrossRef]
Li, J.; Xiao, E.; Houser, D.; Montague, P.R. Neural responses to sanction threats in two-party economic exchange. Proc. Natl. Acad. Sci. USA 2009, 106, 16835–16840. [Google Scholar] [CrossRef] [PubMed]
Pogarsky, G.; Piquero, A.R.; Paternoster, R. Modeling Change in Perceptions about Sanction Threats: The Neglected Linkage in Deterrence Theory. J. Quant. Criminol. 2004, 20, 343–369. [Google Scholar] [CrossRef]
Meyer, K.E.; Thein, H.H. Business under adverse home country institutions: The case of international sanctions against Myanmar. J. World Bus. 2014, 49, 156–171. [Google Scholar] [CrossRef]
Kim, O. The impact of economic sanctions on audit pricing. J. Contemp. Account. Econ. 2021, 17, 100257. [Google Scholar] [CrossRef]
Hagen, A.; Schneider, J. Trade sanctions and the stability of climate coalitions. J. Environ. Econ. Manag. 2021, 109, 102504. [Google Scholar] [CrossRef]
Moor, T.D.; Farjam, M.; van Weeren, R.; Bravo, G.; Forsman, A.; Ghorbani, A.; Dehkordi, M.A.E. Taking sanctioning seriously: The impact of sanctions on the resilience of historical commons in Europe. J. Rural Stud. 2021, 87, 181–188. [Google Scholar] [CrossRef]
Doyle, J.C.; Francis, B.A.; Tannenbaum, A.R. Feedback Control Theory; Macmillian Publishing Company: New York, NY, USA, 1990. [Google Scholar]
Anderson, A. A note on searching a binary search tree. Software 1991, 21, 1125–1128. [Google Scholar] [CrossRef]

Figure 1. Sanctioning institutions steer the group through punishment methods.

Figure 2. A group under different punishment methods. (a,b) The effect of punishment intention under the constant punishment intensity approach and the negative feedback punishment approach, respectively. The sampling interval of punishment intention is

0.01

in our simulation. (c,d) The 50 operation processes of the sanctioning institution under the constant punishment intensity approach and the negative feedback punishment approach, respectively. One colorful line depicts one operation, and the black line shows an average over 50 repetitions. Based on the operations, we compare the two methods in terms of time and cost in (e,f), respectively. Bar heights in (e,f) indicate the mean evolutionary rounds and the average of the cumulative punishment intensity, respectively. Error bars represent standard errors. The detailed setup of the simulation environment is shown in the Appendix A.1.

Figure 3. Comparison of the operation of sanctioning institutions under different punishment methods. Based on 50 operations of the institution, we compare the time and money consumption of several punishment methods. Bar heights in (a,b) indicate the mean evolutionary rounds and the average of the cumulative punishment intensity over 50 repetitions, respectively. The detailed setup of the simulation environment is shown in Appendix A.1.

Figure 4. Numerical analysis of possible equilibrium point

x_{3}

in the negative feedback punishment approach. (a) All

(x, k)

that satisfy

p (x) = g (x, k)

. For each x there is one and only one k that makes

(x, k)

meet the condition. (b) This shows that

{F (x)}^{'} \leq 0

for every

(x, k)

satisfying

p (x) = g (x, k)

.

Table 1. Notation table.

Notation	Explanation
n	population size
$s_{i}^{(0)}$	initial strategy of individual i
$s_{i}^{(p)}$	individual i’s strategy in round p
C	Cooperation, one of two strategies
D	Defection, the other of two strategies
$r_{c}$	amount of resources contributed by individuals with the cooperation strategy
$r_{d}$	amount of resources contributed by individuals with the defection strategy
R	amount of resources allocated to each individual in each round
$μ$	synergy coefficient in the public goods game
$t^{(p)}$	punishment intensity in round p
$β$	difficulty factor of learning
$α$	mutation rate
k	punishment intention of the sanctioning institution
$f_{c}^{(p)}$	percentage of cooperation strategy individuals in the group in round p
$g (f_{c}^{(p)}, k)$	punishment intensity function of the negative feedback punishment approach

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Negative Feedback Punishment Approach Helps Sanctioning Institutions Achieve Stable, Time-Saving and Low-Cost Performances

Abstract

1. Introduction

2. Materials and Methods

2.1. Model

2.1.1. PGG with Pool Punishment

2.1.2. Punishment Methods

3. Results

3.1. Punishment Intention Affects GROUP Performance

3.2. Operation of the Sanctioning Institution

3.3. Theoretical Analysis

4. Conclusions

5. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

Appendix A.1. Simulation Setup

Appendix A.2. Binary Search by Sanctioning Institutions

Appendix A.3. Further Comparison

References

Article Metrics

Citations

Article Access Statistics