Repeated Interaction and Its Impact on Cooperation and Surplus Allocation—An Experimental Analysis

: This paper investigates how the possibility of affecting group composition combined with the possibility of repeated interaction impacts cooperation within groups and surplus distribution. We developed and tested experimentally a Surplus Allocation Game where cooperation of four agents is needed to produce surplus, but only two have the power to allocate it among the group members. Three matching procedures (corresponding to three separate experimental treatments) were used to test the impact of the variables of interest. A total of 400 subjects participated in our research, which was computer-based and conducted in a laboratory. Our results show that allowing for repeated interaction with the same partners leads to a self-selection of agents into groups with different life spans, whose duration is correlated with the behavior of both distributors and receivers. While behavior at the group level is diverse for surplus allocation and amount of cooperation, aggregate behavior is instead similar when repeated interaction is allowed or not allowed. We developed a behavioral model that captures the dynamics observed in the experimental data and sheds light into the rationales that drive the agents’ individual behavior, suggesting that the most generous distributors are those acting for fear of rejection, not for true generosity, while the groups lasting the longest are those composed by this type of distributors and “undemanding” receivers.


Introduction
One defining aspect of human life is cooperation. Cooperation is at the core of the relations between family members and friends, as cooperation between co-workers is crucial for any employment relationship. While cooperation fulfils many needs of human beings, one main reason for the paramount importance of cooperation is the production of surplus. A cooperating group can achieve more than the sum of what each individual can achieve on his/her own. This holds true for co-workers assembling a car as well as for the founders of a modern startup or scientists collaborating on a research project.
Even though the production of surplus provides an incentive to cooperation, it also has the potential to cause conflicts, which will lead to an efficiency loss. It is well known that subjects care about allocative fairness and may even give up own payoff in order to prevent unfair outcomes (for an overview of the experimental evidence see, e.g., [1]). In our context this implies that if the distribution of the surplus is not satisfactory for all cooperating agents, some agents might stop cooperating, and if an unsatisfactory surplus distribution is foreseen, cooperation might be prevented from the very beginning. Successful cooperation requires a split of the surplus that is satisfactory for all agents involved in the generation of the surplus [2].
Obviously, if enforceable contracts can be signed a priori, the problem of a satisfactory surplus distribution can be easily solved. However, such contracts are often not feasible, e.g., because it might be impossible to foresee all the contingencies that can occur during the surplus production. Without binding a priori contracts, the final surplus distribution is determined by the bargaining power the agents have after the surplus is already produced. In cases where agents have very uneven bargaining power, cooperation might be refused a priori by those agents who expect to have low ex-post bargaining power.
Repeated cooperation and information about past surplus distribution may appear to solve or reduce the problem of uneven bargaining power, leading to efficient cooperation and satisfactory surplus allocation. There are two reasons for this: On the one hand, repeated surplus distribution might allow the weak agents to get information about the specific behavior of each individual strong agent. Hence, the weak agents can cooperate or not depending on the past behavior of their strong partners. This provides the strong agents with an incentive not to abuse their bargaining power, leading to more cooperation and more equal surplus distribution (see, e.g., [3]). If information on prior behavior is provided, this mechanism holds when the group formation changes exogenously. Take as an example, a re-assignment of workers within the same firm; it is likely that despite the re-assignment, workers have some information about the past behavior of their new teammates, and that this influences their cooperation.
On the other hand, repeated interaction with the same partners provides an additional reason to enhance cooperation and a more equal surplus allocation: The weak agent can directly threaten the strong one to refuse cooperation in the future, forcing the strong one to accept a surplus distribution that is satisfactory for all agents involved. Take for example a group of workers that get a joint bonus if their joint production fulfils some criteria, then assume that one of the co-workers (e.g., the foreman) has a decisive influence on the distribution of the bonus between himself and the others. If the foreman would ensure himself the lion share of the bonus, his co-workers would refrain from cooperating next time, and this threat forces the foreman to find a fair distribution (whatever might be perceived as fair in the particular context).
It has long been observed in the theoretical literature that repeated interactions may improve efficiency, fostering cooperation (see, e.g., [4,5]). A typical example is the relational contract literature (see, e.g., [6]). However, this analysis does not take into account that, in many contexts, agents might not only refuse to cooperate with given partners (and get an exogenously determined value of an outside option), but they might also switch partners altogether. Again, take the example of the working group. Unfairly treated workers might decide to quit and change team or look for another job, possibly making it harder for the remaining group members to fulfil the criteria for the bonus.
In this paper we analyze experimentally how the possibility of affecting group composition impacts the cooperation level of and the surplus distribution within groups. With this design, we capture all those working situations where team members collaborate to a common goal, but where ex-post bargaining power differs across members (such as in presence of seniority or hierarchy). As information about past behavior is public, we can think about workers within a firm, which collaborate to different projects. Once a new project is presented, workers can maintain the group (team) structure if it has been successful previously or ask to be re-allocated to a different team.
The rest of the paper is organized as follows: In the next section (Section 2) we introduce the literature that is more relevant for our research. Section 3 describes the experimental design. In Section 4, we specify the hypotheses to be tested, while in Section 5, we describe the experimental results. Section 6 discusses the main features of a model we developed to investigate the rationales that guide the subjects' behavior and the relevance of our results. The last section concludes.

Related Literature
As the study of cooperation and group composition are relevant topics in many fields of research, our results are linked with previous research developed in fields very diverse in both focuses and goals.
In economics, the theoretical and experimental investigation of cooperation focuses mainly on prisoner's dilemma games, starting with the classical study of [7]. As in our paper, much of the existing analysis of prisoner's dilemma games focused on the impact of repetition on cooperation (see, e.g., [8,9]). However, in most of this literature, repeated interaction is exogenously given [10]. Recently attention has been devoted to the study of behavior in prisoner's dilemma games where partnerships can be endogenously terminated [11]. On this topic, the most relevant article is [12], which focuses on the impact on cooperation of different costs of quitting an unsatisfactory relationship.
Interesting insights on how to promote cooperation in social dilemmas come from social psychology. According to a vast literature, cooperation can be promoted using different tools, such as punishment [13,14], reputation and gossip [14][15][16], and communication [17]. Of particular relevance is the work of [18], which focuses on cooperation in situations with power asymmetry, testing some strategies that might help promoting cooperation (such as gossip and punishment). The authors show that despite the difference in power, group members do not show significantly different levels of cooperation. These results are not aligned with previous literature [19] and indicate that further investigation is needed.
As information on past behavior is available in our framework, our work touches the field of reputation. Previous results on reputation within various fields of research suggest that showing an other-regarding behavior in the past triggers trust and increases cooperation [20][21][22][23][24][25][26]. In some of these experiments [21,22,25,26], matching is exogenous, and subjects are allowed to choose the preferred earnings allocation between fixed options, favoring either one or the other participant. Other experiments instead (such as [20,23,24]), investigate how reputation affects cooperation in dynamic networks, in a prisoner's dilemma framework. Such situations are however very different from the one we are focusing on, as actions are symmetric and the bargaining power is equal across agents.
Evolutionary game theory has also addressed the question of how changing links in social networks promote cooperation. Results in this field illustrate that allowing subjects to update their network often (e.g., cutting ties with defectors) indeed increases cooperation, both theoretically [27][28][29] and experimentally [30,31]. This experimental research is often based on the prisoner's dilemma game, where bargaining power is symmetric across agents. Nonetheless, these results suggest that in a framework were bargaining power is very uneven, allowing for repeated interaction and for punishing uncooperative partners by changing links, trust, and cooperation is boosted.
Our paper is also linked to the extensive literature on bargaining experiments, particularly, ultimatum and dictator game experiments (see, e.g., [32,33]; for an overview of the experimental results of these games see [34]). Experimental investigations of these games have been guiding the analysis of other-regarding preferences since the beginning of this literature (see, e.g., [35,36]; for more recent contributions see, e.g., [37,38]). Typically, this literature focuses on the type and the intensity of individual other-regarding preferences that can be concluded from the experimental results of these games.
Results in the dictator games show that subjects do not use all their bargaining power if this would lead to a very unequal allocation. However, in these experiments, both surplus and matching are exogenously given. [39] show that allowing partner selection in a trust and modified dictator game boosts trust and altruism. However, here partner selection is based on personal characteristics (such as age and gender) and not past behavior. Hence, dictator game experiments so far cannot answer the question of whether the endogenous possibility of repeated cooperation-based on donations' history-impacts cooperation level and the distribution of the surplus.
The ultimatum game is closer to our surplus allocation game insofar, as in the ultimatum game subjects with low bargaining power can refuse to cooperate. The experimental results show that this possibility leads to more equal allocations than in the dictator games. However, contrary to our experiment, in ultimatum games the decision to cooperate is made by receivers after they already know how the surplus is shared in case of cooperation. Hence, the ultimatum game models a situation where binding a priori contracts are feasible. Furthermore, in most ultimatum game experiments, the matching is exogenous and therefore not connected to the cooperation decision. Like dictator games, ultimatum game experiments cannot investigate how the endogenous possibility of repeated cooperation impacts cooperation level and the distribution of the surplus.
A closer match to our paper is the research studying the yes/no games; i.e., ultimatum games where respondents blindly decide whether to accept/reject a distribution (they are not told the distribution before deciding, see [40,41]). However, in this literature, authors do not allow groups to build a common history based on repeated cooperation.
As in our framework information of past behavior is presented to the agents before they choose their action, a relevant literature is the one investigating other-regarding preferences and expectations, in particular in frameworks where punishment is possible (such as the ultimatum game). Experimental studies show that providing information about proposers' past behavior affects the likelihood of offer-specific rejections [42,43], indicating that something similar might emerge in our context, when group members suspect that the future allocation might be unfair (based on own evaluations). Inequity aversion and frustration might play a role too in such a context. It has been shown that frustration decreases the likelihood of accepting what is perceived as an unfair proposal [38], this suggests that sharing information about prior behavior might lead to severe punishment of forecasted allocations perceived as particularly unfair, but also that agents might become more prone to punishment, when trust has been betrayed (for example when trust from a receiver has been repaid by a low offer from a distributor).
Finally, our paper is connected to the extensive experimental work on group formation. Group formation is investigated mainly in the context of public good games (see [44] for a review). Some papers investigate what happens to voluntary contributions to a public good when groups are formed according to some exogenously given criteria (e.g., [45] where group members were exogenously matched according to their past contributions to a public good). Experiments on endogenous group formation focus on particular aspects of group formation: restricted vs. free entry and exit [46], costly entry [47], direct selection from a pool of possible partners [48], group formed on the basis of stated preferences [49] or on previous donations to charitable organizations [50], the possibility of exclusion [51,52], and mobility between groups [53]. Recent research investigated the impact of repeated cooperation in endogenously formed groups, however the settings considered were rather different from ours, studying, among others, the use of punishment to enhance cooperation [54], and the stability of cooperation in public projects with stochastic outcomes, imperfect monitoring, and an exit option [55].
In contrast to previous literature, our experiment studies a simple yet crucial mechanism whose dynamics remain unclear: how the possibility to impact group composition affects the willingness to cooperate and surplus allocation. Understanding how group dynamics unravel in such a framework is crucial, as this context is easily encountered in many working situations involving teamwork. To our knowledge, the only paper connecting partner choice with surplus distribution is [56]. However, unlike our paper, they focus on the impact of competition within stable groups characterized by excess supply of either distributors or responders.

Experimental Design
We developed a "surplus allocation game" where groups consisting of two distributors and two recipients were formed. Since we were interested in group behavior, we focused on groups of four subjects equally split in distributors and receivers as this was the smallest symmetric group possible (excluding pairs, which display a different behavior than groups; see, e.g., [57]). (As we are interested in the behavioral mechanisms that drive group behavior in such a framework, we decided for a small group, as a larger group (e.g., 6 members) would naturally be subject to more confounds, making the interpretation of data difficult. Furthermore, having larger groups would not provide more information to answer the questions this paper aims at answering.) The experiment consisted of 30 rounds. At the beginning of the session, participants were randomly assigned to the role of distributor or receiver. The role was maintained throughout the experiment. In each round of the game, subjects were first allocated to a group, then the potential group members had to decide individually whether to cooperate; full acceptance was needed for surplus production. Participants were aware of how many other players were part of a group (but not who they were) and knew how many rounds the experiment lasted. Full anonymity of group members was guaranteed. The group surplus amounted to 20 Experimental Currency Units (henceforth ECUs) per round. If at least one subject refused to cooperate, no surplus was produced and all potential group members earned nothing that round. If all members decided to cooperate, surplus was produced and had to be allocated among group members.
To model a situation with unequal ex-post bargaining power, each of the distributors received half of the produced surplus (10 ECUs) that she (for the sake of readability, we stick to the convention that distributors are female and receivers male) could then freely distribute between herself and the receivers. The contributions of the two distributors were then summed up and divided equally between the two receivers. A minimum contribution of 1 ECU was set, to avoid multiple equilibria in the game. Before choosing whether to cooperate or not, all subjects were informed about how the matched distributors allocated surplus in the last three rounds. (No information was provided about the rounds in which matching has been refused and therefore distributors had not allocated surplus. A pilot study showed no effect of adding information of past refusal. Similarly, no difference has been observed when subjects received the average of the last three contributions, instead of the three single values.) Before the experiment started, participants were randomly allocated to cubicles, instructions were read aloud by a lab assistant, and participants had to answer a control questionnaire to assure that they understood the mechanisms of the experiment. Once everyone answered correctly all the questions (explanations were repeated if necessary), the experiment started. After the experiment was concluded, participants had to fill in a brief questionnaire, and then they were paid privately in cash.
To test how the possibility to impact group composition affects cooperation and surplus distribution, we designed three treatments: a baseline treatment, a re-matching treatment (exogenous-matching), and an endogenous-matching treatment. In this setting, accepting the proposed group coincides with the intention to cooperate, as subjects can only choose whether to cooperate (accept the group) or not (refuse the group). Figure 1 illustrates the experimental setup.

Baseline Treatment (BT)
In the baseline treatment (BT), we imposed cooperation on all four subjects. Groups were forcibly formed and re-matched every round. Receivers were mere observers, while the two distributors decided how to divide the surplus. Hence, BT was equivalent to a dictator game but with two dictators (i.e., distributors) and two receivers forming a match.
This treatment was used as benchmark for the analysis of behavior in the next two treatments.

Re-Match Treatment (RT)
In the re-match treatment (RT), cooperation was not enforced. Groups got exogenously re-matched after each round and this was known to the participants. In each round the four subjects decided whether to cooperate or not. If all members decided to cooperate, the group proceeded as in BT. If at least one of the members decided not to cooperate, no surplus was produced and nobody on that proposed group earned anything in that round. In order to build a history of contributions for the distributors, RT consisted of two phases. The first phase lasted three rounds and was identical to BT-cooperation was exogenously enforced, the surplus was distributed by the distributors, and the subjects got re-matched every round. In the second phase (from round 4 to round 30) groups were also randomly re-matched every round, but cooperation was not enforced anymore. All subjects were informed about the three previous contributions of the distributors they were matched with; with this information, each group member (both distributors and receivers) decided whether to cooperate or not.
Comparing results from BT and RT allowed us to study how the possibility of refusing to cooperate affected surplus allocation.

Endogenous-Match Treatment (ET)
The endogenous-match treatment (ET) was similar to RT, but for the fact that subjects would maintain the same group composition, as long as all members agreed on cooperating. When at least one member refused to cooperate, the group was dismantled. As RT, also ET was composed of two phases. For the first three rounds (phase 1), groups were maintained, cooperation was exogenously enforced, and the distributors made unilateral decisions about their contributions. This phase was designed to build a past history of contributions. At the beginning of round 4 (phase 2), groups were randomly re-matched and information about past behavior was provided. From round 4 onwards, subjects decided about cooperation. More specifically, at the beginning of each round (annotated as round t) each member decided whether to cooperate or not. If all four members of a group decided to cooperate, the surplus was produced, the distributors allocated the surplus, and in the following round the group was re-proposed and participants had to decide again whether to cooperate or not. If at least one member decided not to cooperate, no surplus was produced, all members of the group earned nothing in this round, and the group was dissolved. At the beginning of the next round (round t + 1), all subjects whose groups were dissolved in round t were randomly re-matched among the available subjects. The newly matched group members got informed about the last three contributions of their distributors, and they decided whether to cooperate, etc.
Comparing results from ET and RT allowed us to study how the possibility of building a long-lasting relationship and a common history affects cooperation and surplus allocation. ET introduces a single variation to RT: that a randomly formed group is maintained so long that all group members cooperate. Therefore, the difference in the two treatments is that accepting (or refusing) to cooperate impacts group composition.
In all three treatments, a distributor's payment was the sum of all the ECUs she had kept for herself in all those rounds where all members of her group cooperated. A receiver earned the sum of the ECUs he had received in those rounds where all members of his group cooperated. ECUs were transformed into Euros with a 10 to 1 exchange rate, and 2.5 Euros were added as show-up fee and 2.5 Euros as payment for filling in an optional questionnaire on personal information at the end of the experimental session.

Hypothesis
It is easy to see that in any subgame perfect equilibrium the distributors would contribute the minimum amount of 1 ECU in every round of every treatment where surplus can be distributed, assuming all subjects to be purely selfish and fully rational. In RT and ET, collaboration of all subjects is required for surplus production. This implies that in these treatments there exist subgame perfect equilibria where subjects do not cooperate in some or all rounds, since in these rounds each player expects the other members of the group to refrain from cooperation, which in turn implies that the individual player has no incentive to cooperate himself/herself in these rounds. However, these "implausible" equilibria do not survive any refinement of subgame-perfection like trembling-hand perfection or properness. If all other players play every pure strategy with a strictly positive minimum probability, each player's unique best response is to cooperate in all rounds (and for a distributor to contribute the minimum amount of 1 in all rounds). Hence, in any perfect or proper equilibrium all subjects cooperate in all rounds of RT and ET. Consequently, all ET subjects would stay in the same group during all rounds.

Hypothesis 1 (H1). How the possibility of refusing cooperation affects contributions.
Previous experimental results on dictator games suggest that distributors would contribute more than the minimum. Furthermore, ultimatum games results suggest that the possibility to refuse cooperation should lead to higher contributions in RT and ET than in BT. This implies higher distributors' earnings in BT than in the other two treatments, due to both less money contributed by the distributors and no possibility to refuse cooperation. Concerning receivers' earnings, two opposing effects are possible. On the one hand, larger contributions would imply larger receivers' earnings in RT and ET than in BT. On the other hand, the refusal to cooperate could lead to smaller earnings of receivers in RT and ET than in BT. Since distributors are aware of the risk of being rejected if they propose a too low contribution, we expect them to raise the contribution enough for the first effect to dominate. These considerations lead to Hypothesis 1: If the possibility of long-term cooperation overcomes the potentially detrimental effect of unsatisfactory surplus distribution on cooperation levels, we would expect a higher efficiency level, i.e., higher cooperation rates, in ET than in RT. Furthermore, we should observe larger contributions in ET than in RT, since in ET receivers are able to punish distributors, who contribute little in round t, directly by withholding cooperation in round t + 1. These considerations are aligned with results presented by [13], which show that punishment is particularly effective in frameworks where groups are not forcefully rematched. Concerning earnings, note that more cooperation and larger contributions imply higher earnings for the receivers in ET than in RT. For the distributors' earnings, the higher contributions and the larger cooperation rates have opposing effects, and we cannot form ex-ante a clear prediction of which effect will dominate. These considerations lead to Hypothesis 2: (i) Cooperation rates are higher in ET than in RT. (ii) Contributions are higher in ET than in RT. (iii) Receivers' earnings are higher in ET than in RT.

Hypothesis 3 (H3). Testing individual differences in cooperation and contributions.
Hypotheses 1 and 2 focus on agents' aggregate behavior. Obviously, we expect to observe individual differences in the cooperation behavior and in the contribution levels of the distributors. These individual differences should lead to a variation in the life spans of the endogenously formed groups in ET. Since distributors have no reason to refuse cooperation, receivers determine group duration (in fact, we observed hardly any refusal of cooperation by distributors, in ET, only 2.55% of all refusals to cooperate were done by distributors). We expect that receivers' likelihood to cooperate and to maintain the group should depend on two factors. First, receivers' cooperation should be more likely the larger the distributors' past contributions are. Hence, the larger the distributors' past contributions, the longer a group should stay together. Second, receivers might differ in how demanding they are, i.e., how high is the minimum amount they demand to receive in order to accept to cooperate. Therefore, for given past distributors' contributions, the likelihood of cooperation is larger for the less demanding receivers. Hence, we expected less demanding receivers to belong to longer-lived groups. These assumptions are aligned with results in a vast literature within cooperation in social dilemmas, where it is shown that the positive correlation between trust and cooperation is stronger in situations where the degree of conflict is higher [58]. Lastly, due to the increase in cooperation and contribution levels, the payoffs of both types of subjects should increase in group duration. These considerations led to Hypothesis 3: Testing Hypothesis 3 allows us to observe individual differences in receivers' cooperation behavior and in the underlying reasons for these differences. However, while we can observe the differences in contributions, we can only infer the rationales behind the distributors' choices. Moreover, different combinations of receivers and distributors (demanding or undemanding) would lead to different group durations that are not predictable ex-ante. To investigate the rationales that drive distributors and to model groups' behaviors, we developed a simple behavioral model which we briefly present in the Discussion section, and in detail in the Supplementary Materials (see Supplementary Material A).

Experimental Results
We ran the experiments at the Cologne Laboratory for Economic Research, University of Cologne. The BT sessions lasted less than 40 min and the RT and the ET sessions about one hour. The average earning was 20 € and a minimum earning aligned with the lab policies was guaranteed.
Overall, 400 subjects took part in 13 experimental sessions, where each session had either 28 or 32 participants for a total of 7/8 groups. (We ran two additional sessions that had to be discarded due to technical problems). We have personal data of 356 participants. Of these 356 subjects 157 were male, and the average age was 24.1 years (SD 3.7). The number of independent observations (sessions) is aligned with the standards in the field [46,47,51]. Table 1 summarizes the average contributions, earnings, and cooperation rates in each treatment. The average contribution was 20.1% in BT, which is in line with the results of standard dictator game experiments. In the other treatments, the threat of refusing cooperation causes distributors to raise their contributions to a 30-40% share, similar to what is observed in ultimatum games. We first turn to Hypothesis 1 (see Table 2). The statistical analysis is performed at the session level (using session averages), since each session is an independent observation. Average contributions and receivers' earnings are significantly lower in BT than in the other treatments, while distributors' earnings are significantly larger (see Table 2).  Figure 2 shows the evolution of the average contributions over time in the three treatments. In all but the last round the contributions were lower in BT than in the other treatments. To test for time effects, we compared the sessions' averages of contributions and earnings of rounds 4 to 13 with those of rounds 20 to 29 for each treatment, excluding the first three rounds, where non-cooperation was not possible, and the last round, where we observed the well-known "end of the experiment effect"  Overall, we can conclude that Hypothesis 1 is supported by the data.
Hypothesis 2 was instead rejected by the data. Table 2 as well as Figure 2 reveal that there is no significant difference in average contributions and average earnings between RT and ET. Furthermore, average cooperation rates (calculated as the number of times a receiver or a distributor has cooperated, averaged by session) in RT and ET are indistinguishable (two sample t test: t(6.04) = 0, p = 1). However, within the ET treatment we observe substantial differences in the endogenous life span of groups. In Figure 3, we plot the number of groups with their duration, weighted by the duration of the group. (Weighting by the duration of the group is necessary to avoid misleading results. To see this, take a hypothetical session with 16 subjects and 27 rounds. In this session, half of the subjects stay always in the same group (i.e., two completely stable groups), while the other groups never cooperate and hence always split after one round. In this case we have 2 groups with a duration of 27 rounds each, and 54 groups with a duration of 1. Taking only the number of groups with the different durations would give the impression as if groups of duration 1 would completely dominate the session, while in fact the actual distribution of subjects into the short-and the long-term groups is half/half.) Since during the first three rounds groups had to cooperate, we take only rounds 4 to 30 into account, implying that the minimum group duration is 1 and the maximum 27. As can be seen from this figure, 141 groups broke up immediately since in these cases at least one member of the group refused to cooperate already at the first round the group was together (groups of duration 1). A total of 41 groups cooperated once and refused cooperation in the second round of their existence (duration of 2), 30 groups cooperated twice and refused cooperation in the third round of their existence (duration of 3), etc. Recall that during these rounds a refusal to cooperate ends the group relationship and the members get randomly re-matched in the next round (unless they were already in round 30). For example, if a group was matched in round 15, cooperated from round 15 to round 17, and at least one member refused to cooperate in round 18, the group duration in that span is 4 rounds. Overall, Figure 3 reveals a large heterogeneity of group durations, with a lot of shortlived groups, but also quite some long-term relationships. These differences in group duration translated into substantial differences in the number of groups subjects belonged to. To see this, we calculated for each subject the number of groups she or he belonged to during the whole experiment (again excluding the first three rounds). e.g., if a subject spent 26 rounds with the same group and 1 round with another group, the number of groups she belonged to is 2. The same results if a subject spent 13 rounds with a group and the remaining 14 rounds with another group. Figure 4 shows the distribution of numbers of groups subjects belonged to. Eight distributors and eight receivers were only members of one group, i.e., four groups (out of 39 possible) stayed together for the whole experiment.
On the other hand, many subjects switched group quite often (e.g., eight distributors and seven receivers belonged to seven groups), and a few subjects even belonged to more than 20 groups. These differences in the number of groups subjects belonged to are linked to subjects' behavior. First, we look at distributors' contributions. As can be seen from Figure 5, distributors' average contributions differ substantially, ranging from a minimum of 1 to a maximum of 6 ECUs. Furthermore, Figure 5 suggests a negative correlation between a distributor's average contributions and the number of different groups she belonged to, which suggests that groups with more generous distributors are accepted more frequently. This impression of a negative correlation between a distributor's contributions and the number of different groups she belonged to is confirmed by the statistical analysis (Pearson's product moment correlation = −0.809, t(76) = −12.00, p < 0.001). This evidence supports Hypothesis 3i.
Considering that cooperation was nearly never refused by distributors, it follows that receivers belonging to more groups refuse to cooperate on average more often than receivers belonging to fewer groups. As already explained, cooperation is driven by (a combination of) two different factors. First, the higher a receiver's expectation of what to get from a particular pair of distributors, the more likely she is to cooperate. Since one can expect that distributors' past contributions are correlated with receivers' expectations about future contributions, higher past contributions should lead to a larger likelihood of cooperation. Second, for given past contributions of the distributors, different receivers might differ in their "acceptance" threshold-some receivers are more demanding than others. To distinguish between these two different reasons for cooperation, we categorized receivers into three categories of roughly equal size: "multi-group receivers" who stayed in at least 9 different, relatively short-lived groups (32 receivers); "few-group receivers" who stayed in at most 4 different, relatively long-lived groups (24 receivers); and "some-group receivers" receivers who belonged to a medium number of groups (22 receivers). The thresholds of four and nine groups were chosen in order to have roughly the same number of subjects in all categories. To test how past contributions affect cooperation rates in our three categories, we do a probit regression with a cooperation dummy as dependent variable. The average past contributions (in the last three played rounds) of the distributors and the dummies for the multi-group and the few-group receivers are the independent variables. Figure 5. Average contribution of individual distributors, given the number of groups she belonged to. The vertical lines divide distributors into three sub-groups of roughly equal size: "few-group distributors", i.e., distributors belonging to 4 groups or less (24 distributors), "multi-group distributors" belonging to 9 groups or more (27 distributors), and "some-group distributors" belonging to a medium number of different groups (27 distributors).
As expected (see Table 3), the likelihood of cooperation increases in the average past contribution of the distributors (Hypothesis 3i). For a given past contribution level, the likelihood of cooperation is significantly lower for receivers that were in many groups than for those that were in a small or medium number of groups. We can conclude that receivers who are members of many short-lasting groups have higher acceptance thresholds than the other receivers, which supports hypothesis 3ii. Table 3. Probit regression of the impact of average past contributions and number of groups a receiver belonged to on cooperative behavior. The dependent variable equals 1 if the receiver decides to cooperate and 0 otherwise. * indicates significance at the 1% level. Since receivers belonging to few groups are confronted with distributors contributing more, receivers' earnings should be negatively correlated with the number of different groups they belonged to. Concerning distributors' earnings there are two opposing effects at play. On the one hand, higher contributions have a direct negative effect on distributors' earnings. On the other hand, higher contributions are connected to fewer groups and more cooperation. Figure 6 indicates that for distributors too, earnings decrease in function of the number of different groups she belonged to, showing that the latter effect prevails. Figure 6. Total earnings in Euros of individual distributors and receivers, given the number of groups they belonged to. The vertical lines divide subjects into three sub-groups of roughly equal size: "few-group" subjects, i.e., subjects belonging to 4 groups or less, "multi-group" subjects belonging to 9 groups or more, and "some-group" subjects belonging to a medium number of different groups.

Variable
To test for this, we also categorize distributors into "few-group distributors", i.e., distributors belonging to 4 groups or less (24 distributors), "multi-group distributors" belonging to 9 groups or more (27 distributors), and "some-group distributors" belonging to a medium number of different groups (27 distributors). The thresholds of four and nine groups are chosen so to have roughly the same number of subjects in all categories. Using this categorization of the distributors and the similar one introduced for receivers above, we find indeed that subjects' earnings are decreasing in the number of groups they belonged to (receivers: two sample t test, Multi-Few Group, t(38.595) = 12.28, p < 0.001; distributors: two sample t test, Multi-Few Group, t(43.762) = 6.31, p < 0.001). There is also a negative correlation between a subject's earnings and the number of different groups he/she belonged too (receivers: Pearson's productmoment correlation = −0.936, t(76) = −18.14, p < 0.001; distributors: Pearson's product-moment correlation = −0.807, t(76) = −11.95, p < 0.001). This confirms Hypotheses 3iii.
Since in ET we found significant differences between subjects belonging to few and many groups we compare the behavior of these different ET subjects with the behavior found in RT. Table 4 shows that distributors belonging to many groups in the endogenous-match treatment contribute significantly less than RT distributors, whereas those belonging to few ET groups contribute significantly more. Earnings are larger in ET than in RT for both types of subjects whenever they are in few different groups, and smaller in ET whenever they are in many different groups. These differences are all significant. Table 4. Values for the 95% family-wise confidence level, Tukey's "Honest Significant Difference" method. For RT we used the contributions/the total earnings of all distributors/subjects (distributors: 74, receivers: 74) of RT; for the multi-group ET the contributions/total earnings of all distributors/subjects that were in at least 9 groups (distributors: 27, receivers: 32), and for few-group ET the contributions/total earnings of all of all distributors/subjects that had were in at most 4 groups for both (distributors: 24, receivers: 24). Concerning cooperation, Table 5 shows the result of a probit regression with the combined data of the endogenous-match treatment and the re-match treatment. Here the probit regression is run with a cooperation dummy as dependent variable. The average past contributions of the distributors (in the last three played rounds) and the dummies for the multi-group, some-group, and the few-group receivers are the independent variables. Again, higher average past contributions increase the likelihood of cooperation, and multigroup endogenous-match treatment receivers have higher acceptance thresholds than few-group and some-group receivers. They have also higher acceptance thresholds than the average receivers in the re-match treatment. This is further evidence that the possibility of staying together leads to self-selection of distributors and receivers.

Discussion
The experimental results show that average contributions and cooperation rates do not differ between ET and RT. We have also seen that in ET agents self-select into groups whose life spans differ largely and that receivers' expectations have a large impact on their cooperation rates. What we do not know yet is what drives distributors to contribute differently and how individual differences affect group duration. While it is not possible to reconstruct the real motivations driving the subjects' behavior in this context, we develop a behavioral model that aims at replicating the observed data and at shedding light on which rationales might drive subjects' behavior.
In contrast to the experimental design, the model investigates a variant of the surplus allocation game with one receiver and one distributor forming a group, and with only receivers deciding about cooperation. We believe that this simplified model allows us to capture the dynamics of our more complex experimental design. While it is true that in the model the choice of the receiver to dissolve the group impacts only himself and the distributor (deemed worthy of punishment), in the experiment the choice of one receiver impacts the other receiver too. However, we assume that one receiver expects the other receiver to share his preferences, and as such to support the dissolution of the group. We designed the model as a two-stage game, played repeatedly: First, receivers decide whether to cooperate or not. If a receiver refuses to cooperate, no surplus is produced and both group members earn nothing in that round. In case of cooperation, a surplus is produced. In the second stage of the game the distributor decides unilaterally about her contribution to the payoff of "her" receiver. To allow for informed cooperation decisions, each receiver is informed about the previous contribution of "his" distributors, before the decision about cooperation is taken. Overall, this game is played for T rounds. As in the experiment, we analyze two different matching protocols: the re-match (RT), and the endogenous-match (ET). We also look at the case where cooperation is exogenously enforced, and agents get re-matched every round (BT). The model is described in detail in Supplementary Material A.
The model captures well the experimental results, showing that in BT, contributions are smaller than in the other treatments, leading to higher distributors' payoffs and lower receivers' payoffs in BT than in the other treatments. On average, cooperation rates, contribution levels, and earnings are the same in ET and RT. However, in ET group durations differ, and so does the number of groups each agent is member of. Multi-group ET receivers cooperate less than RT receivers, while few-group ET receivers cooperate more, and multi-group ET distributors contribute less than RT receivers, while few-group ET distributors contribute more.
Most importantly, the model proposes an explanation of the rationales that drive distributors and receivers. We suggest that subjects expect others to have the same preferences as they have, and that the amount distributors decide to offer coincides with the minimum donation they would be willing to accept as receivers. This implies that distributors that offer the most are those who would be greedy receivers, and that they act generously for fear of rejection. Similarly, the receivers that accept the lowest amounts would be greedy distributors, as they would offer a lower amount based on their own acceptance threshold. These conclusions are in contrast with a large literature on cooperation in social dilemmas, where prosocial agents expect higher cooperation rates than individualists and competitors [25,58]. This difference might be due to our specific framework, where anti-social behavior is punishable, while pro-social behavior is expected to be rewarded. Looking at group behavior, our model suggests that the more efficient groups (lasting the longest) are those composed by modest receivers and greedy distributors, while the groups with the shortest life-span are those composed by modest distributors and greedy receivers. Our results align with results on belief-dependent preferences and reputation presented in [59]. In this paper, within the framework of a modified repeated trust game (called trust minigame), the authors show that sharing information about guilt-averse trustees leads to higher trust and to longer cooperative paths, while the opposite is observed for selfish trustees, all else equal. This is similar to what we observe in our framework, where based on distributors' prior behavior, receivers can loosely classify them as generous or greedy and react accordingly. However, in our context, negative other-regarding preferences seem to be socially desirable, as generous behavior (in distributors) is the product of greediness, and cooperative behavior (in receivers) is the product of selfishness.
Our results are interesting from a managerial perspective. According to our results, the most successful group is composed by greedy distributors and modest receivers, which could be translated as a team composed by greedy leaders and modest employees collaborating on a project. In our context, a greedy leader could be interpreted as a leader that attributes high importance to monetary incentives and that expects others to do the same; such leader would guarantee high bonuses to the employees in an attempt to avoid them to undermine the productivity of team. On the other hand, modest employees would be satisfied with (almost) any bonus level and would collaborate to the productivity of the team most of the times. The opposite behavior would be observed in teams formed by modest leaders and greedy employees. These teams would be highly unproductive and unstable as the leaders would not provide high monetary incentives to the employees as they themselves would not value bonuses much (for example leaders which have a high commitment to their job or a high intrinsic motivation), while the greedy employees would demand high bonuses in order to complete the task. As the leader would not bestow high bonuses, employees would not collaborate, and this type of team would not produce surplus most of the times.

Conclusions
This paper investigates the impact of repeated interaction on cooperation levels and surplus distribution, when ex-post bargaining power differs across group members. As expected, the opportunity to refuse cooperation restricts the possibility of the strong agents to take the lion share of the surplus produced by cooperation, leading to more equal contributions compared to those observed when cooperation is enforced. However, contrary to what suggested by previous literature, we show that repeated interaction alone does not improve efficiency. Because of the heterogeneity in groups' behavior, we observe no impact of the possibility of repeated interaction on the aggregate contributions and cooperation levels, compared to those observed when repeated interaction is not possible. Instead, the possibility of repeated interaction with the same partners leads to a self-selection of agents into groups with different life-spans; long-lived, cooperative groups with high cooperation levels and contribution rates exist together with short-lived, un-cooperative groups. Interestingly, our model suggests that the most generous (and most efficient) distributors are those who act from fear of rejection, rather than from true generosity.
Our results cast doubts whether the possibility of repeated interaction can unequivocally lead to cooperative and efficient outcomes when the ex-post bargaining power about the surplus distribution is very unequal. Rather, it seems to amplify differences in cooperation and distribution behavior across groups. These results are interesting both from a theoretical and an applied perspective. Showing that repeated interaction with possibility of punishment (in this case refusing to cooperate) does not increase efficiency on average is an important result, as well as observing that the most efficient teams (long-lasting) are not made entirely by other-regarding members. From a managerial perspective, this gives some insights on the best composition of groups/teams.
Our model is based on the strong assumption that distributors believe receivers to be of their own type. We believe this simplifying assumption is justified for the analysis of certain scenarios, e.g., for situations where co-workers share a strong workplace culture. However, in situations where social preferences play a role, it is common to observe asymmetric information and heterogenous beliefs about others' social preferences (see, e.g., [60]), which in turn affect choice behavior [61]. In the context of Bayesian Psychological Games, [62] analyze these issues theoretically. Obviously, such a theoretical analysis would be interesting also in our context. Such a framework would allow distributors to update their beliefs about the receivers' acceptance thresholds, and heterogenous ex-ante beliefs as well as heterogenous cooperation experiences would influence distributors' choices. To tackle these issues, a new, more complex theoretical analysis as well as additional experiments with explicit belief elicitation would be required. This investigation is left to future research.
Other issues left for future research are the effect of being matched with distributors with very unequal donations (e.g., one offering a small amount while the other one offering a large amount) on cooperation, and the influence of other distributors' donations on one's own donations. Let us add that, as most experimental results, ours have to be interpreted as qualitative; we do not aim at describing in detail a real-life scenario, rather at providing a guideline on how to interpret specific situations.  Onderzoek-FWO, and the grant nr. 0166-00005B of the "Danmarks Frie Forskningsfond". None of the funding sources were involved in designing the study, in the collection, analysis, and interpretation of the data, in the writing of the report, in the decision to submit the article for publication.

Institutional Review Board Statement:
The study was conducted in accordance with the Declaration of Helsinki. The design was also in accordance with the umbrella ethical agreement of the Cologne Laboratory for Economic Research, University of Cologne (where the experimental data was collected), and of the Université Libre de Bruxelles (the institution where the experiment was designed and the research conducted). Further ethical review and approval were waived for this study as at the time of the data collection no ethical committee was established at the Université Libre de Bruxelles.
Informed Consent Statement: All subjects gave their informed consent for inclusion before they participated in the study.