# Categorization and Cooperation across Games

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

## 2. The Model

#### 2.1. The Game Space

#### 2.2. Discriminating Games

C | D | |

C | $\frac{1}{2},\frac{1}{2}$ | $-\frac{1}{2},\frac{1}{2}$ |

D | $\frac{1}{2},-\frac{1}{2}$ | $0,0$ |

#### 2.3. Strategies and Binary Categorizations

## 3. Equilibria over Thresholds

**Proposition**

**1.**

**Proof.**

## 4. Evolutionary and Stochastic Stability

**Proposition**

**2.**

**Proof.**

**Proposition**

**3.**

**Proof.**

## 5. Social Learning for Categorizers

#### 5.1. Learning by Imitation

- a.threshold: the current threshold that separates a’s categories (ℓ and h);
- a.action: the last action (C or D) played by a;
- a.xValue: the value x for the last game ${G}_{x}$ played by a;
- a.payoff: the last payoff received by a after playing ${G}_{x}$.

**line 7:**the agents in $\mathcal{A}$ are randomly matched in pairs;**lines 8–10:**each pair of agents $({a}_{j},{a}_{k})$ plays a game ${G}_{x}$ from $\mathcal{G}$ where x is an independent uniform draw from $(0,1)$;**lines 13–16:**for each agent a,- the agent’s game a.xValue is set equal to x (line 12);
- the agent’s played action $a.$action is set by the strategy $C{}_{t}D$ to be C if $a.$xValue$<a.$threshold and D otherwise (lines 13–16);

**lines 17–18:**the two paired agents play their chosen action in the game ${G}_{x}$ and the payoff a.payoff is computed for each of them.

- suppose ${a}_{i}$ faced situation ℓ and thus played C. Because ${a}_{e}$ played the different action D, he knows that ${a}_{e}$ must have played a game that ${a}_{e}$ perceives as situation h. Thus ${a}_{i}$ infers ${a}_{e}$.xValue$>{a}_{e}$.threshold. On the other hand, because ${a}_{e}$ played a game that ${a}_{i}$ perceives as situation ℓ, ${a}_{i}$ knows that ${a}_{e}$.xValue$<{a}_{i}$.threshold. Combining the two inequalities gives ${a}_{e}$.threshold$<{a}_{e}$.xValue$<{a}_{i}$.threshold. Therefore, ${a}_{i}$ knows that decreasing her threshold from ${a}_{i}$.threshold to ${a}_{e}$.xValue moves her categorization closer to ${a}_{e}$’s.
- conversely, suppose ${a}_{i}$ faced situation h and played D. Because ${a}_{e}$ played C, he knows that ${a}_{e}$ faced a game that ${a}_{e}$ perceives as situation ℓ. Then ${a}_{i}$ infers ${a}_{e}$.xValue$<{a}_{e}$.threshold. In addition, because ${a}_{e}$ played a game that ${a}_{i}$ perceives as situation h, ${a}_{i}$ knows that ${a}_{e}$.xValue$>{a}_{i}$.threshold. This yields ${a}_{e}$.threshold$>{a}_{e}$.xValue$>{a}_{i}$.threshold. Hence, ${a}_{i}$ knows that increasing her threshold to ${a}_{e}$.xValue makes her categorization closer to ${a}_{e}$’s.

#### 5.2. Simulations

#### 5.2.1. Experiment I: Only Suckers Imitate

- ${R}_{I}\left(\mathcal{A}\right)=random\_member\left({\mathcal{A}}^{\prime}\right)$, where ${\mathcal{A}}^{\prime}=\{a\in \mathcal{A}\mid a$.payoff$<0\}$

#### 5.2.2. Experiment II: All Imitate

- ${R}_{I}\left(\mathcal{A}\right)=random\_member\left(\mathcal{A}\right)$

#### 5.2.3. Experiment III: Agents Imitate When below Average

- ${R}_{I}\left(\mathcal{A}\right)=random\_member\left({\mathcal{A}}^{\prime}\right)$, where ${\mathcal{A}}^{\prime}=below\_average\_same\_situation\left(\mathcal{A}\right)$

## 6. Analysis of the Results

- sucker (S-agent): unilateral cooperation—agent played C, the opponent chose D;
- rewarded (R-agent): mutual cooperation—both played C;
- punished (P-agent): mutual defection—both played D;
- tempted (T-agent): unilateral defection—agent played D, the opponent chose C.

- Experiment I concerns the case where only S-agents (suckers) may imitate: the entire population learns to share the threshold ${t}^{\ast}=0$. This follows from two simple reasons. First, an S-agent played C when facing situation ℓ and, as discussed in Section 5.1, imitation may only decrease the threshold. Second, an S-agent received the lowest possible payoffs so any agent who is not a sucker is a potential exemplar. In sum, under “suckers imitate”, a player who experience the pain of being a sucker expands her category h, increasing the chance to defect later and to make other (more cooperative) agents turn out as suckers themselves.
- Suppose that only R-agents may imitate. Then imitation stops once all agents’ thresholds lie below $\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.$. The reason is again twofold. First, an R-agent played C when facing situation ℓ: then imitation can only decrease his threshold. Moreover, the realized game ${G}_{x}$ must have a temptation value smaller than the imitator’s threshold ${t}_{i}$, so $x<{t}_{i}$. Second, because an R-agents received the payoff $\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.$, his only viable exemplars have a payoff above $\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.$. Any of these exemplars must be a T-agent who scored against an S-agent with a threshold $t>x>\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.$. Once all agents’ thresholds are below $\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.$, no R-agent finds viable exemplars to imitate and learning stops.
- Assume that only P-agents imitate. Then the population learns to share the highest threshold in the initial population. First, a P-agent played D when facing situation h: then imitation can only increase her threshold. Second, a P-agents received the payoff 0, so R-agents (who score $\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.$) are viable exemplars. As threshold values increase, cooperation becomes more likely and thus there are more and more R-agents that P-agents try to imitate.
- Finally, suppose that only T-agents imitate. Then imitation stops once all agents’ thresholds are above $\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.$. First, a T-agent played D when facing situation h: imitation can only increase his threshold. Second, because a T-agent received a payoff above 0, his only viable exemplars are R-agents who score $\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.$. (A T-agent never imitates another T-agent because both of them have played the same action.) Once all agents’ thresholds are above $\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.$, a T-agent defects only when facing a game ${G}_{x}$ with $x>\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.$ and receives a payoff $x>\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.$, which is higher than what any R-agent receives. Therefore, a T-agents finds no agents to imitate and learning stops.

## 7. The Stability of Learning by Imitation

#### 7.1. Long-Term Behavior

#### 7.2. Resilience after Invasions

## 8. Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## Abbreviations

ES | evolutionarily stable |

SS | stochastically stable |

TI | threshold imitation |

ALL | all agents |

AVG | agents that scored below the average score of all agents in the same situation |

S-agent | sucker agent |

R-agent | rewarded agent |

P-agent | punished agent |

T-agent | tempted agent |

## Appendix A. Implemented Functions

#### Appendix A.1. random_member_with_higher_payoff

1 | Input: | set of n agents $A=\{{a}_{1},{a}_{2},\dots {a}_{n}\}$, | |||

2 | imitator agent ${a}_{i}$ | ||||

3 | $O=\left\{\right\}$: | ||||

4 | for$a\in A$: | ||||

5 | if${a}_{i}$.payoff$<a$.payoff: | ||||

6 | $O=O\cup \left\{a\right\}$ | ||||

7 | returnrandom_member(O) |

#### Appendix A.2. is_in_same_situation

1 | Input: | imitator agent ${a}_{i}$, | |||

2 | example agent ${a}_{e}$ | ||||

3 | $c=$false | ||||

4 | if${a}_{i}$.xValue$<{a}_{i}$.thresholdand${a}_{e}$.xValue$<{a}_{i}$.threshold: | ||||

5 | $c=$true | ||||

6 | if${a}_{i}$.xValue$>{a}_{i}$.thresholdand${a}_{e}$.xValue$>{a}_{i}$.threshold: | ||||

7 | $c=$true | ||||

8 | returnc |

#### Appendix A.3. below_average_same_situation

1 | Input: | set of n agents $A=\{{a}_{1},{a}_{2},\dots {a}_{n}\}$, | |||

2 | $M=\left\{\right\}$: | ||||

3 | for$a\in A$: | ||||

4 | accumulated_payoff$=0.0$ | ||||

5 | num_opponents$=0$ | ||||

6 | for${a}^{\prime}\in A$: | ||||

7 | ifis_in_same_situation(a, ${a}^{\prime}$): | ||||

8 | accumulated_payoff = accumulated_payoff + ${a}^{\prime}$.payoff | ||||

9 | num_opponents = num_opponents$+1$ | ||||

10 | avg_payoff = accumulated_payoff/num_opponents | ||||

11 | ifa.payoff < avg_payoff: | ||||

12 | $M=M\cup \left\{a\right\}$ | ||||

13 | returnM |

## References

- Denzau, A.; North, D. Shared mental models: Ideologies and institutions. Kyklos
**1994**, 47, 3–31. [Google Scholar] [CrossRef] - Hoff, K.; Stiglitz, J.E. Striving for balance in economics: Towards a theory of the social determination of behavior. J. Econ. Behav. Organ.
**2016**, 126, 25–57. [Google Scholar] [CrossRef][Green Version] - Hobbes, T. Leviathan; Shapiro, I., Ed.; Yale University Press: New Haven, CT, USA, 2010. [Google Scholar]
- Nowak, M.; Highfield, R. Supercooperators: Altruism, Evolution, and Why We Need Each Other to Succeed; Free Press: New York, NY, USA, 2011. [Google Scholar]
- Rousseau, J.J. Discours sur l’origine et les fondements de l’inégalité parmi les hommes; Bedford/St. Martins: New York, NY, USA, 2011. [Google Scholar]
- Skyrms, B. The Stag Hunt and the Evolution of Social Structure; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
- Binmore, K. Do conventions need to be common knowledge? Topoi
**2008**, 27, 17–27. [Google Scholar] [CrossRef] - Van Huyck, J.; Stahl, D. Conditional behavior and learning in similar stag hunt games. Exp. Econ.
**2018**, 21, 513–526. [Google Scholar] [CrossRef] - Cohen, H.; Lefebvre, C. Handbook of Categorization in Cognitive Science; Elsevier: Amsterdam, The Netherlands, 2005. [Google Scholar]
- Gibbons, R.; LiCalzi, M.; Warglien, M. What Situation Is This? Coarse Cognition and Behavior over a Space of Games; Working Paper 9/2017; Department of Management, Università Ca’ Foscari Venezia: Venice, Italy, 2017. [Google Scholar]
- Steiner, J.; Stewart, C. Contagion through learning. Theor. Econ.
**2008**, 3, 431–458. [Google Scholar] - Peysakhovich, A.; Rand, D.G. Habits of virtue: Creating norms of cooperation and defection in the laboratory. Manag. Sci.
**2016**, 62, 631–647. [Google Scholar] [CrossRef] - Heller, Y.; Winter, E. Rule rationality. Int. Econ. Rev.
**2016**, 57, 997–1026. [Google Scholar] [CrossRef] - Mengel, F. Learning across games. Games Econ. Behav.
**2012**, 74, 601–619. [Google Scholar] [CrossRef] - Macy, M.W.; Flache, A. Learning dynamics in social dilemmas. Proc. Natl. Acad. Sci. USA
**2002**, 99, 7229–7236. [Google Scholar] [CrossRef] [PubMed][Green Version] - Rankin, F.W.; van Huyck, J.B.; Battalio, R.C. Strategic similarity and emergent conventions: Evidence from similar stag hunt games. Games Econ. Behav.
**2000**, 32, 315–337. [Google Scholar] [CrossRef] - LiCalzi, M. Fictitious play by cases. Games Econ. Behav.
**1995**, 11, 64–89. [Google Scholar] [CrossRef] - Gracia-Lázaro, C.; Floría, L.M.; Moreno, Y. Cognitive hierarchy theory and two-person games. Games
**2017**, 8, 1. [Google Scholar] [CrossRef] - Cressman, R.; Apaloo, J. Evolutionary game theory. In Handbook on Dynamic Game Theory; Başar, T., Zaccour, G., Eds.; Springer: Cham, Switzerland, 2018; pp. 461–510. [Google Scholar]
- Kandori, M.; Mailath, G.; Rob, R. Learning, mutation and long run equilibrium in games. Econometrica
**1993**, 61, 29–56. [Google Scholar] [CrossRef] - Morris, S.; Rob, R.; Shin, H. p-dominance and belief potential. Econometrica
**1995**, 63, 145–157. [Google Scholar] [CrossRef] - Ellison, G. Basins of attraction, long-run stochastic stability, and the speed of step- by-step evolution. Econometrica
**2000**, 67, 17–45. [Google Scholar] [CrossRef] - Boyd, R.; Richerson, P.J. Culture and the evolution of human cooperation. Philos. Trans. R. Soc. Lond. B Biol. Sci.
**2009**, 364, 3281–3288. [Google Scholar] [CrossRef] [PubMed][Green Version] - Ellison, G.; Holden, R. A theory of rule development. J. Law Econ. Organ.
**2014**, 30, 649–682. [Google Scholar] [CrossRef] - Rusch, H.; Luetge, C. Spillovers from coordination to cooperation: Evidence for the interdependence hypothesis? Evolut. Behav. Sci.
**2016**, 10, 284–296. [Google Scholar] [CrossRef] - Knez, M.; Camerer, C. Increasing cooperation in prisoner’s dilemmas by establishing a precedent of efficiency in coordination games. Organ. Behav. Hum. Decis. Process.
**2000**, 82, 194–216. [Google Scholar] [CrossRef] [PubMed] - Bacharach, M. Beyond Individual Choice: Teams and Frames in Game Theory; Princeton University Press: Princeton, NJ, USA, 2006. [Google Scholar]
- Bednar, J.; Page, S. Can game(s) theory explain culture? Rational. Soc.
**2007**, 19, 65–97. [Google Scholar] [CrossRef] - Hume, D. A Treatise of Human Nature; Selby-Bigge, L.A., Nidditch, P., Eds.; Oxford University Press: Oxford, UK, 1978. [Google Scholar]

1. | See Cohen and Lefebvre [9] for extensive surveys on the cognitive primacy of categorization processes. |

2. | See Steiner and Stewart [11] for a formal model of spillovers due to learning across games. |

3. | This remains true at the null event $x=1/2$, when Player 1 gets the same payoff under $(C,C)$ and $(D,C)$. |

4. | Technically, thresholds define partitions up to zero-measure sets because cells overlap at the boundary. |

5. | An implementation of the model (using python) is available at the GitHub repository: https://github.com/muehlenbernd/temptation_threshold_imitation. |

6. | See Bacharach [27] for a different approach. |

**Figure 3.**Categorization thresholds that determine where to cooperate (light gray) and defect (dark gray) over a space of stag hunt games (SH, $<\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.$) and prisoners’ dilemmas (PD, $>\raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.$).

**Figure 5.**Percentages of the cumulated number of updates steps from R-, S-, T-, and P-agents for “ALL imitate” (left) and “AVG imitate” (right).

**Figure 6.**Distribution of shared thresholds under persistent invasions for “ALL imitate” (left) and “AVG imitate” (right).

C | D | C | D | |||

C | $\frac{1}{2},\frac{1}{2}$ | $-\frac{1}{2},\frac{t}{2}$ | C | $\frac{1}{2},\frac{1}{2}$ | $-\frac{1}{2},\frac{1+t}{2}$ | |

D | $\frac{t}{2},-\frac{1}{2}$ | $0,0$ | D | $\frac{1+t}{2},-\frac{1}{2}$ | $0,0$ | |

low temptation (ℓ) | high temptation (h) |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

LiCalzi, M.; Mühlenbernd, R. Categorization and Cooperation across Games. *Games* **2019**, *10*, 5.
https://doi.org/10.3390/g10010005

**AMA Style**

LiCalzi M, Mühlenbernd R. Categorization and Cooperation across Games. *Games*. 2019; 10(1):5.
https://doi.org/10.3390/g10010005

**Chicago/Turabian Style**

LiCalzi, Marco, and Roland Mühlenbernd. 2019. "Categorization and Cooperation across Games" *Games* 10, no. 1: 5.
https://doi.org/10.3390/g10010005