# Testing Game Theory of Mind Models for Artificial Intelligence

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Game Structure

- All: all choices are combined without differentiating between games in which they were made.
- Common Interest: there is one option that is best for both players, e.g., ${f}_{3}>{f}_{1}>{f}_{2}$ and ${s}_{3}>{s}_{2}>{s}_{1}$.
- Safe Shot: “In” is the optimal choice for player 1, e.g., ${f}_{2}>{f}_{3}>{f}_{1}$ and ${s}_{3}>{s}_{2}>{s}_{1}$.
- Strategic Dummy: player 2 cannot affect the payoffs, e.g., ${f}_{1}>{f}_{2}={f}_{3}$ and ${s}_{2}={s}_{3}>{s}_{1}$.
- Near Dictator: the best payoff for player 1 is independent of player 2’s choice, e.g., ${f}_{1}>{f}_{3}>{f}_{2}$ and ${s}_{2}>{s}_{1}>{s}_{3}$.
- Free Punish: player 2 can punish player 1’s “In” choice with no cost, e.g., ${f}_{2}>{f}_{1}>{f}_{3}$ and ${s}_{1}>{s}_{2}={s}_{3}$.
- Rational Punish: punishing player 1’s “In” choice maximises player 2’s payoff, e.g., ${f}_{3}>{f}_{1}>{f}_{2}$ and ${s}_{1}>{s}_{2}>{s}_{3}$.
- Costly Punish: punishing player 1’s “In” choice is costly, e.g., ${f}_{3}>{f}_{1}>{f}_{2}$ and ${s}_{1}>{s}_{3}>{s}_{2}$.
- Free Help: improving the other’s payoff is not costly, e.g., ${f}_{1}={f}_{2}>{f}_{3}$ and ${s}_{2}>{s}_{1}>{s}_{3}$.
- Costly Help: improving the other’s payoff is costly for the helper, e.g., ${f}_{3}>{f}_{1}>{f}_{2}$ and ${s}_{2}>{s}_{1}={s}_{3}$.
- Trust Game: choosing “In” improves 2’s payoff but reciprocation is irrational for player 2, e.g., ${f}_{2}>{f}_{1}>{f}_{3}$ and ${s}_{3}>{s}_{2}>{s}_{1}$ (Trust is a subset of Costly Help).
- Conflicting Interest: player 1’s reward is maximised only if player 2 chooses suboptimally after player 1 plays “In”, i.e., ${f}_{2}>{f}_{3}$ but ${s}_{2}<{s}_{3}$, while “Out” is neither minimal nor maximal for player 1, e.g., ${f}_{3}>{f}_{1}>{f}_{2}$ and ${s}_{2}>{s}_{1}>{s}_{3}$.

#### 2.2. Experimental Data and Decision Model

#### 2.3. Expected Utility Models

- Linear Model:$$U\left(x\right)=\lambda x$$
- Linear Loss Aversion Behaviour:$$U\left(x\right)=\left\{\begin{array}{cc}x,\hfill & \mathrm{if}\phantom{\rule{4.pt}{0ex}}x\ge 0,\hfill \\ \lambda x,\hfill & \mathrm{if}\phantom{\rule{4.pt}{0ex}}x<0.\hfill \end{array}\right.$$
- General Linear Loss Aversion Behaviour:$$U\left(x\right)=\left\{\begin{array}{cc}\alpha x\hfill & \mathrm{if}\phantom{\rule{4.pt}{0ex}}x\ge 0\hfill \\ \lambda \beta x\hfill & \mathrm{if}\phantom{\rule{4.pt}{0ex}}x<0\hfill \end{array}\right.$$
- Normalised Exponential Loss Aversion Behaviour,$$U(x,\alpha )=\left\{\begin{array}{cc}1-{e}^{-\alpha x}\hfill & \mathrm{if}\phantom{\rule{4.pt}{0ex}}x\ge 0,\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\alpha >0\hfill \\ x,\hfill & \mathrm{if}\phantom{\rule{4.pt}{0ex}}x\ge 0,\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\alpha =0\hfill \\ {e}^{-\alpha x}-1\hfill & \mathrm{if}\phantom{\rule{4.pt}{0ex}}x\ge 0,\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\alpha <0\hfill \\ -\lambda (1-{e}^{-\alpha x})\hfill & \mathrm{if}\phantom{\rule{4.pt}{0ex}}x<0,\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\alpha >0\hfill \\ -\lambda x,\hfill & \mathrm{if}\phantom{\rule{4.pt}{0ex}}x<0,\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\alpha =0\hfill \\ -\lambda ({e}^{-\alpha x}-1)\hfill & \mathrm{if}\phantom{\rule{4.pt}{0ex}}x<0,\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\alpha <0\hfill \end{array}\right.$$
- Power Law Loss Aversion Behaviour:$$U\left(x\right)=\left\{\begin{array}{cc}{x}^{\alpha}\hfill & \mathrm{if}\phantom{\rule{4.pt}{0ex}}x\ge 0\hfill \\ -\lambda {(-x)}^{-\beta}\hfill & \mathrm{if}\phantom{\rule{4.pt}{0ex}}x<0\hfill \end{array}\right.$$
- General Power Loss Aversion Behaviour:$$U\left(x\right)=\left\{\begin{array}{cc}\beta {x}^{\alpha}\hfill & \mathrm{if}\phantom{\rule{4.pt}{0ex}}x\ge 0\hfill \\ -\lambda {(-\delta x)}^{\gamma}\hfill & \mathrm{if}\phantom{\rule{4.pt}{0ex}}x<0\hfill \end{array}\right.$$
- Exponential Power Loss Aversion Behaviour:$$U\left(x\right)=\left\{\begin{array}{cc}\gamma -exp(-\beta {x}^{\alpha})\hfill & \mathrm{if}\phantom{\rule{4.pt}{0ex}}x\ge 0\hfill \\ -\lambda (\gamma -exp(-\beta {(-x)}^{\alpha}))\hfill & \mathrm{if}\phantom{\rule{4.pt}{0ex}}x<0\hfill \end{array}\right.$$
- Quadratic Loss Aversion Behaviour:$$U\left(x\right)=\left\{\begin{array}{cc}\alpha x-{x}^{2}\hfill & \mathrm{if}\phantom{\rule{4.pt}{0ex}}x\ge 0\hfill \\ -\lambda (-\beta x-{x}^{2})\hfill & \mathrm{if}\phantom{\rule{4.pt}{0ex}}x<0\hfill \end{array}\right.$$

#### 2.4. Prospect Theory Models

- None: $\pi \left(p\right)=p$
- Kahneman–Tversky: $\pi \left(p\right)={p}^{\alpha}{({p}^{\alpha}+{(1-p)}^{\alpha})}^{-1/\alpha}$
- Log-Odds Linear: $\pi \left(p\right)=\beta {p}^{\alpha}{(\beta {p}^{\alpha}+{(1-p)}^{\alpha})}^{-1}$
- Power law: $\pi \left(p\right)=\beta {p}^{\alpha}$
- NeoAdditive: $\pi \left(p\right)=\beta +\alpha {p}^{\alpha}$
- Hyperbolic Log: $\pi \left(p\right)={(1-\alpha +logp)}^{\frac{\beta}{\alpha}}$
- Exponential Power: $\pi \left(p\right)=exp(-\frac{\alpha}{\beta}(1-{p}^{\beta})$
- Compound Invariance: $\pi \left(p\right)=exp\left(\beta {(-logp)}^{\alpha}\right)$
- Constant Relative Sensitivity: $\pi \left(p\right)={\beta}^{(1-\alpha )}+{p}^{\alpha}$

## 3. Results

#### 3.1. Average Root Mean Square Error of Utility Models

#### 3.2. Performance: Individual Models by Game Type

## 4. Discussion

## Author Contributions

## Funding

## Data Availability Statement

## Conflicts of Interest

## References

- Wolpert, D.H.; Wheeler, K.R.; Tumer, K. Collective intelligence for control of distributed dynamical systems. EPL (Europhys. Lett.)
**2000**, 49, 708. [Google Scholar] [CrossRef] - Suran, S.; Pattanaik, V.; Draheim, D. Frameworks for collective intelligence: A systematic literature review. ACM Comput. Surv. (CSUR)
**2020**, 53, 1–36. [Google Scholar] [CrossRef] - Kameda, T.; Toyokawa, W.; Tindale, R.S. Information aggregation and collective intelligence beyond the wisdom of crowds. Nat. Rev. Psychol.
**2022**, 1, 345–357. [Google Scholar] [CrossRef] - Momennejad, I. Collective minds: Social network topology shapes collective cognition. Philos. Trans. R. Soc. B
**2022**, 377, 20200315. [Google Scholar] [CrossRef] - Harré, M.S.; Prokopenko, M. The social brain: Scale-invariant layering of Erdős–Rényi networks in small-scale human societies. J. R. Soc. Interface
**2016**, 13, 20160044. [Google Scholar] [CrossRef] - Woolley, A.W.; Chabris, C.F.; Pentland, A.; Hashmi, N.; Malone, T.W. Evidence for a collective intelligence factor in the performance of human groups. Science
**2010**, 330, 686–688. [Google Scholar] [CrossRef] - Frith, U. Mind blindness and the brain in autism. Neuron
**2001**, 32, 969–979. [Google Scholar] [CrossRef] - Mann, R.P.; Helbing, D. Optimal incentives for collective intelligence. Proc. Natl. Acad. Sci. USA
**2017**, 114, 5077–5082. [Google Scholar] [CrossRef] - Yoshida, W.; Dolan, R.J.; Friston, K.J. Game theory of mind. PLoS Comput. Biol.
**2008**, 4, e1000254. [Google Scholar] [CrossRef] - Team, F.; Bakhtin, A.; Brown, N.; Dinan, E.; Farina, G.; Flaherty, C.; Fried, D.; Goff, A.; Gray, J.; Hu, H.; et al. Human-level play in the game of Diplomacy by combining language models with strategic reasoning. Science
**2022**, 378, 1067–1074. [Google Scholar] - Grünwald, P.D.; Dawid, A.P. Game theory, maximum entropy, minimum discrepancy and robust Bayesian decision theory. Ann. Stat.
**2004**, 32, 1367–1433. [Google Scholar] [CrossRef] - Wolpert, D.H.; Harré, M.; Olbrich, E.; Bertschinger, N.; Jost, J. Hysteresis effects of changing the parameters of noncooperative games. Phys. Rev. E
**2012**, 85, 036102. [Google Scholar] [CrossRef] - Ruiz-Serra, J.; Harré, M.S. Inverse Reinforcement Learning as the Algorithmic Basis for Theory of Mind: Current Methods and Open Problems. Algorithms
**2023**, 16, 68. [Google Scholar] [CrossRef] - Lee, D. Game theory and neural basis of social decision making. Nat. Neurosci.
**2008**, 11, 404–409. [Google Scholar] [CrossRef] [PubMed] - Lee, D.; Seo, H. Neural basis of strategic decision making. Trends Neurosci.
**2016**, 39, 40–48. [Google Scholar] [CrossRef] - Bard, N.; Foerster, J.N.; Chandar, S.; Burch, N.; Lanctot, M.; Song, H.F.; Parisotto, E.; Dumoulin, V.; Moitra, S.; Hughes, E.; et al. The hanabi challenge: A new frontier for ai research. Artif. Intell.
**2020**, 280, 103216. [Google Scholar] [CrossRef] - Ho, M.K.; Saxe, R.; Cushman, F. Planning with theory of mind. Trends Cogn. Sci.
**2022**, 26, 959–971. [Google Scholar] [CrossRef] - Harré, M.S. What Can Game Theory Tell Us about an AI ‘Theory of Mind’? Games
**2022**, 13, 46. [Google Scholar] [CrossRef] - Linson, A.; Parr, T.; Friston, K.J. Active inference, stressors, and psychological trauma: A neuroethological model of (mal) adaptive explore-exploit dynamics in ecological context. Behav. Brain Res.
**2020**, 380, 112421. [Google Scholar] [CrossRef] - Dale, R.; Kello, C.T. “How do humans make sense?” multiscale dynamics and emergent meaning. New Ideas Psychol.
**2018**, 50, 61–72. [Google Scholar] [CrossRef] - Pessoa, L. Neural dynamics of emotion and cognition: From trajectories to underlying neural geometry. Neural Netw.
**2019**, 120, 158–166. [Google Scholar] [CrossRef] [PubMed] - Nowak, A. Dynamical minimalism: Why less is more in psychology. In Theory Construction in Social Personality Psychology; Psychology Press: London, UK, 2016; pp. 183–192. [Google Scholar]
- Iravani, B.; Arshamian, A.; Fransson, P.; Kaboodvand, N. Whole-brain modelling of resting state fMRI differentiates ADHD subtypes and facilitates stratified neuro-stimulation therapy. Neuroimage
**2021**, 231, 117844. [Google Scholar] [CrossRef] [PubMed] - Khona, M.; Fiete, I.R. Attractor and integrator networks in the brain. Nat. Rev. Neurosci.
**2022**, 23, 744–766. [Google Scholar] [CrossRef] [PubMed] - Wang, S.; Falcone, R.; Richmond, B.; Averbeck, B.B. Attractor dynamics reflect decision confidence in macaque prefrontal cortex. Nat. Neurosci.
**2023**, 26, 1970–1980. [Google Scholar] [CrossRef] [PubMed] - Steemers, B.; Vicente-Grabovetsky, A.; Barry, C.; Smulders, P.; Schröder, T.N.; Burgess, N.; Doeller, C.F. Hippocampal attractor dynamics predict memory-based decision making. Curr. Biol.
**2016**, 26, 1750–1757. [Google Scholar] [CrossRef] [PubMed] - Harré, M.S. Information theory for agents in artificial intelligence, psychology, and economics. Entropy
**2021**, 23, 310. [Google Scholar] [CrossRef] [PubMed] - Jha, A.; Peterson, J.C.; Griffiths, T.L. Extracting low-dimensional psychological representations from convolutional neural networks. Cogn. Sci.
**2023**, 47, e13226. [Google Scholar] [CrossRef] - Schöner, G. The dynamics of neural populations capture the laws of the mind. Top. Cogn. Sci.
**2020**, 12, 1257–1271. [Google Scholar] [CrossRef] - Peterson, J.C.; Bourgin, D.D.; Agrawal, M.; Reichman, D.; Griffiths, T.L. Using large-scale experiments and machine learning to discover theories of human decision-making. Science
**2021**, 372, 1209–1214. [Google Scholar] [CrossRef] - Ert, E.; Erev, I.; Roth, A.E. A choice prediction competition for social preferences in simple extensive form games: An introduction. Games
**2011**, 2, 257–276. [Google Scholar] [CrossRef] - Fehr, E.; Schmidt, K.M. A theory of fairness, competition, and cooperation. Q. J. Econ.
**1999**, 114, 817–868. [Google Scholar] [CrossRef] - Gintis, H. The foundations of behavior: The beliefs, preferences, and constraints model. Biol. Theory
**2006**, 1, 123–127. [Google Scholar] [CrossRef] - Weber, E.U. Perception matters: Psychophysics for economists. Psychol. Econ. Decis.
**2004**, 2, 14–41. [Google Scholar]

**Figure 1.**Sequential interaction between player 1 and player 2 and the payoffs for the joint choices. Player 1 selects between “Out” and “In”, and then Player 2 selects between “Left” and “Right” if player 1 is “In”.

**Figure 2.**Root mean square error results for each utility and neural network model using the training data. It can be readily seen that the neural network models perform well against the other structural decision models.

**Figure 3.**Root mean square error results for each utility and neural network model using the test data (out-of-sample) where ordering is the same as Figure 2. Note that the performance categorised by model type shows no significant patterns in either the spreads or the means of their RMSE performance.

**Figure 4.**RMSE performance of top 10 and bottom 10 of EU-PT models. Note that there is significantly less variation in the out-of-sample performance than there is in the in-sample performance, but with noticeably better performance in the unconstrained neural network and the unconstrained neural network with feature engineering being highly comparable.

**Figure 5.**Log${}_{10}$-transformed RMSE performance distribution of all EU-PT models disaggregated by game type. Logs highlight the low value tails for ‘best in class’ data points with low RMSE values.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Harré, M.S.; El-Tarifi, H.
Testing Game Theory of Mind Models for Artificial Intelligence. *Games* **2024**, *15*, 1.
https://doi.org/10.3390/g15010001

**AMA Style**

Harré MS, El-Tarifi H.
Testing Game Theory of Mind Models for Artificial Intelligence. *Games*. 2024; 15(1):1.
https://doi.org/10.3390/g15010001

**Chicago/Turabian Style**

Harré, Michael S., and Husam El-Tarifi.
2024. "Testing Game Theory of Mind Models for Artificial Intelligence" *Games* 15, no. 1: 1.
https://doi.org/10.3390/g15010001