# Estimating Case-Based Learning

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Applying Case-Based Learning to Experiments

## 3. Learning Algorithms

#### 3.1. Case-Based Learning

#### 3.1.1. Definition of Case-Based Attraction

#### 3.1.2. The Functional Form of Similarity

#### 3.1.3. Comparision to RL And EWA

#### 3.2. Reinforcement Learning

#### 3.3. Self-Tuning Experience Weighted Attraction

#### 3.4. Relationship between RL and CBL

#### 3.5. Initial Attractions

#### 3.6. Stochastic Choice Probabilities

## 4. Description of the Data

## 5. Measuring Goodness of Fit

## 6. Results

## 7. Case-Based Parameters

#### 7.1. Memory

#### 7.2. Definition of Similarity

## 8. Empirical Comparison of Learning Models

## 9. Discussion

## 10. Conclusions

## Author Contributions

## Funding

## Conflicts of Interest

## Appendix A. Proofs

**Proposition**

**A1.**

## Appendix B. Definition of the Problem

(1) | |
---|---|

Mean Quadratic Score | |

One lag | 0.677 |

Two lags | 0.684 |

Three lags | 0.682 |

MA-3 | 0.678 |

MA-5 | 0.678 |

## Appendix C. Normalization of Weights

${\mathit{W}}_{1}$ | ${\mathit{W}}_{2}$ | |
---|---|---|

Constant sum games | 0.544 | 2.435 |

Non-constant sum games | 14.245 | 1.539 |

All games | 0.622 | 1.414 |

## Appendix D. Individual Game Results

**Table A3.**In-sample Fit by Game: Mean Quadratic Score * denotes the best-fitting model based on the mean quadratic score. CBL has five estimated parameters, RL has four estimated parameters, and ST EWA has one estimated parameter. ST EWA refers to the self-tuning EWA, RL refers to reinforcement learning, and CBL refers to case-based learning.

CBL | ST EWA | RL | |
---|---|---|---|

Game 1 | 0.809 * | 0.785 | 0.808 |

Game 2 | 0.667 * | 0.661 | 0.667 |

Game 3 | 0.768 * | 0.750 | 0.767 |

Game 4 | 0.654 * | 0.636 | 0.650 |

Game 5 | 0.630 | 0.608 | 0.633 * |

Game 6 | 0.594 | 0.568 | 0.598 * |

Game 7 | 0.741 | 0.735 | 0.751 * |

Game 8 | 0.637 * | 0.629 | 0.636 |

Game 9 | 0.743 * | 0.706 | 0.738 |

Game 10 | 0.639 * | 0.607 | 0.637 |

Game 11 | 0.616 * | 0.597 | 0.615 |

Game 12 | 0.593 * | 0.573 | 0.590 |

**Table A4.**Out-of-sample Fit by Game Note: * denotes the best fitting model based on the mean quadratic score. ST EWA refers to the self-tuning EWA, RL refers to reinforcement learning, and CBL refers to case-based learning.

A: Out-of-Sample: Predict Last 60% | B: Out-of-Sample: Predict Last 50% | |||||
---|---|---|---|---|---|---|

CBL | ST EWA | RL | CBL | ST EWA | RL | |

Game 1 | 0.827 | 0.811 | 0.838 * | 0.845 * | 0.817 | 0.844 |

Game 2 | 0.674 | 0.669 | 0.675 * | 0.671 | 0.671 | 0.677 * |

Game 3 | 0.793 * | 0.771 | 0.793 | 0.782 | 0.778 | 0.798 * |

Game 4 | 0.658 * | 0.637 | 0.653 | 0.659 * | 0.637 | 0.655 |

Game 5 | 0.636 * | 0.607 | 0.632 | 0.641 * | 0.607 | 0.634 |

Game 6 | 0.601 * | 0.568 | 0.600 | 0.603 * | 0.567 | 0.599 |

Game 7 | 0.768 | 0.766 | 0.783 * | 0.784 | 0.769 | 0.785 * |

Game 8 | 0.641 | 0.638 | 0.644 * | 0.637 | 0.637 | 0.642 * |

Game 9 | 0.743 | 0.715 | 0.747 * | 0.746 | 0.711 | 0.747 * |

Game 10 | 0.632 | 0.605 | 0.638 * | 0.645 * | 0.610 | 0.641 |

Game 11 | 0.616 | 0.598 | 0.617 * | 0.618 | 0.601 | 0.624 * |

Game 12 | 0.593 * | 0.576 | 0.593 | 0.579 | 0.574 | 0.593 |

C: Out-of-Sample: Predict Last 40% | D: Out-of-Sample: Predict Last 30% | |||||

CBL | ST EWA | RL | CBL | ST EWA | RL | |

Game 1 | 0.843 | 0.818 | 0.847 | 0.854 | 0.82 | 0.852 |

Game 2 | 0.665 | 0.672 | 0.680 * | 0.669 | 0.671 | 0.680 * |

Game 3 | 0.787 | 0.779 | 0.800 * | 0.792 | 0.780 | 0.802 * |

Game 4 | 0.661 * | 0.636 | 0.655 | 0.664 * | 0.641 | 0.659 |

Game 5 | 0.641 * | 0.607 | 0.634 | 0.643 * | 0.611 | 0.635 |

Game 6 | 0.594 | 0.569 | 0.600 * | 0.606 * | 0.574 | 0.603 |

Game 7 | 0.789 | 0.775 | 0.792 * | 0.791 | 0.777 | 0.795 * |

Game 8 | 0.635 | 0.635 | 0.640 * | 0.639 | 0.634 | 0.639 * |

Game 9 | 0.750 * | 0.715 | 0.748 | 0.737 | 0.714 | 0.748 * |

Game 10 | 0.651 * | 0.613 | 0.650 | 0.643 | 0.614 | 0.651 * |

Game 11 | 0.621 | 0.605 | 0.626 * | 0.621 | 0.603 | 0.624 * |

Game 12 | 0.596 * | 0.574 | 0.594 | 0.599 | 0.579 | 0.599 * |

E: Out-of-Sample: Predict Last 20% | ||||||

Game 1 | 0.866 * | 0.821 | 0.862 | |||

Game 2 | 0.694 | 0.686 | 0.697 * | |||

Game 3 | 0.804 | 0.789 | 0.806 * | |||

Game 4 | 0.660 | 0.642 | 0.661 * | |||

Game 5 | 0.637 * | 0.612 | 0.636 | |||

Game 6 | 0.609 * | 0.574 | 0.604 | |||

Game 7 | 0.794 * | 0.776 | 0.794 | |||

Game 8 | 0.634 | 0.637 | 0.643 * | |||

Game 9 | 0.746 | 0.726 | 0.757 * | |||

Game 10 | 0.663 | 0.626 | 0.666 * | |||

Game 11 | 0.620 | 0.608 | 0.629 * | |||

Game 12 | 0.591 | 0.575 | 0.595 * |

## References

- Gilboa, I.; Schmeidler, D. Case-based decision theory. Q. J. Econ.
**1995**, 110, 605–639. [Google Scholar] [CrossRef][Green Version] - Von Neumann, J.; Morgenstern, O. Theory of Games and Economic Behavior; Princeton University Press: Princeton, NJ, USA, 1944. [Google Scholar]
- Savage, L.J. The Foundations of Statistics; Wiley: New York, NY, USA, 1954. [Google Scholar]
- Pape, A.D.; Kurtz, K.J. Evaluating case-based decision theory: Predicting empirical patterns of human classification learning. Games Econ. Behav.
**2013**, 82, 52–65. [Google Scholar] [CrossRef] - Guilfoos, T.; Pape, A.D. Predicting human cooperation in the Prisoner’s Dilemma using case-based decision theory. Theory Decis.
**2016**, 80, 1–32. [Google Scholar] [CrossRef] - Luce, R.D. Individual Choice Behavior; Dover Publications Inc.: Menola, NY, USA, 1959. [Google Scholar]
- Erev, I.; Roth, A.E. Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria. Am. Econ. Rev.
**1998**, 88, 848–881. [Google Scholar] - Selten, R.; Chmura, T. Stationary concepts for experimental 2 × 2-games. Am. Econ. Rev.
**2008**, 98, 938–966. [Google Scholar] [CrossRef] - Chmura, T.; Goerg, S.J.; Selten, R. Learning in experimental 2 × 2 games. Games Econ. Behav.
**2012**, 76, 44–73. [Google Scholar] [CrossRef][Green Version] - Ho, T.H.; Camerer, C.F.; Chong, J.K. Self-tuning experience weighted attraction learning in games. J. Econ. Theory
**2007**, 133, 177–198. [Google Scholar] [CrossRef][Green Version] - Gayer, G.; Gilboa, I.; Lieberman, O. Rule-based and case-based reasoning in housing prices. BE J. Theor. Econ.
**2007**, 7. [Google Scholar] [CrossRef][Green Version] - Kinjo, K.; Sugawara, S. Predicting empirical patterns in viewing Japanese TV dramas using case-based decision theory. BE J. Theor. Econ.
**2016**, 16, 679–709. [Google Scholar] [CrossRef] - Golosnoy, V.; Okhrin, Y. General uncertainty in portfolio selection: A case-based decision approach. J. Econ. Behav. Organ.
**2008**, 67, 718–734. [Google Scholar] [CrossRef][Green Version] - Guerdjikova, A. Case-based learning with different similarity functions. Games Econ. Behav.
**2008**, 63, 107–132. [Google Scholar] [CrossRef] - Ossadnik, W.; Wilmsmann, D.; Niemann, B. Experimental evidence on case-based decision theory. Theory Decis.
**2013**, 75, 1–22. [Google Scholar] [CrossRef] - Grosskopf, B.; Sarin, R.; Watson, E. An experiment on case-based decision making. Theory Decis.
**2015**, 79, 639–666. [Google Scholar] [CrossRef][Green Version] - Bleichrodt, H.; Filko, M.; Kothiyal, A.; Wakker, P.P. Making case-based decision theory directly observable. Am. Econ. J. Microecon.
**2017**, 9, 123–151. [Google Scholar] [CrossRef] - Radoc, B.; Sugden, R.; Turocy, T.L. Correlation neglect and case-based decisions. J. Risk Uncertain.
**2019**, 59, 23–49. [Google Scholar] [CrossRef][Green Version] - Bhui, R. Case-Based Decision Neuroscience: Economic Judgment by Similarity. In Goal-Directed Decision Making; Elsevier: Amsterdam, The Netherlands, 2018; pp. 67–103. [Google Scholar]
- Gayer, G.; Gilboa, I. Analogies and theories: The role of simplicity and the emergence of norms. Games Econ. Behav.
**2014**, 83, 267–283. [Google Scholar] [CrossRef] - Bordalo, P.; Gennaioli, N.; Shleifer, A. Memory, Attention, and Choice. Q. J. Econ.
**2020**, 135, 1399–1442. [Google Scholar] [CrossRef] - Argenziano, R.; Gilboa, I. Similarity & Nash Equilibria in Statistical Games. Available online: https://www.researchgate.net/publication/334635842_Similarity-Nash_Equilibria_in_Statistical_Games (accessed on 23 July 2019).
- Gilboa, I.; Lieberman, O.; Schmeidler, D. Empirical Similarity. Rev. Econ. Stat.
**2006**, 88, 433–444. [Google Scholar] [CrossRef] - Billot, A.; Gilboa, I.; Schmeidler, D. Axiomatization of an Exponential Similarity Function. Math. Soc. Sci.
**2008**, 55, 107–115. [Google Scholar] [CrossRef][Green Version] - Matsui, A. Expected Utility and Case-Based Reasoning. Math. Soc. Sci.
**2000**, 39, 1–12. [Google Scholar] [CrossRef] - Charness, G.; Gneezy, U. What’s in a name? Anonymity and social distance in dictator and ultimatum games. J. Econ. Behav. Organ.
**2008**, 68, 29–35. [Google Scholar] [CrossRef][Green Version] - Guilfoos, T.; Kurtz, K.J. Evaluating the role of personality trait information in social dilemmas. J. Behav. Exp. Econ.
**2017**, 68, 119–129. [Google Scholar] [CrossRef][Green Version] - Cerigioni, F. Dual Decision Processes: Retrieving Preferences When Some Choices are Intuitive; Economics Working Papers; Department of Economics and Business, Universitat Pompeu Fabra: Barcelona, Spain, 2016. [Google Scholar]
- Fudenberg, D.; Levine, D.K. The Theory of Learning in Games; MIT Press: Cambridge, MA, USA, 1998; Volume 2. [Google Scholar]
- Camerer, C.; Ho, T.H. Experience-weighted attraction learning in coordination games: Probability rules, heterogeneity, and time-variation. J. Math. Psychol.
**1998**, 42, 305–326. [Google Scholar] [CrossRef] [PubMed][Green Version] - Camerer, C.; Ho, T.H. Experience-weighted attraction learning in normal form games. Econometrica
**1999**, 67, 827–874. [Google Scholar] [CrossRef][Green Version] - Camerer, C.F.; Ho, T.H.; Chong, J.K. Sophisticated experience-weighted attraction learning and strategic teaching in repeated games. J. Econ. Theory
**2002**, 104, 137–188. [Google Scholar] [CrossRef][Green Version] - Shepard, R. Toward a Universal Law of Generalization for Psychological Science. Science
**1987**, 237, 1317. [Google Scholar] [CrossRef][Green Version] - Dayan, P.; Niv, Y. Reinforcement learning: The good, the bad and the ugly. Curr. Opin. Neurobiol.
**2008**, 18, 185–196. [Google Scholar] [CrossRef] - Glimcher, P.W. Understanding dopamine and reinforcement learning: The dopamine reward prediction error hypothesis. Proc. Natl. Acad. Sci. USA
**2011**, 108, 15647–15654. [Google Scholar] [CrossRef][Green Version] - Harley, C.B. Learning the evolutionarily stable strategy. J. Theor. Biol.
**1981**, 89, 611–633. [Google Scholar] [CrossRef] - Roth, A.E.; Erev, I. Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term. Games Econ. Behav.
**1995**, 8, 164–212. [Google Scholar] [CrossRef] - Brier, G.W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev.
**1950**, 78, 1–3. [Google Scholar] [CrossRef] - Selten, R. Axiomatic characterization of the quadratic scoring rule. Exp. Econ.
**1998**, 1, 43–61. [Google Scholar] [CrossRef] - Vuong, Q.H. Likelihood ratio tests for model selection and non-nested hypotheses. Econom. J. Econom. Soc.
**1989**, 57, 307–333. [Google Scholar] [CrossRef][Green Version] - Andreoni, J.; Miller, J.H. Rational cooperation in the finitely repeated prisoner’s dilemma: Experimental evidence. Econ. J.
**1993**, 103, 570–585. [Google Scholar] [CrossRef]

1. | The learning models from Chmura et al. [9] establish the fit of these stationary concepts and other learning models provide a worse fit of the data than the models considered here. We do replicate the findings for self-tuning EWA and find a better fit for reinforcement learning by estimating a greater number of free parameters. |

2. | It is worth noting that there is a mapping between expected utility and case-based decision theory [25], which implies that in a formal sense replacing the state space with the problem space is not ‘easier,’ if one requires that the decision-maker must ex-ante judge the similarity between all possible pairs of problems. |

3. | But see Section 7.2. In fact, Erev and Roth [7] discuss such a similarity between situations in which to define experimentation of a subject when choosing strategies. |

4. | Similarity is used in a way that maps closely to how learning models work, in general, by repeating successful choices under certain conditions. Choices in Cerigioni [28] use similarity when automated through the dual decision processes familiar from psychology. |

5. | We discuss the functional form of the probability distribution in Section 3.6. |

6. | As discussed in the introduction of this section, our implementation uses attractions, so choice is not deterministic, but rather stochastic with the probability of choosing an action increasing in the CBU. |

7. | Moreover, the similarity function can also be dynamic, which further allows for reconsideration of past events in a way RL/EWA accumulation does not. |

8. | The bucket analogy is also apropos because Erev and Roth [7] describe a spillover effect, in which buckets can slosh over to neighboring buckets. We do not investigate the spillover effect in this paper, since with only two actions (in 2 × 2 games) the spillover effect washes out. |

9. | In addition to logit response, we also estimate a power logit function, but find that it does not change the conclusions or generally improve the fit of the learning models estimated here. |

10. | We use STATA to estimate the maximium likelihood functions using variations of Newton–Raphson and Davidon–Fletcher–Powell algorithms, depending on success in estimation. Code is available upon request. |

11. | |

12. | The use of other measures of goodness of fit generally provide the same qualitative measures, but ordering of preferred learning models can be reversed by employing Log-Likelihood when model fitness is relatively close. We prefer the quadratic scoring rule and use that throughout. |

13. | The mean squared error is 0.1618 for RL, 0.1715 for self-tuning EWA, and 0.1603 for CBL, where the ordering of selection of models is the same as the quadratic scoring rule. |

14. | We estimate the initial attractions in our self-tuning EWA model while Chmura et al. [9] do not, which does not appear to make much of a difference in goodness of fit. They assume a random action initially for all learning models investigated. Chmura et al. [9] also estimates a one parameter RL model, which under performs self-tuning EWA. |

**Figure 2.**In-sample Fit of Learning Models. Note: The red line represents the quadratic score of the baseline model which is the predicted score of a learning model picking strategies at random. ST EWA refers to self-tuning EWA, RL refers to reinforcement learning, and CBL refers to case-based learning.

**Figure 3.**Out-of-sample Fit of Learning Models. Note: Each model is estimated using a portion of the data, while goodness of fit is measured on the remaining data. ST EWA refers to the self-tuning EWA, RL refers to reinforcement learning, and CBL refers to case-based learning.

**Figure 5.**Convergence of RL and CBL. The red line denotes an OLS regression line of round on percent difference in predictions.

**Table 1.**CBL Parameter Estimates. Note: ***, **, * denote statistical significance of 10%, 5%, and 1%. Clustered standard errors by subject are in parentheses. MQS is the mean quadratic score. ${}^{\u2020}$ The standard error did not calculate using clustered standard errors and is instead calculated using the outer product of the gradient (OPG) vectors method.

A: Combined Models | |||||||
---|---|---|---|---|---|---|---|

$\mathit{\lambda}$ | ${\mathit{A}}_{\mathbf{0}}^{\mathit{L}}$ | ${\mathit{A}}_{\mathbf{0}}^{\mathit{U}}$ | ${\mathit{W}}_{\mathbf{1}}$ | ${\mathit{W}}_{\mathbf{2}}$ | N | MQS | |

Constant sum games | 10.896 *** | −1.596 *** | 1.775 *** | 0.043 *** | 15.917 *** | 115,200 | 0.688 |

(0.331) | (0.211) | (0.250) | (0.002) | (0.002) | |||

Non-constant sum games | 37.831 *** | −1.024 *** | 1.023 *** | 1.136 *** | 10.259 *** | 57,600 | 0.653 |

(1.748) | (0.148) | (0.143) | (0.000) | (1.006) ${}^{\u2020}$ | |||

All games | 10.179 *** | −1.946 *** | 1.553 *** | 0.049 *** | 9.365 *** | 172,800 | 0.679 |

(0.186) | (0.195) | (0.204) | (0.000) | (0.364) ${}^{\u2020}$ | |||

B: Individual Game Models | |||||||

$\mathit{\lambda}$ | ${\mathit{A}}_{\mathbf{0}}^{\mathit{L}}$ | ${\mathit{A}}_{\mathbf{0}}^{\mathit{U}}$ | ${\mathit{W}}_{\mathbf{1}}$ | ${\mathit{W}}_{\mathbf{2}}$ | N | MQS | |

Game 1 | 15.543 *** | −1.397 *** | 4.821 *** | 0.205 *** | 4.146 *** | 19,200 | 0.809 |

(0.823) | (0.437) | (1.021) | (0.016) ${}^{\u2020}$ | (0.953) ${}^{\u2020}$ | |||

Game 2 | 6.655 *** | −0.730 * | 2.395 ** | 0.019 *** | 5.316 *** | 19,200 | 0.667 |

(0.925) | (0.444) | (1.179) | (0.007) | (1.376) | |||

Game 3 | 15.175 *** | −1.753 | 2.465 *** | 0.127 | 3.929 *** | 19,200 | 0.768 |

(0.584) ${}^{\u2020}$ | (3.794) | (0.269) ${}^{\u2020}$ | (0.134) | (0.398) | |||

Game 4 | 13.758 *** | −2.683 *** | 2.139 *** | 0.139 *** | 3.167 *** | 19,200 | 0.654 |

(0.832) | (0.471) | (0.352) | (0.005) | (0.028) | |||

Game 5 | 110.704 *** | −0.416 *** | 0.377 *** | 3.204 *** | 7.261 *** | 19,200 | 0.630 |

(8.723) | (0.071) | (0.078) | (0.000) | (2.352) ${}^{\u2020}$ | |||

Game 6 | 65.058 *** | −0.406 *** | 0.114 | 1.535 *** | 3.246 ** | 19,200 | 0.594 |

(5.276) | (0.125) | (0.090) | (0.003) | (1.507) ${}^{\u2020}$ | |||

Game 7 | 33.815 *** | −0.431 | 2.135 *** | 1.005 *** | 8.853 *** | 9600 | 0.741 |

(3.764) | (0.316) | (0.618) | (0.104) ${}^{\u2020}$ | (0.875) | |||

Game 8 | 5.300 * | −3.121 ** | 3.856 ** | 0.020 | 5.155 * | 9600 | 0.637 |

(3.092) | (1.475) | (1.941) | (0.041) | (2.783) | |||

Game 9 | 10.058 *** | −4.345 *** | 2.512 *** | 0.080 *** | 2.784 *** | 9600 | 0.743 |

(0.724) | (1.155) | (0.695) | (0.000) | (0.004) | |||

Game 10 | 5.837 *** | −4.364 ** | −0.563 | 0.020 *** | 1.499 *** | 9600 | 0.639 |

(0.836) | (2.063) | (1.025) | (0.008) | (0.013) | |||

Game 11 | 20.525 *** | −1.588 *** | 1.444 *** | 0.394 *** | 0.767 *** | 9600 | 0.616 |

(1.638) | (0.487) | (0.416) | (0.000) | (0.004) | |||

Game 12 | 7.184 *** | −1.864 | 0.604 | 0.012 ** | 8.471 *** | 9600 | 0.593 |

(1.078) | (1.162) | (0.890) | (0.005) | (2.254) |

(1) | (2) | (3) | (4) | |
---|---|---|---|---|

$\mathit{S}$, $\mathit{d}$ | ${\mathit{S}}^{\mathbf{2}}$, $\mathit{d}$ | $\mathit{S}$, ${\mathit{d}}^{\mathbf{2}}$ | ${\mathit{S}}^{\mathbf{2}}$, ${\mathit{d}}^{\mathbf{2}}$ | |

MQS | 0.679 | 0.686 | 0.681 | 0.682 |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Guilfoos, T.; Pape, A.D. Estimating Case-Based Learning. *Games* **2020**, *11*, 38.
https://doi.org/10.3390/g11030038

**AMA Style**

Guilfoos T, Pape AD. Estimating Case-Based Learning. *Games*. 2020; 11(3):38.
https://doi.org/10.3390/g11030038

**Chicago/Turabian Style**

Guilfoos, Todd, and Andreas Duus Pape. 2020. "Estimating Case-Based Learning" *Games* 11, no. 3: 38.
https://doi.org/10.3390/g11030038