# The Prior Can Often Only Be Understood in the Context of the Likelihood

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. The Role of the Prior Distribution in a Bayesian Analysis

#### 1.1. The Practical Consequences of A Prior Can Depend on the Data

#### 1.2. Existing Methods for Setting Priors Already Depend on the Likelihood

#### 1.3. The Role of the Prior in Generative and Predictive Modeling

#### 1.4. Coherence and Cheating

## 2. A Simple Motivating Example

#### 2.1. Bayesian Analysis under Different Priors

#### 2.2. Understanding the Problem

## 3. When Exactly Is the Prior Irrelevant in Practice?

#### 3.1. Uniform Priors Are Not A Panacea and Can Do Unbounded Damage

#### 3.2. Asymptotics: so Close, yet so Far Away

#### 3.3. For Complex Models, Certain Aspects of the Prior Will Always Be Relevant

## 4. A Prior Is More than Just A Probability Measure, So We Need to Start Thinking Generatively

#### 4.1. When Is A Probability Distribution A Prior?

#### 4.2. Prior Choice Is Especially Important in High Dimensions

#### 4.3. Sensitivity of the Marginal Likelihood to the Prior

## 5. Generative Priors Need to Be Prediction Focused

#### 5.1. In the Sea of Complex Models, the Leviathan Is Overfitting

#### 5.2. Overfitting Leads to Poor Posterior Predictive Performance

#### 5.3. Don’t Forget Your Roots: Predictive Priors Aren’t Always Generative

## 6. Discussion

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## References

- Gelman, A.; Hennig, C. Beyond subjective and objective in statistics. J. R. Stat. Soc.
**2017**, 180, 1–31. [Google Scholar] [CrossRef] - Bernardo, J.M. Reference posterior distributions for Bayesian inference. J. R. Stat. Soc. B
**1979**, 41, 113–147. [Google Scholar] - Jaynes, E.T. On the rationale of maximum-entropy methods. Proc. IEEE
**1982**, 70, 939–952. [Google Scholar] [CrossRef] - Rubin, D.B. Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann.Stat.
**1984**, 12, 1151–1172. [Google Scholar] [CrossRef] - Gelman, A.; Jakulin, A.; Pittau, M.G.; Su, Y.S. A weakly informative default prior distribution for logistic and other regression models. Ann. Appl. Stat.
**2008**, 2, 1360–1383. [Google Scholar] [CrossRef] - Simpson, D.; Rue, H.; Riebler, A.; Martins, T.G.; Sorbye, S.H. Penalising model component complexity: A principled, practical approach to constructing priors. Stat. Sci.
**2017**, 32, 1–28. [Google Scholar] [CrossRef] - Rubin, H. A weak system of axioms for “rational” behavior and the non-separability of utility from prior. Stat. Decis.
**1987**, 5, 47–58. [Google Scholar] - Gelman, A. A Bayesian formulation of exploratory data analysis and goodness-of-fit testing. Int. Stat. Rev.
**2003**, 71, 369–382. [Google Scholar] [CrossRef] - Gelman, A.; Shalizi, C. Philosophy and the practice of Bayesian statistics. Br. J. Math. Stat Psychol.
**2013**, 66, 8–80. [Google Scholar] [CrossRef] [PubMed] - Kanazawa, S. Beautiful parents have more daughters: A further implication of the generalized Trivers-Willard hypothesis (gTWH). J. Theor. Biol.
**2007**, 244, 133–140. [Google Scholar] [CrossRef] [PubMed] - Gelman, A.; Weakliem, D. Of beauty, sex, and power: Statistical challenges in estimating small effects. Am. Sci.
**2009**, 97, 310–316. [Google Scholar] [CrossRef] - Stein, M.L. Interpolation of Spatial Data: Some Theory for Kriging; Springer: New York, NY, USA, 1999. [Google Scholar]
- Zhang, H. Inconsistent estimation and asymptotically equal interpolations in model-based geostatistics. J. Am. Stat. Assoc.
**2004**, 99, 250–261. [Google Scholar] [CrossRef] - Kaufman, C.G.; Shaby, B.A. The role of the range parameter for estimation and prediction in geostatistics. Biometrika
**2013**, 100, 473–484. [Google Scholar] [CrossRef] - Van der Vaart, A.W.; van Zanten, J.H. Adaptive Bayesian estimation using a Gaussian random field with inverse gamma bandwidth. Ann. Stat.
**2009**, 37, 2655–2675. [Google Scholar] [CrossRef] - Fuglstad, G.A.; Simpson, D.; Lindgren, F.; Rue, H. Constructing priors that penalize the complexity of Gaussian random fields. arXiv, 2017; arXiv:1503.00256. [Google Scholar]
- Gelman, A. Bayesian model-building by pure thought: Some principles and examples. Stat. Sin.
**1996**, 6, 215–232. [Google Scholar] - Kass, R.E.; Raftery, A.E. Bayes factors and model uncertainty. J. Am. Stat. Assoc.
**1995**, 90, 773–795. [Google Scholar] [CrossRef] - Vanpaemel, W.; Lee, M.D. Using priors to formalize theory: Optimal attention and the generalized context model. Psychon. Bull. Rev.
**2012**, 19, 1047–1056. [Google Scholar] [CrossRef] [PubMed] - Vanpaemel, W. Prior sensitivity in theory testing: An apologia for the Bayes factor. J. Math. Psychol.
**2010**, 54, 491–498. [Google Scholar] [CrossRef] - Klein, N.; Kneib, T. Scale-dependent priors for variance parameters in structured additive distributional regression. Bayesian Anal.
**2016**, 11, 1071–1106. [Google Scholar] [CrossRef] - Gelman, A. Prior distributions for variance parameters in hierarchical models. Bayesian Anal.
**2006**, 1, 515–534. [Google Scholar] [CrossRef] - Polson, N.G.; Scott, J.G. On the half-Cauchy prior for a global scale parameter. Bayesian Anal.
**2012**, 7, 887–902. [Google Scholar] [CrossRef] - Piironen, J.; Vehtari, A. Projection predictive variable selection using Stan+ R. arXiv, 2015; arXiv:1508.02502. [Google Scholar]
- Jeffreys, H. Theory of Probability, 3rd ed.; Oxford University Press: Oxford, UK, 1961. [Google Scholar]
- Kass, R.E.; Wasserman, L. The selection of prior distributions by formal rules. J. Am. Stat. Assoc.
**1996**, 91, 1343–1370. [Google Scholar] [CrossRef] - O’Hagan, A. Fractional Bayes factors for model comparison (with discussion). J. R. Stat. Soc. B
**1995**, 57, 99–138. [Google Scholar] - Berger, J.O.; Pericchi, L.R. The intrinsic Bayes factor for model selection and prediction. J. Am. Stat. Assoc.
**1996**, 91, 109–122. [Google Scholar] [CrossRef] - Stan Development Team. Prior Choice Recommendations. 2017. Available online: https://github.com/stan-dev/stan/wiki/Prior-Choice-Recommendations (accessed on 19 October 2017).

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Gelman, A.; Simpson, D.; Betancourt, M. The Prior Can Often Only Be Understood in the Context of the Likelihood. *Entropy* **2017**, *19*, 555.
https://doi.org/10.3390/e19100555

**AMA Style**

Gelman A, Simpson D, Betancourt M. The Prior Can Often Only Be Understood in the Context of the Likelihood. *Entropy*. 2017; 19(10):555.
https://doi.org/10.3390/e19100555

**Chicago/Turabian Style**

Gelman, Andrew, Daniel Simpson, and Michael Betancourt. 2017. "The Prior Can Often Only Be Understood in the Context of the Likelihood" *Entropy* 19, no. 10: 555.
https://doi.org/10.3390/e19100555