Adaptations on the Use of p-Values for Statistical Inference: An Interpretation of Messages from Recent Public Discussions
Abstract
:1. Introduction
- p-values can indicate how incompatible the data are with a specified statistical model.
- p-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.
- Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.
- Proper inference requires full reporting and transparency.
- A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.
- By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.
2. Materials and Methods
3. Revisiting a Real-World Published Application
4. Discussion
- Is hypotheses testing (or some other approach to binary decision making) unsuitable as a methodological backbone of the empirical, inductive sciences?
- Should p-values (or Bayesian analogs of them) be banned as a basic tool of statistical inference?
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. R-Code for the Application
References
- Cox, D.R. Statistical Significance. Annu. Rev. Stat. Its Appl. 2020, 7, 1–10. [Google Scholar] [CrossRef]
- Hubbard, R.; Bayarri, M.J.; Berk, K.N.; Carlton, M.A. Confusion over Measures of Evidence (p’s) versus Errors (α’s) in Classical Statistical Testing. Am. Stat. 2003, 57, 171–182. [Google Scholar] [CrossRef]
- Wood, J.; Freemantle, N.; King, M.; Nazareth, I. Trap of trends to statistical significance: Likelihood of near significant p value becoming more significant with extra data. BMJ 2014, 348, g2215. [Google Scholar] [CrossRef]
- McShane, B.B.; Gal, D. Statistical Significance and the Dichotomization of Evidence. J. Am. Stat. Assoc. 2017, 112, 885–895. [Google Scholar] [CrossRef]
- Greenland, S.; Senn, S.J.; Rothman, K.J.; Carlin, J.B.; Poole, C.; Goodman, S.N.; Altman, D.G. Statistical tests, p values, confidence intervals, and power: A guide to misinterpretations. Eur. J. Epidemiol. 2016, 31, 337–350. [Google Scholar] [CrossRef] [PubMed]
- Bauer, P. Comment on ‘A critical evaluation of the current “p-value controversy”’. Biom. J. 2017, 59, 873–874. [Google Scholar] [CrossRef] [PubMed]
- Brannath, W. Contribution to the discussion of “A critical evaluation of the current ‘p-value controversy’”. Biom. J. 2017, 59, 875–876. [Google Scholar] [CrossRef]
- Di Leo, G.; Sardanelli, F. Statistical significance: p value, 0.05 threshold, and applications to radiomics—Reasons for a conservative approach. Eur. Radiol. Exp. 2020, 4, 18. [Google Scholar] [CrossRef]
- Farcomeni, A. Contribution to the discussion of the paper by Stefan Wellek: “A critical evaluation of the current p-value controversy”. Biom. J. 2017, 59, 880–881. [Google Scholar] [CrossRef]
- Gasparini, M. Contribution to the discussion of “A critical evaluation of the current ‘p-value controversy’”. Biom. J. 2017, 59, 882–883. [Google Scholar] [CrossRef]
- Goeman, J.J. Contribution to the discussion of “A critical evaluation of the current ‘p-value controversy’”. Biom. J. 2017, 59, 884–885. [Google Scholar] [CrossRef]
- Held, L. An objective Bayes perspective on p-values. Biom. J. 2017, 59, 886–888. [Google Scholar] [CrossRef]
- Laber, E.B.; Shedden, K. Statistical Significance and the Dichotomization of Evidence: The Relevance of the ASA Statement on Statistical Significance and p-Values for Statisticians. J. Am. Stat. Assoc. 2017, 112, 902–904. [Google Scholar] [CrossRef]
- Greenland, S. Valid p-Values Behave Exactly as They Should: Some Misleading Criticisms of p-Values and Their Resolution with S-Values. Am. Stat. 2019, 73 (Suppl. 1), 106–114. [Google Scholar] [CrossRef]
- Berry, D. A p-Value to Die For. J. Am. Stat. Assoc. 2017, 112, 895–897. [Google Scholar] [CrossRef]
- Ioannidis, J.P.A. Why Most Published Research Findings Are False. PLoS Med. 2005, 2, e124. [Google Scholar] [CrossRef]
- Mayo, D.G. Statistical Inference as Severe Testing: How to Get beyond the Statistics Wars; Cambridge University Press: Cambridge, UK, 2018. [Google Scholar]
- Nuzzo, R. Scientific method: Statistical errors. Nature 2014, 506, 150–152. [Google Scholar] [CrossRef] [PubMed]
- Perezgonzalez, J.D.; Frias-Navarro, M.D. Retract p < 0.005 and propose using JASP, instead. F1000Research 2017, 6, 2122. [Google Scholar]
- Amrhein, V.; Greenland, S.; McShane, B. Retire statistical significance. Nature 2019, 567, 305–307. [Google Scholar] [CrossRef]
- Halsey, L.G. The reign of the p-value is over: What alternative analyses could we employ to fill the power vacuum? Biol. Lett. 2019, 15, 20190174. [Google Scholar] [CrossRef]
- Amrhein, V.; Trafimow, D.; Greenland, S. Inferential Statistics as Descriptive Statistics: There Is No Replication Crisis if We Don’t Expect Replication. Am. Stat. 2019, 73 (Suppl. 1), 262–270. [Google Scholar] [CrossRef]
- Gardner, M.J.; Altman, D.G. Confidence intervals rather than p values: Estimation rather than hypothesis testing. Br. Med. J. (Clin. Res. Ed.) 1986, 292, 746–750. [Google Scholar] [CrossRef] [PubMed]
- Kuss, O.; Stang, A. The p-value—A well-understood and properly used statistical concept? Contact Dermat. 2011, 66, 1–3. [Google Scholar] [CrossRef] [PubMed]
- Feinstein, A.R. p-Values and Confidence Intervals: Two Sides of the Same Unsatisfactory Coin. J. Clin. Epidemiol. 1998, 51, 355–360. [Google Scholar] [CrossRef]
- Gelman, A.; Carlin, J. Some Natural Solutions to the p-Value Communication Problem—And Why They Won’t Work. J. Am. Stat. Assoc. 2017, 112, 899–901. [Google Scholar] [CrossRef]
- Berger, V.W. On the generation and ownership of alpha in medical studies. Control. Clin. Trials 2004, 25, 613–619. [Google Scholar] [CrossRef]
- Benjamini, Y.; De Veaux, R.D.; Efron, B.; Evans, S.; Glickman, M.; Graubard, B.I.; He, X.; Meng, X.L.; Reid, N.; Stigler, S.M.; et al. The ASA president’s task force statement on statistical significance and replicability. Ann. Appl. Stat. 2021, 15, 1084–1085. [Google Scholar] [CrossRef]
- Wasserstein, R.L.; Lazar, N.A. The ASA’s Statement on p-Values: Context, Process, and Purpose. Am. Stat. 2016, 70, 129–133. [Google Scholar] [CrossRef]
- Riley, R.D.; Cole, T.J.; Deeks, J.; Kirkham, J.J.; Morris, J.; Perera, R.; Wade, A.; Collins, G.S. On the 12th Day of Christmas, a Statistician Sent to Me. BMJ 2022, 379, e072883. [Google Scholar] [CrossRef]
- Meng, X.L. Posterior Predictive p-Values. Ann. Stat. 1994, 22, 1142–1160. [Google Scholar] [CrossRef]
- Sellke, T.; Bayarri, M.J.; Berger, J.O. Calibration of p Values for Testing Precise Null Hypotheses. Am. Stat. 2001, 55, 62–71. [Google Scholar] [CrossRef]
- Piegorsch, W.W. Are p-values under attack? Contribution to the discussion of ‘A critical evaluation of the current “p-value controversy”’. Biom. J. 2017, 59, 889–891. [Google Scholar] [CrossRef]
- Bayarri, M.J.; Berger, J.O. The Interplay of Bayesian and Frequentist Analysis. Stat. Sci. 2004, 19, 58–80. [Google Scholar] [CrossRef]
- Held, L.; Ott, M. How the Maximal Evidence of p-Values Against Point Null Hypotheses Depends on Sample Size. Am. Stat. 2016, 70, 335–341. [Google Scholar] [CrossRef]
- Novick, S.; Zhang, T. Mean comparisons and power calculations to ensure reproducibility in preclinical drug discovery. Stat. Med. 2021, 40, 1414–1428. [Google Scholar] [CrossRef]
- Gelman, A.; Robert, C.P. Revised evidence for statistical standards. Proc. Natl. Acad. Sci. USA 2014, 111, E1933. [Google Scholar] [CrossRef]
- Browner, W.S.; Newman, T.B. Are all significant p-values created equal? The analogy between diagnostic tests and clinical research. JAMA 1987, 257, 2459–2463. [Google Scholar] [CrossRef] [PubMed]
- Kuffner, T.A.; Walker, S.G. Why are p-Values Controversial? Am. Stat. 2019, 73 (Suppl. 1), 1–3. [Google Scholar] [CrossRef]
- Senn, S. A comment on “replication, p-values and evidence, S.N.Goodman, Statistics in Medicine 1992; 11:875–879”. Stat. Med. 2002, 21, 2437–2444. [Google Scholar] [CrossRef]
- Shi, H.; Yin, G. Reconnecting p-Value and Posterior Probability under One- and Two-Sided Tests. Am. Stat. 2021, 75, 265–275. [Google Scholar] [CrossRef]
- Gaudart, J.; Huiart, L.; Milligan, P.J.; Thiebaut, R.; Giorgi, R. Reproducibility issues in science, is p value really the only answer? Proc. Natl. Acad. Sci. USA 2014, 111, E1934. [Google Scholar] [CrossRef] [PubMed]
- Lazzeroni, L.C.; Lu, Y.; Belitskaya-Lévy, I. p-values in genomics: Apparent precision masks high uncertainty. Mol. Psychiatry 2014, 19, 1336–1340. [Google Scholar] [CrossRef]
- Senn, S. Contribution to the discussion of ‘A critical evaluation of the current “p-value controversy”’. Biom. J. 2017, 59, 892–894. [Google Scholar] [CrossRef] [PubMed]
- Hand, D.J. Trustworthiness of statistical inference. J. R. Stat. Soc. Ser. A Stat. Soc. 2022, 185, 329–347. [Google Scholar] [CrossRef]
- Senn, S. Two cheers for p-values? J. Epidemiol. Biostat. 2001, 6, 193–204. [Google Scholar] [CrossRef] [PubMed]
- Wellek, S. A critical evaluation of the current ”p-value controversy”. Biom. J. 2017, 59, 854–872. [Google Scholar] [CrossRef]
- Alfo, M.; Boehning, D. Editorial for the discussion papers on the p-value controversy. Biom. J. 2017, 59, 853. [Google Scholar] [CrossRef]
- Johnson, V.E. Revised standards for statistical evidence. Proc. Natl. Acad. Sci. USA 2013, 110, 19313–19317. [Google Scholar] [CrossRef]
- Wasserstein, R.L.; Schirm, A.L.; Lazar, N.A. Moving to a World Beyond “p < 0.05”. Am. Stat. 2019, 73 (Suppl. 1), 1–19. [Google Scholar]
- Indrayan, A.; Malhotra, R.K. Medical Biostatistics, 4th ed.; CRC Press: Boca Raton, FL, USA, 2017. [Google Scholar]
- Vexler, A.; Hutson, A.D.; Chen, X. Statistical Testing Strategies in the Health Sciences; CRC Press: Boca Raton, FL, USA, 2016. [Google Scholar]
- Goodman, S.N.; Fanelli, D.; Ioannidis, J.P.A. What does research reproducibility mean? Sci. Transl. Med. 2016, 8, 341ps12. [Google Scholar] [CrossRef]
- National Academies of Sciences, Engineering, and Medicine. Reproducibility and Replicability in Science; The National Academies Press: Washington, DC, USA, 2019. [Google Scholar]
- Boos, D.D.; Stefanski, L.A. p-Value Precision and Reproducibility. Am. Stat. 2011, 65, 213–221. [Google Scholar] [CrossRef] [PubMed]
- Stodden, V. Reproducing Statistical Results. Annu. Rev. Stat. Its Appl. 2015, 2, 1–19. [Google Scholar] [CrossRef]
- Halsey, L.G.; Curran-Everett, D.; Vowler, S.L.; Drummond, G.B. The fickle p value generates irreproducible results. Nat. Methods 2015, 12, 179–185. [Google Scholar] [CrossRef] [PubMed]
- van Zwet, E.W.; Goodman, S.N. How large should the next study be? Predictive power and sample size requirements for replication studies. Stat. Med. 2022, 41, 3090–3101. [Google Scholar] [CrossRef]
- Coolen, F.P.A.; Bin Himd, S. Nonparametric Predictive Inference for Reproducibility of Basic Nonparametric Tests. J. Stat. Theory Pract. 2014, 8, 591–618. [Google Scholar] [CrossRef]
- Goodman, S.N. A comment on replication, p-values and evidence. Stat. Med. 1992, 11, 875–879. [Google Scholar] [CrossRef]
- Zhao, Y.; Caffo, B.S.; Ewen, J.B. B-value and empirical equivalence bound: A new procedure of hypothesis testing. Stat. Med. 2022, 41, 964–980. [Google Scholar] [CrossRef]
- Sarafidis, K.; Soubasi-Griva, V.; Piretzi, K.; Thomaidou, A.; Agakidou, E.; Taparkou, A.; Diamanti, E.; Drossou-Agakidou, V. Diagnostic utility of elevated serum soluble triggering receptor expressed on myeloid cells (sTREM)-1 in infected neonates. Intensive Care Med. 2010, 36, 864–868. [Google Scholar] [CrossRef]
- Nakas, C.T.; Bantis, L.E.; Gatsonis, C.A. ROC Analysis for Classification and Prediction in Practice, 1st ed.; CRC Press: Boca Raton, FL, USA, 2023. [Google Scholar]
- DeLong, E.R.; DeLong, D.M.; Clarke-Pearson, D.L. Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach. Biometrics 1998, 44, 837–845. [Google Scholar] [CrossRef]
- Robin, X.; Turck, N.; Hainard, A.; Tiberti, N.; Lisacek, F.; Sanchez, J.-C.; Mueller, M. pROC: An open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinform. 2011, 12, 77. [Google Scholar] [CrossRef]
- Richardson, S. Statistics in times of increasing uncertainty. J. R. Stat. Soc. Ser. A Stat. Soc. 2022, 185, 1471–1496. [Google Scholar] [CrossRef]
- Wellek, S. Author response to the contributors to the discussion on ‘A critical evaluation of the current “p-value controversy”’. Biom. J. 2017, 59, 897–900. [Google Scholar] [CrossRef]
- Efron, B.; Hastie, T. Computer Age Statistical Inference. Algorithms, Evidence, and Data Science; Cambridge University Press: Cambridge, UK, 2016. [Google Scholar]
- Christensen, R. Analysis of Variance, Design, and Regression: Linear Modeling for Unbalanced Data, 2nd ed.; CRC Press: Boca Raton, FL, USA, 2016. [Google Scholar]
- Bhattacharya, B.; Habtzghi, D. Median of the p Value Under the Alternative Hypothesis. Am. Stat. 2002, 56, 202–206. [Google Scholar] [CrossRef]
- Sackrowitz, H.; Samuel-Cahn, E. p Values as Random Variables—Expected p Values. Am. Stat. 1999, 53, 326–331. [Google Scholar] [CrossRef]
- Browne, R.H. The t-Test p Value and Its Relationship to the Effect Size and P(X > Y). Am. Stat. 2010, 64, 30–33. [Google Scholar] [CrossRef]
- De Martini, D. Reproducibility probability estimation for testing statistical hypotheses. Stat. Probab. Lett. 2008, 78, 1056–1061. [Google Scholar] [CrossRef]
- Hung, J.H.M.; O’Neill, R.T.; Bauer, P.; Koehne, K. The Behavior of the p-Value When the Alternative Hypothesis is True. Biometrics 1997, 53, 11–22. [Google Scholar] [CrossRef]
- Nakas, C.; Yiannoutsos, C.T.; Bosch, R.J.; Moyssiadis, C. Assessment of diagnostic markers by goodness-of-fit tests. Stat. Med. 2003, 22, 2503–2513. [Google Scholar] [CrossRef] [PubMed]
- Pepe, M.S.; Janes, H.; Longton, G.; Leisenring, W.; Newcomb, P. Limitations of the odds ratio in gauging the performance of a diagnostic, prognostic, or screening marker. Am. J. Epidemiol. 2004, 159, 882–890. [Google Scholar] [CrossRef] [PubMed]
- Pepe, M.S.; Cai, T. The Analysis of Placement Values for Evaluating Discriminatory Measures. Biometrics 2004, 60, 528–535. [Google Scholar] [CrossRef] [PubMed]
- Benjamin, D.J.; Berger, J.O. Three Recommendations for Improving the Use of p-Values. Am. Stat. 2019, 73 (Suppl. 1), 186–191. [Google Scholar] [CrossRef]
- Berger, V.W. The p-Value Interval as an Inferential Tool. J. R. Stat. Soc. Ser. D Stat. 2001, 50, 79–85. [Google Scholar] [CrossRef]
- Berry, G.; Armitage, P. Mid-P confidence intervals: A brief review. J. R. Stat. Soc. Ser. Stat. 1995, 44, 417–423. [Google Scholar] [CrossRef]
- Briggs, W.M. The Substitute for p-Values. J. Am. Stat. Assoc. 2017, 112, 897–898. [Google Scholar] [CrossRef]
- De Capitani, L.; De Martini, D. Reproducibility Probability Estimation and RP-Testing for Some Nonparametric Tests. Entropy 2016, 18, 142. [Google Scholar] [CrossRef]
- Demidenko, E. The p-Value You Can’t Buy. Am. Stat. 2016, 70, 33–38. [Google Scholar] [CrossRef]
- Goodman, W.M.; Spruill, S.E.; Komaroff, E. A Proposed Hybrid Effect Size Plus p-Value Criterion: Empirical Evidence Supporting its Use. Am. Stat. 2019, 73 (Suppl. 1), 168–185. [Google Scholar] [CrossRef]
- Infanger, D.; Schmidt-Trucksaess, A. p value functions: An underused method to present research results and to promote quantitative reasoning. Stat. Med. 2019, 38, 4189–4197. [Google Scholar] [CrossRef]
- Ioannidis, J.P.A. How to Make More Published Research True. PLoS Med. 2014, 11, e1001747. [Google Scholar] [CrossRef]
- Jakobsen, J.C.; Gluud, C.; Winkel, P.; Lange, T.; Wetterslev, J. The thresholds for statistical and clinical significance—A five-step procedure for evaluation of intervention effects in randomised clinical trials. BMC Med. Res. Methodol. 2014, 14, 34. [Google Scholar] [CrossRef]
- Kieser, M.; Friede, T.; Gondan, M. Assessment of statistical significance and clinical relevance. Stat. Med. 2013, 32, 1707–1719. [Google Scholar] [CrossRef]
- Matthews, R.A.J. Moving Towards the Post p < 0.05 Era via the Analysis of Credibility. Am. Stat. 2019, 73 (Suppl. 1), 202–212. [Google Scholar]
- Rice, K.; Ye, L. Expressing Regret: A Unified View of Credible Intervals. Am. Stat. 2022, 76, 248–256. [Google Scholar] [CrossRef]
- Stahel, W.A. New relevance and significance measures to replace p-values. PLoS ONE 2021, 16, e0252991. [Google Scholar] [CrossRef]
- Blume, J.D.; Greevy, R.A.; Welty, V.F.; Smith, J.R.; Dupont, W.D. An Introduction to Second-Generation p-Values. Am. Stat. 2019, 73 (Suppl. 1), 157–167. [Google Scholar] [CrossRef]
- Bormann, S.-K. A Stata implementation of second-generation p-values. Stata J. 2022, 22, 496–520. [Google Scholar] [CrossRef]
- Schuemie, M.J.; Ryan, P.B.; DuMouchel, W.; Suchard, M.A.; Madigan, D. Interpreting observational studies: Why empirical calibration is needed to correct p-values. Stat. Med. 2014, 33, 209–218. [Google Scholar] [CrossRef]
- Walsh, M.; Srinathan, S.K.; McAuley, D.F.; Mrkobrada, K.; Levine, O.; Ribic, C.; Molnar, A.O.; Dattani, N.D.; Burke, A.; Guyatt, G.; et al. The statistical significance of randomized controlled trial results is frequently fragile: A case for a Fragility Index. J. Clin. Epidemiol. 2014, 67, 622–628. [Google Scholar] [CrossRef] [PubMed]
- Goeman, J.J.; Solari, A.; Stijnen, T. Three-sided hypothesis testing: Simultaneous testing of superiority, equivalence and inferiority. Stat. Med. 2010, 29, 2117–2125. [Google Scholar] [CrossRef] [PubMed]
- Solari, A. Contribution to the discussion of ‘A critical evaluation of the current “p-value controversy”’. Biom. J. 2017, 59, 895–896. [Google Scholar] [CrossRef] [PubMed]
- Killeen, P.R. An Alternative to Null-Hypothesis Significance Tests. Psychol. Sci. 2005, 16, 345–353. [Google Scholar] [CrossRef] [PubMed]
- Lecoutre, B.; Lecoutre, M.P.; Poitevineau, J. Killeen’s probability of replication and predictive probabilities: How to compute, use, and interpret them. Psychol. Methods 2010, 15, 158–171. [Google Scholar] [CrossRef] [PubMed]
- Bickel, D.R. Testing prediction algorithms as null hypotheses: Application to assessing the performance of deep neural networks. Stat 2020, 9, e270. [Google Scholar] [CrossRef]
- Bland, M. Do Baseline p-Values Follow a Uniform Distribution in Randomised Trials? PLoS ONE 2013, 8, e76010. [Google Scholar] [CrossRef]
- Buehlmann, P.; Kalisch, M.; Meier, L. High-Dimensional Statistics with a View Toward Applications in Biology. Annu. Rev. Stat. Its Appl. 2014, 1, 255–278. [Google Scholar] [CrossRef]
- Held, L. The harmonic mean χ2-test to substantiate scientific findings. Appl. Stat. 2020, 69, 697–708. [Google Scholar] [CrossRef]
- van Reenen, M.; Reinecke, C.J.; Westerhuis, J.A.; Venter, J.H. Variable selection for binary classification using error rate p-values applied to metabolomics data. BMC Bioinform. 2016, 17, 33. [Google Scholar] [CrossRef] [PubMed]
- Zumbrunnen, N.R. p-Values for Classification—Computational Aspects and Asymptotics. Ph.D. Thesis, University of Bern, Bern, Switzerland, University of Goettingen, Goettingen, Germany, 2014. [Google Scholar]
- Zumbrunnen, N.; Duembgen, L. pvclass: An R Package for p Values for Classification. J. Stat. Softw. 2017, 78, 1–19. [Google Scholar] [CrossRef]
- Zuo, Y.; Stewart, T.G.; Blume, J.D. Variable Selection with Second-Generation p-Values. Am. Stat. 2022, 76, 91–101. [Google Scholar] [CrossRef]
- Benjamini, Y.; Hochberg, Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J. R. Stat. Soc. Ser. B Methodol. 1995, 57, 289–300. [Google Scholar] [CrossRef]
- Elston, R.C. On Fisher’s method on combining p-values. Biom. J. 1991, 33, 339–345. [Google Scholar] [CrossRef]
- Johnson, V.E. Reply to Gelman, Gaudart, Pericchi: More reasons to revise standards for statistical evidence. Proc. Natl. Acad. Sci. USA 2014, 111, E1936–E1937. [Google Scholar] [CrossRef]
- Pericchi, L.; Pereira, C.A.; Pérez, M.-E. Adaptive revised standards for statistical evidence. Proc. Natl. Acad. Sci. USA 2014, 111, E1935. [Google Scholar] [CrossRef] [PubMed]
- Harrington, D.; D’Agostino, R.B.S.; Gatsonis, C.; Hogan, J.W.; Hunter, D.J.; Normand, S.-L.T.; Drazen, J.M.; Hamel, M.B. New Guidelines for Statistical Reporting in the Journal. N. Engl. J. Med. 2019, 381, 285–286. [Google Scholar] [CrossRef] [PubMed]
- Schervish, M.J. p values: What they are and what they are not. Am. Stat. 1996, 50, 203–206. [Google Scholar] [CrossRef]
- Goodman, S.N. Why is Getting Rid of p-Values So Hard? Musings on Science and Statistics. Am. Stat. 2019, 73 (Suppl. 1), 26–30. [Google Scholar] [CrossRef]
- Saville, B.R.; Connor, J.T.; Ayers, G.D.; Alvarez, J.A. The utility of Bayesian predictive probabilities for interim monitoring of clinical trials. Clin. Trials 2014, 11, 485–493. [Google Scholar] [CrossRef]
- Marinell, G.; Steckel-Berger, G.; Ulmer, H. Not Significant: What Now? J. Probab. Stat. 2012, 2012, 804691. [Google Scholar] [CrossRef]
- Linden, A. SVALUE: Stata module for computing and graphically displaying S-values against their respective p-values. In Statistical Software Components; Boston College Department of Economics: Chestnut Hill, MA, USA, 2019; p. S458650. [Google Scholar]
- Rafi, Z.; Greenland, S. Semantic and cognitive tools to aid statistical science: Replace confidence and significance by compatibility and surprise. BMC Med. Res. Methodol. 2020, 20, 244. [Google Scholar] [CrossRef]
- Guo, D.; Ma, Y. The “p-hacking-is-terrific” ocean - A cartoon for teaching statistics. Teach. Stat. 2022, 44, 68–72. [Google Scholar] [CrossRef]
- Head, M.L.; Holman, L.; Lanfear, R.; Kahn, A.T.; Jennions, M.D. The Extent and Consequences of P-Hacking in Science. PLoS Biol. 2015, 13, e1002106. [Google Scholar] [CrossRef] [PubMed]
- Senn, S. Dicing with Death: Living by Data, 2nd ed.; Cambridge University Press: Cambridge, UK, 2023. [Google Scholar]
- De Santis, F. Contribution to the discussion of “A critical evaluation of the current ‘p-value controversy’”. Biom. J. 2017, 59, 877–879. [Google Scholar] [CrossRef] [PubMed]
- Open Science Collaboration. Estimating the reproducibility of psychological science. Science 2015, 349, aac4716. [Google Scholar] [CrossRef] [PubMed]
AUC | p-Value | s-Value | rri | |
---|---|---|---|---|
(95% CI) | (95% CI) | |||
sTREM-1 | 0.733 | 0.005 | 7.644 | 0.815 |
(0.585, 0.882) | (1.543, 16.478) | |||
IL-6 | 0.892 | 21.575 | 1.000 | |
(0.808, 0.976) | (11.679, 25.336) | |||
sTREM-1 vs. IL-6 | 0.0534 | 4.227 | 0.441 | |
(0.264, 11.810) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Verykouki, E.; Nakas, C.T. Adaptations on the Use of p-Values for Statistical Inference: An Interpretation of Messages from Recent Public Discussions. Stats 2023, 6, 539-551. https://doi.org/10.3390/stats6020035
Verykouki E, Nakas CT. Adaptations on the Use of p-Values for Statistical Inference: An Interpretation of Messages from Recent Public Discussions. Stats. 2023; 6(2):539-551. https://doi.org/10.3390/stats6020035
Chicago/Turabian StyleVerykouki, Eleni, and Christos T. Nakas. 2023. "Adaptations on the Use of p-Values for Statistical Inference: An Interpretation of Messages from Recent Public Discussions" Stats 6, no. 2: 539-551. https://doi.org/10.3390/stats6020035
APA StyleVerykouki, E., & Nakas, C. T. (2023). Adaptations on the Use of p-Values for Statistical Inference: An Interpretation of Messages from Recent Public Discussions. Stats, 6(2), 539-551. https://doi.org/10.3390/stats6020035