Maximum Likelihood and Calibrating Prior Prediction Reliability Bias Reference Charts
Abstract
1. Introduction
1.1. Parameter Uncertainty
1.2. Background on Maximum Likelihood Reliability Bias
1.3. Background on Objective Bayesian Prediction
1.4. Study Objectives
1.5. Study Limitations
2. Methods
2.1. Reliability Testing Simulations
- Assign a model, a sample size n, and values for the model parameters , which we refer to as the true parameters.
- Given the true parameters, generate 100,000 sets of training data, each of the given sample size, using a random number generator for the model.
- From each of the training data sets, generate predictive quantiles at a number of NPs.
- Generate 100,000 testing values, using the same random number generator.
- Count the number of testing values that exceed each quantile prediction.
- Calculate the relative frequency of the number of values that exceeded each quantile prediction as a Monte Carlo estimate of the PCP and compare that estimate with the NP for that prediction.
- 4.
- Substitute each quantile prediction into the known distribution function based on the true parameters to calculate how often it will be exceeded. This gives the PCP for that quantile, for that single set of training data.
- 5.
- Average together the PCPs for all sets of training data to give an estimate of the overall PCP for each quantile.
- 6.
- Compare the estimated overall PCP with the NP.
2.2. Dependence on Parameter Values
2.2.1. Homogeneous Models
2.2.2. Inhomogeneous Models
2.3. Sampling Bias on Probabilities and Quantiles
2.4. Prediction Methods
2.4.1. Calibrating Prior Prediction
2.4.2. Evaluation of Bayesian Predictions
2.4.3. Parameter Ranges
3. Results
3.1. Investigating Point Estimate Prediction Reliability Biases
3.1.1. Normal Distribution Reliability Bias
3.1.2. Exponential Distribution Reliability Bias
3.1.3. Impact of Sample Size
3.1.4. Impact of Predictors
3.1.5. Impact of Model
3.1.6. Impact of GPD Shape Parameter
3.1.7. Impact of GEVD Shape Parameter
3.1.8. Impact of Estimator: Normal Distribution
3.1.9. Impact of Estimator: GEVD
3.2. Point Estimate Prediction Reliability Bias Charts
3.2.1. Chart Type 1: PCP vs. NP and Sample Size
3.2.2. Chart Type 1: Usage
- If a study with sample size n returns a maximum likelihood assessment of the probability of an event (or quantile) as being then we can use the appropriate chart to derive the PCP (i.e., the true probability) for that event, using the contours, as a function of n (on the horizontal axis) and (on the vertical axis). The PCPs are typically higher than the NPs. This is because the tail of maximum likelihood predictions is typically too thin.
- Alternatively, if we wish to find the event with PCP given by , from a maximum likelihood predictive distribution based on a sample size n, then we can read the corresponding NP, from the vertical axis. The event with PCP given by is the event with NP .
3.2.3. Chart Type 1: Interpretation
3.2.4. Chart Type 2: PCP vs NP and True Parameter
3.2.5. Chart Type 2: Interpretation
3.2.6. Chart Type 3: Max. Exceedance vs True Parameter and Sample Size
3.3. Calibrating Prior Prediction Reliability Bias Charts
3.3.1. Chart Type 1: PCP vs. NP and Sample Size
3.3.2. Chart Type 2: PCP vs. NP and True Parameter
4. Discussion
5. Conclusions
Supplementary Materials
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Abbreviations Used in This Study
| CDF | Cumulative distribution function | |
| CP | Calibrating prior | A prior chosen so that resulting predictions are reliable, a.k.a. well calibrated |
| CP-DMGS | Calibrating prior and DMGS | A specification of the prior and numerical scheme used to evaluate the Bayesian prediction equation |
| DMGS equations | Datta–Mukerjee–Ghosh–Sweeting | A set of equations that give approximate Bayesian predictions |
| EDF | Exceedance distribution function | One minus the CDF. Also known as the survival function. |
| GEV | Generalized extreme value | A distribution, used for maxima. |
| GEVD | Generalized extreme value distribution | |
| GP | Generalized Pareto | A distribution, used for tails |
| GPD | Generalized Pareto distribution | |
| NP | Nominal probability | The exceedance probability at which exceedance quantiles are to be calculated |
| PCP | Predictive coverage probability | The probability with which quantiles at a certain NP are exceeded |
| Probability density function | ||
| PWM | Probability weighted moments | A point estimate method for fitting distributions, often used for the GEVD |
| RHP | Right Haar prior | A prior with certain beneficial mathematical properties |
Appendix B. References to Studies Using Point Estimate Prediction
- Eden et al. [18] used the GEVD to analyze extreme precipitation.
- Gudmundsson and Seneviratne [19] used the Gamma distribution to analyze extreme precipitation.
- Kew et al. [20] used the normal distribution to model extreme soil moisture, and the GPD to model extreme precipitation.
- Otto et al. [21] used the GPD to model extreme temperatures.
- Philip et al. [22] used the normal to model extreme precipitation.
- Siswanto et al. [23] used the GEVD to model extreme precipitation.
- Stott et al. [24] used the GEVD to model extreme temperatures.
- van den Brink and Können [25] used the Gumbel to model extreme wind speeds.
- van der Wiel et al. [26] use the GEVD to model extreme precipitation.
- Vautard et al. [27] used the Gumbel to model precipitation.
- Thompson et al. [28] used the GEVD to model extreme temperature.
- Wehner et al. [29] used the GEVD to model extreme temperature and precipitation.
Appendix C. Details for the Statistical Models Considered
Appendix C.1. Exponential Distribution/Pareto Distribution with One Parameter
- The same charts apply to both models. Since both models are homogeneous, the results are independent of the parameter .
- Figure S1 shows the type 1 maximum likelihood reliability bias chart.
- Figure S83 shows the type 1 CP-Analytic reliability bias chart. Since both models are homogeneous, and we use an analytic solution, the CP-Analytic predictions are perfectly reliable and the contours are horizontal. The chart is only included for completeness.
Appendix C.2. Exponential Regression/Pareto Regression
- The same charts apply to both models. Since both models are homogeneous, the results are independent of the parameters.
- Figure S2 shows the type 1 maximum likelihood reliability bias chart.
- Figure S84 shows the type 1 CP-DMGS reliability bias chart. Since both models are homogeneous, the slight deviations from perfect reliability for small sample sizes and low probabilities are due to the approximation in the DMGS equations.
Appendix C.3. Normal Distribution/Log-Normal Distribution
- The same charts apply to both models. Since both models are homogeneous, the results are independent of the parameters.
- Figure S3 shows the type 1 maximum likelihood reliability bias chart.
- Figure S85 shows the type 1 CP-Analytic reliability bias chart. Since both models are homogeneous, and we use an analytic solution, the predictions are perfectly reliable. The chart is only included for completeness.
Appendix C.4. Normal Distribution/Log-Normal Distribution, for Method of Moments
- Figure S4 shows the type 1 method of moments reliability bias chart for the normal and log-normal distributions, but fitted with method of moments.
Appendix C.5. Gaussian Linear Regression with 1 Predictor/Log-Normal Regression with 1 Predictor
- The same charts apply to both models. Since both models are homogeneous, the results are independent of the parameters.
- Figure S5 shows the type 1 maximum likelihood reliability bias chart.
- Figure S87 shows the type 1 CP-Analytic reliability bias chart. Since both models are homogeneous, and since we use an analytic solution, the predictions are perfectly reliable. The chart is only included for completeness.
Appendix C.6. Logistic Distribution
- Since this model is homogeneous, the results are independent of the parameters.
- Figure S6 shows the type 1 maximum likelihood reliability bias chart.
- Figure S88 shows the type 1 CP-DMGS reliability bias chart. Since this model is homogeneous, the slight deviations from perfect reliability for small sample sizes and low probabilities are due to the approximation in the DMGS equations.
Appendix C.7. Gumbel Distribution
- Since this model is homogeneous, the results are independent of the parameters.
- Figure S7 shows the type 1 maximum likelihood reliability bias chart.
- Figure S89 shows the type 1 CP-DMGS reliability bias chart for the Gumbel distribution. Since this model is homogeneous, the slight deviations from perfect reliability for small sample sizes and low probabilities are due to the approximation in the DMGS equations.
Appendix C.8. Gumbel Linear Regression
- Since this model is homogeneous, the results are independent of the parameters.
- Figure S8 shows the type 1 maximum likelihood reliability bias chart.
- Figure S90 shows the type 1 CP-DMGS reliability bias chart. Since this model is homogeneous, the slight deviations from perfect reliability for small sample sizes and low probabilities are due to the approximation in the DMGS equations.
Appendix C.9. Fréchet Distribution with Fixed Location Parameter
- Since this model is homogeneous, the results are independent of the parameters.
- Figure S9 shows the type 1 maximum likelihood reliability bias chart.
- Figure S91 shows the type 1 CP-DMGS reliability bias chart. Since this model is homogeneous, the slight deviations from perfect reliability for small sample sizes and low probabilities are due to the approximation in the DMGS equations.
Appendix C.10. Weibull Distribution
- Since this model is homogeneous, the results are independent of the parameters.
- Figure S10 shows the type 1 maximum likelihood reliability bias chart.
- Figure S92 shows the type 1 CP-DMGS reliability bias chart. Since this model is homogeneous, the slight deviations from perfect reliability for small sample sizes and low probabilities are due to the approximation in the DMGS equations.
Appendix C.11. Generalized Extreme Value Distribution
- Since this model is inhomogeneous, the results depend on the shape parameter .
- Type 1 charts for maximum likelihood prediction are shown in Figures S11 to S19 for values of the shape parameter from to .
- Type 2 charts for maximum likelihood prediction are shown in Figures S47 to S54 for sample sizes from to .
- A type 3 chart for maximum likelihood prediction is shown in Figure S79.
- The calibrating prior GEVD results are based on predictions generated using a prior of . This prior is recommended as a calibrating prior by Jewson et al. [11].
- Type 1 CP-DMGS reliability bias charts are shown in Figures S93 to S101 for values of the shape parameter from to .
- Type 2 CP-DMGS charts are shown in Figures S129 to S136 for sample sizes from to .
- Type 3 charts are not relevant for Bayesian predictions, as there is no upper limit to the random variable in the predictions.
Appendix C.12. Generalized Extreme Value 1 Predictor Regression
- Since this model is inhomogeneous, the results depend on the shape parameter .
- Type 1 charts for maximum likelihood prediction are shown in Figures S20 to S28 for values of the shape parameter from to .
- Type 2 charts for maximum likelihood prediction are shown in Figures S55 to S62 for sample sizes from to .
- A type 3 chart for maximum likelihood prediction is shown in Figure S80.
- The calibrating prior GEVD-with-1-predictor results are based on predictions generated using a prior of . This prior is recommended as a calibrating prior by Jewson et al. [11].
- Type 1 CP-DMGS reliability bias charts are shown in Figures S102 to S110 for values of the shape parameter from to .
- Type 2 CP-DMGS charts are shown in Figures S137 to S144 for sample sizes from to .
Appendix C.13. Generalized Extreme Value 2 Predictor Regression
- Since this model is inhomogeneous, the results depend on the shape parameter .
- Type 1 charts for maximum likelihood prediction are shown in Figures S29 to S37 for values of the shape parameter from to .
- Type 2 charts for maximum likelihood prediction are shown in Figures S63 to S70 for sample sizes from to .
- A type 3 chart for maximum likelihood prediction is shown in Figure S81.
- The calibrating prior GEVD-with-2-predictors results are based on predictions generated using a prior of . This prior is recommended as a calibrating prior by Jewson et al. [11].
- Type 1 CP-DMGS reliability bias charts are shown in Figures S111 to S119 for values of the shape parameter from to .
- Type 2 CP-DMGS reliability bias charts are shown in Figures S145 to S152 for sample sizes from to .
Appendix C.14. Generalized Pareto Distribution
- Since this model is inhomogeneous, the results depend on the shape parameter .
- Type 1 charts for maximum likelihood prediction are shown in Figures S38 to S46 for values of the shape parameter from to .
- Type 2 charts for maximum likelihood prediction are shown in Figures S71 to S78 for sample sizes from to .
- A type 3 chart for maximum likelihood prediction is shown in Figure S82.
- The calibrating prior GPD results are based on predictions generated using a prior of . This prior is recommended as a calibrating prior by Jewson et al. [11].
- Type 1 CP-DMGS reliability bias charts are shown in Figures S120 to S128 for values of the shape parameter from to .
- Type 2 CP-DMGS charts are shown in Figures S153 to S160 for sample sizes from to .
References
- Priestley, M.D.K.; Stephenson, D.B.; Scaife, A.A.; Bannister, D.; Allen, C.J.T.; Wilkie, D. Return levels of extreme European windstorms, their dependency on the North Atlantic Oscillation, and potential future risks. Nat. Hazards Earth Syst. Sci. 2023, 23, 3845–3861. [Google Scholar] [CrossRef]
- Barnes, C.; Jain, P.; Keeping, T.R.; Gillett, N.; Boucher, J.; Gachon, P.; Heinrich, D.; Kirchmeier-Young, M.; Boulanger, Y. Disentangling the roles of natural variability and climate change in Canada’s 2023 fire season. Environ. Res. Clim. 2025, 4, 035013. [Google Scholar] [CrossRef]
- Cho, E.; Ahmadisharaf, E.; Villarini, G.; AghaKouchak, A. Historical changes in overtopping probability of dams in the United States. Nat. Commun. 2025, 16, 6693. [Google Scholar] [CrossRef]
- Quilcaille, Y.; Gudmundsson, L.; Schumacher, D.L.; Gasser, T.; Heede, R.; Heri, C.; Lejeune, Q.; Nath, S.; Naveau, P.; Thiery, W.; et al. Systematic attribution of heatwaves to the emissions of carbon majors. Nature 2025, 645, 392–398. [Google Scholar] [CrossRef] [PubMed]
- Bernardo, J.; Smith, A. Bayesian Theory; Wiley: Hoboken, NJ, USA, 1993. [Google Scholar]
- Geisser, S. Predictive Inference: An Introduction; Chapman and Hall: New York, NY, USA, 1993. [Google Scholar]
- Root, H.E. Probability Statements in Weather Forecasting. J. Appl. Meteorol. Climatol. 1962, 1, 163–168. [Google Scholar] [CrossRef]
- Niculescu-Mizil, A.; Caruana, R. Predicting good probabilities with supervised learning. In Proceedings of the 22nd International Conference on Machine Learning, ICML ’05, Bonn, Germany, 7–11 August 2005; Association for Computing Machinery: New York, NY, USA, 2005; pp. 625–632. [Google Scholar] [CrossRef]
- Gerrard, R.; Tsanakas, A. Failure Probability Under Parameter Uncertainty. Risk Anal. 2011, 31, 727–744. [Google Scholar] [CrossRef] [PubMed]
- Blanco, D.; Weng, A. Practical aspects of modelling parameter uncertainty for risk capital calculation. Z. Für Die Gesamte Versicherungswissenschaft 2019, 108, 43–62. [Google Scholar] [CrossRef]
- Jewson, S.; Sweeting, T.; Jewson, L. Reducing reliability bias in assessments of extreme weather risk using calibrating priors. ASCMO 2025, 11, 1–22. [Google Scholar] [CrossRef]
- Severini, T.; Mukerjee, R.; Ghosh, M. On an exact probability matching property of right-invariant priors. Biometrika 2002, 89, 952–957. [Google Scholar] [CrossRef]
- Fraser, D. The Fiducial Method and Invariance. Biometrika 1961, 48, 261–280. [Google Scholar] [CrossRef]
- Hora, R.B.; Buehler, R.J. Fiducial Theory and Invariant Estimation. Ann. Math. Stat. 1966, 37, 643–656. [Google Scholar] [CrossRef]
- Datta, G.; Mukerjee, R.; Ghosh, M.; Sweeting, T. Bayesian prediction with approximate frequentist validity. Ann. Stat. 2000, 28, 1414–1426. [Google Scholar] [CrossRef]
- Coles, S.G.; Dixon, M.J. Likelihood-Based Inference for Extreme Value Models. Extremes 1999, 2, 5–23. [Google Scholar] [CrossRef]
- Jewson, S. Fitdistcp, R package version 0.1.1; 2025. Available online: https://cran.r-project.org/web/packages/fitdistcp/index.html (accessed on 29 October 2025).
- Eden, J.M.; Kew, S.F.; Bellprat, O.; Lenderink, G.; Manola, I.; Omrani, H.; van Oldenborgh, G.J. Extreme precipitation in the Netherlands: An event attribution case study. Weather Clim. Extrem. 2018, 21, 90–101. [Google Scholar] [CrossRef]
- Gudmundsson, L.; Seneviratne, S.I. Anthropogenic climate change affects meteorological drought risk in Europe. Environ. Res. Lett. 2016, 11, 044005. [Google Scholar] [CrossRef]
- Kew, S.; Philip, S.; van Oldenborgh, G.J.; van der Schrier, G.; Otto, F.E.L.; Vautard, R. The Exceptional Summer Heat Wave in Southern Europe 2017. Bull. Am. Meteorol. Soc. 2019, 100, S49–S53. [Google Scholar] [CrossRef]
- Otto, F.E.L.; Massey, N.; van Oldenborgh, G.J.; Jones, R.G.; Allen, M.R. Reconciling two approaches to attribution of the 2010 Russian heat wave. Geophys. Res. Lett. 2012, 39, L04702. [Google Scholar] [CrossRef]
- Philip, S.; Kew, S.F.; van Oldenborgh, G.J.; Otto, F.; O’Keefe, S.; Haustein, K.; King, A.; Zegeye, A.; Eshetu, Z.; Hailemariam, K.; et al. Attribution Analysis of the Ethiopian Drought of 2015. J. Clim. 2018, 31, 2465–2486. [Google Scholar] [CrossRef]
- Siswanto; van Oldenborgh, G.J.; van der Schrier, G.; Lenderink, G.; van den Hurk, B. Trends in High-Daily Precipitation Events in Jakarta and the Flooding of January 2014. Bull. Am. Meteorol. Soc. 2015, 96, S131–S135. [Google Scholar] [CrossRef]
- Stott, P.A.; Christidis, N.; Otto, F.E.L.; Sun, Y.; Vanderlinden, J.P.; van Oldenborgh, G.J.; Vautard, R.; von Storch, H.; Walton, P.; Yiou, P.; et al. Attribution of extreme weather and climate-related events. WIREs Clim. Chang. 2016, 7, 23–41. [Google Scholar] [CrossRef]
- van den Brink, H.W.; Können, G.P. Estimating 10000-year return values from short time series. Int. J. Climatol. 2011, 31, 115–126. [Google Scholar] [CrossRef]
- van der Wiel, K.; Kapnick, S.B.; van Oldenborgh, G.J.; Whan, K.; Philip, S.; Vecchi, G.A.; Singh, R.K.; Arrighi, J.; Cullen, H. Rapid attribution of the August 2016 flood-inducing extreme precipitation in south Louisiana to climate change. Hydrol. Earth Syst. Sci. 2017, 21, 897–921. [Google Scholar] [CrossRef]
- Vautard, R.; Yiou, P.; van Oldenborgh, G.J.; Lenderink, G.; Thao, S.; Ribes, A.; Planton, S.; Dubuisson, B.; Soubeyroux, J.M. Extreme Fall 2014 Precipitation in the Cévennes Mountains. Bull. Am. Meteorol. Soc. 2015, 96, S56–S60. [Google Scholar] [CrossRef]
- Thompson, V.; Mitchell, D.; Hegerl, G.; Collins, M.; Leach, N.; Slingo, J. The most at-risk regions in the world for high-impact heatwaves. Nat. Commun. 2023, 14, 2152. [Google Scholar] [CrossRef] [PubMed]
- Wehner, M.; Gleckler, P.; Lee, J. Characterization of long period return values of extreme daily temperature and precipitation in the CMIP6 models: Part 1, model evaluation. Weather Clim. Extrem. 2020, 30, 100283. [Google Scholar] [CrossRef]














Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jewson, S. Maximum Likelihood and Calibrating Prior Prediction Reliability Bias Reference Charts. Stats 2025, 8, 109. https://doi.org/10.3390/stats8040109
Jewson S. Maximum Likelihood and Calibrating Prior Prediction Reliability Bias Reference Charts. Stats. 2025; 8(4):109. https://doi.org/10.3390/stats8040109
Chicago/Turabian StyleJewson, Stephen. 2025. "Maximum Likelihood and Calibrating Prior Prediction Reliability Bias Reference Charts" Stats 8, no. 4: 109. https://doi.org/10.3390/stats8040109
APA StyleJewson, S. (2025). Maximum Likelihood and Calibrating Prior Prediction Reliability Bias Reference Charts. Stats, 8(4), 109. https://doi.org/10.3390/stats8040109

