Next Article in Journal
Quantifying Non-Stationarity with Information Theory
Next Article in Special Issue
Gradient Regularization as Approximate Variational Inference
Previous Article in Journal
The “Hockey” Assist Makes the Difference—Validation of a Defensive Disruptiveness Model to Evaluate Passing Sequences in Elite Soccer
Previous Article in Special Issue
Minimum Message Length in Hybrid ARMA and LSTM Model Forecasting
Article

Empirical Frequentist Coverage of Deep Learning Uncertainty Quantification Procedures

1
Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
2
Google Research, Cambridge, MA 02142, USA
3
Department of Epidemiology, Harvard School of Public Health, Boston, MA 02115, USA
*
Author to whom correspondence should be addressed.
Academic Editors: Eric Nalisnick and Dustin Tran
Entropy 2021, 23(12), 1608; https://doi.org/10.3390/e23121608
Received: 1 October 2021 / Revised: 17 November 2021 / Accepted: 24 November 2021 / Published: 30 November 2021
(This article belongs to the Special Issue Probabilistic Methods for Deep Learning)
Uncertainty quantification for complex deep learning models is increasingly important as these techniques see growing use in high-stakes, real-world settings. Currently, the quality of a model’s uncertainty is evaluated using point-prediction metrics, such as the negative log-likelihood (NLL), expected calibration error (ECE) or the Brier score on held-out data. Marginal coverage of prediction intervals or sets, a well-known concept in the statistical literature, is an intuitive alternative to these metrics but has yet to be systematically studied for many popular uncertainty quantification techniques for deep learning models. With marginal coverage and the complementary notion of the width of a prediction interval, downstream users of deployed machine learning models can better understand uncertainty quantification both on a global dataset level and on a per-sample basis. In this study, we provide the first large-scale evaluation of the empirical frequentist coverage properties of well-known uncertainty quantification techniques on a suite of regression and classification tasks. We find that, in general, some methods do achieve desirable coverage properties on in distribution samples, but that coverage is not maintained on out-of-distribution data. Our results demonstrate the failings of current uncertainty quantification techniques as dataset shift increases and reinforce coverage as an important metric in developing models for real-world applications. View Full-Text
Keywords: uncertainty quantification; coverage; Bayesian methods; dataset shift uncertainty quantification; coverage; Bayesian methods; dataset shift
Show Figures

Figure 1

MDPI and ACS Style

Kompa, B.; Snoek, J.; Beam, A.L. Empirical Frequentist Coverage of Deep Learning Uncertainty Quantification Procedures. Entropy 2021, 23, 1608. https://doi.org/10.3390/e23121608

AMA Style

Kompa B, Snoek J, Beam AL. Empirical Frequentist Coverage of Deep Learning Uncertainty Quantification Procedures. Entropy. 2021; 23(12):1608. https://doi.org/10.3390/e23121608

Chicago/Turabian Style

Kompa, Benjamin, Jasper Snoek, and Andrew L. Beam. 2021. "Empirical Frequentist Coverage of Deep Learning Uncertainty Quantification Procedures" Entropy 23, no. 12: 1608. https://doi.org/10.3390/e23121608

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop