An approach to the assessment of probabilistic inference is described which quantifies the performance on the probability scale. From both information and Bayesian theory, the central tendency of an inference is proven to be the geometric mean of the probabilities reported for the actual outcome and is referred to as the “Accuracy”. Upper and lower error bars on the accuracy are provided by the arithmetic mean and the −2/3 mean. The arithmetic is called the “Decisiveness” due to its similarity with the cost of a decision and the −2/3 mean is called the “Robustness”, due to its sensitivity to outlier errors. Visualization of inference performance is facilitated by plotting the reported model probabilities versus the histogram calculated source probabilities. The visualization of the calibration between model and source is summarized on both axes by the arithmetic, geometric, and −2/3 means. From information theory, the performance of the inference is related to the cross-entropy between the model and source distribution. Just as cross-entropy is the sum of the entropy and the divergence; the accuracy of a model can be decomposed into a component due to the source uncertainty and the divergence between the source and model. Translated to the probability domain these quantities are plotted as the average model probability versus the average source probability. The divergence probability is the average model probability divided by the average source probability. When an inference is over/under-confident, the arithmetic mean of the model increases/decreases, while the −2/3 mean decreases/increases, respectively.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited