Symmetry Properties of Bi-Normal and Bi-Gamma Receiver Operating Characteristic Curves are Described by Kullback-Leibler Divergences

Receiver operating characteristic (ROC) curves have application in analysis of the performance of diagnostic indicators used in the assessment of disease risk in clinical and veterinary medicine and in crop protection. For a binary indicator, an ROC curve summarizes the two distributions of risk scores obtained by retrospectively categorizing subjects as cases or controls using a gold standard. An ROC curve may be symmetric about the negative diagonal of the graphical plot, or skewed towards the left-hand axis or the upper axis of the plot. ROC curves with different symmetry properties may have the same area under the curve. Here, we characterize the symmetry properties of bi-Normal and bi-gamma ROC curves in terms of the Kullback-Leibler divergences (KLDs) between the case and control distributions of risk scores. The KLDs describe the known symmetry properties of bi-Normal ROC curves, and newly characterize the symmetry properties of constant-shape and constant-scale bi-gamma ROC curves. It is also of interest to note an application of KLDs where their asymmetry—often an inconvenience—has a useful interpretation.


Introduction
Receiver operating characteristic (ROC) curve analysis provides a basis for describing the performance of a diagnostic indicator when deployed in a binary diagnostic test.ROC curve analysis has found application in clinical medicine, veterinary medicine and crop protection (e.g., [1][2][3]).For a comprehensive overview of the methodology, see [4,5].
For the purpose of the present work, an outline description of the process by which an ROC curve may be derived allows us to introduce our terminology and notation.We refer generically to data provided by the diagnostic indicator as "risk scores".During the process of characterizing a diagnostic indicator, a risk score is recorded for each of a number of experimental subjects.Each subject is also classified definitively as either a "case" (e.g., subject is diseased) or a "control" (e.g., subject is healthy) by a gold standard assessment (independent of the putative indicator).The ultimate goal of the experimental procedure as described is to provide a basis for decision-making in practice that does not require reference to the gold standard.When the decision in question is binary, an ROC curve is a useful summary of the performance of the diagnostic indicator [6].
We now have a number of subjects, and two values for each: a risk score provided by means of the diagnostic indicator and the true status (case or control) provided by the gold standard.We can present the results graphically as frequency distributions of risk scores plotted separately for cases and controls.It is normal practice to calibrate the output of the diagnostic indicator so that higher risk scores tend to be associated with case status, and lower risk scores tend to be associated with control status.Typically, then, the mean of the distribution of risk scores for cases is larger than the mean of the distribution of risk scores for controls.
An ROC curve is, in essence, a summary of the (normalized) frequency distributions of risk scores for cases and controls.In this article, we are concerned with the properties of ROC curves based on continuous parametric models for the distributions of risk scores (e.g., [5,7]).In practice, then, model parameters must be estimated from the experimental data; we do not describe this part of the analysis.For a continuous indicator variable X we refer to the resulting probability density functions (pdfs) as f 1 (x) (for cases) and f 2 (x) (for controls).The corresponding cumulative distribution functions (cdfs) are F 1 (x) and F 2 (x), respectively.Now, consider the graphical plot of the pdfs of risk scores plotted separately for cases and controls.A diagnostic indicator and a threshold risk score together constitute a diagnostic test.In the process of developing a diagnostic test, our task is to characterize a threshold on the risk score scale such that subjects with risk scores above the threshold will be treated, and subjects with risk scores at or below the threshold will not be treated.The problem is that, typically, the distributions of risk scores for cases and controls overlap, so that there is no unequivocal "best" threshold risk score.Consider a particular choice of threshold risk score, and recall that we are working with the pdfs of risk scores for cases and controls.The proportion of cases correctly classified is the true positive proportion (TPP) and the proportion of controls correctly classified is the true negative proportion (TNP).The false negative proportion is FNP = 1 − TPP and the false positive proportion is FPP = 1 − TNP.The values of these proportions change with the choice of threshold risk score.
An ROC curve is a graphical plot of , with pairs of TPP and FPP values obtained by allowing a single threshold risk score to vary over the range of the indicator variable.Thus, points along the curve represent potential thresholds on the scale of the indicator variable, from each of which a binary test may be characterized.An ROC curve can therefore provide a useful summary of the characteristics of an indicator variable used as the basis for a binary test.Depending on the choice of model for risk scores for cases and controls, it may be possible to write down an analytical equation for the ROC curve, but this is immaterial in the present context.ROC curves that are monotone increasing above the main diagonal of the plot over the whole domain are sometimes referred to as "proper" ROC curves (see, e.g., Section 4.6 in [1]).Some continuous parametric ROC curves are proper, some are not; for example, it is well-known that the bi-Normal ROC curve is not in general proper, while the bi-gamma ROC curve is proper [8].
While on the one hand the ROC curve represents a summary of the distributions of risk scores for cases and controls, on the other there are methods by which a summary of the ROC curve itself is sought [7,9].By far the most common single-figure ROC curve summary measure in use is the area under the curve (AUC) as an index of diagnostic accuracy (e.g., [10]).Briefly, the idea is that diagnostic indicators with ROC curves which pass close to the top left-hand corner of the graphical plot of TPP against FPP (high AUC) provide tests for which TPP and TNP are both high, offering good discrimination between cases and controls.Diagnostic indicators with ROC curves close to the main diagonal of the plot of TPP against FPP (low AUC) have little to offer in terms of discrimination between cases and controls.However, in the present context, the AUC is unsuitable for use in the description of the symmetry properties of ROC curves.It is not difficult to see that ROC curves with the same AUC may have different symmetry properties (e.g., Figure 2A in [11]; Figure 2 in [12]).
This article describes the symmetry properties of some parametric ROC curves based on continuous distributions.The article is set out as follows.The generic symmetry properties of ROC curves are described graphically.The application of the Kullback-Leibler divergence is outlined within the context of the present work.Some useful properties of the Pareto distribution are illustrated.The symmetry properties of bi-Normal, bi-exponential and bi-gamma ROC curves are analyzed.A general discussion is provided.

Geometric Symmetry of ROC Curves
Geometric symmetry of ROC curves refers to an axis of symmetry that is the negative diagonal of the ROC plot (i.e., the line TPP = TNP).Green and Swets [13], Killeen and Taylor [14] and Hughes [15] have discussed the conditions for symmetry of ROC curves.However, ROC curves may be asymmetrical (skewed)-for example, the curve may "cling to the left edge of the ROC space longer than it does to the top" [16].We refer to this kind of skew as TPP-asymmetry, and to the kind of skew where the curve clings to the top edge of the ROC space longer than it does to the left as TNP-asymmetry [17].Figure 1 provides graphical definitions of these symmetry and asymmetry properties.

Kullback-Leibler Divergences
For a continuous indicator variable X we denote pdfs f 1 (x) (for cases) and f 2 (x) (for controls).Then the Kullback-Leibler divergences (KLDs) [18] are I(f 1 ,f 2 ) (with cases as the comparison distribution and controls as the reference distribution): and I(f 2 ,f 1 ) (with controls as the comparison distribution and cases as the reference distribution): where D is the common support of f 1 and f 2 .

TPP FPP
From Cover and Thomas [19] (who refer to continuous KLDs as differential relative entropies) we note that I(f 1 ,f 2 ) and I(f 2 ,f 1 ) ≥ 0, with equality only if f 1 (x) and f 2 (x) are identical.Typically, [10] although for an ROC curve based on f 1 (x) (for cases) and f 2 (x) (for controls) that is symmetric about the negative diagonal, I(f 1 ,f 2 ) = I(f 2 ,f 1 ) [17].A KLD can be interpreted as a kind of distance between probability distributions [19], although the asymmetry in its arguments (apart from some special cases) clearly indicates it is not a distance in the Euclidian sense.We will work in natural logarithms, so the KLDs are denominated in nits [20].For a discussion of measures of distance between distributions as used in summarizing ROC curves, see Section 4.3.4 in [1].

The Pareto Distribution
Now, without for the moment invoking any ROC-related context, consider the Pareto densities: 3) in [21]) we obtain KLDs for two Pareto distributions, as follows: with equality only if z = 1 (lemma 6.1 in [21]) we have both I(f 1 ,f 2 ) and I(f 2 ,f 1 ) ≥ 0, with equality only if f 1 (x) and f 2 (x) are identical (i.e., if 2 1    ), as required.
Figure 2A shows the graphical plots of the two Pareto KLDs  , from which it appears that (for z > 0): It turns out to be easier to characterize the inequality portrayed in Figure 2A if we calculate the  with equality only if z = 1 (see also Figure 2B), and that this inequality describes the relationship between I(f 1 ,f 2 ) and I(f 2 ,f 1 ) shown in Figure 2. We will use these results on the Pareto distribution in the following sections.
(the solid line), and   (the dashed line).

The Bi-Normal ROC Curve
For a continuous indicator variable X we have 1 and   denote the mean and variance, respectively, of f 1 (x); and 2 2 2 and   denote the mean and variance, respectively, of f 2 (x).The indicator variable is calibrated so that μ 1 > μ 2 .
First we consider the symmetric bi-Normal ROC curve.For such curves, the standard deviations of the case and control distributions are equal, 2 1    [1].Also, for symmetric ROC curves in general, , and for the symmetric bi-Normal ROC curve in particular, [23].For a numerical example, consider Killeen and Taylor's Figure 1 (top) in [14].In this example, the distribution of risk scores for cases f 1 (x) is Normal with mean μ 1 = 3.4 and standard deviation σ 1 = 1 and the distribution of risk scores for controls f 2 (x) is Normal with mean μ 2 = 2 and standard deviation σ 2 = 1.The resulting ROC curve is geometrically symmetric [14] and I(f 1 ,f 2 ) = I(f 2 ,f 1 ) = 0.980 nits.Asymmetric bi-Normal ROC curves are discussed by Green and Swets [13], Pepe [1] and Marzban [24].In the terminology of the present article, bi-Normal ROC curves are TPP-asymmetric when 1 and TNP-asymmetric when 1 . Writing in the context of applications of bi-Normal indicators in clinical epidemiology, Pepe [1] notes that the distribution of risk scores for controls is typically less dispersed than the distribution of risk scores for cases, in which case a typical bi-Normal ROC curve would be TPP-asymmetric (e.g., Figure 4.1 in [1], where 85 .0 ).The KLDs are now (e.g., [15]): , and we can write these as: We compare this with the situation when , and the ROC curve is symmetric.For TPP-asymmetry, we have 1 and (referring to Figure 2A) we can see that . For TNP-asymmetry and (referring again to Figure 2A) The inclusion of the factor ½ does not affect the inequality portrayed.Thus, for TPP-asymmetry, we have     and for TNP-asymmetry, we have     . This is illustrated in Figure 3, using values of μ 1 and μ 2 from Killeen and Taylor (Figure 1 in [14]).We note also from Figure 3 that the point where the two curves intersect characterizes the symmetric ROC curve with I(f 1 ,f 2 ) = I(f 2 ,f 1 ) = 0.980 nits.

The Bi-Exponential ROC Curve
We deal with the bi-exponential ROC curve in passing, since it turns out to be a special case of the constant-shape bi-gamma ROC curve, below.Here, we have exponential densities for cases and controls (e.g., [25]), respectively: . The indicator variable is calibrated so that the mean of the case distribution is larger than the mean of the control distribution, which requires 2 1    .A graphical plot of 1−F 1 (x) against 1−F 2 (x) then provides the ROC curve.Such ROC curves are TPP-asymmetric (as described in Figure 1) (see, e.g., Figure 1 in [25]).
Asadi et al. (Table 13.1 in [26]) provide a table of distributions related to the Pareto distribution by fixed transformation.Note that in the notation of Asadi et al. [26], our Pareto parameterization KLDs . Then, following Asadi et al. [26], we obtain KLDs for two exponential distributions as follows: and refer to Figure 2A.For a useful ROC curve we require 2 1    , so we are only interested in the part of Figure 2A where z < 1, and here we have     , and the corresponding ROC curve follows the main diagonal of the plot; such diagnostic indicators offer no discrimination between cases and controls.

The Bi-Gamma ROC Curve
We start by writing a general gamma density: . We refer to r as a shape parameter and λ as a scale parameter.Mathiassen et al. [27], Faraggi and Reiser [28], Faraggi et al. [29], and Hussain [30] use the same format.For two such gamma densities, respectively f 1 (x) and f 2 (x), we have X~gamma(x, r 1 , λ 1 ) and X~gamma(x, r 2 , λ 2 ) and the corresponding KLDs (e.g., [27]) are: (17) in which Γ(•) is the gamma function and Ψ(•) is the digamma function (the derivative of the logarithm of the gamma function, [31]).Here, we describe separately a constant-shape ROC curve and a constant-scale ROC curve.

The Constant-Shape Bi-Gamma ROC Curve
Here, 0 ; 0 , . For f 1 (x) and f 2 (x) respectively, X~gamma(x, r, λ 1 ) and X~gamma(x, r, λ 2 ).The indicator variable is calibrated so that the mean of the case distribution is larger than the mean of the control distribution, which requires against 1−F 2 (x) then provides the ROC curve.Such curves are TPP-asymmetric (as described in Figure 1).For example, see Dorfman et al. [8].If r = 1, f 1 (x) and f 2 (x) are the same as for the bi-exponential ROC curve (above), and the symmetry properties then follow.Otherwise, the general gamma KLDs above simplify to: and again, we can refer to Figure 2A (the inclusion of the (constant) factor r does not affect the inequality portrayed).For a useful ROC curve we require 2 1    , so we are only interested in the part of Figure 2A where z < 1, and here we have     , and the corresponding ROC curve follows the main diagonal of the plot; such diagnostic indicators offer no discrimination between cases and controls.

The Constant-Scale Bi-Gamma ROC Curve
Here, 0 ; 0 . For simplicity, we follow Hanley [32] and Tang et al. [33] who have λ = 1.For f 1 (x) and f 2 (x) respectively, X~gamma(x, r 1 , λ) and X~gamma(x, r 2 , λ).The indicator variable is calibrated so that the mean of the case distribution is larger than the mean of the control distribution, which requires . A graphical plot of 1−F 1 (x) against 1−F 2 (x) then provides the ROC curve.Such curves are TNP-asymmetric (as described in Figure 1).The general gamma KLDs above simplify to: Now, we set r 2 = 1 and r 1 = ζ, ζ > 0. Figure 4A shows the graphical plots of: from which it appears that (for ζ > 0): On calculating the derivatives, is the trigamma function (the first derivative of the digamma function, [31]) and g in which γ is Euler's constant (= 0.5772…) (see also Figure 4B).Recall that   , and the inequality portrayed in Figure 4 appears to have the same characteristics as that shown in Figure 2.
(B).The derivatives (the solid line) and g (the dashed line).
, and the corresponding ROC curve follows the main diagonal of the plot; such diagnostic indicators offer no discrimination between cases and controls.

Discussion
For continuous parametric ROC curves, we can define symmetry conditions.Notwithstanding, it is sometimes rather difficult to tell from a graphical plot whether an empirical ROC curve is actually symmetrical or only approximately so (e.g., Figure 2 in [24]).It is harder to define asymmetry conditions for continuous parametric ROC curves, although often relatively easy to tell from a graphical plot when an empirical ROC curve is asymmetric (e.g., Figure 2 in [10]).Marzban [24] asks if asymmetry can be explained in terms of the underlying case and control distributions, and concludes that asymmetry in an ROC curve "can be attributed to unequal widths of the underlying distributions".What is lacking is an independent assessment of asymmetry for comparison with the statistical assessment based on the relative dispersion of the case and control distributions.Here, we bring together a graphical definition of asymmetry (Figure 1) with an analysis of the KLDs for the case and control distributions for some examples of continuous parametric ROC curves.
The main findings are as follows.Bi-Normal ROC curves may be symmetric, TPP-asymmetric or TNP-asymmetric.For symmetric bi-Normal curves, we have     Of particular interest is the point of intersection of the two curves in Figure 3.The fact that at this point we have indicates the existence of symmetric curves that lie above the main diagonal of the bi-Normal ROC plot.This in itself is not surprising, of course, but it is noted here for reference below.
Bi-exponential ROC curves may only be TPP-asymmetric.For these TPP-asymmetric curves, . In this case the KLDs are equal only when     0 , , (referring to Figure 2), indicating that (unlike the bi-Normal case) there is no symmetric curve that lies above the main diagonal of the bi-exponential ROC plot.
Bi-gamma ROC curves may be TPP-asymmetric or TNP-asymmetric.A constant-shape bi-gamma ROC curve is always TPP-asymmetric and      4A), indicating that (unlike the bi-Normal case) there is no symmetric curve that lies above the main diagonal of the bi-gamma ROC plot.
The choice of operational threshold on an ROC curve amounts to specification of the error rates (FPP and FNP=1 − TPP) of the resulting diagnostic test.Recalling Figure 1 (for example), we can see that the symmetry properties of an ROC curve influence the trade-off between these error rates that is of interest in the process of choosing a threshold.ROC curve symmetry and both kinds of asymmetry are observed empirically in the study of disease diagnostics.This is beyond the scope of summaries based on area under curve calculations.As noted by Marzban [24], the ROC curve is a two-dimensional representation of a diagnostic indicator, so a single-figure summary measure cannot characterize all its properties.Further difficulties with the area under the ROC curve as a summary measure of performance of a diagnostic indicator are discussed in [34].Our work so far, relating to continuous parametric ROC curves, indicates the following.First, although the KLD is usually not a symmetric quantity [35], it is noteworthy that for an ROC curve based on f 1 (x) (for cases) and f 2 (x) (for controls) that is symmetric about the negative diagonal, I(f 1 ,f 2 ) = I(f 2 ,f 1 ) [17].Second, although the lack of symmetry of the KLD has been referred to as a nuisance in applications [36], in this particular study we find that the asymmetry of the KLD usefully characterizes the asymmetry of bi-Normal and bi-gamma ROC curves.

Figure 1 .
Figure 1.Graphical description of symmetric and asymmetric ROC curves.The dotted lines show, for reference, TPP = 1 − FPP (the negative diagonal) and the lines FPP = a (vertical) and TPP = 1 − a (horizontal).The FPP coordinate of point A = a, and the FPP coordinate of point C = a*, such that a < a*.The solid line is a symmetric ROC curve passing through the points A (a, b) and B (a 1 , b 1 ) (such that a 1 = 1 − b, b 1 = 1 − a).Point C (a*, 1 − a*) also lies on the symmetric ROC curve.Asymmetries are defined by reference to the symmetric curve passing through point A, as follows.The dashed line is a TPPasymmetric ROC curve passing through the points A (a, b) and D (a 2 , b 2 ) (such that a 2 > 1 − b, b 2 = 1 − a).The dot-dashed line is a TNP-asymmetric ROC curve passing through the points A (a, b) and E (a 3 , b 3 ) (such that a 3 < 1 − b, b 3 = 1 − a).

Figure 4 .
Figure 4. Analysis of a constant scale bi-gamma ROC curve.(A).Graphical plots of Kullback-Leibler divergences: so we are only interested in the part of Figure4Awhere ζ > 1, and here we have constant-scale bi-gamma ROC curve we considered the case of λ = 1.This is always TNP-asymmetric and  cases (i.e., constant-shape and constant-scale) the KLDs are equal only when