Refinement Orders for Quantitative Information Flow and Differential Privacy

systems: max-case reﬁnement orders and


Introduction
The enormous growth in the use of internet-connected devices and the big-data revolution have created serious privacy concerns, and motivated an intensive area of research aimed at devising methods to protect the users' sensitive information.During the last decade, two main frameworks have emerged in this area: Differential Privacy (DP) and Quantitative Information Flow (QIF).
Differential privacy (DP) [1] was originally developed in the area of statistical databases, and it aims at protecting the individuals' data while allowing the release of aggregate information through queries.This is obtained by obfuscating the result of the query via the addition of controlled noise.Naturally, we need to assume that the curator, namely the entity collecting and storing the data and handling the queries, is honest and capable of protecting the data from security breaches.Since this assumption cannot always be guaranteed, a variant has been proposed: local differential privacy (LDP) [2], where the data are obfuscated individually before they are collected.
Both DP and LPD are subsumed by d-privacy [3], and, in this paper, we will use the latter as a unifying framework.The definition of d-privacy assumes an underlying metric structure on the data domain X .An obfuscation mechanism K for X is a probabilistic mapping from X to some output domain Y, namely a function from X to probabilistic distributions over Y.We will use the notation K x,y to represent the probability that K on input x gives output y.The mechanism K is ε•d-private, where ε is a parameter representing the privacy level, if K x 1 ,y ≤ e ε d(x 1 ,x 2 ) K x 2 ,y for all x 1 , x 2 ∈ X , y ∈ Y Standard DP is obtained from this definition by assuming X to be a set of all datasets and d the Hamming distance between two datasets, seen as vectors or records (i.e., the number of positions in which the two datasets differ).Note that the more common definition of differential privacy assumes that x 1 , x 2 are adjacent, i.e., their Hamming distance is 1, and requires K x 1 ,y ≤ e ε K x 2 ,y .It is easy to prove that the two definitions are equivalent.As for LDP, it is obtained by considering the so-called discrete metric which assigns distance 0 to identical elements, and 1 otherwise.
The other framework for the protection of sensitive information, quantitative information flow (QIF), focuses on the potentialities and the goals of the attacker, and the research on this area has developed rigorous foundations based on information theory [4,5].The idea is that a system processing some sensitive data from a random variable X and releasing some observable data as a random variable Y can be modelled as an information-theoretic channel with input X and output Y.The leakage is then measured in terms of correlation between X and Y.There are, however, many different ways to define such correlation, depending on the notion of adversary.In order to provide a unifying approach, Ref. [6] has proposed the theory of g-leakage, in which an adversary is characterized by a functional parameter g representing its gain for each possible outcomes of the attack.
One issue that arises in both frameworks is how to compare systems from the point of view of their privacy guarantees.It is important to have a rigorous and effective way to establish whether a mechanism is better or worse than another one, in order to guide the design and the implementation of mechanisms for information protection.This is not always an obvious task.To illustrate the point, consider the following examples.
Example 1.Let P 1 , P 2 , P 3 and P 4 be the programs illustrated in Table 1, where H is a "high" (i.e., secret) input and L is a "low" (i.e., public) output.We assume that H is a uniformly distributed 32-bit integer with range 0 ≤ H < 2 32 .All these programs leak information about H via L, in different ways: P 1 reveals H whenever it is a multiple of 8 (H mod 8 represents the integer division of H by 8), and reveals nothing otherwise.P 2 does the same thing whenever H is a multiple of 4. P 3 reveals the last 8 bits of H (note that H & 0 24 1 8 represents the bitwise conjunction between H and a string of 24 bits "0" followed by 8 bits "1").Analogously, P 4 reveals the last 4 bits of H. Now, it is clear that P 2 leaks more than P 1 , and that P 4 leaks more than P 3 , but how to compare, for instance, P 1 , and P 3 ?It is debatable which one is worse because their behavior is very different: P 1 reveals nothing in most cases, but when it does reveal something, it reveals everything.P 3 , on the other hand, always reveals part of the secret.Clearly, we cannot decide which situation is worse, unless we have some more information about the goals and the capabilities of the attacker.For instance, if the adversary has only one attempt at his disposal (and no extra information), then the program P 3 is better because even after the output of L there are still 24 bits of H that are unknown.On the other hand, if the adversary can repeat the attacks on program similar to P 3 , then eventually it will uncover the secret entirely all the times.Table 1.Programs that take in input a secret H and leak information about H via the output L. Example 2. Consider the domain of the integer numbers between 0 and 100, and consider a geometric mechanism (cf.Definition 9) on this domain, with ε = log 27 /25.Then, consider a randomized response mechanism (cf.Definition 13) still on the same domain and ε = log 2. The two mechanisms are illustrated in Figure 1.
They both satisfy d-privacy, but for different d: in the first case d is the standard distance between numbers, while in the second case it is the discrete metric.Clearly, it does not make sense to compare these mechanisms on the basis of their respective privacy parameters ε because they represent different privacy properties, and it is not obvious how to compare them in general: The geometric mechanism tends to make the true value indistinguishable from his immediate neighbors, but it separates it from the values far away.The randomized response introduces the same level of confusion between the true value and any other value of the domain.Thus, which mechanism is more private depends on the kind of attack we want to mitigate: if the attacker is trying to guess an approximation of the value, then the randomized response is better.If the attacker is only interested in identifying the true value among the immediate neighbors, then the geometric is better.Indeed, note that in the subdomain of the numbers between 40 and 60 the geometric mechanism also satisfies d-privacy with d being the discrete metric and ε = log 2, and in any subdomain smaller than that it satisfies the discrete metric d-privacy with ε < log 2. In this respect, the QIF approach has led to an elegant theory of refinement (pre)order (In this paper, we call avg G and the other refinement relations "orders", although, strictly speaking, they are preorders.)avg G , which provides strong guarantees: A avg G B means that B is safer than A in all circumstances, in the sense that the expected gain of an attack on B is less than on A, for whatever kind of gain the attacker may be seeking.This means that we can always substitute the component A by B without compromising the security of the system.An appealing aspect of this particular refinement order is that it is characterized by a precise structural relation between the stochastic channels associated with A and B [6,7], which makes it easy to reason about, and relatively efficient to verify.It is important to remark that this order is based on an average notion of adversarial gain (vulnerability), defined by mediating over all possible observations and their probabilities.We call this perspective average-case.
At the other end of the spectrum, DP, LDP and d-privacy are max-case measures.In fact, by applying the Bayes theorem to (1), we obtain: where, for i ∈ {1, 2}, π(x i ) is the prior probability of x i and p(x i | y) is the posterior probability of x i given y.We can interpret π(x 1 ) /π(x 2 ) and p(x 1 |y) /p(x 2 |y) as knowledge about X : they represent how much more likely x 1 is with respect to x 2 , before (prior) and after (posterior) observing y, respectively.Thus, the property expresses a bound on how much the adversary can learn from each individual outcome of the mechanism (The property (2) is also the semantic interpretation of the guarantees of the Pufferfish framework (cf.[8], Section 3.1).The ratio ) is known as odds ratio).
In the literature of DP, LDP and d-privacy, mechanisms are usually compared on the basis of their ε-value (In DP and LDP ε is a parameter that usually appears explicitly in the definition of the mechanism.In d-privacy, it is an implicit scaling factor.),which controls a bound on the log-likelihood ratio of an observation y given two "secrets" x 1 and x 2 : smaller ε means more privacy.In DP and LDP, the bound is ε itself, while in d-privacy it is ε × d(s 1 , s 2 ) We remark that the relation induced by ε in d-privacy is fragile, in the sense that the definition of d-privacy assumes an underlying metric structure d on the data, and whether a mechanism B is "better" than A depends in general on the metric considered.
Average-case and max-case are different principles, suitable for different scenarios: the former represents the point of view of an organization, for instance an insurance company providing coverage for risks related to credit cards, which for the cost-benefit analysis is interested in reasoning in terms of expectation (expected cost of an attack).The max-case represents the point of view of an individual, who is interested in limiting the cost of any attack.As such, the max-case seems particularly suitable for the domain of privacy.
In this paper, we combine the max-case perspective with the robustness of the QIF approach, and we introduce two refinement orders: • max Q , based on the max-case leakage introduced in [9].This order takes into account all possible privacy breaches caused by any observable (like in the DP world), but it quantifies over all possible quasi-convex vulnerability functions (in the style of the QIF world).• prv M , based on d-privacy (like in the DP world), but quantified over all metrics d.To underline the importance of a robust order, let us consider the case of the oblivious mechanisms for differential privacy: These mechanisms are of the form K = H • f , where f : X → Y is a query, namely a function from datasets in X to some answer domain Y, and H is a probabilistic mechanism implementing the noise.The idea is that the system first computes the result y ∈ Y of the query (true answer), and then it applies H to y to obtain a reported answer z.In general, if we want K to be ε-DP, we need to tune the mechanism H in order to take into account the sensitivity of f , which is the maximum distance between the results of f on two adjacent databases, and as such it depends on the metric on Y.However, if we know that K = H • f is ε-DP, and that H prv M H for some other mechanism H , then we can safely substitute H by H as it is because one of our results (cf.Theorem A5) guarantees that K = H • f is also ε-DP.In other words, H prv M H implies that we can substitute H by H in an oblivious mechanism for whatever query f and whatever metric on Y, without the need to know the sensitivity of f and without the need to do any tuning of H . Thanks to Theorems 3 and 4, we know that this is the case also for avg G and max Q .We illustrate this with the following example.
Example 3. Consider datasets x ∈ X of records containing the age of people, expressed as natural numbers from 0 to 100, and assume that each dataset in X contains at least 100 records.Consider two queries, f (x) and g(x), which give the rounded average age and the minimum age of the people in x, respectively.Finally, consider the truncated geometric mechanism TG ε (cf.Definition 10), and the randomized response mechanism R ε (cf.Definition 13).It is easy to see that K 1 = TG ε • f is ε-DP, and it is possible to prove that TG ε prv M R ε (cf.Theorem 14).We can then conclude that K 2 = R ε • f is ε-DP as well, and that in general it is safe to replace TG ε by R ε for whatever query.On the other hand, R ε prv M TG ε , so we cannot expect that it is safe to replace R ε by TG ε in any context.In fact, the fact that both mechanisms are constructed using the same privacy parameter ε.Hence, we can conclude that a refinement relation based only on the comparison of the ε parameters would not be robust, at least not for a direct replacement in an arbitrary context.Note that K 4 is 100 × ε-DP.In order to make it ε-DP, we should divide the parameter ε by the sensitivity of g (with respect to the ordinary distance on natural numbers), which is 100, i.e., use TG ε /100 .For R ε , this is not necessary because it is defined using the discrete metric on {0, . . ., 100}, and the sensitivity of g with respect to this metric is 1.
The robust orders allow us to take into account different kinds of adversaries.The following example shows what the idea is.
Example 4. Consider the following three LDP mechanisms, represented by their stochastic matrices (where each element is the conditional probability of the outcome of the mechanism, given the secret value).The secrets are three possible economic situations of an individual, p, a and r, standing for "poor", "average" and "rich", respectively.The observable outcomes are r and n, standing for "rich" and "not rich".
Let us assume that the prior distribution π on the secrets is uniform.We note that A is (log 3)-LDP while B is (log 2)-LDP.Hence, if we only look at the value of ε, we would think that B is better than A from the privacy point of view.However, there are attackers that gain more from B than from A (which means that, with respect to those attackers, the privacy of B is worse).For instance, this is the case when the attacker is only interested in discovering whether the person is rich or not.In fact, if we consider a gain 1 when the attacker guesses the right class (r versus (either p or a)) and 0 otherwise, we have that the highest possible gain in A is ( 3 /4) π(p) + ( 1 /2) π(a) = 5 /12, while in B is ( 2 /3) π(p) + ( 2 /3) π(a) = 4 /9, which is higher than 5 /12.This is consistent with our orders: it is possible to show that none of the three orders hold between A and B, and that therefore we should not expect B to be better (for privacy) than A with respect to all possible adversaries.
On the other hand, the mechanism C is also (log 2)-LDP, and in this case we have that the relation A max Q C holds, implying that we can safely replace A by C. We can also prove that the reverse does not hold, which means that C is strictly better than A.
A fundamental issue is how to prove that these robust orders hold: Since max Q and prv M involve universal quantifications, it is important to devise finitary methods to verify them.To this purpose, we will study their characterizations as structural relations between stochastic matrices (representing the mechanisms to be compared), along the lines of what was done for avg G .We will also study the relation between the three orders (the two above and avg G ), and their algebraic properties.Finally, we will analyze various mechanisms for DP, LDP, and d-privacy to see in which cases the order induced by ε is consistent with the three orders above.

Contribution
The main contributions of this paper are the following:

•
We introduce two refinement orders for the max case, max Q and prv M , that are robust with respect to a large class of adversaries.

•
We give structural characterizations of both max Q and prv M in terms of relations on the stochastic matrices of the mechanisms under comparison.These relations help the intuition and open the way to verification.

•
We study efficient methods to verify the structural relations above.Furthermore, these methods are such that, when the verification fails, they produce counterexamples.In this way, it is possible to pin down what the problem is and try to correct it.

•
We show that avg ⊂ max ⊂ prv .

•
We apply the three orders ( avg G , max Q , and prv M ) to the comparison of some well-known families of d-private mechanisms: geometric, exponential and randomised response.We show that, in general, A avg G B (and thus all the refinement orders between A and B) holds within the same family whenever the ε of B is smaller than that of A.

•
We show that, if A and B are mechanisms from different families, then, even if the ε of B is smaller than that of A, the relations A avg G B and A max Q B do not hold, and in most cases A prv M B does not hold either.We conclude that a comparison based only on the value of the ε's is not robust across different families, at least not for the purposes illustrated above.

•
We study lattice-properties of these orders.In contrast to avg G , which was shown to not be a lattice, we prove that suprema and infima exist for max Q and prv M , and that therefore these orders form lattices.

Related Work
We are not aware of many studies on refinement relations for QIF.Yasuoka and Terauchi [10] and Malacaria [11] have explored strong orders on deterministic mechanisms, focusing on the fact that such mechanisms induce partitions on the space of secrets.They showed that the orders produced by min-entropy leakage [5] and Shannon leakage [12,13] are the same and, moreover, they coincide with the partition refinement order in the Lattice of Information [14].This order was extended to the probabilistic case in [6], resulting in the relation avg mentioned in Section 2. The same paper [6] proposed the theory of g-leakage and introduced the corresponding order avg G .Furthermore, [6] proved that avg ⊆ avg G and conjectured that also the reverse should hold.This conjecture was then proved valid in [7].The max-case leakage, on which the relation max Q is based, was introduced in [9], but max Q and its properties were not investigated.Finally, prv is a novel notion introduced in this paper.
In the field of differential privacy, on the other hand, there have been various works aimed at trying understand the operational meaning of the privacy parameter ε and at providing guidelines for the choice of its values.We mention, for example [15,16], which consider the value of ε from an economical point of view, in terms of cost.We are not aware, however, of studies aimed at establishing orders between the level of privacy of different mechanisms, except the one based on the comparison of the ε's.
The relation between QIF and DP, LDP, and d-privacy is based on the so-called semantic interpretation of the privacy notions that regard these properties as expressing a bounds on the increase of knowledge (from prior to posterior) due to the answer reported by the mechanism.For d-privacy, the semantic interpretation is expressed by (2).To the best of our knowledge, this interpretation was first pointed out (for the location privacy instance) in [17].The seminal paper on d-privacy, [3], also proposed a semantic interpretation, with a rather different flavor, although formally equivalent.As for DP, as explained in the Introduction, (2) instantiated to databases and Hamming distance corresponds to the odds ratio on which the semantics interpretation is based provided in [8].Before that, another version of semantic interpretation was presented in [1] and proved equivalent to a form of DP called ε-indistinguishability. Essentially, in this version, an adversary that queries the database, and knows all the database except one record, cannot infer too much about this record from the answer to the query reported by the mechanism.Later on, an analogous version of semantic interpretation was reformulated in [18] and proved equivalent to DP.A different interpretation of DP, called semantic privacy, was proposed by [19].This interpretation is based on a comparison between two posteriors (rather between the posterior and the prior), and the authors show that, within certain limits, it is equivalent to DP.
A short version of this paper, containing only some of the proofs, appeared in [20].

Plan of the Paper
In the next three sections, Sections 2-4, we define the order refinements avg G , max Q and prv M , respectively, and we study their properties.In Section 5.1, we investigate methods to verify them.In Section 6, we consider various mechanisms for DP and its variants, and we investigate the relation between the parameter ε and the orders introduced in this paper.In Section 7, we show that max Q and prv M form a lattice.Finally, Section 8 provides conclusions.
Note: In order to make the reading of the paper more fluid, we have moved all the proofs to the appendix at the end of the paper.

Average-Case Refinement
We recall here some basic concepts from the literature.Table 2 lists the symbols used for the main concepts through the paper.
Table 2. Definitions and symbols used in this paper.

Vulnerability, Channels, and Leakage
Quantitative Information Flow studies the problem of quantifying the information leakage of a system (e.g., a program, or an anonymity protocol).A common model in this area is to consider that the user has a secret x from a finite set of possible secrets X , about which the adversary has some probabilistic knowledge π: DX (DX denoting the set of probability distributions over X ).A function V: DX → R ≥0 is then employed to measure the vulnerability of our system: V(π) quantifies the adversary's success in achieving some desired goal, when his knowledge about the secret is π.
Various such functions can be defined (e.g., employing well-known notions of entropy), but it quickly becomes apparent that no single vulnerability function is meaningful for all systems.The family of g-vulnerabilities [6] tries to address this issue by parametrizing V in an operational scenario: first, the adversary is assumed to possess a set of actions W; second, a gain function g(w, x) models the adversary's gain when choosing action w and the real secret is x.g-vulnerability can be then defined as the expected gain of an optimal guess: V g (π) = max w: W ∑ x: X π x g(w, x).Different adversaries can be modelled by proper choices of W and g.We denote by GX the set of all gain functions.
A system is then modelled as a channel: a probabilistic mapping from the (finite) set of secrets X to a finite set of observations Y, described by a stochastic matrix C, where C x,y is the probability that secret x produces the observation y.When the adversary observes y, he can transform his initial knowledge π into a posterior knowledge δ y : DX .Since each observation y is produced with some probability a y , it is sometimes conceptually useful to consider that the result of running a channel C, on the initial knowledge π, is a "hyper" distribution [π, C]: a probability distribution on posteriors δ y , each having probability a y .
It is then natural to define the (average-case) posterior vulnerability of the system by applying V to each posterior δ y , then averaging by its probability a y of being produced: when defining vulnerability in this way, it can be shown [9] that V has to be convex on π; otherwise, fundamental properties (such as the data processing inequality) are violated.Any continuous and convex function V can be written as V g for a properly chosen g, so, when studying average-case leakage, we can safely restrict to using g-vulnerability.
Leakage can be finally defined by comparing the prior and posterior vulnerabilities, e.g., as . (Comparing vulnerabilities "multiplicatively" is also possible but is orthogonal to the goals of this paper.)

Refinement
A fundamental question arises in the study of leakage: can we guarantee that a system B is no less safe than a system A? Having a family of vulnerability functions, we can naturally define a strong order avg G on channels by explicitly requiring that B leaks (Note that comparing the leakage of A, B is equivalent to comparing their posterior vulnerability, so we choose the latter for simplicity.)no more than A, for all priors π and all gain functions g: GX : (Note also that quantifying over g: GX is equivalent to quantifying over all continuous and convex vulnerabilities.) for all g: GX , π: DX Although avg G is intuitive and provides clear leakage guarantees, the explicit quantification over vulnerability functions makes it hard to reason about and verify.Thankfully, this order can be characterized in a "structural" way that is as a direct property of the channel matrix.We first define the refinement order avg on channels by requiring that B can be obtained by post-processing A by some other channel R that is: A fundamental result [6,7] states that avg and avg G coincide.We read A avg B as "A is refined by B", or "B is as safe as A".When A avg B holds, we have a strong privacy guarantee: we can safely replace A by B without decreasing the privacy of the system, independently from the adversary's goals and his knowledge.However, refinement can be also useful in case A avg B; namely, we can conclude that some adversary must exist, modelled by a gain function g, and some initial knowledge π, such that the adversary actually prefers to interact with A rather than interacting with B. (Whether this adversary is of practical interest or not is a different issue, but we know that one exists.)Moreover, we can actually construct such a "counter-example" gain function; this is discussed in Section 5.

Max-Case Refinement
Although avg , avg G provides a strong and precise way of comparing systems, one could argue that average-case vulnerability might underestimate the threat of a system.More precisely, imagine that there is a certain observation y such that the corresponding posterior δ y is highly vulnerable (e.g., the adversary can completely infer the real secret), but y happens with very small probability a y .
In this case, the average-case posterior vulnerability V[π, C] can be relatively small, although V(δ y ) is large for that particular y.
If such a scenario is considered problematic, we can naturally quantify leakage using a max-case (called "worse"-case in some contexts, although the latter is more ambiguous, "worse" can refer to a variety of factors.)variant of posterior vulnerability, where all observations are treated equally regardless of their probability of being produced: Under this definition, it can be shown [9] that V has to be quasi-convex on π (instead of convex), in order to satisfy fundamental properties (such as the data processing inequality).Hence, in the max-case, we no longer restrict to g-vulnerabilities (which are always convex), but we can use any vulnerability V : QX , where QX denotes the set of all continuous quasi-convex functions DX → R ≥0 .
Inspired by avg G , we can now define a corresponding max-case leakage order.
Definition 1.The max-case leakage order is defined as for all V: QX , π: DX Similarly to its average-case variant, max Q provides clear privacy guarantees by explicitly requiring that B leaks no more than A for all adversaries (modelled as a vulnerability V).However, this explicit quantification makes the order hard to reason about and verify.We would thus like to characterize max Q by a refinement order that depends only on the structure of the two channels.Given a channel C from X to Y, we denote by C the channel obtained by normalizing C's columns (if a column consists of only zeroes, it is simply removed.) and then transposing: Note that the row y of C can be seen as the posterior distribution δ y obtained by C under the uniform prior.Note also that C is non-negative and its rows sum up to 1, so it is a valid channel from Y to X .The average-case refinement order required that B can be obtained by post-processing A. We define the max-case refinement order by requiring that B can be obtained by pre-processing Ã. Definition 2. The max-case refinement order is defined as A max B iff R Ã = B for some channel R.
Our goal now is to show that max Q and max are different characterizations of the same order.To do so, we start by giving a "semantic" characterization of max that is, expressing it, not in terms of the channel matrices A and B, but in terms of the posterior distributions that they produce.Thinking of [π, C] as a ("hyper") distribution on the posteriors produced by π and C, its support supp [π, C] is the set of all posteriors produced with non-zero probability.We also denote by ch S the convex hull of S.
Moreover, if (A1) holds and π is full support, then A max B.
Theorem 2. The orders max and max Q coincide.
Similarly to the average case, A max B gives us a strong leakage guarantee: can safely replace A by B, knowing that, for any adversary, the max-case leakage of B can be no-larger than that of A. Moreover, in case A max B, we can always find an adversary, modelled by a vulnerability function V, who prefers (wrt the max-case) interacting with A that with B. Such a function is discussed in Section 5.
Finally, we resolve the question of how max and avg are related.
Theorem 3. avg is strictly stronger than max .
This result might appear counter-intuitive at first; one might expect A max B to imply A avg B. To understand why it does not, note that the former only requires that, for each output y B of B, there exists some output of y A that is at least as vulnerable, regardless of how likely y A and y B are to happen (this is max-case, after all).We illustrate this with the following example.

Example 6. Consider the following systems:
A y 1 y 2 y 3 y 4 x 1 3 /4 0 1 /4 0 x 2 3 /4 1 /4 0 0 Under the uniform prior, the y 1 , y 2 , y 3 posteriors for both channels are the same, namely ( 1 /2, 1 /2, 0), (0, 1 /2, 1 /2) and ( 1 /2, 0, 1 /2), respectively.Thus, the knowledge that can be obtained by each output of B can be also obtained by some output of A (albeit with a different probability).Hence, from Theorem A1, we get that A max B. However, we can check (see Section 5.1) that B cannot be obtained by post-processing A, that is, A avg B.
The other direction might also appear tricky: if B leaks no more than A in the average-case, it must also leak no more than B in the max-case.The quantification over all gain functions in the average-case is powerful enough to "detect" differences in max-case leakage.The above result also means that avg could be useful even if we are interested in the max-case, since it gives us max for free.

Privacy-Based Refinement
Thus far, we have compared systems based on their (average-case or max-case) leakage.In this section, we turn our attention to the model of differential privacy and discuss new ways of ordering mechanisms based on that model.

Differential Privacy and d-Privacy
Differential privacy relies on the observation that some pairs of secrets need to be indistinguishable from the point of view of the adversary in order to provide some meaningful notion of privacy; for instance, databases differing in a single individual should not be distinguishable, otherwise the privacy of that individual is violated.At the same, other pairs of secrets can be allowed to be distinguishable in order to provide some utility; for instance, distinguishing databases differing in many individuals allows us to answer a statistical query about those individuals.
This idea can be formalized by a distinguishability metric d. (To be precise, d is an extended pseudo metric, namely one in which distinct secrets can have distance 0, and distance +∞ is allowed.)Intuitively, d(x, x ) models how distinguishable we allow these secrets to be.A value 0 means that we require x and x to be completely indistinguishable to the adversary, while +∞ means that she can distinguish them completely.
In this context, a mechanism is simply a channel (the two terms will be used interchangeably), mapping secrets X to some observations Y. Denote by MX the set of all metrics on X .Given d ∈ MX , we define d-privacy as follows: Intuitively, this definition requires that the closer x and x are (as measured by d), the more similar (probabilistically) the output of the mechanism on these secrets should be.

Remark 1.
Note that the definition of d-privacy given in (1) is slightly different from the above one because of the presence of ε in the exponent.Indeed, it is common to scale d by a privacy parameter ε ≥ 0, in which case d can be thought of as the "kind" and ε as the "amount" of privacy.In other words, the structure determined by d on the data specifies how we want to distinguish each pair of data, and ε specifies (uniformly) the degree of the distinction.Note that ε • d is itself a metric, so the two definitions are equivalent.
Using a generic metric d in this definition allows us to express different scenarios, depending on the domain X on which the mechanism is applied and the choice of d.For instance, in the standard model of differential privacy, the mechanism is applied to a database x (i.e., X is the set of all databases), and produces some observation y (e.g., a number).The Hamming metric d H -defined as the number of individuals in which x and x differ-captures standard differential privacy.

Oblivious Mechanisms
In the case of an oblivious mechanism, a query f : X → Y is first applied to database x, and a noise mechanism H from Y to Z is applied to y = f (x), producing an observation z.In this case, it is useful to study the privacy of H wrt some metric d Y on Y.Then, to reason about the d X -privacy of the whole mechanism H • f , we can first compute the sensitivity of f wrt d X , d Y : and then use the following property [3]: For instance, the geometric mechanism G ε satisfies ε • d E -privacy (where d E denotes the Euclidean metric), hence it can be applied to any numeric query f : the resulting mechanism The sensitivity wrt the Hamming and Euclidean metrics reduces to

Applying Noise to the Data of a Single Individual
There are also scenarios in which a mechanism C is applied directly to the data of a single individual (that is, X is the set of possible values).For instance, in the local model of differential privacy [2], the value of each individual is obfuscated before sending them to an untrusted curator.In this case, C should satisfy d D -privacy, where d D is the discrete metric, since any change in the individual's value should have negligible effects.
Moreover, in the context of location-based services, a user might want to obfuscate his location before sending it to the service provider.In this context, it is natural to require that locations that are geographically close are indistinguishable, while far away ones are allowed to be distinguished (in order to provide the service).In other words, we wish to provide d E -privacy, for the Euclidean metric on R 2 , called geo-indistinguishability in [17].
Note that Priv d (C) = +∞ iff there is no such ε; also Priv Similarly to the other leakage orders, the drawback of prv M is that it quantifies over an uncountable family of metrics.As a consequence, our first goal is to characterize it as a property of the channel matrix alone, which would make it much easier to reason about or verify.
To do so, we start by recalling an alternative way of thinking about d-privacy.Consider the multiplicative total variation distance between probability distributions µ, µ ∈ DY, defined as: If we think of C as a function X → DY (mapping every x to the distribution Then, we introduce the concept of the distinguishability metric d C ∈ MX induced by the channel C, defined as Intuitively, d C (x, x ) expresses exactly how much the channel distinguishes (wrt tv ⊗ ) the secrets x, x .It is easy to see that d C is the smallest metric for which C is private; in other words, for any d: We can now give a refinement order on mechanisms, by comparing their corresponding induced metrics.We now turn our attention to the question of how these orders relate to each other.Theorem 4. max is strictly stronger than prv , which is strictly stronger than prv d .
The fact that max is stronger than prv is due to the fact than Priv d can be seen as a max-case information leakage, for a properly constructed vulnerability function V d .This is discussed in detail in Section 4.7.This implication means that avg , max can be useful even if we "only" care about d-privacy.

Application to Oblivious Mechanisms
We conclude the discussion on privacy-based refinement by showing the usefulness of our strong prv order in the case of oblivious mechanisms.Theorem 5. Let f : X → Y be any query and A, B be two mechanisms on Y.
This means that replacing A by B is the context of an oblivious mechanism is always safe, regardless of the query (and its sensitivity) and regardless of the metric by which the privacy of the composed mechanism is evaluated.
Assume, for instance that we care about standard differential privacy, and we have properly constructed A such that A • f satisfies ε-differential privacy for some ε.If we know that A prv B (several such cases are discussed in Section 6), we can replace A by B without even knowing what f does.The mechanism B • f is guaranteed to also satisfy ε-differential privacy.
Note also that the above theorem fails for the weaker order prv • f for some other metric of interest d X : MX .It is possible that replacing A by B in that case is not safe (one would need to re-examine the behavior of B, and possibly reconfigure it to the sensitivity of f ).
Table 3 summarizes the relations between the various orderings.

Privacy as Max-Case Capacity
In this section we show that d-privacy can be expressed as a (max-case) information leakage.Note that this provides an alternative proof that max is stronger than prv .We start this by defining a suitable vulnerability function: Note the difference between V d (π) (a vulnerability function on distributions) and Priv d (C) (a "leakage" measure on channels).
A fundamental notion in QIF is that of capacity: the maximization of leakage over all priors.In turns out that, for V d , the capacity-realizing prior is the uniform one.In the following, L +,max d denotes the additive max-case d leakage, namely: and ML +,max d denotes the additive max-case d-capacity, namely: Theorem 6. ML +,max d is always achieved on a uniform prior π u .Namely This finally brings us to our goal of expressing Priv d in terms of information leakage (for a proper vulnerability function).

Verifying the Refinement Orders
We now turn our attention to the problem of checking whether the various orders hold, given two explicit representations of channels A and B (in terms of their matrices).We show that, for all orders, this question can be answered in time polynomial in the size of the matrices.Moreover, when one of the order fails, we discuss how to obtain a counter-example (e.g., a gain function g or a vulnerability function V), demonstrating this fact.All the methods discussed in the section have been implemented in a publicly available library, and have been used in the experimental results of Section 6.

Average-Case Refinement
Verifying A avg B can be done in polynomial time (in the size of A, B) by solving the system of equations AR = B, with variables R, under the linear constraints that R is a channel matrix (non-negative and rows sum up to 1).However, if the system has no solution (i.e., A avg B), this method does not provide us with a counter-example gain function g.
We now show that there is an alternative efficient method: define C ↑ = {CR | R is a channel}, the set of all channels obtainable by post-processing C. The idea is to compute the projection of B on A ↑ .Clearly, the projection is B itself iff A avg B; otherwise, the projection can be directly used to construct a counter-example g.Theorem 8. Let B * be the projection of B on A ↑ .

1.
If B = B * , then A avg B.

2.
Otherwise, let G = B − B * .The gain function g(w, x) = G x,w provides a counter-example to A avg B, which is V g (π u , A) < V g (π u , B), for uniform π u .
Since x − y 2 2 = x T x − 2x T y + y T y, the projection of y to a convex set can be written as min x x T x − 2x T y for Ax ≤ b.This is a quadratic program with Q being the identity matrix, which is positive definite, hence it can be solved in polynomial time.
Note that the proof that avg G is stronger than avg (the "coriaceous" theorem of [7]) uses the hyperplane-separation theorem to show the existence of a counter example g in case A avg B. The above technique essentially computes such a separating hyperplane.

Max-Case Refinement
Similarly to avg , we can verify A max B directly using its definition, by solving the system R Ã = B under the constraint that R is a channel.
In contrast to avg , when A max B, the proof of Theorem 2 directly gives us a counter-example: where S = ch supp [π, A] and π is any full-support prior.For this vulnerability function, it holds that V max (π, A) < V max (π, B).

Privacy-Based Refinement
The prv order can be verified directly from its definition, by checking that d A ≥ d B .This can be done in time O(|X | 2 |Y |), by computing tv ⊗ (C x,− , C x ,− ) for each pair of secrets.If A prv B, then d = d B provides an immediate counter-example metric, since B satisfies d B -privacy, but A does not.

Application: Comparing DP Mechanisms
In differential privacy, it is common to compare the privacy guarantees provided by different mechanisms by 'comparing the epsilons'.However, it is interesting to ask to what extent ε-equivalent mechanisms are comparable wrt the other leakage measures defined here-or we might want to know whether reducing ε in a mechanism also corresponds to a refinement of it.This could be useful if, for example, it is important to understand the privacy properties of a mechanism with respect to any max-case leakage measure, and not just the DP measure given by ε.
Since the ε-based order given by prv d is (strictly) the weakest of the orders considered here, it cannot be the case that we always get a refinement (wrt other orders).However, it may be true that, for particular families of mechanisms, some (or all) of the refinement orders hold.
We investigate three families of mechanisms commonly used in DP or LDP: geometric, exponential and randomized response mechanisms.

Preliminaries
We define each family of mechanisms in terms of their channel construction.We assume that mechanisms operate on a set of inputs (denoted by X ) and produce a set of outputs (denoted Y).
In this sense, our mechanisms can be seen as oblivious (as in standard DP) or as LDP mechanisms.(We use the term 'mechanism' in either sense).We denote by M ε a mechanism parametrized by ε, where ε is defined to be the same as Priv d (M).For the purposes of comparison, we make sure that we use the best possible ε for each mechanism.In order to compare mechanisms, we restrict our input and output domains of interest to sequences of non-negative integers.We assume X , Y are finite unless specified.In addition, as we are operating in the framework of d-privacy, it is necessary to provide an appropriate metric defined over X ; here, it makes sense to use the Euclidean distance metric d E .Definition 9. A geometric mechanism is a channel (X , Z, G ε ), parametrized by ε ≥ 0 constructed as follows: where α = e −ε and d E (x, y) = x − y .Such a mechanism satisfies ε • d E -privacy.
In practice, the truncated geometric mechanism is preferred to the infinite geometric.We define the truncated geometric mechanism as follows.
Definition 10.A truncated geometric mechanism is a channel (X , Y, TG ε ), parametrized by ε ≥ 0 with X ⊆ Y constructed as follows: where α = e −ε and d E (x, y) = x − y .Such a mechanism satisfies ε • d E -privacy.
It is also possible to define the 'over-truncated' geometric mechanism whose input space is not entirely included in the output space.Definition 11.An over-truncated geometric mechanism is a channel (X , Y, OTG ε ), parametrized by ε ≥ 0 with X ⊆ Y constructed as follows: 1.
Start with the truncated geometric mechanism (X , X ∪ Y, TG ε ).

2.
Sum up the columns at each end until the output domain is reached.
Such a mechanism satisfies ε • d E -privacy.
For example, the set of inputs to an over-truncated geometric mechanism could be integers in the range [0 . . .100], but the output space may have a range of [0 . . .50] or perhaps [−50 . . .50].In either of these cases, the mechanism has to 'over-truncate' the inputs to accommodate the output space.
We remark that we do not consider the over-truncated mechanism a particularly useful mechanism in practice.However, we provide results on this mechanism for completeness since its construction is possible, if unusual.Definition 12.An exponential mechanism is a channel (X , Y, E α ), parametrized by ε ≥ 0 constructed as follows: where λ x are normalizing constants ensuring ∑ y E α x,y = 1.Such a mechanism satisfies α • d E -privacy where α ≥ ε 2 (which can be calculated exactly from the channel construction).
Note that the construction presented in Definition 12 uses the Euclidean distance metric since we only consider integer domains.The general construction of the exponential mechanism uses arbitrary domains and arbitrary metrics.Note that its parameter ε does not correspond to the true (best-case) ε-DP guarantee that it provides.We will denote by E ε the exponential mechanism with 'true' privacy parameter ε rather than the reported one, as our intention is to capture the privacy guarantee provided by the channel in order to make reasonable comparisons.Definition 13.A randomized response mechanism is a channel (X , Y, R ε ), parametrized by ε ≥ 0 constructed as follows: where n = Y − 1 and d D is the discrete metric (that is, We note that the randomized response mechanism also satisfies ε • d E -privacy.Intuitively, the randomized response mechanism returns the true answer with high probability and all other responses with equal probability.In the case where the input x lies outside Y (that is, in 'over-truncated' mechanisms), all of the outputs (corresponding to the outlying inputs) have equal probability.

Example 7.
The following are examples of each of the mechanisms described above, represented as channel matrices.For this example, we set ε = log(2) for the geometric and randomized response mechanisms, while, for the exponential mechanism, we use ε = log(4).
Note that the exponential mechanism here actually satisfies log( 16  7 ) • d E -privacy even though it is specified by ε = log(4).
We now have three families of mechanisms which we can characterize by channels, and which satisfy ε • d E -privacy.For the remainder of this section, we will refer only to the ε parameter and take d E as given, as we wish to understand the effect of changing ε (for a fixed metric) on the various leakage measures.

Refinement Order within Families of Mechanisms
We first ask which refinement orders hold within a family of mechanisms.That is, when does reducing ε for a particular mechanism produce a refinement?Since we have the convenient order avg ⊂ max ⊂ prv , it is useful to first check if avg holds as we get the other refinements 'for free'.
For the (infinite) geometric mechanism, we have the following result.
This means that reducing ε in an infinite geometric mechanism is safe against any adversary that can be modelled using, for example, max-case or average-case vulnerabilities.
For the truncated geometric mechanism, we get the same result.
However, the over-truncated geometric mechanism does not behave so well.
We can think of this last class of geometrics as 'skinny' mechanisms, that is, corresponding to a channel with a smaller output space than input space.
Intuitively, this theorem means that we can always find some (average-case) adversary who prefers the over-truncated geometric mechanism with the smaller ε.
We remark that the gain function we found can be easily calculated by treating the columns of channel A as vectors, and finding a vector orthogonal to both of these.This follows from the results in Section 5.1.Since the columns of A cannot span the space R 3 , it is always possible to find such a vector, and, when this vector is not orthogonal to the 'column space' of B, it can be used to construct a gain function preferring B to A.
Even though the avg refinement does not hold, we can check whether the other refinements are satisfied.Theorem 12. Let OTG ε be an over-truncated geometric mechanism.Then, reducing ε does not produce a max refinement; however, it does produce a prv refinement.
This means that, although a smaller ε does not provide safety against all max-case adversaries, it does produce a safer mechanism wrt d-privacy for any choice of metric we like.
Intuitively, the prv order relates mechanisms based on how they distinguish inputs.Specifically, if A prv B, then, for any pair of inputs x, x , the corresponding output distributions are 'further apart' in channel A than in channel B, and thus the inputs are more distinguishable using channel A. When prv fails to hold, it means that there are some inputs in A which are more distinguishable than in B, and vice versa.This means an adversary who is interested in distinguishing some particular pair of inputs would prefer one mechanism to the other.
We now consider the exponential mechanism.In this case, we do not have a theoretical result, but, experimentally, it appears that the exponential mechanism respects refinement, so we present the following conjecture.
Conjecture 1.Let E ε be an exponential mechanism.Then, decreasing ε in E produces a refinement.That is, Finally, we consider the randomized response mechanism.
In conclusion, we can say that, in general, the usual DP families of mechanisms are 'well-behaved' wrt all of the refinement orders.This means that it is safe (wrt any adversary we model here) to replace a mechanism from a particular family with another mechanism from the same family with a lower ε.

Refinement Order between Families of Mechanisms
Now, we explore whether it is possible to compare mechanisms from different families.We first ask: can we compare mechanisms which have the same ε?We assume that the input and output domains are the same, and the intention is to decide whether to replace one mechanism with another.Theorem 14.Let R be a randomized response mechanism, E an exponential mechanism and TG a truncated geometric mechanism.Then, TG ε prv R ε and TG ε prv E ε .However, prv does not hold between E ε and R ε .
Proof.We present a counter-example to show E ε prv R ε and R ε prv E ε .The remainder of the proof is in Appendix D.
Intuitively, the randomized response mechanism maintains the same (ε) distinguishability level between inputs, whereas the exponential mechanism causes some inputs to be less distinguishable than others.This means that, for the same (true) ε, an adversary who is interested in certain inputs could learn more from the randomized response than the exponential.In the above counter-example, points x 2 , x 3 in the exponential mechanism of channel A are less distinguishable than the corresponding points in the randomized response mechanism B.
As an example, let's say the mechanisms are to be used in geo-location privacy and the inputs represent adjacent locations (such as addresses along a street).Then, an adversary (your boss) may be interested in how far you are from work, and therefore wants to be able to distinguish between points distant from x 1 (your office) and points within the vicinity of your office, without requiring your precise location.Your boss chooses channel A as the most informative.However, another adversary (your suspicious partner) is more concerned about where exactly you are, and is particularly interested in distinguishing between your expected position (x 2 , the boulangerie) versus your suspected position (x 3 , the brothel).Your partner chooses channel B as the most informative.
Regarding the other refinements, we find (experimentally) that in general they do not hold between families of mechanisms.(Recall that we only need to produce a single counter-example to show that a refinement doesn't hold, and this can be done using the methods presented in Section 5.) We next check what happens when we compare mechanisms with different epsilons.We note the following.Theorem 15.For any (truncated geometric, randomized response, exponential) mechanisms M 2 for any of our refinements ( avg , max , prv ), then M Proof.This follows directly from transitivity of the refinement relations, and our results on refinement with families of mechanisms.(We recall however that our result for the exponential mechanism is only a conjecture.) This tells us that, once we have a refinement between mechanisms, it continues to hold for reduced ε in the refining mechanism.
Corollary 1.Let G, TG, R, E be the geometric, truncated geometric, randomized response and exponential mechanisms respectively.Then, for all ε ≤ ε, we have that TG Thus, it is safe to 'compare epsilons' wrt prv if we want to replace a geometric mechanism with either a randomized response or exponential mechanism.(As with the previous theorem, note that the results for the exponential mechanism are stated as conjecture only, and this conjecture is assumed in the statement of this corollary.)What this means is that if, for example, we have a geometric mechanism TG that operates on databases with distance measured using the Hamming metric d H and satisfying ε • d H -privacy, then any randomized response mechanism R parametrized by ε ≤ ε will also satisfy ε • d H -privacy.Moreover, if we decide we'd rather use the Manhattan metric d M to measure distance between the databases, then we only need to check that TG also satisfies ε • d M -privacy, as this implies that R will too.
The following Tables 4-6 summarize the refinement relations with respect to the various families of mechanisms.
Comparing different families of mechanisms with respect to the different refinements under the same ε.

Refinements across Families with Same ε
Comparing different families of mechanisms with differing ε.We recall that the results in the grey cells are based on Conjecture 1.

Asymptotic Behavior
We now consider the behavior of the relations when ε approximates 0, which represents the absence of leakage.We start with the following result: Theorem 16.Every (truncated geometric, randomized response, exponential) mechanism is 'the safest possible mechanism' when parametrized by ε = 0.That is, L ε avg M 0 for all mechanisms L, M (possibly from different families) and ε > 0.
While this result may be unsurprising, it means that we know that refinement must eventually occur when we reduce ε.It is interesting then to ask just when this refinement occurs.We examine this question experimentally by considering different mechanisms and investigating for which values of ε average-case refinement holds.For simplicity of presentation, we show results for 5 × 5 matrices, noting that we observed similar results for experiments across different matrix dimensions, at least wrt the coarse-grained comparison of plots that we do here.The results are plotted in Figure 3.
The plots show the relationship between ε 1 (x-axis) and ε 2 (y-axis) where ε 1 parametrizes the mechanism being refined and ε 2 parametrizes the refining mechanism.For example, the blue line on the top graph represents TG ε 1 avg E ε 2 .We fix ε 1 and ask for what value of ε 2 do we get a avg refinement.Notice that the line ε 1 = ε 2 corresponds to the same mechanism in both axes (since every mechanism refines itself).We can see that refining the randomized response mechanism requires much smaller values of epsilon in the other mechanisms.For example, from the middle graph, we can see that R 4 avg TG 1 (approximately) whereas, from the top graph, we have TG 4 avg R 4 .This means that the randomized response mechanism is very 'safe' against average-case adversaries compared with the other mechanisms, as it is much more 'difficult' to refine than the other mechanisms.
We also notice that, for 'large' values of ε 1 , the exponential and geometric mechanisms refine each other for approximately the same ε 2 values.This suggests that, for these values, the epsilons are comparable (that is, the mechanisms are equally 'safe' for similar values of ε).However, smaller values of ε 1 require a (relatively) large reduction in ε 2 to obtain a refinement.

Discussion
At the beginning of this section, we asked whether it is safe to compare differential privacy mechanisms by 'comparing the epsilons'.We found that it is safe to compare epsilons within families of mechanisms (except in the unusual case of the over-truncated geometric mechanism).However, when comparing different mechanisms, it is not safe to just compare the epsilons, since none of the refinements hold in general.Once a 'safe' pair of epsilons has been calculated, then reducing epsilon in the refining mechanism is always safe.However, computing safe epsilons relies on the ability to construct a channel representation, which may not always be feasible.

Lattice Properties
The orders avg , max and prv are all reflexive and transitive (i.e., preorders), but not anti-symmetric (i.e., not partial orders).This is due to the fact that there exist channels that have "syntactic" differences but the same semantics; e.g., two channels having their columns swapped.However, if we are only interested in a specific type of leakage, then all channels such that A B A (where is one of avg , max , prv ) have identical leakage, so we can view them as the "same channel" (either by working on the equivalence classes of ∪ or by writing all channels in some canonical form).
Seeing now as a partial order, the natural question is whether it forms a lattice that is whether suprema and infima exist.If it exists, the supremum A ∨ B has an interesting property: it is the "least safe" channel that is safer than both A and B (any channel C such that A C and B C would necessarily satisfy A ∨ B C).If we wanted a channel that is safer than both A and B, A ∨ B would be a natural choice.
In this section, we briefly discuss this problem and show that-in contrast to avg -both max and prv do have suprema and infima (i.e., they form a lattice).

Average-Case Refinement
In the case of avg , "equivalent" channels are those producing the exact same hypers.However, even if we identify such channels, it is known [7] that two channels A, B do not necessarily have a least upper bound wrt avg , hence avg does not form a lattice.

Max-Case Refinement
In the case of max , "equivalent" channels are those producing the same posteriors (or more generally the same convex hull of posteriors).However, in contrast to avg , if we identify such channels that is if we represent a channel only by the convex hull of its posteriors, then max becomes a lattice.
First, note that given a finite set of posteriors P = {δ y |y}, such that π ∈ ch {δ y } y , i.e., such that π = ∑ y a y δ y , it is easy to construct a channel C producing each posterior δ y with output probability a y .It suffices to take C x,y := δ y x a y /π x .Thus, A ∨ max B can be simply constructed by taking the intersection of the convex hulls of the posteriors of A, B. This intersection is a convex polytope itself, so it has (finitely many) extreme points, so we can construct A ∨ max B as the channel having exactly those as posteriors.A ∧ max B, on the other hand, can be constructed as the channel having as posteriors the union of those of A and B.
Note that computing the intersection of polytopes is NP-hard in general [21], so A ∨ max B might be hard to construct.However, efficient special cases do exist [22]; we leave the study of the hardness of ∨ max as future work.

Privacy-Based Refinement
In the case prv , "equivalent" channels are those producing the same induced metric, i.e., d A = d B .
Representing channels only by their induced metric, we can use the fact that MX does form a lattice under ≥.We first show that any metric can be turned into a corresponding channel.Then, A ∨ prv B will be simply the channel whose metric is d A ∨ d B , where ∨ is the supremum in the lattice of metrics, and similarly for ∧ prv .
Note that the infimum of two metrics d 1 , d 2 is simply the max of the two (which is always a metric).The supremum, however, is more tricky, since the min of two metrics is not always a metric: the triangle inequality might be violated.Thus, we first need to take the min of d 1 , d 2 , then compute its "triangle closure", by finding the shortest path between all pairs of elements, for instance using the well-known Floyd-Warshall algorithm.

Conclusions
We have investigated various refinement orders for mechanisms for information protection, combining the max-case perspective typical of DP and its variants with the robustness of the QIF approach.We have provided structural characterizations of these preorders and methods to verify them efficiently.Then, we have considered various DP mechanisms, and investigated the relation between the ε-based measurement of privacy and our orders.We have shown that, while within the same family of mechanisms, a smaller ε implies the refinement order, this is almost never the case for mechanisms belonging to different families.
Theorem A5.Let f : X → Y be any query and A, B be two mechanisms on Y.
), and similarly for d B• f .Then, we have: Theorem A6.ML +,max d is always achieved on a uniform prior π u .Namely Proof.Fix π, C, and let (a, δ y ) and (b, ρ y ) be the outer and inners of [π, Moreover, it holds that: Finally, we have that:

Appendix D. Proofs of Results about Refinement Comparison
We call a truncated geometric mechanism 'square' if is has the same input and output space (that is, the channel representation is a square matrix).
We first show that geometric mechanisms and truncated geometric mechanisms are equivalent to square mechanisms under avg .Lemma A1.Let G ε be a geometric mechanism.Then, the reduced (abstract) channel form of G ε is the square channel (X , X , TG ε ).
Proof.First, note that the square channel is obtained from the (infinite) geometric by summing up all the 'extra' columns (i.e., those columns in Y \ X ).Now, note that these 'extra' columns are scalar multiples of each other, since each column has the form for increasing values of k.Thus, the 'summing up' operation is a valid reduction operation, and so the infinite geometric is reducible to the corresponding square channel.
Lemma A2.Let TG ε be a truncated geometric mechanism.Then, the reduced (abstract) channel form of TG ε is the square channel (X , X , TG ε ).
Proof.First, note that the truncated geometric is obtained from the infinite geometric by summing up columns at the ends of the matrix.This is exactly the 'reduction' step noted above.We can continue, as above, to sum up 'extra' columns until we get a square matrix.
Corollary A1.Any avg refinement that holds for a square geometric mechanism (X , X , G ε ) also holds for any truncated geometric mechanism or (the) geometric mechanism G ε having domain X .
Note that we only define truncation as far as the square geometric matrix, since at this point the columns of the matrix are linearly independent and can no longer be truncated via matrix reduction operations.We now show that refinement holds for the square geometric mechanisms.Lemma A3.Let TG ε be a square geometric mechanism.Then, decreasing ε produces a refinement of it.That is, TG ε avg TG ε iff ε ≥ ε .
Proof.The square geometric mechanism TG ε has the following form: where α = e −ε , and similarly for TG ε with α = e −ε .Now, this matrix is invertible and the inverse has the following form: for some channel R, we can construct a suitable R using (TG ε ) −1 , namely R = (TG ε ) −1 • TG ε .It suffices to show that R is a valid channel.It is clear that the rows of R sum to 1, since it is the product of matrices with rows summing to 1.
Multiplying out the matrix R yields: ) where α = e −ε and β = e −ε .The only way that any of these matrix entries can be less than 0 is if α > β, or ε < ε .Thus, R is a valid channel precisely when ε ≥ ε and so TG ε avg TG ε as required.
The following theorems now follow from the previous lemmas.
Proof.Using Lemma A1, we can express G ε as a square channel and from Lemma A3 it follows that the refinement holds.
Theorem A12.Let OTG ε be an over-truncated geometric mechanism.Then, reducing ε does not produce a max refinement; however, it does produce a prv refinement.
Proof.We show the first part using a counter-example.Consider the following channels: x 1 4/5 1/5 x 2 1/5 4/5 x 3 1/20 19/20 x 1 2/3 1/3 x 2 1/3 2/3 x 3 1/6 5/6 (A23) Channels A ε and B ε are over-truncated geometric mechanisms using ε = 2 log 2, ε = log 2, respectively.We can define the (prior) vulnerability V as the usual (convex) g-vulnerability.Then, under a uniform prior π u , the gain function given by: g(w, x 1 ) = 1 5 (A24) g(w, x 2 ) = −1 (A25) g(w, x 3 ) = 4 5 (A26) yields V max [π u , A] = 0 and V max [π u , B] = 2 55 .Thus, A max B. For the second part, we first note that for any square geometric channel A ε we have d A (x, x ) = ε exactly when x, x are adjacent rows in the matrix (this can be seen from the construction of the square channel).Now, the over-truncated geometric is obtained by summing columns of the square geometric.By construction, the square geometric A has adjacent elements A x,y , A x ,y satisfying A x,y/A x ,y = e ε when x > x and x is above (or on) the diagonal of the channel matrix; otherwise, A x,y/A x ,y = e −ε .This means that each (over)-truncation step maintains the A x,y/A x ,y ratio except when x, y and x , y occur on diagonal elements, in which case their sum is between e −ε and e ε .Since this affects only two elements in each row, we still have that d A (x, x ) = ε (until the final truncation step to produce a single 1 vector).Therefore, since prv holds for the square matrix, and it holds under truncation, we must have that it holds for over-truncated geometric mechanisms.Thus, reducing ε corresponds to refinement under prv as required.
We show that the randomized response mechanism behaves well with respect to avg by considering three cases.
Firstly, we consider the case where X = Y.We use the following lemmas: Lemma A4.Let R α , R β be 'square' randomized response mechanisms.Then, B = R α R β is a randomized response mechanism with parameter ε = log e α+β +k e α +e β +k−1 where k + 1 is the dimension of R α , R β and B.
Proof.Observe that R α can be factorised as 1 e α +k R where R has the form: R x 1 x 2 . . .x k+1 x 1 e α 1 . . . 1 x 2 1 e α . . . 1 x k+1 (Note that the constant co-efficient factorised out the front does not affect the ε calculation for the channel).This is exactly the randomized response mechanism required.any solution R for S ε is a solution for R ε .Thus, for the reverse direction, if ε ≥ ε , we can always find an R such that R ε avg R ε .
For the forward direction, we prove the contrapositive.Note that in this case we cannot assume a partial order relation for avg , since there may be columns of R ε which are identical.If ε < ε , we want to show that R ε avg R ε .To do this, we need to find a gain function and prior π such that V g (π, R ε ) < V g (π, R ε ).The min-entropy leakage will do: this is simply the sum of the column maxima of the channel matrix.For the randomized response channels, this is given by for a channel with dimensions a × (a + b).This is an increasing function of ε (for a ≥ 1), in fact the derivative is always positive for a > 1, hence we must have ε < ε =⇒ V(R ε ) < V(R ε ).Thus, R ε avg R ε .
We can now conclude the following theorem from the main body of the paper: Theorem A13.Let R ε be a randomized response mechanism.Then, decreasing ε in R produces a refinement.That is, R ε avg R ε iff ε ≥ ε .
Theorem A14.Let R be a randomized response mechanism, E an exponential mechanism and TG a truncated geometric mechanism.Then, TG ε prv R ε and TG ε prv E ε .However, prv does not hold between E ε and R ε .
Proof.We first show that TG ε prv R ε , which is equivalent to showing that d R ≤ d TG .For the geometric mechanism, we have d TG (x, x ) = εd(x, x ) for all x, x ∈ X .For the randomized response mechanism, we have d R (x, x ) = ε or d R (x, x ) = 0 (when x, x ∈ Y).Thus, d R ≤ d TG and so TG ε prv R ε .We now show TG ε prv E ε .Recall that we parametrize the exponential mechanism by the smallest possible ε such that it satisfies εd-privacy.In this case, we find that, for any pair x, x we have d E (x, x ) ≤ εd(x, x ), whereas, for the geometric mechanism, we have d TG (x, x ) = εd(x, x ).Therefore, d E ≤ d TG and so TG ε prv E ε .
The proof of prv not holding between E ε and R ε was provided in the main body of the paper.
Theorem A15.Every (truncated geometric, randomized response, exponential) mechanism is 'the safest possible mechanism' when parametrized by ε = 0.That is, L ε avg M 0 for all mechanisms L, M (possibly from different families) and ε > 0.
Proof.The intuition is that all channels parametrized by ε = 0 are equivalent to the 1 channel (that is, the m × 1 channel consisting only of 1s).Indeed, the exponential and randomized response mechanisms parametrized by ε = 0 have every element equal to 1 n where n = Y .These clearly reduce to the 1 channel.The truncated geometric mechanism contains all 0s except for the first and last column which contain 1  2 .Again, this reduces to the 1 channel.Since the 1 channel refines everything (that is, L avg 1 for any channel L), the result follows.Proof.Letting d: MX , we first show that, for any x 0 : X , we can construct a channel C x 0 whose induced metric is below d but coincides with it on all distances to x 0 , that is: d C x 0 ≤ d and d C x 0 (x 0 , x) = d(x 0 , x) for all x: X (A33) To construct C x 0 , we use just two outputs (i.e., Y = {y 1 , y 2 }) and we use the fact that tv ⊗ on DY admits a geodesic that is a curve γ : [0, +∞] → DY such that tv ⊗ (γ(t), γ(t )) = |t − t | for all t, t : [0, +∞] (A34) For instance, we can check that γ(t) = (e −t−1 , 1 − e −t−1 ) is such a geodesic.
We can now use the geodesic, to assign probability distributions on each secret such that the properties (A33) are satisfied.Concretely, define each row x of C x 0 as: Finally, note that the visible choice adds the columns of all mechanisms, so the constructed channel has 2|X | columns.However, the equality of distances in (A33) is given by the first column of C x (this is because of the way γ is constructed), hence we can merge all second columns together, giving finally a simple construction for C d with Y = X ∪ {⊥} x ∈ X (A37) 8 = 0 then if H mod 4 = 0 then L := H & 0 24 1 8 L := H & 0 28 1 4 L := H L := H else else L := 1 L := 1

Figure 1 .
Figure 1.Comparison between the geometric (red) and the randomized response (blue) mechanisms.The area between 40 and 60, delimited by the green lines, represents the sub-domain where the geometric mechanism satisfies also the discrete metric d-privacy with ε = log 2.

Figure 2 .Example 5 .
Figure 2. The simplex and the convex hulls of the posterior distributions in Example 5. Example 5. Consider the following systems.

4. 4 .Definition 4 .
Comparing Mechanisms by Their "Smallest ε" (For Fixed d) Scaling d by a privacy parameter ε allows us to turn d-privacy (for some fixed d) into a quantitative "leakage" measure, by associating each channel to the smallest ε by which we can scale d without violating privacy.The privacy-based leakage (wrt d) of a channel C is defined as

Definition 6 .
d (C) ≤ 1 iff C satisfies d-privacy.It is then natural to compare two mechanisms A and B based on their "smallest ε".Definition 5. Define A prv d B iff Priv d (A) ≥ Priv d (B).For instance, A prv d H B means that B satisfies standard differential privacy for ε at least as small as the one of A. 4.5.Privacy-Based Leakage and Refinement Orders When discussing the average-and max-case leakage orders avg G , max Q , we obtained strong leakage guarantees by quantifying over all vulnerability functions.It is thus natural to investigate a similar quantification in the context of d-privacy.Namely, we define a stronger privacy-based "leakage" order, by comparing mechanisms not on a single metric d, but on all metrics simultaneously.The privacy-based leakage order is defined as A prv M B iff A prv d B for all d ∈ MX .

Definition 7 .Proposition 1 .
The privacy-based refinement order is defined as A prv B iff d A ≥ d B , or equivalently iff B satisfies d A -privacy.This achieves our goal of goal of characterizing prv M .The orders prv M and prv coincide.

Figure 3 .
Figure 3. Refinement of mechanisms under avg for 5 × 5 channels.The x-axis represents the ε on the LHS of the relation, and the y-axis represents the one on the RHS.The top graph represents refinement of the truncated geometric mechanism (that is, TG avg ), the middle graph is refinement of randomized response (R avg ), and the bottom graph is refinement of the exponential mechanism (E avg ).

Theorem 17 .
For any metric d: MX , we can construct a channel C d such that d C d = d.

Appendix E .
Proofs of the Results about the Lattice Properties Theorem A16.For any metric d: MX , we can construct a channel C d such that d C d = d.

Table 3 .
Comparison of leakage and refinement orders.All implications are strict.

Table 4 .
The refinements respected by families of mechanisms for decreasing ε.We recall that the results in the grey cells are based on Conjecture 1.