Calibrating the Attack to Sensitivity in Differentially Private Mechanisms

: This work studies the power of adversarial attacks against machine learning algorithms that use differentially private mechanisms as their weapon. In our setting, the adversary aims to modify the content of a statistical dataset via insertion of additional data without being detected by using the differential privacy to her/his own beneﬁt. The goal of this study is to evaluate how easy it is to detect such attacks (anomalies) when the adversary makes use of Gaussian and Laplacian perturbation using both statistical and information-theoretic tools. To this end, ﬁrstly via hypothesis testing, we characterize statistical thresholds for the adversary in various settings, which balances the privacy budget and the impact of the attack (the modiﬁcation applied on the original data) in order to avoid being detected. In addition, we establish the privacy-distortion trade-off in the sense of the well-known rate-distortion function for the Gaussian mechanism by using an information-theoretic approach. Accordingly, we derive an upper bound on the variance of the attacker’s additional data as a function of the sensitivity and the original data’s second-order statistics. Lastly, we introduce a new privacy metric based on Chernoff information for anomaly detection under differential privacy as a stronger alternative for the ( (cid:101) , δ ) -differential privacy in Gaussian mechanisms. Analytical results are supported by numerical evaluations.


Introduction
The major issue in terms of data privacy in today's world stems from the fact that machine learning (ML) algorithms strongly depend on the use of large datasets to work efficiently and accurately. Along with the highly increased deployment of ML, its privacy aspect rightfully became a cause of concern, since the collection of such large datasets makes users vulnerable to fraudulent use of personal, (possibly) sensitive information. This vulnerability is aimed to be mitigated by privacy enhancing technologies that are designed to protect data privacy of users.
Differential privacy (DP) has been proposed to address this vulnerability and it has furthermore been used to develop practical methods for protecting private user-data. Dwork's original definition of DP in [1] emanates from a notion of statistical indistinguishability of two different probability distributions which is achieved through randomization of the data prior to their publication. The outputs of two differentially private mechanisms are indistinguishable for two datasets that only differ in one user's data, i.e., neighbors. In other words, DP guarantees that the output of the mechanism is statistically indifferent to changes made in a single row of the dataset proportional to its privacy budget. The reader is referred to [2][3][4] for surveys of results.
Let us imagine a scenario where it is possible to weaponize privacy protection methods by adversaries in order to avoid being detected by the defender. Adversarial classification/anomaly detection is an application of the ML approach, statistical classification, to detect misclassification attacks where adversaries shield themselves by using DP to remain undetected. This paper studies adversarial classification in differentially private mechanisms to establish the trade-off between the probability distribution of the noise and the impact of the attack to remain indistinguishable. This is achieved by employing both statistical and information-theoretic tools. In this setting, we consider an adversary who not only aims to discover the information of a dataset but also wants to harm it by inserting data into the original dataset. Accordingly, we establish stochastic and information-theoretic relations between the impact of the adversary's attack and the privacy budget of the DP mechanism.

Related Work and Methodology
This part is reserved for a discussion on related work and background of the addressed problem emphasizing the differences between the existing literature and the current paper along with the methodology that is used in this paper.
The addressed problem in this work differs from existing work on DP which considers an adversary model where the goal of the attacker is to solely discover some information about the dataset. For instance, the assumption in [5] is that the adversary has the knowledge of the entire dataset except for one entry. This translates to the implicit strong adversary assumption. In this paper, our aim is to extend this model with a stronger adversary who also wants to harm the dataset and the output of the mechanism. We consider an adversary who is able to modify (add, replace, delete, etc.) the published information from a differentially private mechanism which is a noisy version of the output. The adversary's goal in this model is to maximize the possible damage (the induced bias or additional variance) while remaining undetected. Thus, there are two sides of what the adversary wants to achieve: (i) s/he gives false data with the biggest possible difference from the real data, (ii) this modification has to be achieved without being detected. On the defender's end, the mechanism wants to preserve DP and correctly detect the attack.
A simpler version of the described problem is addressed by [6] from an adversarial perspective and the two conflicting goals of the adversary is formulated as an optimization problem where maximizing the bias induced by the adversary is the objective function. However, the privacy parameter does not take part in the formulation of this optimization problem, instead, DP is used in conjunction with anomaly detection for preserving privacy afterward. We seek a characterization of the trade-off between the attack (the change in the output induced by the adversary) and the privacy parameter. On the other hand, in [7], the authors show that the sensitivity of a mechanism has also an impact on the differentially private output. The noise to be added on the output is calibrated accordingly as a function of the noise distribution. Such a characterization of the problem described in this paper introduces a third element as the value of the attack to be included in this adjustment of the DP noise with respect to (w.r.t.) the sensitivity of the system. This will allow us to be able to determine a threshold for detecting the attacker, alternatively, for the attacker to remain undetected.
As for the methodology, we will use the framework of statistical hypothesis testing in a similar vein to [8] where the authors determine an appropriate value of the privacy parameter as a function of false alarm and mis-detection probabilities in deciding on the presence or absence of a particular record in a dataset. Similarly, in [9], the author studies the differentially private hypothesis testing in the local setting where users locally add the DP noise on their personal data before submitting them to the dataset. In this paper, we tailor this approach as a first attempt for a solution for anomaly detection in Laplace and Gaussian mechanisms under global DP where the personal sensitive data are transmitted to a central server by the users and the server applies DP noise on the data before their release. The major difference from the existing literature that employs statistical inference to differential privacy lies in our new attacker model which considers an adversary who not only aims to discover but also wants to alter the information in the dataset. We present a statistical threshold of detecting the attacker as a function of the impact of the attack (the effect of the additional data on the overall dataset) and the privacy parameter(s).
Additionally, in the case of Laplace mechanism, we propose an interval for the privacy budget, so that the defender detects the attack.
For the case of Gaussian mechanism, besides the aforementioned statistical approach, we also derive the mutual information between the datasets before and after the attack (considered as neighbors) in order to bound the second-order statistics of the additional data. This yields an information-theoretic threshold for correctly detecting the attack. Originally, the lossy source-coding approach in the information-theoretic DP literature has mostly been used to quantify the privacy guarantee [10] or the leakage [11,12]. Ref. [13] stands out in the way that the rate-distortion perspective is applied to DP, where various fidelity criteria is set to determine how fast the empirical distribution converges to the actual source distribution. We present an adaptation of the so-called Kullback-Leibler (KL)-DP [5] for detecting misclassification attacks in Laplace and Gaussian mechanisms, where the corresponding distributions in relative entropy were considered as the differentially private noise with and without the adversary's advantage. Lastly, this work introduces a novel DP metric based on Chernoff information along with its application to adversarial classification.
Aside from statistical and information-theoretic approaches as employed in this paper, the literature on adversarial examples and attempts to correctly classify and detect them is rather rich. For instance, ref. [14] offers a game-theory-based risk analysis approach that was originally introduced by [15], whereas [16] introduce efficient algorithms for reverse engineering linear classifiers for adversarial classification. Adversarial classification dates back to [17], which assumes (somewhat unrealistically) that the adversary has the perfect knowledge of the classifier and attempt to detect these attacks by computation of the adversary's optimal strategy. The novelty of the current paper lies in its methodology that makes use of information-theoretic quantities to solve a privacy and security problem.

Contributions and Outline
Our contributions are summarized in the following list.

•
We consider a new attacker model whereby the adversary takes advantage of the underlying differentially private mechanism in order to remain undetected. • We derive a trade-off between the privacy protected adversary's advantage and the security of the system for the adversary to remain undetected while giving as much damage as possible to the system or, alternatively, for the defender to preserve the privacy of the system and detect the attacker. This trade-off is defined in the framework of statistical hypothesis testing similarly to [8]. • We adopt the Kullback-Leibler DP definition of [5] to the addressed problem for adversarial classification in differentially private mechanisms and present numerical comparisons of different cases where the sensitivity of the system is less and greater than the bias induced by the adversary on the published information. • We apply a source-coding approach to anomaly detection under differential privacy to bound the variance of the additional data by the sensitivity of the mechanism and the original data's statistics by deriving the mutual information between the neighboring datasets. • We introduce a new DP metric, that is called Chernoff DP, as a stronger alternative to the well-known ( , δ)-DP and KL-DP for the Gaussian mechanism. Chernoff DP is also adapted for adversarial classification and numerically shown to outperform KL-DP.
The outline of the paper is as follows. In the upcoming section, we remind the reader of some important preliminaries from the DP literature which will be used throughout this paper along with the detailed problem definition and performance criteria. In Sections 3 and 4, we present statistical and information-theoretic thresholds for anomaly detection in Laplace and Gaussian mechanisms, respectively. Section 5 introduces divergence-based definitions of DP adapted for anomaly detection. We present numerical evaluation results in Section 6 and draw our final conclusions in Section 7.

System Model and Its Components
In this part, we revisit certain notions from the literature on DP which will also be employed in this paper. These preliminaries will be followed by a detailed definition of the addressed problem. We begin with defining the notion of neighborhood between datasets and sensitivity of DP.
Definition 1 (Neighboring datasets). Any two datasets that differ only in one row are called neighbors [4]. For two neighboring datasets, the following equality holds where d(., .) denotes the Hamming (or l 1 ) distance between two datasets.
Definition 2 (L 1 norm sensitivity [7]). Global sensitivity, denoted by s of a function (or a query) q: D → R k is the smallest possible upper bound on the distance between the images of q when applied to two neighboring datasets, i.e., the l 1 distance is bounded by q(x) − q(x) 1 ≤ s .
Basically, sensitivity of a DP mechanism is the smallest possible upper bound on the images of a query function for neighbors. Hence it is a function of the type of the query having an opposite relationship with the privacy. Higher sensitivity of the query refers to a stronger requirement for privacy guarantee, consequently more noise is needed to achieve that guarantee.
and for all neighboring datasets x andx within the domain of Y the following inequality holds.
Next, we remind the reader of the Laplace distribution and Laplace mechanism. A differentially private system is named after the probability distribution of the perturbation applied onto the query output in the global setting. The Laplace distribution, also known as the double exponential distribution, is defined as with the location parameter equal to its mean µ ∈ R and variance 2b 2 where b > 0 denotes the scale parameter.
We will refer to the parameters and δ as privacy budget throughout the paper. Next definition reminds the reader of the L 2 norm global sensitivity. Definition 5. L 2 norm sensitivity denoted s refers to the smallest possible upper bound on the L 2 distance between the images of a query q : D → R k when applied to two neighboring datasets X andX as Definition 6. Gaussian mechanism [7] is defined for a function (or a query) q : D → R k as follows M(X, q(.), , δ) = q(X) + (Z 1 , · · · , Z k ) (6) where Z i ∼ N (0, σ 2 ), i = 1, · · · , k denote independent and identically distributed (i.i.d.) Gaussian random variables with the variance σ 2 z = 2s 2 log(1.25/δ) 2 . Theorem 1 ([4]). The Laplace mechanism satisfies ( , 0)-differential privacy.
Application of Gaussian noise results in a more relaxed privacy guarantee contrary to Laplace mechanism, which brings about ( , 0)-DP.

Problem Definition
Within the scope of this paper, we use two different approaches to study adversarial classification under differential privacy, namely the statistical approach to bound the firstorder statistics of the additional data and an information-theoretic approach to characterize the second-order statistics of the attack. We define the original dataset in the following form X = X n = {X 1 , · · · , X n }. The query function takes the aggregation of this dataset as q(X) = ∑ n i X i and the DP-mechanism adds Laplacian or Gaussian noise Z on the query output leading to the noisy output in the following form M(X, q(.), , δ) = Y = ∑ n i X i + Z. This public information is altered by an adversary, who adds a single record denoted X a to this dataset. The modified output of the DP-mechanism becomes ∑ n i X i + X a + Z. The reader should note that, we do not make any assumptions on the value of X a .

First-Order Statistics of X a
Our first approach is inspired by [8] where the authors determine statistical thresholds for the adversary's hypothesis problem which is set to decide a given dataset entry is included in a dataset D or its neighborD. This approach is adapted to the problem of detecting a strong adversary who does not only want to discover all the entries of a dataset but also wants to harm it. Accordingly, we set the following hypotheses where the null and alternative hypotheses are respectively translated into DP noise distribution with and without the bias induced by the attacker.
H 0 : defender fails to detect the attack H 1 : defender detects the attack (7) The hypothesis testing problem defined above in (7) can be translated into deciding on the DP noise distribution with its parameters. Here H 0 and H 1 correspond to DP noise following the probability distributions p 0 with mean µ 0 and p 1 with mean µ 1 , respectively. Therefore, the decision boils down to choosing between Hence the shift in the location due to the addition of X a to the dataset is ∆µ = µ 1 − µ 0 . The corresponding likelihood ratio for this problem yields where L(.) denotes the likelihood function for the corresponding hypothesis and κ is some positive number to be determined. Such a threshold defines the critical region in statistical hypothesis tests where the null hypothesis is rejected. This approach results in a precise trade-off between the attacker's advantage (or the bias induced by the adversary) ∆µ, the sensitivity s and the privacy parameter of the differentially private mechanism to characterize the threshold for rejecting the null hypothesis, i.e., detecting the attack, as a function of the error probabilities.
α and β respectively denote type I and type II error probabilities which are defined for the hypothesis testing problem in (7) as follows: Based on the definition of α, also called the probability of false-alarm, we denote its complement byᾱ = 1 − α. Similarly, due to (10), the complement of type II error probability (or the probability of mis-detection) is denoted byβ = 1 − β. The probability of detectionβ (i.e., correctly deciding H 1 ) is also called the power of the test in the statistics or the recall in machine learning terminology.
According to the Neyman-Pearson Theorem [18], the likelihood ratio compared against some positive integer defines the best critical region of size α for testing a simple hypothesis against an alternative simple hypothesis with the largest (or equally largest) power of the test. An extension of this result to testing against a composite alternative hypothesis is also possible. Such an extension is called uniformly most powerful test since such a test with the best critical region of size α is conducted for each possible value of the alternative hypothesis. Once we define the critical region for deciding between H 0 and H 1 in (7) as a function of ∆µ, the privacy parameter and the sensitivity s, we will derive the error probabilities and the power of the test analytically as well as compute and depict them numerically.

Second-Order Statistics of X a -Information-Theoretic Approach
Our second approach is inspired by rate-distortion theory. For Gaussian mechanism, we employ the biggest possible difference between the images of the query for the datasets with and without the additional data X a (i.e., neighboring inputs) as the fidelity criterion (Definition 5). Accordingly, we derive the mutual information between the original dataset and its neighbor in order to bound the additional data's second-order statistics so that the defender fails to detect the attack. We assume that X a follows a normal distribution with the variance σ 2 X a . To simplify our derivations, we also assume that the original dataset X n = {X 1 , X 2 , · · · , X i , · · · , X n } and its neighborX n = {X 1 , X 2 , · · · , X i , · · · , X n + X a } have the same dimension n. Alternatively, the attack would change the size of the dataset as n + 1 where the additional data are not added to either of the X i 's.

Adversarial Classification in Laplace Mechanisms
We separate our results in two main groups for ( , 0)-DP in Laplace mechanisms for one-sided and two-sided hypothesis tests.

One-Sided Test
We will investigate both cases of setting the alternative hypothesis H 1 as either µ 1 > µ 0 (i.e., ∆µ > 0) or µ 1 < µ 0 (i.e., ∆µ < 0). This corresponds to a one-sided hypothesis testing problem. The decision of choosing between the hypotheses in (7) ) where θ ≥ 1 as the measure of the change in the privacy budget of the system whereas s and denote the sensitivity and privacy parameter, respectively. It should be noted that setting θ = 1 translates the hypothesis test in (7) into testing only the location parameter of the Laplacian DP noise. Our goal is to derive a relationship between the privacy parameter, the significance level (or the probability of false alarm), type II error probability (or the probability of mis-detection) for the attacker to be successful, i.e., to fail to reject H 0 , as a function of the bias ∆µ. The corresponding likelihood ratio to (7) is given by where κ is some positive number to be determined and (µ i , b i ) for i = 0, 1 represent the location and scale parameters of the distributions to be tested. The next theorem states our first main result which presents a threshold of correctly detecting the adversary for a given level of privacy budget, sensitivity and type I error probability. Theorem 3. The threshold of the best critical region of size α defined in (9) for deciding between the null hypothesis and its alternative of the one-sided hypothesis testing problem in (7) for a Laplace mechanism with the largest powerβ is given as a function of the probability of false alarm α, privacy parameter and global sensitivity s as follows Then according to the adversary's hypothesis testing problem, the defender detects the attack for ∆µ > 0 if the output of the Laplace mechanism Y 0 exceeds (k + q(x)) where q(.) is the noiseless query output. Similarly, for ∆µ < 0, the attack is detected if Y 0 < q(x) + k. Remark 1. The decision rule given by Theorem 3 is equivalent to comparing the Laplace noise to the threshold k as it will be shown by the following proof. For positive bias, the critical region becomes k. By analogy if ∆µ < 0, the critical region for the Laplace noise becomes (−∞, k).
Proof. According to the Neyman-Pearson theorem [18], each point where Λ ≥ κ composes the best critical region of size α as defined in (9) for this simple hypothesis testing problem. Using the ratio in (11), we will determine the threshold k as a function of the best critical region, the power of the test, the privacy budget and lastly, the attack.
We expand Λ as follows.
The likelihood ratio in (13) can be summarized by the following piecewise function based on the possible relationships between µ 1 and z due to the absolute value in the exponent of the probability distribution for µ 1 < µ 0 .
Equivalently, Λ I is confined in the interval On the other hand, for µ 1 > µ 0 , the corresponding likelihood ratio for the hypotheses in (7) yields To be able to determine a threshold for deciding between the hypotheses in (7), we compute the false alarm rate α and the mis-detection error β (and the power of the test, that is 1 − β) applying the Neyman-Pearson lemma that guarantees maximizing the power of the hypothesis test for a given false alarm rate α.

Derivation of α:
Based on the definition in (9), for ∆µ > 0 the probability of raising a false-alarm is derived by integrating the following probability distribution over the critical region which is further expanded out in two possible ways. First for k < µ 0 , we get Second, we have for k ≥ µ 0 Rewriting (20) and (22) as an equality for k, we obtain the piecewise function (12) as the threshold in Theorem 3 as a function of α. If the bias induced by the adversary is negative, i.e., ∆µ < 0, then the conditions to obtain (20) and (22) are swapped. For ∆µ < 0 and k < µ 0 , we get (22) for the probability of false-alarm.
How to determine κ?: According to the piecewise expansions of likelihood ratio functions in (16) and (15) respectively for ∆µ < 0 and ∆µ > 0, we have the intervals for κ given by (23) and (24) on top of the next page since Λ Therefore, the null hypothesis is rejected for Due to the threshold of the critical region defined in Theorem 3, we finally get κ as follows Derivation of the power of the test: The power of the hypothesis test is the probability of rejecting the null hypothesis H 0 given that the alternative hypothesis H 1 is true. Letβ denote the complement of the type II error β, we have using the definition in (10) for ∆µ > 0 and k < µ 1 As for k > µ 1 , the power function becomes On the contrary for negative bias ∆µ < 0, the conditions based on k and µ 1 to obtain (29) and (32) are swapped. In Section 6, we present numerical evaluation results for Theorem 3 using the probability of false-alarm P FA and power of the test 1 − P MD =β to draw receiving operating characteristic curves (ROC) as performance analysis.

Two-Sided Test
As an alternative solution to the same problem of detecting the attacker through determining the shifts and changes in the location and deviation of the DP noise using a one-sided hypothesis test, a two-sided test could provide a more realistic solution where it is not possible to assume the direction of the shift induced by the adversary. Hence the hypothesis test in (7) can be conducted for determining the (possible) change in the distribution of the DP noise in both directions where the null hypothesis remains the same as H 0 : Z ∼ Lap(µ 0 , s/ ) to test against the alternative H 1 : Z ∼ Lap(µ 1 , θs/ ). This translates to choosing between H 1 : at least one of the equalities does not hold where µ denotes the location parameter and b denoted the scale parameter of any Laplace distribution. The alternative hypothesis can also be stated with the parameters µ = µ 1 , b = θs/ where θ ≥ 1.
In this two-sided test, there are two thresholds on each side of the origin to be determined for the critical region each with a size of α/2. Let k 1 and k 2 denote the threshold greater and smaller than the origin, respectively. The next theorem presents the thresholds for detecting the attack as a function of the probability of false-alarm and the privacy budget of the differentially private mechanism as its one-sided counterpart given by Theorem 3.

Theorem 4.
The threshold of the best critical region of size α defined in (9) for choosing between the null hypothesis and its alternative of the two-sided hypothesis testing problem in (34) and (35) for a Laplace mechanism with the largest powerβ is Then according to the adversary's hypothesis testing problem, the defender fails to detect the attack when the output of the Laplace mechanism Y 0 is confined in (q(x) + k 2 , q(x) + k 1 ) where q(.) is the noiseless query output.
Proof. The null hypothesis cannot be rejected if the noisy output of the Laplace mechanism is confined in the interval (k 2 , k 1 ). First, we begin with the derivation of threshold for the output of the DP mechanism. The probability of raising a false-alarm or having a type I error is derived as follows.
Each addend of α corresponds to one half of the probability of false-alarm. Equating each integral to α/2 and rewriting the equalities in terms of k 1 and k 2 , we get the thresholds in (37).

A Trade-off between µ 1 , s and for Detecting the Attacker-Two-Sided Test
Using the threshold presented in Theorem 4, we can determine an interval to confine the mean of the attacker's advantage to be detected by the DP mechanism, i.e., for the null hypothesis H 0 to be rejected. Alternatively, such an interval can be converted for the privacy parameter as a function of error probabilities, the attack and the sensitivity. The following result, Corollary 1, presents upper and lower bounds on the attacker's advantage so that the defender detects the attack.
There are two possible cases w.r.t. the relationship between µ 0 and µ 1 . The alternative hypothesis in this two-sided test also states that these two parameters are unequal. As we have discussed earlier in the derivation of the threshold for determining the critical region in Laplace mechanisms, whether µ 0 > µ 1 or µ 1 > µ 0 directly effects the likelihood ratio function, and thus the condition to reject the null hypothesis. Let us then consider the first possible case of µ 0 < µ 1 . In this case, we have either k 2 < µ 0 < k 1 < µ 1 or k 2 < µ 0 < µ 1 < k 1 . On the contrary for µ 1 < µ 0 , we have for the thresholds either of the cases µ 1 < k 2 < µ 0 < k 1 or k 2 < µ 1 < µ 0 < k 1 . These different cases can be used for deriving an interval to include ∆µ as a function of the error probabilities, privacy budget and the sensitivity. Corollary 1. The absolute bias |∆µ| = |µ 1 − µ 0 | induced by the adversary is confined in the following interval so that the defender detects X a and preserves ( , 0)-DP s log αβ θ < ∆µ < s log 1 for θ ≥ 1 where α andβ respectively are the significance level and the power of the test of (35).

Proof.
We begin with deriving the power of the two-sided test (35) as a function of the thresholds of the critical region. The probability of correctly detecting the attacker is as follows.β Each addend in (41) corresponds toβ/2 and can be rewritten for the thresholds as functions of the power of the test as k 1 = µ 1 − s log(β) θ and k 2 = µ 1 + s log(β) θ . Combining this with k 2 < k 1 for the case µ 0 < µ 1 , the bias is lower bounded as follows As for the upper bound we have By analogy, we get the swapped upper and lower bound for −∆µ for the second case of µ 1 < µ 0 . Finally, we get the interval for the absolute bias as given by (40). This concludes the proof of the corollary.

Adversarial Classification in Gaussian Mechanisms
Next, we apply a source-coding approach to anomaly detection under DP, which results in an upper bound on the variance of the additional data X a as a function of the sensitivity of the mechanism and the original data's statistics. Additionally, we present a statistical trade-off between the probability of false alarm, privacy budget and the impact of the attack for the first-order statistics of the data in Section 4.2.

Privacy-Distortion Trade-off for Second-Order Statistics
The idea applied here is to render the problem of adversarial classification under DP as a lossy source-coding problem. Instead of using the mutual information between the input and output (or the input's estimate obtained via the output), considering the adversary's conflicting goals we derive the mutual information between the datasets before and after the attack. We present the main result for Gaussian mechanism by the following theorem.
Theorem 5. The privacy-distortion function for a dataset X n and Gaussian mechanism as defined by (6) is for s ∈ 0, ∏ n i σ 2 X i and zero elsewhere. σ X i denotes the standard deviation of X i for i = 1, · · · , n, f n is some constant dependent on the size of the dataset n and σ X i is the standard deviation of the additional data.
Proof. The first expansion of I(X n ;X n ) proceeds as follows In (51), we apply the following property due to concavity of entropy function, h(g(x)) ≤ h(x) for any function g(.) and introduce the lower bound since the condition conditioning reduces entropy. In (52), we plug in Definition 5 into the second term after bounding it by Gaussian entropy.
It is worth noting that the additional factor 2πe appears here as opposed to the original rate-distortion function due to the choice of the query function that aggregates the entire dataset and returns an output of size 1.

Corollary 2.
The second order statistics of the additional data inserted into the dataset by the adversary is upper bounded by a function of the privacy budget ( , δ)− and the statistics of the original dataset as follows 2 log(1.25/δ) due to Definition 6 for n ≥ 2.
Proof. For the second expansion of I(X n ;X n ), we have the following considering the neighbor that includes X a has now (n + 1) entries over n rows asX n = {X 1 , X 2 , · · · , X n + X a }.
Due to the adversary's attack, in the first term of (56), we add up the variances of (n + 1) X i 's including X a . Since (58) ≥ (53), global sensitivity is bounded as follows in terms of the second-order statistics of the original data and those of the additional data X a .
Alternatively, the lower bound on the sensitivity of the Gaussian mechanism can be used as an upper bound on σ 2 X a to yield a threshold in terms of the additional data X a 's variance as a function of the privacy budget and the original data X n 's statistics to guarantee that the adversary avoids being detected.

Remark 3.
The second expansion of the mutual information between neighboring datasets derived in (53), can be related to the well-known rate-distortion function of the Gaussian source which, originally, provides the minimum possible transmission rate for a given distortion balancing (mostly for the Gaussian case) the squared-error distortion with the source variance. This is in line with the adversary's goal in our setting, where the adversary aims to maximize the damage that s/he inflicts on the DP-mechanism. However, at the same time, to avoid being detected the attack is calibrated according to the sensitivity which here replaces the distortion. Thus, similar to the classical rate-distortion theory, here the mutual information between the neighbors is minimized for a given sensitivity to simultaneously satisfy adversary's conflicting goals for the problem of adversarial classification under Gaussian DP-mechanism.

A Statistical Threshold-First-Order Statistics
Next, we present a statistical trade-off between the privacy budget of the Gaussian mechanism and the adversary's advantage. Theorem 6. The adversary avoids being correctly detected by the defender with the largest possible power of the testβ = 1 − β and the best critical region of size α = 1 −ᾱ for positive bias, if the following inequality holds where Q(.) denotes the Gaussian Q-function defined as Pr[T > t] and for σ z = √ 2·s·.5·log(1.25/δ) .
By analogy, for negative bias, we have Proof. Likelihood ratio function Λ to choose between Y − ∑ n i X i and Y − ∑ n i X i − X a results in z >k wherek = σ 2 z log k ∆µ + µ 1 +µ 0 2 by setting p 0 and p 1 as Gaussian distributions with respective location parameters µ 0 and µ 1 and the mutual scale parameter σ z . Probability of rejecting H 0 in case of an attack is derived using this condition as where Q(.) denotes the Gaussian Q-function defined as Pr[T > t] for standard Gaussian random variables. The threshold of the critical region k for ∆µ > 0 is obtained as a function of the probability of false-alarm as k = exp ∆µ σ z Q −1 (α) − ∆µ/2σ z . The second threshold for negative bias can be obtained similarly. The defender fails to detect the attack if Y < k + q(X), where q(.) is the noiseless query output. By analogy, for ∆µ < 0, the attack is not detected if the DP output exceedsk + q(X) wherek = exp ∆µ The power of the test for both cases is obtained as follows Rewriting (62) and (63), we obtain (60) and (61).

Kullback-Leibler DP and Chernoff DP for Adversarial Classification
This part is reserved for adaptation of existing quantities from information theory such as the relative entropy or Kullback-Leibler (KL) divergence and Chernoff information to adversarial classification under DP. In [5], KL-DP is defined as follows.
Definition 7 (KL-DP, [5]). For a randomized mechanism P Y|X that guarantees -KL-DP, if the following inequality holds for all its neighboring datasets x andx.
In [5] (Theorem 1), KL-DP is proven to satisfy the following chain of inequalities In the upcoming part, we derive KL-DP in Laplace mechanisms. Additionally, in Section 5.2, we introduce a new metric of DP based on Chernoff information for adversarial classification under Gaussian mechanisms.

Laplace Mechanisms
This section is dedicated to the derivation of relative entropy or Kullback-Leibler (KL) divergence between two Laplace distributions and its adaptation to adversarial classification through KL-DP. For the described problem and the associated model described in Section 2.1, the neighboring datasets could be imagined as those where the output of the query is ∑ n i=1 X i before the attack and (∑ n i=1 X i + X a ) after the attack in both cases of Laplace and Gaussian mechanisms.The corresponding distributions are considered as the DP noise with and without the induced value of X a by the attacker as in our original hypothesis testing problem in (7). To be consistent with the hypotheses in (7), we set P Y|X=x Lap(µ 0 , s/ ) and for the neighbor, we have Lap(µ 1 , θs/ ).
In step (a), we substituted E p 0 [|z − µ 0 |] by b 0 since for z ∼ Lap(µ, b) then |z − µ| ∼ Exp(1/b) and the corresponding mean for the exponential random variable is the inverse of its parameter. For the last term, 1 b 1 E p 0 [|z − µ 1 |], we must consider two different cases due to the absolute value in the exponent of the Laplace distribution. In the following first expansion, the two distributions are centered around µ 0 and µ 1 where µ 0 < µ 1 .
Remark 4. Authors of [6] also seek the maximum bias induced by the adversary where the objective function is the minimum relative entropy between the probability distribution of the dataset before (p 0 ) and after the attack (p 1 ). Nevertheless, the choice of the objective function is set as D(p 1 ||p 0 ) ≤ γ for some γ. For the Laplace distribution, KL divergence is not symmetric, hence D(p 0 ||p 1 ) = D(p 1 ||p 0 ). Therefore, due to Stein's lemma [19], (72) and (76) should be used instead.
In Section 6, we present numerical evaluation results of (73) for different values privacy parameter as well as various levels of attack.

Chernoff DP for Gaussian Mechanism
In the classical approach, the best error exponent in hypothesis testing for choosing between two probability distributions is the Kullback-Leibler divergence between these two distributions due to Stein's lemma [19]. In the Bayesian setting, however, assigning prior probabilities to each of the hypotheses in a binary hypothesis testing problem minimizes the best error exponent when the weighted sum probability of error, i.e., π = aα + bβ for b = 1 − a and a ∈ (0, 1) which corresponds to the Chernoff information/divergence. The Chernoff information between two probability distributions f 0 and f 1 with prior probabilities a and b is defined as The Renyi divergence denoted D a ( f 0 || f 1 ) between two Gaussian distributions with parameters N (µ 0 , σ 2 0 ) and N (µ 1 , σ 2 1 ) is given in [20] by where (σ 2 ) * a = aσ 2 1 + bσ 2 0 . Using the following relation between Chernoff information and Renyi divergence D a ( f 0 || f 1 ) = 1 1−a C a ( f 0 || f 1 ), we obtain the Gaussian univariate Chernoff information with different standard deviations σ i for i = 0, 1 as follows.
On the other hand, KL divergence between two Gaussian distributions denoted The next definition provides an adaptation of Chernoff information to quantify DP guarantee as a stronger alternative to KL-DP of Definition 7 and ( , δ)-DP for Gaussian mechanisms. We apply this to our problem setting for adversarial classification under Gaussian mechanisms, where the query output before and after the attack are ∑ n i X i and ∑ n i X i + X a , respectively. The corresponding distributions are considered as the DP noise with and without the induced value of X a by the attacker as in our original hypothesis testing problem in (7) in Section 2.1.1.

Definition 8 (Chernoff DP).
For a randomized mechanism P Y|X guarantees − Chernoff-DP, if the following inequality holds for all its neighboring datasets x andx where C a (.||.) is defined by (78).
Ref. [5] (Theorem 1) proves that KL-DP defined in Definition 7 is a stronger privacy metric than ( , δ)-DP that is achieved by Gaussian mechanism. Accordingly, the following chain of inequalities are proven to hold for various definitions of DP where MI-DP refers to the mutual information DP defined by sup i,P X n I(X i ; Y|X −i ) ≤ nats for a dataset X n = {X1, · · · , X n } with the corresponding output Y according to the randomized mechanism represented by P Y|X n where X −i denotes the dataset entries excluding X i . δ-DP represents the case when = 0 in ( , δ)-DP.
Chernoff-information-based definition of DP is a stronger privacy metric than KL-DP, and thus ( , δ)-DP for the Gaussian mechanism due to prior probabilities. Such a comparison is presented numerically in Section 6. For the special case of equal standard deviation of both distributions, Chernoff information C( f 0 || f 1 ) is exactly a · b · D KL ( f 0 || f 1 ).

ROC Curves for Laplace Mechanism
Figures 1 and 2 present the numerical evaluation results of the one-sided hypothesis test for the Laplace DP noise parameters. The plots depict two different possible scenarios where the induced bias by the adversary is above and below the sensitivity of the system. µ 0 is set equal to 0 hence ∆µ = µ 1 . As highlighted in the legend, we plot the ROC curves for different values of and θ. We observe that when the privacy parameter is very small (e.g., = 0.015), the test is no longer accurate and detecting the adversary can be considered similar to random guessing. On the other hand, when the privacy parameter is very large, the accuracy of the test becomes higher in the expense of the privacy guarantee. Furthermore, as opposed to [8] (Theorem 5), we notice that ROC curves strongly depend on the sensitivity s, hence the mapping function (query) applied on the input. Indeed, when µ 1 > s the accuracy of the test becomes less important as the adversary is trying to harm the system. Figures 1 and 2 also show that the choice of θ affects the power of the test. When θ = 1, the test only consists in choosing between two location parameters. W.r.t. to the choice of θ, numerical evaluation shows that the power of the test on the y-axis decreases with θ. For each value of , ROC curves that correspond to θ = 1 outperform those with bigger variance as of a certain level of α and as the privacy is decreased (or equivalently when is increased) this flip in performance occurs for much smaller choices of the probability of false alarm.     The ROC curves corresponding to two-sided hypothesis test (35) are depicted in Figures 3 and 4 for same values of privacy budget and θ used in the previous case. As ex-pected, ROC curves for the two-tailed test show the same behavior as in Figures 1 and 2 w.r.t. the effect of the change in the privacy budget on the accuracy of the test (β increases with ). On the other hand, we observe that in the second case the test is less accurate. This is justified by the lack of knowledge on the sign of ∆µ. Indeed, the previous test is considered as being more precise (∆µ > 0).

KL-DP for Adversarial Classification:
KL-DP (73) derived in Section 5.1 is numerically evaluated in Figure 5 for different levels of attack in comparison to the sensitivity of the system for both θ = 1 and θ = 1.5.
Accordingly, the effect of the attack is compared with the upper bound exp{ } in (64). Figure 5 shows that increasing the impact the attack w.r.t. the sensitivity, closes the gap with the upper bound and for the case |∆µ| = 4 · s. As for moderate privacy budget, KL-DP upper bound is violated.

Numerical Evaluation Results for the Gaussian Mechanism
Figures 6 and 7 present ROC curves computed using the threshold of (60) for adversarial classification under Gaussian DP for two different scenarios where the impact of the attack is greater and less than the L 2 norm global sensitivity (in this order) for various levels of privacy budget. We observe that in the low privacy regime (i.e., when is large) the accuracy of the test is high which comes at the expense of the privacy guarantee since as the privacy budget is decreased (higher privacy) the test is no longer accurate and the adversary cannot be correctly detected with high probability. Another observation can be made based on the effect of the relationship between the attack and sensitivity. Unsurprisingly, increasing the bias ∆µ as opposed to s also increases the probability of correctly detecting the attacker.

Privacy-Distortion Trade-Off
The upper bound (54) on the additional data's variance presented in Corollary 2, is tested for two opposing hypothesis in (7) and the corresponding thresholds of the critical region (to be compared to the chi-square table values) are depicted in Figure 8. Here the null hypothesis that states that the defender fails to detect the attack corresponds to the case where σ 2 X a respects the upper bound (54) whereas the alternative hypothesis claims the variance of X a exceeds the proposed bound by factors stated in the legend of the figure.
Increasing the privacy budget also increases the threshold and θσ 2 X a violates the upper bound for θ > 1. This is consistent with Figure 8. 6.3.2. KL-DP vs. Chernoff DP Figure 9 depicts Chernoff DP and KL-DP for various levels of privacy and the impact of the attack which were set as a function of the global sensitivity. Accordingly, the attack is compared to the privacy constraint in Definition 8, which is referred as the upper bound in the legend. Due to prior probabilities, Chernoff information is tighter than KL divergence consequently, it provides a more strict privacy constraint. Figure 9 confirms that increasing the impact of the attack as a function of the sensitivity closes the gap with the upper bound for Chernoff-DP. Additionally, the KL-DP does not violate the upper bound of the privacy budget only in the high privacy regime (when is small) for the cases of ∆µ = 2 · s and ∆µ = 4 · s.

Conclusions
We characterized statistical trade-offs between the security of the Laplace mechanism and the privacy protected adversary's advantage for adversarial classification using one and two-tailed hypothesis testing. In both settings, we established trade-offs between the sensitivity of the system, privacy parameter and the damage caused by the attack (that is the bias due to the attack) using the threshold(s) of the critical region in choosing between the hypotheses whether or not the defender detects the attack. Such trade-offs are presented as functions of corresponding error probabilities. Numerical evaluation results show that increasing the privacy parameter also increases the accuracy of the hypothesis test. Additionally, we derived KL-DP for adversarial classification in Laplace mechanism. According to the numerical evaluation results, the effect of increasing the impact of the attack closes the gap with the DP upper bound exp{ } and some even violates it for moderate privacy budget.
We established statistical and information-theoretic trade-offs between the security of the Gaussian DP-mechanism and the adversary's advantage who aims to trick the classifier that detects anomalies. Accordingly, we determined a statistical threshold that offsets the DP-mechanism's privacy budget against the impact of the adversary's attack to remain undetected and introduced the privacy-distortion function which we used for bounding the impact of the adversary's modification on the original data. We introduced Chernoff DP and its application to adversarial classification which turned out to be a stronger privacy metric than KL-DP and ( , δ)-DP for the Gaussian mechanism.