Given the above mentioned issue, we provide ARA solutions to AC. We focus first on modelling the adversary’s problem in the operation phase. We present the classification problem faced by
C as a Bayesian decision analysis problem in
Figure 3, derived from
Figure 2. In it,
A’s decision appears as random to the classifier, since she does not know how the adversary will attack the data. For notational convenience, when necessary we distinguish between random variables and realisations using upper and lower case letters, respectively; in particular, we denote by
X the random variable referring to the original instance (before the attack) and
${X}^{\prime}$ that referring to the possibly attacked instance.
$\widehat{z}$ will indicate an estimate of
z.
4.1. The Case of Generative Classifiers
Suppose first that a generative classifier is required. As training data is clean by assumption, we can estimate ${p}_{C}\left(y\right)$ (modelling the classifier’s beliefs about the class distribution) and ${p}_{C}(X=xy)$ (modelling her beliefs about the feature distribution given the class when A is not present). In addition, assume that when C observes ${X}^{\prime}={x}^{\prime}$, she can estimate the set ${\mathcal{X}}^{\prime}$ of original instances x potentially leading to the observed ${x}^{\prime}$. As later discussed, in most applications this will typically be a very large set. When the feature space is endowed with a metric d, an approach to approximate ${\mathcal{X}}^{\prime}$ would be to consider ${\mathcal{X}}^{\prime}=\{x:d(x,{x}^{\prime})<\rho \}$ for a certain threshold $\rho $.
Given the above, when observing
${x}^{\prime}$ the classifier should choose the class with maximum posterior expected utility (
7). Applying Bayes formula, and ignoring the denominator, which is irrelevant for optimisation purposes, she must find the class
In such a way,
A’s modifications are taken into account through the probabilities
${p}_{C}({X}^{\prime}={x}^{\prime}X=x,y)$. At this point, recall that the focus is restricted to integrity violation attacks. Then,
${p}_{C}({X}^{\prime}={x}^{\prime}X=x,{y}_{2})=\delta ({x}^{\prime}x)$ and problem (
8) becomes
Note that should we assume full common knowledge, we would know
A’s beliefs and preferences and, therefore, we would be able to solve his problem exactly: when
A receives an instance
x from class
${y}_{1}$, we could compute the transformed instance. In this case,
${p}_{C}\left({X}^{\prime}\rightX=x,{y}_{1})$ would be 1 just for the
x whose transformed instance coincides with that observed by the classifier and 0, otherwise. Inserting this in (
9), we would recover Dalvi’s formulation (
5). However, common knowledge about beliefs and preferences does not hold. Thus, when solving
A’s problem we have to take into account our uncertainty about his elements and, given that he receives an instance
x with label
${y}_{1}$, we will not be certain about the attacked output
${x}^{\prime}$. This will be reflected in our estimate
${p}_{C}\left({x}^{\prime}\rightx,{y}_{1})$ which will not be 0 or 1 as in Dalvi’s approach (stage 3). With this estimate, we would solve problem (
9), summing
${p}_{C}\left(x\right{y}_{1})$ over all possible originating instances, with each element weighted by
${p}_{C}\left({x}^{\prime}\rightx,{y}_{1})$.
To estimate these last distributions, we resort to
A’s problem, assuming that this agent aims at modifying
x to maximise his expected utility by making
C classify malicious instances as innocent. The decision problem faced by
A is presented in
Figure 4, derived from
Figure 2. In it,
C’s decision appears as an uncertainty to
A.
To solve the problem, we need
${p}_{A}\left({y}_{c}^{*}\left({x}^{\prime}\right)\right{x}^{\prime})$, which models
A’s beliefs about
C’s decision when she observes
${x}^{\prime}$. Let
p be the probability
${p}_{A}({y}_{c}^{*}\left(a\left(x\right)\right)={y}_{1}a\left(x\right))$ that
A concedes to
C saying that the instance is malicious when she observes
${x}^{\prime}=a\left(x\right)$. Since
A will have uncertainty about it, let us model its density using
${f}_{A}\left(p\right{x}^{\prime}=a\left(x\right))$ with expectation
${p}_{{x}^{\prime}=a\left(x\right)}^{A}$. Then, upon observing an instance
x of class
${y}_{1}$,
A would choose the data transformation maximising his expected utility:
where
${u}_{A}({y}_{i},{y}_{j})$ is the attacker’s utility when the defender classifies an instance of class
${y}_{j}$ as one of class
${y}_{i}$.
However, the classifier does not know the involved utilities
${u}_{A}$ and probabilities
${p}_{z=a\left(x\right)}^{A}$ from the adversary. Let us model such uncertainty through a random utility function
${U}_{A}$ and a random expectation
${P}_{z=a\left(x\right)}^{A}$. Then, we could solve for the random attack, optimising the random expected utility
We then use such distribution and make (assuming that the set of attacks is discrete, and similarly in the continuous case)
${p}_{C}\left({x}^{\prime}\rightx,{y}_{1})=Pr({X}^{\prime}(x,{y}_{1})={x}^{\prime})$ which was the missing ingredient in problem (
9). Observe that it could be the case that
$Pr({X}^{\prime}(x,{y}_{1})=x)>0$, i.e., the attacker does not modify the instance.
Now, without loss of generality, we can associate utility 0 with the worst consequence and 1 with the best one, having the other consequences intermediate utilities. In
A’s problem, his best consequence holds when the classifier accepts a malicious instance as innocent (he has opportunities to continue with his operations) while the worst consequence appears when the defender stops an instance (he has wasted effort in a lost opportunity), other consequences being intermediate. Therefore, we adopt
${U}_{A}({y}_{1},{y}_{1})\sim {\delta}_{0}$ and
${U}_{A}({y}_{2},{y}_{1})\sim {\delta}_{1}$. Then, the Attacker’s random optimal attack would be
Modelling
${P}_{z=a\left(x\right)}^{A}$ is more delicate. It entails strategic thinking and could lead to a hierarchy of decision making problems, described in [
43] in a simpler context. A heuristic to assess it is based on using the probability
$r=P{r}_{C}({y}_{c}^{*}\left(z\right)={y}_{1}z)$ that
C assigns to the instance received being malicious assuming that she observed
z, with some uncertainty around it. As it is a probability,
r ranges in
$[0,1]$ and we could make
${P}_{z=a\left(x\right)}^{A}\sim \beta e({\delta}_{1},{\delta}_{2})$, with mean
${\delta}_{1}/({\delta}_{1}+{\delta}_{2})=r$ and variance
$\left({\delta}_{1}{\delta}_{2}\right)/\left[{({\delta}_{1}+{\delta}_{2})}^{2}({\delta}_{1}+{\delta}_{2}+1)\right]=var$ as perceived.
$var$ has to be tuned depending on the amount of knowledge
C has about
A. Details on how to estimate
r are problem dependent.
In general, to approximate
${p}_{C}\left({x}^{\prime}\rightx,{y}_{1})$ we use Monte Carlo (MC) simulation drawing
K samples
$\left({P}_{z}^{A,k}\right)$,
$k=1,\cdots ,K\phantom{\rule{0.166667em}{0ex}}$ from
${P}_{z}^{A}$, finding
${X}_{k}^{\prime}(x,{y}_{1})=\mathrm{arg}{\mathrm{min}}_{z}{P}_{z}^{A,k}$ and estimating
${p}_{C}\left({x}^{\prime}\rightx,{y}_{1})$ using the proportion of times in which the result of the random optimal attack coincides with the instance actually observed by the defender:
It is easy to prove, using arguments in [
44], that (
12) converges almost surely to
${p}_{C}\left({x}^{\prime}\rightx,{y}_{1})$. In this, and other MC approximations considered, recall that the sample sizes are essentially dictated by the required precision. Based on the Central Limit Theorem [
45], MC sums approximate integrals with probabilistic bounds of the order
$\sqrt{\frac{var}{N}}$ where
N is the MC sum size. To obtain a variance estimate, we run a few iterations and estimate the variance, then choose the required size based on such bounds.
Once we have an approach to estimate the required probabilities, we implement the scheme described through Algorithm 1, which reflects an initial training phase to estimate the classifier and an operational phase which performs the above once a (possibly perturbated) instance
${x}^{\prime}$ is received by the classifier.
Algorithm 1 General adversarial risk analysis (ARA) procedure for AC. Generative 
Input: Training data $D$, test instance ${x}^{\prime}$. Output: A classification decision ${y}_{c}^{*}\left({x}^{\prime}\right)$. Training Train a generative classifier to estimate ${p}_{C}\left(y\right)$ and ${p}_{C}\left(x\righty)$ End Training Operation Read ${x}^{\prime}$. Estimate ${p}_{C}\left({x}^{\prime}\rightx,{y}_{1})$ for all $x\in {\mathcal{X}}^{\prime}$. Solve Output ${y}_{c}^{*}\left({x}^{\prime}\right)$. End Operation

4.2. The Case of Discriminative Classifiers
With discriminative classifiers, we cannot use the previous approach as we lack an estimate of
${p}_{C}(X=xy)$. Alternatively, assume for the moment that the classifier knows the attack that she has suffered and that this is invertible in the sense that she may recover the original instance
$x={a}^{1}\left({x}^{\prime}\right)$. Then, rather than classifying based on (
6), as an adversary unaware classifier would do, she should classify based on
However, she actually has uncertainty about the attack
a, which induces uncertainty about the originating instance
x. Suppose we model our uncertainty about the origin
x of the attack through a distribution
${p}_{C}(X=x{X}^{\prime}={x}^{\prime})$ with support over the set
${\mathcal{X}}^{\prime}$ of reasonable originating features
x. Now, marginalising out over all possible originating instances, the expected utility that the classifier would get for her classification decision
${y}_{c}$ would be
and we would solve for
Typically, the expected utilities (
13) are approximated by MC using a sample
${\left\{{x}_{n}\right\}}_{n=1}^{N}$ from
${p}_{C}\left(x\right{x}^{\prime})$. Algorithm 2 summarises a general procedure.
Algorithm 2 General ARA procedure for AC. Discriminative 
Input: Monte Carlo size N, training data $D$, test instance ${x}^{\prime}$. Output: A classification decision ${y}_{c}^{*}\left({x}^{\prime}\right)$. Training Based on $D$, train a discriminative classifier to estimate ${p}_{C}\left(y\rightx)$. End Training Operation Read ${x}^{\prime}$ Estimate ${\mathcal{X}}^{\prime}$ and ${p}_{C}\left(x\right{x}^{\prime}),x\in {\mathcal{X}}^{\prime}$ Draw sample ${\left\{{x}_{n}\right\}}_{n=1}^{N}$ from ${p}_{C}\left(x\right{x}^{\prime})$. Find ${y}_{c}^{*}\left({x}^{\prime}\right)=\mathrm{arg}{\mathrm{max}}_{{y}_{c}}\frac{1}{N}{\sum}_{i=1}^{2}\left({u}_{C}({y}_{c},{y}_{i})\left[{\sum}_{n=1}^{N}{p}_{C}\left({y}_{i}\right{x}_{n})\right]\right)$ Output ${y}_{c}^{*}\left({x}^{\prime}\right)$ End Operation

To implement this approach, we need to be able to estimate
${\mathcal{X}}^{\prime}$ and
${p}_{C}\left(x\right{x}^{\prime})$ or, at least, sample from such distribution. A powerful approach samples from
${p}_{C}\left(X\right{X}^{\prime}={x}^{\prime})$ by leveraging approximate Bayesian computation (ABC) techniques, [
46]. This requires being able to sample from
${p}_{C}\left(X\right)$ and
${p}_{C}\left({X}^{\prime}\rightX=x)$, which we address first.
Estimating
${p}_{C}\left(x\right)$ is done using training data, untainted by assumption. For this, we can employ an implicit generative model, such as a generative adversarial network [
47] or an energybased model [
48]. In turn, sampling from
${p}_{C}\left({x}^{\prime}\rightx)$ entails strategic thinking. Notice first that
We easily generate samples from
${p}_{C}\left({y}_{c}\rightx)$, as we can estimate those probabilities based on training data as in
Section 2.1. Then, we can obtain samples from
${p}_{C}\left({x}^{\prime}\rightx)$ sampling
$y\sim {p}_{C}\left(y\rightx)$ first; next, if
$y={y}_{2}$ return
x or, otherwise, sample
${x}^{\prime}\sim {p}_{C}\left({x}^{\prime}\rightx,{y}_{1})$. To sample from the distribution
${p}_{C}\left({x}^{\prime}\rightx,{y}_{1})$, we model the problem faced by the attacker when he receives instance
x with label
${y}_{1}$.The attacker will maximise his expected utility by transforming instance
x to
${x}^{\prime}$ given by (
10). Again, associating utility 0 with the attacker’s worst consequence and 1 with the best one; and modelling our uncertainty about the attacker’s estimates of
${p}_{z=a\left(x\right)}^{A}$ using random expected probabilities
${P}_{z=a\left(x\right)}^{A}$, we would look for random optimal attacks
${X}^{\prime}(x,{y}_{1})$ as in (
11). By construction, if we sample
${p}_{z=a\left(x\right)}^{A}\sim {P}_{z=a\left(x\right)}^{A}$ and solve
${x}^{\prime}$ is distributed according to
${p}_{C}\left({x}^{\prime}\rightx,{y}_{i})$ which was the last ingredient required.
Once with these two sampling procedures, we generate samples from
${p}_{C}\left(X\right{X}^{\prime}={x}^{\prime})$ with ABC techniques. This entails generating
$x\sim {p}_{C}\left(X\right)$,
${\tilde{x}}^{\prime}\sim {p}_{C}\left({X}^{\prime}\rightX=x)$ and accepting
x if
$\varphi ({\tilde{x}}^{\prime},{x}^{\prime})<\delta $, where
$\varphi $ is a distance function defined in the space of features and
$\delta $ is a tolerance parameter. The
x generated in this way is distributed approximately according to the desired
${p}_{C}\left(x\right{x}^{\prime})$. However, the probability of generating samples for which
$\varphi ({\tilde{x}}^{\prime},{x}^{\prime})<\delta $ decreases with the dimension of
${x}^{\prime}$. One possible solution replaces
${x}^{\prime}$ by
$s\left({x}^{\prime}\right)$, a set of summary statistics that capture the relevant information in
${x}^{\prime}$; then, the acceptance criterion would be replaced by
$\varphi (s\left({\tilde{x}}^{\prime}\right),s\left({x}^{\prime}\right))<\delta $. The choice of summary statistics is problem specific. We sketch the complete ABC sampling procedure in Algorithm 3, to be integrated within Algorithm 2 in its drawing sample command.
Algorithm 3 ABC scheme to sample from ${p}_{C}\left(x\right{x}^{\prime})$ within Algorithm 2 
Input: Observed instance ${x}^{\prime}$, data model ${p}_{C}\left(x\right)$, ${p}_{C}\left(y\rightx)$, ${P}_{z}^{A}$, family of statistics s. Output: A sample approximately distributed according to ${p}_{C}\left(x\right{x}^{\prime})$. while$\varphi (s\left({x}^{\prime}\right),s\left({\tilde{x}}^{\prime}\right))>\delta $do Sample $x\sim {p}_{C}\left(x\right)$ Sample $y\sim {p}_{C}\left(y\rightx)$ if $y={y}_{2}$ then ${\tilde{x}}^{\prime}=x$ else Sample ${p}_{z}^{A}\sim {P}_{z}^{A}$ Compute ${\tilde{x}}^{\prime}=\mathrm{arg}{\mathrm{min}}_{z}{p}_{z}^{A}$ end if Compute $\varphi (s\left({x}^{\prime}\right),s\left({\tilde{x}}^{\prime}\right))$ end while Output x
