Minimum Adversarial Examples

Deep neural networks in the area of information security are facing a severe threat from adversarial examples (AEs). Existing methods of AE generation use two optimization models: (1) taking the successful attack as the objective function and limiting perturbations as the constraint; (2) taking the minimum of adversarial perturbations as the target and the successful attack as the constraint. These all involve two fundamental problems of AEs: the minimum boundary of constructing the AEs and whether that boundary is reachable. The reachability means whether the AEs of successful attack models exist equal to that boundary. Previous optimization models have no complete answer to the problems. Therefore, in this paper, for the first problem, we propose the definition of the minimum AEs and give the theoretical lower bound of the amplitude of the minimum AEs. For the second problem, we prove that solving the generation of the minimum AEs is an NPC problem, and then based on its computational inaccessibility, we establish a new third optimization model. This model is general and can adapt to any constraint. To verify the model, we devise two specific methods for generating controllable AEs under the widely used distance evaluation standard of adversarial perturbations, namely Lp constraint and SSIM constraint (structural similarity). This model limits the amplitude of the AEs, reduces the solution space’s search cost, and is further improved in efficiency. In theory, those AEs generated by the new model which are closer to the actual minimum adversarial boundary overcome the blindness of the adversarial amplitude setting of the existing methods and further improve the attack success rate. In addition, this model can generate accurate AEs with controllable amplitude under different constraints, which is suitable for different application scenarios. In addition, through extensive experiments, they demonstrate a better attack ability under the same constraints as other baseline attacks. For all the datasets we test in the experiment, compared with other baseline methods, the attack success rate of our method is improved by approximately 10%.


Introduction
With the wide applications of a system based on DNNs, the concerns of their security become a focus. Recently, researchers have found that adding subtle perturbations to the input of deep neural networks causes models to give a wrong output with high confidence. Furthermore, they call the deliberately constructed inputs adversarial examples (AEs). The attack of DNNs by AEs is called adversarial attacks. These low-cost adversarial attacks can severely damage applications based on DNNs. Adding adversarial patches onto traffic signs can lead to auto-driving system error [1]. Adding adversarial logos to the surface of goods can impede automatic check-out in automated retail [2]. Generating adversarial master prints can destroy deep fingerprint identification models [3]. In any of the aforementioned scenarios, AEs can cause great inconvenience and harm people's lives. Therefore, AEs become an urgent issue in the area of AI security.
In the research on generating AEs, two fundamental problems exist: (1) What is the minimum boundary of the amplitude of adversarial perturbations? All the models try to generate AEs with smaller adversarial perturbations. It is their objective to add as few adversarial perturbations as necessary to the clean example to achieve the attack; (2) Is the minimum boundary of adversarial amplitude reachable? The reachability refers to whether examples with adversarial perturbations that are under a minimum bound of adversarial amplitude can successfully attack as well as whether AEs exist under that boundary.
In order to answer those two problems, traditional AE generation can be devised into two main optimization models: (1) Taking the successful attack as the objective function and the limitation of perturbations as the constraint. This limitation is usually limited as less than or equal to a value, as shown in Equation (1). For a neural network F, input distribution ℵ ⊂ R n , a point X 0 ∈ ℵ, X ∈ R n , X is the adversarial example of X 0 under the v constraint. D is the distance metric function: (2) Taking the minimum of adversarial perturbations as the target and the success of the attack as the constraint: min D(X, X 0 ) s.t. F(X) = F(X 0 ) However, the above two models do not solve the two problems well: (1) for the first model, when setting the limitation of AEs in the constraint, whether the model has a solution depends on the limit value v. The model may have no solution when the limit value v is too small. However, when the limit value is larger, the constraint on the AEs is too relaxed, and thus the gap between the solution and the minimum AEs is larger; (2) For the second model, when the limitation of adversarial perturbations is in the objective function, the perturbations will decrease in the whole optimization process until it drops in the local optimum of the whole objective function. This optimization model can easily fall into local optimization so that the solution is not the minimum adversarial example. At the same time, this paper also proves that finding the minimum AEs is an NPC problem, so it cannot find the real minimum AEs. Therefore, in this paper, we focus on answering the problems mentioned above. For the first problem, we propose the concept of minimum AEs and give the theoretical lower bound of the amplitude of minimum adversarial perturbations. For the second problem, we prove that generating the minimum adversarial example is an NPC problem, which means that the minimum boundary of adversarial amplitude is computationally unreachable. Therefore, we generate the controllable approximation of the minimum AEs. We use the certified lower bound of minimum adversarial distortion to constrain the adversarial perturbations and transform the traditional optimization problem into another new model. (3) Taking the successful attack as a target and the adversarial perturbations are equal to the lower bound of the minimum adversarial distortion plus a controllable approximation, as shown in Equation (3). ε NNS is the lower bound of the minimum adversarial distortion and δ ε is a constant of controllable approximation: This model has two advantages compared with the existing methods: (1) Better attack success rate under the same amplitude of adversarial perturbations. Based on the theoretical lower bound of the amplitude of the minimum perturbations, the AEs overcome the blindness of the existing methods by controlling the increment in that amplitude and improve the attack success rate of the AEs. (2) More precisely controlled amplitude of adversarial perturbations under different constraints. The amplitude of the adversarial perturbations will affect the visual quality of AEs. To go a step further, for different scenarios of applications of the AEs, the requirements of visual quality are different. In some scenarios, they are very strict, while others are relaxed. There are two common scenarios as follows: (1) collaborative evaluation of humans and machines. In that case, AEs need to deceive both human oracles and the classifiers based on DNNs. For example, in the scenario of auto-driving, if the patches too easily draw humans' attention, these adversarial signs would be moved and they would lose their adversarial effect. (2) Single evaluation of machines. In that case, only the classifiers and models based on DNNs need to be bypassed. In the scenario of massive electronic data filtering, they have a low probability of human involvement. When filtering and testing the harmful data involving violence and terrorism, it may heavily depend on the machines so that it has lower requirements for visual quality. Therefore, in order to adapt the two entirely different scenarios, we need to be able to controllably generate AEs.
Meanwhile, generating controllable AEs also brings additional benefits. There are two different views with different implications: (1) Attackers can adaptively and dynamically adjust the amplitude of perturbations. As the described above, the defense technologies against adversarial attacks are mainly detection methods. From the attackers' point of view, when their target is a combined network or system with detectors in front of the target classifier, as Figure 1 shows, they will expect to evaluate the successful probability of attacking the combined network before implementing the attack. For example, supposing that they know the probability of AEs with fixed perturbation bypassing the detector in advance according to prior knowledge, then they can purposefully generate AEs with bigger perturbations or more minor perturbations with a better visual quality to human eyes. (2) Defenders can actively defend against the attacks with the help of the outputs of controllable AEs. From the defenders' point of view, controllable AEs can help evaluate defenders' abilities against the AEs of different modification amplitude. When inputting different AEs with fixed adversarial perturbations to models, the defenders can evaluate their anti-attack capabilities according to the outputs against the unclean examples and then decide whether to add additional defense strategies with an emphasis on the current setting. For the example mentioned in the last point, if the defender has prior knowledge about the attackers' average perturbation amplitude, they can select whether additional defensive measures are necessary. In this paper, we first give the definitions of minimum adversarial perturbations and AEs and the theorem of generating minimum AEs as an NPC problem and then propose a new model of generating adversarial examples. Furthermore, we give two algorithms for generating an approximation of AEs under L p and SSI M constraints. We perform experiments under widely used datasets and models for all the datasets tested in the experiment; compared with other baseline methods, the attack success rate of our method is improved by approximately 10%.
Our contributions are as follows: • We first prove that generating minimum AEs is an NPC problem. We then analyze the existence of AEs with the help of the definition of the lower bound of the minimum adversarial perturbations. According to the analysis, we propose a general framework to generate an approximation of the minimum AEs. • We propose the methods of generating AEs with a controllable amplitude of AEs under the L 2 and SSI M constraints. Additionally, we further improve the visual quality in case of greater perturbations.

•
The experiments demonstrate that our method has a better performance in terms of attack success rate than other widely used methods at baseline under the same constraint. Meanwhile, its performance of precisely controlled amplitude of adversarial perturbations under different constraints is also better.
The rest of this paper is organized as follows. In Section 2, we briefly review the related work. In Section 3, we describe the basic definition, theorem and model of our algorithm in detail and prove the theorem. In Section 4, we give the transformed model of the basic model under two constraints and provide the efficient solution algorithm of the two models, respectively, in the two subsections. In Section 6, we present our experimental results and compare them with other baseline methods. Finally, we conclude our paper in Section 7.

Adversarial Attack
There are two main pursuits of AEs: one is the smaller perturbations of the AEs; and the other is the successful attack. Previous works transform the two pursuits into two main optimization models. One takes the successful attack as the objective function and the limitation of perturbations as the constraint. These works include L-BFGS [4], C&W [5], DF [6] and HCA [7]. The other takes the successful attack as the objective function and the limitation of perturbations as the constraint. Such works include UAP [8], BPDA [9] and SA [10]. Other works, including FGSM [11], JSMA [12], BIM [13] and PGD [14] do not directly use the model of the optimization problem. However, these methods convert the successful attack into a loss function, move it along the direction of the decrease or increase in the loss function to find the AEs, and use a value at each step to constrain the perturbations. They can be classified as the second optimization model from the point of method-based view.
However, these works cannot really find the minimum AEs with the minimum amplitude of adversarial perturbations. For the first model, the model may have no solution when the value is set as too small. Furthermore, for the second model, it is easy to fall into local optimization.
Meanwhile, considering the constraint function of adversarial perturbations, the works of adversarial example generation can be divided into two main classes. One AEs generation under L p constraint, including L 0 constraint [14,15], L 2 constraint [14] and L ∞ constraint [11,13,14], which is widely used. Furthermore, in addition to that L p constraint, there were other constraints in previous studies. In [16], the authors proposed that the commonly used L p constraint failed to completely capture the perceptual quality of AEs in the field of image classification. This used the structural similarity index SSIM [17] measure to replace that constraint. Moreover, the other two works [18,19] also used perceptual dis-tance measures to generate AEs. The work [18] used SSIM while [19] used the perceptual color distance to achieve the same purpose.
However, the constraint of those works is not strict. For the AEs generation under the L p constraint, it is hard to control the amplitude of perturbations and there is a deviation of AEs generated by those works. For the other constraints, they cannot strictly control the perceptual visual quality: neither the SSIM value nor perceptual color distance. Therefore, in this paper, we search for the minimum AEs with the minimum amplitude of perturbations. Moreover, we prove that generating the minimum AEs is an NPC problem. Furthermore, we transform that problem into the new optimization model that generates the controllable approximation of the minimum AEs. We generate AEs with a controllable amplitude of adversarial perturbations under the L p constraint and SSI M constraint, respectively.

Certified Robustness
The robustness of neural networks focuses on searching the lower bound and upper bound of the robustness of neural networks. The lower bound of the robustness is that there are no AEs when adding adversarial perturbations that are less than or equal to that boundary. Moreover, the upper bound of the robustness adding AEs that are larger than or equal to that bound can always acquire the AEs. The work CLEVER [20] and CLEVER++ [21] were the first neural network robustness evaluation scores. They use extreme value theory to estimate the Lipschitz constant based on sampling. However, that estimation requires many samples to have a better value of estimation. Therefore, the two methods only estimate the lower bound of the robustness of neural networks and cannot provide certification. As follows, the works Fast-Lin and Fast-Lip [22], CROWN [23] and CNN-Cert [24] are methods of certifying the robustness of the neural networks. The Fast-Lin and Fast-Lip [22] can only be used for neural networks with the activation function of ReLu. CROWN [23] can be further used for the networks with all general activation functions. Furthermore, the CNN-Cert [24] can be used for the general convolutional neural networks (CNNs). The basic idea is constructing linear functions to constrain the input and then using the upper and lower bounds of the functions as the upper and lower bounds of input, respectively. After that, it can constrain the whole network layer by layer. The whole process is iterative. However, the above algorithm does not indicate how to calculate the AEs according to the calculated lower bound, and the reachability of AEs based on the lower bound remains a problem. Therefore, in this paper, we calculate the approximation of the minimum AEs based on the lower bound.

Basic Definition, Theorem and Modeling
Definition 1. (AEs, Adversarial Perturbations). Given a neural network F, a distribution ℵ ⊂ R n , a distance measurement D : R n × R n → R between X and X 0 , a point X 0 ∈ ℵ and a point X ∈ R n , we say that X is an adversarial example of X 0 under constraint ε 0 if F(X) = F(X 0 ) and D(X, X 0 ) = ε 0 . Definition 2. (Minimum AEs, Minimum Adversarial Perturbations). Given a neural network F, a distribution ℵ ⊂ R n , a distance measurement D : R n × R n → R between X and X 0 , and a point X 0 ∈ ℵ, we say that X * ∈ R n is a minimum adversarial example of X 0 if X * is an adversarial example of X 0 under constraint ε * and ε * = min X ε 0 such that there exists an adversarial example of X 0 under constraint ε 0 . ε * is the minimum adversarial perturbations of X 0 under D constraint. Theorem 1. Given a neural network F, a distribution ℵ ⊂ R n , a distance measurement D : R n × R n → R between X and X 0 and a point X 0 ∈ ℵ, searching for a minimum adversarial example of X 0 is an NPC problem.
Proof. The proof of Theorem 1 is shown in Appendix A.
Although it is an NPC problem, researchers calculate the non-trivial upper bounds of the robustness of the neural network [23][24][25]. We can thus calculate the non-trivial lower bounds of the minimum adversarial perturbations ε NNS of X 0 based on the exact meaning of the two bounds.
We thus model the problem of calculating the non-trivial lower bounds of the minimum adversarial perturbations ε NNS of X 0 . For input distribution ℵ ⊂ R n , a clean input X 0 , perturbed input X of X 0 under the ε constraint, X ∈ B(X 0 , ε), B = {X : D(X, X 0 ) ≤ ε}, a neural network F : R n → R k , original label y of X 0 , F(X 0 ) = y, target label y * , y * = y, and we define the non-trivial lower bounds of the minimum adversarial perturbations as ε NNS of X 0 , as shown in Equation (4): and: In Equation (4), ε * y * is the minimum of adversarial perturbations of X 0 under the target label y * . In Equation (5), ε is the perturbation of X 0 such that F(X) = y * , γ U y (X) means the upper bound of the network under label y of input X and γ L y * means the lower bound of the network under another label y * of the input. They are calculated in [23][24][25].
Theorem 2. Given a neural network F, a distribution ℵ ⊂ R n , a distance measurement D : R n × R n → R between X and X 0 , a point X 0 ∈ R n , the non-trivial lower bounds ε NNS ∈ R of the minimum adversarial perturbations of X 0 , if X is the perturbed example of X 0 under constraint ε NNS and X ∈ B(X 0 , ε NNS ), then F(X) ≡ F(X 0 ).
Proof. According to the definition and meaning of the ε NNS , we can obtain Theorem 2.

Definition 3.
(N-order tensor [26]). In deep learning, a tensor extends from a vector or matrix to a higher dimensional space. The tensor can be defined by a multi-dimensional array. The dimension of a tensor is also called order, that is, N-dimensional tensor, also known as N-order tensor. For example, when N = 0, the tensor is a 0-order tensor, which is one number. When N = 1, the tensor is a 1-order tensor, which is a 1-dimensional array. When N = 2, the tensor is a 2-order tensor, which is a matrix. [26]). The Hadamard product is the element-wise matrix product. Given the N-order tensors A, B ∈ R I 1 ×I 2 ×...×I N , the Hadamard product A × B is denoted as the product of elements corresponding to the same position of the tensor. The product C is a tensor with the same order and size as A and B. That is:

Definition 4. (Hadamard product
Definition 5. (+ * ). For a real number λ ∈ R and N-order tensor X ∈ R I 1 ×I 2 ×...×I N , we define λ + * X as the sum of X and the Hadamard product of λ and another tensor Ψ ∈ R I 1 ×I 2 ×...×I N . That is: D, Ψ and X have the same order. Specifically, in the field of the AEs, given a clean input X 0 ∈ R N , and perturbations r ∈ R, the adversarial example is X = X 0 + * r = X 0 + Ψ × r. The physical meaning is the proportionality factor of r which adds on each feature X 0i 1 ,i 2 ,...,i n .
Definition 6. (ε ∼ τ approximation of minimum AEs, ε ∼ τ approximation of minimum adversarial perturbations). Given a neural network F, a distribution ℵ ⊂ R n , a point X 0 ∈ ℵ, the non-trivial lower bounds ε NNS ∈ R of the minimum adversarial perturbations of X 0 , a constraint ε * τ and ε * τ = ε NNS + τ, τ > 0 is a constant, we say that X * τ is the ε ∼ τ approximation of minimum AEs of X 0 and ε * τ is the ε ∼ τ approximation of minimum adversarial perturbations such that τ is a constant set by humans according to the actual statement. When generating an adversarial example for a specific input, it has different requirements of the adversarial perturbations for different settings, scenarios and samples.
(a) The more complex the scenario is, the smaller the constant τ is. In the extreme scenario of digital AEs generation, it needs a clear filter of the AEs and has a strict requirement of invisibility, and the τ should be small [15]. However, for most physical AEs generations, it has the relaxed requirement of invisibility. Most of them only need to keep semantic consistency. The τ can be set more considerably than the digital setting [27].
(b) The more simple the sample is, the smaller the constant τ is. When the sample is simple, its information is single, and people would be more sensitive to the perturbations than complex samples. It is easier for people to recognize the difference between clean inputs and perturbed inputs. For example, the τ of the MNIST dataset [28] should be smaller than the CIFAR-10 dataset [29].
We model the problem of generating AEs under D measure metrics as follows. For a deep neural network F, input distribution ℵ ⊂ R n , a point X 0 ∈ ℵ and given the distance value d under constraint D, the problem of generating controllable AEs of d can be modeled as We discuss the problem under two settings. One is the constraint of the L p norm, and the other is that of perceptually constrained D measure metrics. We use the widely used structural similarity (SSIM) as the perceptual constraint in a perceptually constrained AEs generation. The two constraints will be discussed, respectively, in the following sections.

Analysis of the Existence of AEs
According to Theorem 2, we reach the following conclusions concerning the existence of AEs, as shown in Figure 2.
As Figure 2a shows, we have the following analysis. When adding adversarial perturbations lower than ε NNS , no AEs of X 0 exist.
When adding adversarial perturbations larger than ε NNS , AEs of X 0 exist. The gray shadow between the red circle and the blue line is the space where AEs exist. However, whether AEs can be found depends on the direction of adding ε perturbations. As the figure shows, the perturbations of X A and X B all equal ε, and they are all located on the bound of the ball of ε; however, we can see that X A is inside the gray shadow while X B is not.
Therefore, some conclusions that were previously well known hypotheses can be proven. Different AEs generation methods generate AEs with varying accuracy. For a clean input X 0 , when adding the same perturbations on it, method A can acquire the adversarial input X A . In contrast, method B obtains X B located inside the blue line and can still be correctly classified by the network. Hence, the key to generating AEs is finding the direction of where AEs exist. As shown in Figure 2a, when it is along the path of X, the added perturbations are the smallest.
Meanwhile, for different clean samples, the added perturbations of generating AEs are different. When a specific perturbation ε > ε NNS is fixed, different clean samples will obtain different perturbed examples after adding those perturbations. As shown in Figure 2b, the blue boundary and the yellow boundary are the different classification boundaries of two different samples, respectively. The perturbed examples acquired by adding the same perturbations ε are within the yellow boundary but outside the blue border. Therefore, they are the AEs of the blue boundary constraint samples which can be correctly classified by the yellow boundary constraints. Thus, the adversarial example needs to be researched for a specific sample. Therefore, according to the analysis of the existence of AEs, we have the following conclusions. In order to generate practical AEs, the added perturbations quantity needs to meet the requirement ε > ε NNS and it needs to be larger than the classification boundary in this direction. At the same time, due to the limitation of invisibility of the AEs, it should be as small as possible. Thus, the generation direction of the AEs should be closer to the direction of minimum AEs.
According to Theorem 1, searching for the minimum AEs of sample X 0 is an NPC problem. In this paper, we try to generate the minimum AEs under a ε ∼ τ numerical approximation as Definition 6.
According to Figure 2a, in order to generate an effective adversarial example, the perturbations should be larger than the lower bound ε NNS and the perturbations needed to cross the boundary of the classifier. When fixing ε * τ , it defines a ball with a center of X 0 and radius ε * τ . As shown in Figure 2a, the points on the ball are not all AEs. Using the + method is the same as selecting a random direction to generate perturbed examples that are highly unlikely to be adversarial. Therefore, it is necessary to calculate the direction of adding ε * τ and make F( . Ψ is the direct tensor of effective AEs.

Model of L p Constraint
We model the problem of generating the ε ∼ τ approximation of minimum AEs. For a neural network F, the input distribution ℵ ⊂ R n , a point X 0 ∈ ℵ and given the ε ∼ τ approximation of minimum adversarial perturbations ε * τ , the problem of generating the ε ∼ τ approximation of the minimum adversarial example X * τ can be modeled as According to the analysis of the existence of AEs and Theorem 2, when the added adversarial perturbations ε > ε NNS , AEs certainly exist and the model must have a solution.

Framework of AE Generation under L p Constraint
According to Definition 6, we transform the problem of calculating the ε ∼ τ approximation of minimum adversarial example X 0 * τ into searching for the direct tensor Ψ. For a neural network F, input distribution ℵ ⊂ R n , a point X 0 ∈ ℵ and given the ε ∼ τ approximation of minimum adversarial perturbations ε * τ , according to Definition 5, the ε ∼ τ approximation of the minimum adversarial example is X * τ , and X * This model must have solutions, and we can consider a special solution. We set one element of Ψ ∈ R I 1 ×I 2 ×...×I N as 1 and the others are 0, fulfilling Equation (10). When the clean input is an image, it means modifying one channel of one pixel of the image, as proposed in [15]. However, this attack only has a 20.61% success rate on VGG-16 [30] of cifar- 10 [29]. Furthermore, the perturbations of this pixel are too large to be set as τ.
It is difficult to directly calculate Ψ; thus, to solve Equation (10), we decompose ε * τ × Ψ into the two tensors δ × Λ and δ, Λ ∈ R I 1 ×I 2 ×...×I N , and each element of δ and Λ are defined as δ i 1 ,i 2 ,...,i n and Λ i 1 ,i 2 ,...,i n , respectively. The n-order tensor δ determines the location of the added perturbations and the importance of the target label while the n-order tensor Λ determines the size of the added perturbations, that is, the percentage of the total perturbations. According to Equation (10), we obtain the following derivation: Therefore: However, in Equation (13), the two tensors are all unknown and all of them have n elements, so it is a multivariate n-order equation and still unsolvable. Although it is unsolvable, we can certify it as a trivial solution. We can certify when: Equation (14) is workable. The proof is shown as follows. Proof.
We only need to search for one solution of the model (9). That is, we only need to generate one ε ∼ τ approximation of a minimum adversarial example corresponding to the requirements (9). The trivial solution Equation (14) is therefore the result.
Therefore, the problem of generating the ε ∼ τ approximation of minimum AEs is transformed into generating the tensor Ψ by Definition 6 and it is then transformed into calculating the two tensors δ, Λ by Equation (13). Moreover, it is finally transformed into calculating the tensor δ. However, it is still an unsolvable question. Although the only thing we need to do is calculate the tensor δ, it is an n-order tensor in the real world so that there are n elements that remain unknown and need to be calculated. According to Equation (13), when tensor δ is known, the problem of solving the multivariate n-order equation is turned into a multivariate 1-order equation. If we want to solve the multivariate 1-order equation, we need n equations. However, we only have one equation, which is Equation (13). Therefore, this paper proposes the solution framework for generating the ε ∼ τ approximation of minimum AEs and a heuristic method to solve the problem.

Method of Generating Controllable AEs under L p Constraint
According to the definition of the AEs, we decompose the tensor δ into + αξ, , ξ ∈ R I 1 ×I 2 ×...×I N . Each element of and ξ are defined as i 1 ,i 2 ,...,i n and ξ i 1 ,i 2 ,...,i n , respectively.
Because the N-order tensor determines the position of adding perturbations and the importance of the position to the target label, it contains two factors that restrict the value of the AEs. One is to improve the invisibility of the AEs so that added perturbations should be insensitive to human eyes. Another is to improve the effectiveness of the AEs so that the added perturbations should be able to push the sample away from the original classification boundary (in the case of non-target attack) or close to the target classification boundary (in the case of target attack. Obtaining a balance between the two factors is a key problem in the study of AEs. Therefore, we decompose the δ into and ξ. Importantly, is the tensor to determine the effectiveness of AEs and ξ is the tensor to determine the invisibility of AEs. According to Equation (13), we have: Therefore, the perturbations added on each element are: According to the above analysis, we transform the ε ∼ τ approximation of minimum AEs generation into calculating the and ξ.

Calculating
According to the analysis of Equation (5), when γ U y (X) is lower than γ L y * (X), the input X is an adversarial example. This means that the upper bound of the network under the original label of input X is lower than the lower bound of the network under other labels. Therefore, we let: In the initial update step, the perturbed examples are not in the shadow space so that they are still correctly recognized by the model and γ U y (X) − γ L y * (X) > 0. At this time, we need to make the examples as close as possible to reducing theh(X), so the update direction is opposite to the gradient. When theh(X) value is less than 0, the absolute value of thē h(X) needs to be larger, but the realh(X) value still needs to decrease so that the update direction remains the opposite of the gradient direction.

Calculating ξ
According to the definition of ξ, ξ is the tensor to determine the invisibility of AEs. DCT transformation [31] can transform the data from host space to frequency domain space, and the data in the time-domain or space-domain can be transformed into a frequencydomain that is easy to analyze and process. When data are image data, after transformation, much crucial visual information about the images is concentrated in a small part of the coefficient of DCT transformation. The high-frequency signal corresponds to the nonsmooth region in the image, while the low-frequency signal corresponds to the smoother region in the image.
According to the human visual system (HVS) [17], (1) human eyes are more sensitive to the noise of the smooth area of the image than the noise of the non-smooth area or the texture area; (2) human eyes are more sensitive to the edge information of the image and the information is easily affected by external noise.
Therefore, according to the definition of DCT, we can distinguish the features of each region of the image and selectively add perturbations. Given that the N-order tensor input data X 0 ∈ R I 1 ×I 2 ×...×I N can be seen as a superposition of I 1 × I 2 × . . . × I N /I i × I j two-order tensor X Π 0 ∈ R I i ×I j : and m, k ∈ {0, 1, . . . , i i − 1}, n, l ∈ 0, 1, . . . , i j − 1 .
In this paper, according to the definition of the tensor of ξ: Above all, we give the algorithm that generates the ε ∼ τ approximation of minimum AEs under the Lp constraint in Algorithm 1.

AEs Generation under SSIM Constraint
We model the problem of generating AEs under SSI M [17] measure metrics as follows. We use SSI M to replace the D measure metrics in Equation (8). For a neural network F, input distribution ℵ ⊂ R n , a point X 0 ∈ ℵ, the problem of generating controllable AEs of SSI M can be modeled as According to the definition of the similarity measurement SSI M, for gray-scale images x, y ∈ R n as SSIM( where l(x, y) = 2µ x µ y +C 1 µ 2 x +µ 2 y +C 1 defines the luminance, c(x, y) = 2σ x σ y +C 2 σ 2 x +σ 2 y +C 2 defines the contrast comparison function, and s(x, y) = σ xy +C 3 σ x σ y +C 3 defines the structure comparison function. Furthermore, µ x , µ y define the mean value of inputs x, y, respectively, σ x , σ y define the standard deviation of x, y, respectively, and σ xy is the covariance between x and y. C 1 , C 2 , C 3 > 0 and ς, θ, ι > 0 are constants. According to [17], when setting ς = θ = ι = 1 and C 3 = C 2 /2, Equation (24) can be simplified as, Furthermore, according to Lagrangian constraint, we formulate Equation (26) as where is the Lagrangian valuable, t is the one-hot tensor of the target label and loss is the cross-entropy loss function as shown in Equation (27). Cross-entropy can measure the difference between two different probability distributions in the same random variable. In machine learning, it is expressed as the difference between the target probability distribution t and the predicted probability distribution F(X * τ ).

Experimental Setting
Dataset: In this work, we evaluate our methods under two widely used datasets. MNIST is a handwriting digit recognition dataset from 0 to 9, including 70,000 gray images and 60,000 for training and 10,000 for testing. CIFAR-10 [32] has 60,000 images of ten classes, including airplane, automobile, bird, cat, deer, dog, frog, horse, ship and truck.
Threat model: In our paper, we generate the AEs of trained threat models. Due to limited computational resources, we train a feed-forward network with p layers and q neurons per layer. For all the networks, we use the ReLU activation function. We denote the networks as p × [q]. For the MNIST dataset, we train the 3 × (1024) network as a threat model. For the CIFAR-10 dataset, we train 6 × (1024), 7 × (1024) and 6 × [2048] as threat models.
Baseline attack: For comparing our method with other adversarial attacks, we generate AEs by different attack methods. Our method can adapt to different L p constraint measurements. In this part, due to limited computational resources, we adopt the L 2constrained measurement. Therefore, we use other the L 2 -constrained attack methods as the baseline, including SA-L 2 [10], FGSM-L 2 [11], BIM-L 2 [33], PGD-L 2 [14] and DF-L 2 [6]. We compare the performance of those attacks with our method under different ε * τ constraint.

Results of Attack Ability
We calculate the success rates of the attacks to compare the attack ability. Due to the uncontrollable ability of the perturbations of other baseline attack, we first set the ε * τ as 0.4, 0.8 and 1.2 for the MNIST dataset and 20, 25, 30 and 37 for the CIFAR-10 dataset, and we obtain the average perturbations of the baseline attacks under the L 2 constraint, as shown in Tables 1 and 2, and then we use their average perturbations as the ε * τ of our method under the same constraint and make a comparison of the success rates.
The criteria for selecting the values for the baseline for each dataset is that the value is sufficiently adequate for the baseline attack. This means that under that value, the baseline attack will not jump out of the circulation of attack in advance due to an excessively large value, which leads to the measured average perturbations not having enough correlation with that value. Meanwhile, that value will not lead to the low success rate of the baseline attack due to it being too small. Specifically, because the baseline attack cannot control the average perturbations, we first take the way of binary search that the range is (0, 100] and the value interval is five and test the attack success rate and average perturbations of the baseline attack under different values. We then remove the points where either the difference between the average perturbations and that value is too large or the success rate is too low, that is, the points where that value overflows or is insufficient. Due to the same average perturbations of the PGD-L 2 and BIM-L 2 attacks, we show their results in one table, namely Table 3 for MNIST and Table 4 for CIFAR. Furthermore, the comparison of the FGSM-L 2 attack and our method is shown in Table 5 for MNIST and  Table 6 for CIFAR. As the four tables show, under the same L 2 (ε * τ ) constraint, our attack has a better attacking performance than other PGD-L 2 (ε * τ ), BIM-L 2 (ε * τ ) and FGSM-L 2 (ε * τ ) attacks. In addition to the attacks that have a fixed ε * τ , we also compare the attacks without a value to constrain the perturbations including the SA-L 2 and DF attacks. We also calculate the average perturbations of those attacks. Furthermore, then we use the same average perturbations as the ε * τ of our method and make a comparison in Tables 7 and 8. For MNIST, our method has a better performance than the DF and SA attacks.  In addition to the above two small-sized datasets, the experiment also evaluates the performance of the algorithm on the larger and more complex dataset that is TinyImagenet. The dataset has 200 classes, each class has 500 pictures and we extract 200 pictures as the experimental data. For this dataset, we select the CNN model with seven layers that is denoted by 'CNN-7layer' [34] as the threat model. Furthermore, we set the ε * τ as 1.0, 2.0, 4.0 and 6.0. The experiment first measures the average perturbation of the baseline attack under the selected ε * τ . Furthermore, it then sets the average perturbation as the ε * τ to compare the success rate of our algorithm and the baseline attack under that same value. The average perturbation of the baseline attack is shown in Table 9 and the comparison of the attack ability is shown in Table 10. As shown in Table 10, our algorithm has a better performance than the FGSM attack under the same ε * τ . Table 9. Furthermore, we also evaluate the attack ability of our algorithm on more complex models. We select Wide-ResNet, ResNeXt, and DenseNet as the target models and train them under the CIFAR dataset. The detail is the same as in [34]. The benchmark values we selected are 1.0, 5.0, 10.0, 30.0, 60.0 and 80.0. Similarly, we first calculate the average perturbations of the baseline attack under that values. Then, we evaluate the results of the success rate of our algorithm and the baseline attack under the same ε * tau of our algorithm which is the same as the average perturbations calculated beforehand. Table 11 shows the average perturbations. We make a comparison of the attack ability in Table 12 and  Due to Table 12, we find that under the benchmark values 5.0, 10.0, 30.0, 60.0 and 80.0, our algorithm performs better than the FGSM attack. However, under the 1.0, in the Wide-ResNet and ResNeXt, the FGSM attack performs better. In this part, we evaluate our method described in Section 5. Due to there being no work devised for the same purpose as our method, we only show the results of our method without any comparison with others. We show the controllable ability under the SSI M constraint of our method and record its success rate in Table 13. We also show the adversarial images under different SSI M constraints in Figure 3.  In this section, we discuss the results of our method under L 2 with the α = 0 constraint. Through the different α, we can not only generate the controllable AEs but also improve the perceptual visual quality under the same ε * τ constraint. When the ε * τ under the L 2 constraint is large, the perceptual visual quality is poor. In order to adapt to this situation, our paper devises α to improve the perceptual visual quality. However, there is a trade-off between the visual quality of the AEs and their success rate. Figure 4 shows the SSIM value of AEs under different ε * τ with different α. As it shows, the SSIM value increases with the α increasing under the same ε * τ constraint. Furthermore, we can see that with the increasing ε * τ , the SSIM value has a trend of decreasing under the same α. This means that the visual quality becomes poorer when more perturbations are added to the inputs, which is in line with the intuition of the AEs. Meanwhile, the SSIM value rises rapidly before α = 1.0 under the same ε * τ constraint; after that, its trend tends to be flatter. Figure 5 shows the success rate of AEs under different ε * τ with different α. As it shows, the success rate decreases with the α increasing under the same ε * τ constraint. Moreover, with the increasing ε * τ , the success rate increases under the same α constraint. It is also consistent with the general nature of the AEs that when more perturbations are added, the probability of a successful attack becomes greater. Furthermore, the α = 1.0 still tends to be a boundary that before α = 1.0, the success rate decreases slower and then it decreases faster when ε * τ = 3.00 and ε * τ = 2.50. However, it has a nearly consistent trend of decreasing with ε * τ = 1.00, ε * τ = 1.50 and ε * τ = 2.00. It means that when the perturbations remain small, excessive attention to visual quality will lead to a greater loss of attack success rate. Therefore, it corresponds to the actual meaning of the parameter α that only needs to be set to α = 0 when ε * τ is large. We set α = 1 and compare the results between α = 0 and α = 1 in Table 14. Figure 6 shows the time of generating AEs under different ε * τ with different α. As it shows, the time decreases with the α increasing under the same ε * τ constraint. Moreover, with the ε * τ increasing, the time increases under the same α constraint.

Conclusions
Aiming at the two fundamental problems of generating the minimum AEs, we first define the concept of the minimum AEs and prove that generating the minimum AEs is an NPC problem. Based on this conclusion, we then establish a new third kind of optimization model that takes the successful attack as the target and the adversarial perturbations equal the lower bound of the minimum adversarial distortion plus a controllable approximation. This model generates the controllable approximation of the minimum AEs. We give a heuristic solution method of that model. From the theoretical analysis and experimental verification, our model's AEs have a better attack ability and can generate more accurate and controllable AEs to adapt to different environmental settings. However, the method in this paper of the model does not perfectly determine the solution of the model, which will be the focus of future research. . . B n , |B| = n, each variable takes 0 or 1, and a set of clauses C = C 1 , C 2 , . . . C k , |C| = k, ψ = C 1 ∧ C 2 ∧ . . . ∧ C k ; each C i is a disjunctive normal form composed of three variables, that is, Z 1 ∨ Z 2 ∨ Z 3 . Question: given a Boolean variable set X and clause set C, whether there is a true value assignment so that C is true and then each clause is true.
Theorem A1. Given a neural network F, a distribution ℵ ⊂ R n , and a distance measurement D : R n × R n → R between X and X 0 , a point X 0 ∈ R n , searching for a minimum adversarial example of X 0 is an NPC problem.
Proof. We now prove Theorem A1. We first reduce the problem into a decision problem, and then according to the definition of an NPC problem, we prove that the problem belongs to the type of NP problem, and finally, we prove that a known NPC problem can be reduced to the decision problem in polynomial time.
We first reduce the problem of finding the minimum AEs into a series of decision problems. Though many important problems are not decision problems when they appear in the most natural form, they can be reduced to a series of decision problems that are easier to study, for example, the coloring problem of a graph. When coloring the vertices of a graph, we need at least n colors to make any two adjacent vertices have different colors. Then, it can be transformed into another question. Can we color the vertices of the graph with no more than m colors, m ∈ N+? The first m value in the set that makes the problem solvable is the optimal solution of the coloring problem. Similarly, we can also transform the optimization problem of finding the minimum AEs into the following series of decision problems: given the precision of perturbations δε, and initial perturbations ε 1 , ε i = ε i−1 + δε, whether we can use the perturbations ε, ε ∈ ε 1 , ε 2 , ..., ε i , ... to make the inequality F(X 0 + * ε i ) = F(X 0 )) true, and the first value in the sequence that makes the inequality true is the optimal solution of the optimization problem.
Obviously, it is an NP problem. In the guessing stage, given any perturbations ε, assuming the ε is a candidate solution of the decision problem. Furthermore, then in the verification stage, since the process of inputting perturbations ε and samples X 0 to the neural network and then outputting the results can be completed in polynomial time, it is polynomial in the verification stage. Therefore, the solution to the decision problem is an uncertain polynomial algorithm. Furthermore, according to the definition of the NP problem, the decision problem is an NP problem.
Finally, we prove that any problem in NP can be reduced to the decision problem in polynomial time. Due to the transitivity of polynomial simplification, we can prove a known NPC problem: the 3 − SAT problem can be transformed into the decision problem in polynomial time and then complete this proof. Since the 3 − SAT problem is an NPC problem, according to the definition of an NPC problem-that is, any problem in the set of NP problems that can be reduced to an 3 − SAT problem in polynomial time-if the 3 − SAT problem can be reduced to the aforementioned decision problem of searching for the AEs, according to transitivity, any problem in NP can be reduced to that decision problem in polynomial time and it can be proven that the decision problem is an NP-hard problem. We then prove how the 3 − SAT problem is Turing reduced to the decision problem.
According to the definition of an 3 − SAT problem, given the Ternary satisfiability formula ψ = C 1 ∧ C 2 ∧ . . . ∧ C k on the variable set X = {x 1 , x 2 , . . . , x k }, each clause is a disjunction of three terms: q 1 i ∨ q 2 i ∨ q 3 i , where q 1 i , q 2 i and q 3 i are variables from X or their negative values. The problem is turned into that determining whether there is an assignment α : X → {0, 1} to satisfy ψ, that is, whether there is an assignment α that makes all clauses C i valid at the same time.
To simplify, we first assume that the input node q 1 i , q 2 i and q 3 i is a sub-statement constructed when the discrete value is 0 or 1. Then, we will explain how to relax this restriction so that the only restriction on the input nodes is that they are in the range of [0, 1].
Firstly, we introduce the disjunctive tool, that is, given nodes q 1 , q 2 , q 3 ∈ 0, 1 and the output node is Y i . When q 1 + q 2 + q 3 ≥ 1, Y i = 1, otherwise Y i = 0. The following Figure A2 shows the situation when q i is the variable itself (that is, it is not the negative value of the variable). The disjunctive tool can be seen as the process of calculating Equation (A1): If it has one variable of input which is at least 1, then Y i = 1. If all the variables of input are 0, then Y i = 0. The key of the tool is that the ReLU function can ensure that the output Y i remains exactly 1 even if multiple inputs are set to 1.
For processing any negative item q j i = 1 − x j ≡ ¬x j , we introduce a negative tool before inputting the negative item into the disjunctive tool, as shown in Figure A3a. The tool that calculates 1 − x j and then continues to calculate is the aforementioned disjunctive tool. The last step involves a conjunction widget, as shown in Figure A3b.
Assuming that all nodes Y 1 , . . . , Y n are in the range of 0, 1, we require node Y in the range of [n, n]. Obviously, this requirement only holds if all nodes are 1.
Lastly, in order to check if all the clauses C 1 , . . . , C n are satisfied at the same time, we construct a conjunction gadget(using the negative value tool as input as needed) and combine it with a conjunction gadget, as shown in Figure A4. The input variable is mapped to each t i node according to the definition of clause C i , that is, t i → C i . According the above discussion, if the clause C i is satisfied, then Y i = 1; otherwise, Y i = 0. Therefore, the node Y is the range of [n, n] if and only if all the clauses are satisfied at the same time. Thus, an assignment α : X → 0, 1 of input satisfies the constraint between the input and the output of neural networks if and only if that assignment also satisfies the original item ψ = C 1 ∧ C 2 ∧ . . . ∧ C n .
The above construction is based on the assumption that the input node takes values from discrete values 0, 1, that is, α : X → 0, 1. However, it does not accord with the assumption that ψ 1 (X) is the conjunction of linear constraints. We will then prove how to relax the restriction to make the original proposition true.
Letting ε is a very small number. We suppose that each variable X i is in the range of [0, 1] but ensure that any feasible solution satisfies X ∈ [0, ε] or X ∈ [1 − ε, 1]. We add an auxiliary gadget to each input variable X i , that is, using the ReLU function node to calculate Equation (A2) as follows: max(0, − X) + max(0, X − 1 + ) (A2) Furthermore, the output node of Equation (A2) is required to be within the range [0, ε]. This expression can directly indicate that when X ∈ [0, ε] or X ∈ [1 − ε, 1], it is true for X ∈ [0, 1].
The disjunctive expression in our construction is Equation (A1). The value of its disjunctive expression changes with the inputs. If all inputs are in [0, ε] or [1 − ε, 1], then at least one input is in [1 − ε, ε] and then the end output node of each disjunctive gadget Y i will no longer use discrete values [0, 1] but will be in [0, 3ε].
If at least one node of each input clause is in the range [1 − ε, 1], then all Y i nodes will be in [1 − ε, 1] and Y will be in [n(1 − ε), n]. However, if at least one clause does not have a node in the range [1 − ε, 1], Y will be less than n(1 − ε) (when < 1 n+3 ). Therefore, keeping the requirements Y ∈ [n(1 − ), n] true, if and only if ψ is satisfied, its input and output will be satisfied, and the satisfied assignment can be constructed by making each X i ∈ [0, ε] = 0 and each X i ∈ [1 − ε, 1] = 1.
its inputs x and y as X *