You are currently viewing a new version of our website. To view the old version click .
Applied Sciences
  • Article
  • Open Access

18 January 2023

A Self-Adaptive Approximated-Gradient-Simulation Method for Black-Box Adversarial Sample Generation

,
,
and
1
School of Computer and Big Data Science, JiuJiang University, Jiujiang 332005, China
2
School of Computer Information & Communication Engineering, Kunsan National University, Gunsan 54150, Republic of Korea
3
Jiangxi Institute of Science and Technology Information, Nanchang 330046, China
*
Authors to whom correspondence should be addressed.
This article belongs to the Special Issue Future Information & Communication Engineering 2022

Abstract

Deep neural networks (DNNs) have famously been applied in various ordinary duties. However, DNNs are sensitive to adversarial attacks which, by adding imperceptible perturbation samples to an original image, can easily alter the output. In state-of-the-art white-box attack methods, perturbation samples can successfully fool DNNs through the network gradient. In addition, they generate perturbation samples by only considering the sign information of the gradient and by dropping the magnitude. Accordingly, gradients of different magnitudes may adopt the same sign to construct perturbation samples, resulting in inefficiency. Unfortunately, it is often impractical to acquire the gradient in real-world scenarios. Consequently, we propose a self-adaptive approximated-gradient-simulation method for black-box adversarial attacks (SAGM) to generate efficient perturbation samples. Our proposed method uses knowledge-based differential evolution to simulate gradients and the self-adaptive momentum gradient to generate adversarial samples. To estimate the efficiency of the proposed SAGM, a series of experiments were carried out on two datasets, namely MNIST and CIFAR-10. Compared to state-of-the-art attack techniques, our proposed method can quickly and efficiently search for perturbation samples to misclassify the original samples. The results reveal that the SAGM is an effective and efficient technique for generating perturbation samples.

1. Introduction

Deep neural networks (DNNs) have achieved impressive success in various ordinary duties, such as image classification [1], autonomous driving [2], and object recognition [3]. With the expansion of application domains, DNN robustness, which refers to the sensitivity of its output results to tiny changes in the input image, has become an important research field [4]. Clean images with tiny and imperception perturbation, known as adversarial examples, may cause DNNs to misclassify the label of an image [5]. Researchers have declared that this makes DNNs very vulnerable to adversarial examples. In real-world scenarios, there is a security risk for the adversarial examples in the contexts of self-driving vehicles [6], text classification [7], and medical diagnostic tests [8]. Adversarial examples are dangerous for DNNs but are also a strategy to enhance DNN robustness.
The method of generating adversarial examples that cause DNNs to give an incorrect output with high confidence is defined as an adversarial attack [9]. Adversarial attacks generally consist of two types of attacks, namely white-box [10] and black-box attacks [11]. The adversarial examples generated by white-box attacks can fool DNNs with high success rates, mainly because they have access to the internal knowledge of the DNN and the gradient obtained by backpropagation. However, in a real-world scenario, it is very impractical to have complete access to the internal information of a DNN. It is a new challenge for attackers to generate adversarial examples from the input image and output in a black-box attack, without considering the internal details of DNNs [12].
In a state-of-the-art white-box attack, the adversarial image mainly adds or subtracts adversarial examples to each point of the clean image in a sensitivity direction. Researchers have proposed that adversarial examples are generated by looking for the sensitivity gradient direction [13]. As a result, most attack methods only keep the gradient direction without considering the gradient magnitude. The sensitivity direction can be directly estimated from the loss of DNNs. However, black-box attackers cannot directly obtain the gradient and may simulate it through other schemes, such as zero-order optimization [14], evolution strategies [15], and the random search method [16]. A black-box attack can be implemented by solving an optimization problem on minimum perturbation. Normally, it is time-consuming to solve a high-dimensional optimization problem, and the mapping from probability to categories is extremely non-linear.
Based on the above considerations, we propose a self-adaptive approximated-gradient-simulation method for black-box adversarial attacks (SAGM) to generate perturbation samples. The SAGM utilizes self-adaptive differential evolution (DE) to simulate the gradient of each pixel and leverage the momentum gradient to improve its efficiency. In the process of gradient simulation, the DE parameters are self-adaptively adjusted according to the DNN confidence score. Finally, the perturbation image is continuously adjusted by approximating the gradient to achieve the attack purpose.
We compare the SAGM with other state-of-the-art attack methods in two datasets, namely MNIST [17] and Cifar10 [18]. Extensive experiments show that our proposed SAGM can achieve high attack efficiency that is comparable with other attack methods. Our attack method can generate an adversarial example faster and more efficiently than other methods. Finally, the SAGM is applied to strengthen DNN robustness.
The paper can be summarized by providing the following main contributions:
  • We construct an evaluation model based on similarity and confidence score to validate the effectiveness of adversarial samples.
  • We propose a method to simulate the approximated gradient using differential evolution.
  • We propose a novel parameters adaptive scheme in which F and CR are adaptively adjusted to explore the sensitive gradient directions by the feedback of the evaluation model.
  • We propose an adversarial-sample-generation method based on quantized gradients, preserving the magnitude and direction of the gradient, to generate an efficient sample.
The remainder of this paper is organized as follows. Section 2 reviews related work on recent white-box attacks, black-box attacks, and DE. The proposed SAGM is presented in detail in Section 3. Section 4 gives the experimental results and analysis. The last section, Section 5, is devoted to conclusions and future work.

3. Our Proposed Algorithm

In this section, our approach, entitled self-adaptive approximated-gradient-simulation method (SAGM), is proposed to improve adversarial attacks. Its main feature is to enhance the generation of adversarial samples. Our proposed framework is depicted in Figure 1. The SAGM mainly includes two processes of simulating the approximate gradient and self-adaptive generation of adversarial samples, in which the gradient of the pixel is approximated by the evolutionary algorithms (EAs) to self-adaptively perturb the original image. These two processes are performed iteratively until the perturbed image is successfully attacked or the maximum number of runs is reached. In the following subsections, we emphasize these two processes in detail.
Figure 1. The SAGM framework.

3.1. Model Construction

In a DNN, the image x is entered into the hidden layer to retrieve its features, and finally outputs the confidence vector of n categories p ( x , n ) [ 0 , 1 ] . The SAGM is a black-box attack method without concern for the DNN internal configurations; therefore, the adversarial image x that is similar to the image x belonging to category t is identified as another category t ( t t ) . To achieve this goal, the SAGM needs to solve the following constrained optimization problem presented in Equation (2).
m i n i m i z e D ( x ) = x x 2 s . t . C ( x ) = t x = x + δ δ [ 0 , 1 ] n
The constrained optimization problem is to explore similar images with validity to implement misclassification. C ( x ) = t is an extremely non-linear function whose purpose is to misclassify the image into category t . In other words, the confidence that the image belongs to category t (i.e., p ( x , t ) ) is the highest among all categories. Based on this consideration, the non-linear function is converted to the following Equation (3) function.
l ( x ) = p ( x , t ) max k t p ( x , k )
where t is the category of the original image x , p ( x , t ) is confidence in the category t , max k t p ( x , k ) is the maximum confidence among all categories k that do not belong to category t .
When l ( x ) is less than or equal to 0, the image is misclassified into other categories. Therefore, the converted function has the same purpose as the original function, which is to misclassify perturbed images. The constrained optimization problem (2) is transformed into the following new constrained optimization problem for optimization (Equation (4)).
m i n i m i z e D ( x ) = x x 2 s . t . p ( x , t ) max p k t ( x , k ) 0 x = x + δ δ [ 0 , 1 ] n
The main purpose of Equation (4) is to explore the perturbed vector δ and add it to the original image x to fool the DNN. The relationship between the three parameters in Equation (4) is depicted in Figure 2.
Figure 2. The relationship between the adversarial image and the original image.
Figure 2 shows that there is an area gap between the two different categories. The red and blue dashed circles represent the feasible region X = { x | x max k t   p ( x + δ , k ) } and initial solution X 0 = { x | x p ( x , t ) } of the model (4), respectively. The black dashed line indicates the gap δ between the two categories.
Based on the characteristics of the model, model (4) is converted to an unconstrained non-linear optimization problem using the exterior point method, which constructs a penalty function to penalize points in the infeasible region, as seen in the following formula (Equation (5)).
P ( x ) = { 0 , x X M ,   x X
where M is a large positive number.
The penalty function (5) can be transformed into the unconstrained problem, but there is a jump at the boundary. Therefore, to avoid this problem, the penalty function of
(4) is constructed as shown in Equation (6).
P c g ( δ ) = c g { [ max ( p ( x + δ , t ) max k t   p ( x + δ , k ) , 0 ) ] + [ max ( δ , 0 ) ] + [ max ( δ 1 , 0 ) ] }
In (6), c g is the penalty parameter with an ascending sequence, making x gradually approach the optimal solution from the outside of the feasible region. Therefore, (4) is converted to the following augmented objective function (Equation (7)).
m i n i m i z e   f ( δ ) = δ 2 + P c g ( δ )
In Equation (7), the first term on the right represents the similarity between the perturbed image x and the image x , and the two images can be ensured to be identical by minimizing. The two terms on the right tend to the feasible point by imposing penalties on the infeasible points. Therefore, the solution to the adversarial image x is actually to solve the perturbed vectors δ .

3.2. Self-Adaptive Two-Step Perturbation

Perturbation based on sign gradients is an effective method for adversarial attacks. However, the sign gradient method rejects the proportion among the raw gradients, resulting in components sharing the same coefficients. Therefore, after G iterations, perturbation based on a fixed step size α may appear in two cases, as shown in Figure 3.
Figure 3. Self-adaptive two-step perturbation.
In Figure 3, the blue and red dotted circles represent the probability contours of x and x , the two black lines represent the perturbation distances α and β in the gradient direction, and x g is a perturbed image after g generation. Figure 3a shows that the perturbation image is not being converted into an adversarial image due to the small step size. Figure 3b shows that the perturbation image is converted into an adversarial image that is easily recognizable because of the big step size.
The main reason for this condition Is that each pixel of the original image is adjusted at a fixed learning rate to achieve the purpose of the attack. Hence, each pixel with different gradients is updated at the same learning rate. In this case, it cannot be motivated to improve the efficiency of the attack. To deal with this problem, we propose an improved two-step adaptive perturbation method to improve its efficiency. The first step is to adaptively adjust α to generate the temporary perturbation image x g + 1 , and the second step is to fine-tune x g + 1 by adding a step size β to move toward the boundary of the adversarial image.
In the first step, we leverage the optimal gradient in the past to guide the next-generation adaptive change of the pixel (Equation (8)).
δ i = δ i 1 + δ i , b e s t δ i , b e s t 2
where δ i , b e s t is the optimal gradient for generation t .
Assuming that the initial optimal gradient δ 0 = 0 , then the cumulative gradient of generation i can be written as the following (Equation (9)).
δ i = i = 0 g 1 δ i , b e s t δ i , b e s t 2
Afterward, the δ g 1 and δ g are linearly combined to adjust the perturbation direction and step size of the current generation (Equation (10)).
δ g = θ 1 δ g 1 + θ 2 δ g
where θ 1 and θ 2 are hyperparameters that control the exponential decay rate, and θ 1 + θ 2 = 1 .
To conserve the direction and magnitude of the gradient, the perturbed vectors δ are calculated by the following Equation (11).
δ = λ δ g max ( δ g ) δ g = { s i g n ( δ ) ,     i f | δ | < 1 r o u n d   ( δ ) ,     o t h e r w i s e
where λ is a fixed parameter, max ( ) is the gradient maximum. Function r o u n d   ( x ) rounds the variable x . The main purpose of Equation (11) is to set a different step size α for each perturbation.
To further improve the success rate of adversarial images, the second step is adopted to fine-tune the perturbation images to approach the boundary of the target images x g + 2 . Therefore, β is updated by step-size derivation, as shown in Equation (12).
β g = Δ x , Δ p Δ p , Δ p  
where Δ x = x g + 2 x g + 1 , Δ p = p g + 2 p g + 1 , and a , b denotes the scalar product of the vectors a and b . x g + 2 is the target image that belongs to the category with the highest confidence other than the category t of the original image.

3.3. Self-Adaptive Approximated Gradient Search

It is well known that black-box attacks only require access to the output, not to the network structure. The gradient cannot be obtained directly because the value of the loss function cannot be obtained. The perturbed image x is generated by the following iterative algorithm (Equation (13)) on the basis of the original image x .
x g + 1 = x g + δ g
Assuming p g is the unit vector of δ g , there exists a variable α g > 0 for which the equation δ g = α g · p g holds. In the iterative algorithm, p g represents the perturbation direction of δ g , and α g represents the step of the g th generation along the search direction p g . Therefore, the iterative function is written as follows (Equation (14)).
x g + 1 = x g + α g s i g n ( δ g )
Here, the initialization value of x 0 is the clean image. The exploration process of x is shown in Figure 4.
Figure 4. The exploration process of perturbed image.
In Figure 4, the solid blue and red circles represent the probability contour of x and x , respectively. The dotted concentric circles on the left indicate that the probability of belonging to class t decreases with increasing radius. The clean image is perturbed by the vector δ , resulting in low probability. Meanwhile, the probability of other categories will gradually increase. When the probability of belonging to class t drops to a certain threshold, an adversarial image may be generated.
Based on the above analysis, a self-adaptive DE is proposed to simulate the perturbed vectors δ g . Differential evolution (DE) [28], proposed by Storn and Price, has been widely used to solve numerical optimization problems because of its simplicity, robustness, convergence, and ease of use. The three steps in DE, namely mutation, crossover, and selection, are computed repeatedly until the termination criterion is satisfied. Mutation produces a mutant vector through mutation strategies, the crossover is the process of recombination between the mutant vector and the target vector, and selection is to preserve the optimal individuals for the next generation according to the greedy choice method.
In the SAGM, we utilize adaptive DE to explore the perturbed vectors. The initial perturbed vector between −1 and 1 is generated using uniformly distributed randomization, and its dimension is consistent with the dimension of the input image. Each element of the initial vector δ 0 is generated by Equation (15).
δ 0 = min + r a n d ( 0 , 1 ) ( max min )
where m i n and m a x are the minimum and maximum of the gradient, respectively, and r a n d ( 0 ,   1 ) is a random number between 0 and 1.
After initialization, a mutant vector is produced with respect to the target vector. In each generation of mutation, the mutant vector can be generated by the following mutation strategy (Equation (16)),
V i , g = δ r 1 , g + F g ( δ r 2 , g δ r 3 , g )
where r 1 , r 2 , and r 3 are three integers different from the index i . The scaling factor F is a positive control parameter to scale the difference vector.
The scaling factor in the mutation strategy enhances the diversity of the population by scaling the difference vectors. In the process of population evolution, the diversity of the population gradually decreases. To maintain the diversity of the population in the later stage of evolution, an adaptive mutation strategy based on fitness values is proposed. For individuals with poor fitness values, a larger F was used to stretch the difference vector to explore a larger space, otherwise, a smaller value is used. The F for each individual is adjusted according to Equation (17).
F i , g = γ f i , g max ( f g )
where F i , g is the scaling factor corresponding to the i th individual in the g generation, f g is the fitness value in the g generation, f i , g is the fitness value corresponding to the i th individual in the G generation, and γ is a constant between 0 and 1.
After the mutant vector generation by mutation, a trial vector is generated by performing a binomial crossover on the target vector and the mutant vector (Equation (18)).
U g = { V g ,   i f ( r a n d ( 0 , 1 ) C R   o r   j = j r a n d   ) δ g ,   o t h e r w i s e
where r a n d is a uniformly distributed random number, j r a n d is a randomly chosen index, and condition j = j r a n d ensures that the trial vector gets at least one variable from the mutant vector C R [ 0 ,   1 ] .
The crossover operation improves the inherent diversity of the population by exchanging individual elements. To preserve the optimal individuals for the next generation with the highest probability, a fitness-value-based method is used to modify the C R corresponding to each individual. In other words, individuals with optimal fitness values were given a smaller C R value for the cross-operation. The C R for each individual is computed according to the normalization method (Equation (19)).
C R i , g = ε f i , g min ( f i , g ) max ( f i , g ) min ( f i , g )
where C R i , g is the crossover probability corresponding to the i th individual in the g generation, f i , g is the fitness value corresponding to the i th individual in the g generation, and ε is a constant between 0 and 1.
Finally, a greedy selection scheme is used to choose the best vector to survive to the next generation. A greedy selection scheme is described in Equation (20)
δ i , g + 1 = { U i , g ,   i f ( f ( U i , G ) f ( X i , g ) ) δ i , g ,   o t h e r w i s e  
where f (.) is Equation (7). Therefore, if the new trial vector yields an equal or lower value of the objective function, it replaces the corresponding target vector in the next iteration; otherwise, the target vector is retained in the next generation.
In the SAGM, we use self-adaptive DE to simulate the approximate gradient, and then the clean image is perturbed in the gradient direction with an adaptive step size. The pseudo-code of the self-adaptive approximated gradient search ( S a G S ) is shown in Algorithm 1.
Algorithm 1: SaGS algorithm
Input: image: I. δ0 =. Parameters: F, CR, α0, and λ. The objective function: f (.).
Output: the adversarial image IG.
1. I0 = I + α0 · sign(δ0);
2. f0 ← f (I0);
3. for g ← 1 to G do
4.     fmax = max(fg−1) and fmin = min(fg−1);
5.     F ← F · fg−1/fmax
6.     CR ← CR · (fg−1fmin)/(fmaxfmin)
7.     Generate vector δg according to Equations (16) and (18);
8.     k ← sort(fg−1(Ig−1))
9.      δ g δ g 1 + δ ( k ( 1 ) ) / δ ( k ( 1 ) ) 2 ;
10.    δ = θ 1 · δ g + θ 2 · δ g
11.   α = λ·δ′/max(δ′);
12.   if |α| < 1 then
13.      Ig = Ig−1 + α0 · sign(α)
14.   else
15.      Ig = Ig−1 + α0 · round(α)
16.   end if
17.   fg ← f(Ig);
18.   Compare fg and fg−1 and preserve δg
19. end for
20. Return IG, δG, and fG;
In Algorithm 1, DE is used to continuously adjust the direction and step size of the perturbation and to generate a temporary perturbation image. In line 1, the clean image is perturbed in the δ0 direction with α0. The perturbed image I0 is fed to the DNN for testing, and its confidence and distance from I are obtained by Equation (7) (line 2). The offspring vectors are generated according to Equations (16) and (18) for adaptive F and CR. The effectiveness of previous perturbations is sorted in ascending line 8, and k (1) is the subscript of the best perturbation. In lines 9 to 10, empirically excellent gradients are used to guide the current perturbation. In line 12, gradients are normalized to be between −λ and λ. In lines 12 to 16, Ig−1 is perturbed by different step sizes. The optima gradients are preserved for the next generation in line 18. After G generation, the temporary perturbed image IG is returned.

3.4. Description of the Proposed Method

In this subsection, a self-adaptive approximated-gradient-based adversarial attack is presented. The framework consists of two main steps: self-adaptive approximated gradient simulation and boundary exploration. The pseudo-code of the SAGM is summarized in Algorithm 2.
Algorithm 2: The SAGM algorithm
Input: image: original image I0, parameters: α, β, λ, θ1, θ2, the objective function: f(.), F,
            CR, T, G.
Output: perturbation image IT.
1.   Initialize perturbed vector δ0;
2.   for i ← 1 to T do
3.       I G , δ G , f G S a G S ( I 0 , δ 0 , G , F , C R , α , λ ,   f ( . ) ) ;
4.      k ← sort(fG)
5.      ∆I′I′(k(2)) - I′(k(1))
6.      ∆δδG(k(2)) - δG(k(1))
7.         β = ⟨∆I, ∆δ⟩/⟨∆δ, ∆δ⟩;
8.      I0I′(k(1)) + β · sign(δG(k(1)));
9.   end for
10. Return the perturbation image I0;
In algorithm 2, the perturbed vector δ0 is initialized between −1 and 1 according to Equation (15). δG is the optimal gradient after G generation. The retention of the former gradient helps barrel through local optima and accelerates convergence. In line 3, the SaGS method simulates the approximate gradient of the image, and a set of temporary perturbation samples are generated which may fool DNN with a high success rate. The effect of perturbations is sorted in ascending order in line 4. In lines 5 to 7, β is updated by Equation (12). In line 8, permanent perturbation samples are generated by the optimal sample of the temporary perturbation samples. After T generation, the adversarial image is returned.

4. Experiment and Evaluation

4.1. Experimental Setting

To evaluate the performance of the proposed SAGM, different experiments were conducted on the two image datasets, namely MNIST and CIFAR10, which have different image dimensions and category numbers. MNIST is a dataset of 28 × 28 gray-scale handwritten digits in the range of 0 to 9, containing 60,000 training samples and 10,000 testing samples. CIFAR-10 is a 32 × 32 color image containing 10 categories, with 6000 images per category. One DNN model, namely LeNet5 [44], was pre-trained in the MNIST dataset, while three DNN models, namely ResNet18 [45], AlexNet [46], and GoogLeNet [47], were selected as the CIFAR-10 dataset. For MNIST and CIFAR-10, all training samples were used to train the accuracy of the DNN used in this paper, and all test samples were used to verify the robustness of the DNN.
Moreover, for this DNN model, parameters were consistent with the original framework. All the experiments were carried out on a computer with an Intel Core i-7 CPU and 16 GB RAM. LeNet5 on the MNIST achieved an accuracy of 98.44%, and ResNet18, AlexNet, and GoogLeNet on the Cifar-10 achieved accuracies of 98.90%, 96.46%, and 98.38%, respectively. Furthermore, four attack algorithms, namely FGM [48], PGD, CW, and BMI-FGSM were selected for comparison with our attack algorithm. The curves of training accuracy and loss are shown in Figure 5 and Figure 6, respectively.
Figure 5. Network training accuracy.
Figure 6. Network training loss.

4.2. Measurement

4.2.1. Comparison

In this subsection, comprehensive metrics were used to evaluate different adversarial attack techniques
The attack success rate (ASR) is the ratio of the number of successfully attacked images to all the test images. The larger the value, the lower the accuracy of DNN. The calculation is shown in Equation (21)
A S R = N u m s u c c e s s   N u m t o t a l
where Numsuccess is the number of successfully attacked images and Numtotal is the number of all the test images. For the MNIST dataset, the ASRs of five attack methods on LeNet5 are presented in Table 1. The results in Table 1 show that the SAGM has the highest value on ASR; therefore, the proposed SAGM has better performance.
Table 1. Attack Success Rate on the MNIST Dataset.
For the CIFAR-10 dataset, the ASRs of five attack methods on ResNet18 and AlexNet are shown in Table 2. PGD achieved a 100% success rate on ResNet18 and the SAGM, but slightly lower than the SAGM on AlexNet. Therefore, the SAGM and BMI-FGSM a have better performance among these attack algorithms.
Table 2. Attack Success Rate on the CIFAR10 Dataset.
Transferability AEs generated for one ML model can be used to misclassify another model, even if both models have different architecture and training data. This property is called transferability. The adversarial examples of the SAGM for MNIST and Cifar10 are shown in Figure 7 and Figure 8, respectively. The proposed algorithm was found to have a good effect.
Figure 7. The adversarial examples of SAGM for MNIST.
Figure 8. The adversarial examples of SAGM for Cifar10.
Robustness is a metric used to evaluate the flexibility of DNNs to adversarial examples. The model robustness to adversarial perturbations is defined as the following Equation (22).
Δ a d v ( x , p ) = m i n δ [ 0 , 1 ] n δ p   s . t .   p ( x , t ) p ( x + δ , t ) 0
The robustness of the three DNNs on the three norms is shown in Table 3. The smaller the value, the smaller the distance between the adversarial examples and the real samples. The best mean values are shown in bold. The results show that the samples generated based on the white-box attack have a better adversarial distance than the black-box attack because it utilizes the DNN gradient.
Table 3. Robustness of DNNs.
To further discuss the sensitivity of each class of images, we selected the testing samples from the CIFAR-10 dataset on AlexNet and ResNet18 to verify the sensitivity. The adversarial examples are shown in Figure 9, Figure 10, Figure 11, Figure 12, Figure 13, Figure 14, Figure 15, Figure 16, Figure 17 and Figure 18. From these figures, we found that the SAGM algorithm has a better adversarial effect and higher clarity.
Figure 9. The adversarial examples of FGM for Cifar10 and AlexNet.
Figure 10. The adversarial examples of CW for Cifar10 and AlexNet.
Figure 11. The adversarial examples of PGD for Cifar10 and AlexNet.
Figure 12. The adversarial examples of BMI-FGSM for Cifar10 and AlexNet.
Figure 13. The adversarial examples of SAGM for Cifar10 and AlexNet.
Figure 14. The adversarial examples of FGM for Cifar10 and ResNet18.
Figure 15. The adversarial examples of CW for Cifar10 and ResNet18.
Figure 16. The adversarial examples of PGD for Cifar10 and ResNet18.
Figure 17. The adversarial examples of BMI-FGSM for Cifar10 and ResNet18.
Figure 18. The adversarial examples of SAGM for Cifar10 and ResNet18.
Structural similarity (SSIM) measurement, consisting of three comparisons of luminance, contrast, and structure, was proposed to evaluate the similarity between two images. The SSIM between image x and x′ is modeled as Equation (23).
S S I M ( x , x ) = 1 m n = 1 m [ L ( x n , x n ) α C ( x n , x n ) β S ( x n , x n ) γ ]
where m is the number of pixels and L, C, and S are the image’s luminance, contrast, and structure, respectively, α = β = γ = 1. The comparison of structural similarity is shown in Table 4, and the best mean values are shown in bold. The results show that CW has a clear advantage on LeNet5 but is almost similar in the other two DNNs.
Table 4. SSIM of DNNs.

4.2.2. Effectiveness of Adaptive Step

To verify the effect of the adaptive step, Equation (11) is embedded in FGM and PGD. The effectiveness of adaptive steps is shown in Table 5. The results show that the adaptive step size is beneficial to improve ASR.
Table 5. Effect of Adaptive Steps.

4.2.3. Evaluation

To sum up, we compared the SAGM with existing adversarial training algorithms and the results show that the algorithm that we proposed has an excellent performance in attack model evaluation metrics, such as attack success rate, transferability, robustness, sensitivity, and structural similarity (SSIM) measurement. The attack success rate is an important evaluation index of the adversarial training algorithm, which is equivalent to the accuracy of the recognition algorithm, Table 1 and Table 2 show that the SAGM can reach a very high success rate in this index and it can carry out accurate and no-miss attacks. Figure 7 and Figure 8 show the transferability of the SAGM. Table 3 shows the robustness of SAGM compared with other existing algorithms. Figure 9, Figure 10, Figure 11, Figure 12, Figure 13, Figure 14, Figure 15, Figure 16, Figure 17 and Figure 18 show the sensitivity of SAGM, it has a better adversarial effect and higher clarity. Table 4 shows the result of SSIM that the SAGM compared with other existing algorithms.
At the same time, the self-adaptive algorithm is also effective for existing adversarial algorithms. When added to other adversarial algorithms, it can improve the ASK of other algorithms, which is shown in Table 5.

5. Conclusions

In this paper, we proposed a gradient simulation method for the black-box adversarial attack, SAGM, for the score-based generation of adversarial examples. It is very difficult to use adversarial examples to identify and successfully fool DNNs on MNIST and CIFAR10. Two operations, namely the self-adaptive approximated gradient simulation and the momentum gradient-based perturbation sample generation, were proposed to explore the sensitive gradient direction and generate the efficiency perturbation samples to attack DNNs. The extensive experiments showed that the SAGM achieves excellent performance comparable to other attack algorithms in success rate, perturbation distance, and transferability. In the future, the effect of this study will be a focus on improving the evaluation model to generate more efficient adversarial examples to achieve a balance between adversarial examples and the human visual system. In addition, we believe that some sensitive pixels can be explored to perturb partial pixels of clean images to improve the robustness of adversarial examples. Furthermore, another direction is the application of other intelligent optimization algorithms to adversarial attacks.

Author Contributions

Conceptualization, Y.Z. and S.-Y.S.; methodology, B.X.; writing—original draft preparation Y.Z. and X.T.; supervision, Y.Z., S.-Y.S. and X.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research is partially supported by Institute of Information and Telecommunication Technology of KNU.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Junior, F.E.F.; Yen, G.G. Particle swarm optimization of deep neural networks architectures for image classification. Swarm Evol. Comput. 2019, 49, 62–74. [Google Scholar] [CrossRef]
  2. Cococcioni, M.; Rossi, F.; Ruffaldi, E.; Saponara, S.; Dupont de Dinechin, B. Novel arithmetics in deep neural networks signal processing for autonomous driving: Challenges and opportunities. IEEE Signal Process. Mag. 2020, 38, 97–110. [Google Scholar] [CrossRef]
  3. Janai, J.; Güney, F.; Behl, A.; Geiger, A. Computer vision for autonomous vehicles: Problems, datasets and state of the art. Found. Trends® Comput. Graph. Vis. 2020, 12, 1–308. [Google Scholar] [CrossRef]
  4. Lee, S.; Song, W.; Jana, S.; Cha, M.; Son, S. Evaluating the robustness of trigger set-based watermarks embedded in deep neural networks. IEEE Trans. Dependable Secur. Computing 2022, 1–15. [Google Scholar] [CrossRef]
  5. Ren, H.; Huang, T.; Yan, H. Adversarial examples: Attacks and defenses in the physical world. Int. J. Mach. Learn. Cybern. 2021, 12, 3325–3336. [Google Scholar] [CrossRef]
  6. Shen, J.; Robertson, N. Bbas: Towards large scale effective ensemble adversarial attacks against deep neural network learning. Inf. Sci. 2021, 569, 469–478. [Google Scholar] [CrossRef]
  7. Garg, S.; Ramakrishnan, G. Bae: Bert-based adversarial examples for text classification. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Punta Cana, Dominican Republic, 8 May 2020. [Google Scholar] [CrossRef]
  8. Rahman, A.; Hossain, M.S.; Alrajeh, N.A.; Alsolami, F. Adversarial examplesłsecurity threats to COVID-19 deep learning systems in medical iot devices. IEEE Internet Things J. 2020, 8, 9603–9610. [Google Scholar] [CrossRef]
  9. Finlayson, S.G.; Bowers, J.D.; Ito, J.J.; Zittrain, L.; Beam, A.L.; Kohane, I.S. Adversarial attacks on medical machine learning. Science 2019, 363, 1287–1289. [Google Scholar] [CrossRef]
  10. Prinz, K.; Flexer, A.; Widmer, G. On end-to-end white-box adversarial attacks in music information retrieval. Trans. Int. Soc. Music. Inf. Retr. 2021, 4, 93–104. [Google Scholar] [CrossRef]
  11. Guo, C.; Gardner, J.; You, Y.; Wilson, A.G.; Weinberger, K. Simple black-box adversarial attacks. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 2484–2493. [Google Scholar] [CrossRef]
  12. Wang, Y.; Tan, Y.; Zhang, W.; Zhao, Y.; Kuang, X. An adversarial attack on dnn-based black-box object detectors. J. Netw. Comput. Appl. 2020, 161, 102634. [Google Scholar] [CrossRef]
  13. Goodfellow, I.J.; Shlens, J.; Szegedy, C. Explaining and harnessing adversarial examples. arXiv 2014, arXiv:1412.6572. [Google Scholar] [CrossRef]
  14. Liu, S.; Chen, P.Y.; Kailkhura, B.; Zhang, G.; Hero, A.; Varshney, P.K. A primer on zeroth-order optimization in signal processing and machine learning: Principals, recent advances, and applications. IEEE Signal Process. Mag. 2020, 37, 43–54. [Google Scholar] [CrossRef]
  15. Cai, X.; Zhao, H.; Shang, S.; Zhou, Y.; Deng, W.; Chen, H.; Deng, W. An improved quantum-inspired cooperative co-evolution algorithm with muli-strategy and its application. Expert Syst. Appl. 2021, 171, 114629. [Google Scholar] [CrossRef]
  16. Mohammadi, H.; Soltanolkotabi, M.; Jovanović, M.R. On the linear convergence of random search for discrete-time LQR. IEEE Control. Syst. Lett. 2020, 5, 989–994. [Google Scholar] [CrossRef]
  17. LeCun, Y. The Mnist Database of Handwritten Digits. 1998. Available online: http://yann.lecun.com/exdb/mnist/ (accessed on 20 September 2018).
  18. Hinton, G.E. Training products of experts by minimizing contrastive divergence. Neural Comput. 2002, 14, 1771–1800. [Google Scholar] [CrossRef] [PubMed]
  19. Kumar, B.; Dikshit, O.; Gupta, A.; Singh, M.K. Feature extraction for hyperspectral image classification: A review. Int. J. Remote Sens. 2020, 41, 6248–6287. [Google Scholar] [CrossRef]
  20. Kurakin, A.; Goodfellow, I.J.; Bengio, S. Adversarial examples in the physical world. In Artificial Intelligence Safety and Security; Chapman Hall/CRC: Boca Raton, FL, USA, 2018; pp. 99–112. [Google Scholar] [CrossRef]
  21. Madry, A.; Makelov, A.; Schmidt, L.; Tsipras, D.; Vladu, A. Towards deep learning models resistant to adversarial attacks. Stat 2017, 1050, 9. [Google Scholar] [CrossRef]
  22. Dong, Y.; Liao, F.; Pang, T.; Su, H.; Zhu, J.; Hu, X.; Li, J. Boosting adversarial attacks with momentum. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 9185–9193. [Google Scholar] [CrossRef]
  23. Carlini, N.; Wagner, D. Towards evaluating the robustness of neural networks. In Proceedings of the 2017 IEEE Symposium on Security and Privacy (sp). IEEE, San Jose, CA, USA, 22–26 May 2017; pp. 39–57. [Google Scholar] [CrossRef]
  24. Chen, P.Y.; Zhang, H.; Sharma, Y.; Yi, J.; Hsieh, C.J. Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, Dallas, TX, USA, 3 November 2017; pp. 15–26. [Google Scholar] [CrossRef]
  25. Su, J.; Vargas, D.V.; Sakurai, K. One pixel attack for fooling deep neural networks. IEEE Trans. Evol. Computation 2019, 23, 828–841. [Google Scholar] [CrossRef]
  26. Lin, J.; Xu, L.; Liu, Y.; Zhang, X. Black-box adversarial sample generation based on differential evolution. J. Syst. Softw. 2020, 170, 110767. [Google Scholar] [CrossRef]
  27. Li, C.; Wang, H.; Zhang, J.; Yao, W.; Jiang, T. An approximated gradient sign method using differential evolution for black-box adversarial attack. IEEE Trans. Evol. Comput. 2022, 26, 976–990. [Google Scholar] [CrossRef]
  28. Storn, R.; Price, K. Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces. J. Glob. Optim. 1997, 11, 341–359. [Google Scholar] [CrossRef]
  29. Xu, Z.; Han, G.; Liu, L.; Martínez-García, M.; Wang, Z. Multi-energy scheduling of an industrial integrated energy system by reinforcement learning-based differential evolution. IEEE Trans. Green Commun. Netw. 2021, 5, 1077–1090. [Google Scholar] [CrossRef]
  30. Jana, R.K.; Ghosh, I.; Das, D. A differential evolution-based regression framework for forecasting Bitcoin price. Ann. Oper. Res. 2021, 306, 295–320. [Google Scholar] [CrossRef] [PubMed]
  31. Njock, P.G.A.; Shen, S.L.; Zhou, A.; Modoni, G. Artificial neural network optimized by differential evolution for predicting diameters of jet grouted columns. J. Rock Mech. Geotech. Eng. 2021, 13, 1500–1512. [Google Scholar] [CrossRef]
  32. Luo, G.; Zou, L.; Wang, Z.; Lv, C.; Ou, J.; Huang, Y. A novel kinematic parameters calibration method for industrial robot based on Levenberg-Marquardt and Differential Evolution hybrid algorithm. Robot. Comput. Integr. Manuf. 2021, 71, 102165. [Google Scholar] [CrossRef]
  33. Sun, Y.; Song, C.; Yu, S.; Liu, Y.; Pan, H.; Zeng, P. Energy-efficient task offloading based on differential evolution in edge computing system with energy harvesting. IEEE Access 2021, 9, 16383–16391. [Google Scholar] [CrossRef]
  34. Singh, D.; Kaur, M.; Jabarulla, M.Y.; Kumar, V.; Lee, H.N. Evolving fusion-based visibility restoration model for hazy remote sensing images using dynamic differential evolution. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–14. [Google Scholar] [CrossRef]
  35. Biswas, S.; Manicka, S.; Hoel, E.; Levin, M. Gene regulatory networks exhibit several kinds of memory: Quantification of memory in biological and random transcriptional networks. iScience 2021, 24, 102131. [Google Scholar] [CrossRef]
  36. Tan, X.; Lee, H.; Shin, S.Y. Cooperative Coevolution Differential Evolution Based on Spark for Large-Scale Optimization Problems. J. Inf. Commun. Converg. Eng. 2021, 19, 155–160. [Google Scholar] [CrossRef]
  37. Pant, M.; Zaheer, H.; Garcia-Hernandez, L.; Abraham, A. Differential Evolution: A review of more than two decades of research. Eng. Appl. Artif. Intell. 2020, 90, 103479. [Google Scholar] [CrossRef]
  38. Baioletti, M.; Di Bari, G.; Milani, A.; Poggioni, V. Differential evolution for neural networks optimization. Mathematics 2020, 8, 69. [Google Scholar] [CrossRef]
  39. Zhang, J.; Sanderson, A.C. JADE: Adaptive differential evolution with optional external archive. IEEE Trans. Evol. Comput. 2009, 13, 945–958. [Google Scholar] [CrossRef]
  40. Tan, X.; Shin, S.Y.; Shin, K.S.; Wang, G. Multi-Population Differential Evolution Algorithm with Uniform Local Search. Appl. Sci. 2022, 12, 8087. [Google Scholar] [CrossRef]
  41. Georgioudakis, M.; Plevris, V. A comparative study of differential evolution variants in constrained structural optimization. Front. Built Environ. 2020, 6, 102. [Google Scholar] [CrossRef]
  42. Ronkkonen, J.; Kukkonen, S.; Price, K.V. Real-parameter optimization with differential evolution. IEEE Congr. Evol. Comput. 2005, 1, 506–513. [Google Scholar] [CrossRef]
  43. Ali, M.M.; Torn, A. Population set-based global optimization algorithms: Some modifications and numerical studies. Comput. Oper. Res. 2004, 31, 1703–1725. [Google Scholar] [CrossRef]
  44. LeCun, Y.; Jackel, L.; Bottou, L.; Brunot, A.; Cortes, C.; Denker, J.; Drucker, H.; Guyon, I.; Muller, U.; Sackinger, E.; et al. Comparison of learning algorithms for handwritten digit recognition. Int. Conf. Artif. Neural Netw. 1995, 60, 53–60. [Google Scholar]
  45. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
  46. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
  47. Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar] [CrossRef]
  48. Kurakin, A.; Goodfellow, I.; Bengio, S. Adversarial machine learning at scale. arXiv 2016, arXiv:1611.01236. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.