Recent studies have shown that deep neural networks are susceptible to adversarial examples, posing significant challenges in terms of model robustness and application security for deep learning models. To explore and understand the potential vulnerabilities of deep learning models, Szegedy et al. [
6] first proposed the concept of adversarial examples in 2013 and developed an optimization-based method for generating adversarial examples: L-BFGS. This approach introduces meticulously designed minor perturbations into the input data, causing the model to make incorrect predictions, although these perturbations are imperceptible to humans. Over time, the study of adversarial examples has gradually shifted from theoretical exploration to practical application, such as improving the robustness and security of deep learning models. Extending from computer vision to other areas, such as speech and natural language processing [
22,
23,
24,
25,
26], research on adversarial examples now not only focuses on the effectiveness of attacks, but also includes exploring more covert and practically valuable attack methods [
27,
28,
29,
30,
31]. These studies offer valuable insights and guidance for improving the security of deep learning models. The existing adversarial example generation algorithms can be categorized from multiple perspectives: according to the attack objective, they can be classified as targeted attacks or nontargeted attacks; based on the attacker’s knowledge of the deep neural network, they can be categorized as white-box attacks, black-box attacks, or grey-box attacks. In this paper, adversarial example generation algorithms can also be classified based on the degree of perturbation to the input, such as global perturbation attacks and local perturbation attacks. This section introduces two types of attack methods based on the degree of input perturbation during the adversarial example generation process.
2.1. Global Perturbation Attack
A global perturbation attack is a type of method for adversarial example generation that is designed to mislead a model’s prediction by applying subtle yet comprehensive changes to all the input data. The essence of this attack method is to influence the model’s understanding of the entire dataset rather than targeting specific areas or features. Research on global perturbation attacks not only exposes the vulnerabilities of deep learning models, but also promotes continuous improvement in model robustness. The strategies for global perturbation attacks can be broadly divided into several categories based on the principles and techniques used to execute the attack.
Gradient-based attack methods: This type of method generates adversarial perturbations by computing the gradient of the model’s loss function with respect to the input, and then adds these adversarial perturbations to the original samples to create adversarial examples. In 2014, Goodfellow et al. proposed the fast gradient sign method (FGSM) algorithm [
7], which was the first global perturbation attack method based on gradient attacks. It generates adversarial examples by modifying the image according to the gradient through the backwards propagation of the loss function. Soon after, the FGSM was introduced into adversarial attacks on SAR image samples [
32,
33,
34,
35,
36] by adding perturbations in the direction of the greatest gradient change within the network model to rapidly increase the loss function, ultimately leading to incorrect classification by the model. The original FGSM algorithm required only a single gradient update to produce adversarial examples; however, as a single-step attack with relatively high perturbation intensity that is only applicable to linear target functions, this approach results in a lower attack success rate. To address this issue, Kurakin et al. extended the FGSM algorithm and proposed the basic iterative method (BIM) [
37] algorithm, which perturbs the image through multiple iterations and adjusts the calculation direction after each iteration, solving the problem of the low attack success rate of the FGSM algorithm. Huang et al. [
38] designed a variant of BIM that moved from untargeted to targeted attacks by replacing the ground-truth label in the loss function with a target label. However, adversarial examples generated by BIM-like methods can easily become trapped in local maxima due to the limitation of the learning rate step size, thereby affecting the transferability of adversarial examples. To address this, Yin et al. improved upon the BIM algorithm and proposed the momentum iterative fast gradient sign method (MIFGSM) [
39] algorithm, which uses momentum to make the direction of the gradient updates more stable, solving the problem of the BIM algorithm becoming trapped in local minima during the generation of adversarial examples. Furthermore, to increase the success rate of attacks, Madry et al. extended the BIM approach and proposed a variant of BIM, the projected gradient descent (PGD) algorithm [
40]. This method generates adversarial examples by performing multiple perturbations on the input samples along the sign direction of the gradient using projected gradient descent. The PGD method has been widely applied in adversarial attacks in SAR imagery [
33,
36,
41], enhancing the attack effectiveness by increasing the number of iterations and incorporating a layer of randomization. As an alternative to the aforementioned FGSM method and its variants, Dong et al. proposed a translation-invariant attack method (TIM) [
42], in which the recognition areas of the attacked white-box model are less sensitive and the generated adversarial examples have better transferability. This algorithm can be extended to any gradient-based attack method. It uses a convolution operation before applying the gradient to the original image. Compared to those of the FGSM algorithm, the perturbations generated by TIM are smoother.
Optimization-based attack methods: The process of generating adversarial examples can be viewed as finding the optimal perturbation to produce effective adversarial examples. Therefore, adversarial example generation algorithms can be described as solving constrained optimization problems to implement adversarial attacks. The C&W [
43] algorithm is a classic optimization-based global perturbation attack method. It is based on iterative optimization strategies using infinity, the 0-norm, and the 2-norm. By adjusting the parameters of the objective function, the algorithm significantly increases the solution space, thereby greatly enhancing the success rate of the adversarial examples. The C&W algorithm has been widely applied in adversarial attacks on SAR imagery [
32,
33]. However, this approach is limited by drawbacks such as slow training speed and poor transferability. Since the adversarial perturbation for each test sample must be optimized iteratively over a long period, this approach is not suitable for adversarial attack tasks that require an immediate response. Chen et al. proposed the elastic-net attacks to DNNs (EAD) [
44] method for generating attack perturbations, which can be viewed as an extension of the C&W method to the
distance norm. By applying elastic-net regularization, the approach addresses the issue of high-dimensional feature selection in the
norm, thereby finding more effective adversarial perturbations and significantly improving the transferability of global adversarial perturbations. In the SAR domain, to address the slow training speed of SAR image adversarial example generation by the C&W algorithm, Du et al. proposed the fast C&W [
45] adversarial example generation algorithm. This algorithm builds a deep encoder network to learn the forwards mapping from the original SAR image space to the adversarial example space. Through this method, adversarial perturbations can be generated more quickly during an attack through fast forwards mapping.
Decision-based attack methods: These methods utilize the principle of hyperplane classification, determining the size of perturbations by calculating the minimum distance between the decision boundary of the original sample and its adversarial counterpart. After obtaining the perturbation vector, it is added to the original sample to generate the adversarial example. The DeepFool [
46] algorithm and the HSJA [
47] algorithm are classic decision-based global adversarial attack methods. The DeepFool algorithm iteratively generates a perturbation vector pointing towards the nearest decision boundary by iterating over the loss function until the generated adversarial example crosses the decision boundary. In contrast, the HSJA algorithm repeatedly performs gradient direction estimation, geometric series search steps, and binary searches of the estimated decision boundary to generate global adversarial examples. To improve the global generalization ability of the adversarial examples, Moosavi et al. proposed a method called universal adversarial perturbations (UAP) [
48], which creates widely applicable adversarial perturbations by computing the shortest distance from the original sample to the classification boundaries of multiple target models.
2.2. Local Perturbation Attacks
Local perturbation attacks are mainly those in which the attacker perturbs only specific areas or parts of the input sample to generate adversarial examples. Unlike global perturbation attacks, they focus on making minor modifications to specific areas of the input data, reducing the overall interference of background speckles while maintaining the effectiveness of the attack. The key to local perturbation attack methods lies in precisely controlling the perturbation area to effectively deceive the model while maintaining the naturalness of the adversarial perturbation. Local perturbation attacks can be categorized into several types based on the perturbation optimization strategy, with each method employing different principles and techniques to execute the attack.
Gradient-based attack methods: Similar to global perturbation attacks, a series of methods for local perturbation attacks exist that optimize perturbations based on gradient strategies. Papernot et al. proposed the Jacobian-based saliency map attack (JSMA) [
49] algorithm for generating adversarial examples targeting specific objectives. This algorithm utilizes the Jacobian matrix and saliency map matrix to identify the two pixels with the most influence on the model’s classification results within the entire input area. It then modifies those pixels to generate adversarial examples. Dong et al. proposed the superpixel-guided attention (SGA) [
50] algorithm, which adds perturbations to similar areas of an image through superpixel segmentation and then converts the global problem into a local problem using class activation mapping information. Additionally, Lu et al. introduced the DFool method, which adds perturbations to ’stop’ signs and facial images to mislead the corresponding detectors. This was the first paper to propose the generation of adversarial examples in the field of object detection. DAG [
10] is a classic method for adversarial object detection attacks; it yields effective results in actual attacks but is time-consuming due to the need for iterative attacks on each candidate box. Li et al. proposed the RAP attack [
51] for two-stage networks, designing a loss function that combines classification and location losses. Compared to the DAG method, Li’s method utilizes the location box information in object detection for the attack; however, its actual attack performance is moderate, and its transferability to RPN attacks is poor. In the field of SAR, researchers have drawn inspiration from gradient-based attack concepts and designed a series of local perturbation attack methods. For instance, the SAR sticker [
52] creates perturbations in specific areas of SAR images, maintaining the effectiveness of the attack while enhancing its stealth. Peng et al. proposed the speckle variant attack (SVA) [
53], a method for adversarial attacks on SAR remote sensing images. This method consists of two main modules: a gradient-based perturbation generator and an object area extractor. The perturbation generator is used to implement transformations of the background SAR coherent speckle, disrupting the original noise pattern and continuously reconstructing the speckle noise during each iteration. This prevents the generated adversarial examples from overfitting to noise features, which achieves robust transferability. The object area extractor ensures the feasibility of adding adversarial perturbations in real-world scenarios by restricting the area of the perturbations.
Optimization-based attack methods: Su et al. proposed the one-pixel attack algorithm [
54], which manipulates a single pixel that can alter the classification of the entire image, thereby deceiving the classification model into mislabelling the image as a specified tag, to some extent achieving a targeted attack. Xu et al. [
32] drew inspiration from the one-pixel algorithm to generate local perturbation attacks in SAR images, transforming the creation of adversarial examples into a constrained optimization problem. This method only needs to identify the pixel location to be modified and then use a differential evolution optimization algorithm to perturb that pixel value for a successful attack. Compared to gradient-based adversarial example generation methods, the perturbations created by optimization-based methods are smaller in magnitude and more precise. Furthermore, inspired by sparse adversarial perturbation methods, Meng et al. [
55] proposed the TRPG method for local adversarial perturbations when generating adversarial examples in SAR remote sensing images. This method involves extracting the object mask position in SAR images through segmentation, thereby aggregating the perturbations of the SAR adversarial examples into the object area. Finally, adversarial examples that are more consistent with SAR image characteristics are generated via the optimization-based C&W method. Moreover, considering the practicality of local perturbations in the physical world, a category of local perturbation attack methods known as ‘adversarial patches’ emerged. These local perturbations are applied within a certain area of the input image to attack network models and can be effectively applied in real-world scenarios. The earliest concept of adversarial patches was proposed by Brown et al. [
18] in 2017, and the locally generated perturbations could achieve general and targeted attacks on real-world objects. Subsequently, Karmon et al. introduced the localized and visible adversarial noise (LaVAN) [
56] method, which focuses more on exploiting the model’s vulnerabilities to cause misclassification, with perturbation sizes far smaller than those designed by Brown. Moreover, in the field of object detection, a series of adversarial patch attack methods have been developed. The Dpatch method [
16] generates local perturbations and uses them as detection boxes to interfere with detectors, while the Obj-hinder method [
15] disrupts detectors by minimizing their class loss. Wang et al. [
57] proposed an object detection black-box attack based on particle swarm optimization named EA, which guides the generation of perturbations in appropriate positions using natural optimization algorithms; however, this method is time-consuming.
By deeply exploring the diverse methods and techniques of global and local perturbation attacks, we can more clearly see that adversarial examples have made significant progress in multiple fields. However, when transitioning to SAR image processing, these methods face significant limitations. First, the coherent speckle background in SAR images dynamically changes, which makes maintaining the stability and effectiveness of adversarial examples under different conditions more challenging. Second, the size adaptability of local perturbations is a problem, especially when effective attacks on objects of different sizes and shapes are needed. These challenges require further optimization and adjustment of current adversarial example methods to suit the characteristics and diversity of the scenarios in SAR imagery.