From 2D-Patch to 3D-Camouflage: A Review of Physical Adversarial Attack in Object Detection

Li, Guojia; Cao, Mingyue; Zhang, Yihong; Xu, Simin; Cao, Yan

doi:10.3390/electronics14214236

Open AccessReview

From 2D-Patch to 3D-Camouflage: A Review of Physical Adversarial Attack in Object Detection

by

Guojia Li

^1,†

,

Mingyue Cao

^1,†,

Yihong Zhang

¹,

Simin Xu

¹ and

Yan Cao

^1,2,*

¹

School of Cyber Science and Engineering, Zhengzhou University, Zhengzhou 450002, China

²

Key Laboratory of Cyberspace Security, Ministry of Education, Zhengzhou 450001, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Electronics 2025, 14(21), 4236; https://doi.org/10.3390/electronics14214236

Submission received: 28 September 2025 / Revised: 25 October 2025 / Accepted: 27 October 2025 / Published: 29 October 2025

(This article belongs to the Section Computer Science & Engineering)

Download

Browse Figures

Versions Notes

Abstract

Deep neural networks have demonstrated remarkable performance in object detection tasks; however, they remain highly susceptible to adversarial attacks. Previous surveys in computer vision have provided considerable coverage of physical adversarial attacks, yet the aspect of systematically categorizing and evaluating their physical deployment methods for object detection has not received commensurate focus. To address this gap, we categorize physical adversarial attacks into three primary classes based on the physical deployment of adversarial patterns: manipulating 2D physical objects, injecting adversarial signals, and placing 3D adversarial camouflage. These categories are further analyzed and compared across nine key attributes. Furthermore, we elucidate the relationship between physical adversarial attacks against object detection models and two critical properties: transferability and perceptibility. Our findings indicate that while attacks involving the manipulation of 2D physical objects are relatively straightforward to deploy, their adversarial patterns are often perceptible to human observers. Similarly, 3D adversarial camouflage tends to lack stealthiness, whereas adversarial signal injection offers stronger imperceptibility. However, all three attack types exhibit limited transferability across different models and modalities. Finally, we discuss current challenges and propose actionable directions for future research, aiming to foster the development of more robust object detection systems.

Keywords:

physical adversarial attack; object detection; 3D adversarial camouflage; physical deployment; transferability; perceptibility

1. Introduction

In recent years, Deep Neural Networks (DNNs) have been extensively utilized in object detection tasks [1,2]. These tasks often serve as the foundation in fields such as autonomous driving [3,4], video surveillance [5]. Existing research has highlighted the vulnerability of DNNs [6,7]. Adversarial examples pose a significant security threat to object detection models, and this threat continues to be a key area of research [8,9].

For example, in autonomous driving systems, object detectors are employed to recognize traffic signs in order to assist with safety and security [10,11]. Adversaries can mislead or cause the object detector to ignore the sign by pasting a physical adversarial patch on the sign (such as a stop sign), which may lead to vehicle collision accidents and traffic rule violations after being captured by the onboard camera [12,13].

In physical adversarial attacks against object detection models, adversaries need to place physical adversarial patterns in the physical environment so that the image perception device can capture the target carrying adversarial patterns, turning it into a digital adversarial sample as the input for the victim model [14,15,16]. In real environments, physical adversarial patterns are subject to various physical constraints, such as shape, size, color, lighting, distance, viewing angle, and limitations of image perception devices. Therefore, deploying physical adversarial patterns is more challenging compared to adding adversarial perturbations in digital adversarial attacks [4,17]. Therefore, this work offers a thorough summary of physical adversarial attacks in object detection and clarifies the relationship between physical adversarial attacks and their transferability/perceptibility.

1.1. Literature Selection

This review employs a systematic literature review methodology [18] to evaluate the state-of-the-art academic and industrial research on physical adversarial attacks against object detection, including pedestrian and vehicle detection, as well as models for infrared detection. In this review, the following keywords were used as search strings: physical adversarial attack, object detection, person detection, vehicle detection.

Our literature search was conducted up to July 2025, encompassing various scholarly search engines and platforms, including the ACM Digital Library (https://dl.acm.org/, accessed on 15 July 2025), IEEE Xplore (https://ieeexplore.ieee.org/Xplore/home.jsp, accessed on 12 July 2025), Elsevier (https://www.elsevier.com/en-in, accessed on 12 July 2025), Springer (https://www.springer.com/gp, accessed on 12 July 2025), Web of Science (https://webofscience.clarivate.cn/wos/woscc/basic-search, accessed on 12 July 2025), Scopus (https://www.scopus.com/pages/home?display=basic#basic, accessed on 12 July 2025), Google Scholar (https://scholar.google.com/, accessed on 12 July 2025), and mainly from the top journals and conferences, such as CCS, S&P, USENIX, NDSS, AAAI, CVPR, ICCV, ICML, TIFS, TDSC, TGRS, TPAMI, IJCV, TIP, NeurIPS, IoT, and ICLR. The selection criteria were as follows:

(1): Physical deployment. The attack strategies and methods in this review had to be implementable in a physical environment.
(2): Victim tasks. This review focuses on physical adversarial attacks against object detection models, including related tasks such as person detection, vehicle detection, and traffic sign detection.
(3): Conceptual Cohesiveness. Articles were selected based on their relevance and cohesive contribution to the field of physical adversarial attacks.

We employed the OR operator to incorporate synonymous terms for core concepts, such as “physical adversarial attack” OR “physical adversarial example” OR “adversarial patch” OR “adversarial camouflage”. We then applied the AND operator to combine different conceptual groups and refine the search results, for example, (“physical adversarial attack” OR “physical adversarial example”) AND (“object detection” OR “vehicle detection”).

This systematic categorization and organization help identify strengths and weaknesses, recurring trends, and research gaps in physical adversarial attacks against object detection, thereby achieving a comprehensive understanding of existing work. Subsequently, the top 200 articles were read individually to analyze their scope and contributions.

Although the domain of physical adversarial attacks has received considerable attention in computer vision surveys, the aspect of systematically categorizing and evaluating their physical deployment methods for object detection has not received commensurate focus. To address this gap, this review systematically applies and evaluates the taxonomy of physical deployment methods, transferability, and perceptibility to organize and assess the progress in generating 2D adversarial patterns and 3D adversarial camouflage for object detection.

Wei et al. [19] focused on physical adversarial attacks, organizing the current physical attacks from attack tasks, attack forms, and attack methods. Wang et al. [20] mainly reviewed physical adversarial attacks in image recognition and object detection tasks, covering some physical adversarial attacks in object tracking and semantic segmentation. Guesmi et al. [21] explored physical adversarial attacks for various camera-based computer vision tasks. Wei et al. [22] introduced the adversarial medium to summarize physical adversarial attacks in computer vision and proposed six evaluation metrics. Nguyen et al. [23] specifically studied physical adversarial attacks for four surveillance tasks. Additionally, Zhang et al. [24] conducted a comparative analysis of datasets and victim models in digital adversarial attacks. A comparative analysis of existing physical adversarial attack review literature is shown in Table 1. Our work systematically reviews the development of physical adversarial attacks in object detection, from 2D patches to 3D adversarial camouflages, and assesses key properties such as transferability and perceptibility, filling important gaps in current research.

Table 1. A comparative analysis of several existing surveys about physical adversarial attacks.

Survey	3D Adversarial Camouflage	Physical Deployment	Transferability	Perceptibility	Year
Wei et al. [19]	×	×	×	×	2023
Wang et al. [20]	×	×	×	×	2023
Guesmi et al. [21]	×	×	✓	✓	2023
Wei et al. [22]	×	✓	×	✓	2024
Nguyen et al. [23]	×	×	×	×	2024
Zhang et al. [24]	$p a r t i a l$	×	✓	✓	2024
Ours	✓	✓	✓	✓	2025

1.2. Our Contributions

Under the various physical constraints of the real world, how can we establish a systematic framework to categorize, evaluate, and understand the physical deployment of adversarial patterns (both 2D and 3D) against object detection systems, and what are the key trade-offs and relationships among transferability, perceptibility, and the physical deployment manner of such attacks? Our work aims to fill this gap by introducing a novel taxonomy of physical adversarial attacks from 2D adversarial patterns to 3D adversarial camouflage, and we propose a graded evaluation framework for transferability and perceptibility of adversarial attacks. This review offers the following contributions:

This review identifies, explores, and evaluates physical adversarial attacks targeting object detection systems, including pedestrian and vehicle detection.
We propose a novel taxonomy for physical adversarial attacks against object detection, categorizing them into three classes based on the generation and deployment of adversarial patterns: manipulating 2D physical objects, injecting adversarial signals, and placing 3D adversarial camouflage. These categories are systematically compared on nine key attributes.
The transferability of adversarial attacks and the perceptibility of adversarial examples are further analyzed and evaluated. Transferability is classified into two levels: cross-model and cross-modal. Perceptibility is divided into four levels: visible, natural, stealthy, and imperceptible.
We discuss the challenges of adversarial attacks against object detection models, propose potential improvements, and suggest future research directions. Our goal is to inspire further work that emphasizes security threats affecting the entire object detection pipeline.

This review is organized as follows: Section 1 introduces the scope of the reviewed literature and the academic databases used for the literature search. Section 2 presents a number of preliminaries. Section 3 delineates the taxonomy of physical adversarial attacks against object detection (Section 3.1), the deployment of physical adversarial patterns (Section 3.2), the transferability (Section 3.3) of adversarial attacks, and the perceptibility of adversarial examples (Section 3.4). Section 4, Section 5, and Section 6 introduce three types of attack in turn: manipulating 2D physical objects, injecting adversarial signals, and placing 3D adversarial camouflage. Section 7 offers detailed discussion of the trade-offs between transferability, perceptibility, and deployment manners, along with their limitations and future trends. Finally, Section 8 presents the conclusion. The entire structure of this article is shown in Figure 1.

2. Preliminaries

We elucidate the threat model from four dimensions: adversarial attack environment, adversary’s intention, adversary’s knowledge and capabilities, and victim tasks and model architecture. Subsequently, we will formally define the adversarial samples and then introduce the content of the threat model.

2.1. Adversarial Examples

Adversarial examples refer to carefully crafted modifications or perturbations applied to input data with the intention of misleading or causing errors in DNN-based models. Adversarial perturbations are often designed to be imperceptible to human observers and can cause significant errors in the decision-making process of the model.

In computer vision, given a DNN-based model

f : X \to Y

characterized by a parameter set

θ

, we formulate the DNN-based models as follows:

\hat{y} = f (θ, x)

(1)

where for any input data

x \in X

, the well-trained model

f (\cdot)

is able to predict a

\hat{y}

value that closely approximates the corresponding ground truth

y \in Y

.

The addition of small, imperceptible, and carefully crafted perturbations or noise to an image can result in model

f (\cdot)

producing incorrect predictions

y^{'}

. Note that the attacker’s modification is limited to the input data. The input example x, following the addition of minimal perturbations

δ

that can be learned by an adversarial algorithm, is denoted as the adversarial example

x^{'} = x + δ

, which can be indistinguishable from the original input x, and lead to

y^{'} \neq y

. Mathematically, the joint representation is expressed as follows:

y^{'} = f (θ, x + δ)

(2)

Implementing an adversarial attack first requires generating an appropriate adversarial example

x^{'}

.

Generating an adversarial example

x^{'}

is the cornerstone of the adversarial attack, and its quality directly impacts the attack’s effectiveness and success rate [25]. To ensure that

x^{'}

can effectively mislead the incorrect behavior in the target model and be invisible to human observers, adversaries must take into account the environment, the intention, their knowledge, capabilities, and the victim model.

2.2. Adversarial Attack Environment

Based on the deployment scenarios of the adversarial examples, existing adversarial attacks can be categorized as digital adversarial attacks and physical adversarial attacks.

2.2.1. Digital Adversarial Attack

In digital adversarial attacks, adversaries can directly manipulate digital images [26,27]. By making precise pixel modifications to digital images to add adversarial perturbations, digital adversarial examples are generated to serve as direct input to the model, misleading the target model to make incorrect predictions. Researchers have proposed numerous methods for generating adversarial perturbations or adversarial patterns in digital adversarial attacks [28,29].

In the digital domain, adversaries can directly manipulate pixel-level input images to implant carefully crafted perturbations; hence, digital adversarial attacks can exhibit a high success rate [30,31,32]. However, due to the impact of dynamic physical conditions (such as varying shooting angles and distances) and optical imaging [10], it is challenging to perfectly transfer digital adversarial perturbations to the physical environment.

2.2.2. Physical Adversarial Attack

Physical adversarial attacks involve manipulating objects in the physical world [21]. Adversaries can place or attach adversarial patterns to targets in the physical environment, causing image perception devices (e.g., cameras) to capture adversarial images.

As shown in Figure 2, which shows a schematic diagram of physical adversarial attacks, the physical object first enters the image perception device, and the image perception device converts light signals into electrical signals, which are then processed into a digital image and fed into the image victim model. The schematic diagram of physical adversarial attacks can be conceptualized as a three-layer structure, from bottom to top. The first layer represents the adversary’s background knowledge, encompassing attack intentions, knowledge, and capabilities of the adversary, as well as potential attack methodologies. The second layer involves the generation of adversarial patterns, which are categorized into physical and digital adversarial attacks.

Following the generation process of digital adversarial patterns, the digital attack consists of six key steps: digital object manipulation, digital image processing, attack scenario construction, attack simulation, effectiveness assessment, and selection of optimization methods. The objective is to generate optimized digital adversarial patterns.

Adversaries may employ substitute models or generative model-based attack methods (Section 4.2), leveraging background knowledge about the target model to select appropriate substitute or generative models, simulate attack scenarios, and then use the generated adversarial patterns to conduct simulated attacks. They can quantify the attack effectiveness and further refine the adversarial patterns. The ultimate goal is to produce adversarial patterns that can be transferred to unknown target models or systems in black-box settings.

The left side of the second layer illustrates the deployment of physical adversarial patterns, derived from their digital counterparts, which can be deployed in the physical world by manipulating 2D physical objects (Section 4), injecting adversarial signals (Section 5), and placing 3D adversarial camouflage (Section 6). These patterns, when embedded in a physical medium and combined with the target, form physical adversarial examples.

The third layer pertains to the generation and transformation of adversarial examples. The physical adversarial examples generated in the second layer must be converted into digital adversarial examples through image acquisition, which are then input into the object detection model to execute the attack.

A substantial amount of research has been conducted exploring adversarial attacks against various DNN models or tasks, resulting in the creation of various forms of physical adversarial patterns, such as adversarial patch [33], adversarial sticker [34], adversarial camouflage [35], adversarial light projection [36], and adversarial perturbations against cameras using ultrasonic waves or laser signals [16,37].

In complex real-world environments, small perturbations may go unnoticed or be challenging to capture accurately by the camera. In addition, physical adversarial attacks also involve issues concerning the physical deployment of adversarial perturbations. Physical adversarial patterns must be transformed into deployable physical forms through printing, attaching materials, and light projection, then captured by image perception devices to mislead object detection models. Adversarial attacks in the physical environment present substantial threats that raise serious concerns, particularly in safety-critical applications such as autonomous driving and video surveillance. Given that object detection is a core component of these systems, this work focuses on the security threats of physical adversarial examples targeting object detection.

2.3. Adversary’s Intention

Adversarial examples aim to maliciously mislead the model’s output with the objective of undermining the model’s integrity [8]. Based on incorrect output of the victim model, the adversary’s intention can be categorized as an untargeted attack or a targeted attack.

Untargeted attacks are designed to mislead the target model into producing any incorrect prediction, applicable in scenarios where the adversary only needs the model to make an error without concern for the specific category of error. In object detection tasks, untargeted attacks can cause the target to be hidden (Hiding Attack) [4], or they can cause the model to misidentify objects as something else entirely (Appearing Attack) [38].

Targeted attacks, on the other hand, aim to induce the target model to output a specific incorrect prediction for a given example; these are used in scenarios requiring precise control over the model’s erroneous behavior. A targeted attack on an object detection model can cause the adversarial example to be recognized as a specific object [39].

Targeted and untargeted attacks can be easily interchanged by modifying the objective function [24]. Typically, the attack success rate of targeted attacks is lower than that of untargeted attacks [40,41].

2.4. Adversary’s Knowledge and Capabilities

Based on the level of access to information about the target model and training data, adversaries’ knowledge and capabilities can be categorized into three attack scenarios, ranging from weak to strong: black-box, gray-box, and white-box.

(1): White-box

In a white-box scenario, it is assumed that the adversary has full access to the target model, including all parameters, gradients, architectural information, and even substitute data used for training the target model. Therefore, the adversary can meticulously craft adversarial examples by acquiring complete knowledge of the target model [42].

(2): Black-box

In a black-box scenario, the adversary has no knowledge of the target model’s parameters, architecture, or training data, and it can only obtain the model’s outputs, such as confidence scores and predicted class labels, by submitting query samples to the target model [43,44]. Additionally, the adversary may be subject to query limitations and may be unaware of the state-of-the-art (SOTA) defense mechanisms deployed in the target model [45].

(3): Gray-box

In a gray-box scenario, the adversary’s knowledge and capabilities fall between those of white-box and black-box settings. It is assumed that the adversary has partial knowledge of the target model, such as its architecture, but lacks access to the model’s parameters and gradient information [46].

In real-world physical environments, it is difficult for adversaries to gain knowledge of the target model, but they can utilize substitute models from white-box scenarios to produce optimized, transferable adversarial examples for deployment in black-box physical environments.

2.5. Victim Model Architecture

In the domain of object detection, neural network models are primarily categorized into two-stage detectors and one-stage detectors. According to the architecture of the model, they can be further classified into CNN-based models and Transformer-based models [47].

(1): CNN-based model

To date, various deep neural network (DNN) models based on convolutional neural network (CNN) architectures have been most widely applied in object detection tasks [2,48]. A wealth of research has been conducted on adversarial attacks against CNN-based model architectures [2,47,49,50,51], including both digital and physical adversarial attacks on tasks.

(2): Transformer-based model

In recent years, visual image models based on Transformer architectures have also been widely applied across multiple vision tasks [52]. Adversarial attacks against Transformer-based models have also garnered attention [53,54,55,56,57]. Almost all tasks based on the CNN models introduced earlier can be implemented using transformer models. The CNN-based tasks previously introduced face the same adversarial attack threats when implemented under the Transformer architecture.

(3): Real-world systems

Previous studies have simulated and tested adversarial attacks against tasks such as object detection, object tracking, and traffic sign recognition in open-source autonomous driving platforms or real vehicle autonomous driving systems, such as Apollo [58,59,60], Autoware [3,58,61], Autopilot [62], openpilot [61], DataSpeed [63], Tesla Model X and Mobileye 630 [64], and Tesla Model 3 [65].

3. Physical Adversarial Attacks Against Object Detection

3.1. Taxonomy of Physical Adversarial Attacks

Physical adversarial examples, for object detection tasks, refer to adding specific physical adversarial patterns to real-world objects. This causes object detection models to produce errors during detection. These examples are converted into digital patterns through image acquisition. Additionally, digital adversarial patterns can be directly manipulated onto digital images to obtain digital adversarial examples. We focus on physical adversarial examples.

The role of adversarial patterns in generating physical adversarial examples can be categorized into three main scenarios: (1) the adversarial pattern is applied to the target object, such as traffic signs, pedestrians, or vehicles on the road in object detection tasks; (2) the adversarial pattern is introduced into the signal processing, such as altering various lighting conditions in the physical environment; (3) the adversarial pattern is placed on the image perception device, such as the camera lens and the sensors.

Based on where physical adversarial patterns are deployed, physical adversarial attacks against object detection could include the following types:

(1) Manipulating 2D Physical Objects; (2) Injecting Adversarial Signals; (3) Placing 3D Adversarial Camouflage.

Adversaries employ diverse physical adversarial patterns and media via physical deployment to generate physical adversarial examples against object detection models. Figure 3 presents a taxonomy that organizes physical adversarial attacks on object detection into three categories.

3.2. Deployment of Physical Adversarial Patterns

Based on the deployment of physical adversarial patterns in real environments, we categorize the deployment manners of 2D and 3D physical adversarial patterns into two types, printed deployment and injected signal, as shown in Table 2.

(1): Printed deployment involves printing adversarial patterns or examples in 2D or 3D and placing them on or around the target surface. This can be further classified into four categories: printed and attached, printed and photographed, printed on non-rigid surface and attached, and 3D-printed and placed, which are primarily used for the deployment of adversarial patches, stickers, and camouflage for physical object manipulation.
(2): The injected signal involves introducing an external signal source (light or sound signals) to project adversarial patterns onto the target surface or image perception devices. This can be categorized into two types: injected light signal and injected acoustic signal.

Additionally, the simulated scene deployment of adversarial patterns is utilized in autonomous driving and object detection (vehicle detection) simulated scenarios to verify the effectiveness of adversarial patterns [35,53,66,67,68,69,70].

Table 2. List of physical adversarial pattern deployments and corresponding attacks.

Type	Physical Deployment Manner	Representative Work
Printed Deployment	Printed and Attached	[11,15,38,57,62,71,72,73,74]
	Printed and Photographed	[75,76]
	Printed on NRS	[49,77,78,79,80,81,82,83,84,85,86]
	3D-Printed and Placed	[39,53,55,69,87]
	Other-Printed and Deployed	[16,67,68,70]
Injected Signal	Light Signal	[16,63,64,65,88,89,90,91]
Injected Signal	Acoustic Signal	[42,59,92]

3.3. Transferability of Adversarial Attack

The transferability of adversarial attacks is evaluated based on the ability of adversarial patterns, which are optimized and generated on source datasets and models, to attack new datasets and models. We consider the transferability of adversarial attacks as the ability of these patterns to transfer across datasets, models, or tasks.

According to the adversary’s intent, the transferability of adversarial attacks is related to the victim model and data modality. Thus, the transferability of adversarial patterns can also be measured from two aspects: victim model and data modality. We categorize the transferability of physical adversarial patterns into two levels from weak to strong: cross-model and cross-modal.

3.3.1. Cross-Model Transferability

The cross-model transferability (CMT) of adversarial patterns refers to the ability of adversarial patterns to transfer across multiple models and tasks. Currently, DNN model architectures in computer vision can be divided into three categories: CNN-based architectures, Transformer-based architectures, and other architectures. For example, models such as VGG, ResNet, DenseNet, GoogleNet, Xception, Inception, EfficientNet, YOLO, Faster R-CNN, and SSD are CNN-based architectures [47,48]. Models like the ViT family, DeiT family, DETR, Twin DETR, Deformable DETR, and Swin DETR are Transformer-based architectures [93,94]. Based on the victim model’s backbone architecture, we categorize the cross-model transferability of adversarial patterns into two types: cross same-bone model transferability and cross diverse-bone model transferability.

Cross same-bone model transferability (CSMT) refers to the scenario where the victim models share the same backbone architecture, and the types of datasets are the same or diverse (one or multiple tasks). Adversarial patterns can be transferred across different models of the same-bone architecture or between data samples within different types of datasets. For example, adversarial patterns can transfer from image classification tasks (ImageNet, WideResNet50) to face recognition (PubFig, VGG-Face2 model), scene classification (CIFAR-10, 6 Conv + 2 Dense CNN model), and traffic sign recognition (GTSRB, 7 Conv + 2 Dense CNN model) tasks against the CNN-based architectures of victim models [46].

Cross diverse-bone model transferability (CDMT) refers to the scenario where the victim models have different backbone architectures and the types of datasets are the same or diverse (one or multiple tasks). Adversarial patterns can be transferred between models of diverse-bone architectures or between data samples within datasets of different types. For example, adversarial patterns can be transferred between CNN-based models and Transformer-based models [55].

Cross-task transferability of adversarial patterns [95] can be considered a special case of cross-model transferability of adversarial patterns targeting diverse dataset types. It involves transferring from image classification to object detection tasks [16,57]. Prior research indicates that the CSMT of adversarial patterns is stronger than CDMT [69], and it is more difficult to transfer from CNN-based models to Transformer-based models [96].

3.3.2. Cross-Modal Transferability

The cross-modal transferability (CM) of adversarial patterns refers to the ability of adversarial patterns to transfer across different data modalities and multimodal tasks. Currently, the CM of adversarial patterns is an emerging research focus [55,97]. Based on the modalities of datasets, we categorize CM into two types: cross diverse data-modal transferability and cross mulimodal-task transferability.

Cross diverse data-modal transferability (CDDM) refers to the scenario where the victim models have the same or different backbone architectures and the types of victim datasets are diverse, while all are single-modal. Adversarial patterns can take effect on two or more single-modal datasets simultaneously, causing the model to produce incorrect outputs across different modalities. For example, adversarial 3D-printed objects can simultaneously attack LiDAR and camera perception sources in autonomous driving [58]. The cropped warming and cooling paste can simultaneously attack target detection tasks in the image visible light modality and the infrared modality [98,99].

Cross mulimodal-task transferability (CMTT) refers to the scenario in which the victim models have the same or different backbone architectures and the victim datasets are multimodal (covering one or more multimodal tasks). Adversarial patterns are transferable across multiple multimodal tasks or between different modalities within a multimodal dataset. For example, Zhang and Jha [100] proposed adversarial illusions, which perturb an input of one modality (e.g., an image or a sound) to make its embedding close to another target modality (e.g., (image, text) or (audio, text)), to mislead image generation, text generation, zero-shot classification, and audio retrieval caption.

3.4. Perceptibility of Adversarial Examples

To provide a qualitative measure of perceptibility, we integrate the characteristics of physical adversarial attacks and decompose perceptibility into three complementary perspectives: the human observer, the image perception device, and the digital representation. This yields a four-level scale ranging from most to least perceptible: visible, natural, stealthy, and imperceptible.

(1): Visible. Physical adversarial patterns are visible to the human observer, captureable by image perception devices, and clearly distinguishable from the target in digital representation. Adversarial patterns in this category, such as patches or stickers, have noticeable visual differences from normal examples and are easily detected by human observers. We use the term visible to denote this category.
(2): Natural. Physical adversarial patterns are visible to the human observer, captureable by image perception devices, but hard to distinguish from the target in digital representation. Natural adversarial examples emphasize whether the adversarial patterns appear as a naturally occurring object. For example, adversarial patterns that use mediums such as natural lighting, projectors, shadows, and camouflage are classified as natural. This category emphasizes the natural appearance of adversarial patterns in relation to the target environment. We use the term natural to denote this category.
(3): Covert. Physical adversarial patterns are hard to observe with the naked eye, can be captured by image perception devices with anomalies, and are distinguishable from original images in the digital domain. Covert adversarial examples are visually imperceptible or similar to the original examples for humans, but they can be physically perceived by the targeted device or system. Adversarial examples, such as those using media like thermal insulation materials, infrared light, or lasers, are classified as covert. This category emphasizes that such adversarial examples are not easily detected from the perspective of human observers.
(4): Imperceptible. Physical adversarial patterns are invisible to the naked eye, hard to capture using image perception devices, and hard to distinguish from original images in the digital domain. Adversarial examples in this category include ultrasound-based attacks on camera sensors. The term imperceptible highlights that the differences between digital adversarial examples and their benign counterparts are challenging to measure.

4. Manipulating 2D Physical Objects

Adversaries can introduce adversarial patterns by manipulating physical objects to cause object detection models to produce incorrect outputs. Typically, adversaries place the crafted adversarial patterns on the target surface or around the target. Depending on the physical deployment method of the adversarial patterns, a 2D physical object can be manipulated by placing printed adversarial patterns, such as patches, stickers, image patches, and cloth patches, thus generating physical adversarial examples. Furthermore, as shown in Table 3, we have organized a comprehensive list of adversarial methods for physical object manipulation, sorted by the category of adversarial pattern and chronological order.

4.1. 2D-Printed Adversarial Patterns

Placing adversarial patterns on target surfaces or around a target is a common and practical approach in physical adversarial attacks. To generate physically realized adversarial examples, adversaries strategically place printed adversarial patterns onto the surfaces of physical objects. When adversarial examples are captured by image perception devices, the target image carrying the adversarial pattern is fed into object detection models, thereby enabling the execution of the attack.

(1): Patch and sticker

The printed adversarial patterns mainly include two types: patch and sticker. The digitally crafted adversarial patches or stickers are printed on paper (or printed on the physical surfaces of clothing or objects), then attached to an object’s surface or placed around it.

Adversarial patches typically confine perturbations to a small, localized area without imposing any intensity constraints. Their deployment usually only requires simple printing and pasting onto target surfaces, making this method both simple and commonly used for various attack tasks. The generation methods for adversarial stickers and patches are similar, and they perform equally in terms of attack effectiveness. Patches are usually regular in shape (e.g., circular), while stickers are irregular and can adapt to non-rigid surfaces (e.g., cars, traffic signs, and persons).

(2): Image and cloth patch

An image patch is holistically printed or fabricated, wherein the adversarial examples carrying the adversarial pattern are entirely printed or produced, such as a cloth patch that is printed on non-rigid surfaces such as clothing. Such adversarial patterns are also essentially patches but differ in their generation methods and physical deployment.

Generally, both patches and stickers have distinct visual features that are different from the target surface and are visible to human observers. Adversaries can enhance the stealthiness of these attacks by making the sticker patterns close to natural textures or images, improving the stealthiness of patches by reducing their size or applying ℓ₂-norm, ℓ_∞-norm, ℓ_p-norm constraints, or adjusting transparency, or by dispersed deployment [33,101,102]. However, these methods might compromise the effectiveness and robustness of adversarial attacks.

Table 3. List of attributes of manipulating printed 2D adversarial patterns (patches and stickers). They are arranged according to adversarial pattern and chronological order.

AP	Method	Akc	VMa	VT	AInt	Venue
Patch	extended-RP2 [103]	WB	CNN	OD	UT	2018 WOOT
Patch	Nested-AE [38]	WB	CNN	OD	UT	2019 CCS
Patch	OBJ-CLS [72]	WB	CNN	PD	UT	2019 CVPRW
Patch	AP-PA [15]	WB, BB	CNN, TF	AD	UT	2022 TGRS
Patch	T-SEA [73]	WB, BB	CNN	OD, PD	UT, TA	2023 CVPR
Patch	SysAdv [11]	WB	CNN	OD, PD	UT	2023 ICCV
Patch	CBA [74]	WB, BB	CNN, TF	OD, AD	UT	2023 TGRS
Patch	PatchGen [57]	BB	CNN, TF	OD, AD, IC	UT	2025 TIFS
Cloth-P	NAP [77]	WB	CNN	OD, PD	UT	2021 ICCV
Cloth-P	LAP [78]	WB	CNN	OD, PD	UT	2021 MM
Cloth-P	DAP [79]	WB	CNN	OD, PD	UT	2024 CVPR
Cloth-P	DOEPatch [80]	WB, BB	CNN	OD	UT	2024 TIFS
Sticker	TLD [62]	BB	RS	LD	UT	2021 USENIX
Image-P	ShapeShifter [75]	WB	CNN	TSD	UT, TA	2019 ECML PKDD
Image-P	LPAttack [76]	WB	CNN, RS	OD	UT	2020 AAAI

4.2. Generation and Optimization of 2D Adversarial Patterns

4.2.1. Generation and Optimization Methods

Based on the generation and optimization methods of adversarial patterns, physical adversarial attacks can be categorized into gradient-based methods, surrogate model methods, black-box query methods, and generative models methods.

Gradient-based attack methods utilize the gradient information of the target model to guide the generation of adversarial examples [38]. By computing the gradient of the loss function with respect to the input data, these methods identify which pixel changes in the input data can maximize the model’s loss, making it one of the earliest and most widely studied methods of adversarial attacks.

Substitute model attack methods utilize known models (substitute models or ensemble models) and datasets to generate adversarial examples that are then transferred to attack an unknown target model. Several findings have been made to enhance the transferability of adversarial examples [104].

Black-box query attacks query the target model repeatedly and can be further categorized into score-based attacks [105] and decision-based attacks [41] according to model feedback. Adversaries could obtain only the predicted labels or confidence scores of the victim classifier within limited queries and different perturbation constraints to perform adversarial attacks. The adversary’s focus lies in reducing the number of queries by optimizing the size of the adversarial space and refining the query direction and gradient in the adversarial space.

Generative model attack methods utilize generative models (such as generative adversarial networks, diffusion models, etc.) to learn the data distribution and perturb the input or latent space to create adversarial examples [32].

4.2.2. Gradient-Based Adversarial Patch

The adversarial patch printed and affixed to the target surface represents one of the first adversarial patterns used for physical adversarial attacks [106]. Adv-Examples [107] construct the first adversarial examples in the physical world. Adversaries print the generated adversarial examples on paper, then they use a camera to capture the printed images and input the adversarial examples into the subsequent image model.

For example, Extended-RP2 [103] improves the RP2 [71] gradient-based attack to achieve a disappearing attack that causes physical objects to be ignored by detectors, creating an attack wherein innocuous physical stickers fool a model into detecting nonexistent objects. Nested-AE [38] generates two adversarial examples for long and short distances, combined in a nested fashion to deceived multi-scale object detectors. OBJ-CLSE [72] generates adversarial patches to hide a person from a person detector. The adversarial patches, which were printed on small cardboard plates, were held by individuals in front of their bodies and oriented towards surveillance cameras. SysAdv [11] crafts adversarial patches by considering the system-level nature of the autonomous driving system.

For image patches, LPAttack [76] attacks license plate detection by manufacturing real metal license plates with carefully crafted perturbations against object detection models, yet the approach only targeted North American customized vehicle license plates, and perturbations to other license plates with fixed patterns are easily detected by the human observer.

For cloth patches, LAP [78] generates adversarial patches from cartoon images to evade object detection models via the two-stage patch training process. The patches look natural and are hard to distinguish from general T-shirts. DAP [79] generates dynamic adversarial patches against person detection by accounting for non-rigid deformations caused by changes in a person’s pose.

The generation of gradient-based adversarial patches typically occurs under a white-box scenario, where adversaries employ a physical deployment method involving printing and affixing to the target surface. The clothing patch is printed on fabric, while the image patch is entirely printed and placed in real world. These patches are visible to observers, and they exhibit primarily cross same-bone model transferability [11,38,79,80].

4.2.3. Surrogate Models for Generating Adversarial Patches

Existing research on generating adversarial patches based on substitute models generally employs CNN-based or Transformer-based architectures as substitute models [57,74,80]. Adversaries, with knowledge of the substitute models, design multiple loss terms to optimize the loss function of the DNN model, such as TV loss, non-printability loss (NPS), and adversarial detection loss for object detection tasks, and they use Expectation over Transformation (EOT) to compensate for digital-to-physical transformation. They also jointly optimize the size, position, shape, and transparency of the patches to maintain attack performance in the physical world.

For example, AP-PA [15] generates an adversarial patch that can be printed and then placed on or outside aerial targets. The work of the same authors, CBA [74], continues AP-PA, misdirecting target detection by placing adversarial patches in the background region of the target (the airplane in the aerial image). T-SEA [73] proposes a transfer-based self-ensemble attack on object detection systems. PatchGen [57] uses YOLOv3 + Faster R-CNN, YOLOv3 + DETR, and Faster R-CNN + DETR as known white-box models, based on the ensemble model idea, to generate adversarial patches on the DOTA dataset and then transfer them to attack multiple target models (YOLOv3, YOLOv5, Faster R-CNN, DETR, and Deformable DETR) in aircraft detection tasks.

Additionally, clothing patches generated based on substitute models are used to attack object detection models. DOEPatch [80] uses an optimized ensemble model to dynamically adjust the weight parameters of the target models, and it provides an explanatory analysis of the adversarial patch attacks using energy-based analysis and Grad-CAM visualization.

Adversarial patches generated based on substitute models generally exhibit better transferability across same-bone models [80], but they exhibit weaker transferability across divers-models [57,74]. In practical scenarios, because adversaries seldom possess comprehensive knowledge of the target system or model, generating adversarial patterns based on substitute models for black-box attacks has proven to be an effective attack strategy.

4.2.4. Black-Box Query for Generating Adversarial Patches

In adversarial patch and adversarial sticker attacks based on black-box query, adversaries leverage the decision (classification) information or confidence scores from the target model, employ optimization methods such as differential evolution, genetic algorithms, particle swarm optimization, or Bayesian optimization to determine the search direction and magnitude of adversarial perturbations, and ultimately generate adversarial samples that effectively mislead the target model.

For example, TLD [62] conducts a comprehensive study on the security of lane detection modules in real vehicles. Through a meticulous reverse engineering process, the adversarial examples are generated by adding perturbations to the camera images based on physical coordinates that are obtained by accessing the lane detection system’s hardware. These perturbations are designed to be imperceptible to human vision while simultaneously causing the Tesla Autopilot lane detection module to generate a fake lane output. The natural appearance of stickers maintains stealthiness and reduces the need for complex transformations.

4.2.5. Generative Models for Generating Adversarial Patches

In addition to the aforementioned attack methods, existing research has also utilized various generative adversarial networks to generate transferable adversarial patches. In the context of adversarial attacks, GANs are trained on additional (clean) data, using a pre-trained image classifier as the discriminator for image classification tasks. Unlike traditional GAN training, the GAN only updates the generator while keeping the discriminator unchanged. The training objective of the generator is to cause the discriminator to misclassify the generated images (adversarial examples). For example, NAP [77] leverages the learned image manifold of pre-trained generative adversarial networks (BigGAN and StyleGAN2) to generate natural-looking adversarial clothing patches.

4.3. Attributes of Manipulating 2D Physical Objects

We have organized a comprehensive list of adversarial attack methods targeting physical object manipulation, categorized by types of adversarial patterns and arranged chronologically. Each method is analyzed and characterized according to its key attributes. The attributes of methods employing 2D printed adversarial patterns are summarized in Table 3.

(1): Adversarial Pattern (AP).
(2): Adversary’s Knowledge and Capabilities (Akc).
(3): Victim Model Architecture (VMa).
(4): Victim Tasks (VT).
(5): Adversary’s Intention (AInt).

Furthermore, for each adversarial attack targeting object detection tasks, we present the generation and optimization methods (Gen) of the corresponding adversarial patterns, along with their physical deployment manners. We analyze and summarize the transferability of each method across different datasets or models. Additionally, we evaluate the perceptibility of each adversarial pattern to both human vision and image perception devices using a graded perceptibility level. The specific details are presented in Table 4.

(1): Generation and Optimization Methods (Gen).
(2): Physical Deployment Manner (PDM).
(3): Adversarial Attack’s Transferability.
(4): Adversarial Example’s Perceptibilitiy.

4.4. Limitations of 2D Physical Object Manipulation

Physical object manipulation is the most common adversarial patterns, but it also has some issues:

(1): Various printed-deployed adversarial patterns (such as patches and stickers) involve complex digital-to-physical environment transitions (such as non-printability score, expectation over transformation, total variation) to handle printing noise and cannot address noise from dynamic physical environments. From the perspective of generation methodology, these approaches primarily rely on gradient-based optimization (e.g., AP-PA [15], DAP [79]) and surrogate model methods (e.g., PatchGen [57]) to simulate physical conditions during training. However, such simulation remains inadequate for modeling complex environmental variables like lighting changes and viewing angles, leading to performance degradation in real-world scenarios. Furthermore, while these methods effectively constrain perturbations to localized areas, they lack adaptive optimization mechanisms for dynamic physical conditions.
(2): Printed-deployed adversarial patterns (patches, stickers) visually have a significant difference from the background environment of the target, with strong perceptibility that can be easily discovered by the naked eye. This visual distinctiveness stems from their generation process, where gradient-based methods often produce high-frequency patterns that conflict with natural textures. Although generative models have been employed to create more natural-looking patterns (such as NAP [77]), they still struggle to achieve seamless environmental integration. The fundamental challenge stems from the inherent trade-off between attack success and visual stealth, as most optimization algorithms must balance the pursuit of high success rates against preserving the visual naturalness of the pattern.
(3): Adversarial patterns that resemble natural objects tend to have weaker attack effects, possibly because the training of models focuses on similar natural objects that have been seen before, and the target model has gained better robustness during training. This phenomenon can be attributed to the model’s prior exposure to similar patterns during training. The semantic consistency of natural-object-like patterns may actually trigger the model’s built-in mechanisms, making them less effective compared to other patterns.
(4): Distributed adversarial patch attacks increase the difficulty of defense detection, with weaker perceptibility, but are relatively challenging to deploy. Their generation typically employs gradient-based optimization across multiple spatially distributed locations, requiring careful coordination of perturbation distribution. While generative models offer potential for creating coherent distributed patterns, they face practical deployment challenges in maintaining precise spatial relationships. This type of attack is highly dependent on the precise spatial configuration of multiple components during deployment, making it susceptible to environmental factors and deployment methods. Furthermore, attackers must balance distribution density with attack effectiveness, which often requires specialized optimization strategies distinct from traditional patch generation methods.

5. Injecting Adversarial Signals

In general, artificial lights reflected from physical objects or ultrasound signals received by camera image sensors are the main physical signals that affect the image output. The adversary projects adversarial patterns to generate adversarial perturbations by manipulating external light sources or acoustic signals to implement physical-world adversarial attacks against object detection models.

5.1. Adversarial Light and Acoustic Patterns

The adversary uses adversarial light or acoustic signals as attack media and does not directly modify the appearance of the object’s surface. Instead, adversarial examples are generated by optimizing various parameters of the projected light or acoustic signals, such as position, shape, color, intensity, wavelength, distance, angle, and projection pattern. These adversarial perturbations generated using visible light or optical devices are common in physical environments and are often selectively ignored by people, thus providing a certain level of stealthiness.

5.1.1. Projecting Adversarial Patterns with Projectors

Various projectors are used to project adversarial patterns onto objects. Image-capturing devices capture the light superimposed with the target object and the adversarial pattern, then transmit the example with the adversarial pattern to the DNN model. SLAP [88] generates short-lived adversarial perturbations using a projector to shine specific light patterns on stop signs. Vanishing Attack [90] utilizes a drone with a portable projector to project the adversarial light pattern on the rear windshield of a vehicle to obstruct the object detector (OD) in autonomous driving systems. In follow-up work, OptiCloak [91], developed by the same research team, the adversarial light pattern employed in the attack framework was further optimized through the implementation of random focal color filtering (RFCF) and gradient-free optimization via the ZO-AdaMM algorithm, enhancing the attack effectiveness of adversarial samples under long-distance and low-light conditions. DETSTORM [63] investigates physical latency attacks against camera-based perception pipelines in the autonomous driving system, using projector perturbations to create a large number of adversarial objects and causing perception delays in dynamic physical environments.

5.1.2. Adversarial Laser Signal

Some studies also use lasers or infrared lasers as the attack medium. For example, rolling shutter [89] further demonstrates the rolling shutter attack for object detection tasks by irradiating a blue laser onto seven different CMOS cameras, ranging from cheap IoT to semi-professional surveillance cameras, to highlight the wide applicability of the rolling shutter attack. ICSL [65] generates adversarial examples using invisible infrared light LEDs, which are deployed in various ways to attack autonomous vehicles’ cameras, effectively causing errors in their environment perception and SLAM processes.

There have also been works that exploit the perceptual differences between human eyes and cameras for varying frequency signals to attack the rolling shutter of autonomous vehicles’ cameras. L-HAWK [16] presents a novel physical adversarial patch activated by laser signals at long distances against mainstream object detectors and image classifiers. In normal circumstances, L-HAWK remains harmless but can manipulate the driving decision of a targeted AV via specific image distortion caused by laser signal injection towards the camera.

Additionally, GlitchHiker [108] investigates different adversarial light patterns in the camera image signal transmission phase, inducing controlled glitch images by injecting intentional electromagnetic interference (IEMI) into the image transmission lines. The effectiveness of the glitch injection is dependent on the strength of the coupled signal in the victim camera system.

5.1.3. Adversarial Acoustic Signals

Adversaries can inject resonant ultrasonic signals into the camera’s inertial sensors, causing the image stabilization system to overcompensate and inducing image distortion, which in turn produces erroneous outputs. For example, Poltergeist attack (PG attack) [59] injects acoustic signals into the camera’s inertial sensors (via frequency scanning to find the acoustic resonance frequency of the target sensors’ MEMS gyroscopes) to interfere with the camera’s image stabilization system, which results in blurring of the output image, even if the camera is stable. Cheng et al. [92] used the same approach as PG attacks, additionally validated on two lane detectors. TPatch [42] builds on the work of ref. [59], designing a physical adversarial patch triggered by acoustic signals to target the inertial sensor of the camera. TPatch remains benign under normal circumstances but can be triggered to cause specific image blurring distortion and generate adversarial images which launched in hiding, creating or altering attacks when the selected acoustic signals are injected as a trigger into the inertial sensor of camera.

5.2. Attributes of Injecting Adversarial Signal

In physical adversarial attacks that inject adversarial light and acoustic signals to attack object detection, signal sources primarily include projectors, laser signals, and ultrasonic signals. Projectors are used to project adversarial patterns onto the surface of target objects, while laser signals are emitted to illuminate the sensors of image perception devices (such as the camera’s rolling shutter or image processing pipeline). Ultrasonic signals are mainly employed to interfere with the inertial sensors of cameras. In Table 5, we summarize the adversarial light and acoustic signals, along with the attribute list of the corresponding attack methods.

Furthermore, for each adversarial attack utilizing adversarial light and acoustic signals, we also summarize the generation and optimization methods of the adversarial patterns, categorizing them into three types of signal sources while briefly describing the physical deployment approaches. We analyze and summarize the transferability of each method across different datasets or models. Additionally, we evaluate the perceptibility of each adversarial pattern to both human vision and perception devices. The specific details are presented in Table 6.

5.3. Limitations of Adversarial Signals

For adversarial patterns involving artificial or natural light sources, the lack of target image training data in specific optical condition scenarios may be a primary reason for the weak robustness of DNN models against such attacks.

(1): Injecting adversarial signals requires precise devices to reproduce specific optical or acoustic conditions for effective translation of digital attacks into the physical world. The signal sources of such attacks can be categorized into energy radiation types, such as the blue lasers in rolling shutter [89], the infrared laser LED employed in ICSL [65], and field interference types, such as intentional electromagnetic interference (IEMI) in GlitchHiker [108] and the acoustic signals in PG attacks [59]. Their physical deployment is directly linked to the stealth and feasibility of the attacks. For instance, lasers require precise calibration of the incident angle, acoustic signals must match the sensor’s resonant frequency, and electromagnetic interference relies on near-field coupling effects. Since such attacks do not rely on physical attachments, their deployment flexibility is greater than that of patch-based attacks. Although they offer greater deployment flexibility than patch-based attacks, their effectiveness is highly sensitive to even minor environmental deviations.
(2): Projector-based adversarial patterns share mechanisms with adversarial patches by introducing additional patterns captured by cameras. However, projected attacks (e.g., Phantom Attack [64], Vanishing Attack [90]) face unique deployment challenges, including calibration for surface geometry and compensation for ambient light. A key reason for DNN models’ vulnerability to such attacks is the lack of training data representing targets under specific optical conditions, such as dynamic projections or strong ambient light. In contrast, laser injection methods like L-HAWK [16] bypass some environmental interference by directly targeting sensors but face deployment complexities related to synchronization triggering and optical calibration.

6. Placing 3D Adversarial Camouflage

6.1. Adversarial Camouflage

Early adversarial camouflage methods targeted 2D rigid or non-rigid surfaces, generating adversarial textures that were printed and attached to the target surface [14,49,81,82] to attack the target model. The 3D adversarial camouflage attack requires placing the adversarial pattern across the entire surface of the target, which may be a non-rigid structure. Therefore, 3D adversarial camouflage attacks involve two key aspects: texture generation methods and rendering techniques applied to the target surface. The literature lists of placing 3D adversarial camouflage are shown in Table 7.

There are two existing methods for generating 3D adversarial camouflage using neural renderers: one optimizes 2D texture patterns, mapping them repeatedly onto the target via texture overlay, known as the world-aligned method [53,66,69]. The other optimizes 3D textures that cover the entire surface of the target in the form of UV mapping, known as the UV map method [35,67,68,70,84]. The world-aligned method cannot ensure that the texture is mapped onto the target in the same way during both generation and evaluation, leading to discrepancies in the adversarial camouflage effects. The UV map methods are unable to render complex environmental features such as lighting and weather, which can diminish the effectiveness of adversarial camouflage attacks in physical deployment.

6.2. Generating 3D Adversarial Camouflage

Based on whether neural rendering tools are used in the adversarial camouflage generation process, we categorize 3D adversarial camouflage into two types: adversarial camouflage with 3D-rendering, and adversarial camouflage without 3D-rendering.

Adversarial camouflage with 3D-rendering employs neural renderers to generate adversarial textures or patterns. Among these, the 3D neural renderers include two categories: differentiable neural renderers (DNRs) and non-differentiable neural renderers (NNRs). PyTorch3D 0.7.6, Nvdiffrast 0.3.3, Mitsuba 2.0, and Mitsuba 3.0 are differentiable neural renderers, and MeshLab 1.3.2.0, Blender 4.5, Unity 1.6.0, and Unreal Engine 5.6 are non-differentiable neural renderers. The generation and optimization methods, the neural renderer, and the deployment manner of placing 3D adversarial camouflage are shown in Table 8.

3D differentiable rendering techniques address non-rigid deformations and the need for extensive labeled data, ensuring that adversarial information is preserved when mapped onto the 3D object. This enables practical deployment of the 3D adversarial camouflage in a sticker mode while maintaining the necessary adversarial properties and robustness.

6.2.1. Adversarial Camouflage Without Neural Renderers

Early adversarial camouflage methods generated adversarial samples by repeatedly pasting adversarial patterns onto target surfaces without using neural renderers, as demonstrated by [14,49,81,82,83].

For example, UPC [81] generates adversarial patterns that are optimized through a combination of semantic constraints and joint-attacks strategy targeting the region proposal network (RPN), classifier and regressor of the detector. AdvCam [14]) generates stealthy adversarial examples by combining neural style transfer with adversarial attack techniques. Adv-Tshirt [49] was the first clothing non-rigid adversarial patch for targeting person detection systems by creating adversarial T-shirts. The adversarial patterns are optimized using thin plate spline (TPS) transformation to model the non-rigid deformations of the T-shirt, combined with a min–max optimization framework in the ensemble loss function to enhance robustness. Invisibility Cloak [82] concurrently uses a similar method to generate clothing adversarial patterns and has been verified on more models and datasets.

6.2.2. Adversarial Camouflage with Non-Differentiable Neural Renderers

Previous works employed non-differentiable neural renderers in adversarial camouflage attacks to generate adversarial textures or patterns through search-based or iterative optimization techniques. While effective, these methods typically require more computational resources and time compared to approaches using differentiable neural renderers. Under the Expectation Over Transformation (EOT) framework, Athalye et al. [110] pioneered the creation of the first 3D physical-world adversarial objects, addressing the challenge that adversarial examples often fail to maintain their adversarial nature under real-world image transformations.

For example, CAMOU [66] employs a simulation environment (Unreal Engine) that enables the creation of realistic 3D vehicle models and diverse environmental conditions to learn camouflage patterns that can reduce the detectability of vehicles by object detectors. DAS [67] proposes a dual attention suppress attack based on the fact that different models share similar attention patterns for the same object. The adversarial patterns are rendered in a neural mesh renderer (NMR). CAC [39] leverages Unity for rendering and employs a dense proposals attack strategy, significantly enhancing the transferability and robustness of adversarial examples. However, the method’s applicability to Transformer-based models remains unexplored. FCA [68] generates robust adversarial texture patterns by utilizing the CARLA simulator to paint the texture on the entire vehicle surface.

6.2.3. Adversarial Camouflage with Differentiable Neural Renderers

Differentiable neural renderers bridge the gap between 3D models and 2D images by enabling gradient-based optimization of 3D model parameters directly from the pixel space.

For example, ACTIVE [53] provides a differentiable texture rendering process, using a neural texture renderer (NTR) to preserve various physical characteristics of the rendered images. AT3D [40] develops adversarial textured 3D meshes that can effectively deceive commercial face recognition systems by optimizing adversarial meshes in the low-dimensional coefficient space of the 3D Morphable model (3DMM). AdvCaT [84] uses a Voronoi diagram to parameterize the adversarial camouflage textures and the Gumbel–softmax trick to enable differentiable sampling of discrete colors, followed by applying TopoProj and TPS to simulate the physical deformation and movement of clothes. RAUCA [70] utilizes a Neural Renderer Plus (NRP) that can project vehicle textures onto 3D models and render images that incorporate a variety of environmental characteristics, such as lighting and weather conditions, leading to more robust and realistic adversarial camouflage. Wang [86] proposed a unified adversarial attack framework that can generate physically printable adversarial patches with different attack targets, including instance-level hiding and scene-level creating.

Additionally, some studies have used both differentiable and non-differentiable neural renderers. For example, DTA [69] introduces the Differentiable Transformation Attack, which leverages a neural rendering model known as the Differentiable Transformation Network (DTN). The DTN is designed to learn the representation of various scene properties from a legacy photo-realistic renderer. TT3D [55] generates adversarial examples in a digital format and then converts them into physical objects using 3D printing techniques to achieve the targeted adversarial attack. The attack performs adversarial fine-tuning in the grid-based Neural Radiance Field (NeRF) space and conducts transferability tests across various renderers, including Nvdiffrast, MeshLab, and Blender.

6.3. Limitations of 3D Adversarial Camouflage

Physically, adversarial camouflage is typically deployed via 3D-printing or 2D-printing followed by attachment to the target object. Since this form of attack involves manipulating a large area of the target’s surface, it is generally visible to human observers. However, the perceptibility of 3D adversarial camouflage, particularly in the context of pedestrian and vehicle detection, is increasingly evolving toward a more natural appearance.

The adoption of differentiable neural renderers has been instrumental in this evolution, as they enable gradient-based optimization of both texture and shape by enabling gradients to flow from the 2D rendered output back to the 3D model. This facilitates the generation of highly effective and natural-looking adversarial patterns that are fine-tuned to maximize attack success while minimizing perceptibility. Furthermore, placing 3D adversarial camouflage has demonstrated notable cross-model transferability. For example, exiting work [35,40,53,55,70,84,86,111] exhibits transferability across diverse-bone models.

7. Discussion and Future Trends

In this section, we provide a detailed discussion of the trade-offs between transferability, perceptibility, and deployment manners, along with their limitations and future trends.

7.1. Physical Deployment Manner and Transferability

The transferability of physical adversarial attacks can be qualitatively assessed by the cross-dataset, cross-model, and cross-task capabilities of adversarial patterns. For object detection tasks, the reduction in mean average precision (mAP) serves as the primary evaluation metric instead of attack success rate (ASR) [16], quantifying the performance degradation when adversarial patterns are transferred from source datasets and models to new models or tasks.

We systematically summarize the transferability of adversarial attacks against object detection, focusing specifically on the transferability of cross same-bone models and cross diverse-bone models, along with their two corresponding physical deployment manners: the printed deployment and the injected signal. The transferability of these attacks and their corresponding physical deployment manners are presented in Table 9.

Adversarial patterns generally exhibit stronger transferability across models with similar architectures (referred to as CSMT) than across those with different architectures (referred to as CDMT). This performance gap primarily stems from discrepancies in model architectures, variations in training data distributions, and other contributing factors such as differences in feature representations and decision boundaries.

As shown in Table 9, and considering the publication years of the various methods in previous Table 3, Table 5 and Table 7, early adversarial attacks primarily demonstrated cross same-bone model transferability (particularly across CNN architectures). More recently, research demonstrating cross diverse-bone model transferability has also emerged. For instance, 3D adversarial camouflage methods [35,53,55,70,84,86] exhibit transferability across both CNN and Transformer architectures. Similarly, adversarial patches [15,57,74] also demonstrate transferability across different model architectures. From a practical deployment perspective, transferability represents a critical metric for assessing the effectiveness of attacks against unknown real-world systems. Furthermore, adversarial signals, whether acoustic or optical (such as laser or infrared), compared to adversarial patches or large-area 3D adversarial camouflages, are less perceptible to human observers and offer superior visual stealth [63,92]. Additionally, their physical deployment is often more straightforward to implement. These advantages position adversarial signals as a promising vector for future attacks.

7.2. Weakly Perceptible Physical Adversarial Patterns

Adversarial patches that manipulate 2D physical targets cover a small area of the target surface, but their patterns are often unnatural and easily detectable by the naked eye. In real-world deployments, adversarial patterns must be readily captured by imaging sensors while remaining inconspicuous to human observers, all while maintaining a high attack success rate. Consequently, future work should prioritize improving the weak perceptibility of physical adversarial examples.

Researchers may design weakly perceptible physical adversarial patterns that combine multi-source illumination with deformable materials (e.g., colored thin-film plastics) to occlude key regions of the target, thereby inducing object hiding or appearing in object detection models. Additionally, 3D-printed adversarial patterns constrained by semantic priors could be leveraged to deceive object detection systems.

7.3. Transferability Across Real Object Detection Systems

Real-world systems are typically black-box, meaning adversaries possess limited or no knowledge of the target system’s internal workings. To perform successful attacks under such constraints, they often resort to strategies such as surrogate models or black-box query to achieve transferable adversarial effects in practical black-box settings. Enhancing the black-box transferability of physical adversarial patterns is therefore a critical direction for future research.

In the surrogate models method, adversarial patterns generated under white-box conditions demonstrate stronger transferability when the surrogate model’s architecture closely matches that of the target victim model. When the architecture of the target system is unknown, combining adversarial patterns generated from multiple surrogate models with diverse architectures may improve the feasibility and robustness of the attack.

In the black-box query method, adversaries in real-world scenarios often face strict limitations on the number of queries they can make to the target system. Efficiently incorporating prior knowledge about the target system into the adversarial pattern optimization process can significantly enhance query efficiency. For example, object detection mechanisms used in autonomous driving systems from different manufacturers can be analyzed using pre-trained deep neural networks based on open-source autonomous driving architectures, modules, and algorithms. By leveraging such prior knowledge, adversaries can better guide the optimization of adversarial patterns, thereby improving both query efficiency and attack effectiveness against unknown real-world systems.

7.4. Security Threats of Full Object Detection Pipeline

The deployment of adversarial patterns against object detection models in real-world environments involves numerous influencing factors, such as ambient lighting, distance, perspective, target velocity, camera sensor characteristics, and weather conditions, all of which affect the attack’s performance.

In the object detection modules of autonomous vehicles, when operating at high speeds or relying on multi-sensor fusion for decision-making, the robustness of existing adversarial patterns remains insufficiently validated. Many studies still lack evaluation under highway speed scenarios or testing on actual vehicle platforms, indicating a need for further verification. Other approaches use lasers to interfere with camera sensors, rolling shutters, or optical channels, thereby attacking the visual perception systems of autonomous vehicles. However, these efforts often focus on a single attack surface and lack comprehensive system-level adversarial evaluations of the entire object detection pipeline in autonomous driving. As a result, the overall systemic threat posed by such attacks remains poorly understood and requires further investigation.

7.5. Defense of Adversarial Patterns

Given the complexity of the physical world, we categorize defense strategies from the perspective of model pipeline into four types according to the stages of model processing: model training, model input, model inference, and result output. These four defense strategies and their corresponding objectives are presented in Table 10.

(1): Adversarial Training. This method is applied during the model training stage by proactively introducing adversarial perturbations into the training datasets [112]. This process enhances the model’s inherent robustness, enabling it to ignore the interference of adversarial patterns.
(2): Input Transformation. This defense is deployed at the input stage, before the image is processed by the target model, by filtering, purifying, or otherwise transforming the input. For example, SpectralDefense [124] transforms the input image from the spatial domain to the frequency domain (Fourier domain) in order to eliminate adversarial perturbations. The objective is to disrupt or neutralize adversarial patterns, thereby preventing them from reaching and compromising the model’s inference.
(3): Adversarial Purging. During the model testing/inference stage, adversarial purging involves fine-tuning the target model in advance to develop the capability of identifying and detecting adversarial patches and perturbations, such as DetectorGuard [132], PatchGuard [133], PatchCleanser [134], PatchZero [137], PAD [139], and PatchCURE [140]. During inference, the enhanced model can recognize these threats and employ certain cleansing techniques to significantly reduce or even eliminate the effectiveness of adversarial attacks.
(4): Multi-modal Fusion. This strategy operates at the final output stage by aggregating and reconciling outputs from multiple data modalities [143] or independent models to reach a consensus decision. Since adversarial patterns are typically effective against only a single modality or model, fusion mechanisms such as majority voting or consistency verification can mitigate the impact of unilateral attacks. A typical example is fusing results from infrared and visible light modalities in object detection tasks to defend against adversarial light signal injection attacks.

Among the four defense strategies mentioned above, adversarial training is applicable to the three types of adversarial attack deployment methods proposed in this paper. However, it performs relatively weakly against unseen adversarial patterns and requires a large number of adversarial samples during the training phase. Input transformation is primarily used against digital adversarial attacks. Adversarial purging, on the other hand, is suitable for defending against adversarial attacks that involve manipulating 2D physical objects, such as physically deployed adversarial patches. Lastly, the multi-modal fusion defense strategy can be applied in real-world scenarios, such as in perception systems for autonomous driving or unmanned aerial vehicles. In autonomous driving, multi-modal adversarial patterns can be fused to the data streams of in-vehicle perception sensors (e.g., cameras, LiDAR, radar) to perform targeted adversarial training, thereby enhancing the robustness of downstream tasks such as object detection and lane detection modules at the system level.

8. Conclusions

Physical adversarial attacks pose a serious threat to the security of deep neural network models. From the perspective of a novel taxonomy—which includes manipulating 2D physical objects, injecting adversarial light and acoustic signals, and deploying 3D adversarial camouflage—we review existing literature on physical adversarial attacks against object detection. For each attack type, we further introduce subcategories based on the deployment of adversarial patterns to systematically organize current research. We provide a systematic evaluation of the transferability and perceptibility of physical adversarial attacks in object detection.

We find that physical adversarial attacks against object detection generally exhibit weak transferability across diverse-bone architectures. Furthermore, physical adversarial patterns often demonstrate high perceptibility and are easily detected by human observers. For object detection tasks in applications such as autonomous driving, where multi-sensor fusion is employed, generating adversarial patterns that achieve cross-modal transferability remains a critical and unresolved research challenge. Additionally, creating adversarial latency attacks that generate non-existent objects represents an emerging and noteworthy research direction.

Author Contributions

Formal analysis, methodology, validation, writing—original draft, G.L.; conceptualization, investigation, writing—review and editing, M.C.; visualization, writing—review and editing, S.X.; writing—review and editing, Y.Z.; funding acquisition, project administration, resources, Y.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Open Foundation of Key Laboratory of Cyberspace Security, Ministry of Education of China (No. KLCS20240211). The APC was funded by Yan Cao.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

OD	Object Detection	PD	Person Detection
VD	Vehicle Detection	AD	Aerial Detection
LD	Lane Detection	TSD	Traffic Sign Detection
IC	Image Classification	FR	Face Recognition
OT	Object Tracking	TSR	Traffic Sign Recognition
RS	Real Systems
BB	Black-box	GyB	Gray-box
WB	White-box
UT	Untargeted Attack	TA	Targeted Attack
BQ	Black-Box Query	GM	Generative Model
GB	Gradient-based	SM	Surrogate Model
PDM	Physical Deployment Manner	PAA	Physical Adversarial Attack
Per	Perceptibility	Tra	Transferability
LAP	Adversarial Light Pattern	AAR	Adversarial Acoustic Pattern
VT	Victim Tasks	VMa	Victim Model Architecture
AInt	Adversary’s Intention
CNN	CNN-based bone architecture
TF	Transformer-based bone architecture
Akc	Adversary’s knowledge and capabilities
Gen	Generation and optimization method
DNR	Differentiable Neural Renderer
NNR	Non-differentiable Neural Renderer
AC (w/)	Adversarial Camouflage with
	3D-Rendering
AC (w/o)	Adversarial Camouflage without
	3D-Rendering
✓	means covered	×	means not covered
—	means not mentioned

References

Zhu, X.; Su, W.; Lu, L.; Li, B.; Wang, X.; Dai, J. Deformable DETR: Deformable Transformers for End-to-End Object Detection. In Proceedings of the International Conference on Learning Representations, Vienna, Austria, 4 May 2021. [Google Scholar]
Zhao, X.; Wang, L.; Zhang, Y.; Han, X.; Deveci, M.; Parmar, M. A Review of Convolutional Neural Networks in Computer Vision. Artif. Intell. Rev. 2024, 57, 99. [Google Scholar] [CrossRef]
Kobayashi, R.; Nomoto, K.; Tanaka, Y.; Tsuruoka, G.; Mori, T. WIP: Shadow Hack: Adversarial Shadow Attack against LiDAR Object Detection. In Proceedings of the Symposium on Vehicle Security & Privacy, San Diego, CA, USA, 26 February 2024. [Google Scholar] [CrossRef]
Zhou, M.; Zhou, W.; Huang, J.; Yang, J.; Du, M.; Li, Q. Stealthy and Effective Physical Adversarial Attacks in Autonomous Driving. IEEE Trans. Inf. Forensics Secur. 2024, 19, 6795–6809. [Google Scholar] [CrossRef]
Ullah, F.U.M.; Obaidat, M.S.; Ullah, A.; Muhammad, K.; Hijji, M.; Baik, S.W. A Comprehensive Review on Vision-Based Violence Detection in Surveillance Videos. ACM Comput. Surv. 2023, 55, 1–44. [Google Scholar] [CrossRef]
Nguyen, K.N.T.; Zhang, W.; Lu, K.; Wu, Y.H.; Zheng, X.; Li Tan, H.; Zhen, L. A Survey and Evaluation of Adversarial Attacks in Object Detection. IEEE Trans. Neural Netw. Learn. Syst. 2025, 36, 15706–15722. [Google Scholar] [CrossRef]
Wang, N.; Xie, S.; Sato, T.; Luo, Y.; Xu, K.; Chen, Q.A. Revisiting Physical-World Adversarial Attack on Traffic Sign Recognition: A Commercial Systems Perspective. In Proceedings of the Network and Distributed System Security (NDSS) Symposium 2025, San Diego, CA, USA, 24–28 February 2025. [Google Scholar] [CrossRef]
Li, Y.; Xie, B.; Guo, S.; Yang, Y.; Xiao, B. A Survey of Robustness and Safety of 2D and 3D Deep Learning Models Against Adversarial Attacks. arXiv 2023, arXiv:2310.00633. [Google Scholar] [CrossRef]
Wu, B.; Zhu, Z.; Liu, L.; Liu, Q.; He, Z.; Lyu, S. Attacks in Adversarial Machine Learning: A Systematic Survey from the Life-Cycle Perspective. arXiv 2024, arXiv:2302.09457. [Google Scholar] [CrossRef]
Jia, W.; Lu, Z.; Zhang, H.; Liu, Z.; Wang, J.; Qu, G. Fooling the Eyes of Autonomous Vehicles: Robust Physical Adversarial Examples Against Traffic Sign Recognition Systems. In Proceedings of the 2022 Network and Distributed System Security Symposium, San Diego, CA, USA, 24–28 April 2022. [Google Scholar] [CrossRef]
Wang, N.; Luo, Y.; Sato, T.; Xu, K.; Chen, Q.A. Does Physical Adversarial Example Really Matter to Autonomous Driving? In Towards System-Level Effect of Adversarial Object Evasion Attack. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; pp. 4389–4400. [Google Scholar] [CrossRef]
Sato, T.; Bhupathiraju, S.H.V.; Clifford, M.; Sugawara, T.; Chen, Q.A.; Rampazzi, S. Invisible Reflections: Leveraging Infrared Laser Reflections to Target Traffic Sign Perception. In Proceedings of the 2024 Network and Distributed System Security Symposium, San Diego, CA, USA, 29 February–1 March 2024. [Google Scholar] [CrossRef]
Guo, D.; Wu, Y.; Dai, Y.; Zhou, P.; Lou, X.; Tan, R. Invisible Optical Adversarial Stripes on Traffic Sign against Autonomous Vehicles. In Proceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services, Tokyo, Japan, 3–7 June 2024; pp. 534–546. [Google Scholar] [CrossRef]
Duan, R.; Ma, X.; Wang, Y.; Bailey, J.; Qin, A.K.; Yang, Y. Adversarial Camouflage: Hiding Physical-World Attacks With Natural Styles. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 997–1005. [Google Scholar] [CrossRef]
Lian, J.; Mei, S.; Zhang, S.; Ma, M. Benchmarking Adversarial Patch against Aerial Detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5634616. [Google Scholar] [CrossRef]
Liu, T.; Liu, Y.; Ma, Z.; Yang, T.; Liu, X.; Li, T.; Ma, J. L-HAWK: A Controllable Physical Adversarial Patch against a Long-Distance Target. In Proceedings of the 2025 Network and Distributed System Security Symposium, San Diego, CA, USA, 24–28 February 2025. [Google Scholar]
Biton, D.; Shams, J.; Koda, S.; Shabtai, A.; Elovici, Y.; Nassi, B. Towards an End-to-End (E2E) Adversarial Learning and Application in the Physical World. arXiv 2025, arXiv:2501.08258. [Google Scholar] [CrossRef]
Badjie, B.; Cecílio, J.; Casimiro, A. Adversarial Attacks and Countermeasures on Image Classification-Based Deep Learning Models in Autonomous Driving Systems: A Systematic Review. ACM Comput. Surv. 2024, 57, 1–52. [Google Scholar] [CrossRef]
Wei, X.; Pu, B.; Lu, J.; Wu, B. Visually Adversarial Attacks and Defenses in the Physical World: A Survey. arXiv 2023, arXiv:2211.01671. [Google Scholar] [CrossRef]
Wang, D.; Yao, W.; Jiang, T.; Tang, G.; Chen, X. A Survey on Physical Adversarial Attack in Computer Vision. arXiv 2023, arXiv:2209.14262. [Google Scholar] [CrossRef]
Guesmi, A.; Hanif, M.A.; Ouni, B.; Shafique, M. Physical Adversarial Attacks for Camera-Based Smart Systems: Current Trends, Categorization, Applications, Research Challenges, and Future Outlook. IEEE Access 2023, 11, 109617–109668. [Google Scholar] [CrossRef]
Wei, H.; Tang, H.; Jia, X.; Wang, Z.; Yu, H.; Li, Z.; Satoh, S.; Gool, L.V.; Wang, Z. Physical Adversarial Attack Meets Computer Vision: A Decade Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 9797–9817. [Google Scholar] [CrossRef] [PubMed]
Nguyen, K.; Fernando, T.; Fookes, C.; Sridharan, S. Physical Adversarial Attacks for Surveillance: A Survey. IEEE Trans. Neural Netw. Learn. Syst. 2024, 35, 17036–17056. [Google Scholar] [CrossRef]
Zhang, C.; Xu, X.; Wu, J.; Liu, Z.; Zhou, L. Adversarial Attacks of Vision Tasks in the Past 10 Years: A Survey. arXiv 2024, arXiv:2410.23687. [Google Scholar] [CrossRef]
Byun, J.; Cho, S.; Kwon, M.J.; Kim, H.S.; Kim, C. Improving the Transferability of Targeted Adversarial Examples through Object-Based Diverse Input. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 15223–15232. [Google Scholar] [CrossRef]
Ren, Y.; Zhao, Z.; Lin, C.; Yang, B.; Zhou, L.; Liu, Z.; Shen, C. Improving Integrated Gradient-Based Transferable Adversarial Examples by Refining the Integration Path. arXiv 2024, arXiv:2412.18844. [Google Scholar] [CrossRef]
Wei, X.; Ruan, S.; Dong, Y.; Su, H.; Cao, X. Distributionally Location-Aware Transferable Adversarial Patches for Facial Images. IEEE Trans. Pattern Anal. Mach. Intell. 2025, 47, 2849–2864. [Google Scholar] [CrossRef]
Zhou, Z.; Li, B.; Song, Y.; Yu, Z.; Hu, S.; Wan, W.; Zhang, L.Y.; Yao, D.; Jin, H. NumbOD: A Spatial-Frequency Fusion Attack against Object Detectors. In Proceedings of the 39th Annual AAAI Conference on Artificial Intelligence, Philadelphia, PA, USA, 25 February–4 March 2025. [Google Scholar] [CrossRef]
Li, C.; Jiang, T.; Wang, H.; Yao, W.; Wang, D. Optimizing Latent Variables in Integrating Transfer and Query Based Attack Framework. IEEE Trans. Pattern Anal. Mach. Intell. 2025, 47, 161–171. [Google Scholar] [CrossRef]
Song, Y.; Zhou, Z.; Li, M.; Wang, X.; Zhang, H.; Deng, M.; Wan, W.; Hu, S.; Zhang, L.Y. PB-UAP: Hybrid Universal Adversarial Attack for Image Segmentation. In Proceedings of the 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, India, 6–11 April 2025; pp. 1–5. [Google Scholar] [CrossRef]
Sarkar, S.; Babu, A.R.; Mousavi, S.; Gundecha, V.; Ghorbanpour, S.; Naug, A.; Gutierrez, R.L.; Guillen, A. Reinforcement Learning Platform for Adversarial Black-Box Attacks with Custom Distortion Filters. Proc. AAAI Conf. Artif. Intell. 2025, 39, 27628–27635. [Google Scholar] [CrossRef]
Chen, J.; Chen, H.; Chen, K.; Zhang, Y.; Zou, Z.; Shi, Z. Diffusion Models for Imperceptible and Transferable Adversarial Attack. IEEE Trans. Pattern Anal. Mach. Intell. 2025, 47, 961–977. [Google Scholar] [CrossRef]
He, C.; Ma, X.; Zhu, B.B.; Zeng, Y.; Hu, H.; Bai, X.; Jin, H.; Zhang, D. DorPatch: Distributed and Occlusion-Robust Adversarial Patch to Evade Certifiable Defenses. In Proceedings of the 2024 Network and Distributed System Security Symposium, San Diego, CA, USA, 26 February–1 March 2024. [Google Scholar] [CrossRef]
Wei, X.; Guo, Y.; Yu, J. Adversarial Sticker: A Stealthy Attack Method in the Physical World. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 2711–2725. [Google Scholar] [CrossRef] [PubMed]
Li, Y.; Tan, W.; Wang, T.; Liang, X.; Pan, Q. Flexible Physical Camouflage Generation Based on a Differential Approach. arXiv 2024, arXiv:2402.13575. [Google Scholar] [CrossRef]
Zhang, Q.; Guo, Q.; Gao, R.; Juefei-Xu, F.; Yu, H.; Feng, W. Adversarial Relighting Against Face Recognition. IEEE Trans. Inf. Forensics Secur. 2024, 19, 9145–9157. [Google Scholar] [CrossRef]
Wang, Y.; Liu, Z.; Luo, B.; Hui, R.; Li, F. The Invisible Polyjuice Potion: An Effective Physical Adversarial Attack against Face Recognition. In Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, Salt Lake City, UT, USA, 14–18 October 2024; pp. 3346–3360. [Google Scholar] [CrossRef]
Zhao, Y.; Zhu, H.; Liang, R.; Shen, Q.; Zhang, S.; Chen, K. Seeing Isn’t Believing: Practical Adversarial Attack Against Object Detectors. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, London, UK, 11–15 November 2019; pp. 1989–2004. [Google Scholar] [CrossRef]
Duan, Y.; Chen, J.; Zhou, X.; Zou, J.; He, Z.; Zhang, J.; Zhang, W.; Pan, Z. Learning Coated Adversarial Camouflages for Object Detectors. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, Vienna, Austria, 23–29 July 2022; pp. 891–897. [Google Scholar] [CrossRef]
Yang, X.; Liu, C.; Xu, L.; Wang, Y.; Dong, Y.; Chen, N.; Su, H.; Zhu, J. Towards Effective Adversarial Textured 3D Meshes on Physical Face Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 4119–4128. [Google Scholar] [CrossRef]
Wan, J.; Fu, J.; Wang, L.; Yang, Z. BounceAttack: A Query-Efficient Decision-Based Adversarial Attack by Bouncing into the Wild. In Proceedings of the IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 19–23 May 2024; pp. 1270–1286. [Google Scholar] [CrossRef]
Zhu, W.; Ji, X.; Cheng, Y.; Zhang, S.; Xu, W. TPatch: A Triggered Physical Adversarial Patch. In Proceedings of the 32nd USENIX Conference on Security Symposium, Anaheim, CA, USA, 9–11 August 2023; pp. 661–678. [Google Scholar]
Hu, C.; Shi, W.; Jiang, T.; Yao, W.; Tian, L.; Chen, X.; Zhou, J.; Li, W. Adversarial Infrared Blocks: A Multi-View Black-Box Attack to Thermal Infrared Detectors in Physical World. Neural Netw. 2024, 175, 106310. [Google Scholar] [CrossRef]
Hu, C.; Shi, W.; Yao, W.; Jiang, T.; Tian, L.; Chen, X.; Li, W. Adversarial Infrared Curves: An Attack on Infrared Pedestrian Detectors in the Physical World. Neural Netw. 2024, 178, 106459. [Google Scholar] [CrossRef]
Yang, C.; Kortylewski, A.; Xie, C.; Cao, Y.; Yuille, A. PatchAttack: A Black-Box Texture-Based Attack with Reinforcement Learning. In Proceedings of the European Conference on Computer Vision (ECCV), Glasgow, UK, 23–28 August 2020; Volume 12371, pp. 681–698. [Google Scholar] [CrossRef]
Doan, B.G.; Xue, M.; Ma, S.; Abbasnejad, E.; Ranasinghe, C.D. TnT Attacks! Universal Naturalistic Adversarial Patches Against Deep Neural Network Systems. IEEE Trans. Inf. Forensics Secur. 2022, 17, 3816–3830. [Google Scholar] [CrossRef]
Zou, Z.; Chen, K.; Shi, Z.; Guo, Y.; Ye, J. Object Detection in 20 Years: A Survey. Proc. IEEE 2023, 111, 257–276. [Google Scholar] [CrossRef]
Cheng, G.; Yuan, X.; Yao, X.; Yan, K.; Zeng, Q.; Xie, X.; Han, J. Towards Large-Scale Small Object Detection: Survey and Benchmarks. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 13467–13488. [Google Scholar] [CrossRef]
Xu, K.; Zhang, G.; Liu, S.; Fan, Q.; Sun, M.; Chen, H.; Chen, P.Y.; Wang, Y.; Lin, X. Adversarial T-Shirt! Evading Person Detectors in a Physical World. In Proceedings of the European Conference on Computer Vision (ECCV), Glasgow, UK, 23–28 August 2020; Volume 12350, pp. 665–681. [Google Scholar] [CrossRef]
Zhu, X.; Li, X.; Li, J.; Wang, Z.; Hu, X. Fooling Thermal Infrared Pedestrian Detectors in Real World Using Small Bulbs. Proc. AAAI Conf. Artif. Intell. 2021, 35, 3616–3624. [Google Scholar] [CrossRef]
Wei, X.; Yu, J.; Huang, Y. Physically Adversarial Infrared Patches with Learnable Shapes and Locations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 12334–12342. [Google Scholar] [CrossRef]
Han, K.; Wang, Y.; Chen, H.; Chen, X.; Guo, J.; Liu, Z.; Tang, Y.; Xiao, A.; Xu, C.; Xu, Y.; et al. A Survey on Vision Transformer. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 87–110. [Google Scholar] [CrossRef]
Suryanto, N.; Kim, Y.; Larasati, H.T.; Kang, H.; Le, T.T.H.; Hong, Y.; Yang, H.; Oh, S.Y.; Kim, H. ACTIVE: Towards Highly Transferable 3D Physical Camouflage for Universal and Robust Vehicle Evasion. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; pp. 4282–4291. [Google Scholar] [CrossRef]
Wang, R.; Guo, Y.; Wang, Y. AGS: Affordable and Generalizable Substitute Training for Transferable Adversarial Attack. Proc. AAAI Conf. Artif. Intell. 2024, 38, 5553–5562. [Google Scholar] [CrossRef]
Huang, Y.; Dong, Y.; Ruan, S.; Yang, X.; Su, H.; Wei, X. Towards Transferable Targeted 3D Adversarial Attack in the Physical World. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024. [Google Scholar] [CrossRef]
Chen, J.; Feng, Z.; Zeng, R.; Pu, Y.; Zhou, C.; Jiang, Y.; Gan, Y.; Li, J.; Ji, S. Enhancing Adversarial Transferability with Adversarial Weight Tuning. Proc. AAAI Conf. Artif. Intell. 2025, 39, 2061–2069. [Google Scholar] [CrossRef]
Li, K.; Wang, D.; Zhu, W.; Li, S.; Wang, Q.; Gao, X. Physical Adversarial Patch Attack for Optical Fine-Grained Aircraft Recognition. IEEE Trans. Inf. Forensics Secur. 2024, 20, 436–448. [Google Scholar] [CrossRef]
Cao, Y.; Wang, N.; Xiao, C.; Yang, D.; Fang, J.; Yang, R.; Chen, Q.A.; Liu, M.; Li, B. Invisible for Both Camera and LiDAR: Security of Multi-Sensor Fusion Based Perception in Autonomous Driving Under Physical-World Attacks. In Proceedings of the IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 24–27 May 2021; pp. 176–194. [Google Scholar] [CrossRef]
Ji, X.; Cheng, Y.; Zhang, Y.; Wang, K.; Yan, C.; Xu, W.; Fu, K. Poltergeist: Acoustic Adversarial Machine Learning against Cameras and Computer Vision. In Proceedings of the IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 24–27 May 2021; pp. 160–175. [Google Scholar] [CrossRef]
Chen, Y.; Xu, Z.; Yin, Z.; Ji, X. Rolling Colors: Adversarial Laser Exploits against Traffic Light Recognition. In Proceedings of the 31nd USENIX Conference on Security Symposium, Boston, MA, USA, 10–12 August 2022; pp. 1957–1974. [Google Scholar]
Song, R.; Ozmen, M.O.; Kim, H.; Muller, R.; Celik, Z.B.; Bianchi, A. Discovering Adversarial Driving Maneuvers against Autonomous Vehicles. In Proceedings of the 32nd USENIX Conference on Security Symposium, Anaheim, CA, USA, 9–11 August 2023; pp. 2957–2974. [Google Scholar]
Jing, P.; Tang, Q.; Du, Y. Too Good to Be Safe: Tricking Lane Detection in Autonomous Driving with Crafted Perturbations. In Proceedings of the 30nd USENIX Conference on Security Symposium, Vancouver, BC, Canada, 11–13 August 2021; pp. 3237–3254. [Google Scholar]
Muller, R.; University, P.; Song, R.; University, P.; Wang, C. Investigating Physical Latency Attacks against Camera-Based Perception. In Proceedings of the 2025 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 12–15 May 2025; pp. 4588–4605. [Google Scholar] [CrossRef]
Nassi, B.; Mirsky, Y.; Nassi, D.; Ben-Netanel, R.; Drokin, O.; Elovici, Y. Phantom of the ADAS: Securing Advanced Driver-Assistance Systems from Split-Second Phantom Attacks. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, Virtual, 9–13 November 2020; pp. 293–308. [Google Scholar] [CrossRef]
Wang, W.; Yao, Y.; Liu, X.; Li, X.; Hao, P.; Zhu, T. I Can See the Light: Attacks on Autonomous Vehicles Using Invisible Lights. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, Virtual, 15–19 November 2021; pp. 1930–1944. [Google Scholar] [CrossRef]
Zhang, Y.; Foroosh, H.; David, P.; Gong, B. Camou: Learning a Vehicle Camouflage for Physical Adversarial Attack on Object Detectors in the Wild. In Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
Wang, J.; Liu, A.; Yin, Z.; Liu, S.; Tang, S.; Liu, X. Dual Attention Suppression Attack: Generate Adversarial Camouflage in Physical World. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 8561–8570. [Google Scholar] [CrossRef]
Wang, D.; Jiang, T.; Sun, J.; Zhou, W.; Gong, Z.; Zhang, X.; Yao, W.; Chen, X. FCA: Learning a 3D Full-Coverage Vehicle Camouflage for Multi-View Physical Adversarial Attack. Proc. AAAI Conf. Artif. Intell. 2022, 36, 2414–2422. [Google Scholar] [CrossRef]
Suryanto, N.; Kim, Y.; Kang, H.; Larasati, H.T.; Yun, Y.; Le, T.T.H.; Yang, H.; Oh, S.Y.; Kim, H. DTA: Physical Camouflage Attacks Using Differentiable Transformation Network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 15284–15293. [Google Scholar] [CrossRef]
Zhou, J.; Lyu, L.; He, D.; Li, Y. RAUCA: A Novel Physical Adversarial Attack on Vehicle Detectors via Robust and Accurate Camouflage Generation. In Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria, 21–27 July 2024; Volume 235, pp. 62076–62087. [Google Scholar] [CrossRef]
Eykholt, K.; Evtimov, I.; Fernandes, E.; Li, B.; Rahmati, A.; Xiao, C.; Prakash, A.; Kohno, T.; Song, D. Robust Physical-World Attacks on Deep Learning Visual Classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 1625–1634. [Google Scholar] [CrossRef]
Thys, S.; Ranst, W.V.; Goedeme, T. Fooling Automated Surveillance Cameras: Adversarial Patches to Attack Person Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA, 16–17 June 2019; pp. 49–55. [Google Scholar] [CrossRef]
Huang, H.; Chen, Z.; Chen, H.; Wang, Y.; Zhang, K. T-SEA: Transfer-Based Self-Ensemble Attack on Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 20514–20523. [Google Scholar] [CrossRef]
Lian, J.; Wang, X.; Su, Y.; Ma, M.; Mei, S. CBA: Contextual Background Attack against Optical Aerial Detection in the Physical World. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5606616. [Google Scholar] [CrossRef]
Chen, S.T.; Cornelius, C.; Martin, J.; Chau, D.H. ShapeShifter: Robust Physical Adversarial Attack on Faster R-CNN Object Detector. In Proceedings of the Machine Learning and Knowledge Discovery in Databases, Dublin, Ireland, 10–14 September 2019; Volume 11051, pp. 52–68. [Google Scholar] [CrossRef]
Yang, K.; Tsai, T.; Yu, H.; Ho, T.Y.; Jin, Y. Beyond Digital Domain: Fooling Deep Learning Based Recognition System in Physical World. Proc. AAAI Conf. Artif. Intell. 2020, 34, 1088–1095. [Google Scholar] [CrossRef]
Hu, Y.C.T.; Chen, J.C.; Kung, B.H.; Hua, K.L.; Tan, D.S. Naturalistic Physical Adversarial Patch for Object Detectors. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 7828–7837. [Google Scholar] [CrossRef]
Tan, J.; Ji, N.; Xie, H.; Xiang, X. Legitimate Adversarial Patches: Evading Human Eyes and Detection Models in the Physical World. In Proceedings of the 29th ACM International Conference on Multimedia, Virtual, 20–24 October 2021; pp. 5307–5315. [Google Scholar] [CrossRef]
Guesmi, A.; Ding, R.; Hanif, M.A.; Alouani, I.; Shafique, M. DAP: A Dynamic Adversarial Patch for Evading Person Detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–22 June 2024; pp. 24595–24604. [Google Scholar]
Tan, W.; Li, Y.; Zhao, C.; Liu, Z.; Pan, Q. DOEPatch: Dynamically Optimized Ensemble Model for Adversarial Patches Generation. IEEE Trans. Inf. Forensics Secur. 2024, 19, 9039–9054. [Google Scholar] [CrossRef]
Huang, L.; Gao, C.; Zhou, Y.; Xie, C.; Yuille, A.L.; Zou, C.; Liu, N. Universal Physical Camouflage Attacks on Object Detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 717–726. [Google Scholar] [CrossRef]
Wu, Z.; Lim, S.N.; Davis, L.S.; Goldstein, T. Making an Invisibility Cloak: Real World Adversarial Attacks on Object Detectors. In Proceedings of the European Conference on Computer Vision (ECCV), Glasgow, UK, 23–28 August 2020; Volume 12349, pp. 1–17. [Google Scholar] [CrossRef]
Hu, Z.; Huang, S.; Zhu, X.; Sun, F.; Zhang, B.; Hu, X. Adversarial Texture for Fooling Person Detectors in the Physical World. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 13297–13306. [Google Scholar] [CrossRef]
Hu, Z.; Chu, W.; Zhu, X.; Zhang, H.; Zhang, B.; Hu, X. Physically Realizable Natural-Looking Clothing Textures Evade Person Detectors via 3D Modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 16975–16984. [Google Scholar] [CrossRef]
Sun, J.; Yao, W.; Jiang, T.; Wang, D.; Chen, X. Differential Evolution Based Dual Adversarial Camouflage: Fooling Human Eyes and Object Detectors. Neural Netw. 2023, 163, 256–271. [Google Scholar] [CrossRef]
Wang, J.; Li, F.; He, L. A Unified Framework for Adversarial Patch Attacks against Visual 3D Object Detection in Autonomous Driving. IEEE Trans. Circuits Syst. Video Technol. 2025, 35, 4949–4962. [Google Scholar] [CrossRef]
Xiao, C.; Yang, D.; Li, B.; Deng, J.; Liu, M. MeshAdv: Adversarial Meshes for Visual Recognition. In Proceedings of the IEEE/CVFConference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 6891–6900. [Google Scholar] [CrossRef]
Lovisotto, G.; Turner, H.; Sluganovic, I.; Strohmeier, M.; Martinovic, I. SLAP: Improving Physical Adversarial Examples with Short-Lived Adversarial Perturbations. In Proceedings of the 30nd USENIX Conference on Security Symposium, Vancouver, BC, Canada, 11–13 August 2021; pp. 1865–1882. [Google Scholar]
Köhler, S.; Lovisotto, G.; Birnbach, S.; Baker, R.; Martinovic, I. They See Me Rollin’: Inherent Vulnerability of the Rolling Shutter in CMOS Image Sensors. In Proceedings of the 37th Annual Computer Security Applications Conference, Virtual, 6–10 December 2021; pp. 399–413. [Google Scholar] [CrossRef]
Wen, H.; Chang, S.; Zhou, L. Light Projection-Based Physical-World Vanishing Attack against Car Detection. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (Icassp), Rhodes Island, Greece, 4–10 June 2023; pp. 1–5. [Google Scholar] [CrossRef]
Wen, H.; Chang, S.; Zhou, L.; Liu, W.; Zhu, H. OptiCloak: Blinding Vision-Based Autonomous Driving Systems Through Adversarial Optical Projection. IEEE Internet Things J. 2024, 11, 28931–28944. [Google Scholar] [CrossRef]
Cheng, Y.; Ji, X.; Zhu, W.; Zhang, S.; Fu, K.; Xu, W. Adversarial Computer Vision via Acoustic Manipulation of Camera Sensors. IEEE Trans. Dependable Secur. Comput. 2024, 21, 3734–3750. [Google Scholar] [CrossRef]
Khan, A.; Rauf, Z.; Sohail, A.; Khan, A.R.; Asif, H.; Asif, A.; Farooq, U. A Survey of the Vision Transformers and Their CNN-Transformer Based Variants. Artif. Intell. Rev. 2023, 56, 2917–2970. [Google Scholar] [CrossRef]
Papa, L.; Russo, P.; Amerini, I.; Zhou, L. A Survey on Efficient Vision Transformers: Algorithms, Techniques, and Performance Benchmarking. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 7682–7700. [Google Scholar] [CrossRef] [PubMed]
Lu, Y.; Jia, Y.; Wang, J.; Li, B.; Chai, W.; Carin, L.; Velipasalar, S. Enhancing Cross-Task Black-Box Transferability of Adversarial Examples with Dispersion Reduction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 937–946. [Google Scholar] [CrossRef]
Zhao, Z.; Zhang, H.; Li, R.; Sicre, R.; Amsaleg, L.; Backes, M.; Li, Q.; Shen, C. Revisiting Transferable Adversarial Image Examples: Attack Categorization, Evaluation Guidelines, and New Insights. arXiv 2023, arXiv:2310.11850. [Google Scholar] [CrossRef]
Fu, J.; Chen, Z.; Jiang, K.; Guo, H.; Wang, J.; Gao, S.; Zhang, W. Improving Adversarial Transferability of Vision-Language Pre-Training Models through Collaborative Multimodal Interaction. arXiv 2024, arXiv:2403.10883. [Google Scholar] [CrossRef]
Kim, T.; Lee, H.J.; Ro, Y.M. Map: Multispectral Adversarial Patch to Attack Person Detection. In Proceedings of the ICASSP 2022—2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 23–27 May 2022; pp. 4853–4857. [Google Scholar] [CrossRef]
Wei, X.; Huang, Y.; Sun, Y.; Yu, J. Unified Adversarial Patch for Cross-Modal Attacks in the Physical World. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; pp. 4422–4431. [Google Scholar] [CrossRef]
Zhang, T.; Jha, R. Adversarial Illusions in Multi-Modal Embeddings. In Proceedings of the 33nd USENIX Security Symposium, Philadelphia, PA, USA, 14–16 August 2024; pp. 3009–3025. [Google Scholar]
Williams, P.N.; Li, K. CamoPatch: An Evolutionary Strategy for Generating Camouflaged Adversarial Patches. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS 2023), New Orleans, LA, USA, 10–16 December 2023; Volume 36, pp. 67269–67283. [Google Scholar]
Liu, F.; Zhang, C.; Zhang, H. Towards Transferable Unrestricted Adversarial Examples with Minimum Changes. In Proceedings of the IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), Raleigh, NC, USA, 8–10 February 2023; pp. 327–338. [Google Scholar] [CrossRef]
Eykholt, K.; Evtimov, I.; Fernandes, E.; Li, B.; Rahmati, A.; Tramer, F.; Prakash, A.; Kohno, T.; Song, D. Physical Adversarial Examples for Object Detectors. In Proceedings of the 12th USENIX Workshop on Offensive Technologies, Baltimore, MD, USA, 13–14 August 2018. [Google Scholar]
Xia, H.; Zhang, R.; Kang, Z.; Jiang, S.; Xu, S. Enhance Stealthiness and Transferability of Adversarial Attacks with Class Activation Mapping Ensemble Attack. In Proceedings of the 2024 Network and Distributed System Security Symposium, San Diego, CA, USA, 26 February–1 March 2024. [Google Scholar] [CrossRef]
Williams, P.N.; Li, K. Black-Box Sparse Adversarial Attack via Multi-Objective Optimisation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 12291–12301. [Google Scholar] [CrossRef]
Brown, T.B.; Mané, D.; Roy, A.; Abadi, M.; Gilmer, J. Adversarial Patch. In Proceedings of the Advances in Neural Information Processing Systems Workshop, Montreal, QC, Canada, 3–8 December 2018. [Google Scholar] [CrossRef]
Kurakin, A.; Goodfellow, I.; Bengio, S. Adversarial Examples in the Physical World. In Proceedings of the International Conference on Learning Representations, Toulon, France, 24–26 April 2017. [Google Scholar] [CrossRef]
Jiang, Q.; Ji, X.; Yan, C.; Xie, Z.; Lou, H.; Xu, W. GlitchHiker: Uncovering Vulnerabilities of Image Signal Transmission with IEMI. In Proceedings of the 32nd USENIX Conference on Security Symposium, Anaheim, CA, USA, 9–11 August 2023; pp. 7249–7266. [Google Scholar]
Lin, G.; Niu, M.; Zhu, Q.; Yin, Z.; Li, Z.; He, S.; Zheng, Y. Adversarial Attacks on Event-Based Pedestrian Detectors: A Physical Approach. In Proceedings of the The 39th Annual AAAI Conference on Artificial Intelligence, Philadelphia, PA, USA, 25 February–4 March 2025. [Google Scholar] [CrossRef]
Athalye, A.; Engstrom, L.; Ilyas, A.; Kwok, K. Synthesizing Robust Adversarial Examples. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; Volume 80, pp. 284–293. [Google Scholar]
Xie, T.; Han, H.; Shan, S.; Chen, X. Natural Adversarial Mask for Face Identity Protection in Physical World. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 47, 2089–2106. [Google Scholar] [CrossRef]
Addepalli, S.; Jain, S.; Babu, R.V. Efficient and Effective Augmentation Strategy for Adversarial Training. In Proceedings of the Advances in Neural Information Processing Systems, New Orleans, LA, USA, 28 November–9 December 2022; Volume 35, pp. 1488–1501. [Google Scholar]
Dong, Y.; Ruan, S.; Su, H.; Kang, C.; Wei, X.; Zhu, J. ViewFool: Evaluating the Robustness of Visual Recognition to Adversarial Viewpoints. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), New Orleans, LA, USA, 28 November–9 December 2022; Volume 35, pp. 36789–36803. [Google Scholar] [CrossRef]
Wei, Z.; Wang, Y.; Guo, Y.; Wang, Y. CFA: Class-Wise Calibrated Fair Adversarial Training. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 8193–8201. [Google Scholar] [CrossRef]
Li, L.; Spratling, M. Data Augmentation Alone Can Improve Adversarial Training. In Proceedings of the International Conference on Learning Representations, Kigali, Rwanda, 1–5 May 2023. [Google Scholar] [CrossRef]
Xu, Y.; Sun, Y.; Goldblum, M.; Goldstein, T.; Huang, F. Exploring and Exploiting Decision Boundary Dynamics for Adversarial Robustness. In Proceedings of the International Conference on Learning Representations, Kigali, Rwanda, 1–5 May 2023. [Google Scholar] [CrossRef]
Latorre, F.; Krawczuk, I.; Dadi, L.; Pethick, T.; Cevher, V. Finding Actual Descent Directions for Adversarial Training. In Proceedings of the International Conference on Learning Representations, Kigali, Rwanda, 1–5 May 2023. [Google Scholar]
Yuan, X.; Zhang, Z.; Wang, X.; Wu, L. Semantic-Aware Adversarial Training for Reliable Deep Hashing Retrieval. IEEE Trans. Inf. Forensics Secur. 2023, 18, 4681–4694. [Google Scholar] [CrossRef]
Ruan, S.; Dong, Y.; Su, H.; Peng, J.; Chen, N.; Wei, X. Towards Viewpoint-Invariant Visual Recognition via Adversarial Training. In Proceedings of the IEEE/CVFInternational Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; pp. 4686–4696. [Google Scholar] [CrossRef]
Li, Q.; Shen, C.; Hu, Q.; Lin, C.; Ji, X.; Qi, S. Towards Gradient-Based Saliency Consensus Training for Adversarial Robustness. IEEE Trans. Dependable Secur. Comput. 2024, 21, 530–541. [Google Scholar] [CrossRef]
Liu, Y.; Yang, C.; Li, D.; Ding, J.; Jiang, T. Defense against Adversarial Attacks on No-Reference Image Quality Models with Gradient Norm Regularization. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 25554–25563. [Google Scholar] [CrossRef]
Bai, Y.; Ma, Z.; Chen, Y.; Deng, J.; Pang, S.; Liu, Y.; Xu, W. Alchemy: Data-Free Adversarial Training. In Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, Salt Lake City, UT, USA, 14–18 October 2024; pp. 3808–3822. [Google Scholar] [CrossRef]
He, C.; Zhu, B.B.; Ma, X.; Jin, H.; Hu, S. Feature-Indistinguishable Attack to Circumvent Trapdoor-Enabled Defense. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, Virtual, 15–19 November 2021; pp. 3159–3176. [Google Scholar] [CrossRef]
Harder, P.; Pfreundt, F.J.; Keuper, M.; Keuper, J. SpectralDefense: Detecting Adversarial Attacks on CNNs in the Fourier Domain. In Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 18–22 July 2021; pp. 1–8. [Google Scholar] [CrossRef]
Nie, W.; Guo, B.; Huang, Y.; Xiao, C.; Vahdat, A.; Anandkumar, A. Diffusion Models for Adversarial Purification. arXiv 2022, arXiv:2205.07460. [Google Scholar] [CrossRef]
Shi, X.; Peng, Y.; Chen, Q.; Keenan, T.; Thavikulwat, A.T.; Lee, S.; Tang, Y.; Chew, E.Y.; Summers, R.M.; Lu, Z. Robust Convolutional Neural Networks against Adversarial Attacks on Medical Images. Pattern Recognit. 2022, 132, 108923. [Google Scholar] [CrossRef]
Zhu, H.; Zhang, S.; Chen, K. AI-Guardian: Defeating Adversarial Attacks Using Backdoors. In Proceedings of the 2023 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 21–25 May 2023; pp. 701–718. [Google Scholar] [CrossRef]
Wang, Y.; Li, X.; Yang, L.; Ma, J.; Li, H. ADDITION: Detecting Adversarial Examples with Image-Dependent Noise Reduction. IEEE Trans. Dependable Secur. Comput. 2024, 21, 1139–1154. [Google Scholar] [CrossRef]
Wu, S.; Wang, J.; Zhao, J.; Wang, Y.; Liu, X. NAPGuard: Towards Detecting Naturalistic Adversarial Patches. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 24367–24376. [Google Scholar] [CrossRef]
Diallo, A.F.; Patras, P. Sabre: Cutting through Adversarial Noise with Adaptive Spectral Filtering and Input Reconstruction. In Proceedings of the 2024 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 19–23 May 2024; pp. 2901–2919. [Google Scholar] [CrossRef]
Zhou, Z.; Li, M.; Liu, W.; Hu, S.; Zhang, Y.; Wan, W.; Xue, L.; Zhang, L.Y.; Yao, D.; Jin, H. Securely Fine-Tuning Pre-Trained Encoders against Adversarial Examples. In Proceedings of the 2024 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 19–23 May 2024; pp. 3015–3033. [Google Scholar] [CrossRef]
Xiang, C.; Mittal, P. DetectorGuard: Provably Securing Object Detectors against Localized Patch Hiding Attacks. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, Virtual, 15–19 November 2021; pp. 3177–3196. [Google Scholar] [CrossRef]
Xiang, C.; Sehwag, V.; Mittal, P. PatchGuard: A Provably Robust Defense against Adversarial Patches via Small Receptive Fields and Masking. In Proceedings of the 30th USENIX Security Symposium (USENIX Security 21), Online, 11–13 August 2021. [Google Scholar]
Xiang, C.; Mahloujifar, S.; Mittal, P. PatchCleanser: Certifiably Robust Defense against Adversarial Patches for Any Image Classifier. In Proceedings of the 31st USENIX Security Symposium (USENIX Security), Boston, MA, USA, 10–12 August 2022; pp. 2065–2082. [Google Scholar]
Liu, J.; Levine, A.; Lau, C.P.; Chellappa, R.; Feizi, S. Segment and Complete: Defending Object Detectors against Adversarial Patch Attacks with Robust Patch Detection. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 14953–14962. [Google Scholar] [CrossRef]
Cai, J.; Chen, S.; Li, H.; Xia, B.; Mao, Z.; Yuan, W. HARP: Let Object Detector Undergo Hyperplasia to Counter Adversarial Patches. In Proceedings of the 31st ACM International Conference on Multimedia, Ottawa, ON, Canada, 29 October–3 November 2023; pp. 2673–2683. [Google Scholar] [CrossRef]
Xu, K.; Xiao, Y.; Zheng, Z.; Cai, K.; Nevatia, R. PatchZero: Defending against Adversarial Patch Attacks by Detecting and Zeroing the Patch. In Proceedings of the 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 2–7 January 2023; pp. 4621–4630. [Google Scholar] [CrossRef]
Lin, Z.; Zhao, Y.; Chen, K.; He, J. I Don’t Know You, but I Can Catch You: Real-Time Defense against Diverse Adversarial Patches for Object Detectors. In Proceedings of the 2024 on ACM Sigsac Conference on Computer and Communications Security, Salt Lake City, UT, USA, 14–18 October 2024; pp. 3823–3837. [Google Scholar] [CrossRef]
Jing, L.; Wang, R.; Ren, W.; Dong, X.; Zou, C. PAD: Patch-Agnostic Defense against Adversarial Patch Attacks. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 24472–24481. [Google Scholar] [CrossRef]
Xiang, C.; Wu, T.; Dai, S.; Mittal, P. PatchCURE: Improving Certifiable Robustness Model Utility and Computation Efficiency of Adversarial Patch Defense. In Proceedings of the 33rd USENIX Security Symposium (USENIX Security 24), Philadelphia, PA, USA, 14–16 August 2024; pp. 3675–3692. [Google Scholar]
Feng, J.; Li, J.; Miao, C.; Huang, J.; You, W.; Shi, W.; Liang, B. Fight Fire with Fire: Combating Adversarial Patch Attacks Using Pattern-Randomized Defensive Patches. In Proceedings of the IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 12–15 May 2025; pp. 2133–2151. [Google Scholar] [CrossRef]
Wei, X.; Kang, C.; Dong, Y.; Wang, Z.; Ruan, S.; Chen, Y.; Su, H. Real-World Adversarial Defense against Patch Attacks Based on Diffusion Model. IEEE Trans. Pattern Anal. Mach. Intell. 2025, 1–17. [Google Scholar] [CrossRef]
Yu, Z.; Li, A.; Wen, R.; Chen, Y.; Zhang, N. PhySense: Defending Physically Realizable Attacks for Autonomous Systems via Consistency Reasoning. In Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, Salt Lake City, UT, USA, 14–18 October 2024; pp. 3853–3867. [Google Scholar] [CrossRef]
Kobayashi, R.; Mori, T. Invisible but Detected: Physical Adversarial Shadow Attack and Defense on LiDAR Object Detection. In Proceedings of the 34th USENIX Security Symposium (USENIX Security 25), Seattle, WA, USA, 13–15 August 2025; pp. 7369–7386. [Google Scholar]

Figure 1. Article structure. Physical adversarial attacks against object detection (Section 3) are divided into three categories: manipulating 2D physical object (Section 4), injecting adversarial signal (Section 5), placing 3D adversarial camouflage (Section 6). For each category, we systematically explain the adversarial patterns, detail the generation and optimization methods, evaluate its nine key properties, and discuss specific limitations.

Figure 2. A schematic diagram of physical adversarial attacks, including the manipulation of physical adversarial patterns to generate physical adversarial examples and the generation and optimization of digital adversarial patterns. We can easily compare and contrast to see where the contributions are and what it missing to build effective physical or digital adversarial attacks.

Figure 3. The taxonomy of physical adversarial attacks against object detection: (1) Manipulating 2D physical objects; (2) Injecting adversarial signals; (3) Placing 3D adversarial camouflage. The blue dashed boxes represent the physical deployment methods of adversarial patterns; the black dashed boxes represent the types of physical adversarial attacks; the brown solid arrows indicate the correspondence between the deployment manners and the physical attacks.

Table 4. The adversarial attacks of manipulating printed 2D adversarial patterns: generation and optimazation, deployment manners, transferability, and perceptibility.

AP	Method	Gen	Physical Deployment	Transferability	Perceptibility
Patch	extended-RP2 [103]	GB	Printed, placed	-	Visible
Patch	Nested-AE [38]	GB	Printed, placed	Same-bone	Visible
Patch	OBJ-CLS [72]	GB	Printed, placed	-	Visible
Patch	AP-PA [15]	GB, SM	Printed, placed (on, around)	Diverse-bone	Visible
Patch	T-SEA [73]	SM	Displayed on IPAD	Same-bone	Visible
Patch	SysAdv [11]	GB	Printed, placed	Same-bone	Visible
Patch	CBA [74]	GB, SM	Printed, placed (around)	Diverse-bone	Visible
Patch	PatchGen [57]	SM	Printed, placed	Diverse-bone	Visible
Cloth-P	NAP [77]	GM	Printed, worn	Same-bone	Natural
Cloth-P	LAP [78]	GB	Printed, worn	-	Natural
Cloth-P	DAP [79]	GB	Printed, worn	Same-bone	Natural
Cloth-P	DOEPatch [80]	SM	Printed, worn	Same-bone	Visible
Sticker	TLD [62]	BQ	Printed, placed	-	Covert
Image-P	ShapeShifter [75]	GB	Printed, photographed	-	Visible
Image-P	LPAttack [76]	GB	Printed, photographed	Same-bone	Visible

Table 5. The injection of adversarial light and acoustic signal, and the attribute list of adversarial attack methods.

AP	Method	Akc	VMa	VT	AInt	Venue
ALP	Phantom Attack [64]	BB	RS	PD, TSR	UT	2020 CCS
ALP	SLAP [88]	WB, BB	CNN	OD, TSR	UT, TA	2021 USENIX
AAP	PG attack [59]	BB	CNN	OD	UT, TA	2021 S&P
ALP	Rolling Shutter [89]	BB	CNN	OD	UT	2021 ACSAC
ALP	ICSL [65]	BB	RS	OD, TSD	UT	2021 CCS
ALP	Vanishing Attack [90]	WB, BB	CNN	VD	UT	2023 ICASSP
AAP	TPatch [42]	WB, BB	CNN	OD, IC	TA	2023 USENIX
AEP	GlitchHiker [108]	GrB	CNN	OD, FR	UT	2023 USENIX
ALP	OptiCloak [91]	WB, BB	CNN	VD	UT, TA	2024 IoTJ
AAP	cheng et al. [92]	BB	CNN	OD, LD	UT, TA	2024 TDSC
ALP	L-HAWK [16]	WB	CNN	OD, IC	TA	2025 NDSS
ALP	DETSTORM [63]	WB	CNN, Other	OD, OT	UT	2025 S&P

Table 6. Adversarial light and acoustic signal attacks: generation and optimization, signal source, deployment manners, transferability, and perceptibility.

Method	Gen	Signal Source	Physical Deployment	Transferability	Perceptibility
Phantom Attack [64]	—	Projector or screen	Projected or displayed	—	Natural
SLAP [88]	GB, SM	Projector (RGB)	Projected short-lived light	Same-bone	Visible
PG attack [59]	BQ	Acoustic signals	Injected into the inertial sensors	Same-bone	Imperceptible
Rolling Shutter [89]	BQ	Blue laser signal	Injected into rolling shutter	Same-bone	Covert
ICSL [65]	BQ	Infrared laser LED	Deployed and injected into camera	Diverse-bone	Covert
Vanishing Attack [90]	GB, SM	Projector	Projected	Same-bone	Visible
TPatch [42]	GB, SM	Acoustic signals, patch	Injected, affixed patch	Same-bone	Covert
GlitchHiker [108]	BQ	Intentional electromagnetic interference (IEMI)	Injected signal	Same-bone	Covert
OptiCloak [91]	GB, BQ	Projector	Projected	Same-bone	Visible
Cheng et al. [92]	BQ	Acoustic signals	Injected into the inertial sensors	Same-bone	Imperceptible
L-HAWK [16]	GB	Laser signal, patch	Injected, affixed patch	Same-bone	Covert
DETSTORM [63]	GB	Projector	Projected into camera perception pipeline	Diverse-bone	Covert

Table 7. Attribute list of placing 3D adversarial camouflage and corresponding adversarial attack methods.

AP	Method	Akc	VMa	VT	AInt	Venue
AC (w/o)	UPC [81]	WB	CNN	OD	UT	2020 CVPR
AC (w/o)	Invisibility Cloak [82]	WB	CNN	PD	UT	2020 ECCV
AC (w/o)	AdvT-shirt [49]	WB	CNN	PD	UT	2020 ECCV
AC (w/o)	TC-EGA [83]	WB	CNN	PD	UT	2022 CVPR
AC (w/)	MeshAdv [87]	WB	CNN	OD, IC	TA	2019 CVPR
AC (w/)	CAMOU [66]	BB	CNN	VD	UT	2019 ICML
AC (w/)	DAS [67]	BB	CNN	OD, IC	UT	2021 CVPR
AC (w/)	CAC [39]	WB	CNN	VD	TA	2022 IJCAI
AC (w/)	FCA [68]	BB	CNN	VD	UT	2022 AAAI
AC (w/)	DTA [69]	WB	CNN	VD	UT	2022 CVPR
AC (w/)	AdvCaT [84]	WB	CNN, TF	PD	UT	2023 CVPR
AC (w/)	ACTIVE [53]	WB	CNN, TF	VD, IS	UT	2023 ICCV
AC (w/)	DAC [85]	BB	CNN	PD	UT	2023 NN
AC (w/)	TT3D [55]	BB	CNN, TF	OD, IC, ICap	TA	2024 CVPR
AC (w/)	FPA [35]	WB	CNN, TF	OD, VD	UT	2024 arXiv
AC (w/)	RAUCA [70]	BB	CNN, TF	VD	UT	2024 ICML
AC (w/)	Lin et al. [109]	WB	TF	PD	UT	2025 AAAI
AC (w/)	Wang et al. [86]	WB	CNN, TF	OD (3D)	UT	2025 TCSVT

Table 8. Placing 3D adversarial camouflage: generation and optimization, neural renderer, deployment manner, transferability, and perceptibility.

Method	Gen	Renderer	Physical Deployment	Transferability	Perceptibility
UPC [81]	GB	—	Printed, attached	Same-bone	Visible
Invisibility Cloak [82]	GB	—	Printed, worn	Same-bone	Visible
AdvT-shirt [49]	GB	—	Printed, worn	Same-bone	Visible
TC-EGA [83]	GB, GM	—	Printed, tailored	Same-bone	Visible
MeshAdv [87]	GB	NNR	3D-printed, placed	Same-bone	Visible
CAMOU [66]	SM	NNR	Painted on target	Same-bone	Visible
DAS [67]	SM	NNR	Printed, attached to a toy car	Same-bone	Visible
CAC [39]	GB	NNR	3D-printed car model	Same-bone	Visible
FCA [68]	SM	NNR	Printed, attached to a toy car	Same-bone	Visible
DTA [69]	GB	DNR, NNR	3D-printed car model	Same-bone	Visible
AdvCaT [84]	GB	DNR	Printed, tailored	Diverse-bone	Natural
ACTIVE [53]	GB	DNR	3D-printed car model	Diverse-bone	Visible
DAC [85]	SM	NNR	Printed, worn	Same-bone	Natural
TT3D [55]	SM	DNR, NNR	3D-printed, placed	Diverse-bone	Natural
FPA [35]	GB	DNR, NNR	Printed, attached to a toy car	Diverse-bone	Visible
RAUCA [70]	SM	DNR	Printed, attached to a toy car	Diverse-bone	Visible
Lin et al. [109]	GB	DNR	Printed, assembled	-	Natural
Wang et al. [86]	GB	DNR	Printed, attached	Diverse-bone	Visible

Table 9. The transferability of adversarial physical attacks and corresponding physical deployment manner.

Transferability	Physical Deployment	Method
Same-bone	Printed Deployment	[11,38,39,49,67,68,69,76,77,79,80,81,82,83,85,87]
Same-bone	Injected Signal	[16,42,59,88,89,90,91,92,108]
Diverse-bone	Printed Deployment	[15,35,53,55,57,70,74,84,86]
Diverse-bone	Injected Signal	[63,65]

Table 10. The taxonomy of adversarial defense strategies: defense objectives and key limitations across the four pipeline stages.

Pipeline Stage	Defense Strategy	Defense Objective	Cost and Limitations	Defense Method
Model Training	Adversarial Training	Enhancing model’s intrinsic robustness	Poor defense against novel or unseen adversarial patterns	[112,113,114,115,116,117,118,119,120,121,122]
Data Input	Input Transformation	Neutralizing adversarial features in input	Potential damage to valid image information, degrading primary task performance	[123,124,125,126,127,128,129,130,131]
Model Inference	Adversarial Purging	Locating and cleansing adversarial patterns	Ineffective against subtle or natural adversarial patterns	[132,133,134,135,136,137,138,139,140,141,142]
Result Output	Multi-Modal Fusion	Achieving output consensus from multiple modalities or models	High computational overhead and poor real-time performance	[143,144]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, G.; Cao, M.; Zhang, Y.; Xu, S.; Cao, Y. From 2D-Patch to 3D-Camouflage: A Review of Physical Adversarial Attack in Object Detection. Electronics 2025, 14, 4236. https://doi.org/10.3390/electronics14214236

AMA Style

Li G, Cao M, Zhang Y, Xu S, Cao Y. From 2D-Patch to 3D-Camouflage: A Review of Physical Adversarial Attack in Object Detection. Electronics. 2025; 14(21):4236. https://doi.org/10.3390/electronics14214236

Chicago/Turabian Style

Li, Guojia, Mingyue Cao, Yihong Zhang, Simin Xu, and Yan Cao. 2025. "From 2D-Patch to 3D-Camouflage: A Review of Physical Adversarial Attack in Object Detection" Electronics 14, no. 21: 4236. https://doi.org/10.3390/electronics14214236

APA Style

Li, G., Cao, M., Zhang, Y., Xu, S., & Cao, Y. (2025). From 2D-Patch to 3D-Camouflage: A Review of Physical Adversarial Attack in Object Detection. Electronics, 14(21), 4236. https://doi.org/10.3390/electronics14214236

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

From 2D-Patch to 3D-Camouflage: A Review of Physical Adversarial Attack in Object Detection

Abstract

1. Introduction

1.1. Literature Selection

1.2. Our Contributions

2. Preliminaries

2.1. Adversarial Examples

2.2. Adversarial Attack Environment

2.2.1. Digital Adversarial Attack

2.2.2. Physical Adversarial Attack

2.3. Adversary’s Intention

2.4. Adversary’s Knowledge and Capabilities

2.5. Victim Model Architecture

3. Physical Adversarial Attacks Against Object Detection

3.1. Taxonomy of Physical Adversarial Attacks

3.2. Deployment of Physical Adversarial Patterns

3.3. Transferability of Adversarial Attack

3.3.1. Cross-Model Transferability

3.3.2. Cross-Modal Transferability

3.4. Perceptibility of Adversarial Examples

4. Manipulating 2D Physical Objects

4.1. 2D-Printed Adversarial Patterns

4.2. Generation and Optimization of 2D Adversarial Patterns

4.2.1. Generation and Optimization Methods

4.2.2. Gradient-Based Adversarial Patch

4.2.3. Surrogate Models for Generating Adversarial Patches

4.2.4. Black-Box Query for Generating Adversarial Patches

4.2.5. Generative Models for Generating Adversarial Patches

4.3. Attributes of Manipulating 2D Physical Objects

4.4. Limitations of 2D Physical Object Manipulation

5. Injecting Adversarial Signals

5.1. Adversarial Light and Acoustic Patterns

5.1.1. Projecting Adversarial Patterns with Projectors

5.1.2. Adversarial Laser Signal

5.1.3. Adversarial Acoustic Signals

5.2. Attributes of Injecting Adversarial Signal

5.3. Limitations of Adversarial Signals

6. Placing 3D Adversarial Camouflage

6.1. Adversarial Camouflage

6.2. Generating 3D Adversarial Camouflage

6.2.1. Adversarial Camouflage Without Neural Renderers

6.2.2. Adversarial Camouflage with Non-Differentiable Neural Renderers

6.2.3. Adversarial Camouflage with Differentiable Neural Renderers

6.3. Limitations of 3D Adversarial Camouflage

7. Discussion and Future Trends

7.1. Physical Deployment Manner and Transferability

7.2. Weakly Perceptible Physical Adversarial Patterns

7.3. Transferability Across Real Object Detection Systems

7.4. Security Threats of Full Object Detection Pipeline

7.5. Defense of Adversarial Patterns

8. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI