Efficient Ensemble Adversarial Attack for a Deep Neural Network (DNN)-Based Unmanned Aerial Vehicle (UAV) Vision System

Zhang, Zhun; Liu, Qihe; Zhou, Shijie; Deng, Wenqi; Wu, Zhewei; Qiu, Shilin

doi:10.3390/drones8100591

Open AccessArticle

Efficient Ensemble Adversarial Attack for a Deep Neural Network (DNN)-Based Unmanned Aerial Vehicle (UAV) Vision System

by

Zhun Zhang

¹,

Qihe Liu

^1,*,

Shijie Zhou

¹,

Wenqi Deng

²,

Zhewei Wu

¹ and

Shilin Qiu

¹

School of Information and Software Enginerring, University of Electronic Science and Technology of China, Chengdu 610054, China

²

Michtom School of Computer Science, Brandeis University, Waltham, MA 02453, USA

^*

Author to whom correspondence should be addressed.

Drones 2024, 8(10), 591; https://doi.org/10.3390/drones8100591

Submission received: 3 September 2024 / Revised: 10 October 2024 / Accepted: 16 October 2024 / Published: 17 October 2024

Download

Browse Figures

Versions Notes

Abstract

In recent years, unmanned aerial vehicles (UAVs) vision systems based on deep neural networks (DNNs) have made remarkable advancements, demonstrating impressive performance. However, due to the inherent characteristics of DNNs, these systems have become increasingly vulnerable to adversarial attacks. Traditional black-box attack methods typically require a large number of queries to generate adversarial samples successfully. In this paper, we propose a novel adversarial attack technique designed to achieve efficient black-box attacks with a minimal number of queries. We define a perturbation generator that first decomposes the image into four frequency bands using wavelet decomposition and then searches for adversarial perturbations across these bands by minimizing a weighted loss function on a set of fixed surrogate models. For the target victim model, the perturbation images generated by the perturbation generator are used to query and update the weights in the loss function, as well as the weights for different frequency bands. Experimental results show that, compared to state-of-the-art methods on various image classifiers trained on ImageNet (such as VGG-19, DenseNet-121, and ResNext-50), our method achieves a success rate over 98% for targeted attacks and nearly a 100% success rate for non-targeted attacks with only 1–2 queries per image.

Keywords:

adversarial attack; UAV vision systems; deep neural networks; automatic target recognition

1. Introduction

In recent years, unmanned aerial vehicles (UAVs) have seen widespread use across various fields such as agricultural monitoring [1], express delivery [2], and emergency medical services [3], owing to their exceptional mobility and rich sensing capabilities. Concurrently, the rapid development of deep learning (DL) technologies has significantly advanced image recognition techniques based on deep neural networks (DNNs) [4,5]. These techniques now exhibit remarkable performance and substantial adaptability in data-driven decision-making processes, becoming integral to enhancing the autonomous operational capabilities of UAVs [6,7,8]. By incorporating DNNs into their visual systems, UAVs can process and analyze images from onboard cameras in real-time [9]. This ability allows them to accurately identify crucial details such as ground features, obstacles, and navigational markers. Such advancements significantly improve their capacity to perceive, analyze, and respond to their environment during flight.

However, the emergence of adversarial examples reveals the vulnerability of DNNs [10]. These meticulously crafted malicious samples, by adding imperceptible perturbations to normal images, can induce misclassifications in unmanned aerial vehicle (UAV) target recognition models, posing a severe threat to their reliability. Consequently, exploring adversarial attack mechanisms and developing highly robust DNN-based UAV vision systems becomes particularly critical. Adversarial attacks are generally categorized into two types based on the knowledge of information to the attacker: white-box and black-box attacks [11,12]. In white-box attacks, the target model is fully exposed to the adversary. Thus, it is straightforward to design perturbations using gradients. In contrast, under the black-box assumption, attackers cannot directly access internal model information but can only query the model through input data to obtain outputs, such as probabilities or labels, without direct access to gradient information. This makes black-box attacks more challenging yet more aligned with the needs of real-world application scenarios [13].

Black-box attacks can be further divided into query-based and transfer-based attacks. Query-based attacks rely on feedback from model queries to guide the generation of adversarial samples, requiring a large number of queries to ensure effectiveness [14]. Although frequent querying can increase the success rate of attacks, it also significantly raises the cost and risk of detection. In contrast, transfer-based attacks generate adversarial samples on a substitute model before transferring them to the target model, generally requiring fewer queries but often resulting in lower success rates [15]. Thus, designing a black-box adversarial attack method that combines high success rates with minimal querying presents a challenging task. While recent research combining query and transfer attack strategies has achieved significant breakthroughs [16,17,18], solving this problem still faces many challenges.

Typically, low-frequency components contain the basic information of an image, such as overall shape and areas of constant color, which form the core visual content [19]. High-frequency components, on the other hand, represent finer details and potential noise, including edges, texture, and rapid intensity variations, which are critical for high-quality image perception [20]. Specifically, in digital imaging, high-frequency components are those features that exhibit significant variations from surrounding pixels, commonly represented by sharp edges or pixel mutations. Adversarial perturbations also belong to high-frequency noise due to pixel mutation phenomena. In some studies, low-frequency perturbations have been shown to efficiently attack models [21,22], while other studies have demonstrated that high-frequency signals significantly influence the predictions of deep neural networks (DNNs) [23,24]. Although the frequency distribution of adversarial perturbations remains unclear, previous studies have demonstrated that implementing adversarial attacks in the frequency domain can be highly effective.

In this paper, we propose an efficient ensemble black-box adversarial attack method that combines query-based and transfer-based approaches in the frequency domain. This method constructs a perturbation generator to search for adversarial perturbations within the frequency domain. We minimize a weighted loss function on a set of fixed surrogate models to effectively implement the attack. Additionally, perturbation images generated by the generator are used to query and update the weights in the loss function and across different frequency bands. Experimental results demonstrate that this method achieves state-of-the-art performance across multiple image classifiers trained on ImageNet.

The main contributions of this paper are summarized as follows.

(1): We propose a novel frequency decomposition-based perturbation generator that first decomposes the image into different frequency bands using wavelet decomposition. Then, it refines the weight updates within these frequency bands based on historical query information to generate adversarial perturbations that more effectively mislead the model. Furthermore, in addition to using the model’s gradient information when adding perturbations, we also introduce randomly generated squared Gaussian noise perturbations in high-gradient areas to enhance the attack’s success rate.
(2): We propose an efficient integrated black-box adversarial attack method that combines query-based and transfer-based attack strategies. This method effectively performs the attack by minimizing a weighted loss function on a set of fixed surrogate models. Additionally, adversarial images generated by the perturbation generator are used to query and update the weights in the loss function. Experimental results demonstrate that, compared to the latest methods on various image classifiers trained on ImageNet (such as VGG-19, DenseNet-121, and ResNext-50), our approach achieves a success rate over 98% for targeted attacks and a nearly 100% success rate for non-targeted attacks with only 1–2 queries per image.

2. Related Work

2.1. Adversarial Example

Szegedy et al. [10] are the pioneers in proposing adversarial perturbations. They cleverly introduced subtle perturbations into original images by leveraging gradient information. These minor modifications transform clean samples into adversarial samples. Although these changes are typically imperceptible to the human eye, they are sufficient to mislead classifiers into making incorrect predictions. Assume

x \in R^{d}

represents a clean image sample, and

x_{a d v}

represents its corresponding adversarial example. This process of creating adversarial examples can be framed as an optimization problem:

arg max_{x_{a d v}} J (x_{a d v}, y) s . t . {∥ x_{a d v} - x ∥}_{l_{p}} \leq ϵ

(1)

where y is the class label of the clean image x and

J (\cdot)

represents the loss function, used to measure the error of the adversarial sample

x_{a d v}

relative to the label y. The term

{| | x_{a d v} - x | |}_{p}

denotes the

l_{p}

distance between

x_{a d v}

and x, commonly using

p = {0, 2, \infty}

. The parameter

ϵ

is a threshold that limits the size of the perturbation, ensuring that the difference between the adversarial and original samples remains within a narrowly defined range, making it difficult for the human eye to detect. Adversarial sample attacks can be categorized into black-box and white-box attacks, with black-box attacks posing more challenges but also more closely aligning with real-world application requirements. Black-box attacks can further be subdivided into transfer-based attacks, query-based attacks, and query-transfer-based strategies [11].

2.2. Transfer-Based Black Box Attacks

Given the transferable properties of adversarial perturbations across different victim models, adversarial samples can be generated for a substitute model and then applied to the target model to execute attacks. Zhou et al. [15] proposed a Data-free Substitute Training (DaST) method for training substitute models. However, this method cannot accurately recover data distributions and decision boundaries, both crucial for the transferability of adversarial samples. Dong et al. [25] introduced momentum into the iterative process, proposing the Momentum Iterative Fast Gradient Sign Method (MI-FGSM) to enhance the effectiveness of iterative attacks. Xie et al. [26] employed data augmentation techniques, using probable image transformations to increase the diversity of the input data set and presented an improved Iterative Fast Gradient Sign Method (DI2-FGSM). Additionally, to improve the effectiveness of transfer attacks, Liu et al. [27] integrated the probability scores of multiple proxy models. While these methods are effective, a more natural approach involves combining the losses of multiple proxy models. For example, Yuan et al. [28] iteratively selected a set of proxy models from an ensemble and performed meta-training and meta-testing steps to minimize the difference between the white-box and black-box gradient directions. Furthermore, Ma et al. [29] utilized several proxy models to train a generalized substitute model, which serves as a stand-in for victim models to leverage the transferability of the attack on the target model. However, training such simulators is computationally expensive and difficult to scale to large datasets.

2.3. Query-Based Black Box Attacks

Query-based techniques address black-box optimization problems by iteratively querying the victim model, demonstrating higher attack success rates compared to transfer-based methods, though requiring a larger number of queries. Brendel et al. [14] initially generated adversarial samples with significant perturbations, subsequently refining them to closely resemble the original images. Brunner et al. [30] improved this strategy, enhancing query efficiency. Chen et al. [31] introduced a zeroth-order optimization attack method, which relies on extensive queries to the target model and employs a dimensionality estimation-based finite difference method to approximate gradient values. Subsequent enhancements to this technique include Tu et al.’s AutoZOOM [32] and Ilyas et al.’s Bandits-TD [33]. However, the query costs for these methods across the entire image space can be prohibitively high, requiring thousands or even tens of thousands of queries to mount an effective attack [31,32]. To reduce the complexity of queries, Guo et al. [22] explored searching for adversarial perturbations in the low-frequency domain after a discrete cosine transformation. A common drawback of these studies is the necessity to access the victim model to obtain complete inference scores. As mentioned earlier, such attacks could be effectively defended against if the target model restricts access.

2.4. Transfer-Query-Based Black Box Attacks

Combining query-based and transfer-based methods integrates the advantages of both techniques, aiming to significantly reduce the number of queries while enhancing the attack success rate. One strategy involves updating adversarial samples by implementing white-box attacks on proxy models under the guidance of query feedback. For instance, Guo et al. [16] employed a method for gradient estimation within a lower-dimensional subspace defined by prior gradients of several substitute models. Cheng et al. [17] utilized proxy gradients as a prior for transfer-based attacks and introduced the P-RGF method by extracting random vectors for gradient estimation within a low-dimensional subspace. Tashiro et al. [34] proposed the ODS method, which increases the diversity of output space perturbations by optimizing within a logical space. The GFCS method [35] improved upon ODS, primarily searching along the direction of substitute gradients, reverting to ODS if that direction fails. Cai et al. [36] generated adversarial perturbations by minimizing a weighted loss function across a series of fixed substitute models. Other approaches involve learning the adversarial distribution by leveraging clean images and their adversarial samples obtained from attacking proxy models, such as TREMBA [18], which trained a minor perturbation generator and searched for adversarial perturbations in a low-dimensional latent space; CGA [37] partially transferred conditional adversarial distribution parameters of substitute models and learned the untransferred parameters based on queries to the target model, achieving a higher transfer success rate and significantly improved query efficiency, although targeted attack queries on ImageNet still reached over 3000. Mohaghegh et al. [38] used the clean data distribution to approximate the adversarial distribution, then utilized the learned mapping of adversarial distribution features as a space to search for adversarial samples. Furthermore, some methods continuously optimize substitute models using query feedback and generate adversarial samples based on these models. For example, Al et al. [39] extracted generalized priors from previous attacks and used these priors to infer attack patterns. Yin et al. [40] pre-trained a meta-generator and performed rapid fine-tuning based on historical feedback information to produce effective perturbations. However, while these methods help enhance attack efficiency, they demand extensive computational resources and are time-consuming.

3. Attack Scenario

In practical UAV adversarial scenarios, the DNN-based vision systems of hostile UAVs are typically unknown. Attackers cannot directly access the internal model parameters, weights, or training data. Instead, we can only infer the behavioral patterns of the vision systems by observing the UAVs’ responses to specific inputs and using this information to design and generate adversarial examples. Therefore, the attack process can be described as follows: Consider a DNN-based uav vision model

F : X \to Y

, which represents a mapping function from the input space X to the output space Y. Given an image

x \in X

, we obtain

F (x; θ) = y

along with the predicted confidence or output probability

p (y | x)

. The objective of the attack is to find a small perturbation

δ

to generate an adversarial example

x_{a d v}

, which can be described as Equation (2).

\{\begin{matrix} d (x_{a d v}, x) = {∥ x_{a d v} - x ∥}_{p}, p = {0, 2, \infty} \\ min d (x_{a d v}, x), s . t . F (x_{a d v}; θ) \neq F (x; θ) \end{matrix}

(2)

In a black-box attack scenario, since one is limited to obtaining the output probabilities p by querying the model

F (x; θ)

, it is necessary to find an adversarial sample

x_{a d v}

that is very close to x with as few queries as possible.

In practice, our attack process involves generating adversarial samples through a proxy model ensemble search on the backend server. Subsequently, these samples are transmitted and imported into UAV for querying the black-box target model deployed within its onboard vision system.

4. The Proposed Attacks

This section begins by introducing the overall architecture of the proposed algorithm, followed by an explanation of its main components.

4.1. Overall Architecture

The algorithm initially constructs a perturbation generator that integrates n substitute models. This generator uses wavelet transform techniques to decompose the original input image into four frequency bands:

(c A, c H, c V, c D)

. Initial weights W for each substitute model and

W_{F}

for the different frequency bands are assigned separately. Based on these weights and gradient information, we add gradient-based perturbations and random block perturbations across the four bands. Then, the perturbed frequency bands are reconstructed into an adversarial example through inverse wavelet transformation for use in attacking the target model. If the attack succeeds, the algorithm terminates; if it fails, the weights W of the substitute models and the weights

W_{f} c

for the different frequency bands are updated based on the loss information from the target model, and the weights are reallocated for further iterations. This process continues until the attack is successful or the maximum number of queries is reached. The architecture of the algorithm is illustrated in Figure 1.

4.2. Construction of Frequency-Based Perturbation Generator for Surrogate Ensemble Models

Inspired by ensemble attack methods [35,36], we design a frequency-based perturbation generator to create adversarial perturbations and conduct attacks. The generator takes an image as input and produces perturbations capable of deceiving all substitute models. To reduce the number of queries during the attack, we introduce a weighted adversarial loss function. By minimizing the loss across a collection of proxy models, the perturbations are guided in a direction that more effectively deceives the models.

The perturbation generator consists of n substitute models, denoted as

F^{'} = [F_{1}^{'}, F_{2}^{'}, \dots, F_{n}^{'}]

, where each model is assigned an initial weight

W = [w_{1}, w_{2}, \dots, w_{n}]

, such that

\sum_{i = 1}^{N} w_{i} = 1

. Given a set of weights W, an image x, a label y, and frequency bands

F c

obtained from frequency decomposition (

F D

), the goal of the attack is to search the perturbation

δ

and determine the model weights W, as well as the weights

W_{f c}

of the frequency bands

F c

, ensuring that the perturbed frequency band

F c^{'}

can be used to reconstruct the image

x_{a d v}

via the Inverse Frequency Decomposition (

I F D

) method. The reconstructed image is designed to mislead all substitute models (i.e., for the target model F, there is

F (x_{a d v}) = y^{'} \neq y

). The adversarial example

x_{a d v}

generated through the ensemble of substitute models can be represented as follows:

\{\begin{matrix} F c & = F D (x) \\ F c^{'} & = F c (W, W_{f} c, δ) \\ x_{a d v} & = I F D (F c^{'}) \end{matrix}

(3)

Thus, the problem can be decomposed into four sub-problems: (1) frequency decomposition (

F D

), (2) weighted ensemble loss function optimization (W), (3) frequency band weights optimization (

W_{f c}

), and (4) perturbation addition.

4.2.1. Frequency Decomposition

Inspired by frequency domain operations [22], this paper attempts to decompose images in the frequency domain using Discrete Wavelet Transform (DWT) and to search for adversarial perturbations in various frequency bands. Compared to the Discrete Cosine Transform (DCT) used in other attack methods, DWT not only provides multi-resolution analysis capabilities, allowing for more detailed capture of local features and edge information in images, but its localization characteristics in both spatial and frequency domains make it more efficient and accurate in handling local variations.

For a two-dimensional signal

x (k, l) \in L^{2} (R^{2})

, let

c A_{j} x

represent the two-dimensional wavelet discrete approximation of the signal

x (k, l)

at the resolution j. The decomposition formula for multiresolution analysis is provided in Equation (4) [41].

\{\begin{matrix} c A_{j + 1} x = \sum_{k, l \in Z} (h_{k - 2 n}^{1} h_{l - 2 m}^{2}) * c A_{j} x, \\ c H_{j + 1} x = \sum_{k, l \in Z} (h_{k - 2 n}^{1} g_{l - 2 m}^{2}) * c A_{j} x, \\ c V_{j + 1} x = \sum_{k, l \in Z} (g_{k - 2 n}^{1} h_{l - 2 m}^{2}) * c A_{j} x, \\ c D_{j + 1} x = \sum_{k, l \in Z} (g_{k - 2 n}^{1} g_{l - 2 m}^{2}) * c A_{j} x, \end{matrix} n, m \in Z, j = {0, 1, 2, \dots, J}

(4)

where (

c H_{j + 1} x, c V_{j + 1} x, c D_{j + 1} x

) represent the detail coefficients in the vertical, horizontal, and diagonal directions, respectively, at resolution

j + 1

. The filters

h_{1}

and

h_{2}

denote low-pass filters, while

g_{1}

and

g_{2}

denote high-pass filters, determined by the chosen wavelet basis functions. The symbol ∗ indicates the complex conjugate of the filter coefficients, used to ensure energy preservation and correct reconstruction during the inverse transformation. m and n, respectively, indicate the displacement amounts of the filter coefficients in the vertical and horizontal directions. Additionally, the decomposed detail coefficients can be reconstructed through the Discrete Wavelet Inverse Transform (IDWT), as shown in Equation (5).

c A_{j} x = \sum_{k, l \in Z} (h_{n - 2 k}^{1} h_{m - 2 l}^{2} c A_{j + 1} x + h_{n - 2 k}^{1} g_{m - 2 l}^{2} c H_{j + 1} x + g_{n - 2 k}^{1} h_{m - 2 l}^{2} c V_{j + 1} x + g_{n - 2 k}^{1} g_{m - 2 l}^{2} c D_{j + 1} x), n, m \in Z

(5)

To reduce the computational cost of the attack, we select the four frequency bands

(c A_{1}, c H_{1}, c V_{1}, c D_{1})

obtained from a single decomposition as the search space for perturbations. For ease of representation, we will denote the process described in Equation (4) as Equation (6). Given an input image x, a single level of decomposition separates it into a low-frequency sub-band

{c A_{1}}

and high-frequency sub-bands

{c H_{1}, c V_{1}, c D_{1}}

. The decomposition process is as follows:

({c A}_{1}, {c H}_{1}, {c V}_{1}, {c D}_{1}) = D W T (x)

(6)

Similarly, we can reconstruct the frequency bands back into an image x using IDWT, as illustrated in Equation (7):

x = I D W T ({c A}_{1}, {c H}_{1}, {c V}_{1}, {c D}_{1})

(7)

4.2.2. Weighted Ensemble Loss Function Optimization

For the attack optimization problem based on a weighted ensemble loss function, minimising the cross-entropy loss defined by the weighted combination of probability vectors from the ensemble models is first necessary. This addresses the issue of combining prediction probabilities from multiple models and further minimizes the loss of the target label.

x_{a d v} (W) = arg min_{x} - log (y^{'} \sum_{i = 1}^{N} w_{i} softmax (F_{i} (x)))

(8)

Equation (8) weights the probability vectors of all models by

w_{i}

, calculates the logarithmic loss for the target label y, and optimizes the adversarial perturbation based on this calculation [27].

The second method combines the logit outputs of multiple models by weighted aggregation to effectively integrate the outputs of multiple models, as shown in Equation (9). This approach minimizes the loss relative to the target label to generate adversarial perturbations. It effectively consolidates information from multiple models to enhance the efficacy and generalizability of the adversarial samples [25].

x_{a d v} (W) = arg min_{x} L (\sum_{i = 1}^{N} w_{i} F_{i} (x), y^{'})

(9)

Additionally, another method involves optimizing adversarial perturbations by minimizing a weighted adversarial loss function, addressing the issue of synthesizing losses from multiple models to generate more effective adversarial perturbations [42], as shown in Equation (10).

x_{a d v} (W) = arg min_{x} \sum_{i = 1}^{N} w_{i} L (F_{i} (x), y^{'})

(10)

4.2.3. Frequency Band Weights Optimization

For the frequency band weight optimization problem, to achieve adaptive weight adjustment across different frequency bands, we have adopted the concept of the Adam optimization algorithm [43]. To achieve adaptive weight adjustment across different frequency bands, we have adopted the concept of the Adam optimization algorithm. The update of each weight not only depends on the current gradient but also relates to the estimations of the first and second moments of past gradients. Specifically, each weight and its corresponding gradient are identified by a unique index. Upon receiving a gradient g, the iteration count t for that weight is first updated. This iteration count is used for subsequent bias correction to ensure effective gradient estimations even at the early stages of training. Subsequently, the current gradient’s mean,

g_{m e a n}

, is calculated using Equation (11), while also updating the first moment

m_{t}

and the second moment

v_{t}

.

\{\begin{matrix} m_{t} = β_{1} m_{t - 1} + (1 - β_{1}) g_{m e a n} \\ v_{t} = β_{2} v_{t - 1} + (1 - β_{2}) {g_{m e a n}}^{2} \end{matrix}

(11)

where

β_{1}

and

β_{2}

are decay factors that control the extent to which historical information influences the current estimates. Additionally, to correct biases caused by initialization and early iterations, we apply bias correction calculations (Equation (12)) to adjust the first and second moments.

\{\begin{matrix} \hat{m} = \frac{m}{1 - β_{1}^{t}} \\ \hat{v} = \frac{v}{1 - β_{2}^{t}} \end{matrix}

(12)

Using these adjusted moment estimates, the weight update formula is shown in Equation (13).

w = w - α \frac{\hat{m}}{\sqrt{\hat{v}} + ϵ}

(13)

Finally, we ensure that the updated weight values w remain within preset minimum and maximum limits, which helps prevent excessive weight updates, thus maintaining the stability and effectiveness of the algorithm.

4.2.4. Perturbation Addition

Inspired by AutoAttack [44], we first introduce perturbations across different frequency bands using gradient information. To further enhance attack efficiency, we integrate both the gradient and square attack strategies. This involves identifying regions of significant gradients by computing the percentile threshold of absolute gradient values. Subsequently, positions are randomly selected within these high-gradient regions, where randomly generated square Gaussian noise or uniformly distributed perturbations are applied. This strategy aims to introduce random perturbations to increase the success rate of attacks. The specific perturbation generation algorithm is shown in Algorithm 1.

4.3. Frequency-Based Surrogate Ensemble Black-Box Attack

4.3.1. Model Weight Optimization

To further enhance the efficiency of black-box attacks, it is necessary to address the optimization of weights W, which guides the perturbation generator to produce adversarial perturbations that can quickly deceive the target model. Assuming the target model is F and its adversarial loss is

L_{m}

, this optimization can be expressed as Equation (14).

\{\begin{matrix} x_{a d v} (W, W_{f c}) = F P G (x) \\ W = arg min_{W} L_{m} (F (x_{a d v} (W, W_{f c})), y^{'}) \end{matrix}

(14)

4.3.2. Frequency-Based Surrogate Ensemble Black-Box Attack (FSEBA)

We initialize the weights of each surrogate model to the same value, generating the initial adversarial image

x_{a d v} (W, W_{f c})

. The attack halts if successful for the target model F; otherwise, we update the weights W and generate a new adversarial image. Following the method described in [36], we select the n-th model from the surrogate models. We then create two sets of weights,

W^{+}

and

W^{-}

, using the learning rate

η

. Algorithm 1 is used to generate two adversarial samples

x_{a d v} (W^{+}, W_{f c})

and

x_{a d v} (W^{-}, W_{f c})

, which are then queried against the target model F to obtain the outputs

{p^{+}, p^{-}}

and compute the model losses

L_{F}^{+}

and

L_{F}^{-}

. If the attack is successful, it stops; otherwise, the weights are iteratively updated based on the W corresponding to the minimal loss. It is important to note that the weights W are normalized to ensure they remain non-negative and sum up to 1. Frequency-based surrogate ensemble black-box attack method is shown in Algorithm 2.

Algorithm 1 Frequency-Based Perturbation Generator (FPG)

Input:: Clean image x with label y, target label $y^{'}$ (for untargeted attacks $y^{'} \neq y$ ), ensemble surrogate models $F^{'} = {F_{1}^{'}, F_{2}^{'}, \dots, F_{n}^{'}}$ , initial ensemble weights $W = {w_{1}, w_{2}, \dots, w_{n}}$ , initial band weights $W_{f c} = [w_{c A_{1}}, w_{c H_{1}}, w_{c V_{1}}, w_{c D_{1}}]$ , initial perturbation $δ_{0}$ , step size $λ$ , max perturbation $ϵ$ , maximum iterations T, perturbed block size s, threshold $θ$ .
Output:: Adversarial image $x_{a d v} (W, W_{f c})$ .

1:: $F c = [c A_{1}, c H_{1}, c V_{1}, c D_{1}] \leftarrow D W T (x)$
2:: $F c . r e q u i r e s_g r a d = T r u e$
3:: $δ = δ_{0}$ ▹ Initial perturbation.
4:: $F c^{'} (W, W_{f c}) \leftarrow F c (W, W_{f c}) + δ_{0}$
5:: for $t = 0$ to T do
6:: $L_{e} \leftarrow \sum_{i = 1}^{N} w_{i} L_{i} (F c^{'} (W, W_{f c}), y^{'})$ ▹ Calculate the weighted loss.
7:: $W_{f c} \leftarrow a d j u s t_w e i g h t (W_{f c}, L_{e})$ ▹ Update the band weights.
8:: $δ \leftarrow δ - λ \cdot s i g n (\nabla_{δ} L_{e})$ ▹ Update the perturbation with gradient.
9:: $I \subseteq {i | | \nabla_{δ} L_{e} | > q u a n t i l e (| \nabla_{δ} L_{e} |, θ)}$ ▹ Select the gradient position above the $θ$ .
10:: for $i \in I$ do
11:: $δ_{i} \leftarrow δ_{i} - λ \cdot s i g n (r a n d o m (s, s))$ ▹ Add random block perturbations of size s.
12:: end for
13:: $δ \leftarrow Π_{ε} (δ)$ ▹ Ensure the perturbations within the norm constraint.
14:: $F c^{'} (W, W_{f c}) \leftarrow F c^{'} (W, W_{f c}) + δ$
15:: end for
16:: $x_{a d v} (W, W_{f c}) \leftarrow I D W T (F c^{'} (W, W_{f c}))$
17:: return $x_{a d v} (W, W_{f c})$

Algorithm 2 Frequency-Based Surrogate Ensemble Black-Box Attack (FSEBA)

Input:: Clean image x with label y, target label $y^{'}$ (for untargeted attacks $y^{'} \neq y$ ), maximum query times Q, target model F, ensemble surrogate models $F^{'} = {F_{1}^{'}, F_{2}^{'}, \dots, F_{N}^{'}}$ , learning rate $η$ , FPG.
Output:: Adversarial image $x_{a d v} (W, W_{f c})$ .

1:: Initialize $δ_{0} = 0$ , $q = 0$ , $W = {w_{i} | \frac{1}{n}, i = 1, 2, \dots, n}$ , $W_{f c}$
2:: $x_{a d v} (W, W_{f c}) = F P G (x, W, W_{f c}, δ_{0})$
3:: $p = F (x_{a d v} (W))$
4:: $q = q + 1$
5:: if $a r g m a x p = y^{'}$ then
6:: break
7:: end if
8:: while $q < Q$ do
9:: for $n = 0$ to N do
10:: $W^{+} \leftarrow w_{n} + η, W^{-} \leftarrow w_{n} - η$
11:: $x_{a d v} (W^{\pm}) \leftarrow F P G A (x, W^{\pm}, W_{f c}, δ_{0})$
12:: $L_{F}^{\pm}, p^{\pm} \leftarrow {F (x}_{a d v} (W^{\pm}, W_{f c}))$
13:: $q = q + 2$
14:: $W \leftarrow {W | m i n {L_{F}^{+} (W^{+}), L_{F}^{-} (W^{-})}}$
15:: if $a r g m a x p^{\pm} = y^{'}$ then
16:: break
17:: end if
18:: end for
19:: end while
20:: return $x_{a d v} (W, W_{f c})$

5. Experiments

5.1. Dataset and Target Model

To effectively evaluate and compare the performance of different algorithms, we utilized the NeurIPS-17 dataset [45]. This dataset comprises high-quality images that are similar to those commonly found in the ImageNet dataset. Each image is accompanied by detailed category labels and target labels, spanning a wide variety of common object categories, which enhances the diversity and challenge of the tasks. In this study, we followed the experimental setup of [36], and utilized a subset of 1000 images from the NeurIPS-17 dataset for our evaluations. These 1000 images provide a comprehensive basis for testing the robustness of different algorithms in adversarial settings.

Additionally, we selected three target models—VGG19 [46], DenseNet-121 [47], and ResNeXt-50 [48]—for simulating the attacks. These models were chosen due to their distinct architectures and varying complexities, which offer a broad range of scenarios for evaluating the effectiveness of adversarial attacks. The top 1 accuracy, top 5 accuracy, and the number of parameters for each of these models are presented in Table 1, providing an overview of their performance and computational requirements under our experimental conditions.

5.2. Ensemble of Surrogate Models

When selecting surrogate models, we followed the setup in [36] and chose models except the target models from the Pytorch Torchvision [49] to construct our surrogate model set. This library offers a variety of open-source models that have been pre-trained on large datasets such as ImageNet [50], including classic architectures like ResNet [51], VGG [46], Inception [52], DenseNet [47], and emerging architectures such as EfficientNet [53], MobileNet [54], and Vision Transformer [55]. In recent years, with increased computing power, we have been able to use these models more effectively to enhance the capability of adversarial attacks. By integrating these diverse models, we aim to improve the adaptability and robustness of our perturbation generator, thereby better addressing the challenges posed by different image features.

Specifically, we utilize a total of 78 models as the surrogate model set, including [alexnet, vgg11, vgg13, vgg16, vgg19, vgg11_bn, vgg13_bn, vgg16_bn, vgg19_bn, resnet18, resnet34, resnet50, resnet101, resnet152, resnext50_32x4d, resnext101_32x8d, resnext101_ 64x4d, wide_resnet50_2, wide_resnet101_2, squeezenet1_0, squeezenet1_1, densenet121, densenet161, densenet169, densenet201, inception_v3, googlenet, shufflenet_v2_x0_5, shufflenet_v2_x1_0, shufflenet_v2_x1_5, shufflenet_v2_x2_0, mobilenet_v2, mobilenet_v3_large, mobilenet_v3_small, mnasnet0_5, mnasnet0_75, mnasnet1_0, mnasnet1_3, efficientnet_b0, efficientnet_b1, efficientnet_b2, efficientnet_b3, efficientnet_b4, efficientnet_b5, efficientnet_b6, efficientnet_b7, efficientnet_v2_s, efficientnet_v2_m, efficientnet_v2_l, regnet_y_ 400mf, regnet_y_800mf, regnet_y_1_6gf, regnet_y_3_2gf, regnet_y_8gf, regnet_y_16gf, regnet_y_32gf, regnet_x_400mf, regnet_x_800mf, regnet_x_1_6gf, regnet_x_3_2gf, regnet_x_8gf, regnet_x_16gf, regnet_x_32gf, vit_b_16, vit_b_32, vit_l_16, vit_l_32, convnext_tiny, convnext_small, convnext_base, convnext_large, swin_t, swin_s, swin_b, swin_ v2_t, swin_v2_s, swin_v2_b, maxvit_t]. All of these models are derived from pre-trained models in PyTorch, and we do not conduct any additional training. In the experiments, once the target model is selected, it is removed from the surrogate model set to ensure the implementation of black-box attacks.

5.3. Experimental Environment

We design our experiments to generate adversarial samples on a backend server, which are used to attack a DNN-based UAV vision system. The server is equipped with an Intel(R) Xeon(R) Silver 4310 CPU @ 2.10 GHz, an NVIDIA A100 GPU, and 251 GB of DDR4 RAM and operates under Ubuntu 20.04.5 LTS. The software setup includes Python 3.10.14 and Pytorch 1.13.1.

The UAV used is a DJI Matrice 300 (Figure 2), which carries the DJI Manifold 2 onboard computing system. This system utilizes an Intel(R) Core(TM) i7-8550U CPU and an NVIDIA Jetson TX2 GPU. We deployed the pre-trained target model directly onto the UAV for operational testing.

5.4. Evaluation Indicators

We evaluate our method under the

L_{\infty}

norm bound, with a common perturbation budget of

L_{\infty} \leq 16

on a pixel scale of

0 - 255

. The algorithm’s superiority is assessed using the Attack Success Rate (

A S R

) and the Average Query Count (

A Q C

) as performance metrics, as described in Equation (15).

\{\begin{matrix} A S R & = \frac{N_{s u c}}{N} \\ A Q C & = \frac{1}{N_{s u c}} \sum_{i = 1}^{N_{s u c}} q_{i} \end{matrix}

(15)

where

q_{i}

represents the number of queries required for the i-th successful attack,

N_{s u c}

represents the number of samples for which the attack is successful, and N is the total number of samples.

5.5. Ablation Experiment

We conduct ablation experiments on the adversarial loss (

L o s s

), weight loss (

W L o s s

), learning rate (

l r

), and step size (

λ

) parameters within the algorithm. This paper presents the impact of different hyperparameter choices on the proposed algorithm under targeted and untargeted attack scenarios, as illustrated in Figure 3 and Figure 4. Notably, we use the first 100 images as the dataset and restrict the maximum number of queries to 20. For the model we choose DenseNet121.

Adversarial loss (

L o s s

). As shown in Figure 3a and Figure 4a, the adversarial losses Cross Entropy (

L o s s_{c e}

) and C&W (

{L o s s}_{c w}

) exhibit similar performance in two attack settings. To facilitate the identification of attack success, we selected

{L o s s}_{c w}

.

Weight loss (

W L o s s

). When analyzing the impact of the weight loss on ASR under targeted (Figure 3b) and non-targeted attacks (Figure 4b), we find that

L o s s

(Equation (10)) maintains a high ASR with fewer queries in both scenarios. In contrast, while

l o g i t s

(Equation (9)) and

p r o b

(Equation (8)) loss function gradually improve the ASR with more queries, their success rate is lower in the initial few queries. Therefore, to achieve a high ASR with the minimum number of queries, using the

L o s s

function is the optimal strategy.

Learning rate (

l r

). As shown in Figure 3c and Figure 4c, different learning rates achieve optimal results after multiple queries (5–6 times). When

l r = 0.005

, our method can quickly reach near-perfect success rates with just 1–2 queries and then maintain stability. Compared to other learning rates, this setting offers a significant advantage by achieving better performance with fewer queries.

Step size (

λ

). The results indicate that the impact of different step sizes on ASR is most significant with fewer queries. When the step size is set to

3 λ

, it demonstrates the most stable performance, achieving 100% ASR with just two queries for targeted attacks (Figure 3d) and one query for untargeted attacks (Figure 4d). In contrast,

λ

performs the worst, and performance declines after

5 λ

.

Moreover, we conduct a comparative experiment to discern the efficacy of image space versus frequency space attacks. Utilizing Densenet-121 as the target model, we selected 100 images and applied the same scrambling technique in both the image space and frequency domain. The results, illustrated in Figure 5, demonstrate the relationship between the number of queries and ASR for untargeted and targeted attacks. The findings indicate that both approaches yield high ASR; however, attacks implemented in the frequency domain are notably more efficient, requiring only one or two queries.

5.6. Comparison Experiment

We compared our method with several state-of-the-art black-box attack techniques, including BASES [36], GFCS [35], TREMBA [18], ODS [34], and P-RGF [17]. BASES [36] is a powerful ensemble surrogate attack method capable of successfully executing attacks with very few queries. TREMBA [18] seeks effective perturbations by adjusting the latent code of the surrogate model generator, while GFCS [35] uses surrogate gradient directions to probe the target model. ODS [34] and P-RGF [17] are two attack methods based on transferable prior knowledge.

In Table 2, our method clearly excels under untargeted attack conditions on a range of deep learning models such as VGG-19, DenseNet-121, and ResNeXt-50, where our ASR approaches a near-perfect 100% across all tested architectures. This is a significant leap over other methods; for instance, the P-RGF method achieves a lower ASR of 93.5% with 156 queries on VGG-19, highlighting our method’s superior effectiveness. More impressively, our method achieves this high efficiency with an exceptionally low AQC of only 1.0, compared to other approaches that require much higher query counts such as ODS with 38 queries on VGG-19 to achieve a 99.9% ASR. This drastic reduction in the number of queries not only highlights the efficiency of our approach but also enhances the feasibility of conducting attacks discreetly without triggering defensive mechanisms that could mitigate the attack’s impact. This efficiency in query usage combined with the high success rates positions our method as a highly effective and practical solution in adversarial scenarios.

Targeted attacks represent a more challenging scenario, as they require specific outcomes, making the high performance of our method in such situations particularly noteworthy. As detailed in Table 3, our method achieves an impressive ASR of 98.7% on VGG-19 with a minimal AQC of only 1.2. This efficiency extends across other models as well, with nearly perfect ASRs on DenseNet-121 and ResNeXt-50, paired with an equally low AQC, demonstrating robust performance across diverse architectures. This contrasts sharply with other methods like BASES and GFCS, which, despite achieving high ASRs, require significantly more queries, such as 76 for GFCS on DenseNet-121 to reach a 95.2% ASR. Our method’s ability to maintain high accuracy with fewer queries not only enhances the stealthiness and speed of the attack but also positions our approach as an optimal solution in black-box settings where every query increases the risk of detection and countermeasures by defensive systems.

5.7. Effect of Attacking Robust Models

We evaluate the efficacy of our method against models equipped with adversarial defense mechanisms. Specifically, we targeted robust models ResNet-

50_{r o b}

[56] and Wide-ResNet

101_{r o b}

[57] from RobustBench [58], which are outfitted with defense strategies. We also compare our approach to the advanced BASES method. These experiments were conducted in an untargeted attack setting, and the results are depicted in Figure 6. To fully assess the impact of attacks, we also tested these models without defenses. The findings indicate that models with adversarial defenses are relatively more difficult to compromise, generally exhibiting a decline in ASR. Compared to the BASES method, our approach demonstrated superior attack capabilities on the defended models, with a less pronounced reduction in ASR.

5.8. Visualization of Adversarial Examples

In this section, we use DenseNet-121 as the target model for targeted attacks and present various adversarial samples and their class activation map [59] (CAM) visualizations, as shown in Figure 7. Analyzing the differences in adversarial samples and CAM visualizations, as well as the performance of our method, reveals the following. Firstly, the goal of generating adversarial samples is typically to mislead the model without significantly altering the image appearance. In these cases, adversarial samples generated by our method are visually very similar to the original images, indicating that their perturbations are less perceptible to the human eye. Although perturbations from ODS, GFCS, and BASES also exhibit some level of stealthiness, the TREMBA method produces samples with more noticeable changes, which might be more easily detected in practical applications. For Figure 7b, the CAM visualizes the focal areas of the image that influence the model’s decision. The CAM for adversarial samples generated by our method shows a significant shift in the focal area, suggesting that adversarial perturbations fine-tune pixel-level features and may alter the model’s structural understanding. For instance, while the model might focus on the windows of a building or the wings of a butterfly in the original image, the focus might shift to the building’s edges or the butterfly’s body in the adversarial samples. These CAM differences reveal the strategies and effectiveness of our method in deceiving the model.

6. Limitations

The proposed adversarial attack technique for the UAV vision systems is highly efficient in reducing the number of queries needed for successful attacks but also presents two manageable limitations.

Firstly, while the method does necessitate significant computational resources due to the use of a large set of surrogate models for generating adversarial perturbations, it is important to note that this computation is primarily performed as white-box calculations on the surrogate models and does not involve the target model.

Secondly, although generating adversarial perturbations on these surrogate models can be time-consuming (a query is closer to 50 s), leading to longer durations per attack, the critical advantage lies in the minimal interaction required with the target model. Each image typically necessitates only one or two queries to execute a successful attack, substantially reducing the risk of detection by defense mechanisms that monitor query volumes. This efficiency in querying the target model ensures effectiveness in adversarial environments.

These characteristics highlight a strategic trade-off between computational intensity and operational stealthiness, making the method particularly suitable for applications where reducing the detectability of attacks is crucial. Despite its limitations, the technique’s ability to efficiently manage computational tasks and minimize interactions with the target model suggests substantial potential for further refinement and application in dynamic or resource-constrained contexts.

7. Discussion

The results of this paper highlight significant advancements in adversarial attacks on DNNs specifically designed for UAV vision systems. Our innovative method employs a frequency decomposition-based perturbation generator and integrates both query-based and transfer-based attack strategies, achieving considerable enhancements over previous methods.

Our strategy effectively exploits both high-frequency and low-frequency perturbations, using frequency domain analysis to accurately disrupt model predictions. This demonstrates that strategic perturbations across various frequency bands can significantly influence model performance, supporting and expanding on previous studies in frequency-based adversarial attacks [23,24]. By combining query-based and transfer-based approaches, our hybrid attack strategy not only achieves high success rates but does so with fewer queries. This represents a critical improvement over earlier methods that struggled with balancing query costs against attack effectiveness [14,15], marking a significant advancement for practical applications where minimizing queries is essential.

As we advance our research, we are aware of the ethical and legal implications of adversarial attacks, particularly in sensitive applications such as UAV systems. The exploration of adversarial attacks is crucial, not to exploit vulnerabilities maliciously but to understand potential weaknesses within DNNs employed in UAVs. This knowledge is imperative for developing more robust models that can withstand malicious attempts to compromise their functionality, thereby ensuring safer and more reliable UAV operations in various fields, from surveillance to delivery systems.

In the future, we plan to investigate the specific effects of different frequency bands on attack success to further refine our perturbation strategies. We also intend to develop adaptive defenses capable of countering adversarial perturbations across various frequency ranges, a crucial step for enhancing model robustness. Additionally, future work will extend the validation of our approach to include datasets containing UAV-specific imagery to verify the adaptability and effectiveness of our proposed adversarial attacks in more targeted real-world scenarios. This effort will help substantiate the practical applicability of our findings, particularly in UAV visual systems, while continuously addressing the broader ethical considerations by promoting the creation of more secure AI systems.

8. Conclusions

In this paper, we introduce a novel black-box adversarial attack method targeting DNNs in drone vision systems. This method integrates multiple surrogate models and employs a frequency-decomposition-based perturbation generator to search for and create adversarial samples in the frequency domain. Compared to existing techniques, this method significantly enhances attack effectiveness with the same number of queries. By combining query-based and transfer-based attack strategies, our method achieves high success rates with minimal queries. In targeted attacks, it reaches over 98% ASR, and in non-targeted attacks, it approaches a 100% success rate, outperforming state-of-the-art methods. Each image requires only 1-2 queries to successfully execute the attack, significantly reducing the risk of detection by query-based defenses. We highlight the importance of frequency domain analysis in designing robust adversarial attacks and point out potential areas for further improvement. Future research may explore the impact of different frequency bands on adversarial effectiveness and develop adaptive defenses to counter these complex attack methods.

Author Contributions

Conceptualization, Z.Z. and Q.L.; Methodology, Z.Z., Q.L., Z.W. and S.Q.; Validation, Z.Z. and Q.L.; Formal analysis, Z.Z. and Z.W.; Investigation, Z.Z. and W.D.; Data curation, Z.Z. and W.D.; Writing—original draft, Z.Z. and Q.L.; Writing—review and editing, Z.Z. and S.Z.; Visualization, Z.Z.; Supervision, S.Z.; Project administration, Q.L., S.Z. and S.Q.; Funding acquisition, S.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China (62272089), the Open Project of the Intelligent Terminal Key Labo ratory of Sichuan Province (SCITLAB-30003) and Sichuan Province science and technology Department key research and development project (2020YFG0472).

Data Availability Statement

Data are contained within the article.

DURC Statement

Current research is limited to addressing robustness-related risks faced by intelligent models of DNN-based UAV vision systems, which is beneficial for building more robust target recognition models and does not pose a threat to public health or national security. Authors acknowledge the dual-use potential of the research involving DNN-based UAV vision systems and confirm that all necessary precautions have been taken to prevent potential misuse. As an ethical responsibility, authors strictly adhere to relevant national and international laws about DURC. Authors advocate for responsible deployment, ethical considerations, regulatory compliance, and transparent reporting to mitigate misuse risks and foster beneficial outcomes.

Conflicts of Interest

The authors declare no conflicts of interest.

Correction Statement

This article has been republished with a minor correction to the DURC Statement. This change does not affect the scientific content of the article.

References

Istiak, M.A.; Syeed, M.M.; Hossain, M.S.; Uddin, M.F.; Hasan, M.; Khan, R.H.; Azad, N.S. Adoption of Unmanned Aerial Vehicle (UAV) imagery in agricultural management: A systematic literature review. Ecol. Inform. 2023, 78, 102305. [Google Scholar] [CrossRef]
Engesser, V.; Rombaut, E.; Vanhaverbeke, L.; Lebeau, P. Autonomous delivery solutions for last-mile logistics operations: A literature review and research agenda. Sustainability 2023, 15, 2774. [Google Scholar] [CrossRef]
Roberts, N.B.; Ager, E.; Leith, T.; Lott, I.; Mason-Maready, M.; Nix, T.; Gottula, A.; Hunt, N.; Brent, C. Current summary of the evidence in drone-based emergency medical services care. Resusc. Plus 2023, 13, 100347. [Google Scholar] [CrossRef]
Li, Y.; Fan, Q.; Huang, H.; Han, Z.; Gu, Q. A modified YOLOv8 detection network for UAV aerial image recognition. Drones 2023, 7, 304. [Google Scholar] [CrossRef]
Chen, K.; Chen, B.; Liu, C.; Li, W.; Zou, Z.; Shi, Z. Rsmamba: Remote sensing image classification with state space model. IEEE Geosci. Remote Sens. Lett. 2024, 21, 8002605. [Google Scholar] [CrossRef]
Zeng, L.; Chen, H.; Feng, D.; Zhang, X.; Chen, X. A3D: Adaptive, Accurate, and Autonomous Navigation for Edge-Assisted Drones. IEEE/Acm Trans. Netw. 2023, 32, 713–728. [Google Scholar] [CrossRef]
Hadi, H.J.; Cao, Y.; Li, S.; Xu, L.; Hu, Y.; Li, M. Real-time fusion multi-tier DNN-based collaborative IDPS with complementary features for secure UAV-enabled 6G networks. Expert Syst. Appl. 2024, 252, 124215. [Google Scholar] [CrossRef]
Akshya, J.; Neelamegam, G.; Sureshkumar, C.; Nithya, V.; Kadry, S. Enhancing UAV Path Planning Efficiency through Adam-Optimized Deep Neural Networks for Area Coverage Missions. Procedia Comput. Sci. 2024, 235, 2–11. [Google Scholar]
Dutta, A.; Das, S.; Nielsen, J.; Chakraborty, R.; Shah, M. Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve Aerial Visual Perception? In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–22 June 2024; pp. 22678–22690. [Google Scholar]
Goodfellow, I.J.; Shlens, J.; Szegedy, C. Explaining and harnessing adversarial examples. arXiv 2014, arXiv:1412.6572. [Google Scholar]
Chakraborty, A.; Alam, M.; Dey, V.; Chattopadhyay, A.; Mukhopadhyay, D. A survey on adversarial attacks and defences. CAAI Trans. Intell. Technol. 2021, 6, 25–45. [Google Scholar] [CrossRef]
Long, T.; Gao, Q.; Xu, L.; Zhou, Z. A survey on adversarial attacks in computer vision: Taxonomy, visualization and future directions. Comput. Secur. 2022, 121, 102847. [Google Scholar] [CrossRef]
Baniecki, H.; Biecek, P. Adversarial attacks and defenses in explainable artificial intelligence: A survey. Inf. Fusion 2024, 107, 102303. [Google Scholar] [CrossRef]
Brendel, W.; Rauber, J.; Bethge, M. Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
Zhou, M.; Wu, J.; Liu, Y.; Liu, S.; Zhu, C. Dast: Data-free substitute training for adversarial attacks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 234–243. [Google Scholar]
Guo, Y.; Yan, Z.; Zhang, C. Subspace attack: Exploiting promising subspaces for query-efficient black-box attacks. Adv. Neural Inf. Process. Syst. 2019, 32, 3820–3829. [Google Scholar]
Cheng, S.; Dong, Y.; Pang, T.; Su, H.; Zhu, J. Improving black-box adversarial attacks with a transfer-based prior. Adv. Neural Inf. Process. Syst. 2019, 32. [Google Scholar] [CrossRef]
Huang, Z.; Zhang, T. Black-Box Adversarial Attack with Transferable Model-based Embedding. In Proceedings of the 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020. [Google Scholar]
Xiang, S.; Liang, Q. Remote sensing image compression based on high-frequency and low-frequency components. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5604715. [Google Scholar] [CrossRef]
Lin, Y.; Xie, Z.; Chen, T.; Cheng, X.; Wen, H. Image privacy protection scheme based on high-quality reconstruction DCT compression and nonlinear dynamics. Expert Syst. Appl. 2024, 257, 124891. [Google Scholar] [CrossRef]
Sharma, Y.; Ding, G.W.; Brubaker, M.A. On the effectiveness of low frequency perturbations. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, Macao, China, 10–16 August 2019; pp. 3389–3396. [Google Scholar]
Guo, C.; Gardner, J.; You, Y.; Wilson, A.G.; Weinberger, K. Simple black-box adversarial attacks. In Proceedings of the International Conference on Machine Learning, PMLR, Long Beach, CA, USA, 9–15 June 2019; pp. 2484–2493. [Google Scholar]
Wang, H.; Wu, X.; Huang, Z.; Xing, E.P. High-frequency component helps explain the generalization of convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 8684–8694. [Google Scholar]
Yin, D.; Gontijo Lopes, R.; Shlens, J.; Cubuk, E.D.; Gilmer, J. A fourier perspective on model robustness in computer vision. Adv. Neural Inf. Process. Syst. 2019, 32, 13276–13286. [Google Scholar]
Dong, Y.; Liao, F.; Pang, T.; Su, H.; Zhu, J.; Hu, X.; Li, J. Boosting adversarial attacks with momentum. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 9185–9193. [Google Scholar]
Xie, C.; Zhang, Z.; Zhou, Y.; Bai, S.; Wang, J.; Ren, Z.; Yuille, A.L. Improving transferability of adversarial examples with input diversity. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 2730–2739. [Google Scholar]
Liu, Y.; Chen, X.; Liu, C.; Song, D. Delving into Transferable Adversarial Examples and Black-box Attacks. In Proceedings of the International Conference on Learning Representations, Online, 25–29 April 2022. [Google Scholar]
Yuan, Z.; Zhang, J.; Jia, Y.; Tan, C.; Xue, T.; Shan, S. Meta gradient adversarial attack. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 7748–7757. [Google Scholar]
Ma, C.; Chen, L.; Yong, J.H. Simulating unknown target models for query-efficient black-box attacks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 11835–11844. [Google Scholar]
Brunner, T.; Diehl, F.; Le, M.T.; Knoll, A. Guessing smart: Biased sampling for efficient black-box adversarial attacks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 4958–4966. [Google Scholar]
Chen, P.Y.; Zhang, H.; Sharma, Y.; Yi, J.; Hsieh, C.J. Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, Dallas, TX, USA, 3 November 2017; pp. 15–26. [Google Scholar]
Tu, C.C.; Ting, P.; Chen, P.Y.; Liu, S.; Zhang, H.; Yi, J.; Hsieh, C.J.; Cheng, S.M. Autozoom: Autoencoder-based zeroth order optimization method for attacking black-box neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 742–749. [Google Scholar]
Ilyas, A.; Engstrom, L.; Madry, A. Prior Convictions: Black-box Adversarial Attacks with Bandits and Priors. In Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
Tashiro, Y.; Song, Y.; Ermon, S. Diversity can be transferred: Output diversification for white-and black-box attacks. Adv. Neural Inf. Process. Syst. 2020, 33, 4536–4548. [Google Scholar]
Lord, N.A.; Mueller, R.; Bertinetto, L. Attacking deep networks with surrogate-based adversarial black-box methods is easy. arXiv 2022, arXiv:2203.08725. [Google Scholar]
Cai, Z.; Song, C.; Krishnamurthy, S.; Roy-Chowdhury, A.; Asif, S. Blackbox attacks via surrogate ensemble search. Adv. Neural Inf. Process. Syst. 2022, 35, 5348–5362. [Google Scholar]
Feng, Y.; Wu, B.; Fan, Y.; Liu, L.; Li, Z.; Xia, S.T. Boosting black-box attack with partially transferred conditional adversarial distribution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 15095–15104. [Google Scholar]
Mohaghegh Dolatabadi, H.; Erfani, S.; Leckie, C. Advflow: Inconspicuous black-box adversarial attacks using normalizing flows. Adv. Neural Inf. Process. Syst. 2020, 33, 15871–15884. [Google Scholar]
Al-Dujaili, A.; O’Reilly, U.M. Sign bits are all you need for black-box attacks. In Proceedings of the International Conference on Learning Representations, Addis Ababa, Ethiopia, 26–30 April 2020. [Google Scholar]
Yin, F.; Zhang, Y.; Wu, B.; Feng, Y.; Zhang, J.; Fan, Y.; Yang, Y. Generalizable black-box adversarial attack with meta learning. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 46, 1804–1818. [Google Scholar] [CrossRef]
Antonini, M.; Barlaud, M.; Mathieu, P.; Daubechies, I. Image coding using wavelet transform. IEEE Trans. Image Process. 1992, 1, 205–220. [Google Scholar] [CrossRef]
Wu, Z.; Lim, S.N.; Davis, L.S.; Goldstein, T. Making an invisibility cloak: Real world adversarial attacks on object detectors. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Proceedings, Part IV 16. Springer: Berlin/Heidelberg, Germany, 2020; pp. 1–17. [Google Scholar]
Kingma, D.P. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Croce, F.; Hein, M. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In Proceedings of the International Conference on Machine Learning, PMLR, Virtual, 13–18 July 2020; pp. 2206–2216. [Google Scholar]
Google Brain. Neurips 2017: Targeted Adversarial Attack. 2017. Available online: https://www.kaggle.com/competitions/nips-2017-targeted-adversarial-attack/data (accessed on 15 October 2024).
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
Xie, S.; Girshick, R.; Dollár, P.; Tu, Z.; He, K. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1492–1500. [Google Scholar]
Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. Pytorch: An imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 2019, 32, 8026–8037. [Google Scholar]
Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Kyoto, Japan, 29 September–2 October 2009; IEEE: New York, NY, USA, 2009; pp. 248–255. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
Tan, M. Efficientnet: Rethinking model scaling for convolutional neural networks. arXiv 2019, arXiv:1905.11946. [Google Scholar]
Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar]
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In Proceedings of the International Conference on Learning Representations, Addis Ababa, Ethiopia, 26–30 April 2020. [Google Scholar]
Salman, H.; Ilyas, A.; Engstrom, L.; Kapoor, A.; Madry, A. Do adversarially robust imagenet models transfer better? Adv. Neural Inf. Process. Syst. 2020, 33, 3533–3545. [Google Scholar]
Peng, S.; Xu, W.; Cornelius, C.; Hull, M.; Li, K.; Duggal, R.; Phute, M.; Martin, J.; Chau, D.H. Robust principles: Architectural design principles for adversarially robust cnns. arXiv 2023, arXiv:2308.16258. [Google Scholar]
Croce, F.; Andriushchenko, M.; Sehwag, V.; Debenedetti, E.; Flammarion, N.; Chiang, M.; Mittal, P.; Hein, M. RobustBench: A standardized adversarial robustness benchmark. arXiv 2020, arXiv:2010.09670. [Google Scholar]
Zhou, B.; Khosla, A.; Lapedriza, A.; Oliva, A.; Torralba, A. Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2921–2929. [Google Scholar]

Figure 1. Architecture of our attack. We first design a frequency-based perturbation generator capable of decomposing images into the frequency domain and searching for and adding perturbations across various frequency bands to generate adversarial samples. Subsequently, by querying the target model to obtain loss feedback, we update the bands weights

W_{f c}

and loss weights W.

Figure 1. Architecture of our attack. We first design a frequency-based perturbation generator capable of decomposing images into the frequency domain and searching for and adding perturbations across various frequency bands to generate adversarial samples. Subsequently, by querying the target model to obtain loss feedback, we update the bands weights

W_{f c}

and loss weights W.

Figure 2. The DJI M300 UAV with the DJI Manifold 2 onboard computer system deployed.

Figure 3. Impact analysis of different parameters under target attacks.

Figure 4. Impact analysis of different parameters under untarget attacks.

Figure 5. Comparative experiments of attacks in the frequency domain and image space.

f r e

is the frequency domain attack, and

p i c

stands for image space attack. It is worth noting that the other parameters are the same except for the space of perturbation addition.

Figure 5. Comparative experiments of attacks in the frequency domain and image space.

f r e

is the frequency domain attack, and

p i c

stands for image space attack. It is worth noting that the other parameters are the same except for the space of perturbation addition.

Figure 6. Effect of attacking robust models.

r o b

indicates that the target model is imposed with adversarial defense mechanisms.

Figure 6. Effect of attacking robust models.

r o b

indicates that the target model is imposed with adversarial defense mechanisms.

Figure 7. Different adversarial examples and their CAM visualizations are shown. (a) represents the samples, while (b) shows the corresponding CAM for each sample. The first row displays the normal samples, with each subsequent row representing adversarial samples generated by different attack methods, and the final row illustrates the results from our method.

Table 1. Black-box target model with its top 1, top 5, and number of parameters.

Models	Top-1	Top-5	Parameters
VGG19	72.37%	90.87%	143.7 M
DenseNet-121	74.43%	91.97%	8.0 M
ResNeXt-50	81.19%	95.34%	25.0 M

Table 2. Comparison of attack success rate (ASR) and average query count (AQC) of different methods under an untargeted attack. The best-performing are shown in bold.

	AQC	ASR	AQC	ASR	AQC	ASR
Method	VGG-19		DenseNet-121		ResNeXt-50
P-RGF [17]	156	93.5%	164	92.9%	166	92.5%
ODS [34]	38	99.9%	52	99.0%	54	98.4%
GFCS [35]	14	100.0%	16	99.9%	15	99.7%
TREMBA [18]	2.4	99.7%	5.9	99.5%	7.5	98.9%
BASES [36]	1.2	99.8%	1.2	99.9%	1.2	100.0%
Ours	1.0	99.9%	1.0	100.0%	1.0	100.0%

Table 3. Comparison of attack success rate (ASR) and average query count (AQC) of different methods under a targeted attack. The best-performing are shown in bold.

	AQC	ASR	AQC	ASR	AQC	ASR
Methods	VGG-19		DenseNet-121		ResNeXt-50
P-RGF * [17]	-	-	-	-	-	-
ODS [34]	261	49.0%	266	49.7%	270	42.7%
GFCS [35]	101	89.1%	76	95.2%	86	92.9%
TREMBA [18]	92	89.2%	70	90.5%	100	85.1%
BASES [36]	3.0	95.9%	1.8	99.4%	1.8	99.7%
Ours	1.2	98.7%	1.1	99.7%	1.0	99.9%

* For targeted attacks under lower query budgets, P-RGF is almost ineffective (0% ASR).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, Z.; Liu, Q.; Zhou, S.; Deng, W.; Wu, Z.; Qiu, S. Efficient Ensemble Adversarial Attack for a Deep Neural Network (DNN)-Based Unmanned Aerial Vehicle (UAV) Vision System. Drones 2024, 8, 591. https://doi.org/10.3390/drones8100591

AMA Style

Zhang Z, Liu Q, Zhou S, Deng W, Wu Z, Qiu S. Efficient Ensemble Adversarial Attack for a Deep Neural Network (DNN)-Based Unmanned Aerial Vehicle (UAV) Vision System. Drones. 2024; 8(10):591. https://doi.org/10.3390/drones8100591

Chicago/Turabian Style

Zhang, Zhun, Qihe Liu, Shijie Zhou, Wenqi Deng, Zhewei Wu, and Shilin Qiu. 2024. "Efficient Ensemble Adversarial Attack for a Deep Neural Network (DNN)-Based Unmanned Aerial Vehicle (UAV) Vision System" Drones 8, no. 10: 591. https://doi.org/10.3390/drones8100591

APA Style

Zhang, Z., Liu, Q., Zhou, S., Deng, W., Wu, Z., & Qiu, S. (2024). Efficient Ensemble Adversarial Attack for a Deep Neural Network (DNN)-Based Unmanned Aerial Vehicle (UAV) Vision System. Drones, 8(10), 591. https://doi.org/10.3390/drones8100591

Article Menu

Efficient Ensemble Adversarial Attack for a Deep Neural Network (DNN)-Based Unmanned Aerial Vehicle (UAV) Vision System

Abstract

1. Introduction

2. Related Work

2.1. Adversarial Example

2.2. Transfer-Based Black Box Attacks

2.3. Query-Based Black Box Attacks

2.4. Transfer-Query-Based Black Box Attacks

3. Attack Scenario

4. The Proposed Attacks

4.1. Overall Architecture

4.2. Construction of Frequency-Based Perturbation Generator for Surrogate Ensemble Models

4.2.1. Frequency Decomposition

4.2.2. Weighted Ensemble Loss Function Optimization

4.2.3. Frequency Band Weights Optimization

4.2.4. Perturbation Addition

4.3. Frequency-Based Surrogate Ensemble Black-Box Attack

4.3.1. Model Weight Optimization

4.3.2. Frequency-Based Surrogate Ensemble Black-Box Attack (FSEBA)

5. Experiments

5.1. Dataset and Target Model

5.2. Ensemble of Surrogate Models

5.3. Experimental Environment

5.4. Evaluation Indicators

5.5. Ablation Experiment

5.6. Comparison Experiment

5.7. Effect of Attacking Robust Models

5.8. Visualization of Adversarial Examples

6. Limitations

7. Discussion

8. Conclusions

Author Contributions

Funding

Data Availability Statement

DURC Statement

Conflicts of Interest

Correction Statement

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI