Next Article in Journal
A Myoelectric Signal-Driven Intelligent Wheelchair System Incorporating Occlusal Control for Assistive Mobility
Previous Article in Journal
Noisy Label Learning for Gait Recognition in the Wild
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Invisible Backdoor Attack Based on Dual-Frequency- Domain Transformation

1
School of Cyber Science and Engineering, Zhengzhou University, Zhengzhou 450002, China
2
Key Laboratory of Cyberspace Security, Ministry of Education, Zhengzhou 450001, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Electronics 2025, 14(19), 3753; https://doi.org/10.3390/electronics14193753
Submission received: 2 September 2025 / Revised: 19 September 2025 / Accepted: 21 September 2025 / Published: 23 September 2025

Abstract

Backdoor attacks are recognized as a significant security threat to deep learning. Such attacks can induce models to perform abnormally with inputs that contain predefined triggers, while maintaining state-of-the-art (SOTA) performance on clean data. Research indicates that existing backdoor attacks in the spatial domain have the problems of poor stealthiness and limited effectiveness. Based on the dispersion of adding perturbations in the frequency domain and the idea that multiple frequency-domain transformations can achieve different levels of feature fusion, we propose a dual-frequency-domain transformation backdoor attack method called DFDT (dual-frequency-domain transformation). DFDT executes dual-frequency-domain transformation on both clean samples and a trigger image, then conducts feature fusion in the frequency domain to augment the stealthiness of the poisoned samples. In addition, we introduce regularization samples to reduce the latent separability of clean and poisoned samples. We thoroughly evaluate the DFDT on three image datasets: CIFAR-10, GTSRB, and CIFAR-100. The experimental results show that the DFDT achieves greater stealthiness and effectiveness, achieving an attack success rate (ASR) that approximates 100% and a benign accuracy (BA) nearing 94%. Furthermore, we illustrate that DFDT can successfully evade state-of-the-art defenses, including STRIP, NC, and I-BAU.

1. Introduction

Deep neural networks (DNNs) are widely used in our lives [1,2,3]. The security issue of DNNs, particularly the backdoor attack technique, cannot be overlooked, as it significantly endangers the protection of users’ lives and property [4,5], while also undermining the trustworthiness of DNNs. Specific input patterns trigger backdoor attacks during the training phase to induce targeted erroneous output from models while maintaining normal performance on benign inputs [6,7]. Typically, the performance of backdoor attacks are able to be evaluated by the following two aspects: (i) effectiveness, which refers to the effectiveness of the backdoor attack in causing the targeted model to produce erroneous outputs; and (ii) stealthiness, which refers to the extent to which the poisoned training samples are indistinguishable from the benign samples.
Due to the serious threat posed by backdoor attacks on DNNs, various strategies and techniques have been explored. Early backdoor attacks used visible patterns as triggers [8,9]. To increase the stealthiness of these triggers, recent backdoor attacks have utilized techniques such as image transformations to create invisible dynamic triggers [10,11]. Although existing backdoor attacks performed in the spatial domain perform relatively well in terms of effectiveness, the triggers typically involve direct modifications to image pixels, often manifesting as anomalous patterns or noise within the image. These patterns or noise are particularly evident in localized regions of the image. For instance, an attacker might introduce a specific pattern or texture to a corner of the image to activate the backdoor behavior. From the perspective of visual perception, the human visual system is sensitive to anomalies in local image regions. Triggers for spatial domain attacks can also induce changes in local statistical properties of the image, such as variations in local brightness, contrast, or color distribution. This concentration facilitates the detection and removal of backdoors by defense methods [12,13,14] that analyze local image characteristics. Recent research has begun exploring methods for inserting triggers in the frequency domain, aiming to further enhance their stealthiness; there are some frequency-domain transformation methods, including Discrete Cosine Transform (DCT) [15], Discrete Wavelet Transform (DWT) [16], and Wavelet Packet Decomposition (WPD) [17]. Frequency-domain transformation can disperse image information across multiple frequency components, preventing the concentration of trigger information in localized regions. By embedding triggers in the frequency domain, the trigger signal spreads throughout the entire image space, significantly reducing the probability of detection by defensive methods. However, in the frequency domain, most backdoor attacks are performed under a single-frequency-domain transform [18,19]. Although such methods outperform traditional attacks in terms of spatial domain stealthiness, for this type of single-frequency-domain transformation attack, its trigger energy remains concentrated within specific frequency bands, leading to anomalies in frequency-domain statistical characteristics and making it easily detectable by defense methods (e.g., STRIP [20]). Furthermore, the single-frequency-domain trigger features exhibit linear separability, enabling defenders to reconstruct the trigger patterns through reverse engineering methods such as Neural Cleanse [13]. Even with quantization compression applied to the triggers, this vulnerability remains difficult to mitigate. Furthermore, the relatively uniform embedding of triggers in single-frequency-domain transformations reduces their effectiveness. Due to the concentration of trigger energy within a single-frequency band, this method may not effectively activate the backdoor behavior in all scenarios, especially when facing models with strong defensive capabilities. This reduction in effectiveness further limits the potential of single-frequency-domain methods in practical applications.
To address these limitations, we propose a method with dual-frequency-domain transformation. Unlike the linear feature mapping in single-frequency-domain attacks, DFDT achieves dispersion of trigger energy across multi-scale frequency spectra through nonlinear cross-domain interactions. This cross-domain fusion reduces the correlation between trigger information and any single transformation basis, significantly increasing the complexity of adversarial reverse engineering for defenders. Figure 1 shows the overview of the attack. The dual-frequency-domain transform combines the DCT and the DWT. The selection of DCT and DWT as frequency-domain transformation tools stems from their complementary strengths. DCT effectively concentrates image energy, reduces redundancy, and preserves semantic integrity, while DWT offers multi-resolution analysis, locality, and directional selectivity. They work together to improve the trigger’s stealthiness and the attack’s effectiveness. DCT is able to obtain high-frequency information from an image, which mainly characterizes the edge and texture features of an image. DWT can further extract the local details of the image. With this combination, triggers can be more evenly distributed throughout the frequency domain of the image. This distribution makes the triggers more difficult to detect by defense methods in the image due to their features fused in more complex frequency-domain information, thus achieving a high degree of stealthiness in backdoor attacks. Furthermore, we introduce regularization samples to reduce the latent separability of clean and poisoned samples [21]. In summary, the main contributions of this paper are as follows:
  • We propose an invisible backdoor attack method based on dual-frequency-domain transformation to DFDT, which is characterized by its efficiency and stealthiness.
  • We experimentally demonstrate that the DFDT attack can effectively enhance the stealthiness of the trigger and that the poisoned samples are visually invisible to the observer.
  • Our comprehensive experiments, which compared the proposed attack with established backdoor attacks, show that DFDT not only achieves a high ASR but also significantly decreases the efficacy of certain state-of-the-art defenses.
Figure 1. The overview framework of dual-frequency-domain transformation (DFDT).
Figure 1. The overview framework of dual-frequency-domain transformation (DFDT).
Electronics 14 03753 g001

2. Related Work

2.1. Backdoor Attacks

Backdoor attacks involve embedding a concealed backdoor within deep neural networks, enabling to perform adequately on clean inputs but produce malicious outputs when activated by specific triggers. BadNets [8] represents pioneering research in the domain of backdoor attacks for image classification, where a white patch is introduced into benign samples to serve as a trigger. However, this method is susceptible to detection by the naked eye due to the conspicuousness of the triggers.
Thus, it is recommended to adopt a more natural trigger. Liu et al. [22] used a commonly used natural reflection phenomenon as a trigger, where the poisoning images were generated from a combination of normal and reflected images. Turner et al. [23] implanted a backdoor by interfering with the image pixels instead of replacing the image. Nguyen et al. [10] proposed a warping-based trigger to enhance stealth.
With the ongoing advancement in the field of backdoor attacks, defense strategies against backdoor attacks are also gaining attention, which makes the difficulty of attacking neural networks gradually rise. To further improve the success rate of the attack, Chen et al. [24] mixed noisy images in a very small portion of the input data. Bagdasaryan et al. [25] implanted a backdoor into the original model by using the aggregation mechanism of federated learning and submitting the poisoned data to the aggregator during the final aggregation of the local model. Chen et al. [26] designed a backdoor attack on the federated meta-learning backdoor attack method. Wang et al. [18] proposed a backdoor attack called FTrojan, which injects triggers in high-frequency components of the UV channel by DCT. WaveAttack [27] is an attack method that attempted to generate triggers for the high-frequency component obtained through DWT. DFDT represents the first trigger generated under dual-frequency-domain transformation. Unlike existing backdoor attack methods, DFDT subjects clean samples to two sequential frequency-domain transformations, enabling the generated trigger to be more uniformly dispersed across the entire frequency spectrum of the image. Therefore, DFDT can not only acquire the effectiveness of backdoor attacks, but also attain a high degree of stealth in terms of image quality and subspaces.

2.2. Backdoor Defense

Based on the objectives of defense strategies, defenses against backdoor attacks can be categorized into three types: data-based defenses, model-based defenses, and trigger-based defenses. Data-based defenses aim to remove poisoned data from the training set. Liu et al. [28] proposed the first pre-processing-based defense against backdoor attacks, incorporating a pre-processing module that precedes the input to the deep neural network (DNN). This module alters the trigger patterns within the compromised samples, rendering the original triggers ineffective. Inspired by the idea that the trigger region contributes the most to the prediction, Doan et al. [12] proposed using GAN to process potentially triggered portions of an image. Udeshi et al. [29] devised a square trigger interceptor to locate and remove backdoor triggers by using dominant colors in an image. Model-based defenses are suitable for situations where the user does not have access to the data and can only detect the model directly. These defenses can identify anomalies within a model, enabling the prevention of its deployment to counteract backdoor threats. Kolouri et al. [30] first discussed how to diagnose a given model to prevent backdoor threats. Trigger-based defenses are more challenging with the innovation of backdoor attacks. For this reason, the academic community has put forth a defense strategy predicated on trigger synthesis. Wang et al. [13] proposed the first defense based on trigger syntheses, which has emerged as the most widely adopted approach.

3. Preparation

3.1. Notations

We are working on a DNN image classification model. We denote a deep learning image classifier by f θ : X Y , where X R 3 , and Y = 1 , 2 , 3 , , c indicate the input and output spaces, respectively. Given a K-class image dataset D = { ( x i , y i ) } i = 1 N , where x i X denotes a benign image, while y i Y is its true label. we choose a subset of D with a poisoning rate of p p as the poisoning dataset D p = { ( x i , y t ) x i = H ( x i ) , x i X } , where H ( x i ) is the backdoor transformation function, and y t is the target label. We use a subset of D with a poisoning rate of p r as a regularization dataset D r = { ( x i , y t ) x i = H ( x i ) , x i X } . The subset of D is treated as a clean dataset D c = ( x i , y i ) | x i X } . Then, the poisoned training dataset is D ^ = { D p , D r , D c } .

3.2. Attack Model

In this study, we investigate the data poisoning-based backdoor attack, where the attacker can only manipulate the benign training dataset D to obtain the poisoned training dataset D ^ , without access to the training process. However, the attacker has no control over the training and deployment process of the model and does not know anything about the trained model. In the inference phase, the user uses the poisoned dataset D ^ to train the model f θ , thus activating the backdoor in the model f θ , i.e., f θ ( x i ) = y t . The specific backdoor attack process is shown in Figure 2.

3.3. Adversarial Goal

During the attack process, adversaries are primarily driven by two core objectives: effectiveness and stealthiness. Effectiveness indicates the adversaries’ efforts to train backdoored models that achieve a high attack success rate (ASR), while ensuring that any decrease in benign accuracy (BA) is imperceptible. Stealthiness, on the other hand, indicates that samples with triggers maintain high fidelity, and there is no discernible separation between poisoned and clean samples in the latent space.

4. Method

4.1. Implementation of DFDT

This section delineates the implementation of DFDT. Figure 1 shows an overview of DFDT.

4.1.1. Apply Color Channel Transformations to Clean Samples and Trigger Image

The color space of clean samples is converted from RGB to YUV, where the YUV color space represents an image with a luminance component (i.e., the Y channel) and chrominance components (i.e., the U and V channels). Compared to the RGB color space, YUV offers better visual perception effects for the human eye. This conversion is denoted as S ( · ) , and its inverse transformation is denoted as S 1 ( · ) . The human visual system is susceptible to alterations in luminance (the Y component) but relatively insensitive to changes in chromaticity (the U and V components). Therefore, adding triggers to the chrominance channels allows for the introduction of more information variations without significantly affecting the visual quality of the image.

4.1.2. Application of Discrete Cosine Transform

DCT is a widely used image processing transform that can convert an image from the spatial domain to the frequency domain. It is particularly adept at capturing high-frequency information in images, which mainly includes detailed features such as edges and textures. DCT is applied to the UV channels of the image to transform it from the spatial domain to the frequency domain, separating the high-frequency and low-frequency information. High-frequency information captures fine details, such as edges and textures, whereas low-frequency information represents the primary structure and overall content of the image. By extracting high-frequency details from an image and integrating them into the trigger, the semantic integrity of the image is maintained, while the trigger’s stealthiness is enhanced. The attacker applies the DCT to both the clean samples and the trigger image, thereby extracting the high- and low-frequency information from each, respectively. The transformation of DTC is shown in Equation (1), where f(x,y) represents the pixel value in the spatial domain, F ( u , v ) represents the coefficient in the frequency domain, and C ( u ) and C ( v ) are the normalization coefficients.
F ( u , v ) = 2 N C ( u ) C ( v ) x = 0 N 1 y = 0 N 1 f ( x , y ) cos ( 2 x + 1 ) u π 2 N cos ( 2 y + 1 ) v π 2 N

4.1.3. Application of Discrete Wavelet Transform

DWT provides a multi-resolution representation of the image, decomposing it into subbands of different frequency bands. Unlike the global frequency representation provided by DCT, DWT allows for more detailed manipulation of local features in the image. To further decompose the high-frequency information, we employ a bi-orthogonal Discrete Wavelet Transform to decompose the image into its low-frequency component (approximation coefficient LL ) and high-frequency components (detail coefficients LH , HL , HH ). LH represents the high-frequency details in the horizontal direction, HL represents the high-frequency details in the vertical direction, and HH represents the high-frequency details in the diagonal direction. The diagonal high-frequency component (HH component) primarily corresponds to the diagonal edges and fine textures in an image, and this component contributes the least to human visual perception. Figure 3 illustrates the impact of adding the same noise to different frequency-domain components. We observe that, compared to the other three poisoned images (i.e., Figure 3b–d), the difference between the poisoned image on the original HH component (i.e., Figure 3e) and the original image is significantly smaller. Therefore, we choose to inject the trigger into the component HH to better achieve the stealthiness of the backdoor attack. The decomposition formula for DWT is shown in Equation (2), where g ( · ) are the coefficients of the low-pass and high-pass filters, respectively, and x ( i , j ) is the pixel value of the input image.
H H ( i , j ) = m n g ( m ) g ( n ) x ( 2 i + m , 2 j + n )

4.1.4. Trigger Generation in the Frequency Domain

Clean samples and trigger image are processed using DCT and DWT, respectively, to obtain their H H components (denoted H H m and H H M , respectively). Add α times H H M to H H m to regain a new H H component for a clean image. α is the strength of the trigger that controls the significance of the trigger in the poisoned image. Subsequently, the average transformation T is applied to simplify the trigger information, reduce the visual artifacts, and further enhance concealment. We can generate the poisoned H H m component with the triggers as follows:
T ( H H m + α · H H M ) = H H m

4.1.5. Transform from Frequency Domain to Spatial Domain

The high-frequency component of the poisoned image is processed with an Inverse Discrete Wavelet Transform (IDWT), followed by an Inverse Discrete Cosine Transform (IDCT) applied to all the information of the frequency domain. This process yields the poisoned image in the spatial domain, represented by the YUV channel.

4.1.6. Color Channels Transform from YUV to RGB

Ultimately, the poisoned image is transformed from the YUV channel back to the RGB channel to obtain a poisoned sample that can be used for backdoor attacks.

4.2. Optimization Objectives

Our objective is to optimize a backdoor classifier f for effectiveness and stealthiness of DFDT. The classifier is trained by the cross-entropy loss function in D p , D r , D c . The optimization objective is defined as follows:
L = λ L ( f ( x c , ω ) , y ) + β L ( f ( x p , ω ) , y t ) + η L ( f ( x r , ω ) , y t )
where ω is the classifier parameter, x c D c , x p D p , x r D r . L c = L ( f ( x c , ω ) , y ) is the loss term for computing clean samples, L p = L ( f ( x p , ω ) , y t ) , and L r = L ( f ( x r , ω ) , y t ) is the loss term for regularization samples. λ , β , and η control the mixing strength of the loss signals from the clean, poisoned, and regularization samples during classifier training. In experiments, we observe that in parameters λ , β , and η , the larger the data, the more quickly the data samples converge to the best performance in the classifier. For the backdoor classifier to converge to the same optimal performance for clean, poisoned, and regularization samples in all three cases, we assume that λ , β , and η are all 1 in the remainder of the paper.

4.3. Algorithm Flow

In Algorithm 1, line 3 counts the number of poisoned, regularized, and clean samples. Lines 4–11 represent the process of modifying the samples by injecting triggers into the high-frequency components. Line 4 decomposes the clean samples and trigger image into the high- and low-frequency components of the frequency domain by DTC. Line 5 represents the label of the clean sample. Lines 6–7 decompose the high-frequency portion of the clean sample and the trigger image into four frequency components by DWT. Line 8 adds the HH components of the trigger image to the HH components of the clean samples and performs an averaging transformation. Line 9 is the reconstruction of the high frequencies of the poisoned samples from the four frequency components by IDWT. Line 10 reconstructs the poisoned samples from their frequency-domain components using the IDCT. Line 11 represents the label of the poisoned sample. Lines 12–15 compute the optimization object. Line 18 places the poisoned samples into the backdoor dataset.
Algorithm 1 Training of DFDT
Input:
  • (i) D , benign training dataset; (ii) ω , randomly initialized generator parameters; (iii) p p , rate of poisoning samples; (iv) p r , rate of regularization samples; (v) M, trigger image; (vi) y t , target label; (vii) E, # of epochs training; (viii) D ^ , backdoor dataset.

Output:
  ω, well-trained classifier model.
1:
for  e = 1 , , E   do
2:
   for  ( x , y ) in D  do
3:
      D p p p × D , D r p r × D , D c 1 D p D r
4:
      X low , X high DCT ( x m ) ; M low , M high DCT ( M )
5:
      m l a b e l x m
6:
      L L m , L H m , H L m , H H m DWT ( X high )
7:
      L L M , L H M , H L M , H H M DWT ( M high )
8:
      H H m T ( H H m + α H H M )
9:
      X high IDWT ( L L m , L H m , H L m , H H m )
10:
      x m IDCT ( X high , X low )
11:
      m l a b e l x m
12:
      L p L ( ( D p , y t ) , ω )
13:
      L r L ( ( D r , y ) , ω )
14:
      L c L ( ( D c , y ) , ω )
15:
      L L p + L r + L c
16:
      L .backward()
17:
     update ( ω )
18:
     Append the poisoned sample pair ( x m , y ) into D ^
19:
   end for
20:
end for

5. Experiment

5.1. Experimental Settings

5.1.1. Datasets and Model Architectures

In this study, we evaluate attacks on three datasets used in the field of backdoor learning: CIFAR-10 [31], GTSRB [32], and CIFAR-100 [31]. CIFAR-10 is a widely utilized computer vision dataset comprising 60,000 images, categorized into 10 classes. GTSRB is a German benchmark dataset for traffic sign recognition. The CIFAR-100 dataset encompasses 100 distinct categories. It is divided into a training set comprising 50,000 images and a test set containing 10,000 images. These datasets have a wide range of influence in the fields of computer vision and security, facilitating a thorough evaluation of adaptive attacks across various scenarios. The details are summarized in Table 1. In constructing the baseline model, we selected four deep learning architectures, ResNet-18 [33], VGG16 [34], ResNet-50 [33], and VGG-19 [34], for comprehensive performance evaluation.

5.1.2. Attack Configurations

To evaluate the performance of DFDT against state-of-the-art (SOTA) attack methods, we have considered four SOTA backdoor attacks for comparison. BadNets [8] and Blend [24] represent typical dirty tagging attacks with patch triggers and hybrid-based triggers, respectively. For BadNets, we used a grid trigger placed in the bottom-right corner of the image. For Blend, a ‘Hello Kitty’ trigger is applied to the dataset. TaCT [35] is a source-specific backdoor attack; WABA [27] is a backdoor attack on remote sensing data based on wavelet transform. Multi-Trigger Backdoor Attack (MTBA) refers to the scenario where multiple adversaries utilize different types of triggers to poison the same dataset [6]. In this experiment, we conducted training on ResNet-18, the target label was associated with the trigger image, and the regularized sample rate and poisoning rate were both set to 1%. Our experiment findings indicate that DFDT achieves an optimal equilibrium between stealthiness and effectiveness when the parameter α is set to 0.6. Consequently, for the remainder of this paper, we set α to 0.6.

5.1.3. Defense Configurations and Evaluation Metrics

To assess the resilience of DFDT against backdoor defenses, we implemented three representative defenses: NC [13], STRIP [20], and I-BAU [36]. We implemented these defenses using the default hyperparameters described in the original papers.
We evaluated the effectiveness of all attack methods using two metrics, attack success rate (ASR) and benign accuracy (BA).

5.2. Effectiveness Evaluation

5.2.1. Effectiveness Comparison with SOTA Attack Methods

To evaluate the effectiveness of DFDT, we compared the ASR and BA of clean samples of this method and four state-of-the-art (SOTA) attack methods. To ensure the robustness and reproducibility of our findings, we repeated each experiment five times using different random seeds, and the experimental results are presented in the form of mean ± standard deviation. From Table 2, it can be seen that DFDT can achieve a high ASR without significantly reducing the BA. Especially for the CIFAR-10 and GTSRB datasets, DFDT achieves the best ASR and BA among other SOTA attack methods.

5.2.2. Effectiveness on Different Networks

This paper evaluates the effectiveness of DFDT across various networks by conducting experiments on CIFAR-10 with several models: ResNet-18 [33], VGG16 [34], ResNet-50 [33], and VGG19 [34]. Table 3 reveals that DFDT not only successfully embeds backdoors into various networks to induce malicious effects, but also sustains BA, thereby demonstrating the versatility of DFDT.

5.2.3. Effectiveness on Different Datasets and Poisoning Rates

To ascertain the effectiveness of DFDT on different backdoor ratios, we tested the performance of DFDT on several key metrics on three datasets, namely CIFAR10, CIFAR100, and GTSRB, using the ResNet18 model. As shown in Table 4, DFDT achieves a high ASR in different datasets, with an ASR above 93.46% on all datasets and a BA above 98.45% in clean samples when the poisoning rate exceeds 3%. This experiment demonstrates that DFDT can identify most of the poisoned samples on multiple datasets and has less impact on the accuracy of clean datasets.

5.3. Performance with Different Trigger Intensities α

In the process of proportional blending of diagonally detailed subbands ( HH ), the parameter controls the intensity of blending of the original image with the trigger image pair. As it increases, the percentage of the original image decreases. We used three metrics (i.e., PSNR, SSIM, and IS) to evaluate the stealthiness of the DFDT-generated triggers and ASR to evaluate the effectiveness of the triggers. Table 5 shows that the ASR of the trigger attack increases with increasing value of α , but the stealthiness of the trigger gradually decreases. To achieve the optimal balance between stealthiness and efficacy, a value of 0.6 is chosen.

5.4. Stealthiness Evaluation

5.4.1. Stealthiness Results from the Perspective of Latent Space

To evaluate the stealthiness of the DFDT, this paper employs a support vector machine to identify linear boundaries that optimally separate poisoned and clean samples in the latent representation space. As shown in Figure 4, our attack (e) brings the poisoned and clean samples closer together compared to (a), (b), (c), and (d). Purple indicates clean samples, red indicates infected samples. Consequently, DFDT exhibits enhanced stealthiness within the latent representation space.

5.4.2. Stealthiness Results from GradCAM Vision Capture

In this paper, we employ GradCAM [38] to visualize the poisoned samples to evaluate the behavior of different attack methods. As shown in Figure 5, GradCAM can successfully identify anomalous trigger regions generated by BadNets, Blend, and TaCT. When activating a backdoor attack, these three attack methods force the model to focus on a specific location of the trigger, which is very different from the location of the clean model, i.e., leaking the attack behavior. However, as DFDT injects triggers in the frequency domain, it does not introduce anomalous activations in specific spatial regions, exhibiting behavior similar to that of a clean model.

5.4.3. Visualization of Features by t-SNE

To more intuitively and thoroughly verify the stealthiness of DFDT, we employed the t-SNE algorithm to conduct a visual analysis of clean samples and poisoned samples with backdoor triggers added. Figure 6 visualizes the distribution of feature representations of poisoned and benign samples under four types of attacks. Blue represents clean samples, while red represents poisoned samples. In Figure 6a–c, the existence of two distinct clusters indicates that the poisoned and clean samples can be separated, implying that the stealthiness of the poisoned samples is not strong. However, in Figure 6d, the clean and poisoned samples are intermingled, suggesting that the poisoned samples of DFDT possess strong stealthiness.

5.4.4. Stealthiness Results from Metrics (SSIM, PSNR, LPIPS)

In this study, to comprehensively evaluate the stealth performance of the proposed dual-frequency-domain transformation backdoor attack method, we employed three widely recognized image quality assessment metrics: Structural Similarity Index (SSIM), Peak Signal-to-Noise Ratio (PSNR), and Learned Perceptual Image Patch Similarity (LPIPS). SSIM is a metric that measures the similarity between two images, comprehensively evaluating image similarity from three aspects: luminance, contrast, and structure. It ranges from 0 to 1, with values closer to 1 indicating greater similarity between the two images.PSNR is a commonly used metric for assessing the degree of image distortion. It is represented by the logarithm of the mean squared error between pixel values of the original and distorted images relative to the maximum pixel value. A higher PSNR value indicates better image quality and less distortion. Unlike traditional similarity metrics based on pixels or low-level features, LPIPS leverages deep learning models to extract high-level features from images and computes similarity between images based on these features. A lower LPIPS value indicates greater perceptual similarity between the two images. These metrics quantitatively analyze changes in images before and after the attack from different perspectives. As shown in Table 6, compared to other methods, DFDT exhibits significant advantages in maintaining high image quality and perceptual similarity.

5.5. Impacts over Defenses

We evaluated the defensibility of DFDT against three backdoor defense methods, NC, STRIP, and I-BAU.

5.5.1. STRIP

STRIP [13] is a defense method that assesses whether a model is compromised by examining the randomness of the prediction class for perturbed inputs. Due to the substantial disparity in entropy distribution between poisoned and clean samples, STRIP can effectively differentiate between samples containing triggers and those that are clean. Figure 7 illustrates that the distribution of image poisoning models derived from DFDT closely resembles that of benign models on the CIFAR-10 dataset, exhibiting significant overlap. Consequently, DFDT can effectively circumvent the STRIP defense method.

5.5.2. Neural Cleanse

Neural Cleanse (NC) [20], a representative backdoor defense strategy, initially generates distinct trigger patterns for each category label through an optimization process. Subsequently, NC employs an anomaly index to detect the presence of a backdoor attack within a deep neural network (DNN). If the anomaly index exceeds 2, the DNN is considered to be under a backdoor attack. The results of defending against NC are shown in Figure 8. In the figure, the term “Clean” denotes the model trained on the benign dataset, and “Poison” denotes the model trained on the poisoned dataset in this paper. As can be observed in Figure 8, the anomaly indices of the three datasets are less than 2, so DFDT is able to bypass NC detection.

5.5.3. I-BAU

I-BAU [36] is a novel defense method that alternates iteratively between trigger generation and backdoor removal. It proposes a minimax optimization framework for backdoor removal and employs implicit hypergradients to solve it. Table 7 demonstrates the defensive performance of I-BAU against our attack method. I-BAU reduces the attack success rates on CIFAR-10, GTSRB, and CIFAR-100 by 1.24%, 1.79%, and 3.62%, respectively, indicating that I-BAU performs poorly in defending against the attack methods.

5.5.4. Fine-Pruning

Fine-Pruning is a backdoor defense method that combines model pruning with fine-tuning. It first analyzes the activation of the model on clean data to identify suspicious neurons that are abnormally activated by potential backdoor triggers and then prunes these neurons. After that, it fine-tunes the model using clean data to weaken the impact of backdoor attacks while maintaining normal performance. As shown in Figure 9, when the pruning rate of Fine-Pruning reaches 50%, the attack success rate (ASR) of DFDT still remains above 80%, while the performance on clean data significantly declines. So DFDT can bypass Fine-Pruning detection.

6. Conclusions

We propose an invisible backdoor attack method. We apply a dual-frequency-domain transformation to both clean image samples and the trigger image, followed by feature fusion within the frequency domain. This method not only significantly enhances the stealthiness of the backdoor but also increases its detectability difficulty by minimizing the impact on spatial domain features through operations in the frequency domain. Through extensive experimental evaluations, DFDT outperforms current state-of-the-art backdoor attack methods on several image classifications. The experimental results show that DFDT maintains a high attack success rate while maintaining stable performance with multiple defenses.
Currently, the DFDT (dual-frequency-domain transformation) method still has certain limitations. The effectiveness of the DFDT backdoor attack method is closely tied to the poisoning rate. A low poisoning rate may result in insufficient embedding of the backdoor trigger into the training data, thereby reducing the success rate of the attack when the model is deployed. On the other hand, an excessively high poisoning rate will heighten the risk of detection by anomaly-based defense methods. As the proportion of poisoned samples in the training set increases, the statistical characteristics of the data may deviate significantly from the normal distribution, making it easier for defense measures to identify and flag malicious samples. The parameter α , which plays a pivotal role in the dual-frequency-domain transformation process, has a substantial impact on the performance of the DFDT method. α determines the balance between the contributions of the original image features and the backdoor trigger features in the frequency domain. An inappropriate value of α may weaken the effectiveness of the backdoor trigger, leading to attack failure, or make the trigger overly conspicuous, thus reducing the stealth of the poisoned samples.
Our evaluation of the attacks is confined to the image classification tasks. The performance of these attacks on other learning tasks, such as natural language processing and the video field, remains unclear. We intend to investigate these areas in future research. To defend the proposed attacks, we also plan to develop more robust defenses that extend beyond the current assumptions regarding backdoor attacks in the spatial domain. These defenses may focus on detecting abnormal frequency-domain patterns introduced by the DFDT transformation or developing model-training methods that offer stronger resistance against such backdoor attacks.

Author Contributions

Formal analysis, investigation, methodology, software, validation, writing—original draft, M.C.; conceptualization, writing—review and editing, G.L.; visualization, S.X.; funding acquisition, project administration, resources, Y.C.; data curation, writing—review and editing, Y.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Open Foundation of Key Laboratory of Cyberspace Security, Ministry of Education of China (Project No. KLCS20240211). The APC was funded by Yan Cao.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Proceedings of the Advances in Neural Information Processing Systems 25 (NIPS 2012), Lake Tahoe, NV, USA, 3–6 December 2012. [Google Scholar]
  2. Collobert, R.; Weston, J.; Bottou, L.; Karlen, M.; Kavukcuoglu, K.; Kuksa, P. Natural language processing (almost) from scratch. J. Mach. Learn. Res. 2011, 12, 2493–2537. [Google Scholar]
  3. Dahl, G.E.; Yu, D.; Deng, L.; Acero, A. Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans. Audio Speech Lang. Process. 2011, 20, 30–42. [Google Scholar] [CrossRef]
  4. Wang, Z.; Liu, K.; Hu, J.; Ren, J.; Guo, H.; Yuan, W. Attrleaks on the edge: Exploiting information leakage from privacy-preserving co-inference. Chin. J. Electron. 2023, 32, 1–12. [Google Scholar] [CrossRef]
  5. Ding, Y.; Wang, Z.; Qin, Z.; Zhou, E.; Zhu, G.; Qin, Z.; Choo, K.K.R. Backdoor attack on deep learning-based medical image encryption and decryption network. IEEE Trans. Inf. Forensics Secur. 2023, 19, 280–292. [Google Scholar] [CrossRef]
  6. Li, Y.; Jiang, Y.; Li, Z.; Xia, S.T. Backdoor learning: A survey. IEEE Trans. Neural Netw. Learn. Syst. 2022, 35, 5–22. [Google Scholar] [CrossRef] [PubMed]
  7. Wu, B.; Chen, H.; Zhang, M.; Zhu, Z.; Wei, S.; Yuan, D.; Shen, C. Backdoorbench: A comprehensive benchmark of backdoor learning. Adv. Neural Inf. Process. Syst. 2022, 35, 10546–10559. [Google Scholar] [CrossRef]
  8. Gu, T.; Liu, K.; Dolan-Gavitt, B.; Garg, S. Badnets: Evaluating backdooring attacks on deep neural networks. IEEE Access 2019, 7, 47230–47244. [Google Scholar] [CrossRef]
  9. Yamaguchi, S.; Saito, S.; Nagano, K.; Zhao, Y.; Chen, W.; Olszewski, K.; Morishima, S.; Li, H. High-fidelity facial reflectance and geometry inference from an unconstrained image. ACM Trans. Graph. (TOG) 2018, 37, 1–14. [Google Scholar] [CrossRef]
  10. Nguyen, A.; Tran, A. Wanet–imperceptible warping-based backdoor attack. arXiv 2021, arXiv:2102.10369. [Google Scholar]
  11. Xu, Z.Q.J.; Zhang, Y.; Luo, T.; Xiao, Y.; Ma, Z. Frequency principle: Fourier analysis sheds light on deep neural networks. arXiv 2019, arXiv:1901.06523. [Google Scholar] [CrossRef]
  12. Doan, B.G.; Abbasnejad, E.; Ranasinghe, D.C. Februus: Input purification defense against trojan attacks on deep neural network systems. In Proceedings of the 36th Annual Computer Security Applications Conference, Virtual, 7–11 December 2020; pp. 897–912. [Google Scholar]
  13. Wang, B.; Yao, Y.; Shan, S.; Li, H.; Viswanath, B.; Zheng, H.; Zhao, B.Y. Neural cleanse: Identifying and mitigating backdoor attacks in neural networks. In Proceedings of the 2019 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 19–23 May 2019; pp. 707–723. [Google Scholar]
  14. Liu, Y.; Lee, W.C.; Tao, G.; Ma, S.; Aafer, Y.; Zhang, X. Abs: Scanning neural networks for back-doors by artificial brain stimulation. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, London, UK, 11–15 November 2019; pp. 1265–1282. [Google Scholar]
  15. Ahmed, N.; Natarajan, T.; Rao, K.R. Discrete cosine transform. IEEE Trans. Comput. 2006, 100, 90–93. [Google Scholar] [CrossRef]
  16. Shensa, M.J. The discrete wavelet transform: Wedding the a trous and Mallat algorithms. IEEE Trans. Signal Process. 2002, 40, 2464–2482. [Google Scholar] [CrossRef]
  17. Xiong, Z.; Ramchandran, K.; Orchard, M.T. Wavelet packet image coding using space-frequency quantization. IEEE Trans. Image Process. 1998, 7, 892–898. [Google Scholar] [CrossRef]
  18. Wang, T.; Yao, Y.; Xu, F.; An, S.; Tong, H.; Wang, T. An invisible black-box backdoor attack through frequency domain. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 396–413. [Google Scholar]
  19. Xu, Z.Q.J.; Zhang, Y.; Xiao, Y. Training behavior of deep neural network in frequency domain. In Proceedings of the Neural Information Processing: 26th International Conference, ICONIP 2019, Sydney, Australia, 12–15 December 2019; Proceedings, Part I 26. Springer: Berlin/Heidelberg, Germany, 2019; pp. 264–274. [Google Scholar]
  20. Gao, Y.; Xu, C.; Wang, D.; Chen, S.; Ranasinghe, D.C.; Nepal, S. Strip: A defence against trojan attacks on deep neural networks. In Proceedings of the 35th Annual Computer Security Applications Conference, San Juan, PR, USA, 9–13 December 2019; pp. 113–125. [Google Scholar]
  21. Qi, X.; Xie, T.; Li, Y.; Mahloujifar, S.; Mittal, P. Revisiting the assumption of latent separability for backdoor defenses. In Proceedings of the Eleventh International Conference on Learning Representations, Kigali, Rwanda, 1–5 May 2023. [Google Scholar]
  22. Liu, Y.; Ma, X.; Bailey, J.; Lu, F. Reflection backdoor: A natural backdoor attack on deep neural networks. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Proceedings, Part X 16. Springer: Berlin/Heidelberg, Germany, 2020; pp. 182–199. [Google Scholar]
  23. Turner, A.; Tsipras, D.; Madry, A. Label-consistent backdoor attacks. arXiv 2019, arXiv:1912.02771. [Google Scholar] [CrossRef]
  24. Chen, X.; Liu, C.; Li, B.; Lu, K.; Song, D. Targeted backdoor attacks on deep learning systems using data poisoning. arXiv 2017, arXiv:1712.05526. [Google Scholar] [CrossRef]
  25. Bagdasaryan, E.; Veit, A.; Hua, Y.; Estrin, D.; Shmatikov, V. How to backdoor federated learning. In Proceedings of the International Conference on Artificial Intelligence and Statistics (PMLR), Online, 26–28 August 2020; pp. 2938–2948. [Google Scholar]
  26. Chen, C.L.; Golubchik, L.; Paolieri, M. Backdoor attacks on federated meta-learning. arXiv 2020, arXiv:2006.07026. [Google Scholar] [CrossRef]
  27. Dräger, N.; Xu, Y.; Ghamisi, P. Backdoor attacks for remote sensing data with wavelet transform. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–15. [Google Scholar] [CrossRef]
  28. Liu, Y.; Xie, Y.; Srivastava, A. Neural trojans. In Proceedings of the 2017 IEEE International Conference on Computer Design (ICCD), Boston, MA, USA, 5–8 November 2017; pp. 45–48. [Google Scholar]
  29. Udeshi, S.; Peng, S.; Woo, G.; Loh, L.; Rawshan, L.; Chattopadhyay, S. Model agnostic defence against backdoor attacks in machine learning. IEEE Trans. Reliab. 2022, 71, 880–895. [Google Scholar] [CrossRef]
  30. Kolouri, S.; Saha, A.; Pirsiavash, H.; Hoffmann, H. Universal litmus patterns: Revealing backdoor attacks in cnns. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 301–310. [Google Scholar]
  31. Krizhevsky, A. Learning Multiple Layers of Features from Tiny Images. 2009. Available online: https://www.cs.utoronto.ca/~kriz/learning-features-2009-TR.pdf (accessed on 1 July 2025).
  32. Stallkamp, J.; Schlipsing, M.; Salmen, J.; Igel, C. Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition. Neural Netw. 2012, 32, 323–332. [Google Scholar] [CrossRef] [PubMed]
  33. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
  34. Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
  35. Tang, D.; Wang, X.; Tang, H.; Zhang, K. Demon in the variant: Statistical analysis of DNNs for robust backdoor contamination detection. In Proceedings of the 30th USENIX Security Symposium (USENIX Security 21), Online, 11–13 August 2021; pp. 1541–1558. [Google Scholar]
  36. Zeng, Y.; Chen, S.; Park, W.; Mao, Z.M.; Jin, M.; Jia, R. Adversarial unlearning of backdoors via implicit hypergradient. arXiv 2021, arXiv:2110.03735. [Google Scholar]
  37. Li, Y.; He, J.; Huang, H.; Sun, J.; Ma, X. Shortcuts Everywhere and Nowhere: Exploring Multi-Trigger Backdoor Attacks. arXiv 2024, arXiv:2401.15295. [Google Scholar] [CrossRef]
  38. Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar]
Figure 2. Schematic of a backdoor attack based on poisoning.
Figure 2. Schematic of a backdoor attack based on poisoning.
Electronics 14 03753 g002
Figure 3. Visualization of latent representation spaces fitted by SVM.
Figure 3. Visualization of latent representation spaces fitted by SVM.
Electronics 14 03753 g003
Figure 4. Graphical representation of the feature spaces identified through SVM.
Figure 4. Graphical representation of the feature spaces identified through SVM.
Electronics 14 03753 g004
Figure 5. Visualization of the influence area captured by Grad-CAM.
Figure 5. Visualization of the influence area captured by Grad-CAM.
Electronics 14 03753 g005aElectronics 14 03753 g005b
Figure 6. Visualization of features by t-SNE.
Figure 6. Visualization of features by t-SNE.
Electronics 14 03753 g006
Figure 7. The detection results of STRIP.
Figure 7. The detection results of STRIP.
Electronics 14 03753 g007
Figure 8. The detection results of Neural Clean.
Figure 8. The detection results of Neural Clean.
Electronics 14 03753 g008
Figure 9. The detection results of Fine-Pruning.
Figure 9. The detection results of Fine-Pruning.
Electronics 14 03753 g009
Table 1. Dataset information.
Table 1. Dataset information.
DatasetImage Size# of Labels# of Training Images# of Test Images
CIFAR-103 × 32 × 321050,00010,000
GTSRB3 × 32 × 324339,20912,630
CIFAR-1003 × 32 × 3210050,00010,000
Table 2. Attack performance comparison between DFDT and five SOTA attack methods.
Table 2. Attack performance comparison between DFDT and five SOTA attack methods.
MethodCIFAR-10GTSRBCIFAR-100
BA(%)ASR(%)BA(%)ASR(%)BA(%)ASR(%)
No attack94.25 ± 0.34-97.42 ± 0.13-75.30 ± 0.25-
BadNets [8]93.38 ± 0.51100 ± 097.31 ± 0.31100 ± 074.78 ± 0.78100 ± 0
Blend [24]94.02 ± 0.2199.12 ± 0.1796.84 ± 0.3898.00 ± 0.1575.02 ± 0.4799.79 ± 0.12
TaCT [35]93.71 ± 0.47100 ± 097.16 ± 0.1999.96 ± 0.0374.92 ± 0.5599.27 ± 0.13
WABA [27]94.12 ± 0.2999.95 ± 0.0397.20 ± 0.16100 ± 075.16 ± 0.7499.54 ± 0.21
MTBA [37]93.93 ± 0.2199.99 ± 0.0197.28 ± 0.13100 ± 074.83 ± 0.8299.64 ± 0.21
DFDT (ours)94.18 ± 0.17100 ± 097.34 ± 0.31100 ± 075.06 ± 0.5699.82 ± 0.10
Table 3. Effectiveness on different DNNs.
Table 3. Effectiveness on different DNNs.
NetworkNo Attack (BA%)Ours (BA%)Ours (ASR%)
ResNet-18 [33]94.2594.18100
ResNet-50 [33]95.4695.40100
VGG-16 [34]93.4293.3099.87
VGG-19 [34]94.7894.84100
Table 4. Effectiveness of DFDC attacks with different datasets and poisoning rates.
Table 4. Effectiveness of DFDC attacks with different datasets and poisoning rates.
DatasetPoisoning Rate(%)BA(%)ASR(%)
CIFAR-10 [31]094.25-
0.594.2098.45
194.18100
393.46100
GTSRB [32]097.42-
0.597.3697.94
197.34100
396.45100
CIFAR-100 [31]075.30-
0.575.2897.46
175.0699.82
373.7899.96
Table 5. The comparison of the impact of mixture ratio α on the stealthiness of attacks.
Table 5. The comparison of the impact of mixture ratio α on the stealthiness of attacks.
α CIFAR-10GTSRBCIFAR-100
ASR (%)SSIMPSNRLPIPSASR (%)SSIMPSNRLPIPSASR (%)SSIMPSNRLPIPS
1.01000.965736.230.09781000.963435.680.12361000.973539.350.1067
0.81000.982040.440.06261000.980436.540.09151000.987541.680.0653
0.61000.992141.580.02041000.991338.370.060999.820.994644.460.0467
0.499.590.996342.780.012599.760.995140.670.032699.870.997445.380.0183
0.299.340.997543.450.003698.580.996341.870.006298.640.998748.740.0085
Table 6. Stealthiness results from metrics (SSIM, PSNR, LPIPS).
Table 6. Stealthiness results from metrics (SSIM, PSNR, LPIPS).
MethodSSIMPSNRLPIPS
BadNets [8]0.984526.890.198
Blend [24]0.887623.751.13
TaCT [35]0.928535.560.763
WABA [27]0.987639.790.047
DFDT (ours)0.992141.580.029
Table 7. Results of the I-BAU defense experiment.
Table 7. Results of the I-BAU defense experiment.
DefenseCIFAR-10GTSRBCIFAR-100
CA (%)ASR (%)CA (%)ASR (%)CA (%)ASR (%)
No Defense94.181210097.344510075.063599.8216
I-BAU [36]94.168899.988297.335299.982175.021199.7854
Deviation0.01240.01180.00930.01790.04240.0362
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cao, M.; Li, G.; Xu, S.; Zhang, Y.; Cao, Y. Invisible Backdoor Attack Based on Dual-Frequency- Domain Transformation. Electronics 2025, 14, 3753. https://doi.org/10.3390/electronics14193753

AMA Style

Cao M, Li G, Xu S, Zhang Y, Cao Y. Invisible Backdoor Attack Based on Dual-Frequency- Domain Transformation. Electronics. 2025; 14(19):3753. https://doi.org/10.3390/electronics14193753

Chicago/Turabian Style

Cao, Mingyue, Guojia Li, Simin Xu, Yihong Zhang, and Yan Cao. 2025. "Invisible Backdoor Attack Based on Dual-Frequency- Domain Transformation" Electronics 14, no. 19: 3753. https://doi.org/10.3390/electronics14193753

APA Style

Cao, M., Li, G., Xu, S., Zhang, Y., & Cao, Y. (2025). Invisible Backdoor Attack Based on Dual-Frequency- Domain Transformation. Electronics, 14(19), 3753. https://doi.org/10.3390/electronics14193753

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop