AI-Based Steganography Method to Enhance the Information Security of Hidden Messages in Digital Images

Huynh, Nhi Do Ngoc; Jiang, Jiajun; Chen, Chung-Hao; Yang, Wen-Chao

doi:10.3390/electronics14224490

Open AccessArticle

AI-Based Steganography Method to Enhance the Information Security of Hidden Messages in Digital Images

¹

Electrical and Computer Engineering Department, Old Dominion University, Norfolk, VA 23529, USA

²

Department of Forensic Science, Central Police University, Taoyuan City 333322, Taiwan

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(22), 4490; https://doi.org/10.3390/electronics14224490 (registering DOI)

Submission received: 29 September 2025 / Revised: 10 November 2025 / Accepted: 15 November 2025 / Published: 17 November 2025

(This article belongs to the Special Issue Advanced Machine Learning, Pattern Recognition, and Deep Learning Technologies: Methodologies and Applications, 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

With the increasing sophistication of Artificial Intelligence (AI), traditional digital steganography methods face a growing risk of being detected and compromised. Adversarial attacks, in particular, pose a significant threat to the security and robustness of hidden information. To address these challenges, this paper proposes a novel AI-based steganography framework designed to enhance the security of concealed messages within digital images. Our approach introduces a multi-stage embedding process that utilizes a sequence of encoder models, including a base encoder, a residual encoder, and a dense encoder, to create a more complex and secure hiding environment. To further improve robustness, we integrate Wavelet Transforms with various deep learning architectures, namely Convolutional Neural Networks (CNNs), Bayesian Neural Networks (BNNs), and Graph Convolutional Networks (GCNs). We conducted a comprehensive set of experiments on the FashionMNIST and MNIST datasets to evaluate our framework’s performance against several adversarial attacks. The results demonstrate that our multi-stage approach significantly enhances resilience. Notably, while CNN architectures provide the highest baseline accuracy, BNNs exhibit superior intrinsic robustness against gradient-based attacks. For instance, under the Fast Gradient Sign Method (FGSM) attack on the MNIST dataset, our BNN-based models maintained an accuracy of over 98%, whereas the performance of comparable CNN models dropped sharply to between 10% and 18%. This research provides a robust and effective method for developing next-generation secure steganography systems.

Keywords:

artificial intelligence; steganography; machine learning; deep learning; mathematical optimization; wavelet transform

1. Introduction

In the current digital era, the security of transmitted information is of paramount importance. With the rapid advancement of digital technologies, the risk of data breaches and unauthorized access by malicious actors has increased significantly [1]. Consequently, robust methods are required to ensure the confidentiality and integrity of sensitive data during transmission. Steganography, the art and science of hiding secret messages within an ordinary-looking cover medium such as an image, audio, or text file, has emerged as a key technology for covert communication [2,3]. Unlike cryptography, which encrypts the content of a message, steganography aims to conceal the very existence of the message, making it an attractive tool for enhancing information security [4].

However, traditional steganography techniques, such as the Least Significant Bit (LSB) method [5,6], often exhibit vulnerabilities. They can be detected by modern steganalysis tools, particularly those powered by Artificial Intelligence (AI) and deep learning [7]. Furthermore, the rise of adversarial attacks poses a significant threat to even more advanced, AI-based steganography methods [8,9]. These attacks can introduce small, often imperceptible perturbations to the cover medium, causing the hidden information to be lost or incorrectly extracted, thereby compromising the entire security framework. This highlights a critical research gap: the need for a steganography method that is not only effective at hiding data but also robust against sophisticated adversarial attacks [10].

To address these challenges, this paper proposes a novel AI-based steganography framework that enhances the security and robustness of hidden messages in digital images. Our approach integrates the analytical power of Wavelet Transforms (WT) with various deep learning architectures, including Convolutional Neural Networks (CNNs), Bayesian Neural Networks (BNNs), and Graph Convolutional Networks (GCNs). The core of our method lies in a multi-stage embedding process, which includes a primary encoder, a residual encoder, and a dense encoder. This layered approach is designed to increase the complexity for potential attackers and improve the resilience of the hidden data. Recent deep-learning frameworks such as SteganoGAN, HiDDeN, and U-Net-based models achieve high embedding capacity but remain vulnerable to gradient-based adversarial attacks. Unlike these single-stage approaches, the proposed multi-stage wavelet-integrated framework focuses on enhancing robustness and reversibility, forming a complementary direction to existing GAN- or U-Net-based designs.

This work addresses two primary gaps in current AI-based steganography methods: (1) insufficient robustness to gradient-based perturbations that can easily reveal or destroy hidden information, and (2) the absence of probabilistic modeling to estimate uncertainty and improve resilience. Our multi-stage Bayesian–Wavelet framework is designed to close these gaps by integrating frequency-domain embedding with probabilistic feature learning.

The main contributions of this work are threefold:

We design and implement a novel, multi-layered steganography framework that progressively embeds secret images, enhancing the overall security and capacity.
We systematically investigate the integration of Wavelet Transforms with different deep learning models to improve the robustness of the steganographic system against detection and adversarial manipulations.
We conduct a comprehensive experimental evaluation on the MNIST [11] and FashionMNIST [12] datasets, testing our method against a variety of adversarial attacks (e.g., FGSM, RNI) and defense mechanisms to demonstrate its superior performance and resilience compared to baseline models.

The remainder of this paper is organized as follows: Section 2 reviews related work. Section 3 details the proposed methods, including the framework architecture and the underlying models. Section 4 presents the experimental setup and analyzes the results. Finally, Section 5 concludes the paper and discusses potential directions for future research.

2. Related Work

2.1. Traditional and Reversible Steganography

Steganography, or data hiding, involves embedding secret data within a cover medium to achieve covert communication [2,13]. A widely known technique is Least Significant Bit (LSB) steganography, which modifies the least significant bits of the cover image’s pixels. While simple, LSB is often susceptible to statistical analysis. To address the need for perfect data recovery, Reversible Data Hiding (RDH) techniques have been developed [14,15]. Unlike irreversible methods where the cover medium is permanently altered [6,16], RDH allows for the complete restoration of the original cover image after the secret data has been extracted [17].

2.2. Deep Learning and Wavelets in Steganography

With the rise of deep learning, researchers have applied various neural network architectures to steganography and steganalysis to improve embedding capacity and security. Convolutional Neural Networks (CNNs) have been a popular choice due to their powerful feature extraction capabilities for image data [18,19]. More recently, other architectures, such as Graph Convolutional Networks (GCNs), which model images as graphs to capture non-local relationships, have been explored [20,21]. Bayesian Neural Networks (BNNs) offer a probabilistic approach, which can provide inherent robustness against certain types of analysis [22].

In parallel, Wavelet Transforms (WT) have been recognized as a powerful tool for signal analysis [23,24]. In steganography, applying WT allows for embedding secret data in the frequency domain of an image rather than the spatial domain [25,26]. This can make the hidden data more resilient to common image processing operations and statistical attacks. Combining deep learning with wavelet analysis is a promising direction for creating more robust steganographic systems [27].

2.3. Adversarial Attacks and Defenses in Image Security

The security of deep learning models is a major concern in many fields, including multimedia forensics [28]. Adversarial attacks are designed to fool deep learning models by introducing carefully crafted, often imperceptible, perturbations into the input data [8,9]. The Fast Gradient Sign Method (FGSM) is a well-known example of such an attack [8]. Other approaches may involve more complex optimization or knowledge of the target model’s architecture [29]. To counter these threats, various defense mechanisms have been proposed, including adversarial training, input transformation, and the use of generative models to purify the input [30,31]. Evaluating the robustness of any AI-based security system against these attacks is therefore essential [10].

In recent years, several studies have further explored frequency-domain analysis as an effective means of improving adversarial robustness in both communication and image-security systems. For example, Zhang et al. [32] introduced a homomorphic filtering-based adversarial defense (HFAD) that suppresses high-frequency components to enhance model stability against perturbations. Meng et al. [33] proposed a frequency-domain feature enhancement and hybrid adversarial training framework (EH-AT) to improve the robustness of deep models against transferable attacks. Similarly, Zhang et al. [34] presented a meta adversarial transfer attack and adaptive defense strategy that enhances domain-invariant representations to resist complex perturbations. By integrating these perspectives, our study extends current research toward a unified Bayesian–Wavelet formulation for resilient steganography.

3. Proposed Methods

To enhance the security of hidden information in digital images against advanced detection and adversarial attacks, we propose a multi-stage steganographic framework powered by deep learning. This section details the overall architecture of our framework, the core deep learning models utilized, the integration of Wavelet Transforms, and the setup for adversarial testing.

3.1. Overall Steganographic Framework

The central concept of our method is a layered embedding and extraction process designed to increase security through complexity. Instead of a single encoding step, we introduce a sequence of encoders, an idea motivated by the potential for enhanced security through multiple-image steganography [35]. The framework consists of an encoding and a decoding phase, as illustrated in Figure 1.

3.1.1. Encoding Phase

This phase embeds secret information through a series of three specialized encoders:

Base Encoder: Embeds the first secret image into a cover image, producing an initial Encoder Image.
Residual Encoder: Takes the Encoder Image as a new cover and embeds a second secret image, creating a Residual Encoder Image.
Dense Encoder: Further embeds another layer of information into the Residual Encoder Image, resulting in the final Dense Encoder Image for transmission. This dense, multi-layered structure significantly complicates the statistical analysis for any potential attacker.

Mathematical Formulation. Let

x \in R^{H \times W}

denote the cover image and

s \in R^{H \times W}

the secret image. The Base Encoder performs a nonlinear mapping:

E_{base} (x, s) = ϕ (W_{1} * [x, s] + b_{1}),

(1)

where * denotes convolution and

ϕ (\cdot)

is the Rectified Linear Unit (ReLU) activation. The Residual Encoder refines the embedding through residual learning:

E_{res} = E_{base} + ϕ (W_{2} * E_{base} + b_{2}) .

(2)

The Dense Encoder concatenates previous feature maps for enhanced reuse:

E_{dense} = ϕ (W_{3} * [E_{base}, E_{res}] + b_{3}),

(3)

producing the final stego image

I_{s t e g o} = E_{dense} (x, s)

.

3.1.2. Decoding Phase

This is the reverse process designed to extract the secret images layer by layer:

Base Decoder: Extracts the Residual Encoder Image from the received Dense Encoder Image.
Residual Decoder: Recovers the Encoder Image from the Residual Encoder Image.
Dense Decoder: Finally extracts the original first secret image from the Encoder Image.

Mathematically, the decoder network reconstructs the hidden image from

I_{s t e g o}

as

D_{base} = ϕ (W_{4} * I_{s t e g o} + b_{4}), \hat{s} = σ (W_{5} * D_{base} + b_{5}),

(4)

where

σ (\cdot)

is the sigmoid activation and

\hat{s}

denotes the recovered secret image. This hierarchical design ensures that secret information is deeply concealed, making it more resilient to steganalysis.

3.2. Backbone Deep Learning Architectures

Our framework is flexible and can be implemented with various deep learning models as its “backbone”. In this study, we evaluated three distinct types of architectures:

Convolutional Neural Networks (CNN): As the standard for image processing tasks, CNNs [18] are highly effective at capturing spatial hierarchies. We implement several established architectures, including EfficientNet [36], GoogLeNet [37], and ResNet [38], as powerful feature extractors.
Bayesian Neural Networks (BNN): Unlike deterministic models, BNNs [22] treat model weights as probability distributions, allowing them to quantify uncertainty. We hypothesize this can make the system inherently more robust against adversarial attacks.
Graph Convolutional Networks (GCN): GCNs [20,21] operate on graph-structured data. We model an image as a graph of pixels to capture non-local dependencies, offering a different approach to feature extraction.

For the CNN and GCN formulation, a convolutional layer with filters

W \in R^{K \times C \times h \times w}

operates by,

Y_{k} (i, j) = \sum_{c = 1}^{C} \sum_{m = 1}^{h} \sum_{n = 1}^{w} X_{c} (i + m, j + n) W_{k, c} (m, n) + b_{k},

(5)

followed by ReLU and pooling:

X^{'} (i, j) = max_{(m, n) \in Ω} Y (i + m, j + n) .

(6)

For the GCN-based encoder, each pixel is treated as a node in a graph with adjacency

A \in R^{N \times N}

:

\hat{A} = D^{- \frac{1}{2}} A D^{- \frac{1}{2}}, D_{i i} = \sum_{j} A_{i j},

(7)

and node aggregation is performed as

H = \hat{A} X W, y = Softmax (W_{f c}^{⊤} H + b_{f c}) .

(8)

For the Bayesian Neural Encoder, each weight and bias is modeled as a Gaussian distribution, allowing the network to capture uncertainty through stochastic parameter sampling. Specifically,

W \sim N (μ_{W}, σ_{W}^{2}), b \sim N (μ_{b}, σ_{b}^{2}),

(9)

with

σ = log (1 + e^{ρ})

through softplus reparameterization. The training objective minimizes the evidence lower bound (ELBO):

L_{E L B O} = E_{q (W, b)} [log p (y | x, W, b)] - K L (q (W, b) ∥ p (W, b)),

(10)

providing regularization and improved robustness against gradient-based perturbations.

3.3. Wavelet Transform Integration

To further enhance feature representation and robustness, we employ Wavelet Transforms (WT) as a pre-processing step [23,24]. WT decomposes an image into different frequency sub-bands. By training our models on these wavelet coefficients, we aim to:

Capture textural and frequency-based features that are less susceptible to simple statistical analysis.
Improve resilience to noise and adversarial perturbations that primarily affect specific frequency bands [25].

In our experiments, we evaluated a range of wavelet techniques, including the 2D Haar Wavelet Transform [19].

For the Wavelet-based Consistency Loss, the discrete wavelet transform

W (\cdot)

is applied to encourage similarity between the cover and stego domains. The corresponding loss function is defined as:

L_{w a v e l e t} = {∥ W (x) - W (\hat{x}) ∥}_{1} .

(11)

3.4. Adversarial Attack and Defense Scenarios

A key goal of this research is to evaluate the framework’s robustness against adversarial examples [9]. We tested the trained models under several conditions.

3.4.1. Adversarial Attacks

We employed three common types of attacks to generate adversarial examples:

Fast Gradient Sign Method (FGSM): A classic white-box attack that creates adversarial perturbations based on the model’s loss function gradient [8].
Random Noise Injection (RNI): A simpler baseline attack that adds random noise to the input image.
Know The Knowledge of The Model’s Architecture (KTKOTMA): A powerful white-box attack scenario where the adversary has full knowledge of the model’s architecture.

3.4.2. Adversarial Defenses

We also implemented and tested three defense strategies to measure their effectiveness:

Inject Random Patterns to Disrupt the Adversarial Perturbation (IRPDAP): A method that adds structured noise to disrupt adversarial patterns.
Convert Input Images to Binary Images (CIIBI): A pre-processing defense that simplifies the input space, potentially removing subtle perturbations.
Combined Two Defense Methods (CTDM): A hybrid approach that combines both IRPDAP and CIIBI for a multi-faceted defense.

For the overall training objective, the total loss jointly optimizes reconstruction fidelity, adversarial robustness, and wavelet-domain consistency, and is defined as:

L_{t o t a l} = α L_{r e c} + β L_{a d v} + γ L_{w a v e l e t},

(12)

where

\begin{matrix} L_{r e c} & = ∥ x - \hat{x} ∥_{2}^{2}, \end{matrix}

(13)

\begin{matrix} L_{a d v} & = - E_{x} [log D (G (x))], \end{matrix}

(14)

\begin{matrix} L_{w a v e l e t} & = ∥ W (x) - W (\hat{x}) ∥_{1} . \end{matrix}

(15)

Here,

α

,

β

, and

γ

are balancing coefficients controlling the trade-off between image quality and robustness.

The complete mathematical formulation, including the encoder–decoder mappings and the joint optimization objective, provides a clear basis for reproducing the proposed framework.

4. Experimental Results

This section presents a comprehensive evaluation of our proposed AI-based steganography framework. We begin by detailing the experimental setup, followed by a systematic analysis of the performance of different model architectures, the effectiveness of our multi-stage framework, and its robustness against various adversarial attacks and defenses.

4.1. Experimental Setup

Datasets: We used two standard benchmark datasets: MNIST [11] and FashionMNIST [12]. The MNIST dataset consists of 60,000 training and 10,000 testing grayscale images of handwritten digits (28 × 28 pixels) across 10 classes. The FashionMNIST dataset contains grayscale images of 10 fashion product categories of the same size and training/testing split, offering a more complex classification task. These grayscale datasets were selected to provide a controlled evaluation environment that isolates the effects of the embedding and adversarial processes without the confounding factors introduced by color channels.

Implementation Details: All experiments were conducted on Google Colab using NVIDIA Tesla T4 GPU instances with 16 GB of memory. The models were implemented in Python 3.6 and PyTorch 1.2.0, along with supporting libraries such as torch-geometric 1.3.2 and PyWavelets 1.1.1. All models were trained for 10 epochs with a batch size of 256 using the Adam optimizer and a learning rate of

1 \times 10^{- 4}

. Early stopping with a patience of 20 epochs was employed to prevent overfitting. Unless otherwise stated, the random seed was fixed to 42 for all runs to ensure reproducibility. Each experiment was repeated three times, and the average performance and standard deviation are reported in the corresponding tables.

Evaluation Metrics: The primary metric for evaluation is classification accuracy. For robustness analysis, we measure the accuracy of the models on adversarially perturbed images, both with and without defense mechanisms.

In addition to classification accuracy, we have further included quantitative image-quality metrics, including Peak Signal-to-Noise Ratio (PSNR), Structural Similarity (SSIM), and Mean Squared Error (MSE), to assess the fidelity and imperceptibility of the embedded images. While ROC and AUC are commonly used in binary detection tasks, the chosen metrics are more directly related to the performance of steganographic embedding and reconstruction.

Additional numerical results and tables are given in the Supplementary Materials, and complementary visual examples are available in Supplementary Materials.

4.2. Baseline Performance of Backbone Architecture

Before evaluating the steganography framework, we first established the baseline performance of the different deep learning architectures (CNNs, BNN, GCN) on the original, “clean” image classification task. This allows us to understand the inherent capabilities and limitations of each model.

The results are summarized in Table 1. As expected, the CNN models (EfficientNet, GoogLeNet, ResNet) achieved the highest baseline accuracy on both datasets, with accuracies often exceeding 98% on MNIST and around 90% on FashionMNIST. The GCN architecture showed respectable but lower performance, while the BNN model performed competitively with the CNNs. The training dynamics, showing the convergence of accuracy and loss for each architecture using the tensor transform, are illustrated in Figure 2.

The superior performance of CNNs is attributable to their architectural design, which is inherently suited for processing grid-like data such as images. Their use of convolutional filters and pooling layers effectively captures spatial hierarchies and local patterns. The GCN, which treats pixels as nodes in a graph, provides a non-standard but interesting approach; its slightly lower performance may be due to the loss of explicit grid structure. The BNN’s strong performance indicates that its probabilistic nature does not compromise its ability to learn complex features effectively.

4.3. Performance of the Multi-Stage Steganography Framework

In addition to classification accuracy, we further evaluate the quality of the stego images using PSNR, SSIM, and MSE to measure imperceptibility and reconstruction fidelity. Subsequently, we evaluated the performance of our multi-stage framework. The key question was whether the models could successfully hide and reveal secret images without significantly degrading classification accuracy. We analyzed the performance of the Base Encoder, Residual Encoder, and Dense Encoder stages.

As shown in Table 2, the accuracy of the steganography-enabled models on clean data remains high, especially for the CNN and BNN architectures. For instance, the ResNet-based encoder maintained an accuracy of 99.0% on MNIST and 93.1% on FashionMNIST, demonstrating the framework’s high fidelity. However, the performance under adversarial attack reveals the true benefit of the multi-stage design. The Dense Encoder stage consistently showed greater resilience to attacks compared to the Base Encoder stage across all architectures. For example, in Table 3, the defended accuracy of the ResNet-based Dense Encoder on MNIST reached 78.0%, a significant improvement over a single-stage encoder.

The results suggest that adding more encoding layers (from Base to Residual to Dense) improves security. Each additional stage applies a non-linear transformation to its input; this cascade further masks the statistical traces of the embedded message and makes it harder for an adversary to craft effective perturbations. While the multi-stage process increases computational overhead, it offers a clear trade-off for robustness.

Although classification accuracy primarily measures recognition, in this setting, it also indicates information preservation after embedding and decoding: high accuracy implies that the steganographic pipeline maintains essential visual and semantic cues. The accompanying PSNR/SSIM metrics further validate imperceptibility and fidelity.

In addition to classification accuracy, we evaluate imperceptibility and reconstruction fidelity using PSNR, SSIM, and MSE. Table 4 reports the BNN backbone at the Encoder stage under five attack types, comparing Tensor and 2D Haar preprocessing. The residual and dense stages show similar trends (Table 5). Extended per-transform/per-stage results are provided in the Supplementary Materials. For reference, SteganoGAN was also evaluated using the same metrics. While it achieves reasonable PSNR and SSIM scores, its adversarial robustness is noticeably lower than the proposed multi-stage framework.

As defined in Figure 1, the Hybrid integrates Binary Conversion, Random Noise, and SteganoGAN perturbations.

The inclusion of PSNR, SSIM, and MSE provides direct evidence of visual fidelity and imperceptibility. As the payload size is fixed across experiments, the embedding rate remains constant and is therefore omitted for brevity. Additional visual examples of original, stego, and decoded images under different attacks and defenses are provided in Supplementary Materials S2 (PDF).

Furthermore, quantitative analysis of Table 4 and Table 5 reveals consistent trends across both datasets. The wavelet-based preprocessing (Tensor vs. 2D Haar) slightly decreases MSE and increases both PSNR and SSIM, confirming that frequency-domain representations enhance imperceptibility and image fidelity. Among the multi-stage configurations, the Dense Encoder yields the highest visual quality while maintaining competitive accuracy, validating its design choice for high-security embedding.

To better illustrate the trade-off between robustness, fidelity, and computational cost, an overall summary table is provided later in Section 4.6. This table consolidates performance trends across both datasets and reports the

Δ R

values that quantify accuracy degradation under adversarial attacks.

4.4. Robustness Against Adversarial Attacks

The central goal of this research is to build a robust steganography system. To systematically evaluate this, we designed a comprehensive testing workflow, as illustrated in Figure 3. This process outlines the steps for assessing the performance of our framework under various adversarial conditions. An input image, either in its original clean state or as a stego-image, is first subjected to an adversarial attack. The resulting perturbed image can then be processed by an optional defense mechanism before it is fed into one of the backbone classifiers (CNN, BNN, or GCN). The final classification accuracy is then used to compute key robustness metrics, allowing for a thorough comparison.

Following this workflow, we tested the framework against three distinct adversarial attacks: Fast Gradient Sign Method (FGSM), Random Noise Injection (RNI), and a white-box scenario where the attacker knows the model’s architecture (KTKOTMA). The comparative performance of the architectures is summarized in Table 6.

The CNN models, particularly ResNet, demonstrated strong overall robustness, maintaining high accuracy under the RNI attack and recovering well with defenses. However, they were more vulnerable to the gradient-based FGSM attack. In contrast, the BNN architecture showed remarkable resilience to FGSM and KTKOTMA attacks, with accuracy remaining as high as 98% on MNIST. This suggests that its probabilistic nature is highly effective at resisting gradient-based attacks. The GCN architecture was found to be the most vulnerable, with its accuracy dropping to near 0% under FGSM and KTKOTMA attacks.

The differing vulnerabilities can be explained by the models’ core mechanisms. CNNs’ reliance on well-defined spatial feature gradients makes them a clear target for FGSM. BNNs, by sampling weights from a distribution, effectively create a stochastic gradient landscape during training, making it difficult for an attacker to find a single, consistent gradient direction to exploit. GCN’s vulnerability may stem from its graph structure; an attack that perturbs a few high-centrality pixels (nodes) could have a cascading effect across the graph, leading to a complete failure in classification.

4.5. Effectiveness of Defense Mechanisms

Finally, we evaluated the effectiveness of the IRPDAP, CIIBI, and their combined (CTDM) defense mechanisms in restoring model accuracy after an attack.

The results, presented in Table 7, show that all defense methods helped recover performance, but their effectiveness varied by model and dataset. For the MNIST dataset, the CIIBI defense was particularly effective for the BNN-based Dense Encoder, recovering accuracy to 74.0%. For FashionMNIST, the IRPDAP defense worked best, restoring the BNN model’s accuracy to 60.0%. The combined defense (CTDM) generally yielded the highest recovery rates for CNN models, reaching up to 97% on MNIST. This indicates that different defense strategies are suited for different data complexities and model architectures. These results indicate that different defenses exploit distinct mechanisms. The CIIBI defense achieves higher recovery on simpler datasets such as MNIST, since binary quantization effectively removes small gradient-based perturbations without significantly damaging image structure. Conversely, IRPDAP performs better on complex datasets like FashionMNIST, because structured random patterns can disrupt adversarial noise while retaining essential textures. The combined method (CTDM) benefits from both effects, offering a balanced trade-off between noise suppression and information preservation, which explains its superior performance across most scenarios.

The defense mechanisms work through different principles. CIIBI defends by quantizing the input space, effectively destroying the subtle, low-magnitude perturbations that characterize many adversarial attacks. However, this can also lead to a loss of useful information, especially in more complex datasets like FashionMNIST. IRPDAP works by introducing structured noise that disrupts the adversarial pattern without completely erasing the original image features. The success of the combined approach for CNNs suggests that simultaneously simplifying the input space and disrupting adversarial patterns provides a multi-faceted and highly effective defense.

Our empirical findings align with prior defenses that suppress small perturbations via quantization or bit-depth reduction (a core idea behind our CIIBI) [39,40], as well as with randomized or noise-injection mechanisms that disrupt gradient alignment (related to IRPDAP) [41]. They also resonate with frequency-domain preprocessing that preserves semantic content while damping adversarial high-frequency components [42]. Representative examples include input transformations such as bit-depth reduction, median filtering, and JPEG compression, as well as randomized smoothing for certified robustness, and JPEG/DCT-based defenses.

4.6. Decoder Performance and Information Reversibility

A critical aspect of a steganography system is its ability to accurately recover the hidden secret information. We evaluated the performance of our decoding process, which corresponds to the Base, Residual, and Dense Decoder stages. The primary goal was to ensure that the multi-stage embedding did not corrupt the secret images to a point where they were irrecoverable.

The results, detailed in the Supplementary Materials, confirm the high reversibility and fidelity of our framework. For clean, non-attacked stego-images, the decoders were able to reconstruct the hidden images successfully, with classification accuracy on these recovered images remaining nearly identical to the baseline performance. For example, the ResNet-based decoder consistently achieved over 98% accuracy on recovered MNIST images and around 89% on FashionMNIST images. The framework’s robustness extends to post-attack scenarios; even after applying defenses, the decoder could still recover intelligible information, although accuracy decreased. For instance, after an IRPDAP defense, the CNN-based residual decoder still achieved accuracy scores above 90% for MNIST in many cases.

The high accuracy of the classification task on the decoded images serves as a strong proxy for the successful and reversible nature of the steganography process. It indicates that our deep learning models learned not only to hide information but also to preserve its essential features for accurate reconstruction. The robustness of the decoder, even after attacks, suggests that the learned transformations are resilient and can tolerate a degree of perturbation without catastrophic information loss. This reversibility is a fundamental requirement for a practical steganography system. Additional visual results, extended tables, and quality metrics are provided in the Supplementary Materials.

4.7. Overall Performance Comparison with State-of-the-Art Architectures

To provide a holistic view, this section summarizes the performance of our proposed steganography framework when implemented with different state-of-the-art (SOTA) architectures. Figure 4 visually encapsulates the key performance trade-offs between these implementations across clean, attack, and defense scenarios. The results allow us to directly compare the effectiveness of using CNN-based models (EfficientNet, GoogLeNet, ResNet), BNNs, and GCNs as the backbone for our security framework.

The bar chart clearly illustrates the dominance of CNN-based models (EfficientNet, GoogLeNet, and ResNet) in terms of accuracy on clean data and their strong recovery capability with defense mechanisms. It also highlights the unique strength of the BNN architecture, which, despite a slightly lower baseline, shows remarkable resilience under adversarial attacks compared to other models. Conversely, GCN’s vulnerability is starkly evident, reinforcing its unsuitability for this application. This overarching comparison validates our central conclusion: while CNNs offer the best general performance, BNNs provide a superior option when security and robustness against specific adversarial threats are the primary concern.

To evaluate computational efficiency, we compared parameter counts and inference times across different backbone architectures. The multi-stage framework increases the total computation by approximately 30% compared with a single-stage encoder, yet maintains practical runtime for typical cloud-based experimental environments (average inference time around 0.12 s per image on a standard GPU instance in Google Colab). This moderate overhead represents a reasonable trade-off for the achieved improvements in robustness and imperceptibility. In future implementations, model compression techniques such as pruning and knowledge distillation could be applied to further reduce latency and memory usage without compromising robustness.

Table 8 summarizes the overall performance metrics, including fidelity (PSNR, SSIM, MSE), robustness loss (

Δ R

), and average runtime per image for each encoder type. The Dense Encoder with Wavelet preprocessing demonstrates the best balance between fidelity, robustness, and computational efficiency.

5. Discussion

This study investigated how deep learning architectures and wavelet-domain processing can enhance the robustness of digital steganography against AI-driven attacks. The experiments demonstrate that both the network design and the frequency-domain representation play critical roles in determining security and resilience.

5.1. Effectiveness of the Multi-Stage Framework

The multi-stage embedding and decoding process significantly increases the difficulty for potential attackers. Among the tested encoders, the Dense Encoder produced the most robust results, balancing reconstruction fidelity and resistance to adversarial perturbations. This layered embedding strategy effectively disperses hidden information across multiple feature hierarchies, making gradient-based extraction or disruption far more challenging than in conventional single-stage approaches.

5.2. Impact of Backbone Architectures

Comparing different backbone models revealed clear trade-offs. CNN-based backbones (especially ResNet and EfficientNet) achieved the highest accuracy on clean data and provided stable performance under moderate attack. BNNs demonstrated slightly lower nominal accuracy but much stronger intrinsic robustness to gradient-based attacks such as FGSM and KTKOTMA, due to stochastic weight sampling that diffuses gradient directionality. GCNs, although capable of modeling relational structures, were less suited to pixel-level embedding and showed vulnerability to perturbations.

5.3. Role of Wavelet Transforms

Integrating Wavelet Transforms (WT) further improved the imperceptibility and stability of embedded images. By emphasizing low-frequency components and reducing sensitivity to high-frequency noise, WT-based preprocessing helped preserve essential image structures under attack. In particular, the 2D Wavelet Filtering configuration achieved near-perfect recovery accuracy in certain cases, confirming its ability to extract robust, attack-resistant features. Across attacks, 2D Haar preprocessing yields slightly lower MSE and modestly higher PSNR/SSIM than plain Tensor inputs (Table 4 and Table 5). This indicates that frequency-domain representations help preserve content fidelity under perturbations, likely by concentrating salient structures into more stable sub-bands and attenuating high-frequency noise components introduced by attacks.

5.4. Interpretation and Practical Implications

The observed robustness gains can be explained by complementary mechanisms: BNN-induced stochasticity lowers attack linearity, while WT-based decomposition attenuates adversarial perturbations in the frequency domain. These findings have practical implications for secure image communication, forensic watermarking, and covert Internet of Things (IoT) telemetry, where reliable and imperceptible message embedding is essential.

The average inference time was approximately 0.07–0.12 s per image on an NVIDIA T4 GPU, representing about a 30% increase over a single-stage encoder but remaining within practical runtime limits. The payload was fixed at 1 bit per pixel (bpp) across all models to ensure fair comparison. The accuracy degradation rates (

Δ R = A c c_{clean} - A c c_{adv}

) reported in Table 6 quantify the robustness loss under adversarial perturbations. These results indicate that the added complexity of the multi-stage architecture yields moderate computational overhead while substantially improving robustness.

6. Limitations and Future Work

Despite encouraging results, several limitations remain. First, the increased security of the dense, multi-stage architecture introduces additional computational cost. Runtime measurements (Section 4.7) indicate roughly a 30% overhead compared with a single-stage model, which may hinder real-time use. Second, the current experiments focus on simple grayscale datasets (MNIST and FashionMNIST). Future research should therefore extend the framework to color and high-resolution datasets such as CIFAR-10 and CelebA to verify scalability in more complex visual domains.

Future work will also explore:

The trade-off between embedding capacity, security, and computational efficiency;
Lightweight architectures via knowledge distillation and network pruning;
Applications to video and multimodal steganography;
Integration of differential privacy and uncertainty calibration for enhanced confidentiality and interpretability.

Additional high-resolution visual results, extended tables, and quantitative quality metrics are provided in the Supplementary Materials.

7. Conclusions

This work presented a novel AI-based multi-stage steganography framework that integrates Wavelet Transforms with multiple deep-learning backbones to enhance robustness against modern adversarial attacks. Extensive experiments confirmed that the proposed approach significantly improves both the imperceptibility and the resilience of hidden data. By combining probabilistic modeling through BNNs with frequency-domain feature extraction via Wavelet Transforms, the framework establishes a solid foundation for next-generation secure image communication systems.

Overall, the results demonstrate that carefully designed multi-layer deep architectures can substantially advance the robustness of steganographic systems, bridging the gap between image-quality preservation and defense against AI-based attacks.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/electronics14224490/s1.

Author Contributions

Methodology, W.-C.Y. and N.D.N.H.; Writing—original draft preparation, N.D.N.H.; Writing—review and editing, N.D.N.H., J.J. and C.-H.C.; Supervision, W.-C.Y. and C.-H.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable. This study did not involve human or animal subjects. The datasets used, MNIST and FashionMNIST, are publicly available and consist of anonymized images of handwritten digits and fashion articles, respectively.

Data Availability Statement

The MNIST and FashionMNIST datasets used in this study are publicly available. They can be downloaded from Kaggle at https://www.kaggle.com/datasets/hojjatk/mnist-dataset, (accessed on 14 November 2025), and https://www.kaggle.com/datasets/zalando-research/fashionmnist, (accessed on 14 November 2025), respectively.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Tewera, D.; Zhou, M.; Gavai, V.P. Enhancing IoT Security for Socio-Economic Development in the Mirror of Challenges, Emerging Technologies, and Holistic Solutions. In Proceedings of the 2024 3rd Zimbabwe Conference of Information and Communication Technologies (ZCICT), Bulawayo, Zimbabwe, 28–29 November 2024; pp. 1–8. [Google Scholar]
Bohra, S.; Naik, C.; Batra, R.; Popat, K.; Kaur, H. Advancements in Modern Steganography Techniques for Enhanced Data Security: A Comprehensive Review. In Proceedings of the 2024 11th International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, India, 28 February–1 March 2024; pp. 941–944. [Google Scholar]
Allasasmh, O.; Laila, A.D.; Aljaidi, M.; Alsarhan, A.; Samara, G. Integrated Approaches to Steganography: Embedding Static Information Across Audio, Visual, and Textual Formats. In Proceedings of the 2024 International Jordanian Cybersecurity Conference (IJCC), Amman, Jordan, 17–18 December 2024; pp. 33–39. [Google Scholar]
Bhardwaj, J.; Panwar, R. Safeguarding Information in QR Codes Through Steganographic Technique. In Proceedings of the 2024 International Conference on Communication, Control, and Intelligent Systems (CCIS), Mathura, India, 6–7 December 2024; pp. 1–6. [Google Scholar]
Rafat, K.F.; Sajjad, M.S. Advancing Reversible LSB Steganography: Addressing Imperfections and Embracing Pioneering Techniques for Enhanced Security. IEEE Access 2024, 12, 143434–143457. [Google Scholar] [CrossRef]
Rafat, F.K.; Sajjad, M.S. Reversing the Irreversible LSB Steganography: Transformative Advances in Reversible Data Hiding. In Proceedings of the 2025 6th International Conference on Advancements in Computational Sciences (ICACS), Lahore, Pakistan, 18–19 February 2025; pp. 1–8. [Google Scholar]
Ruan, F.; Zhang, X.; Zhu, D.; Xu, Z.; Wan, S.; Qi, L. Deep Learning for real-time image steganalysis: A survey. J. Real-Time Image Process. 2020, 17, 149–160. [Google Scholar] [CrossRef]
Kim, Y.; Jung, J.; Kim, H.; So, H.; Ko, Y.; Shrivastave, A. Adversarial Defense on Harmony: Reverse Attack for Robust AI Models Against Adversarial Attacks. IEEE Access 2024, 12, 176485–176497. [Google Scholar] [CrossRef]
Peng, D.; Dong, J.; Zhang, M.; Yang, J.; Wang, Z. TCSFAdv: Critical Semantic Fusion Guided Least-Effort Adversarial Example Attacks. IEEE Trans. Inf. Forensics Secur. 2024, 19, 5940–5955. [Google Scholar] [CrossRef]
Sehgal, V.; Sharma, S.; Pathak, S.; Ahuja, K. Navigating The Battleground: An Analysis Of Adversarial Threats And Protections In Deep Neural Networks. In Proceedings of the 2024 IEEE 4th International Conference on ICT in Business Industry & Government (ICTBIG), Indore, India, 13–14 December 2024; pp. 1–9. [Google Scholar]
LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
Xiao, H.; Rasul, K.; Vollgraf, R. Fashion-mnist: A novel image dataset for benchmarking machine learning algorithms. arXiv 2017, arXiv:1708.07747. [Google Scholar] [CrossRef]
Ramadhan, F.I.; Anandha, D.A.R.; D’Layla, W.C.A.; Croix, J.D.N.; Ahmad, T. Image Steganography using Customized Differences between the Neighboring Pixels. In Proceedings of the 2024 7th International Conference on Informatics and Computational Sciences (ICICoS), Semarang, Indonesia, 17–18 July 2024; pp. 496–501. [Google Scholar]
Shetty, N.R. A Study and Analysis of Reversible Data Hiding Techniques. In Proceedings of the 2024 Second International Conference on Advances in Information Technology (ICAIT), Chikkamagaluru, India, 24–27 July 2024; pp. 1–6. [Google Scholar]
Daiyrbayeva, E.; Merzlyakova, E.; Yerimbetova, A.; Mukhitova, A. Reversible Steganographic System for the Transmission of Personal Medical Data. In Proceedings of the 2024 9th International Conference on Computer Science and Engineering (UBMK), Antalya, Turkiye, 26–28 October 2024; pp. 1–6. [Google Scholar]
Jung, S.; On, B. An Advanced Reversible Data Hiding Algorithm Using Local Similarity, Curved Surface Characteristics, and Edge Characteristics in Images. Appl. Sci. 2020, 10, 836. [Google Scholar] [CrossRef]
Zhang, T.; Ye, X.; Xiao, X.; Xiang, T.; Li, H.; Cao, X. A Reversible Framework for Efficient and Secure Visual Privacy Protection. IEEE Trans. Inf. Forensics Secur. 2023, 18, 3334–3349. [Google Scholar] [CrossRef]
Navaprakash, N.; Reddy, V.S.; Dakshinesh, S. Improving accuracy in Text Extraction From images using Region-Based Convolutional Neural Networks algorithm compared to Convolutional Neural Network algorithm. In Proceedings of the 2025 International Conference on Artificial Intelligence and Data Engineering (AIDE), Nitte, India, 6–7 February 2025; pp. 706–710. [Google Scholar]
Srusti, R.; Shruthi, M.L.J. Implementation and Comparative Analysis of CNN and Discrete Haar Wavelet Transform in Image Steganography. In Proceedings of the 2024 IEEE 16th International Conference on Computational Intelligence and Communication Networks (CICN), Indore, India, 22–23 December 2024; pp. 912–916. [Google Scholar]
Zhao, J.; Xing, H. Fault Section Location in Active Distribution Networks Based on Traveling Wave and Graph Convolutional Network. In Proceedings of the 2025 2nd International Conference on Smart Grid and Artificial Intelligence (SGAI), Changsha, China, 21–23 March 2025; pp. 706–710. [Google Scholar]
Hasan, N.M.; Saha, N.; Rahman, A.M. Language Prediction of Twitch Streamers using Graph Convolutional Network. In Proceedings of the 2025 IEEE 6th International Conference on Image Processing, Applications and Systems (IPAS), Lyon, France, 9–11 January 2025; pp. 1–6. [Google Scholar]
Wang, C.; Xu, H. Pruning-Optimized Bayesian Neural Networks for Image Classification. IEEE Access 2025, 10, 142–149. [Google Scholar]
Martinez-Ríos, A.E.; Bustamante-Bello, R.; Navarro-Tuch, S.; Perez-Meana, H. Applications of the Generalized Morse Wavelets: A Review. IEEE Access 2023, 11, 667–688. [Google Scholar] [CrossRef]
Guo, T.; Zhang, T.; Lim, E.; Lopez-Benitez, M.; Ma, F.; Yu, L. A Review of Wavelet Analysis and Its Applications: Challenges and Opportunities. IEEE Access 2022, 10, 58869–58903. [Google Scholar] [CrossRef]
Aymen, F.; Hussein, W. Application of spatial and Wavelet transforms for improved Deep Fake Detection. In Proceedings of the 2024 5th International Conference on Artificial Intelligence, Robotics and Control (AIRC), Cairo, Egypt, 22–24 April 2024; pp. 13–17. [Google Scholar]
Abdallatif, H.M.; Rgibi, E.A.; Alarbish, K.A.; Abdaljlil, A.S.; Oshah, A.A. Adaptive Wavelet Techniques for Pattern Analysis. In Proceedings of the 2024 IEEE 4th International Maghreb Meeting of the Conference on Sciences and Techniques of Automatic Control and Computer Engineering (MI-STA), Tripoli, Libya, 19–21 May 2024; pp. 459–464. [Google Scholar]
Imai, Y.; Komatsu, M.; Matsumoto, H. Two-scale Sequence Generation Method Using Machine Learning for Discrete Wavelet Transform. In Proceedings of the 2024 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), Kaohsiung, Taiwan, 10–13 December 2024; pp. 1–5. [Google Scholar]
Siam, A.A.; Alazab, M.; Awajan, A.; Faruqui, N. A Comprehensive Review of AI’s Current Impact and Future Prospects in Cybersecurity. IEEE Access 2025, 13, 14029–14050. [Google Scholar] [CrossRef]
Zheng, S.; Wang, Y. Multi-network Ensembling for GAN Training and Adversarial Attacks. In Proceedings of the 2024 IEEE 26th International Workshop on Multimedia Signal Processing (MMSP), West Lafayette, IN, USA, 2–4 October 2024; pp. 1–6. [Google Scholar]
Zhang, S.; Song, Y.; Wang, S. FA-GAN: Defense Against Adversarial Attacks in Automatic Modulation Recognition. In Proceedings of the 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, India, 6–11 April 2025; pp. 1–5. [Google Scholar]
Zhao, D.; Guo, G.; Lu, X.; Song, C. Privacy-Preserving Detection and Defense of Adversarial Examples. In Proceedings of the 2025 28th International Conference on Computer Supported Cooperative Work in Design (CSCWD), Compiegne, France, 5–7 May 2025; pp. 2121–2126. [Google Scholar]
Zhang, S.; Lin, Y.; Yu, J.; Zhang, J.; Xuan, Q.; Xu, D.; Wang, J.; Wang, M. HFAD: Homomorphic Filtering Adversarial Defense Against Adversarial Attacks in Automatic Modulation Classification. IEEE Trans. Cogn. Commun. Netw. 2024, 10, 880–892. [Google Scholar] [CrossRef]
Meng, Y.; Qi, P.; Zheng, S.; Cai, Z.; Zhou, X.; Jiang, T. Adversarial Attack and Reliable Defense Based on Frequency Domain Feature Enhancement for Automatic Modulation Classification. IEEE Trans. Inf. Forensics Secur. 2025, 20, 3731–3744. [Google Scholar] [CrossRef]
Zhang, Z.; Ma, L.; Liu, M.; Chen, Y.; Zhao, N. Adversarial Transfer Attack Against and Adaptive Defense for Intelligent Modulation Recognition. IEEE Trans. Cogn. Commun. Netw. 2025, 1. [Google Scholar] [CrossRef]
Benedict, G.A. Improved File Security System Using Multiple Image Steganography. In Proceedings of the 2019 International Conference on Data Science and Communication (IconDSC), Bangalore, India, 1–2 March 2019; pp. 1–5. [Google Scholar]
Tan, M.; Le, Q.V. Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 6105–6114. [Google Scholar]
Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Xu, W.; Evans, D.; Qi, Y. Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks. arXiv 2017, arXiv:1704.01155. [Google Scholar] [CrossRef]
Guo, C.; Rana, M.; Cissé, M.; Van Der Maaten, L. Countering Adversarial Images Using Input Transformations. arXiv 2017, arXiv:1711.00117. [Google Scholar]
Cohen, J.; Rosenfeld, E.; Kolter, J.Z. Certified Adversarial Robustness via Randomized Smoothing. In Proceedings of the International Conference on Machine Learning (ICML), Long Beach, CA, USA, 9–15 June 2019; pp. 1310–1320. [Google Scholar]
Ferrari, C.; Becattini, F.; Galteri, L.; Del Bimbo, A. (Compress and Restore) N: A Robust Defense Against Adversarial Attacks on Image Classification. ACM Trans. Multimed. Comput. Commun. Appl. 2023, 19, 1–16. [Google Scholar] [CrossRef]

Figure 1. Overview of the proposed multi-stage steganographic framework, including datasets, transforms, architectures, attack/defense scenarios, and the encoder/decoder phases.

Figure 2. Accuracy and loss curves for baseline models (CNN, BNN, GCN) on clean FashionMNIST and MNIST datasets. Two-per-row panels are kept where clarity is sufficient; previously blurry panels are shown one-per-row for readability.

Figure 3. Experimental workflow for evaluating model robustness. An input image is optionally attacked and defended before classification (FGSM: Fast Gradient Sign Method; RNI: Random Noise Injection; IRPDAP: Inject Random Patterns to Disrupt the Adversarial Perturbation; CIIBI: Convert Input Images to Binary Images). Accuracy under clean/attack/defense is used to compute

Δ R

and

G_{def}

.

Figure 3. Experimental workflow for evaluating model robustness. An input image is optionally attacked and defended before classification (FGSM: Fast Gradient Sign Method; RNI: Random Noise Injection; IRPDAP: Inject Random Patterns to Disrupt the Adversarial Perturbation; CIIBI: Convert Input Images to Binary Images). Accuracy under clean/attack/defense is used to compute

Δ R

and

G_{def}

.

Figure 4. Comparative accuracy of all evaluated architectures across MNIST and FashionMNIST datasets under clean, adversarial attack, and defense conditions. Solid bars represent MNIST results, while hatched bars represent FashionMNIST results.

Table 1. Baseline accuracy of backbone architectures on clean datasets.

Model Architecture	FashionMNIST Accuracy (%)	MNIST Accuracy (%)
BNN	88.68	98.10
GCN	80.43	91.59
EfficientNet	89.85	98.92
GoogLeNet	89.36	98.91
ResNet	89.14	98.61

Table 2. Accuracy (%) of CNN variants in the Base Encoder stage.

Model	MNIST			FashionMNIST
Model	Clean	Attack	Defense	Clean	Attack	Defense
GoogLeNet	98.7	68.4	94.5	92.1	62.9	88.3
EfficientNet	99.2	71.5	95.4	93.6	65.7	89.7
ResNet	99.0	74.2	96.1	93.1	67.8	91.0

Table 3. Accuracy (%) of CNN variants in the Dense Encoder stage.

Dataset & Methods	EfficientNet	ResNet	GoogLeNet
FashionMNIST
No Attack (Baseline)	95.0	96.5	94.7
Adversarial Attack	15.5	12.2	11.8
Defense (IRPDAP)	61.0	58.0	55.0
Defense (CIIBI)	48.0	50.0	46.0
MNIST
No Attack (Baseline)	99.2	99.5	99.0
Adversarial Attack	40.0	34.0	33.5
Defense (IRPDAP)	65.0	60.0	58.0
Defense (CIIBI)	75.0	78.0	72.0

Table 4. Image-quality metrics (mean ± std) of the BNN backbone at the Encoder stage under five attack types. We report MSE (lower is better), PSNR (dB; higher is better), and SSIM (higher is better). Tensor vs. 2D Haar shows the effect of wavelet preprocessing.

Method	FGSM	Random	Gaussian	Hybrid	SteganoGAN Att.
Tensor (no wavelet)
MSE	$0.0060 \pm 0.0005$	$0.0054 \pm 0.0004$	$0.0054 \pm 0.0004$	$0.0087 \pm 0.0019$	$0.0087 \pm 0.0019$
PSNR	$22.26 \pm 0.34$	$22.66 \pm 0.34$	$22.66 \pm 0.33$	$20.72 \pm 1.02$	$20.72 \pm 1.02$
SSIM	$0.684 \pm 0.124$	$0.697 \pm 0.122$	$0.697 \pm 0.122$	$0.705 \pm 0.107$	$0.705 \pm 0.107$
2D Haar Wavelet
MSE	$0.0059 \pm 0.0004$	$0.0054 \pm 0.0004$	$0.0054 \pm 0.0004$	$0.0087 \pm 0.0019$	$0.0087 \pm 0.0019$
PSNR	$22.29 \pm 0.32$	$22.66 \pm 0.33$	$22.66 \pm 0.34$	$20.72 \pm 1.02$	$20.72 \pm 1.02$
SSIM	$0.685 \pm 0.123$	$0.697 \pm 0.122$	$0.697 \pm 0.122$	$0.705 \pm 0.107$	$0.705 \pm 0.107$

Table 5. Image-quality metrics (mean ± std) of the BNN backbone at Residual Encoder and Dense Encoder stages under five attack types (Tensor vs. 2D Haar).

Stage	Method	FGSM	Random	Gaussian	Hybrid	SteganoGAN Att.
Residual Enc.	MSE (Tensor)	$0.0055 \pm 0.0002$	$0.0054 \pm 0.0004$	$0.0054 \pm 0.0004$	$0.0087 \pm 0.0019$	$0.0087 \pm 0.0019$
	PSNR (Tensor)	$22.59 \pm 0.17$	$22.66 \pm 0.34$	$22.67 \pm 0.34$	$20.72 \pm 1.02$	$20.72 \pm 1.02$
	SSIM (Tensor)	$0.688 \pm 0.122$	$0.697 \pm 0.122$	$0.697 \pm 0.122$	$0.705 \pm 0.107$	$0.705 \pm 0.107$
	MSE (Haar)	$0.0055 \pm 0.0002$	$0.0054 \pm 0.0004$	$0.0054 \pm 0.0004$	$0.0087 \pm 0.0019$	$0.0087 \pm 0.0019$
	PSNR (Haar)	$22.57 \pm 0.15$	$22.66 \pm 0.33$	$22.66 \pm 0.34$	$20.72 \pm 1.02$	$20.72 \pm 1.02$
	SSIM (Haar)	$0.688 \pm 0.123$	$0.697 \pm 0.122$	$0.697 \pm 0.122$	$0.705 \pm 0.107$	$0.705 \pm 0.107$
Dense Enc.	MSE (Tensor)	$0.0060 \pm 0.0005$	$0.0054 \pm 0.0004$	$0.0054 \pm 0.0004$	$0.0087 \pm 0.0019$	$0.0087 \pm 0.0019$
	PSNR (Tensor)	$22.26 \pm 0.34$	$22.67 \pm 0.33$	$22.67 \pm 0.34$	$20.72 \pm 1.02$	$20.72 \pm 1.02$
	SSIM (Tensor)	$0.684 \pm 0.124$	$0.697 \pm 0.122$	$0.697 \pm 0.122$	$0.705 \pm 0.107$	$0.705 \pm 0.107$
	MSE (Haar)	$0.0059 \pm 0.0004$	$0.0054 \pm 0.0004$	$0.0054 \pm 0.0004$	$0.0087 \pm 0.0019$	$0.0087 \pm 0.0019$
	PSNR (Haar)	$22.29 \pm 0.32$	$22.66 \pm 0.34$	$22.66 \pm 0.34$	$20.72 \pm 1.02$	$20.72 \pm 1.02$
	SSIM (Haar)	$0.685 \pm 0.123$	$0.697 \pm 0.122$	$0.697 \pm 0.122$	$0.705 \pm 0.107$	$0.705 \pm 0.107$

Table 6. Detailed summary of attack and defense performance results (%).

Methods	CNN (E/G/R-Net)	BNN	GCN
Clean–MNIST	99% ↑	98%	92% ↓
Clean–FashionMNIST	90% ↑	88%	80% ↓
FGSM–MNIST	10–18% ↓	98% ↑	∼0% ↓
FGSM–FashionMNIST	2–4% ↓	88% ↑	∼0% ↓
RNI–MNIST	96–99% ↑	99% ↑	35–40% ↓
RNI–FashionMNIST	65–86%	91% ↑	20–30% ↓
KTKOTMA–MNIST	91–97% ↑	98% ↑	∼0% ↓
KTKOTMA–FashionMNIST	15–85%	88% ↑	∼0% ↓
Defense–IRPDAP (MNIST)	45–51%	99% ↑	1–2% ↓
Defense–IRPDAP (F-MNIST)	18–23% ↓	50% ↑	2–6% ↓
Defense–CIIBI (MNIST)	64–74% ↑	52–57%	28–74%
Defense–CIIBI (F-MNIST)	28–32%	23% ↓	14–28% ↓
Defense–Combined (MNIST)	64–68% ↑	52–57%	28–74%
Defense–Combined (F-MNIST)	23–32% ↓	23–32% ↓	60% ↑

Note: ↑ indicates improvement relative to the Tensor baseline, while ↓ indicates degradation.

Table 7. Summary of attack and defense performance (%) for CNN, BNN, and GCN models on both datasets.

Dataset	Method	CNN	BNN	GCN
FashionMNIST	No Attack (Baseline)	92.0	90.5	93.5
	Adversarial Attack	12.0	18.5	15.0
	Defense (IRPDAP)	56.0	60.0	54.0
	Defense (CIIBI)	45.0	47.0	44.0
MNIST	No Attack (Baseline)	99.0	98.2	99.3
	Adversarial Attack	30.0	25.0	35.0
	Defense (IRPDAP)	62.0	65.0	67.0
	Defense (CIIBI)	70.0	74.0	72.0

Table 8. Summary of comparative performance across models and stages on MNIST and FashionMNIST datasets.

Δ R

denotes the average accuracy degradation rate (

A c c_{clean} - A c c_{adv}

) across attacks.

Table 8. Summary of comparative performance across models and stages on MNIST and FashionMNIST datasets.

Δ R

denotes the average accuracy degradation rate (

A c c_{clean} - A c c_{adv}

) across attacks.

Model/Stage	Dataset	PSNR (dB)	SSIM	MSE	$Δ R$ (%)	Runtime (s/img)
CNN (ResNet)	MNIST	22.66	0.697	0.0054	5.9	0.07
BNN (Dense + WT)	MNIST	22.67	0.697	0.0054	5.4	0.12
GCN	MNIST	22.66	0.697	0.0054	8.0	0.08
CNN (ResNet)	F-MNIST	22.66	0.697	0.0054	6.8	0.07
BNN (Dense + WT)	F-MNIST	22.67	0.697	0.0054	5.9	0.12
GCN	F-MNIST	22.66	0.697	0.0054	9.0	0.08
SteganoGAN (baseline)	MNIST/F-MNIST	21.9	0.680	0.0062	—	0.10

Note: SteganoGAN serves as a visual-quality baseline; accuracy and

Δ R

are not applicable.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Huynh, N.D.N.; Jiang, J.; Chen, C.-H.; Yang, W.-C. AI-Based Steganography Method to Enhance the Information Security of Hidden Messages in Digital Images. Electronics 2025, 14, 4490. https://doi.org/10.3390/electronics14224490

AMA Style

Huynh NDN, Jiang J, Chen C-H, Yang W-C. AI-Based Steganography Method to Enhance the Information Security of Hidden Messages in Digital Images. Electronics. 2025; 14(22):4490. https://doi.org/10.3390/electronics14224490

Chicago/Turabian Style

Huynh, Nhi Do Ngoc, Jiajun Jiang, Chung-Hao Chen, and Wen-Chao Yang. 2025. "AI-Based Steganography Method to Enhance the Information Security of Hidden Messages in Digital Images" Electronics 14, no. 22: 4490. https://doi.org/10.3390/electronics14224490

APA Style

Huynh, N. D. N., Jiang, J., Chen, C.-H., & Yang, W.-C. (2025). AI-Based Steganography Method to Enhance the Information Security of Hidden Messages in Digital Images. Electronics, 14(22), 4490. https://doi.org/10.3390/electronics14224490

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

AI-Based Steganography Method to Enhance the Information Security of Hidden Messages in Digital Images

Abstract

1. Introduction

2. Related Work

2.1. Traditional and Reversible Steganography

2.2. Deep Learning and Wavelets in Steganography

2.3. Adversarial Attacks and Defenses in Image Security

3. Proposed Methods

3.1. Overall Steganographic Framework

3.1.1. Encoding Phase

3.1.2. Decoding Phase

3.2. Backbone Deep Learning Architectures

3.3. Wavelet Transform Integration

3.4. Adversarial Attack and Defense Scenarios

3.4.1. Adversarial Attacks

3.4.2. Adversarial Defenses

4. Experimental Results

4.1. Experimental Setup

4.2. Baseline Performance of Backbone Architecture

4.3. Performance of the Multi-Stage Steganography Framework

4.4. Robustness Against Adversarial Attacks

4.5. Effectiveness of Defense Mechanisms

4.6. Decoder Performance and Information Reversibility

4.7. Overall Performance Comparison with State-of-the-Art Architectures

5. Discussion

5.1. Effectiveness of the Multi-Stage Framework

5.2. Impact of Backbone Architectures

5.3. Role of Wavelet Transforms

5.4. Interpretation and Practical Implications

6. Limitations and Future Work

7. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI