Double-Flow-Based Steganography Without Embedding for Image-to-Image Hiding

Dong, Yunyun; Wang, Zhen; Song, Bingbing; Zhou, Wei

doi:10.3390/electronics14214270

Open AccessArticle

Double-Flow-Based Steganography Without Embedding for Image-to-Image Hiding

¹

School of Information Science and Engineering, Yunnan University, Kunming 650000, China

²

School of Software and AI, Yunnan University, Kunming 650000, China

³

Engineering Research Center of Cyberspace, Yunnan University, Kunming 650000, China

⁴

Yunnan-Malaya Institute (School of Engineering), Yunnan University, Kunming 650000, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(21), 4270; https://doi.org/10.3390/electronics14214270

Submission received: 25 September 2025 / Revised: 23 October 2025 / Accepted: 29 October 2025 / Published: 30 October 2025

(This article belongs to the Special Issue AI and Cybersecurity: Emerging Trends and Key Challenges)

Download

Browse Figures

Versions Notes

Abstract

As an emerging concept, steganography without embedding (SWE) hides a secret message without directly embedding it into a cover. Thus, SWE has the unique advantage of being immune to typical steganalysis methods and can better protect the secret message from being exposed. However, existing SWE methods are generally criticized for their poor payload capacity and low fidelity of recovered secret messages. In this paper, we propose a novel steganography-without-embedding technique, named DF-SWE, which addresses the aforementioned drawbacks and produces diverse and natural stego images. Specifically, DF-SWE employs a reversible circulation of double flow to build a reversible bijective transformation between the secret image and the generated stego image. Hence, it provides a way to directly generate stego images from secret images without a cover image. Besides leveraging the invertible property, DF-SWE can invert a secret image from a generated stego image in a nearly lossless manner and increase the fidelity of extracted secret images. To the best of our knowledge, DF-SWE is the first SWE method that can hide multiple images into one image with the same size, significantly enhancing the payload capacity. According to the experimental results, the payload capacity of DF-SWE achieves 24–72 BPP, which is 8000∼16,000 times more compared to its competitors while producing diverse images to minimize the exposure risk. Importantly, DF-SWE can be applied in the steganography of secret images in various domains without requiring training data from the corresponding domains. This domain-agnostic property suggests that DF-SWE can (1) be applied to hiding private data and (2) be deployed in resource-limited systems.

Keywords:

image steganography; steganography without embedding; encryption; flow-based model; security

1. Introduction

Deep image steganography aims to conceal secret messages in cover images imperceptibly. The secret messages are only allowed to be recovered by the informed receiver while being invisible to others, which secures their transmission without being noticed [1,2]. Henceforth, image steganography has been applied in various domains, such as information security [3], data communication [1], and copyright protection [4].

In the image steganography task, the primary requirements converge to capacity, extraction error, and security. The embedded steganography (ES) generally selects an existing image as a cover and then embeds secret information into the cover image with a slight modification. However, these traditional ES steganography methods [4,5] have limited payload capacity. To further increase payload capacity, deep learning-based ES steganography methods have been recently proposed to achieve both acceptable imperceptibility and a small extraction error of the secret message [6]. However, since all these ES methods need to modify the cover image, the modified cover image always contains a subtle pseudo-shadow of the secret message, especially under a high hiding payload. This leads to potential risks of exposing the secret message through compromising the cover image using steganalysis tools.

Instead of directly embedding the secret message into a cover image, steganography without embedding (SWE) is an emerging concept of hiding a secret message without a cover image, which eliminates the modification traces observed in ES methods. Thus, SWE has the unique advantage of reducing the risk of secret message breaches from typical steganalysis [7]. Although current SWE approaches have achieved remarkable results, there still exist some fatal drawbacks. There are two types of SWE techniques. (1) Mapping-based methods transform the secret message into a sequence of image hashes selected from an existing image set [8,9]. These mapping-based methods require the construction of fixed image mapping rules, which do not accommodate the dynamic growth of images. (2) Alternatively, generating-based methods synthesize images by passing the secret message into a deep generator network, e.g., a generative adversarial network (GAN) [10,11]. However, due to the instability of the generative network and the irreversibility of the generative process, a critical weakness is that the payload capacity is extremely limited, especially for hiding large secret images. As shown in Table 1, the maximum hiding capacity of the existing works without embedding is 4, and the hiding type can only be a bit. In order to achieve image-to-image steganography without embedding, the hiding capacity must be at least 24 BPP. For multi-image hiding, it needs a higher hiding capacity. Moreover, it is difficult to minimize the message extraction error while keeping the visual quality of the generated stego images [11].

In this paper, we propose a novel DF-SWE approach to tackle the above issues of current SWE methods. Unlike conventional GANs that rely on implicit and non-invertible mappings between latent and data spaces, our framework leverages OpenAI’s Glow [16], a flow-based generative model that explicitly constructs an invertible transformation between the input space and the latent space. Specifically, Glow introduces invertible 1 × 1 convolutions and affine coupling layers, ensuring that every transformation within the model is mathematically bijective and that the Jacobian determinant of each layer is tractable (i.e., can be computed exactly and efficiently). This property allows Glow to directly evaluate the exact log-likelihood of data samples, unlike GANs that rely on adversarial training without a likelihood objective. In steganographic tasks, flow-based models are advantageous because they offer stable and reversible mappings, while traditional GANs often suffer from training instability and non-invertible transformations. The bijective structure of Glow enables accurate forward and inverse processes, allowing information embedding and recovery with high fidelity, something difficult to achieve with the implicit, one-directional mappings of GANs. Our approach significantly enhances the payload capacity and can hide large images without cover images. To the best of our knowledge, DF-SWE can hide multiple secret images at one time, which greatly extends the capability of SWE-based methods. In addition, DF-SWE reduces the extraction error, which is attributed to the reversibility of the hiding and restoring processes by the invertible bijective mapping. Furthermore, DF-SWE guarantees the quality of the generated stego images to enhance the imperceptibility of the secret images.

In sum, our novel DF-SWE method achieves state-of-the-art steganographic performance in the payload capacity, extraction error, and stealthiness of hiding large images. Intriguingly, DF-SWE shows a capability of domain generalization, which makes it applicable to privacy-critical, resource-limited scenarios.

The detailed contributions are as follows:

High payload capacity: DF-SWE works towards image-to-image generative steganography. Our payload capacity (BPP) achieves 24–72 BPP and is 8000∼16,000 times more than existing SWE methods. Moreover, DF-SWE is the first method to achieve multiple secret images hiding without embedding.
Low extraction error: We propose the reversible circulation of double flow to build a reversible bijective transformation between secret images and generated stego images. It is worth noting that reversible circulation of double flow is an invertible process. Hence, we can invert a secret image from a stego image in a nearly lossless manner.
Enhanced stealthiness: According to the experimental results, the proposed DF-SWE shows better hiding performance, providing diverse and realistic images to minimize the exposure risk compared to the prior steganography works. Meanwhile, our proposed SWE also achieves better security performance against steganalysis detections.
Domain generalization: Our experiments show that, once trained, DF-SWE can be applied in the steganography of secret images from different domains without further model training or fine-tuning. This property makes DF-SWE the first domain-agnostic steganography method that can be applied to unseen private data and executed on resource-limited systems.

This paper is organized as follows. Section 2 introduces the related work. Section 3 briefly describes the Glow model as a backbone network. Section 4 elaborates the proposed DF-SWE method. Section 5 presents and discusses the experimental results. A discussion and future work are drawn in Section 6.

2. Related Work

Most existing steganographic approaches are embedded steganography (ES), which embeds the secret information imperceptibly into a cover image by slightly modifying its content. However, the modification traces of the embedded steganography will cause some distortion in the stego image, especially when embedding color image data that usually contains thousands of bits, making them easily detected by steganalysis. Steganography without embedding is proposed to improve security, which does not need to modify the cover image.

2.1. Embedded Steganography

Raditional ES methods: The Least Significant Bits (LSB) [17] only modified the information of the last few bits, so it would not cause visible changes in the pixel values of the picture. In addition, LSB also had many variations [18,19]. For example, an information hiding technique [20] has been proposed by utilizing the least significant bits (LSBs) of each pixel of a grayscale image, adopting XOR features of the host image pixels. Additionally, HUGO [21] was proposed, and the main design principle was to minimize the properly defined distortion through an efficient coding algorithm. There are steganographic algorithms not only in the spatial domain, but also in the frequency domain, such as J-UNIWARD [22], UED [23], I-UED [24], UERD [25].

Deep learning-based ES methods: Baluja [6] proposed an autoencoder architecture placing a full-size image in another image of the same size. After this, Wu et al. [26] proposed an encoder–decoder architecture, where the cover image and the secret image were concatenated using Separable Convolution (SCR) with a residual block. Additionally, Zhang et al. [27] combined the method of adversarial examples for steganography. Replacing the encoder–decoder architecture. CycleGAN-based methods that [28,29] had proposed for image steganography. Furthermore, Zhang et al. [30] proposed ISGAN, which improved the invisibility by hiding the secret image only in the Y channel of the cover image. Wang et al. [5] designed a multi-level feature fusion procedure based on GAN to capture texture information and semantic features. Recently, an Invertible Network was proposed for image hiding. Due to the reversible nature of an Invertible Network, HiNet [31] significantly improves the restored quality of the secret image. Based on this, DeepMIH [32] was proposed to hide multiple images and achieved excellent performance compared with ES methods.

2.2. Steganography Without Embedding (SWE)

Mapping-based SWE methods: In 2016, a bag-of-words (BOW) model was proposed to construct the mapping relationship between the dictionary and the words [8]. Furthermore, Zheng et al. [9] proposed robust image hashing, which calculated the scale-invariant feature transform (SIFT) points in 9 sub-images. Cao et al. [33] divided the pixel values from 0 to 255 into 16 intervals, and built a mapping relationship with the bit string of length 4. After this, Qiu et al. [34] first hashed the local binary pattern (LBP) features of the cover image and the secret image, and then the hashes were matched to create the hidden image. Additionally, a CIS algorithm based on DenseNet feature mapping was proposed [35], which introduced deep learning to extract high-dimensional CNN features mapped into hash sequences. Based on GAN, a Star Generative Adversarial Network (StarGAN) was proposed to construct a high-quality stego image with the mapping relationship [36].

Generating-based SWE methods: Stego-ACGAN was proposed to generate new meaningful normal images for hiding and extracting information [10]. In 2018, Hu et al. [12] mapped secret information into noise vectors and used DCGAN to generate a stego image. After this, Zhu et al. [37] proposed a coverless image steganography method based on the orthogonal generative adversarial network, adding constraints to the objective function to make the model training more stable. For improving the steganography capacity and image quality, A GAN steganography without embedding that combines adversarial training techniques was proposed [38]. Then, the attention-GAN model was proposed for steganography without embedding [11]. Additionally, Liu et al. [14] proposed IDEAS based on GAN, which disentangled an image into two representations for structure and texture and utilized the structure representation to improve secret message extraction. Different from GAN-based approaches, Generative Steganographic Flow (GSF) [39] built a reversible bijective mapping between the input secret data and the generated stego images and took the stego image generation and secret data recovery process as an invertible transformation. After this, Zhou et al. [15] proposed a secret-to-image reversible transformation (S2IRT), where a large number of elements of the given secret message were arranged into the corresponding positions to construct a high-dimensional vector. Then, the vector is mapped to a generated image. In addition, the aforementioned methods are all limited to single-image steganography, whereas our DF-SWE is capable of accomplishing multi-image steganography.

2.3. Comparison with DF-SWE

Unlike these SWE methods, we propose DF-SWE, which can hide multiple secret images into one image with the same size, bringing higher hiding capacity without losing the naturalness of the stego images. Meanwhile, we build a reversible bijective transformation between the secret images and the generated stego images, reducing the extraction error of the secret images.

3. Backbone Network

In this paper, we propose a double-flow-based model to build a reversible bijective transformation between secret images and a generated stego image. Our flow-based backbone network relies on Glow [16]. The flow-based model is commonly used in image generation tasks by learning a bijective mapping between the latent space of simple distributions and the image space with complex distributions.

In flow-based generative models, the generative process is defined as follows:

z \sim p_{θ} (z),

(1)

x \sim g_{θ} (z),

(2)

where z is the latent variable and

p_{θ} (z)

is usually a multivariate Gaussian distribution

N (z; 0, I)

. The function

g_{θ} (\cdot)

is invertible, such that for a given a datapoint x, latent-variable inference is performed by

z = f_{θ} (x) = g_{θ}^{- 1} (x)

. For brevity, we will omit subscript

θ

from

f_{θ}

and

g_{θ}

. The function f is composed of a sequence of transformations:

f = f_{1} \circ f_{2} \circ \dots \circ f_{K}

, such that the relationship between x and z can be written as follows:

x \overset{f_{1}}{⟷} h_{1} \overset{f_{2}}{⟷} h_{2} \dots \overset{f_{k}}{⟷} z,

(3)

where

f_{i}

is a reversible transformation function and

h_{i}

is the output of

f_{i}

.

Under the change of variables of Equation (2), the probability density function of the model for a given datapoint can be written as

\log p_{θ} (x) = \log p_{θ} (z) + \log | d e t (d z / d x) .

(4)

The network architecture of Glow comprises three modules, namely, the squeeze module, the flow module, and the split module. The squeeze module is used to downsample the feature maps, and the flow module is used for feature processing. The split module will divide the image features into halves along the channel side, and half of them are output as the latent tensor.

4. Methodology

DF-SWE builds a reversible circulation of double flow to generate stego images and hide secret images. In the reversible circulation of double flow, there are three strategies, i.e., prior knowledge sampling, high-dimensional space replacement, and distribution consistency transformation. In the following section, we propose a problem definition and a threat model. Based on this, we explain the reversible circulation of double flow and the hiding and restoring processes of DF-SWE in detail.

4.1. Problem Definition and Threat Model

Given a set

I_{s e} : = {I_{s e}}^{k}

of k secret images, an SWE encoder

f_{s e} (\cdot) : I_{s e} \to z_{s e}

transforms the secret images into random noises

z_{s e}

, and

z_{s e} \overset{t}{⟷} z_{s t}^{'}

is a transformation from

z_{s e}

to

z_{s t}^{'}

for secret image hiding. In closing, a generator

f_{s t} (\cdot) : z_{s t}^{'} \to I_{s t}

produces a stego image

I_{s t}

from the noise

z_{s t}^{'}

. To maximize the reconstruction performance of the secret images, we propose using an invertible function for both

f_{s e} (\cdot)

and

f_{s t} (\cdot)

. That is, after taking the inverse

f_{s t}^{- 1} (\cdot) : I_{s t} \to z_{s t}^{'}

and a transformation

z_{s t}^{'} \overset{t^{- 1}}{⟷} z_{s e}

from

z_{s t}^{'}

to

z_{s e}

, the secret images can be revealed through

f_{s e}^{- 1} (\cdot) : z_{s e} \to I_{s e}

.

In our threat model, the attacker has access to a public training dataset for training the steganography model. During the attacking phase, an attacker gathers the secret images and generates the stego image by a composition

f (\cdot) : = f_{s e} \circ f_{s t} (\cdot)

of

f_{s e}

and

f_{s t}

. Once the stego image is delivered to the recipient, the recipient recovers the secret images by the inverse of the same stego model

f^{- 1} (\cdot)

. Moreover, the trained

f (\cdot)

can be reused for various secret images, even those coming from different domains.

Inspired by Glow [16], DF-SWE uses the double-flow-based model to build a reversible bijective transformation between secret images and generated stego images. The DF-SWE network takes a secret image as its input to generate a realistic stego image. Later on, it can directly recover the hidden secret image from the stego image via the reversible transformation.

As illustrated in Figure 1, the key components of our DF-SWE network are the double-flow-based models and the reversible circulation of double flow. A flow-based model (

M o d e l_{s e}

) can be regarded as an encoder to encode secret images

I_{s e}

into multivariate Gaussian distributions, while another one (

M o d e l_{s t}

) can be seen as a generator to generate stego images

I_{s t}

from multivariate Gaussian distributions. Due to the invertibility of the flow model, the two flow models can be considered as decoders to extract secret images. If we construct a reversible circulation of double flow,

I_{s t}

can be generated by

I_{s e}

and

I_{s e}

can be extracted from

I_{s t}

through our strategy. More specifically, the latent tensor

z_{s e} = z_{1}, z_{2}, . . ., z_{L}

and

z_{s t}^{'} = z_{1}^{'}, z_{2}^{'}, . . ., z_{L}^{'}

. L is the depth of the architecture.

We use two Glow models to learn multivariate Gaussian distributions of the secret images

I_{s e}

and the stego image

I_{s t}

, separately. Given functions

f_{s e} : = f_{1} \circ f_{2} \circ \dots \circ f_{K}

and

f_{s t} : = f_{1}^{'} \circ f_{2}^{'} \circ \dots \circ f_{n}^{'}

, we have

I_{s e} \overset{f_{1}}{⟷} h_{1} \dots h_{k - 1} \overset{f_{k}}{⟷} z_{s e}

,

I_{s t} \overset{f_{1}^{'}}{⟷} h_{1}^{'} \dots h_{n - 1}^{'} \overset{f_{n}^{'}}{⟷} z_{s t}^{'}

.

The existing flow model (Glow) implements a mapping relationship between the distribution of z and that of the generated image. In contrast, large image steganography without embedding is a generative task from one image to another. Hence, the core task of image-to-image steganography without embedding is to construct a mapping between the secret image

I_{s e}

and the stego image

I_{s t}

while ensuring the mapping is reversible to enhance the extraction quality of

I_{s e}

. This task can be formulated as follows:

I_{s e} \overset{f_{1}}{⟷} h_{1} \dots \overset{f_{k}}{⟷} z_{s e} \overset{t}{⟷} z_{s t}^{'} \overset{f_{1}^{'}}{⟷} \dots h_{n - 1}^{'} \overset{f_{n}^{'}}{⟷} I_{s t} .

(5)

z_{s e} \overset{t}{⟷} z_{s t}^{'}

is a transformation from a multivariate Gaussian distribution

z_{s e}

to another multivariate Gaussian distribution

z_{s e}^{'}

. Consequently, the core task is for the transformation t to construct a reversible circulation in the double flow model to hide the secret image in the generated stego image and keep it reversible. It should be noted that since t is an invertible transformation, the information can be regarded as lossless under ideal conditions. For example, before applying the reversible transformation, one can incorporate a decoupled encryption scheme [40] to first encrypt

z_{s e}

and then convert it into

z_{s t}^{'}

. In this way, even if an attacker has access to the public training dataset and attempts to reconstruct

I_{s t}^{'}

back to

z_{s t}^{'}

, the cryptographic protection ensures that no valid information can be obtained.

4.2. Reversible Circulation of Double Flow

For transmitting

z_{s e}

to

z_{s t}^{'}

and keeping it reversible, we divide the task of

z_{s e} \overset{t}{⟷} z_{s t}^{'}

into three tasks that need to be solved.

How to initialize $z_{s t}^{'}$ ?
How to transmit $z_{s e}$ to $z_{s t}^{'}$ ?
How to reduce the distortions on generated stego images?

In order to solve the above issues, we propose three techniques named prior knowledge sampling, high-dimensional space replacement, and distribution consistency transformation. We use the latent variables of z and its variants (e.g.,

z_{s t}^{'}, {\hat{z}}^{s t}

) to describe the circulation of two flows at different stages after different operations.

4.2.1. Prior Knowledge Sampling (PKS)

For initializing

z_{s t}^{'}

, we utilize the prior knowledge of the generator of Glow. Firstly, z is sampled from

N (0, I)

and the generated image

I_{g e}

is generated from a Glow model

G l o w_{g_{s t}} (z)

. The process can be formulated as follows:

\begin{matrix} I_{g e} & = G l o w_{g_{s t}} (z), \\ z & \sim N (0, I) . \end{matrix}

(6)

During the generation of

I_{g e}

,

G l o w_{g_{s t}}

utilizes prior knowledge of Glow parameters to generate an image, and the generation is irreversible. Next, we obtain the initialized

z_{s t}^{'}

by a sequence of invertible transformations, which can be formulated as follows:

I_{g e} \overset{f_{1}^{'}}{⟷} h_{1}^{'} \dots h_{n - 1}^{'} \overset{f_{n}^{'}}{⟷} z_{s t}^{'} .

(7)

4.2.2. High-Dimensional Space Replacement (HDSR)

For transmitting

z_{s e}

to

z_{s t}^{'}

and reducing the generated stego image distortion, we proposed the high-dimensional space replacement.

In the backbone network (Glow), each of the L layers of feature maps in

M o d e l_{s e}

is divided into halves along the channel dimension into two sets. Half of the sets are output as the latent tensor

{z_{i}}_{i = 1}^{L}

, and the other half of the sets are cycled into the squeeze module. Hence,

z_{s e}

contains different levels of information about the image. As shown in Figure 1,

z_{s e} = \{z_{1}, \dots, z_{L - 1}, z_{L}\}

and

z_{s t}^{'} = \{z_{1}^{'}, \dots, z_{L - 1}^{'}, z_{L}^{'}\}

. Particularly, we find that the latent tensor from shallow layers of

M o d e l_{s e}

has a greater effect on the reversibility of the image. If

z_{s t}^{'}

is replaced with

z_{s e}

directly, it will cause the distortion of the stego image due to the distribution differences between

z_{s t}^{'}

and

z_{s e}

.

Since different latent tensors of

z_{i}

have different effects on the reconstruction of the image, we propose high-dimensional space replacement, which replaces the high-dimensional distribution of generated images with the low-dimensional distribution of secret images. Our technique follows the principle of minimum information loss. As shown in Figure 2,

z_{L}^{'}

is replaced with the concatenated

\{z_{1}, \dots, z_{L - 1}\}

. For brevity, we abbreviate this process as that

{\hat{z}}^{s t}

is replaced with

{\hat{z}}^{s e}

. The

z_{s e}

of the secret image is circulated to the

z_{s t}^{'}

of the stego image, reducing the impact of the secret image and stego image generation. During the secret image extraction phase,

{\hat{z}}^{s e}

is replaced with

{\hat{z}}^{s t}

.

4.2.3. Distribution Consistency Transformation (DCT)

High-dimensional space replacement has circulated

z_{s e}

of secret image to the

z_{s t}^{'}

of the stego image and reduced the generated stego image distortion. For further improving the quality of image generation and reducing the generated image distortion, we propose distribution consistency transformation, which can decrease the distribution discrepancy between

{\hat{z}}^{s e}

and

{\hat{z}}^{s t}

.

As shown in Figure 2, the distribution consistency transformation is implemented in the high-dimensional space replacement. Because flow-based generative models learn a reversible bijective transformation between images and a multivariate Gaussian,

{\hat{z}}^{s e}

and

{\hat{z}}^{s t}

obey the Gaussian distribution. Hence, the most important thing to measure the Gaussian distribution is its mean and variance.

Based on this, our proposed distribution consistency transformation is to maintain the consistency of the mean and variance between two distributions. Distribution consistency transformation is defined as follows:

S t d = \frac{\sum_{i = 1}^{n} {({\hat{z}}_{i}^{s t} - \frac{\sum_{i = 1}^{n} {\hat{z}}_{i}^{s t}}{n})}^{2}}{\sum_{i = 1}^{n} {({\hat{z}}_{i}^{s e} - \frac{\sum_{i = 1}^{n} {\hat{z}}_{i}^{s e}}{n})}^{2}},

(8)

M e a n = \frac{\sum_{i = 1}^{n} S t d \times {\hat{z}}_{i}^{s t} - {\hat{z}}_{i}^{s e}}{n},

(9)

{\hat{z}}^{s e} = {\hat{z}}^{s e} \times S t d + M e a n .

(10)

Equations (8)–(10) can achieve the reduction of the distribution discrepancy between

{\hat{z}}^{s e}

and

{\hat{z}}^{s t}

. During the secret image extraction phase, the reversible transformation of distribution consistency transformation is expressed as Equation (11):

{\hat{z}}^{s t} = \frac{{\hat{z}}^{s t} - M e a n}{S t d} .

(11)

4.3. Hiding and Restoring Processes

In this section, we will describe the secret image hiding and restoring processes in detail. As shown in Figure 3, DF-SWE comprises two stages: a secret image hiding phase and an extracting phase.

4.3.1. Hiding Process

Figure 3a describes the hiding phase, which can hide large images without embedding.

M o d e l_{s e}

and

M o d e l_{s t}

are two different Glow models. Firstly, as shown in Step 1,

M o d e l_{s t}

randomly samples a Gaussian distribution z to generate an image

I_{g e}

utilizing prior knowledge of

M o d e l_{s t}

. Based on the generated image

I_{g e}

, we use the reversible operation of

M o d e l_{s t}

to obtain an initialized distribution

z_{s t}^{'}

in order to better carry the secret flow. Secondly, as shown in Step 2,

M o d e l_{s e}

encodes the secret image as

z_{s e}

by the reversible operation of

M o d e l_{s e}

. Specifically, Steps 1 and 2 can run in parallel or exchange their sequences. Through the operation of the high-dimensional space replacement and distribution consistency transformation on Step 3,

z_{s e}

can be passed to

z_{s t}^{'}

to generate a stego image. Meanwhile, the hiding phase maintains reversibility for extracted secret images. Finally,

I_{s t}

will be generated by

z_{s t}^{'}

utilizing the

M o d e l_{s t}

in step 4.

4.3.2. Restoring Process

As shown in Figure 3b, the extracting phase is the inverse process of the hiding phase. Hence, DF-SWE can extract the secret image with high quality because we construct an invertible mapping of the secret and stego images. Firstly, the stego image is decoded as

z_{s t}^{'}

by utilizing the reversible operation of

M o d e l_{s t}

in Step 5. And then, through the reverse operation of high-dimensional space replacement and distribution consistency transformation of Step 6,

z_{s t}^{'}

can be passed to

z_{s e}

to extract the secret image. The reverse operation of high-dimensional space replacement and distribution consistency transformation is described in detail in Section 4.2.2 and Section 4.2.3. Finally,

M o d e l_{s e}

extracts the secret image

I_{s e}^{'}

with high quality in Step 7.

5. Experimental Results

5.1. Experimental Setup

To demonstrate the superiority of DF-SWE, we compare it with six state-of-the-art SWE methods, namely DCGAN-Steg [12], SAGAN-Steg [11], SSteGAN [13], WGAN-Steg [41], CycleGAN [42], and CRoSS [43]. To verify the extraction quality of the secret images, we also compare DF-SWE with ES methods, including 4 bit-LSB, Baluja [2], Weng et al. [44], and HiDDeN [1].

Our DF-SWE and baseline models are trained on the datasets of Bedroom (subsets of LSUN, including 3,033,042 color images) [45], LFW [46] (including 13,234 color images), and CelebA [47] (including 202,599 color images). We train DF-SWE with the hyper-parameter

L = 4

. L is the depth of the model. The greater the depth of the model, the better the quality of the generated images, but the model parameters and computational resources increase. Therefore, the hyper-parameter L can be set according to actual requirements. Additionally, the steganography process is completed in less than a second on a GPU RTX3090, with

L = 4

. Therefore, our proposed method has excellent performance in real-time applications.

We evaluate the hiding capacity of DF-SWE by comparing the bits per pixel (BPP), BPP =

\frac{L e n (s e c r e t)}{H \times W}

, which is the number of message bits hidden per pixel of the encoded image. H/W is the height/width of stego images. Meanwhile, we evaluate the detection error (Pe),

P e = \frac{1}{2} (P_{F A} + P_{M D})

, where

P_{F A}

and

P_{M D}

represent the probabilities of false alarm and missed detection rate, respectively.

P e

ranges in

[0, 1]

, and its optimal value is 0.5. As a proxy to secrecy, we can also measure the secret image extraction performance using peak signal-to-noise ratio (PSNR), Root Mean Square Error (RMSE), and Structure Similarity Index Measure (SSIM). A larger value of PSNR, SSIM, and a smaller value of RMSE indicate higher image quality, which are formulated as follows:

RMSE: Root Mean Square Error (RMSE) measures the difference between two images. Given two images X and Y with width W and height H, RMSE is formulated as follows:

$R M S E = \sqrt{M S E},$

(12)

$M S E = \frac{1}{W * H} \sum_{i = 1}^{W} \sum_{j = 1}^{H} ({X_{i, j} - X_{i, j})}^{2},$

(13)

where $X_{i, j}$ and $Y_{i, j}$ indicate the pixels at position $(i, j)$ of images X and Y, respectively.
PSNR: Peak signal-to-noise ratio (PSNR) is a widely used metric to measure the quality of an image. PSNR is defined as follows:

$P S N R = 10 * \log_{10} \frac{R^{2}}{M S E},$

(14)

where R represents the maximum value of images, which is usually set as 255.
SSIM: Structural Similarity Index Measure (SSIM) is another commonly used image quality assessment based on the degradation of structural information [48]. SSIM is computed by the means $μ_{X}$ and $μ_{X}$ , the variance $σ_{X}$ and $σ_{X}$ , and the co-variance $σ_{(X, Y)}$ , as follows:

$S S I M = \frac{(2 μ_{X} μ_{Y} + C_{1}) (σ_{(X, Y)} + C_{2})}{({μ_{X}}^{2} + {μ_{Y}}^{2} + C_{1}) ({σ_{X}}^{2} + {σ_{Y}}^{2} + C_{2})}$

(15)

where $c 1 = k_{1} L^{2}, c 2 = k_{2} L^{2}$ and L is the dynamic range of the pixel values. The default configuration of $k_{1}$ is 0.01, and $k_{2}$ is 0.03.

5.2. Evaluation by Image Hiding Quality

Figure 4 compares our DF-SWE with SWE methods on the bedroom dataset (subsets of LSUN). Since SWE methods hide secret messages without embedding modifications and are immune to typical steganalysis tools, visual quality is crucial. From Figure 4, we can see that images generated by DF-SWE have higher capacity and are more realistic with the FID (Fréchet Inception Distance) than those of the competitors. FID is a metric of image generation, and a lower FID score means that the generated image is more realistic. There are noticeable distortions in the stego images generated by these SWE methods.

Examples of stego images generated by DF-SWE are given in Figure 5 and Figure 6, which show the hiding quality of images in sizes

64 \times 64 \times 3

and

128 \times 128 \times 3

, respectively. It can be observed that the stego images leak no information about the secret images. Only through

M o d l e_{s e}, M o d e l_{s t}

and reversible circulation of double flow, can the secret image be extracted from the stego image.

M o d l e_{s e}

and

M o d e l_{s t}

have hundreds of millions of parameters and different network structures, which makes decrypting the secret images difficult.

Once trained, DF-SWE can be generalized to hiding images from various domains. Figure 5 and Figure 6 show secret images and generated stego images, in different domains. For example, the LFW-CelebA signifies that the secret image is randomly selected from the LFW dataset, and the generated stego image is similar to the style of the CelebA dataset. From Figure 4, we can see that images generated by DF-SWE are more realistic, and extracted secret images have nearly lossless extraction quality.

5.3. The Extraction Quality Compared with Prevalent Methods

Table 2 lists the performances of information extraction accuracy of different steganographic approaches, i.e., DCGAN-Steg, SAGAN-SSteGAN, WGAN-Steg, IDEAS, and S2IRT, with the increase in hiding payloads. From this table, it is clear that DF-SWE achieves much higher information extracted accuracy than SWE approaches under different hiding payloads. The extracted accuracy rates of DF-SWE are kept at a very high level when the hiding payload ranges from 1 BPP to 4 BPP. Additionally, the proposed generative steganographic approach can achieve a high hiding capacity (up to 12 BPP) and the accurate extraction of the secret message (almost

100 %

accuracy rate), simultaneously. Even when hiding images (BPP = 24), the extracted accuracy achieves 0.5124 and the pixel errors of the extracted images are mostly ranged in

\pm 1

. That is because DF-SWE built an image-to-image reversible bijective mapping, reducing the extraction error. In contrast, the extracted accuracy of other SWE methods decreases with the increase in hiding payload. At high hiding capacity, existing SWE methods cannot hide secret messages or generate stego images that are twisted and distorted. Thus, existing SWE methods cannot achieve accurate information extraction under high hiding payloads.

The extraction metrics of the different ES methods are given in Table 3, which describes the extraction quality of secret images by PSNR, SSIM, and RMSE. The columns of LSUN, CelebA, and LFW represent the experimental results under different datasets. Unlike our DF-SWE, to build an image-to-image reversible bijective mapping, existing SWE methods directly write the secret message into a latent space and generate the image directly from the latent space. It is difficult to balance the hiding capacity and generation quality. Since the existing SWE methods face the problem of low hidden capacity and the incapability of hiding secret images of the same size, we compared DF-SWE with ES methods to verify the extraction quality of the secret images. In particular, ES methods usually have a better extraction performance than SWE methods, because ES methods have cover images to hide the secret image and do not consider the generated quality. On the contrary, SWE methods require a plausible visual quality of both the generated stego image and the recovered secret image. From Table 3, it is evident that DF-SWE outperforms all other methods, providing better secret image extraction quality.

5.4. Security Evaluation by Steganalysis

We horizontally compared the performance of the proposed DF-SWE with that of YeNet [49] in Table 4, and vertically evaluated the performance of DF-SWE against different steganalyzers (i.e, DFNet [50], ESNet [51], LWENet [52], and SiaStegNet [53]) in Table 5. To ensure experimental fairness, for the same steganalyzer, we adopted the model trained with identical hyperparameters (e.g., learning rate, number of training epochs, and dataset); for different steganalyzers, the same evaluation metrics were employed. The performance of the steganalysis was quantified using the Probability of Error (Pe) metric. The optimal value of the detection error (Pe) is 0.5, at which point the steganalyzer (Ye-net [49]) fails to distinguish the source of images and can only conduct random guessing. Most Existing Steganography (ES) methods exhibit inadequate steganographic security, whereas the proposed Steganography with Wasserstein Estimation (SWE) achieves superior security performance with higher Pe values. Compared with existing SWE schemes, DF-SWE has demonstrated significant advancements in multiple aspects. Specifically, its payload is over 8000 times higher than that of other counterparts. As illustrated in Table 4 and Table 5, the proposed DF-SWE has achieved Pe values that outperform most of the work and maintain robust performance under the different steganalyzers.

5.5. Multiple Image Hiding

Most image hiding work can only hide a secret image within a cover image. However, it is not applicable to hide multiple secret images in an image when specific integrated or sequentially related multiple images are not separable. Especially in image steganography without embedding, there is no need to do multiple image hiding, and our method is the first proposed to achieve multi-image hiding without embedding.

In this section, we demonstrate the experimental results of DF-SWE for hiding multiple images in sizes of

64 \times 64 \times 3

and

128 \times 128 \times 3

in Figure 7 and Figure 8, respectively.

S e c r e t i

is the i-th secret image and

E x t r a c t i

is the i-th extracted image from the Stego image (Stego) with respect to

S e c r e t i

. It can be observed that, even though there are three images (i.e.,

B P P = 72

) hidden in the same stego image, the generated stego images remain natural. Moreover, the recovered secret images are nearly lossless. The related extraction of multiple image hiding is given in Table 6.

5.6. Domain Generalization

We define domain generalization as the ability of a steganographic model to preserve its hiding–extraction functionality when applied to images drawn from distributions that differ from the training domain. Current image steganography usually requires that the secret images to be hidden are from the same domain as the samples used to train the steganography model. However, it is expensive to train individual steganography models for images from new domains. Furthermore, collecting training data from particular domains could be difficult due to data privacy or other concerns. Therefore, existing methods cannot achieve image steganography when accessing images from the same domain as the secret images, which is prohibited. However, as depicted in the workflow of Figure 3, the entire process of DF-SWE involves two models, as shown in Step 2 and Step 4. Specifically,

M o d e l_{s e}

is utilized for encoding the secret image

I_{s e}

, while

M o d e l_{s t}

is employed for generating the stego image

I_{s t}

. This indicates that the two components: hiding the secret information and generating the stego image, are decoupled. Additionally, when

M o d e l_{s t}

generates images, compared with the noise z (sampled from random noise) used for generating the normal image

I_{g e}

, the adopted noise

z_{s t}^{'}

only replaces the part containing low-density information (Step 3). For

M o d e l_{s t}

, the distribution of

z_{s t}^{'}

is nearly identical to that of z, thus enabling the generated image

I_{s t}

that is almost consistent with

I_{g e}

. Based on the aforementioned characteristics, DF-SWE inherently possesses domain generalization capability. The extraction metrics of our DF-SWE on the random dataset (i.e., Stanford Dogs) are presented in Table 7. Although these metrics are not as optimal as those in Table 3, the method still maintains favorable visual performance. As shown in Figure 9, images in the first three columns on the left side are from the Stanford-dog dataset, and the other images are randomly selected from the Internet. All the images have totally different distributions from those of the images used to train DF-SWE. According to Figure 9, DF-SWE can successfully hide and recover these images with a satisfactory visual quality. This property greatly boosts the capability of DF-SWE and makes it the first domain-agnostic steganography method.

5.7. Ablation Experiment

Figure 10 performs an ablation analysis of three tactics employed by DF-SWE, which are prior knowledge sampling, high-dimensional space replacement, and distribution consistency transformation. The first three, middle three, and last three columns of images are the effect of different tactics on the LFW, CelebA, and LSUN datasets, respectively.

In the first row, the generated stego image is abnormal, particularly in the first three columns. The main change of the direct replacement tactic is that z is replaced with

z_{s e}

directly without utilizing prior knowledge of

M o d e l_{s t}

. The high-dimensional space replacement is our proposed tactic shown in the second row, which uses the low-dimensional space of

{\hat{z}}^{s e}

to replace the high-dimensional space of

{\hat{z}}^{s t}

. We can see that high-dimensional space replacement effectively generates realistic images, but this technique is not adequate for the abnormal images of the first three columns. The prior knowledge sampling is our proposed method. In the third row,

z_{s t}^{'}

is replaced with

z_{s e}

, which utilizes prior knowledge

M o d e l_{s t}

and a multivariate Gaussian latent-variable z. The distribution consistency transformation is proposed to reduce the distortion from the difference between the two distributions. The fourth row is that z is replaced with

z_{s e}

, but

z_{s e}

is changed by the distribution consistency transformation. In the last three columns, the generated stego image is more normal than the first row.

The fifth row combines our proposed prior knowledge sampling and distribution consistency transformation.

z_{s t}^{'}

is replaced with

z_{s e}

which is modified by the distribution consistency transformation. In the last three columns, the quality of the generated image is a significant improvement compared with the first row. In the sixth row, the first three columns indicate that only high-dimensional space replacement and distribution consistency transformation cannot generate a realistic image. Compared with the second and seventh rows, the first three columns clearly show that prior knowledge sampling effectively improves the quality of the generated stego images.

In summary, the ablation experiments verify the effectiveness of our proposed method to circulate two latent flows and guarantee reversibility.

6. Discussion and Future Work

In this paper, we propose a novel double-flow-based steganography without embedding (DF-SWE) method for hiding large images. Specifically, we propose the reversible circulation of double flow to build a reversible bijective transformation between secret images and generated stego images. The reversible circulation ensures the small extraction error of the secret images and the high quality of the generated stego images. Importantly, DF-SWE is the first SWE method that enables hiding multiple large images in one stego image. Specifically, the payload capacity of DF-SWE achieves 24–72 BPP and is 8000–16,000 times more than that of the other SWE methods. In this way, DF-SWE provides a way to directly generate stego images without a cover image, which greatly improves the security of the secret images. According to the experimental results, the proposed DF-SWE shows better hiding/recovering performance. Intriguingly, DF-SWE can be generalized to hiding secret images from different domains with that of the training dataset. This nice property indicates that DF-SWE can be deployed to privacy-critical scenarios in which the secret images are hidden from the provider of DF-SWE. Despite the excellent performance of our method in secret image recovery, there remains room for further improvement. For instance, the method is highly dependent on the Glow model and thus unable to accomplish steganography for high-resolution images (e.g., 256 × 256, 512 × 512). Additionally, it lacks robustness against Gaussian noise and JPEG lossy compression, and the entire extraction process is not completely lossless.

In the future, it is of great significance to further explore the potential of SWE in lossless secret image recovery and investigate effective countermeasures against common perturbations such as noise attacks and JPEG lossy compression. We can attempt to adopt other reversible models similar to Glow or conduct in-depth research on the internal mechanism of the Glow model to develop solutions, all of which are promising research directions.

Author Contributions

Conceptualization, Y.D.; Data curation, W.Z.; Methodology, Y.D.; Project administration, Y.D.; Software, B.S.; Writing—original draft preparation, Y.D.; Writing—review and editing, B.S. and W.Z.; Investigation, Z.W.; Supervision, W.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the Yunnan Province expert workstations (Grant No: 202305AF150078), National Natural Science Foundation of China (Grant No: 62162067), Yunnan Fundamental Research Project (Grant No: 202401AT070474, 202501AU070059), Yunnan Province Special Project (Grant No: 202403AP140021), and Yunnan Provincial Department of Education Science Research Project (Grant No: 2025J0006).

Data Availability Statement

This study exclusively used publicly available benchmark datasets that were collected prior to the research. The datasets mainly include subsets of LSUN (containing 3,033,042 color images), LFW (containing 13,234 color images) and CelebA (containing 202,599 color images). No new data or human subject interaction was involved. The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Correction Statement

This article has been updated to include a Data Availability Statement. This change does not affect the scientific content of the article.

References

Zhu, J.; Kaplan, R.; Johnson, J.; Fei-Fei, L. HiDDeN: Hiding Data with Deep Networks. In Proceedings of the Computer Vision—ECCV 2018—15th European Conference, Munich, Germany, 8–14 September 2018; Proceedings, Part XV, Lecture Notes in Computer Science. Volume 11219, pp. 682–697. [Google Scholar] [CrossRef]
Baluja, S. Hiding Images within Images. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 1685–1697. [Google Scholar] [CrossRef]
Liu, M.; Zhang, M.; Liu, J.; Zhang, Y.; Ke, Y. Coverless Information Hiding Based on Generative adversarial networks. arXiv 2017, arXiv:1712.06951. [Google Scholar] [CrossRef]
van Schyndel, R.; Tirkel, A.Z.; Osborne, C.F. A digital watermark. In Proceedings of the International Conference on Image Processing, Austin, TX, USA, 13–16 November 1994. [Google Scholar]
Wang, Z.; Zhang, Z.; Jiang, J. Multi-Feature Fusion based Image Steganography using GAN. In Proceedings of the IEEE International Symposium on Software Reliability Engineering, ISSRE, Wuhan, China, 25–28 October 2021; pp. 280–281. [Google Scholar] [CrossRef]
Baluja, S. Hiding Images in Plain Sight: Deep Steganography. In Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 2069–2079. [Google Scholar]
Zhou, Z.; Sun, H.; Harit, R.; Chen, X.; Sun, X. Coverless Image Steganography Without Embedding. In Proceedings of the Cloud Computing and Security—First International Conference, ICCCS 2015, Nanjing, China, 13–15 August 2015; Revised Selected Papers, Lecture Notes in Computer Science. Volume 9483, pp. 123–132. [Google Scholar] [CrossRef]
Zhou, Z.L.; Cao, Y.; Sun, X.M. Coverless Information Hiding Based on Bag-of-Words Model of Image. J. Appl. Sci. 2016, 34, 527–536. [Google Scholar]
Zheng, S.; Wang, L.; Ling, B.; Hu, D. Coverless Information Hiding Based on Robust Image Hashing. In Proceedings of the Intelligent Computing Methodologies—13th International Conference, ICIC, Liverpool, UK, 7–10 August 2017; Lecture Notes in Computer Science. Volume 10363, pp. 536–547. [Google Scholar] [CrossRef]
Zhang, Z.; Fu, G.; Liu, J.; Fu, W. Generative Information Hiding Method Based on Adversarial Networks. In Proceedings of theThe 8th International Conference on Computer Engineering and Networks (CENet2018), Shanghai, China, 17–18 August 2018; Springer: Cham, Switzerland, 2018; Volume 905, pp. 261–270. [Google Scholar]
Yu, C.; Hu, D.; Zheng, S.; Jiang, W.; Li, M.; Zhao, Z. An improved steganography without embedding based on attention GAN. Peer-to-Peer Netw. Appl. 2021, 14, 1446–1457. [Google Scholar] [CrossRef]
Hu, D.; Wang, L.; Jiang, W.; Zheng, S.; Li, B. A Novel Image Steganography Method via Deep Convolutional Generative Adversarial Networks. IEEE Access 2018, 6, 38303–38314. [Google Scholar] [CrossRef]
Wang, Z.; Gao, N.; Wang, X.; Qu, X.; Li, L. SSteGAN: Self-learning Steganography Based on Generative Adversarial Networks. In Proceedings of the Neural Information Processing—25th International Conference, ICONIP 2018, Siem Reap, Cambodia, 13–16 December 2018; Proceedings, Part II, Lecture Notes in Computer Science. Volume 11302, pp. 253–264. [Google Scholar] [CrossRef]
Liu, X.; Ma, Z.; Ma, J.; Zhang, J.; Schaefer, G.; Fang, H. Image Disentanglement Autoencoder for Steganography without Embedding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, 18–24 June 2022; IEEE: New York, NY, USA, 2022; pp. 2293–2302. [Google Scholar] [CrossRef]
Zhou, Z.; Su, Y.; Wu, Q.M.J.; Fu, Z.; Shi, Y. Secret-to-Image Reversible Transformation for Generative Steganography. arXiv 2022. [Google Scholar] [CrossRef]
Kingma, D.P.; Dhariwal, P. Glow: Generative Flow with Invertible 1 × 1 Convolutions. In Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, Montreal, QC, Canada, 3–8 December 2018; pp. 10236–10245. [Google Scholar]
Chan, C.; Cheng, L. Hiding data in images by simple LSB substitution. Pattern Recognit. 2004, 37, 469–474. [Google Scholar] [CrossRef]
Mielikäinen, J. LSB matching revisited. IEEE Signal Process. Lett. 2006, 13, 285–287. [Google Scholar] [CrossRef]
Elharrouss, O.; Almaadeed, N.; Al-Máadeed, S. An image steganography approach based on k-least significant bits (k-LSB). In Proceedings of the IEEE International Conference on Informatics, IoT, and Enabling Technologies, ICIoT, Doha, Qatar, 2–5 February 2020; pp. 131–135. [Google Scholar] [CrossRef]
Sahu, A.K.; Gutub, A. Improving grayscale steganography to protect personal information disclosure within hotel services. Multim. Tools Appl. 2022, 81, 30663–30683. [Google Scholar] [CrossRef]
Pevný, T.; Filler, T.; Bas, P. Using High-Dimensional Image Models to Perform Highly Undetectable Steganography. In Proceedings of the Information Hiding—12th International Conference, IH, Calgary, AB, Canada, 28–30 June 2010; Lecture Notes in Computer Science. Volume 6387, pp. 161–177. [Google Scholar] [CrossRef]
Holub, V.; Fridrich, J.J.; Denemark, T. Universal distortion function for steganography in an arbitrary domain. EURASIP J. Inf. Secur. 2014, 2014, 1. [Google Scholar] [CrossRef]
Guo, L.; Ni, J.; Shi, Y. Uniform Embedding for Efficient JPEG Steganography. IEEE Trans. Inf. Forensics Secur. 2014, 9, 814–825. [Google Scholar] [CrossRef]
Pan, Y.; Ni, J.; Su, W. Improved Uniform Embedding for Efficient JPEG Steganography. In Proceedings of the Cloud Computing and Security—Second International Conference, ICCCS, Nanjing, China, 29–31 July 2016; Lecture Notes in Computer Science. Volume 10039, pp. 125–133. [Google Scholar] [CrossRef]
Guo, L.; Ni, J.; Su, W.; Tang, C.; Shi, Y. Using Statistical Image Model for JPEG Steganography: Uniform Embedding Revisited. IEEE Trans. Inf. Forensics Secur. 2015, 10, 2669–2680. [Google Scholar] [CrossRef]
Wu, P.; Yang, Y.; Li, X. Image-into-Image Steganography Using Deep Convolutional Network. In Proceedings of the Advances in Multimedia Information Processing—PCM 2018—19th Pacific-Rim Conference on Multimedia, Hefei, China, 21–22 September 2018; Lecture Notes in Computer Science. Volume 11165, pp. 792–802. [Google Scholar] [CrossRef]
Zhang, Y.; Zhang, W.; Chen, K.; Liu, J.; Liu, Y.; Yu, N. Adversarial Examples Against Deep Neural Network based Steganalysis. In Proceedings of the 6th ACM Workshop on Information Hiding and Multimedia Security, Innsbruck, Austria, 20–22 June 2018; pp. 67–72. [Google Scholar] [CrossRef]
Kuppusamy, P.G.; Ramya, K.C.; Rani, S.S.; Sivaram, M.; Vigneswaran, D. A Novel Approach Based on Modified Cycle Generative Adversarial Networks for Image Steganography. Scalable Comput. Pract. Exp. 2020, 21, 63–72. [Google Scholar] [CrossRef]
Porav, H.; Musat, V.; Newman, P. Reducing Steganography In Cycle-consistency GANs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR, Long Beach, CA, USA, 16–17 June 2019; pp. 78–82. [Google Scholar]
Zhang, R.; Dong, S.; Liu, J. Invisible steganography via generative adversarial networks. Multim. Tools Appl. 2019, 78, 8559–8575. [Google Scholar] [CrossRef]
Jing, J.; Deng, X.; Xu, M.; Wang, J.; Guan, Z. HiNet: Deep Image Hiding by Invertible Network. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, 10–17 October 2021; IEEE: New York, NY, USA, 2021; pp. 4713–4722. [Google Scholar] [CrossRef]
Guan, Z.; Jing, J.; Deng, X.; Xu, M.; Jiang, L.; Zhang, Z.; Li, Y. DeepMIH: Deep Invertible Network for Multiple Image Hiding. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 372–390. [Google Scholar] [CrossRef]
Cao, Y.; Zhou, Z.; Sun, X.; Gao, C. Coverless information hiding based on the molecular structure images of material. Comput. Mater. Contin. 2018, 54, 197–207. [Google Scholar]
Qiu, A.; Chen, X.; Sun, X.; Wang, S.; Wei, G. Coverless Image Steganography Method Based on Feature Selection. J. Inf. Hiding Priv. Prot. 2019, 1, 12. [Google Scholar] [CrossRef]
Liu, Q.; Xiang, X.; Qin, J.; Tan, Y.; Qiu, Y. Coverless image steganography based on DenseNet feature mapping. EURASIP J. Image Video Process. 2020, 2020, 39. [Google Scholar] [CrossRef]
Chen, X.; Zhang, Z.; Qiu, A.; Xia, Z.; Xiong, N.N. Novel Coverless Steganography Method Based on Image Selection and StarGAN. IEEE Trans. Netw. Sci. Eng. 2022, 9, 219–230. [Google Scholar] [CrossRef]
Zhu, Y.; Chen, F.; He, H. Orthogonal GAN information hiding model based on secret information driven. J. Appl. Sci. 2019, 37, 721–732. [Google Scholar]
Jiang, W.; Hu, D.; Yu, C.; Li, M.; Zhao, Z. A New Steganography Without Embedding Based on Adversarial Training. In Proceedings of the ACM TUR-C’20: ACM Turing Celebration Conference, Hefei, China, 22–24 May 2020; pp. 219–223. [Google Scholar] [CrossRef]
Wei, P.; Luo, G.; Song, Q.; Zhang, X.; Qian, Z.; Li, S. Generative Steganographic Flow. In Proceedings of the IEEE International Conference on Multimedia and Expo, ICME 2022, Taipei, Taiwan, 18–22 July 2022; pp. 1–6. [Google Scholar] [CrossRef]
Pawar, P.P.; Femy, F.F.; Rajkumar, N.; Jeevitha, S.; Bhuvanesh, A.; Kumar, D. Blockchain-enabled Cybersecurity for IoT Using Elliptic Curve Cryptography and Black Winged Kite Model. Int. J. Inf. Tecnol. 2025. [Google Scholar] [CrossRef]
Li, J.; Niu, K.; Liao, L.; Wang, L.; Liu, J.; Lei, Y.; Zhang, M. A Generative Steganography Method Based on WGAN-GP. In Proceedings of the Artificial Intelligence and Security, Hohhot, China, 17–20 July 2020; pp. 386–397. [Google Scholar]
Chu, C.; Zhmoginov, A.; Sandler, M. Cyclegan, a master of steganography. arXiv 2017, arXiv:1712.02950. [Google Scholar] [CrossRef]
Yu, J.; Zhang, X.; Xu, Y.; Zhang, J. CRoSS: Diffusion Model Makes Controllable, Robust and Secure Image Steganography. arXiv 2023, arXiv:2305.16936. [Google Scholar] [CrossRef]
Weng, X.; Li, Y.; Chi, L.; Mu, Y. High-Capacity Convolutional Video Steganography with Temporal Residual Modeling. In Proceedings of the 2019 on International Conference on Multimedia Retrieval, ICMR 2019, Ottawa, ON, Canada, 10–13 June 2019; pp. 87–95. [Google Scholar] [CrossRef]
Yu, F.; Zhang, Y.; Song, S.; Seff, A.; Xiao, J. LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop. arXiv 2015, arXiv:1506.03365. [Google Scholar]
Learned-Miller, G.B.H.E. Labeled Faces in the Wild: Updates and New Reporting Procedures; Technical Report UM-CS-2014-003; University of Massachusetts: Amherst, MA, USA, 2014. [Google Scholar]
Liu, Z.; Luo, P.; Wang, X.; Tang, X. Deep Learning Face Attributes in the Wild. In Proceedings of the International Conference on Computer Vision (ICCV), Santiago, Chile, 13–16 December 2015. [Google Scholar]
Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
Ye, J.; Ni, J.; Yi, Y. Deep Learning Hierarchical Representations for Image Steganalysis. IEEE Trans. Inf. Forensics Secur. 2017, 12, 2545–2557. [Google Scholar] [CrossRef]
Fu, T.; Chen, L.; Jiang, Y.; Jia, J.; Fu, Z. Image Steganalysis Based on Dual-Path Enhancement and Fractal Downsampling. IEEE Trans. Inf. Forensics Secur. 2025, 20, 1–16. [Google Scholar] [CrossRef]
He, J.; Weng, S.; Yu, L.; Chen, D. Steganalysis Network with Two-Branch Preprocessing for Spatial and JPEG Domains. IEEE Trans. Circuits Syst. Video Technol. 2025, 35, 1451–1463. [Google Scholar] [CrossRef]
Weng, S.; Chen, M.; Yu, L.; Sun, S. Lightweight and Effective Deep Image Steganalysis Network. IEEE Signal Process. Lett. 2022, 29, 1888–1892. [Google Scholar] [CrossRef]
You, W.; Zhang, H.; Zhao, X. A Siamese CNN for Image Steganalysis. IEEE Trans. Inf. Forensics Secur. 2021, 16, 291–306. [Google Scholar] [CrossRef]

Figure 1. The network architecture of DF-SWE for image-to-image hiding.

Figure 2. High-dimensional space replacement and Distribution consistency transformation. The Std and Mean are the variance and mean of

z_{L}^{'}

. The ⊗ and ⊕ are the matrix operations of multiplication and addition, respectively. The DCT is the distribution consistency transform, and the HDSR is the high-dimensional space replacement.

Figure 2. High-dimensional space replacement and Distribution consistency transformation. The Std and Mean are the variance and mean of

z_{L}^{'}

. The ⊗ and ⊕ are the matrix operations of multiplication and addition, respectively. The DCT is the distribution consistency transform, and the HDSR is the high-dimensional space replacement.

Figure 3. Hiding and restoring processes. (a) is the hiding phase, including Steps 1–4. (b) are the extracting phase, including Steps 5–7.

Figure 4. Hiding evaluation with steganography without embedding.

Figure 5. The quality of generated images with the size

64 \times 64

using DF-SWE.

Figure 5. The quality of generated images with the size

64 \times 64

using DF-SWE.

Figure 6. The quality of generated images with the size

128 \times 128

using DF-SWE.

Figure 6. The quality of generated images with the size

128 \times 128

using DF-SWE.

Figure 7. Multiple image hiding.

Figure 8. Multiple image hiding with

128 \times 128 \times 3

.

Figure 8. Multiple image hiding with

128 \times 128 \times 3

.

Figure 9. Data domain generalization.

Figure 10. Ablationexperiment with 3 tactics.

Table 1. The statistics of hidden capacity.

Methods	Year	Hiding Type	Max Payloads (BPP)
DCGAN-Steg [12]	2018	bit	9.1 × 10⁻³
SSteGAN [13]	2018	bit	2.9 × 10⁻¹
SAGAN-Steg [11]	2021	bit	4 × 10⁻¹
IDEAS [14]	2022	bit	2.3 × 10⁻²
S2IRT [15]	2022	bit	4

Table 2. The information extraction accuracy of those methods with different hidden payloads.

Methods	Type	Hiding Payloads (BPP)
Methods	Type	1	2	4	6	12	24
DCGAN-Steg [12]	bit	0.7134	0.712	0.7122	-	-	-
SAGAN-Steg [11]	bit	0.7245	0.7232	0.723	-	-	-
SSteGAN [13]	bit	0.7139	0.7126	0.7124	-	-	-
WGAN-Steg [41]	bit	0.7122	0.7114	0.7113	-	-	-
IDEAS [14]	bit	0.7552	0.755	0.7546	-	-	-
S2IRT [15]	bit	1	1	0.9942	-	-	-
Ours	image	1	1	1	0.9921	0.9836	0.5124

Table 3. Extraction metrics of secret images compared with prevalent methods.

Methods	Type	LSUN			CelebA			LFW
Methods	Type	PSNR ↑	SSIM ↑	RMSE ↓	PSNR ↑	SSIM ↑	RMSE ↓	PSNR ↑	SSIM ↑	RMSE ↓
4bit-LSB	ES	23.06	0.7638	18.05	23.13	0.7518	17.86	23.13	0.7668	17.92
Baluja [2]	ES	32.41	0.9242	6.31	32.94	0.9325	6.11	32.61	0.9392	6.03
Weng et al. [44]	ES	33.74	0.9518	5.25	34.51	0.9582	4.98	34.23	0.9657	4.98
HiDDeN [1]	ES	34.49	0.9536	4.32	36.34	0.9629	4.07	36.24	0.9682	3.96
CycleGAN [42]	SWE	23.36	0.8005	17.63	23.98	0.8459	16.71	22.67	0.7951	19.65
CRoSS [43]	SWE	18.13	0.5659	32.68	18.44	0.5724	31.42	18.67	0.5982	30.42
Ours	SWE	34.51	0.9542	4.24	37.85	0.9675	2.51	38.13	0.9697	3.21

Table 4. Security evaluation by steganalysis.

Type	Methods	Hiding Type	Payload (BPP) ↑	Pe → 0.5
ES	Baluja [2]	image	24	0.04
ES	Weng et al. [44]	image	24	0.04
ES	HiDDeN [1]	image	24	0.03
SWE	DCGAN-Steg [12]	Binary	1.5 × 10⁻³	0.48
SWE	SAGAN-Steg [11]	Binary	3.2 × 10⁻³	0.47
SWE	SSteGAN [13]	Binary	1.5 × 10⁻³	0.52
SWE	WGAN-Steg [41]	Binary	1.5 × 10⁻³	0.52
SWE	CycleGAN [42]	image	24	0.86
SWE	CRoSS [43]	image	24	0.74
SWE	Ours	image	24	0.51

Table 5. Our DF-SWE in different steganalyzers.

Steganalyzer	Payload (BPP) ↑	PE → 0.5
DFNet [50]	24	0.57
ESNet [51]	24	0.5
LWENet [52]	24	0.53
SiaStegNet [53]	24	0.51

Table 6. Extraction metrics of multiple hiding.

Methods	LSUN			CelebA			LFW
Methods	PSNR ↑	SSIM ↑	RMSE ↓	PSNR ↑	SSIM ↑	RMSE ↓	PSNR ↑	SSIM ↑	RMSE ↓
DF—SWE in Multiple image	29.967	0.875	8.3556	31.654	0.908	6.768	33.204	0.927	5.629

Table 7. Metrics on domain generalization.

Method	PSNR ↑	SSIM ↑	RMSE ↓
Our DF-SWE	24.72	0.755	15.295

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Dong, Y.; Wang, Z.; Song, B.; Zhou, W. Double-Flow-Based Steganography Without Embedding for Image-to-Image Hiding. Electronics 2025, 14, 4270. https://doi.org/10.3390/electronics14214270

AMA Style

Dong Y, Wang Z, Song B, Zhou W. Double-Flow-Based Steganography Without Embedding for Image-to-Image Hiding. Electronics. 2025; 14(21):4270. https://doi.org/10.3390/electronics14214270

Chicago/Turabian Style

Dong, Yunyun, Zhen Wang, Bingbing Song, and Wei Zhou. 2025. "Double-Flow-Based Steganography Without Embedding for Image-to-Image Hiding" Electronics 14, no. 21: 4270. https://doi.org/10.3390/electronics14214270

APA Style

Dong, Y., Wang, Z., Song, B., & Zhou, W. (2025). Double-Flow-Based Steganography Without Embedding for Image-to-Image Hiding. Electronics, 14(21), 4270. https://doi.org/10.3390/electronics14214270

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Double-Flow-Based Steganography Without Embedding for Image-to-Image Hiding

Abstract

1. Introduction

2. Related Work

2.1. Embedded Steganography

2.2. Steganography Without Embedding (SWE)

2.3. Comparison with DF-SWE

3. Backbone Network

4. Methodology

4.1. Problem Definition and Threat Model

4.2. Reversible Circulation of Double Flow

4.2.1. Prior Knowledge Sampling (PKS)

4.2.2. High-Dimensional Space Replacement (HDSR)

4.2.3. Distribution Consistency Transformation (DCT)

4.3. Hiding and Restoring Processes

4.3.1. Hiding Process

4.3.2. Restoring Process

5. Experimental Results

5.1. Experimental Setup

5.2. Evaluation by Image Hiding Quality

5.3. The Extraction Quality Compared with Prevalent Methods

5.4. Security Evaluation by Steganalysis

5.5. Multiple Image Hiding

5.6. Domain Generalization

5.7. Ablation Experiment

6. Discussion and Future Work

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Correction Statement

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI