Secure Steganographic Cover Generation via a Noise-Optimization Stacked StyleGAN2

: Recently, the style-based generative adversarial network StyleGAN2 yields state-of-art performance on unconditional high-quality image synthesis. However, from the perspective of steganography, the image security is not guaranteed during the image synthesis. Relying on the optimal properties of StyleGAN2, this paper proposes a noise-optimization stacked StyleGAN2 named NOStyle to generate the secure and high-quality cover (image used for data hiding). In our proposed scheme, we decompose the image synthesis into two stages with symmetrical mode. In stage-I, StyleGAN2 is preserved to generate a high-quality benchmark image. In the stage-II generator, based on the progressive mechanism and shortcut connection, we design a noise secure optimization network by which the different-scale stochastic variation (noise map) is automatically adjusted according to the results of the stage-II discriminator. After injecting the stochastic variation into different resolutions of the synthesis network, the stage-II generator obtains an intermediate image. For the symmetrical stage-II discriminator, we combine the image secure loss and ﬁdelity loss to construct the noise loss which is used to evaluate the difference between two images generated by the stage-I generator and stage-II generator. Taking the outputs of stage-II discriminator as inputs, by iteration, the stage-II generator ﬁnally creates the optimal image. Extensive experiments show that the generated image is not only secure but high quality. Moreover, we make a conclusion that the security of the generated image is inverse proportion to the ﬁdelity.


Introduction
Steganography is a technology to embed additional data into digital media by slight alteration to achieve covert communication without drawing suspicion [1].Generally, the original digital media (cover) is the spatial/JPEG images which can be chosen from the standard image sets or be downloaded from the Internet [2].For all the steganographic algorithms, the effect of data embedding can be viewed as adding a string of independent pseudo noise to the cover and the modified image is called a stego image [3].Therefore, after data embedding, the steganographic changes are concealed by the cover content.In order to measure the distortion caused by embedding operation, each element (pixels for an uncompressed image or non-zero AC coefficient for a JPEG image) is assigned a distortion value computed by the predefined distortion function and total embedding distortion of all cover elements can be theoretically minimized with the aid of Syndrome-Trellis codes (STC) which nearly reaches the payload-distortion bound [4][5][6].To achieve high security, a group of novel content-adaptive embedding algorithms are developed, such as wavelet obtained weights (WOW) [7], spatial universal wavelet relative distortion (S-UNIWARD) [8], high-pass low-pass and low-pass (HILL) [9,10], minimizing the power of optimal detector (MiPOD) [11], JPEG universal wavelet relative distortion (J-UNIWARD), uniform embedding distortion (UED), and uniform embedding revisited distortion (UERD) [10].The G) and stage-I discriminator (SI-D).The second stage also includes a pair of new stage-II generator (SII-G) and discriminator (SII-D).In Figure 1, we show the detailed framework.MN firstly uses a latent code z as the input and outputs an intermediate latent code w.Then, with w and the progressive growing, SI-G generates an image.Meanwhile, compared with a real image, SI-D makes judgement whether the quality of generated image is vivid.After iteration and parameter optimization, SI-G creates the high-quality benchmark image.At the second stage, based on the basic architectural of StyleGAN2, we design a new generator SII-G composed by a secure noise optimization network (SNON) and a synthesis network (SyN).
Consider the disentanglement and progressive growing, SNON aims to achieve the controlling of image details by adjusting the noise and injecting the optimized noise into finer layer ( 2 128 and 2 256 ) of SyN.In SII-D, we design a noise loss, including the image secure loss and fidelity loss, and compute the difference between the outputs of SI-G and SII-G.By minimizing the noise loss, SNON outputs multiple-scale optimized noise maps which are injected into the corresponding scales of SyN and we finally obtain the secure and high-quality image (cover).Therefore, after image synthesis, the proposed architecture generates the vivid image, while the security of generated image is enhanced.
Summarily, the whole training can be separated into two stages.In the first phase, with given dataset, we obtain a typical StyleGAN2 which accomplishes image generation task and outputs a benchmark high-quality image.Then, applying SNON and the noise loss, we achieve the noise optimization and image evaluation.Finally, the proposed architecture generates a vivid image.The contributions of this article are listed as follows:

•
We hope to make a tightly connection between the image generation and the steganography.Consider the image synthesis, we hypothesis the noise injection is seen as another type of steganography.Hence, by optimizing the injecting noise map, the security of generated image can be enhanced and guaranteed.

•
The proposed architecture NOStyle which is to balance between the security and quality of generated image.To achieve this goal, combing the image secure loss and fidelity loss, we design the noise loss that can evaluate the complexity and fidelity of generated image.MN firstly uses a latent code z as the input and outputs an intermediate latent code w.Then, with w and the progressive growing, SI-G generates an image.Meanwhile, compared with a real image, SI-D makes judgement whether the quality of generated image is vivid.After iteration and parameter optimization, SI-G creates the high-quality benchmark image.At the second stage, based on the basic architectural of StyleGAN2, we design a new generator SII-G composed by a secure noise optimization network (SNON) and a synthesis network (SyN).
Consider the disentanglement and progressive growing, SNON aims to achieve the controlling of image details by adjusting the noise and injecting the optimized noise into finer layer (128 2 and 256 2 ) of SyN.In SII-D, we design a noise loss, including the image secure loss and fidelity loss, and compute the difference between the outputs of SI-G and SII-G.By minimizing the noise loss, SNON outputs multiple-scale optimized noise maps which are injected into the corresponding scales of SyN and we finally obtain the secure and high-quality image (cover).Therefore, after image synthesis, the proposed architecture generates the vivid image, while the security of generated image is enhanced.
Summarily, the whole training can be separated into two stages.In the first phase, with given dataset, we obtain a typical StyleGAN2 which accomplishes image generation task and outputs a benchmark high-quality image.Then, applying SNON and the noise loss, we achieve the noise optimization and image evaluation.Finally, the proposed architecture generates a vivid image.The contributions of this article are listed as follows: • We hope to make a tightly connection between the image generation and the steganography.Consider the image synthesis, we hypothesis the noise injection is seen as another type of steganography.Hence, by optimizing the injecting noise map, the security of generated image can be enhanced and guaranteed.

•
The proposed architecture NOStyle which is to balance between the security and quality of generated image.To achieve this goal, combing the image secure loss and fidelity loss, we design the noise loss that can evaluate the complexity and fidelity of generated image.

•
We try to make a conclusion between the security and fidelity of image.To give a clearer explanation, we calculate the Fréchet inception distance (FID) and give various secure testing on multiple steganographic algorithms.According to the experimen-tal results, it is clearly that, for the style-based image synthesis, the security of the generated image is inverse proportion to the fidelity.
The rest of this article is organized as follows.In Section 2, we show the basic notation, the basic theory of GAN, concept of secure steganography, and a typical steganographic distortion.The detailed architecture and the training processing of our proposed scheme are described in Section 3. In Section 4, we give extensive experiments and detailed analysis on the security and quality of generated image.Finally, Section 5 concludes the whole paper and provides further discussion.

Notation
Throughout the whole paper, the capital letters are used for random variables and the boldface symbols stands for matrices and vectors.The symbol X = (x 1 , . . ., x n ) represents the cover image (spatial/JPEG) and Y = (y 1 , . . ., y n ) is the corresponding stego image.According to the image format, the dynamic range of each element of cover or stego image is {0, 255} or {−1024, . . . ,1024}.Meanwhile, M represents the embedding data.If the range of embedding change |I| is set to 2 or 3, the embedding operation is called binary or ternary embedding.

Generative Adversarial Networks
Generative adversarial networks (GAN) are used to alternatively train the generator G and the discriminator D by competitive mechanism.The optimized G aims to create the true data distribution represented by the fake images which are hard for the optimized D to differentiate from real images.The terminal point of competition mechanism is to achieve the Nash equilibrium [22] with value function V(G, D).Overall, the procedure is defined as a min-max game between G and D. The optimized procedure is defined as follows, min where p data is the true data distribution which is used to generate a real data x.To the opposite, z is a noise vector sampled from distribution p z .Meanwhile, E is the expectation to evaluate the difference between p data and p z .Generally, p z is uniform or Gaussian distribution.

Concept of Secure Steganography
Content secure steganography is achieved by the combination of STC and distortion function.Generally, the distortion function is designed to measure the distortion caused by embedding changes.Assume the given cover X = (x 1 , . . ., x n ) and stego Y = (y 1 , . . ., y n ), the differences between X and Y are measured by assigning embedding cost matrix ρ = (ρ 1 , . . ., ρ n ) on each image elements, where ρ i > 0, i ∈ (1, . . . ,n).Suppose the target payload capacity of embedding message M is C and the embedding operation is independent, after data embedding, the overall distortion can be represented as the sum of each individual distortion value, where p i is defined as the probability to modify x i .
Here, λ is a positive parameter which must be used to satisfy the Equation (4), Symmetry 2023, 15, 979

of 22
Using the deduction, the payload-limited steganography is obtained by solving a constrained optimization problem defined as min Y∈Y D(X, Y), (5) where Y is the stego distribution and it means that sender embeds a fixed average payload of C bits while minimizing the average distortion with STC.

Distortion Function in WOW
There are many classical spatial/JPEG distortion functions and the typical distortion functions are WOW, S-UNIWARD, and J-UNIWARD.These distortion functions always use a set of directional filter banks to evaluate the smoothness of an image along horizontal, vertical, and diagonal directions.Given three linear shift-invariant wavelet filters V = {K (1) , K (2) , K (3) }, we obtain three directional residuals: where ' * ' is a mirror-padded convolution operation and k ∈ {1, 2, 3}.Assume a pair of cover and stego images is X and Y, applying the convolution operation, we compute , and i ∈ (1, . . . ,n).Then, the distortion is defined as the weighted absolute values of the filtered residual differences between X and Y (only one difference at position (i)) and the distortion value is computed as where σ > 0 is a constant stabilizing the numerical calculations.When the directional residual is larger, the distortion is smaller.To minimize the embedding distortion, the data are always embedded into the complex and non-modellable regions in which the computed residuals are larger.

Basic Idea
In the proposed work, based on StyleGAN2, we design a secure cover generation architecture.Specifically, the mapping network takes a latent code z as the input to generate an intermediate latent code w.Then, using z and w, the generator outputs a high-quality image from low-resolution to high-resolution with progressive growing and stochastic variation (noise map).
Consider the distinguishing characteristics of typical architecture StyleGAN2 and the demanding of security, the main goal of our proposed architecture is to enhance the security of generated image by optimizing the stochastic variation, while the fidelity of created image is kept as vivid as possible [27][28][29].Unlike the previous works StyleGAN and StyleGAN2, the stochastic variation represented as noise map is not random and, based on the progressive growing and short connection [30][31][32], we design a secure noise optimization network (SNON) which aims to optimize the noise map.After optimization, we can obtain proper noise map which will be involved in the synthesis of the highresolution and secure image.Apart from the network structure, the convergence of SNON relies on the output of SII-D in which we combine the predefined steganographic distortion function and learned perceptual image patch similarity (LPIPS) to construct the noise loss which is employed to evaluate the differences between the two results of SI-G and SII-G.Here, we hypothesis that, for image generation, the security and fidelity are two contradictory topics.Therefore, our proposed scheme is a strategy which tries to makes a tradeoff between the security and fidelity.

Proposed Architecture Overview
According to the methodology described above, we now give a detailed description of the architecture of NOStyle which is shown in Figure 2 and the details are described as follows.
output of SII-D in which we combine the predefined steganographic distortion function and learned perceptual image patch similarity (LPIPS) to construct the noise loss which is employed to evaluate the differences between the two results of SI-G and SII-G.Here, we hypothesis that, for image generation, the security and fidelity are two contradictory topics.Therefore, our proposed scheme is a strategy which tries to makes a tradeoff between the security and fidelity.

Proposed Architecture Overview
According to the methodology described above, we now give a detailed description of the architecture of NOStyle which is shown in Figure 2

Structural Design
As discussed above, the proposed architecture mainly consists of MN, SNON, SyN, and SII-D.The individual parts are described as follows.

•
Mapping network (MN) accepts a non-linear 512 × 1 latent code z ∈ Z as the input, where Z is the corresponding density in the training data.The original z is represented as the combination of many factors of variations.According to theory of disentanglement, the optimal latent code should be the combination of linear subspaces, each of which controls one factor of variation.Then, after normalization and eight full connected layers (FC), z is disentangled and we obtain a more linear 512 × 1 intermediate latent code w ∈ W.

•
Synthesis network (SyN) takes a latent code w to generate a vivid image with the progressive growing and noise.During the training of network, this architecture aims to firstly create low resolution images and then, step by step, output higher resolution images.Therefore, the different resolution features are not affected by each other.Meanwhile, without affecting the overall features, the injected noise adjusts the local changes to make the image more vivid.

•
Generator network contains two networks which are stage-II SyN and SNON.The first network (Figure 3, right) is the synthesis network and another (Figure 3, left and middle) is the noise optimization network.Consider the optimal characters of the disentangled latent code w, these two networks also use w as the input.
As discussed above, the proposed architecture mainly consists of MN, SNON, SyN and SII-D.The individual parts are described as follows.

•
Mapping network (MN) accepts a non-linear 512 1  latent code  z as the in put, where is the corresponding density in the training data.The original z i represented as the combination of many factors of variations.According to theory o disentanglement, the optimal latent code should be the combination of linear sub spaces, each of which controls one factor of variation.Then, after normalization and eight full connected layers (FC), z is disentangled and we obtain a more linea Synthesis network (SyN) takes a latent code w to generate a vivid image with th progressive growing and noise.During the training of network, this architecture aims to firstly create low resolution images and then, step by step, output higher res olution images.Therefore, the different resolution features are not affected by each other.Meanwhile, without affecting the overall features, the injected noise adjust the local changes to make the image more vivid.

•
Generator network contains two networks which are stage-II SyN and SNON.Th first network (Figure 3, right) is the synthesis network and another (Figure 3, left and middle) is the noise optimization network.Consider the optimal characters of the disentangled latent code w , these two networks also use w as the input.Our noise optimization network is mainly inspired by PGGAN and ResNet [33][34][35] Two simple designing rules are given to optimize the injecting noise: First, we introduc the progressive mechanism to generate the noise with the same size as the different reso lution image.Second, the disentangled latent code w is indirectly utilized to form th secure noise by shortcut connection.The progressive model is formed by three blocks in Our noise optimization network is mainly inspired by PGGAN and ResNet [33][34][35].Two simple designing rules are given to optimize the injecting noise: First, we introduce the progressive mechanism to generate the noise with the same size as the different resolution image.Second, the disentangled latent code w is indirectly utilized to form the secure noise by shortcut connection.The progressive model is formed by three blocks in which there are a full connected layer, two 3 × 3 convolution layers, and an average pooling layer with stride of 1.Each block aims to shift the low resolution to the high resolution.Therefore, after three times promoting, the latent code w is changed into a 512 × 512 feature map.Here, we want to adjust the injecting noise at two resolutions (128 × 128 and 256 × 256).Hence, we narrow the number of 512 × 512 feature map into resolution 128 × 128 with three 1 × 1 convolution kernels and obtain three corresponding feature maps denoted as R n1 , R n2 , and R n3 , respectively.
We deduce that the disentangled latent code w is useful to construct the secure noise.To fully utilize w, we introduce an underlying mapping H(•) to represent w as H(w).Motivated by the previous works, we decide to adopt three 1 × 1 convolution kernels to achieve the mapping H(•) and output three feature maps R w1 , R w2 , and R w3 sized by 128 × 128, 128 × 128, and 256 × 256, respectively.Therefore, we totally obtain six feature maps which are R n1 , R n2 , R n3 , R w1 , R w2 , and R w3 .
We deem that the merged operation can enhance the effectiveness of feature maps.Therefore, we merge four different feature maps into two groups which are denoted as T 1 = {R n1 , R w1 } and T 2 = {R n2 , R w2 }.After applying the activation function (leaky ReLU) to T 1 and T 2 , two 128 × 128 noise maps are created and injected into 128 × 128 layers of the synthesis network.Moreover, by up-sampling, we double the size of feature map R w3 into 256 × 256.Combing R w3 and the same activation function leaky ReLU, the third group T 3 = {R n3 , R w3 } is turned into the third noise map which will be injected into 256 × 256 layer of the synthesis network.For other layers of stage-II SyN, the injected noise maps are kept unchanged.

Loss Function
The SII-D and SII-G will be trained according to the noise loss which is the combination of the image secure loss L sl and fidelity loss L f l .Image secure loss L sl is used to measure the complexity of image [36].As discussed in Section 2.4, with the filter banks V = {K (1) , K (2) , K (3) }, three directional residuals W (k) are obtained, where k = {1, 2, 3}.Suppose we directly use W (k) as the image secure loss, the quality of generated image will be dominated by the image secure loss L sl .Therefore, the created image could lack fidelity.In our scheme, we use "ln" operation to turn the larger residual into the smaller one.Combining three converted residuals, L sl is written as L sl = sum( ln W (1) + ln W (2) + ln W (3) ). ( Fidelity loss L f l is to make the synthetic image more vivid.Inspired by optimal characteristics of LPIPS, we adopt LPIPS matrix as our feature-level loss to evaluate the quality of generated image.LPIPS matrix is the average of normalized and extracted feature of total stacks.Assume the reference and distorted patches are b 0 and b 1 sized by p × q, given a network F owning L layers, we compute the normalized and scaled embedding parameters bl 0pq , bl 1pq ∈ R P l ×Q q ×C l of layer l.Collecting all parameters of total layers, l 2 distance between b 0 and b 1 is computed as follows where h l = 1∀l ∈ R C l is the scale parameter which is equivalent to the cosine distance.L f l is defined as We hypothesize that the security and the fidelity are contradictory topics.It means that if L sl is higher, the generated image may be more secure and less vivid.However, when L sl is lower, the quality of the created image could be higher.Therefore, the final distortion should make a tradeoff between L sl and L f l .Following the described hypothesis, the noise loss L is defined as the sum of the secure loss and fidelity loss: where β and γ are the tunable parameters. To give a clear explanation, the processing of the proposed scheme is described in Algorithm 1. (1) Use w and SI-G to output image X(R).
(2) Introduce w and N as the inputs of SNON and stage-II SyN to generate synthesis image X(S).(3) Compute the noise distortion between X(S) and X(R).(4) Update tunable parameters β and γ to minimize L.
(5) Use optimal parameters to output optimal X(S).

Experimental Results and Discussion
In this section, we show the extensive experimental results for the performance evaluation and image quality analysis.

Settings 4.1.1. Image Sets
In our scheme, there are totally three image sets which are used in our experiments.The first one is the LSUN containing around one million labeled images for each of ten scene categories and twenty object categories.We decide to choose the LSUN Cat as training set which is used in the two stages of NOStyle.Here, due to the high demanding of GPU and energy consumption, we adopt the pre-trained model as the choice of the first stage of our architecture.The second image set named GSI (generated secure images) contains 80,000 256 × 256 gray images created by StyleGAN2, NOStyle-SLA, NOStyle-SLB, and NOStyle.NOStyle-SLA and NOStyle-SLB are the monolayer version of NOStyle.The third image set includes 10,000 256 × 256 images which are the down-sampled version of BOSSbase ver.1.01by the "imresize" Matlab function [37].

Steganographic Methods
Totally, four steganographic methods are used as the testing algorithms, including spatial method S-UNIWARD, two JPEG methods J-UNIWARD and UED, and a deep learning steganographic method SGAN.S-UNIWARD and J-UNIWARD are based on the directional high-pass filter groups which have been discussed in Section 2.4.For these methods, the steganographic distortions are relied on the directional residuals which are computed from the spatial/decompressed JPEG image.Based on the intra/inter-block neighborhood coefficients, another typical steganographic method UED aims to minimize the whole statistical changes of DCT coefficients by modifying the non-zero quantized DCT coefficients with equal probability.Apart from the classical methods, there exists many GAN-based and CNN-based schemes.Among these methods, SGAN utilizes the GAN-based architecture to achieve better security.
Generally, the amount of embedded data is measured by the payload which is represented as the ratio of the capacity of embedding data and the available elements (pixels or non-zero JPEG coefficients).According to the format of cover, the payload is measured as the bits per pixel (bpp) or bits per non-zero AC coefficient (bpnzAC).For example, assume the capacity of the embedding data is C and the number of available pixels is N, the relative payload is α = C/N.Applying STC, the message is embedded into a cover with the minimized distortion to achieve undetectability.

Steganalyzers
Three novel steganalyzers DCTR, JRM, and SRMQ1 are employed to evaluate the security performance of the generated images.Depending on the mutual position of two adjacent/nonadjacent coefficients, SRMQ1 and JRM use co-occurrence to show the correlation of coefficients and statistical dependency.DCTR is the first-order statistics of quantized noise residuals which are calculated from the decompressed JPEG image using 64 kernels of the discrete cosine transform.

Security Evaluation
The security evaluation is carried on two databases GSI and BOSSbase ver.1.01.Meanwhile, the chosen classifier is called an ensemble classifier in which, based on subspace of original feature space, a series of sub-classifier is constructed, and the final decision is made by mixing the individual decision of each sub-classifier denoted as Fisher linear discriminator (FLD).
The whole experimental process is divided into two stages denoted as training and testing.At the training stage, using the designated steganography algorithm and cover dataset, we can construct the corresponding stego images.Then, for the cover and stego set, we randomly choose one half of these two image sets with equal number and create the training set.Finally, based on the statistical difference between the selected cover and stego images, we obtain a trained ensemble classifier which can be employed to judge whether an image is a cover or stego one.
Combing the remaining cover and stego images, we construct the testing set and the performance is evaluated on the testing set.In the testing stage, there are two kinds of testing errors.The first one is that a cover is judged as a stego and the second case is that a stego is seen as a cover.These two errors, respectively, stand for the false alarm and missed detection which are abbreviated to P FA and P MD .Finally, the classification error is defined by the minimal average error under equal probability of these two errors, The security of the generated cover is evaluated by P E and higher P E means cover owns better security.

Image Secure Loss
As discussed in Section 3.4, the image secure loss aims to guarantee the security of generated image.According to Equation (8), we use the directional residuals to design the image secure loss.Let us suppose the directional residuals are large, if we directly use them to build the image secure loss L sl , the final noise loss could be dominated by L sl and the affection of L f l may be ignored.In this case, the fidelity of the generated image cannot be guaranteed.To give a clear explanation, we give a comparison on the generated normal image and abnormal image which is created without using the "ln" operation.According to the results shown in Figure 4, compared with normal image (left image), it is clearly that the generated image (right image) looks like a random noise.Therefore, it is necessary to use the "ln" operation to turn the larger directional residuals to be the smaller one.

Fidelity Loss
As analysis in Section 3.3, fidelity is another key part to construct the noise loss.Inspired by the optimal charateristics of LPIPS, we use LPIPS to measures the fidelity of the generated image.We find that only using the image secure loss may make the image less vivid.Even when the operation "ln" is used, the fidelity of image cannot be guaranteed.To reveal the affections of fidelity loss, we represent a group of comparison images shown in Figure 5.The left image is generated by StyleGAN2 and the right one is created by NOStyle without using LPIPS.Compared with StyleGAN2, the generated image is blurry.Hence, according to the comparative results of Figures 4 and 5, the final noise loss should be the optimal combination of the image secure loss and fidelity loss.
According to the results shown in Figure 4, compared with normal image (left image), it is clearly that the generated image (right image) looks like a random noise.Therefore, it is necessary to use the "ln" operation to turn the larger directional residuals to be the smaller one.

Fidelity Loss
As analysis in Section 3.3, fidelity is another key part to construct the noise loss.Inspired by the optimal charateristics of LPIPS, we use LPIPS to measures the fidelity of the generated image.We find that only using the image secure loss may make the image less vivid.Even when the operation "ln" is used, the fidelity of image cannot be guaranteed.To reveal the affections of fidelity loss, we represent a group of comparison images shown in Figure 5.The left image is generated by StyleGAN2 and the right one is created by NOStyle without using LPIPS.Compared with StyleGAN2, the generated image is blurry.Hence, according to the comparative results of Figure 4 and Figure 5, the final noise loss should be the optimal combination of the image secure loss and fidelity loss.

Fidelity Loss
As analysis in Section 3.3, fidelity is another key part to construct the noise loss.Inspired by the optimal charateristics of LPIPS, we use LPIPS to measures the fidelity of the generated image.We find that only using the image secure loss may make the image less vivid.Even when the operation "ln" is used, the fidelity of image cannot be guaranteed.To reveal the affections of fidelity loss, we represent a group of comparison images shown in Figure 5.The left image is generated by StyleGAN2 and the right one is created by NOStyle without using LPIPS.Compared with StyleGAN2, the generated image is blurry.Hence, according to the comparative results of Figure 4 and Figure 5, the final noise loss should be the optimal combination of the image secure loss and fidelity loss.

Hyperparameters
Based on the conclusions on analysis above, L sl and L f l are both important keys to generate the secure and high-quality image.Consider the value of residuals and LPIPS, two tunable parameters were set to β = 10 −4 and γ = 10 −2 .Meanwhile, we use leaky ReLU with slope ϕ = 10 −4 and the equalized learning rate for all layers.To enhance the quality of image, we follow the some valuable conclusions and use the truncation trick to capture the area of high density.Here, the truncated parameter is set to 0.5.Meanwhile, an Adam optimizer with learning rate 0.1 is used to train our network.

Ablation Experiment
According to the discussion in Section 3.2, by iteration, the proposed architecture aims to generate and adjust the injecting three noise map groups T 1 , T 2 , and T 3 to enhance the security of generated image.Here, each size of three noise map groups is 128 × 128, 128 × 128, and 256 × 256.To evaluate the affection of each noise map group, we individually inject T 1 and T 2 into synthesis network to create image and the generative methods are named NOStyle-SLA and NOStyle-SLB.
To testify the security of above two methods, we choose two image datasets BOSSbase and GSI as the cover sets.Firstly, all spatial images are compressed into JPEG version with quality factor 75 and 95.Then, we employ steganographic methods J-UNIWARD and UED to create stego images.After extracting the DCTR feature and applying the ensemble classifier, the results are given in Tables 1-4 which are listed as follows.We observe that the security of images generated by NOStyle-SLA and NOStyle-SLB outperform the standard image set BOSSbase and another image set created by StyleGAN2.
Across six relative embedding rates, the average improvements of NOStyle-SLA and NOStyle-SLB over StyleGAN2 are about 0.44% and 1.21%.Moreover, the results show that, at most relative payloads, NOStyle-SLB is more secure than NOStyle-SLA.Therefore, we make a conclusion that, compared with T 1 , the noise map group T 2 can effectively enhance the security of generated images.Since T 1 and T 2 also show the ability of raising the security of generated image, in our final scheme, T 1 and T 2 are both employed to create the secure image (cover).The results show that the overall structure of each image is almost the same and the stochastic details of generated image are precisely represented.Therefore, we can see that the quality of the images generated by NOStyle is rather high.However, if we carefully observe the image, we find that there are some tiny differences distributed in the detailed region.The corresponding analysis of comparison results will be given in next subsection.We observe that the security of images generated by NOStyle-SLA and NOStyle-SLB outperform the standard image set BOSSbase and another image set created by Style-GAN2.Across six relative embedding rates, the average improvements of NOStyle-SLA and NOStyle-SLB over StyleGAN2 are about 0.44% and 1.21%.Moreover, the results show that, at most relative payloads, NOStyle-SLB is more secure than NOStyle-SLA.Therefore, we make a conclusion that, compared with 1  , the noise map group 2  can effectively enhance the security of generated images.Since 1  and 2  also show the ability of rais- ing the security of generated image, in our final scheme, 1  and 2  are both employed to create the secure image (cover).Therefore, we can see that the quality of the images generated by NOStyle is rather high.However, if we carefully observe the image, we find that there are some tiny differences distributed in the detailed region.The corresponding analysis of comparison results will be given in next subsection.Except the visual characteristic, we hope to discuss the feature representation of generated image.As discussed in [25], Fréchet inception distance (FID) is an excellent value to measure the quality of image.Lower FID score is an indicator of high-quality images, and vice versa.FID is defined as Suppose p(⸱) and pw(⸱) represent the distribution of generated images and real images, m and C are the mean and variance of p(⸱).mw and Cw are the mean and variance of pw(⸱).Except the visual characteristic, we hope to discuss the feature representation of generated image.As discussed in [25], Fréchet inception distance (FID) is an excellent value to measure the quality of image.Lower FID score is an indicator of high-quality images, and vice versa.FID is defined as Suppose p(•) and p w (•) represent the distribution of generated images and real images, m and C are the mean and variance of p(•).m w and C w are the mean and variance of p w (•).Here, for 80,000 generated 256 × 256 images, we also calculate FID to measure the quality of image.The corresponding FIDs are listed in Table 5.The results of Table 5 show that FID of NOStyle is highest and, to the opposite, the corresponding value of StyleGAN2 is lowest.According to the related conclusion, lower FID means the quality of generated images is higher.Therefore, we conclude that the quality of the image generated by model NOStyle is lower than the other three generative networks.However, the gap between the four FID values is quite small and we can infer that, for the given four image generative methods, the difference in image quality is rather small.Meanwhile, compared with FIDs of NOStyle-SLA and NOStyle-SLB, we see that the FID value of NOStyle-SLB is a little higher than the FID of NOStyle-SLA.Hence, the generated image of model NOStyle-SLA is higher than the corresponding result of NOStyle-SLB.Combing the detection results and FID, for the unconditional high-quality image synthesis, we conclude that higher FID value means higher image security.Here, we suppose that there is tight connection between the FID and image security.To give more explanations, the corresponding analysis will be discussed in Section 4.6.

Detail Comparison of Various Methods
According to the experimental results in [24] and analysis above, StyleGAN2 displays excellent performance to generate high-quality image.Compared with StyleGAN2, NOStyle keeps some key architectures including MN and SyN.The big differences between these two style-based generative networks focus on SNON and stage-II discriminator.Therefore, compared with multiple images generated by different generative models, we assert that the styles corresponding to coarse and middle spatial resolutions are the same, while the details distributed into the complex regions have minor differences.
To show the local difference of various generated images, we focus on the same complex region of four images created by StyleGAN2, NOStyle-SLA, NOStyle-SLB, and NOStyle.Then, we give comparison results in Figure 7.It is clear, for the given four methods, the chosen regions look almost the same.However, if we carefully observe four comparison results, we know that there are some tiny differences distributed in the complex region.The reason is that we just adjust the high-resolution noise maps.In fact, these spatial differences bring about changes in security and fidelity.Figure 8 gives the examples of the generated covers, the corresponding stego images, and the modification maps.The stego images are generated by J-UNIWARD on 0.2 bpnzAC for JPEG quality factors 85.Although four stego images look like almost the same, the modification maps show that the embedding changes in DCT domain are quite different.From the view of steganography, the embedding differences cause the difference of security and we conclude that there is a strong connection between the image synthesis and security.The results of Table 5 show that FID of NOStyle is highest and, to the opposite, the corresponding value of StyleGAN2 is lowest.According to the related conclusion, lower FID means the quality of generated images is higher.Therefore, we conclude that the quality of the image generated by model NOStyle is lower than the other three generative networks.However, the gap between the four FID values is quite small and we can infer that, for the given four image generative methods, the difference in image quality is rather small.Meanwhile, compared with FIDs of NOStyle-SLA and NOStyle-SLB, we see that the FID value of NOStyle-SLB is a little higher than the FID of NOStyle-SLA.Hence, the generated image of model NOStyle-SLA is higher than the corresponding result of NOStyle-SLB.Combing the detection results and FID, for the unconditional high-quality image synthesis, we conclude that higher FID value means higher image security.Here, we suppose that there is tight connection between the FID and image security.To give more explanations, the corresponding analysis will be discussed in Section 4.6.

Detail Comparison of Various Methods
According to the experimental results in [24] and analysis above, StyleGAN2 displays excellent performance to generate high-quality image.Compared with StyleGAN2, NOStyle keeps some key architectures including MN and SyN.The big differences between these two style-based generative networks focus on SNON and stage-II discriminator.Therefore, compared with multiple images generated by different generative models, we assert that the styles corresponding to coarse and middle spatial resolutions are the same, while the details distributed into the complex regions have minor differences.
To show the local difference of various generated images, we focus on the same complex region of four images created by StyleGAN2, NOStyle-SLA, NOStyle-SLB, and NOStyle.Then, we give comparison results in Figure 7.It is clear, for the given four methods, the chosen regions look almost the same.However, if we carefully observe four comparison results, we know that there are some tiny differences distributed in the complex region.The reason is that we just adjust the high-resolution noise maps.In fact, these spatial differences bring about changes in security and fidelity.Figure 8 gives the examples of the generated covers, the corresponding stego images, and the modification maps.The stego images are generated by J-UNIWARD on 0.2 bpnzAC for JPEG quality factors 85.Although four stego images look like almost the same, the modification maps show that the embedding changes in DCT domain are quite different.From the view of steganography, the embedding differences cause the difference of security and we conclude that there is a strong connection between the image synthesis and security.

Security Performance
In this part, we compare the security performance of original image set BOSSbase and different image sets generated by other generative models, including StyleGAN2, NOStyle-SLA, NOStyle-SLB, and NOStyle.The experiments are carried out on the spatial and JPEG domain.In order to construct JPEG image set, the original gray images set GSI are compressed into JPGE images with quality factors 75, 85, and 95.After the compression operation, we totally obtain 40,000 spatial images and 120,000 JPEG images.Finally, the experiments are executed on the 160,000 images and the payloads for each image set are 0.05, 0.1, 0.2, 0,3, 0.4, and 0.5.

Security Performance
In this part, we compare the security performance of original image set BOSSbase and different image sets generated by other generative models, including StyleGAN2, NOStyle-SLA, NOStyle-SLB, and NOStyle.The experiments are carried out on the spatial and JPEG domain.In order to construct JPEG image set, the original gray images set GSI are compressed into JPGE images with quality factors 75, 85, and 95.After the compression operation, we totally obtain 40,000 spatial images and 120,000 JPEG images.Finally, the experiments are executed on the 160,000 images and the payloads for each image set are 0.05, 0.1, 0.2, 0,3, 0.4, and 0.5.
For spatial cover set, we choose three novel steganographic schemes, including S-UNIWARD, HILL, and SGAN, to generate stego images.Meanwhile, for JPEG cover set, two novel JPEG steganographic schemes J-UNIWARD and UED are used as the choices to create stego images.Later, the original image set and the corresponding stego image sets are divided into two parts with equal size.Finally, with FLD, we obtain the corresponding detection results which are shown in Tables 6-8 and Figures 9-12  For spatial cover set, we choose three novel steganographic schemes, including S-UNIWARD, HILL, and SGAN, to generate stego images.Meanwhile, for JPEG cover set, two novel JPEG steganographic schemes J-UNIWARD and UED are used as the choices to create stego images.Later, the original image set and the corresponding stego image sets are divided into two parts with equal size.Finally, with FLD, we obtain the corresponding detection results which are shown in Tables 6-8 and Figures 9-12.According to the above testing results, we can see that, compared with other four image sets BOSSbase, StyleGAN2, NOStyle-SLA, and NOStyle-SLB, NOStyle achieves best secure performance at almost every payload against SRMQ1, regardless of the typical spatial and GAN-based steganographic schemes.On the other hand, for two JPEG steganographic methods UED and J-UNIWARD, NOStyle is more secure than StyleGAN2

Connection between Security and Fidelity
In this section, we hope to construct the connection between the security and quality.To achieve this goal, we firstly select 8000 images from 4 image sets created by StyleGAN2, NOStyle, NOStyle-SLA, and NOStyle-SLB with equal size.For the sake of convenience, we refer to four schemes as SG2, NS, NSA, and NSB.The experiments are carried out on all given images and three experimental relative payloads for each image set are 0.2, 0,3, and 0.4.Meanwhile, we use two JPEG steganographic methods (J-UNIWARD and UED) to generate stego images.Finally, the security testing is carried on the extracted novels features DCTR and JRM.For two steganographic methods, three relative payloads, and two quality factors, we can obtain many combination schemes.For example, if we use DCTR to test J-UNIWARD at embedding rate 0.2 for JPEG quality factor 75, this category is abbreviated to "D-J-75-2".
Suppose we fix a testing combination and, for four methods, we obtain four detection errors.Here, we define a value P SF which is computed as the division of any P E and the maximum of detection errors of four generative models.Therefore, the ratio value P SF is defined as follows Symmetry Additionally, we use the same operation to deal with the corresponding FID values in Table 5. Across all the parameter combinations and FID, we totally obtain 13 ratios which are listed in Table 9.According to the results shown in Table 9, in nearly all cases, P SF of NOStyle is 1.It means that the detection error of NOStyle is the highest and the security of corresponding generated image is the highest.Moreover, we see that the tendency of P SF of various combinations and FID is consistent.Therefore, we assert that the security of the generated image is inverse proportion to the fidelity of the created image.It implies that, under the generative mechanism, the fidelity is lower and the corresponding image security is less, and vice versa for the case of a higher fidelity.
We aim to describe the relationship between the security of the generated image and the fidelity.As previous discussions, in the classical framework of image generation, the key task is to maintain the fidelity of generated image by the disentanglement and style mixing mechanism.Generally, the stochastic controlling of image detail is achieved by the injecting of stochastic noise located at different layers of the synthesis network which is trained based on the noise loss.From the aspect of steganography, combining the security loss and fidelity loss, we redesign the network loss, retrain the synthesis network, and obtain the stochastic map.However, for image generation, the given stochastic map is not optimal and, during the processing of image synthesis, the fidelity of generated image is diminished.Indeed, according to the results of Table 5, FID of NOstyle is a little higher than the FID of StyleGAN2.It means that the fidelity of image generated by NOstyle is a little worse than the corresponding image created by StyleGAN2.Therefore, the above analysis testifies that the security of generated image is inverse proportion to the fidelity.However, we see that the difference of fidelity between two methods is tiny.Experiments show, compared with StyleGAN2, NOstyle makes a bigger progress in image security.
Based on the experimental results and analysis, we see that, by redefining the secure noise optimization network and an optimal noise loss, we achieve the optimization of the injected noise which can be used to generate the secure and high-quality image.Finally, the proposed scheme can make a tradeoff between the security and fidelity.

Computational Complexity
For model-base image synthesis, the computational complexity is the key point to make the proposed approach applicable.We execute a set of experiments to testify the computational complexity of three methods.To train our model, we choose the subset of LSUN Cat dataset as training set to train our proposed image synthesis network NOstyle.Based on the previous discussion, our network architecture mainly owns two stages denoted as stage-I and stage-II.The stage-I NOStyle generator is inherited from the pretrained StyleGAN2 and the stage-II NOStyle is used to optimize the injecting noise and generate a high-quality/secure image.Therefore, the computational complexity of the proposed scheme mainly depends on the computational complexity of stage-II.The experiments are tested on a server with 2.2 GHz CPU, 16 GB memory, and a 2080 Ti GPU.The computational complexity represented as training time (h) is listed in Figure 13 in which NS, NSA, and NSB, respectively, stand for three comparative methods.
analysis testifies that the security of generated image is inverse proportion to the fidelity.However, we see that the difference of fidelity between two methods is tiny.Experiments show, compared with StyleGAN2, NOstyle makes a bigger progress in image security.
Based on the experimental results and analysis, we see that, by redefining the secure noise optimization network and an optimal noise loss, we achieve the optimization of the injected noise which can be used to generate the secure and high-quality image.Finally, the proposed scheme can make a tradeoff between the security and fidelity.

Computational Complexity
For model-base image synthesis, the computational complexity is the key point to make the proposed approach applicable.We execute a set of experiments to testify the computational complexity of three methods.To train our model, we choose the subset of LSUN Cat dataset as training set to train our proposed image synthesis network NOstyle.Based on the previous discussion, our network architecture mainly owns two stages denoted as stage-I and stage-II.The stage-I NOStyle generator is inherited from the pretrained StyleGAN2 and the stage-II NOStyle is used to optimize the injecting noise and generate a high-quality/secure image.Therefore, the computational complexity of the proposed scheme mainly depends on the computational complexity of stage-II.The experiments are tested on a server with 2.2 GHz CPU, 16 GB memory, and a 2080 Ti GPU.The computational complexity represented as training time (h) is listed in Figure 13 in which NS, NSA, and NSB, respectively, stand for three comparative methods.According to given results in Figure 13, the computational complexity of NS is higher than the corresponding values of NSA and NSB.Meanwhile, due to the similar mechanism of NSA and NSB, the training time of two generative methods are almost the same.However, the differences between NS and other two methods are small.On average, the difference training time is about half an hour.As previous discussion, consider the high demanding of GPU and energy consumption, we directly use the pre-train model of StyleGAN2 and, therefore, the training time of StyleGAN2 is not represented in Figure 13.Obviously, we firmly believe that the computational complexity of StyleGAN2 is higher than other three methods NS, NSA, and NSB.With the lower computational complexity, the practicality of the proposed approach is rather high.

Stochastic Variation
Let us consider how the style-based methods implement the stochastic variation.Given the designed network, the stochastic realizations (noise map) are achieved by adding per-pixel noise after each convolution of network.According to the comparative results in Figures 6 and 7 and the related discussion in [25], the noise only affects the stochastic aspects of generated image, such as hairs, fur, or freckles.Meanwhile, we can see the overall composition of different generated images remains unchanged.For our proposed architecture NOStyle, on the one hand, the injected noise is not totally random and the noise is adjusted to maintain the image fidelity and security.Therefore, the pseudo-random noise indeed affects the security of the generated image.However, our proposed architecture optimizes the given noise and makes an ideal tradeoff between image security and fidelity.

Conclusions and Future Work
Traditional steganography usually uses the given image to achieve secure communication and undetectability.In this article, we propose a noise-optimization stacked StyleGAN2 called NOStyle to generate the secure and vivid image which can be used as an optimal cover.The proposed image synthesis is decomposed into two stages.The architecture of the stage-I is the same as the original StyleGAN2 which is used to generate a high-quality benchmark image.Consider the optimal characters of StyleGAN2, for the stage-II generator, the mapping network and synthesis network are kept.The highlight is that, based on the progressive growing and shortcut connection, we design a secure noise optimization network to output the optimal noise maps which is used as the stochastic variation of synthesis network.Relying on the disentanglement, in the image synthesis phase, the optimized noise maps are injected into the finer layers to achieve the detail controlling of generated image.To train our proposed network, combining the steganographic distortion and LPIPS matrix, we design a noise loss which is used to evaluating the difference between the benchmark image and the results of SII-G.Taking the result of SII-D and optimized noise, SII-G finally outputs the secure and high-quality image.Across multiple steganographic methods and steganalyzers, extensive results indicate that the security of image set generated by NOStyle outperforms the standard image set BOSSbase and the generative model StyleGAN2.Meanwhile, NOStyle also owns excellent ability to generate the high-quality image.Moreover, comparing FID and the detection error, our proposed method makes a conclusion that the security of the generated image is inverse proportion to the fidelity.In the future, the secure noise adjustment can be spread into other layers of synthesis network and we could use other spatial/JPEG distortion functions to construct the discriminator to measure the quality of generated image.

Figure 1 .
Figure 1.Sketch of the proposed architecture.

Figure 1 .
Figure 1.Sketch of the proposed architecture.

Figure 2 .
Figure 2. Details of the proposed architecture.It is obvious that our proposed architecture mainly owns five dependent parts which are MN, SNON, stage-I SyN, stage-II SyN, and SII-D.The stage-I NOStyle generator is inherited from original StyleGAN2 including MN and stage-I SyN.In the first stage, we apply the StyleGAN2 to generate a high-quality image which is used as a benchmark image and is injected into SII-G.Taking SI-G results as input, the second stage-II NOStyle optimizes the injecting noise and generates a high-quality/secure image.The architecture of the stage-II NOStyle is composed of SNON, stage-II SyN, and SII-D.The designing principle of SNON is motivated by the progressive growing and short connection by which different-scale noise map is optimized and injected into finer layers of the stage-II SyN.Employing the optimized noise map, the random noise z, and an intermediate latent code w, stage-II SyN finally outputs the high-quality and secure image.Here, the architectures of stage-I SyN and stage-II SyN are the same and the differences focus on the inputs of network.Generally, the per-pixel added noise map is sample from the Gaussian distribution  .Suppose the added noise map is ( ) R ∈ N  , combining w , the synthesis networks Stage-I SyN can output a random image ( ) R X .With the same latent code w , Stage-II SyN generates a secure image ( ) S X .Then, ( ) R X and ( ) S X are entered into SII-D.Using the wavelet filter banks and LPIPS, SII-D constructs the noise loss (NL) to evaluate the

Figure 2 .
Figure 2. Details of the proposed architecture.It is obvious that our proposed architecture mainly owns five dependent parts which are MN, SNON, stage-I SyN, stage-II SyN, and SII-D.The stage-I NOStyle generator is inherited from original StyleGAN2 including MN and stage-I SyN.In the first stage, we apply the StyleGAN2 to generate a high-quality image which is used as a benchmark image and is injected into SII-G.Taking SI-G results as input, the second stage-II NOStyle optimizes the injecting noise and generates a high-quality/secure image.The architecture of the stage-II NOStyle is composed of SNON, stage-II SyN, and SII-D.The designing principle of SNON is motivated by the progressive growing and short connection by which different-scale noise map is optimized and injected into finer layers of the stage-II SyN.Employing the optimized noise map, the random noise z, and an intermediate latent code w, stage-II SyN finally outputs the high-quality and secure image.Here, the architectures of stage-I SyN and stage-II SyN are the same and the differences focus on the inputs of network.Generally, the per-pixel added noise map is sample from the Gaussian distribution N .Suppose the added noise map is N(R) ∈ N , combining w, the synthesis networks Stage-I SyN can output a random image X(R).With the same latent code w, Stage-II SyN generates a secure image X(S).Then, X(R) and X(S) are entered into SII-D.Using the wavelet filter banks and LPIPS, SII-D constructs the noise loss (NL) to evaluate the complexity and fidelity of the generated image.By minimizing NL, we can adjust the injecting noise map.The details are illustrated in Section 3.3.

Figure 3 .
Figure 3. Detailed description of the generator architecture.

Figure 3 .
Figure 3. Detailed description of the generator architecture.

Algorithm 1
Secure Image (Cover) GenerationInput: a pre-trained StyleGAN2 generator SI-G; a latent code w; SNON; stage-II SyN; a discriminator SII-D; a random noise map N ∈ N ; the noise loss L. Output: secure synthesis image (cover) X(S).

4. 4 .
Quality of Generated Images 4.4.1.Comparison of Macroscopic Architecture Based on LSUN CAT and the optimal parameters, we own three image synthesis methods which are NOStyle-SLA, NOStyle-SLB, and NOStyle.Combining the novel method StyleGAN2, we totally obtain four methods.Using different image generation methods, with the same non-linear 512 × 1 latent code z, we can create similar image with the same scene.In Figure 6, we give a set of comparison examples.Each comparison image includes two sub-images which are generated by StyleGAN2 (left sub-image) and NOStyle (right sub-image).

4. 4 .
Quality of Generated Images 4.4.1.Comparison of Macroscopic Architecture Based on LSUN CAT and the optimal parameters, we own three image synthesis methods which are NOStyle-SLA, NOStyle-SLB, and NOStyle.Combining the novel method StyleGAN2, we totally obtain four methods.Using different image generation methods, with the same non-linear 512 1 × latent code z, we can create similar image with the same scene.In Figure 6, we give a set of comparison examples.Each comparison image includes two sub-images which are generated by StyleGAN2 (left sub-image) and NOStyle (right sub-image).The results show that the overall structure of each image is almost the same and the stochastic details of generated image are precisely represented.

Figure 6 .
Figure 6.Macroscopic comparison of two methods.Four sub-figures (a-d) are generated by StyleGAN2 and NOStyle.Each sub-figure contains two images.The left one is an 256 256 × image generated by StyleGAN2 and the right 256 256 × image is created by NOStyle.

Figure 6 .
Figure 6.Macroscopic comparison of two methods.Four sub-figures (a-d) are generated by Style-GAN2 and NOStyle.Each sub-figure contains two images.The left one is an 256 × 256 image generated by StyleGAN2 and the right 256 × 256 image is created by NOStyle.

Figure 7 .Figure 7 .
Figure 7. Details comparison of four generative models.For four sub-figures (a-d), each sub-figure contains five sub-images which are generated by StyleGAN2, NOStyle-SLA, NOStyle-SLB, and NOStyle, respectively.For each sub-figure, the largest sub-image is a 256 256 × benchmark image generated by StyleGAN2.For other four smaller sub-images, under the same scene, the top left sub-Figure 7. Details comparison of four generative models.For four sub-figures (a-d), each sub-figure contains five sub-images which are generated by StyleGAN2, NOStyle-SLA, NOStyle-SLB, and NOStyle, respectively.For each sub-figure, the largest sub-image is a 256 × 256 benchmark image generated by StyleGAN2.For other four smaller sub-images, under the same scene, the top left sub-image and top right sub-image are generated by StyleGAN2 and NOStyle-SLA.The bottom left sub-image and low right sub-image are created by NOStyle-SLB and NOStyle, respectively.
image and top right sub-image are generated by StyleGAN2 and NOStyle-SLA.The bottom left subimage and low right sub-image are created by NOStyle-SLB and NOStyle, respectively.

Figure 8 .
Figure 8. Illustrations of 256 256 × four cover images, stego images, and corresponding modification maps of J-UNIWARD at 0.2 bpnzAC for StyleGAN2, NOStyle-SLA, NOStyle-SLB, and NOStyle.(a-d) are cover images are generated by above mentioned four generative methods.(e-h) are corresponding stego images.(i-l) are the modification maps in JPEG domain. .

Figure 8 .
Figure 8. Illustrations of 256 × 256 four cover images, stego images, and corresponding modification maps of J-UNIWARD at 0.2 bpnzAC for StyleGAN2, NOStyle-SLA, NOStyle-SLB, and NOStyle.(a-d) are cover images are generated by above mentioned four generative methods.(e-h) are corresponding stego images.(i-l) are the modification maps in JPEG domain.

Figure 10 .
Figure 10.Detection performance comparison of 256 × 256 BOSSbass, StyleGAN2, NOStyle-SLA, and NOStyle-SLB for detection UED with DCTR for JPEG quality factors 75, 85, and 95.(a) Detection error for UED for 75.(b) Detection error for UED for 85.(c) Detection error for UED for 95.According to the above testing results, we can see that, compared with other four image sets BOSSbase, StyleGAN2, NOStyle-SLA, and NOStyle-SLB, NOStyle achieves best secure performance at almost every payload against SRMQ1, regardless of the typical spatial and GAN-based steganographic schemes.On the other hand, for two JPEG steganographic methods UED and J-UNIWARD, NOStyle is more secure than StyleGAN2 against JRM and DCTR.On average, across six payloads, the improvements of NOStyle are 1.19%, 0.94%, 1.32%, 1.02%, 1.28%, and 0.71% over StyleGAN2, respectively.The experiments indicate that, compared with the typical image generation scheme StyleGAN2, NOStyle can optimize the injected noise map and enhance the security performance of generated image.Compared with spatial and JPEG detection results, we observe that NOStyle gains bigger improvement on JPEG steganographic schemes.
and the details are described as follows. k

Table 9 .
Detection error ratios P SF of four image generative methods.