Coverless Image Steganography Based on Generative Adversarial Network
Abstract
:1. Introduction
- (1)
- We propose a method of using GAN to complete steganography tasks, whose relative payload is 2.36 bits per pixel.
- (2)
- We propose a measurement method to evaluate the image quality of the steganography algorithm based on deep learning, which can be compared with traditional methods.
2. Image Steganography Based on GAN
3. Method
- (1)
- An Encoder network $\epsilon $, which receives a coverless image and a string of binary secret message, generates a steganographic image;
- (2)
- A Decoder network G, which obtains a steganographic image, attempts to recover a secret message;
- (3)
- A Discriminator network D is used to evaluate the quality of vectors and steganographic images S.
3.1. Encoder Network
- (1)
- Use convolutional block $Conv$ to process the cover image C to get the tensor a with the size of $(32\times W\times H)$.$$a={Conv}_{3\to 32}\left(C\right).$$
- (2)
- Concatenate the message M with a and then process the tensor b with a convolutional block $Conv$. The size of b is $(32\times W\times H)$:$$b={Conv}_{32+Depth\to 32}(ConCat(a,M)).$$
- (i)
- Basic model: We apply two convolution blocks $Conv$ to tensor b successively to generate steganographic images S. Formally:$${\mathcal{E}}_{b}(C,M)={Conv}_{32\to 3}\left({Conv}_{32\to 32}\left(b\right)\right).$$
- (ii)
- Dense model: We use the skip connection [18] to map the features f generated by the former Dense Block to the features l generated by the latter Dense Block, as shown in Figure 1. We assume that using skip connection can improve the embedding rate. Formally:$$\begin{array}{c}\hfill \left\{\begin{array}{c}f={Conv}_{64+Depth\to 32}(ConCat(a,b,M))\hfill \\ l={Conv}_{96+Depth\to 3}(ConCat(a,b,f,M))\hfill \\ {\mathcal{E}}_{l}(C,M)=C+l\hfill \end{array}\right.\end{array}$$
3.2. Decoder Network
3.3. Discriminator Network
3.4. The Objective Fuction
3.4.1. Encoder-Decoder Loss
- (1)
- The cross entropy loss function is used to evaluate the decoding accuracy of decoder network, that is$${L}_{G}={E}_{X\sim {p}_{c}}\phantom{\rule{4.pt}{0ex}}\mathrm{CrossEntropy}\phantom{\rule{4.pt}{0ex}}(G\left(\epsilon (\mathbf{X},\mathbf{M})\right),M).$$
- (2)
- The mean square error is used to analyze the similarity between the steganographic image and the cover image, where W is the width and H is the length of image, that is$${L}_{s}={E}_{X\sim {P}_{C}}\frac{1}{3\times W\times H}{\parallel X-\epsilon (\mathbf{X},\mathbf{M})\parallel}_{2}^{2}.$$
- (3)
- And the realness of the steganographic image using the discriminator, that is$${L}_{r}={E}_{X\sim {P}_{C}}D\left(\epsilon (\mathbf{X},\mathbf{M})\right).$$
Algorithm 1 Steganographic training algorithm based on GAN |
Input: Encoder $\epsilon ,$ Decoder $G,$ Discriminator $D.$ threshold ${}_{G}\leftarrow 0.9,$ threshold ${}_{D}\leftarrow 0.85$. |
Output: val${}_{G}\leftarrow $CrossEntropy of G. |
1. While val${}_{G}<$threshold${}_{G}$ do |
2. Update $\epsilon $ and G using ${L}_{G}+{L}_{s}+{L}_{r}$. |
3. for $\mathit{n}$ training epochs do |
4. if val${}_{G}<$threshold${}_{G}$ then |
5. Update $\epsilon $ using ${L}_{s},G$ using ${L}_{G}$ |
6. else if $va{l}_{D}<$threshold${}_{D}$ then |
7. else |
8. Update $\epsilon $ using ${L}_{s}+{L}_{r},G$ using ${L}_{G}$ |
9. Get $va{l}_{G}\leftarrow $ CrossEntropy of G |
10. Get val${}_{D}\leftarrow $Cross validation accuracy of D |
11. end if |
12. end for |
13. done |
14. return $\phantom{\rule{1.em}{0ex}}va{l}_{G}$ |
3.4.2. Structural Similarity Index
4. Experimental Results and Analysis
4.1. Evaluation Metrics
4.1.1. Reed Solomon Bits Per Pixel
4.1.2. Peak Signal-to-Noise Ratio
4.2. Training
4.3. Experimental Results
5. Discussion and Conclusions
Layers | Name | Output Size |
---|---|---|
Input | / | $3\times 256\times 256$ |
Layer1 | ConvBlock1 | $8\times 128\times 128$ |
Layer2 | ConvBlock1 | $8\times 128\times 128$ |
Layer3 | ConvBlock2 | $16\times 64\times 64$ |
Layer4 | ConvBlock2 | $32\times 32\times 32$ |
Layer5 | ConvBlock3 | $128\times 8\times 8$ |
Layer6 | SPPBlock | $2688\times 1$ |
Layer7 | FC | $128\times 1$ |
Layer8 | FC | $2\times 1$ |
Dataset | Depth | Ours | Zhang’s | ||||||
---|---|---|---|---|---|---|---|---|---|
Basic Model | Dense Model | Basic Model | Dense Model | ||||||
PSNR | MS-SSIM | PSNR | MS-SSIM | PSNR | MS-SSIM | PSNR | MS-SSIM | ||
Div2k | 1 | 39.80 | 0.91 | 37.27 | 0.90 | 34.71 | 0.86 | 34.33 | 0.85 |
2 | 36.03 | 0.87 | 36.09 | 0.88 | 34.21 | 0.84 | 34.32 | 0.85 | |
3 | 34.74 | 0.84 | 34.65 | 0.84 | 33.14 | 0.80 | 33.00 | 0.80 | |
4 | 35.59 | 0.86 | 35.35 | 0.85 | 33.73 | 0.83 | 33.99 | 0.83 | |
5 | 35.88 | 0.87 | 36.47 | 0.88 | 34.17 | 0.84 | 34.36 | 0.84 | |
6 | 36.61 | 0.88 | 36.78 | 0.89 | 34.97 | 0.86 | 34.71 | 0.85 |
Dataset | Depth | Ours | Zhang’s | ||
---|---|---|---|---|---|
Basic Model | Dense Model | Basic Model | Dense Model | ||
RS-BPP | |||||
Div2k | 1 | 0.96 | 0.96 | 0.93 | 0.93 |
2 | 1.82 | 1.83 | 1.76 | 0.93 | |
3 | 2.36 | 2.36 | 2.18 | 2.22 | |
4 | 2.30 | 2.30 | 2.20 | 2.23 | |
5 | 2.28 | 2.31 | 2.15 | 2.19 | |
6 | 2.24 | 2.27 | 2.17 | 2.18 |
Dataset | Depth | Ours | Zhang’s | ||
---|---|---|---|---|---|
Basic Model | Dense Model | Basic Model | Dense Model | ||
Accuracy of Recovery | |||||
Div2k | 1 | 0.98 | 0.98 | 0.97 | 0.96 |
2 | 0.96 | 0.96 | 0.94 | 0.96 | |
3 | 0.89 | 0.89 | 0.86 | 0.87 | |
4 | 0.79 | 0.79 | 0.77 | 0.78 | |
5 | 0.73 | 0.73 | 0.72 | 0.72 | |
6 | 0.67 | 0.69 | 0.68 | 0.68 |
