Deep Multi-Image Steganography with Private Keys

: In this paper, we propose deep multi-image steganography with private keys. Recently, several deep CNN-based algorithms have been proposed to hide multiple secret images in a single cover image. However, conventional methods are prone to the leakage of secret information because they do not provide access to an individual secret image and often decrypt the entire hidden information all at once. To tackle the problem, we introduce the concept of private keys for secret images. Our method conceals multiple secret images in a single cover image and generates a visually similar container image containing encrypted secret information inside. In addition, private keys corresponding to each secret image are generated simultaneously. Each private key provides access to only a single secret image while keeping the other hidden images and private keys unrevealed. In speciﬁc, our model consists of deep hiding and revealing networks. The hiding network takes a cover image and secret images as inputs and extracts high-level features of the cover image and generates private keys. After that, the extracted features and private keys are concatenated and used to generate a container image. On the other hand, the revealing network extracts high-level features of the container image and decrypts a secret image using the extracted feature and a corresponding private key. Experimental results demonstrate that the proposed algorithm effectively hides and reveals multiple secret images while achieving high security.


Introduction
Steganography is an algorithm to conceal information within an object while keeping the object containing the hidden information indistinguishable from the original one. The main purpose of steganography is to grant access to the hidden information only to the authorized clients while keeping its content and its presence unrevealed to the others. Various kinds of carriers such as physical objects, texts, sounds, and network packets have been utilized to safely conceal and deliver confidential data. Among them, a digital image is one of the widely used carriers in recent digital steganographic algorithms (i.e., image steganography).
Conventional image steganography methods usually aim at hiding secret messages within a cover image. To this end, various studies including spatial domain-based methods [1,2] and frequency domain-based methods [3][4][5][6][7] have been actively conducted, and remarkable results have been achieved. Although there has been tremendous progress in image steganography, there is still a limitation in hiding a large amount of data. Recently, several studies have tried to hide full-size secret images inside a cover image using deep CNN [8][9][10]. These methods are completely different from the conventional image steganography approaches. The deep learning-based steganography method usually consists of a hiding network and a revealing network. The hiding network takes a cover image and a secret image as inputs then creates a container image by hiding the secret image into the cover image. The revealing network extracts a hidden secret image from the container image.
With deep learning-based works, it was demonstrated that full-size images can be concealed in a single cover image with minimal quality loss. However, most studies are designed to hide one secret image in the cover image. Hiding [10] extended the image steganography model to hide multiple image simultaneously as shown in Figure 1a. Unfortunately, it contains only a single reveal network, which may cause a critical security problem. In specific, the revealing network does not consider the extraction of a subset of secret images, and it just extracts the entire hidden images together at the decoding stage. Therefore, it is not possible to assign specific access permission for a subset of hidden images depending on the security level. One of the simple workarounds is preparing multiple reveal networks for each secret image as shown in Figure 1b, however, it leads to increased memory usage and can expose the number of hidden images that must be kept unrevealed. In [11], secret messages are hidden in a cover image by a private key, and those secret messages can be only accessed by using the correct private key. In this study, inspired by [11], we introduce the concept of the private key to multi-image steganography. As shown in Figure 1c, our steganography model consists of a hiding network and a reveal network. In the hiding step, the hiding network conceals multiple secret images into a cover image and produces private keys for each secret image. Each secret key is passed along with the container image to an account with access rights. In the revealing step, the target secret image can be obtained by feeding the container image and a private key to the revealing network. As a result, the proposed steganography model can achieve a high level of security and efficiency while successfully hiding multiple secret images in a single cover image. Experimental results demonstrate that our method is very effective and versatile compared to previous image steganography models. To the best of our knowledge, this paper is the first attempt to utilize the concept of the private key in the field of image steganography that conveys multiple images, not messages.
Our main contributions can be summarized as follows: (i) We introduce a concept of the private key for the multi-image steganography problem. (ii) Unlike previous image steganography methods, our model can provide access right to the secert image for only authorized people. (iii) In addition, our model makes it possible to extract only quaried secret image without revealing other hidden images.

Related Works
In this section, we review conventional steganography methods based on both spatial domain and frequency domain, and recent deep learning-based methods.
As a pioneering work, Mielikainen [1] proposed an LSB (Least Significant Bit)-based method that adjusts the value of the least significant bit in the spatial domain of cover image to hide secret messages. Wu and Tsai [2] designed a PVD (Pixel Value Differencing)-based method that inserts secret data according to the difference in pixel values. However, these methods are vulnerable to well-designed steganalysis techniques [12][13][14]. Therefore, there have been various efforts including HUGO (Highly Undetectable steGO) [3], UNIWARD (UNIversal WAvelet Relative Distortion) [4], and WOW (Wavelet Obtained Weights) [5] to utilize the LSB of high-frequency components that are difficult to detect. Moreover, Chen and Lin [6] proposed a method that manipulates coefficients of the DWT (Discrete Wavelet Transform) and Kaur et al. [7] suggested a steganographic approach that embeds data in the mid-frequency band of the DCT blocks.
With recent progress of deep learning, there have been significant research achievements on various computer vision fields. Although it is not much compared to other fields, recent attempts to utilize deep learning technologies for steganography problems have increased. Baluja [8] introduced a deep steganography model that produces a container image by hiding a full-size image into a cover image. The hidden image can be extracted from the container image with a reveal network. Duan et al. [9] also proposed a new image steganography scheme based on a U-Net structure. Furthermore, Baluja [10] has shown that a deep CNN-based steganography model can hide multiple images into a single cover image as shown in Figure 1a. However, this method extracts all the hidden images at once. In other words, it cannot be utilized in a situation that we want to extract only one hidden query image and conceal the rest hidden images. To overcome above limiation, we propose a concept of private keys in the multi-image steganograpy task. With our method, it is possible to access only one hidden image without touching and revealing the other images.

Methods
In this section, we detail our steganography model consisting of a hiding network and a reveal network. We then describe the training processes of the proposed model. The overall pipeline of the proposed steganography model is illustrated in Figure 2.

Hiding Network
The hiding network takes a cover image C ∈ R 256×256×3 and N secret images S = {S n } N n=1 ∈ R 256×256×3 as inputs, then produces a container image C ∈ R 256×256×3 and private keys K = {K n } N n=1 for each secret image S n as follows: where φ f denotes the trainable parameters of the hiding network f (·). The container image C is visually indistinguishable to the input cover image C and holds the hidden information of the secret images S. While not being sensitive to the architectural designs, we adopt the U-Net structure [15] for the hiding network. We note that a single U-Net based hiding network is shared across the input images (i.e., cover image and multiple secret images). The encoder consists of a series of strided seven 4 × 4 convolution layers [16], LeakyReLU [17], and batch normalization [18] while the decoder is composed of a series of seven 4 × 4 deconvolution layers [19], ReLU [20], and batch normalization. Also, there are ReLU and Sigmoid layers respectively instead of LeakyReLU and ReLU layers at the last of the encoder and decoder. The visual features of each image are extracted by forwarding the cover image C and secret images S respectively. Then all extracted visual features from the encoder are concatenated along the channel dimension and passed into the decoder to produce a container image C . Note that visual features of the secret image S n from the encoder are used as its private key K n . Architecture details of the hiding network are presented in Table 1.

Revealing Network
In the revealing stage, the revealing network extracts a secret image corresponding to a query private key from the container image as follows: where φ g is the trainable parameters of the reveal network g(·). To be specific, the revealing network consists of six 3 × 3 convolutional blocks without downsampling. Therefore, the spatial dimension is maintained while passing through the revealing network. As shown in Figure 2, the query private key is resized to the same spatial size using nearest interpolation as the output of the third convolution layer, then concatenated with intermediate activations of the container image obtained from the revealing network. Finally, a hidden secret image S n ∈ R 256×256×3 corresponding to the query private key K n is reconstructed. Architecture details of the revealing network are presented in Table 2. Step

Training
Both hiding and reveal networks are trained in an end-to-end manner. The loss function of our steganography model is defined as follows: where β is a weighting coefficient for balancing two terms in (3). We set β = 0.75 in our all experiments.

Experimental Results
For training, we have randomly collected 20,000 training images from MS-COCO train dataset, 5000 validation images from MS-COCO validation dataset, and 5000 test images from the MS-COCO test dataset [21]. All the images are resized to 256 × 256 using bicubic interpolation. Our method is implemented using PyTorch on Ubuntu 18.04 with a Titan RTX GPU. The proposed model is trained for 100 epochs with an Adam optimizer [22] with the learning rate 1 × 10 −3 . The learning rate is multiplied by 0.2 when the loss has stopped decreasing.

Model Analysis
We first have investigated which feature layer is useful for the private key generation. For this purpose, we compare the performance with private keys selected from the output features of the penultimate (K p ∈ R 4×4×512 ) and the last (K l ∈ R 2×2×512 ) layers of the encoder. We have trained two networks with each private key to hide 1-5 images. Each network is evaluated by comparing PSNR [23] and SSIM [24] of the container and retrieved images with corresponding original images by using correct and random keys. As shown in Table 3, K p and K l show similar performance with N = 1 regardless of the key size. With the increasing number of hidden images, however, K p still shows reliable performance while K l suffers from severe performance degradation. This behavior is expected because the same amount of information should be reconstructed with the less amount of private key information for K l compared with K p . Also, as reported in the third column of Table 3, we obtain poor quantitative results when random keys are used. It shows that only the authorized private key can access the hidden image. Table 3. Quantitative results for both K p and K l according to N. There are also results of revealed secret images with random keys. shows encryption and decryption results of the proposed algorithm with N = 3. In most cases, we successfully generate container images with minimum distortion compared to cover images regardless of the private key size. For the extraction of secret images, each private key accurately reconstructs the corresponding original hidden image. In particular, K p successfully extracts all of the secret images with high quality compared to those extracted with K l due to the high capacity of private keys (i.e., 4× capacity). Note that decrypted images with K l often suffer from slightly blurry and noisy images. In addition, visual artifacts originated from the other secret images are often observed too. Furthermore, we verify the robustness of our algorithm by feeding random private keys to the revealing network together with the container image. Regardless of the capacity of private keys, secret images are not correctly extracted with random private keys as shown in Figure 3. In detail, the extracted images severely suffer from mixed textures and noises and it makes the hidden information indistinguishable. Therefore, these results prove that the proposed algorithm can provide permission for a certain hidden image to an authorized person.

Robustness to Steganalysis
As conducted in [8], we investigate the robustness of our method against StegExpose [25], which is a popular LSB-based steganalysis method. If the existing staganalysis technique does not distinguish the container image and the cover image well, it can be regarded that the proposed method hides the secret images well. For experiments, we utilize 200 pairs of cover and container images to obtain receiver operating characteristic (ROC) curves with varying threshold values. As shown in Figure 4a, ROC curves with both K p and K l are close to the case of a random guess (i.e., a straight diagonal line). It means that the proposed method hides secret images well because performance of the steganalysis is similar to random guessing. Also, area under the curve (AUC) values for both K p and K l are very low along with N, as illustrated in Figure 4b. These experimental results show that the proposed method is quite robust to the common steganalysis method regardless of the size of the private key.

Effects of Noise
To verify the susceptibility of our algorithm to private key contamination, we perform experiments to check how well the revealing network extracts hidden images when noise is added to the private key. To this end, we extracted secret images using private keys with noises. In particular, we added Gaussian or salt-and-pepper (S&P) noises to private keys with various noise levels, and extracted secret images with contaminated keys. Quantitatively, performances decrease as the noise level increases as shown in Figure 5. In particular, in the case of Gaussian noise, we can check that K p is more susceptible to high noise levels compared to K l . Meanwhile, there is not much visual distortion at low noise levels as shown in Figure 6c,f. In other words, the private key can withstand a low level of noise, but it is difficult to extract the original secret image when the noise is severe.

Conclusions
In this paper, we extended the concept of the private key to the multi-image steganography, which hides multiple secret images within a single cover image. Our steganography model takes stack of secret images and a cover image as inputs then produces a container image and private keys for each secret image. In order to extract a secret image from the container image, the corresponding private key is required. The proposed model provides a hidden image only when a proper private key is provided, and does not disclose information about other hidden images. Through extensive experiments, we verified the effectiveness of our method under various conditions (i.e., random key and noisy key).

Conflicts of Interest:
The authors declare no conflict of interest.