In the current study, JPEG compression is widely used in simulated channels in the laboratory. A variety of complicated attacks in real social platforms will lead to the failure of some robust steganography algorithms that have performed well in the laboratory. Of the two categories of robust steganographic algorithms, we believe that the method relying on the image’s robust domain has better robustness when applied to the real social platforms. It is a superior way to find stable relationships in the image to embed the message.
To the best of our knowledge, in order to fulfill the task of robust steganography, most current studies are prone to sacrifice embedding capabilities to improve the safety of covert communication. Furthermore, image steganography studies typically use four metrics to measure the overall performance of different algorithms:
Most of the current robust steganography has not been tested by real network. In fact, the robustness of some steganographic algorithms performs well in laboratory, but fails to meet expectations in real social platforms. The performance in real platforms also belongs to “Robustness”, so we conducted experiments in social platforms such as Facebook, Twitter, and WeChat.
In the following, we give a general overview of several typical robust steganography methods and describe in detail the Flipping of the Sign of DCT Coefficients algorithm proposed by Zhu et al.
2.2. Prior Work
The algorithm proposed in this paper is an improvement of the algorithm presented in Zhu et al. in [
20]. They proposed a method in order to construct embedding domains based on Flipping the Sign of DCT Coefficients (FSDC). In this section, we introduce the FSDC algorithm, including the principle, the cost function design, the location feature generation, and the embedding method. These contents are also applied to the methods proposed in this paper.
The main idea of FSDC algorithm is shown in
Figure 1, the gray block represents the DCT coefficient of a certain
pixels of the cover image, and the blue block represents the DCT coefficient that becomes zero after the JPEG pre-compression. The red block is the DCT coefficient with strong robustness, which is modified to generate the stego element. In addition, we will save the non-zero AC coefficients and the corresponding positions according to the cover elements as a preliminary location feature.
The first step in the construction of robust domain is to obtain the cover element from the cover image
with a quality factor of
Q. It is known that JPEG compression operations can make some DCT coefficients become zero, so we need to further filter out the non-zero AC coefficients in
from those that do not become zero after the JPEG compression attack. Therefore, we perform a pre-compression operation on
using a quality factor of
to obtain a compressed image of
before extracting the cover elements, where
is smaller than
Q. The compressed image
is only a reference for the initial selection of the cover element which takes no part in the rest of the process of stego image generation. By means of a pre-compression operation, we can obtain the cover elements as shown by Equation (
1):
where
represents the DCT coefficients of the
ith row and the
jth column in the image
,
and
represent the cover elements of the
ith row and the
jth column and position coordinates, respectively. We select the coefficients that are non-zero DCT coefficients in the image
and record them in the location feature
.
In order to ensure that the stego image has the minimal changes compared to the cover image, the authors here use a steganographic algorithm for minimizing embedded distortion to obtain the cost of each cover element. The cost of embedding the generalized distortion function S-UNIWARD algorithm is the sum of the relative changes in the wavelet coefficients of the stego image, as shown in the equation:
where
and
are the cover image and the stego image of the spatial domain;
n is the total number of DCT coefficients of
.
,
k = {1, 2, 3},
i∈ {1, …,
n} is the
ith wavelet coefficient of the first
k subband of the first decomposition layer of the cover image filtered by the wavelet filter.
is the corresponding wavelet coefficient of the stego image;
is a stable constant and
.
FSDC embeds the message based on the stability of the sign of DCT coefficient, so we use the J-UNIWARD algorithm to decompress the image into the spatial domain using Equation (
3):
where
, and
denotes the decompression of the image into the spatial domain. In order to satisfy the need to minimize embedding distortion, we should select the appropriate embedding region according to the texture complexity of the image, so J-UNIWARD pre-filters the image in multiple directions to identify textured areas, which are generally less costly to distort.
The cost of distortion can be compared to the sum of the additive costs of each pixel, and we use
to represent the cost of modifying the coefficient
to
, so that the Equation (
3) can be expressed as:
where
represents the sum of the costs.
Based on the invariance principle of the sign of the DCT coefficient, we have flipped some coefficients in order to embed the message. In this process, modifying non-zero coefficients with smaller absolute values will result in less distortion, so we will select DCT coefficients from the cover elements with absolute values less than or equal to
. At the same time, we also need to determine the range of the cost of this location with
, where a threshold value of
is set. We need to select coefficients whose cost is less than
, indicating that this location is in a complex textured region of the spatial image. The equation of the steganographic algorithm that determines the minimal embedding distortion cost is as follows:
where
and
represent threshold values. We use the absolute value of the DCT coefficients as the new cost of this position. When the absolute value of the coefficients is less than or equal to
while
is less than
, the elements will be appropriate. For the elements that are within the texture area but have a value above the threshold, we use
to increase the cost of the element. Similarly, when the element is on the smooth region, we retain its large distortion function cost.
In order to minimize the average embedding distortion in the cover image, we use STCs encoding when acquiring the cover elements. STCs encoding is close to the theoretical minimum of additive average embedding distortion and is the preferred solution for many steganographic algorithms. Equation (
6) for embedding message into a cover image using STCs encoding is as follows:
where
represents the secret message,
is the corresponding chaperone of
and the parity matrix
is a submatrix
stitched together in a diagonal cascade with the matrix dimension
. The submatrix
is represented as a key parameter that is shared by the sender and the receiver. By means of a distortion function and a weighted cost adjustment method, we obtain the cost of each DCT coefficient in the cover element. The STCs encoding then minimizes the distortion between the cover image and the stego image within the limits of both.
After obtaining the cover element
, the embedding cost
, the secret message
, and the key matrix
, Equation (
7) is used to obtain the stego element
:
Then, we flipped the sign of DCT coefficients in the cover image
according to the stego element
to generate the stego image
:
where
and
are the DCT coefficients in the
ith row and the
jth column of the cover image and the stego image, respectively. By the means of Equation (
8), we flipped the sign of DCT coefficients on the cover image to produce the stego image. When
, the DCT coefficients in this position become positive; otherwise, they become negative.