Reversible Data Hiding in JPEG Images Using Quantized DC

Reversible data hiding in JPEG images has become an important topic due to the prevalence and overwhelming support of the JPEG image format these days. Much of the existing work focuses on embedding using AC (quantized alternating current coefficients) to maximize the embedding capacity while minimizing the distortion and the file size increase. Traditionally, DC (quantized direct current coefficients) are not used for embedding, due to the assumption that the embedding in DCs cause more distortion than embedding in ACs. However, for data analytic which extracts fine details as a feature, distortion in ACs is not acceptable, because they represent the fine details of the image. In this paper, we propose a novel reversible data hiding method which efficiently embeds in the DC. The propose method uses a novel DC prediction method to decrease the entropy of the prediction error histogram. The embedded image has higher PSNR, embedding capacity, and smaller file size increase. Furthermore, proposed method preserves all the fine details of the image.


Introduction
JPEG image file format has proved its dominance year after year. Even with the new image standard that supports higher efficiency and compression, it still comes as a default image standard for storing images in smartphones and computers. With unparalleled support for it in variety of devices and software, reversible data hiding JPEG image has become an important topic.
Reversible data hiding is an interoperable data hiding method, which preserves the original file format and has ability to recover the original image from the embedded or watermarked image. But it is different than robust watermarking scheme such as one proposed by Liu et al. [1], which focus on recovery of the embedded message under image processing attacks.
The embedded image should be as close as possible to the original image and the PSNR value is measured against different payload size to compare the performance of the data hiding capability.
On the other hand, reversible data hiding in JPEG has not been extensively researched. There are three main approaches: the first approach modifies the quantization table to artificially increase the quantized DCT coefficients and embed [51,52]. Although this approach has high embedding capacity, the file size increase is much larger than the obtained embedding capacity. The second approach modifies the Huffman table [53][54][55]. Although this approach preserves the file size, the embedding capacity is quite limited. The third approach modifies the quantized DCT coefficients for embedding [56][57][58][59][60][61][62][63][64]. It is most logical approach given that it directly modifies the visual features without modifying other image parameters (such as Huffman table and quantization table).
The proposed method is a third approach, which uses a prediction scheme for quantized DC coefficients. The proposed prediction provide improved accuracy, which decreases the entropy of the quantized DC prediction error histogram to have large embedding capacity with lower distortion. Furthermore, proposed method only modifies the quantized DC values.
The main goal of reversible data hiding in JPEG is to achieve high PSNR without causing much change in the file size, but there are specific cases where quantized AC coefficient preservation is required and is more important than achieving high PSNR. Consider a case where embedded images are used for image data analysis for searching specific patterns or an object. Image data analysis rely on feature extraction of fine image details, which the AC coefficients represent (see Figure 1 for an example). Without the AC coefficient preservation, data analytic may not perform as well.
The proposed method provides a novel reversible data hiding technique which can embed in the quantized DC coefficients. Unlike existing work, all the fine details are preserved since none of the AC coefficients are modified. Quantized DC coefficients preserves the overall intensity, whereas quantized AC coefficients preserve the fine details of the image, making them less ideal for embedding for cases where fine details matter.

Brief Introduction to JPEG Baseline Encoding and Decoding
This section briefly describes the JPEG baseline encoding and decoding steps to aid with understanding of the proposed method. JPEG is a block based lossy compression technique which transforms 8 by 8 pixel blocks into to 8 by 8 quantized DCT coefficient blocks. The transformation consists of normalization by subtracting 128 from each of the pixels, discrete cosine transform (DCT), division by the quantization table, which is usually scaled using a scaling factor called quantization factor (QF) to control the effect of the compression and the image quality, and rounding is applied to make the values integral.
DCT is defined as following: where p i,j represents the pixel value at position (i, j) and DCT u,v represents the DCT value at position (u, v).
Once the quantized DCT coefficients are obtained, they are compressed losslessly. Quantized DCT coefficients consist of quantized DC and quantized AC coefficients. The quantized DC coefficient is the first DCT coefficient value and represents a form of a mean pixel value. The quantized AC coefficients are the rest of the DCT coefficients and represents the fine details of the pixel block. Differential pulse code modulation (DPCM), where the difference between two consecutive values is encoded, is used to compress the quantized DC coefficients in two adjacent blocks. Then, DPCM values are losslessly compressed using a variant of Huffman coding. For the quantized AC coefficients, run length coding is applied, and then encoded using a variant of Huffman coding.
To reconstruct the pixels from the quantized DCT coefficients, the quantized DCT coefficients are multiplied with the quantization table and then, inverse DCT is applied. Then, 128 is added to each value to undo the normalization, and finally, rounding function is used to make the result integral. More information about JPEG encoding and decoding can be found in the ISO document [65].
From here on, the quantized DCT coefficients (including quantized AC and DC coefficients) are denoted by bold font: DCT, AC, and DC.

Proposed Method
The proposed method uses a reversible data hiding technique called histogram shifting, where the performance is highly dependent on the prediction accuracy. The embedding and distortion of histogram shifting is relative to the prediction error; smaller prediction errors mean higher embedding capacity with lower distortion.
Previous work does not utilize advanced prediction in JPEG reversible data hiding. This is because of the assumption that the prediction in the DCT domain is difficult and is not worth the effort. Although the assumption is true in general, it is not true for all DCT. The first DCT in the DCT transformed block, also referred as the DC, represents the 8 Q 1 × mean normalized pixel values in the block, where Q 1 is the quantization table entry for quantizing the DC value. The proposed method builds upon this idea to propose a DC prediction method, which will increase the embedding capacity and decrease the distortion. Figure 2 shows histograms of image Lena (QF = 50). The histogram of DCdoes not have a peak and has the largest entropy, the histogram of differential pulse code modulated (DPCM) DC is better for histogram shifting then DC histogram, because it has a high peak around 0 and lower entropy, and finally, the histogram of proposed DC prediction error histogram has the highest peak and the smallest entropy, making it ideal for histogram shifting. Before explaining the proposed prediction, we define 'T' or 'target block' as the block which we want to predict the DC of. Furthermore, the superscript AC is used to denote that the block is reconstructed only using the AC component.
Then, let T AC denote the partially reconstructed pixel target block using only the AC from the target block (DCvalue is set as 0). Then, because DCT is a linear function, T can be approximately decomposed as following: where Q 1 is the quantization value which divided the DC coefficient to get DC, and DC × Q 1 8 represents the effect of the inverse DCT on DC.
The proposed prediction uses the neighboring blocks and the partially reconstructed blocks using only the AC component. The neighboring DCT blocks are first transformed to pixel blocks of "North", "East", "South", and "West". Figure 4 shows graphical view of the division. The red area represents the parts of neighbors which are closest to the target block, and the yellow area represents the parts of the target block which are closest to the neighboring blocks. Using this setup, we assume that the pixels from the neighboring blocks and the pixels from the target block, which are exactly one position away from each other, should be similarly valued: Pixel red ≈ Pixel yellow (4) where Pixel red , the pixel from the neighboring block, and Pixel yellow , the pixel from the target block, are exactly one position away from each other. In Figure 5, W 8 is exactly one position away from T 1 , but is two positions away from N 57 and T 9 .   Combining Equations (3) and (4), we get following approximation: By rearranging the above equation,DC, the predicted value of DC is: where [.] is the rounding function. Finally, since there are multiple Pixel red and Pixel AC yellow , the mean is used as an estimator to evaluateDC:

Embedding
Histogram shifting technique is used to embed in the prediction error values (DC −DC). Embedded DC is denoted by DC , and is obtained using following equation: The histogram shifting technique shifts DC with prediction errors less than −1 by −1, so that DCs valued −1 can be embedded using coefficients valued −1 and −2: the coefficient valued −1 is modified to −2 if the payload bit is "1", and left as −1 if the payload bit is "0". Similar logic applies to DC with prediction error greater than 0. Since the histogram shifting is applied such that none are overlapping, it can be reversed. The next subsection discusses the extraction of the payload and the recovery of the original DC.

Extraction and Recovery
The extraction of the payload and the recovery of the original DC are trivial.

Block Selection
Block selection is important for cases where full embedding capacity is not utilized. In order to have the smallest impact on the PSNR, blocks are sorted by their smoothness and blocks are embedded sequentially only until all the payload is embedded.
To ensure that the smoothest blocks are embedded first, block selection algorithm proposed by Huang et al. [62] is used. In their algorithm, number of zero DCT coefficients are used as a smoothness measure to sort the blocks. The assumption here is that the DCT blocks with more zero DCT coefficients will have less details and thus be smoother.
In the proposed method, zero DC coefficients are not used to measure the smoothness and only the zero AC coefficients are used to measure the smoothness. DC coefficients are not used, because they may change after embedding.

Encoder and Decoder
This section summarizes the encoding and decoding method of the proposed reversible data hiding scheme. Each subsection will describe the implementation steps and include minor implementation details to aid with understanding. 3. Use block selection method to sort the black set of DC . 4. Predict the black set of DC , extract the payload length and the half of the payload, and recover the original DC for the black set. 5. Use block selection method to sort the white set of DC. 6. Predict the white set of DC , extract the first half of the payload, and recover the original DC for the white set. (Prediction uses the original DC recovered from step 4.)

Experiment
The performance of the proposed method is verified using PSNR (between the original JPEG and the embedded JPEG) and the file size gain (due to embedding) comparison. However, there are no existing work that focus only on embedding in DC. Thus, a variant of Huang et al.'s method [62], which is the state of the art algorithm in terms of PSNR and file size gain, is compared with the proposed method.
Methods such as Wang et al. [66], Xuan et al. [58], and Sakai et al. [59] are not compared here, as Quantization factor (QF) values of 50 and 80 are chosen for comparison. QF = 50 based quantization table is the recommended base quantization table written in the standard document, which is scaled to obtain other quantization table, making it a good benchmark for test against. QF = 80 based quantization table is the table which is known to achieve good compression to visual degradation ratio.
The code for the proposed method will be available at https://github.com/suahnkim/jpegrwdc (accessed on 23 August 2019).

Discussion and Analysis
For testing, two image sets are chosen. To see the comparison using the well-known images, the first image set includes 6 images, "Lena", "Boat", "Barbara", "Baboon", "Peppers", and "F16" from USC-SIPI image data set (http://sipi.usc.edu/database/)(accessed on 23 August 2019). The original bmp images are JPEG compressed to the size of 512× 512. These images are chosen specifically for comparing specific unique features. For more general comparison, the second image set includes 1000 images from Alaska image data set (https://alaska.utt.fr/)(accessed on 23 August 2019). The original raw .cr2 images are JPEG compressed to the size of 432 × 648.

Comparison Using USC-SIPI Image Data Set
In this section, USC-SIPI database is used to compare for specific features. PSNR value relative to the size of the embedded payload and the file size gain due to embedding the payload are compared.   Both methods preserve the AC coefficients as required, but the proposed method has higher PSNR and embedding capacity for almost all images except the image F16. The proposed method consistently can embed more than 15% of the total blocks, and for few of the images it can embed more than 50%, whereas Huang et al.'s method can only embed between 5% and 25%.
Upon close inspection on the prediction error histograms of F16, DPCM is a better predictor for first few sets of payload. This is not surprising because F16 is a very smooth image. However, this is a single outlier, and the proposed method's prediction is generally better and more consistently accurate than DPCM. More importantly, the proposed method can consistently embed more than Huang et al.'s.

File Size Gain Comparison Using USC-SIPI Image Data Set
File size gain due to embedding is an important measure in JPEG reversible data hiding. JPEG is designed to offer good compression capability with respect to the image quality, thus reversible data hiding should not increase the file size significantly. Ideally, the file size gain due to embedding should be smaller than the embedded payload size. Figure 7 summarizes the comparison result for the file size gain. The file size gain is measured against the size of the embedded payload. To clearly see whether the embedding caused more file size gain than the size of the payload, a straight line through the middle is drawn. The proposed method consistently has smaller file size gain than Huang et al.'s for both QFs. Furthermore, it has smaller file size gain than the size of the payload for all cases. This implies that when proposed method is used for embedding, it will not increase the file size more than the size of the payload.

Comparison Using Alaska Image Data Set
In this section, performance comparison using Alaska image data set is summarized. Unlike the earlier comparison which compared specific images, the comparison here uses 1000 images to compare using more robust statistical results. For the comparison, PSNR gain, file size difference, and payload gain are compared.
Applying reversible data hiding in 1000 JPEG images produces many data points: each image has different maximum payload size, and each payload size gives different PSNR value and file size. Fair comparison is done by only comparing the results for the same image and same payload size.

File Size Difference Comparison Using Alaska Image Data Set
File size difference is the difference between the file sizes of the embedded image using the proposed method and Huang et al.'s. This is used to show that the proposed method produces embedded image with smaller file size. Figure 8b summarizes the result for the average file size difference when different payload sizes are embedded. The negative value means that the file size of the embedded image using the proposed method is smaller than Huang et al.'s, and opposite for the positive values. Clearly, the proposed method has smaller file size gain due to embedding in average.

Payload Gain Comparison Using Alaska Image Data Set
Payload gain is the difference between the maximum payload sizes of the Huang et al.'s and the proposed method. This is compared to show the difference in maximum embeddable payload size for each image. Figure 8c summarizes the result. Positive payload gain means that the proposed method can embed that much more, and opposite for the negative values. For all cases except one, the proposed method embeds more than Huang et al.'s for both QF = 50 and QF = 80. In average, proposed method can embed 1570 bits more for QF = 50, and 1163 bits more for QF = 80.

Effectiveness of the Block Sorting Algorithm in General
In this section, we analyze and discuss the effect of the block sorting algorithm with respect to the prediction error histogram resulting from the proposed method. In reversible data hiding, block sorting algorithm is used to control the rate of embedding such that the smallest number of DC can be used for embedding to minimize the impact on the PSNR.
In the proposed method, only prediction error valued 0 and −1 are used for embedding. However, it can be modified to embed in other prediction error values, such as 1 and −1, or even multiple pairs. Naturally, it is not possible to measure the performance of the block sorting algorithm with respect to all possible modifications.
Instead of measuring the performance for all possible modification, we proposed analyzing the prediction error histogram using the entropy to measure the general effectiveness of the block sorting algorithm. Ideally, block sorting algorithm should sort the blocks such that blocks which are more likely embeddable are prioritized for embedding. In other words, the entropy should increase as more of the less likely embeddable blocks are used for embedding if the block sorting algorithm is well designed. Figure 9 shows the average prediction error entropy of 1000 images from Alaska image data set. Each point is measured for every 10 percentage of the total blocks are used for embedding. From the figure, it is very clear that the entropy is smoothly increasing as a greater number of blocks are used for embedding for both methods. Furthermore, the proposed method has much lower average entropy than the Huang  (a) QF = 50 (b) QF = 80 Figure 9. Average prediction error entropy.

Conclusions
Reversible data hiding in JPEG image is becoming an important and highly researched topic. With existing work mostly focusing on embedding only using the quantized AC coefficients, there are no good embedding method only using DC (quantized DC coefficients). This is a problem for data analytic applications where AC (quantized AC coefficients) are used for feature extraction for fine details. The proposed method proposes a novel reversible data hiding in DC. It divides the JPEG blocks into two non-overlapping sets and an accurate predictor is proposed, which uses the four neighboring blocks and the ACs. The experimental results show that the proposed method performs better than the existing work.

Abbreviations
The following abbreviations are used in this manuscript:

DCT
Discrete Cosine Transformation DCT (Bold font) Quantized DCT coefficients DC (Bold font) Quantized DC coefficient AC (Bold font) Quantized AC coefficients PSNR Peak signal-to-noise ratio DPCM JPEG Differential pulse code modulation QF Quantization factor