1. Introduction
Recently, the concept of reversible data hiding in encrypted images (RDHEI) has gained importance due to its usefulness in confidential areas such as cloud, military, and medical applications. For example, a content owner may want to store an original image in the cloud, which needs to be encrypted to prevent unauthorized access [
1]. RDHEI can be used to add additional data to the encrypted image for easier management without compromising the original content.
As shown in
Figure 1, the RDHEI framework includes three users: the content owner, data-hider, and receiver. The content owner encrypts the image content, the data-hider embeds additional data into the encrypted image, and the receiver can extract the hidden data and/or recover the original image. A good RDHEI method should achieve reversibility, security, and high embedding capacity (EC).
RDHEI techniques can be classified into two categories based on the encryption order: vacating room after encryption (VRAE) and reserving room before encryption (RRBE). Furthermore, the data extraction and image recovery processes can be carried out jointly or separately. RRBE methods rely on exploiting the data redundancy present in the original image before encryption to create space that can be used for data hiding. However, in the VRAE techniques, the data hider processes the encrypted image in order to insert its secret information.
The initial VRAE method was suggested by Puech et al. [
2]. It is a joint method where each
$4\times 4$ block of pixels in the original image is encrypted using the Advanced Encryption Standard (AES). After encryption, one bit of the confidential data is embedded into each encrypted block using bit substitution. The outcome is known as the marked encrypted image, and the confidential data can be extracted by acquiring the bits in the substituted positions. During decryption, the local standard deviation analysis is executed to regain the original image. The payload of this approach is
$0.0625$ bpp, which is very small.
Zhang [
3] suggested encrypting the image by applying the bit-XOR operator between the original bits and the pseudo-random bits produced from a standard stream cipher. Following that, a data-hider can modify the encrypted image marginally to embed extra data without knowing the original image content. In the data embedding process, the encrypted image is firstly segmented into different non-overlapping blocks. Next, each block’s pixels are pseudo-randomly divided into two sets based on a data-hiding key. Afterward, three least significant bits (LSBs) of each encrypted pixel from the first or the second sets are flipped according to the values of the embedded bits. The average ER is
$0.033$ bpp.
Zhang [
4] introduced the first separable approach, in which the encrypted image’s LSBs are compressed using a data-hiding key to create a sparse space to accommodate the additional data. The embedded data can be retrieved at the receiver side using the data-hiding key. As the data embedding only affects the LSB, decrypting the encrypted image with the encryption key can yield an image similar to the original. The use of both encryption and data-hiding keys enable the receiver to have a successful extraction of the embedded data and a perfect recovery of the original image by exploiting the spatial correlation present in the natural images.
While Puech [
2] was the pioneer in proposing a VRAE-based RDHEI method, Ma et al. [
5] introduced an innovative and practical approach based on the RRBE framework. They proposed the following three steps: image partition, self-reversible embedding, and image encryption. In order to accommodate messages, they proposed to substitute some LSBs in the encrypted image. Compared to the techniques described in [
3,
4,
6], this method is likely to accommodate payloads that are more than 10 times larger while maintaining an acceptable PSNR close to 40 dB.
Yi et al. [
7] have suggested to embed message bits according to the binary-block embedding method. This method embeds binary bits in several LSB planes of the original image into its MSB planes, and then encrypts it and hides the secret data inside its LSB planes. After embedding but before encryption, a bit-level scrambling process is employed to ensure resistance against noise and data loss attacks. The maximum ER of 10,000 images in the BOWSBase dataset is
$2.2034$ bpp, which outperforms many of the existing RDHEI methods.
A new RDHEI method was suggested by Yin et al. [
8] with higher capacity and error-free data extraction and image decryption based on multi-MSB prediction and Huffman coding. Specifically, based on its predicted value, a tag value of each pixel is obtained and the total data embedding capacity of an image is determined. The advantage of using Huffman coding is to compress the label map to accommodate more space for embedding information. In this method, the obtained average ERs in two well known image datasets, BOSSbase and BOWS-2, reached
$3.361$ bpp and
$3.246$ bpp, respectively, leading to an increment in terms of the net payload of approximately 1 bpp compared to the previous works.
In [
9], Wang et al. proposed a RDHEI based on multi-MSB embedding strategy. Through experimentation, they demonstrated that their approach can achieve an average ER of
$1.721$ bpp for the BOW-2 database.
Yin et al. [
10] introduced an RDHEI algorithm that incorporates pixel prediction and multi-MSB plane rearrangement. The prediction errors are calculated through the median edge detector (MED) predictor. They are then separated into two components: the sign of the errors represented by one bit plane, and the absolute values of the errors represented by other bit planes. These bit planes are further divided into small blocks of
$4\times 4$ pixels and categorized as uniform or non-uniform blocks. This categorization enables the rearrangement of the blocks to accommodate auxiliary information and additional data. It is important to note that using small blocks of pixels increases the amount of auxiliary information, which, in turn, decreases the ER.
A RDHEI with extended parametric binary tree labeling was introduced in [
11]. The experimental results demonstrated that the ERs for commonly used datasets, including Bossbase, BOWS-2, and UCID, are
$3.2305$ bpp,
$3.1619$ bpp, and
$2.8113$ bpp, respectively.
Additional methods [
12,
13,
14,
15,
16,
17,
18,
19,
20] for RDHEI and many others can be found in [
21], which is a survey that presents the birth and evolution of RDHEI methods over 12 years.
The motivation for our research in RDHEI stems from the critical need to find innovative solutions that strike a balance between data security and the processes of securely hiding and retrieving data within encrypted content. This paper presents a significant improvement in the advancement of a secure and efficient RDHEI technique, which holds substantial potential across various domains, including secure communication, healthcare, and cloud computing.
From the exploration of the existing methods for RDHEI, it is evident that various approaches have been developed to strike a balance between data hiding capacity and the preservation of error-free reversibility. Notably, among the methods under consideration, the RRBE approach consistently stands out, showcasing superior performance metrics, particularly in ER and error-free reversibility when compared to VRAE methods as they face significant challenges in achieving a satisfactory EC due to the lack of redundant space in encrypted images.
The main contributions of this article, provided with the proposed solution, are:
RRBE approach: One of the primary contributions of our method is the utilization of the RRBE approach. This strategy leverages the data redundancy present in the original image before encryption to create space for data hiding. This approach ensures security as it maximizes the embedding capacity without altering the encryption process.
Integration of pixel prediction: Our method introduces pixel prediction in the RDHEI process. By predicting the error image, the efficiency of data embedding is enhanced.
Quadtree decomposition: Quadtree decomposition is applied to each bit plane of the mapped prediction error image. This step identifies homogeneous blocks within the image, which can be rearranged to create room for data embedding. Quadtree decomposition allows for a more fine-grained allocation of embedding space, potentially increasing the embedding capacity.
Bit plane reordering: The method incorporates bit plane reordering as part of the data hiding process. This technique likely enhances the efficiency of embedding data by optimizing the allocation of space within each bit plane.
Error-free reversibility: One of the most important contributions of our method is its ability to achieve high ERs while maintaining error-free reversibility. This means that the hidden data can be extracted without any loss or distortion, and the original image can be fully restored after decryption. Error-free reversibility is a crucial requirement in RDHEI to ensure that the quality and integrity of the image are preserved.
Privacy preservation: Encryption is applied to all images being transmitted or stored, ensuring the privacy and confidentiality of sensitive information.
The rest of this paper is organized as follows. While
Section 2 details the proposed separable high-capacity RDHEI scheme,
Section 3 presents the obtained results for different image datasets, as well as a comparison with the state-of-the-art RDHEI methods. In addition, the reversibility, security, and embedding capacity of the proposed scheme are analyzed and discussed. Finally, general conclusions of this paper are drawn in
Section 4.
3. Experiment Results and Analysis
This section provides an in-depth evaluation of the proposed method, including experimental results and analysis. The experimental results are obtained using five well known and frequently used test images, namely Lena, Man, Jetplane, Baboon, and Tiffany, as shown in
Figure 14. In addition, tests are also conducted on the following two image datasets:
BOSSbase [
39]: is a public image dataset from the Break Our Steganographic System! competition [
39]. It is formed by 9074 cover images with a size of
$512\times 512$ pixels.
Additionally, the section encompasses an evaluation of security, performance, and a comparative analysis with other contemporary techniques in terms of embedding capacity.
3.1. Security Analysis
Figure 15 and
Figure 16 illustrate the proposed RDHEI method for two test images, Lena and Jetplane, respectively.
Figure 15a showcases the original Lena image. After reordering the bit planes in the mapped prediction error image, the allocation of space for data embedding, and the insertion of auxiliary information in the corresponding bit planes, the resulting image displayed in
Figure 15b is obtained. The encrypted image, obtained by the content-owner using their encryption key with a net payload of
$3.6$ bpp, is depicted in
Figure 15c. Subsequently, the data-hider incorporates additional data into the encrypted image, resulting in the creation of a marked encrypted image shown in
Figure 15d. The disparity between the encrypted image and the marked-encrypted image is presented in
Figure 15e. Finally, the recovered Lena image is showcased in
Figure 15f, with a Mean Square Error (MSE) value of 0, affirming its matching with the original one. The same outcomes are observed for the Jetplane image illustrated in
Figure 16.
Additionally, this section aims to demonstrate the robustness of the proposed encryption scheme against different attacks. To achieve this goal, several metrics are employed, namely the histogram, the ${\chi}^{2}$ statistic, the correlation coefficient, and the information entropy. These metrics serve as evidence that an ideal encryption algorithm should be resilient to different types of attacks.
- (a)
The key space analysis
The key space denotes the total count of distinct keys available for utilization in the encryption process. Within our proposed algorithm, the secret keys encompass the control parameters and the initial values of the two used PWLCM maps
${k}_{e}=({x}_{0},{\eta}_{1},{y}_{0},{\eta}_{2})$. Assuming a computational precision equivalent to double-precision numbers, which is approximately
${10}^{16}$, then the key space
${H}_{{x}_{0}}$ and
${H}_{{y}_{0}}$ is approximately
${10}^{16}$, while
${H}_{{\eta}_{1}}$ and
${H}_{{\eta}_{2}}$ are around
$0.5\times {10}^{16}$. Consequently, the total key space
H is calculated as
$H={H}_{{x}_{0}}\times {H}_{{y}_{0}}\times {H}_{{\eta}_{1}}\times {H}_{{\eta}_{2}}$, yielding a value of approximately
$0.25\times {10}^{64}$. The key space equates to approximately
$1.519\times {2}^{210}$. For an encryption system to be considered effective and resilient against brute-force attacks, the size of the key space should not fall below
${2}^{100}$, making such attacks infeasible [
40]. Consequently, the key space of our encryption scheme is large enough to ensure robust protection against all kinds of brute-force attacks.
- (b)
The histogram analysis
We have analyzed the histograms of several test images and their corresponding encrypted and marked-encrypted images.
Figure 17 presents the histograms of the encrypted and marked-encrypted images for Lena and Jetplane.
The histograms shown in
Figure 17 indicate that the encrypted and the marked-encrypted images follow a uniform distribution. Therefore, both the encrypted and marked-encrypted images exhibit distinct statistical characteristics from the original image. Similar results for the histogram analysis were found for all the test images.
- (c)
The ${\chi}^{2}$ test
The
${\chi}^{2}$ values are computed for the original, the encrypted, and the marked-encrypted images. The
${\chi}^{2}$ value is described as follows:
where
i is the number of gray levels
$(0,\cdots ,255)$,
${f}_{i}$ is the observed frequency count of each gray level, and
$n\times m$ is the image size. It should be noted that the
${\chi}^{2}$ test is a measure of how much the observed frequencies deviate from the expected frequencies, which is, in our case, the uniform distribution. Assuming a significant level of
$0.05$, then the critical value for the given significance level and degrees of freedom is
${\chi}^{2}(255,0.05)=293$. The
${\chi}^{2}$ values of the different images for Lena and Jetplane are presented in
Table 1. These values, the
${\chi}^{2}$ statistics, are compared to the critical value to determine whether to reject the null hypothesis that the image is uniformly distributed or not. If the calculated
${\chi}^{2}$ statistic for an image is greater than the critical value, the null hypothesis is rejected, indicating that the image is not uniformly distributed. If the calculated
${\chi}^{2}$ statistic is less than or equal to the critical value, the null hypothesis is not rejected, and the image is uniformly distributed. From
Table 1, we observe that for the original Lena and Jetplane images the
${\chi}^{2}$ values are very high—242,173 and 715,669, respectively.However, the
${\chi}^{2}$ values obtained by the encrypted and the marked-encrypted versions of Lena and Jetplane images are less than the critical value 293, which implies that the distributions of these images are uniform, thus indicating that the encryption scheme is of good quality.
- (d)
The correlation test
Real images typically exhibit a high correlation between adjacent pixels in the horizontal, vertical, or diagonal directions. However, an effective encryption scheme should produce a ciphertext with a low correlation between adjacent pixel values [
41].
Table 1 presents the correlation coefficients of the original images, Lena and Jetplane, as well as their respective encrypted and marked-encrypted versions obtained using our proposed RDHEI method. The correlation coefficients of the original images are close to 1, indicating a high correlation among adjacent pixels. Specifically, the correlation coefficients for Lena and Jetplane are
$0.9710$ and
$0.9357$, respectively. However, the encrypted and marked-encrypted images exhibit significantly low correlation coefficients near 0, indicating effective suppression of the correlation between adjacent pixel values. Similar results were obtained for all the test images.
Furthermore, the correlation distributions [
42] in the horizontal direction of the original images, Lena and Jetplane, along with their respective encrypted and marked-encrypted images, are illustrated in
Figure 18. In this figure, the
x-axis corresponds to the pixel intensity at coordinates
$(x,y)$ within the image, while the
y-axis represents the pixel intensity of the horizontally adjacent pixel at coordinates
$(x+1,y)$. Each point on the plot reflects the relationship between the intensity of a pixel and the intensity of its immediate neighbor in the horizontal direction. A higher density of points around a particular region on the plot means a stronger correlation between the pixel at
$(x,y)$ and its adjacent counterpart at
$(x+1,y)$. As depicted in
Figure 18, the original images exhibit a high correlation between horizontal pixels, whereas the encrypted and marked-encrypted images show no correlation between adjacent pixels. It is worth noting that the correlation distribution is used to assess the degree of similarity between pixel values at nearby locations in the image.
Both the correlation coefficients and the figures show that the proposed RDHEI method effectively decorrelates the neighboring pixels of the plain images.
- (e)
The information entropy test
The information entropy is initially defined by Shannon [
43]. When an image is encrypted, its entropy should ideally be equal to 8 bpp. If the entropy of the encrypted image is less than 8 bits/pixel, there exists a certain degree of predictability and the encryption algorithm will be threatened in its security. The entropies of the encrypted and marked-encrypted images for Lena and Jetplane are shown in
Table 1. While the encrypted and marked-encrypted Lena images have entropies of
$7.9994$ bpp and
$7.9995$ bpp, respectively, the encrypted and marked-encrypted Jetplane images have entropies of
$7.9993$ bpp and
$7.9994$ bpp, respectively. These values are very close to the theoretical value of approximately 8 bpp, indicating that the proposed encryption scheme is secure against entropy attacks.
3.2. Performance Analysis
Table 2 shows the ER in bpp using different predictors on the five test images. From
Table 2, we can see that for the five test images, the best ERs are obtained when using the GAP predictor. The average ER for these five test images is more than 3 bits per pixel.
The total embedding capacity, the auxiliary information, and the net payload for the five test images are calculated, and the results using the GAP predictor are shown in
Table 3.
Figure 19 shows the obtained ERs for the five test images when using different methods. It can be seen that our method outperforms the other five methods for the first four images with the highest ER.
Puteaux et al. [
12] proposed to embed additional data into one-MSB only, resulting in a very low ER. For instance, the ERs of their method for the five test images were between
$0.838$ bpp and
$0.993$ bpp. Wu et al. [
19] proposed an improvement of the Yi et al.’s [
18] work by dividing the image pixels into two categories according to a parameter binary tree labeling scheme. By dividing the pixels into embeddable and non-embeddable sets, the overall embedding capacity may be affected. The ERs of the five test images are significantly higher than those obtained by the previous methods and range from
$0.969$ bpp to
$2.6726$ bpp. In their work, Yin et al. [
8] compressed the auxiliary information by using Huffman coding and embedding multi-bits adaptively by multi-MSB substitution. The scheme requires a large amount of auxiliary information leading to a potential impact on the overall embedding capacity. As an example, the auxiliary information required for the Lena image in Yin et al.’s [
8] approach is 793,304 bits, which is over six times greater than that needed in our method, as presented in
Table 3. In [
10], an RDHEI method is proposed based on pixel prediction scheme and multi-MSB planes reordering. The bit planes were divided into several non-overlapping blocks and the best ERs were found in the blocks of size
$4\times 4$. The blocks are classified as uniform or non-uniform, with a binary label map generated for each bit plane to indicate their positions. The data-embedding method used depends on whether the block is uniform or non-uniform. In uniform blocks, one bit is left unchanged to represent the block’s value and aid in image recovery. However, this approach fails to exploit the large uniform blocks in the MSB bit planes, resulting in an augmented amount of the auxiliary information.
The obtained ERs by our method for the five images are better than those achieved by previous methods. For the Lena image the ER is
$3.606$ bpp, which is approximately
$26\%$ better than the best result obtained with the [
10] method. By mapping the prediction errors, using the quadtree decomposition in order to find as many large homogenous blocks as possible and by reordering the embeddable bit planes, we achieved a significantly higher ERs than those obtained with state-of-the-art methods.
Finally,
Table 4 show the ERs of our proposed method and compares them with state-of-the-art methods on the two image datasets, respectively. The average ERs for the image datasets BOSSbase and BOWS-2 are
$3.813$ bpp and
$3.696$ bpp, respectively. In addition, it can be seen from
Table 4 that our proposed method has a higher ER than the other methods. In
Table 4, “Gain” represents the percentage improvement in ER achieved by our approach when compared to other methods. In comparison to the most effective method of Yin et al.’s [
10], our results exhibit an improvement of up to
$9\%$ and
$8.9\%$ for BOSSbase and BOWS-2 datasets, respectively. Comparatively larger improvements are observed when our results are compared to the other methods.
3.3. Time Complexity Analysis
To assess the time complexity of our RDHEI method, we conducted a detailed analysis of its key operations, which are presented in Algorithms 3 and 4. We considered factors such as the size of the input image $n\times m$ and the number of bit planes processed. Algorithm 3 involves initiating the $mappedPredictionError\left(\right)$ function, followed by sequential calls to $quadtreeDecomposition\left(\right)$, $calculateAuxInfo\left(\right)$, and $bitPlaneReordering\left(\right)$ for the embedded bit planes (up to 8 in the worst case), and finally calling the $encrypt\left(\right)$ function. The complexity for most functions is $\mathcal{O}\left(nm\right)$, except for $quadtreeDecomposition\left(\right)$, which has a time complexity of $\mathcal{O}(nmlog(nm\left)\right)$. Consequently, the overall time complexity of the $processImage$ algorithm is deduced to be $\mathcal{O}(nmlog(nm\left)\right)$ as the quadtree decomposition process imposes the most significant computational load compared to the other processes. In the data embedding Algorithm 4, three functions—encrypt(), readPartialInf(), and embedBits()—are employed, with each having a time complexity upper-bounded by $\mathcal{O}\left(nm\right)$. Considering both algorithms, it can be inferred that the overall time complexity of our RDHEI method is $\mathcal{O}(nmlog(nm\left)\right)$.
Upon comparing the complexity analysis of the various studied algorithms, it is noteworthy that a significant portion of these algorithms exhibit a time complexity that approximates $\mathcal{O}\left(nm\right)$. However, our proposed approach, with a time complexity of $\mathcal{O}(nmlog(nm\left)\right)$, while slightly higher, remains well within the bounds of efficiency. It is important to emphasize that this $\mathcal{O}(nmlog(nm\left)\right)$ complexity offers a favorable trade-off between computational cost and performance gains when compared to the $\mathcal{O}\left(nm\right)$ alternatives. The computational efficiency is more than acceptable, making our approach a compelling choice.