Separable Reversible Data Hiding in Encryption Image with Two-Tuples Coding

: Separable Reversible Data Hiding in Encryption Image (RDH-EI) has become widely used in clinical and military applications, social cloud and security surveillance in recent years, contributing signiﬁcantly to preserving the privacy of digital images. Aiming to address the shortcomings of recent works that directed to achieve high embedding rate by compensating image quality, security, reversible and separable properties, we propose a two-tuples coding method by considering the intrinsic adjacent pixels characteristics of the carrier image, which have a high redundancy between high-order bits. Subsequently, we construct RDH-EI scheme by using high-order bits compression, low-order bits combination, vacancy ﬁlling, data embedding and pixel diffusion. Unlike the conventional RDH-EI practices, which have suffered from the deterioration of the original image while embedding additional data, the content owner in our scheme generates the embeddable space in advance, thus lessening the risk of image destruction on the data hider side. The experimental results indicate the effectiveness of our scheme. A ratio of 28.91% effectively compressed the carrier images, and the embedding rate increased to 1.753 bpp with a higher image quality, measured in the PSNR of 45.76 dB.


Introduction
In recent years, the problem of information security has become increasingly prominent, and privacy has attracted much attention. As a significant branch of data hiding, Reversible Data Hiding (RDH) technology can achieve the purpose of secret transmission and content authentication by embedding additional data in the carrier image. RDH has been widely used to protect military and medical industries, remote sensing, and commercial image processing applications [1][2][3][4]. However, the dramatic growth of digital images and the continuous expansion of their applicability in supporting the emerging technologies fields has heralded a revolution in traditional RDH schemes [5][6][7][8].
Unlike traditional RDH practices, the content owners of clinical, military, security, and social cloud services providers are urged to hide the sensitive details of original images, leading to the birth of Reversible Data Hiding in Encryption Image (RDH-EI) [9]. In RDH-EI schemes, additional confidential data can be embedded into the encrypted image without destroying the carrier image. For example, today's healthcare providers suffered a perilous patients privacy situation when they outsource image processing to third parties to process and store patient's sensitive data without falling foul of ever-changing data protection and privacy legislation. In order to secure these sensitive and confidential data, as well as minimize the exposure in the event of a data breach, it is necessary to embed these patient's sensitive data in the encrypted medical image in order to allow these encrypted data can be processed, classified and searchable by third parties. The rule of thumb is that the embedded sensitive data must not affect the process of accurately recovering

•
A new two-tuples coding compression method is proposed, which can be utilized in the field of data hiding, yet it serves as an alternative method of lossless image compression. The proposed two-tuples coding effectively compresses the high-order bits of carrier image and produces large independent redundant space, resulting in a higher compression ratio. • Our RDH-EI scheme eliminates (i) the conventional process of integrating additional image encryption algorithms by directly generating an encrypting image during image coding, and (ii) the pre-requisite for the content owner to prepare the embeddable space in advance. Consequently, the proposed RDH-EI scheme significantly lessens the complexity of recent RDH-EI schemes and minimizes the risk of destroying image in data hider side when embedding additional data. • In our RDH-EI scheme, the additional data are embedded independently of the carrier image in the redundancy filling bits, and this ensures that image restoration and additional data extraction are separable and reversible. • Our RDH-EI scheme has a high generality property, which can accommodate any form and feasible size of the carrier image.

•
The proposed RDH-EI scheme is constructed based on Vacating Room by Encryption (VRBE). Therefore, it enjoys a higher security level compared to the conventional Vacating Room after Encryption (VRAE) and Reserving Room before Encryption (RRBE).
The rest of this paper is organized as follows. Section 2 reviews the recent works of RDH-EI schemes. Section 3 introduces our two-tuples coding method. Section 4 describes our RDH-EI scheme. Next, Section 5 analyzes and discusses experimental results. Lastly, Section 6 concludes.

RDH-EI and Related Works
In recent years, numerous RDH-EI schemes have been proposed to achieve efficiency in embedding rate, security, reversibility, and separability over the last decades. In general, the existing RDH-EI schemes  can be classified into three methods, namely: Vacating Room after Encryption (VRAE) [19][20][21][22][23][24][25][26][27][28][29][30][31], Reserving Room before Encryption (RRBE) [32][33][34][35][36][37][38], and Vacating Room by Encryption (VRBE) [39][40][41][42]. The framework of VRAE, RRBE, VRBE and their characteristics are further illustrated in Figure 1. Let is the carrier image, , , are the encrypted image with the method of VRAE, RRBE and VRBE respectively, and , , are the output of embedding data into the encrypted image of , , respectively. Figure 1A shows the construction of RDH-EI scheme by using the VRAE method. In the VRAE method, the content owner first encrypts the carrier image, to get the encrypted image, and passes the encrypted image to the data hider. The data hider tries to make as much space as possible from the to embed more additional data, to obtain . Figure 1B illustrates the RDH-EI creation with the RRBE method. In the RRBE method, the content owner first reserves part of the space in the carrier image,  Figure 1A shows the construction of RDH-EI scheme by using the VRAE method. In the VRAE method, the content owner first encrypts the carrier image, A to get the encrypted image, E(A) and passes the encrypted image to the data hider. The data hider tries to make as much space as possible from the E(A) to embed more additional data, AD to obtain E(A) AD . Figure 1B illustrates the RDH-EI creation with the RRBE method. In the RRBE method, the content owner first reserves part of the space in the carrier image, A by applying a traditional RDH method, then encrypts it to get the encrypted image, E (A). The data hider embeds the additional data, AD into the encrypted part corresponding to the reserved space for generating E (A) AD . The application of VRBE method in constructing the RDH-EI framework is demonstrated in Figure 1C. In the VRBE method, the content owner designs the image encryption method to construct the encrypted image E (A) that contains a redundant space. The data hider uses the traditional RDH method to embed additional data, AD in the redundant space, which generally has a sizeable redundant space. Details of mathematical notations and abbreviations can be further referred in Appendix A (Tables A1 and A2).

• Vacating Room After Encryption (VRAE) Method
The concept of VRAE was first proposed by Zhang [9] in 2011. Zhang splits the encrypted image, E(A) into nonoverlapping blocks. The pixels of each block are pseudorandomly divided into two sets by using a data-hiding key, then flipping three Least Significant Bits (LSBs) according to the embedded additional data. However, the recovering of the carrier image, A in Zhang's scheme [9] requires an extra fluctuation function and contains some errors. Liao et al. [19] argued that Zhang's [9] fluctuation function ignores four pixels on the boundary of each block, and subsequently, they improved Zhang's fluctuation function by reducing its error rate. While Hong et al. [20] considered the boundary pixels in constructing RDH-EI, their approach limited to the use of two pixels in calculating the smoothness of the image block. Using three or four adjacent pixels in calculating the image block's complexity will significantly increase the recovered image's accuracy. Qin et al. [21] still rely on the flipping LSBs techniques in essence, but with some improvements in selecting the flipped position and adopting an adaptive judgment function in restoring the image based on the local content distribution feature of the image. Thus, Qin et al.'s [21] technique effectively reduces the error rate of the VRAE method. Wang et al. [22] also follow the blueprint of Zhang's method [9] by proposing a block-level image encryption method. The embedding position can be located quickly with their proposed self-embedding method, and the generated error rate of fluctuation function has been reduced. On the other hand, Zhang [23] attempted to achieve reversible and separable properties by proposing a separable reversible data hiding scheme for the encrypted image. In Zhang's RDH-EI scheme [23], the content owner encrypts the carrier image, A first, then the data hider compresses the LSBs of the encrypted image, thus creating a sparse space to embed the additional data, AD. Xu and Wang [24] encrypt carrier image, A with the stream cipher, and the interpolation errors of the other nonsample pixels are encrypted. The improved histogram shift and differential expansion techniques are used to embed the additional data into the interpolation error. Meanwhile, a group of researchers [25][26][27][28][29][30][31][32] is aiming to improve the embedding rate and computational complexity of the VRAE method. Argawal and Kumar [25] aimed to lessen the computational complexity of RDH-EI by exploiting the concepts of additive modulo 256 to encrypt the carrier image and preserved mean values to extract information and restore the image with the accuracy of 100 percent. However, their scheme suffered from the low embedding rate as involving bit-by-bit encryption. Singh and Raman [26] apply Chinese Remainder Theorem (CRT) to encrypt the carrier information and distributes it to multiple shares, then embeds the additional in some of these shares with the data hiding key. Qin et al. [27] encrypt the pixel blocks with analogues stream ciphers and divide the encrypted blocks into two groups corresponding to the carrier image's smooth and complex areas. Then, Qin et al. [27] improved the embedding rate by compressing the smooth area to generate redundant space. By encrypting the blocks with a stream cipher and displacement, Zhang et al. [28] further enhanced Qin et al.'s [27] scheme, then the additional data, AD, are embedded in the Most Significant Bit (MSB) layer of some pixels in the smooth area block, resulted in this method can embed data twice and achieve satisfactory image quality decryption.
Unlike the use of stream cipher approach in Xu and Wang [24] and Qin et al. [27] RDH-EI schemes, Yi et al. [29] used pixel correlation in the image block, and obtained redundant space by using block-level prediction error. Then, they encrypted the image to retain the original redundant space by applying block replacement in order to obtain a higher embedding rate and better visual quality of the image. Di et al. [30] divided the encrypted image, E(A) into two sub-images by using bit plane-level operations and embedded additional data, AD with an adaptive embedding strategy. Although this method improves the embedding rate, the image quality measured in Peak Signal-to-Noise Ration (PSNR) of the encrypted image E(A) is low and it only suitable for some specific types of carrier images. Yu et al. [31] split the carrier image into nonoverlapping blocks, arranged the created image blocks, scanned each block to generate a one-dimensional pixel sequence with a closed Hilbert curve, and transform it into an encrypted image, E(A). Then, the additional data, AD, were embedded by transforming the histogram, thus effectively improving the embedding capability and security level. To enhance the prediction of the current pixel, Xu and Su [32] used the linear weighting of three adjacent pixels in the row, then applied a modular operation to encrypt the pixel value in each row and embed the additional data by changing the differential histogram.

Reserving Room Before Encryption (RRBE) Method
To overcome the low embedding rate of the VRAE method and incomplete restoration of the carrier image, the content owner in RRBE methods [33][34][35][36][37][38][39] employs preprocessing concepts to reserve the embedding space of carrier image, A, in the plaintext domain. Cao et al. [33] used sparse coding to generate ample compression space and achieve significant embedding capacity by mining the correlation between neighboring pixels, and data extraction and carrier image restoration were separable. To obtain a higher embedding rate, Han et al. [34] employed Huffman coding to compress the error between the original image and the estimated image. Qian et al. [35] embedded various amounts of additional data, AD, into the three channels of the encrypted image; however, the performance of the embedding rate heavily relies on the chosen parameters. A higher embedding rate can be achieved by selecting the appropriate parameters, while the inappropriateness of the chosen parameters can result in lower embedding rate, and additional data, AD, cannot be reliably extracted from the carrier image, A. Subsequently, Li et al. [36] extended the application of stream cipher in encrypting original carrier image, A, by combining with the block substitution technique, thus improving the image quality and embedding rate. Wu et al. [37] achieved the separable properties by splitting the carrier image, A, into multiple image blocks with different scales, and adding redundancy for the pixels to vacate more allocated space based on the difference between the average and the block pixel value. Nasrullah et al. [38] employed lifting-based Integer Wavelet Transform (IWT) and Set Partition in Hierarchical Tree (SPIHT) coding to perform lossless compression in encryption domain. The Kd-tree approach was applied to reserve more embedding room for hiding secret data. Yu et al. [39] divided the carrier image, A, into reference pixels and nonreference pixels. The embedding additional data process involves the prediction error of the nonreference pixels and the replacement of original nonreference pixels with the prediction error. The use of stream cipher and shuffling strategies in encrypting the image resulted in a higher security level.

• Vacating Room By Encryption (VRBE) Method
Over the last couple of years, VRBE has become a modern approach proposed differently from the conventional VRAE and RRBE methods. VRAE is a part of the method to create redundant space after the encryption process, and the content owner focuses on encrypting the carrier image, A. While on the encrypted image, the data hider needs to create redundant space without affecting image recovery accuracy. The entropy of the encrypted image E (A), however, tends towards the limit. Theoretically speaking, hiding data in the encrypted image is tricky, imposes higher requirements for data hider and leads to the low embedding rate, poorly reversible and separable properties.
In contrast, the RRBE method first generates redundant space in the carrier image, A, before the encryption process, and the original redundant space has to be retained throughout the encryption process. However, the generation of redundant space often depends on RDH methods. Therefore, the content owner must complete both RDH and image encryption, imposing heavy duties on content owners and putting forward high RDH-EI technology requirements. On the other hand, the VRBE method typically generates redundant space during the encryption process; the content owner only needs to encrypt the carrier image and transmit the redundant space generated during the encryption process directly to the data hider.
As redundant space is independent of the efficient carrier image in the VRBE method, it can ensure separable properties and has apparent advantages over the embedding rate of VRAE and RRBE. Several RDH-EI [40][41][42][43] have been constructed based on VRBE methods recently. Liu et al. [40] generated the encrypted image, E (A), with redundant space by disordering bit-planes and sub-blocks, then used Arnold transform to embed data with general RDH algorithm and lastly transmitted it to the data hider. Thus, it resulted in a low embedding rate of 1.600 bpp and lessened the complexity for the content owner. Yi et al. [41] classified pixels using a binary tree labelling technique, providing different label classification strategies according to different parameters. On this basis, this scheme tolerates significant changes in pixels compared to the conventional data embedding method and achieved an embedding rate of 1.752 bpp. However, the embedding rate would be influenced by the selection of parameters. Tang et al. [42] proposed a new block-based image encryption scheme that transforms the spatial correlation of adjacent pixels of the carrier image into the encrypted image, and combines the designed method of differential compression with the improved Huffman coding, which results in a higher embedding rate and better image quality. Chuan et al. [43] proposed a new RDH-EI scheme based on redundant transmission and sparse block coding. The content owner scrambles and encrypts the bit-plane, blocks and pixels, then transfers the redundant information from MSB to LSB. By applying sparse coding, the data hider embeds additional data, AD into different types of encrypted blocks to ensure the separation of data extraction, image decryption and recovery. However, this method is not universal and produces additional marker bits under some parameters, which reduces the embedding rate. •

Limitations of Recent RDH-EI Methods
The above comprehensive review of existing RDH-EI schemes highlights the contribution and deficiency of three methods in RDH-EI. In VRAE method, the data hider needs to embed additional data, AD in the encrypted image without affecting the restoration of the carrier image. However, it is challenging to obtain redundant space in the encrypted image, resulting in low embedding capacity, and errors might occur in data extraction and image restoration. The RRBE method follows the blueprints of RDH algorithm; content owners can choose the appropriate method from the existing RDH methods to generate redundant space and then encrypt the image. The data hider can directly embed additional data, AD in the reserved space. However, the achieved embedding rate and separability properties are still low due to RDH's intrinsic structure. On the other hand, the algorithm design of VRBE method can generate a larger embedding space, and efficiently meet the reversible and separability properties. However, we found out that recent VRBEmethods [40][41][42][43] had improved the embedding rate and achieved perfect recovery with the immolation of separable and generality properties and scarification of image quality.
This paper proposes a new RDH-EI scheme to address the limitations of VRBE methods, which can improve the embedding rate without scarifying the image quality and achieve the reversible, separable and generality properties. Given the intrinsic characteristic of adjacent pixels in an original carrier image, A has a high correlation between high-order bits, this paper designed a binary two-tuples coding method that utilizes the redundancy of these high-order bits to effectively compress the length of high-order bits, thus enabling the generation of the encrypted image, E (A) directly through image reconstruction. The proposed RDH-EI scheme's construction consists of five processes: high-order bits compression, low-order bits recombination, vacancy filling, data embedding, and pixel diffusion.

Two-Tuples Coding
Definition. A combination of one or more selected bits in a binary sequence is denoted by Element, and the consecutive count occurrences of the Element in a sequence are denoted by Number. The Element and its corresponding Number can be used to form a two-tuples, denoted as (Element, Number). For any given binary sequence, it can be express as (Element 1 , Number 1 ), (Element 2 , Number 2 ), · · · (Element n , Number n ) in n-twotuples. The Element and Number are uniformly represented by binary, and the generated new binary sequence is called two-tuples coding, as shown in Figure 2. If the bit-length of each two-tuples is different, it is difficult to identify and differentiate Element and Number tuples' value after continuous coding. A definite value of Element and Number must be given before two-tuples coding to restore the encoded two-tuples accurately, such that the bit-length of Element is denoted by b Ele = Length(Element) and the bit-length of Number is denoted by b Num = Length(Number), where Length(•) is the bit-length of sequence "•". Then, a sequence must be encoded with L Ele and L Num given in advance and uniquely determined. If the bit-length of Number can not reach the predetermined L Num , it is necessary to supplement redundant bits of "0" at the left of b Num to ensure the uniqueness and definiteness bit-length of b Ele and b Num . Thereby, it enables the reducibility properties of two-tuples coding, as illustrated in Figure 3. The reduced length of coding, ∆Length, such that Length = Length(Q) − Length(Q ) = 31 − 21 = 10. However, in a real-world coding application, an exceptional case may occur when limiting the bit-length of Number. Let max{Number} = 2 b Num − 1 and num be the occurred event of the Element, the possibility of num > max{Number} could occur. Given a binary sequence, Q = {101010101010101011111101}, and Length(Q) = 24, the shortest two-tuples coding of Q should be: Q 1 = {("10", 8), ("11", 3), ("01", 1)} = {("10", 1000), ("11", 11), ("01", 1)} = {1010001111011}. However, in the event of b Ele = 2 and b Num = 2, the value of max{Number} = 2 b Number − 1 = 2 2 − 1 = 3, resulted in the invalid two-tuples ("10", 8). To address this, it is necessary to disassemble the two-tuples and increase the grouping. The two-tuples of ("10", 8) will be encoded as {("10", 3), ("10", 3), ("10", 2)}. Then, the validly generated two-tuples of the sequence Q denoted as Q (2,2) = {("10", 11), ("10", 11), ("10", 10), ("11", 11), ("01", 01)} and the denotation of two-tuples coding is Q (2,2) = {10111011101011110101} and Length Q (2,2) = 20. It is noticeable that the value of b Ele and b Num can directly affect the coding efficiency. For instance, given the value of b Ele = 4 and b_Num = 3, then Q (4,3) = {("1010", 100), ("1111", 001), ("1001", 001)}{101010011110011001001}, and Length Q (4,3) = 21. Also, the same binary sequence with a different set of values b Ele and b Num will result in different lengths of two-tuples coding, illustrated in Figure 4.

Enhanced Two-Tuples for Image Compression
The intrinsic features of adjacent pixels in the original carrier image exists the identical grayscale values. Conventionally, the binary coding can be used to compress the carrier image, A without introducing errors; however, only up to a certain extent, resulted in a low compression rate as the number of identical grayscale values only exists relatively small amount. Considering groups of adjacent pixels in a neighborhood leaves nearly similar grayscale values, especially in decomposing the pixels into the high-level and lowlevel fragmentations. This study exploited the features of nearly similar grayscale values to compress the carrier image. Thus, it is ensuring the perfect recovery and achieving separable space with a higher embedding rate.
The carrier image generally exists as an array of bytes, and each grayscale pixel typically consists of 8 bits (1 byte). For each pixel value, p ∈ [0, 255], given p value in the range of [0, 15], there exists an identical high-order 4-bits "0000". Similar to the p value in the range of [16,31], it consists of the identical high-order 4-bits "0001". As illustrated in Figure 5, the nearly similar grayscale pixel values of 163, 162, 167, 164 and 161 have the identical high-order 4-bits "1010", while the grayscale pixel values of 15, 7, 9 and 12 consist of the identical high-order 4-bits "0000". The features of nearly similar grayscale values can be further utilized to increase the compression ratio by separating the carrier image's high-order bits and low-order bits. Let High x be the high-order x-bits of binary pixel coding and Low y be the low-order y-bits, in which y = 8 − x, given a Number, a two-tuples encoding of high-level fragmentation is defined as (High x , Number), and low-level fragmentation, Low y . As illustrated in Figure 6, assume that the value of b Ele = 4 and b Num = 3 and the image two-tuples coding is defined as (High 4 , 3), we encode the High 4 part with (High 4 , 3), and merge the Low 4 part sequentially. For a m × n size of the carrier image, let the length of the conventional binary coding, L coding and the length of enhanced two-tuples coding, L coding , if using (Element, Number) code, then: and the reduced bits can be calculated as: For example, the length of the conventional binary coding of ten adjacent pixels in Figure 6 is L coding = 10 × 8 = 80 bits, and the two-tuples coding used is (High 4 , 3), then the length of enhanced two-tuples coding is calculated as: = 61 bits and the reduced bits can be computed as:

Evaluation of Compression Efficiency
The compression ratio of the same carrier image is evaluated under different values of b Ele and b Num to determine the compression efficiency of the enhanced two-tuples coding in  When the value of b Ele is in the range of [1,6] and the value of b Num is in the range of [1,7], the experimental results of the compression rate of carrier image Lena, Pepper and Zelda images are summarized in Tables 1-3, respectively. As can be seen in Tables 1-3, the performance of the compression rate is firmly correlated with the value of b Ele and b Num . The compression ratio is relatively small, or cannot even be compressed, if the value of b Ele and b Num are too small or too high. When the value of b Ele = 3 and b Num = 4, the compression ratio reaches the optimal state for both Lena and Peppers images, which are 21.91% and 20.74%, as illustrated in Tables 1 and 2, respectively. Whereas the optimal compression ratio of Zelda image is 28.90% with the value of b Ele = 4 and b Num = 4. The other data in the table also prove that the two-tuples coding method can effectively compress the carrier image.   As the content owners may choose various carrier images, A, with a different set value of b Ele and b Num , it will result in a different compression ratio. However, with the conducted experimental data, we infer that when b Ele = {3, 4} and b Num = {3, 4}, the corresponding two-tuples of (Element, Number) = (3, 3), (3, 4), (4, 3), (4, 4) have a relatively high compression ratio, and they can serve as an optimal parameter to achieve optimal compression ratio. To verify this conjecture's correctness, we have done a lot of experimental tests for different carrier images. The experiments result revealed that most of the carrier images have the maximum compression ratio in the value of b Ele = {3, 4} and b Num = {3, 4}. Therefore, the content owners can choose optimal parameters in the setting value of b Ele = {3, 4} and b Num = {3, 4} to achieve an optimal embedding rate. Of course, through the test, we can obtain b Ele and b Num that correspond to the optimal compression ratio for that particular carrier image, A. However, it will cause extra workloads to the image encryptor and affect the algorithm's generality. Therefore, we suggest using these optimal parameters, (Element, Number) = (3, 3), (3,4), (4,3), (4,4) to achieve the optimal compression ratio of the carrier image in our RDH-EI scheme.

Coding Structure of the Encrypted Image
Generally, to realize data hiding in the encryption domain, the carrier image, A, needs to be encrypted first, and then generate redundant space in the encrypted image to embed additional data. Nonetheless, the encrypted image, E (A), destroys the correlation between image pixels, and the information entropy is close to the maximum. Therefore, it is challenging to generate a sizeable redundant space in the encrypted image, E (A). This is also the main reason for the low embedding rates of the VRAE, RRBE methods. The content owner has a large choice space for the image encryption method; if the content owner can generate a larger redundant space when encrypting the carrier image, A then the data hider can directly embed a large amount of additional data, AD in the space reserved by the content owner, which is the advantage of the VRBE method used in this paper. The two-tuples coding method proposed in this paper can solve the problem better. With the two-tuples coding, a large redundancy space can be reserved in the encrypted image, E (A) directly during the image encryption process. The proposed RDH-EI scheme with the two-tuples coding method consists of five processes: high-order bits compression with two-tuples coding method, low-order bits combination, vacancy filling, data embedding, and pixel diffusion. The coding structure of the encrypted image, E (A) is illustrated in Figure 8.

Calculate the High-Order Bits and Low-Order Bits of the Image
Setup: Let A be a carrier image of size m × n. The high-order x-bits and low-order y-bits of each pixel can be obtained by functions bitshift() and mod(), respectively: where bitshi f t A ij , k returns A ij shifted to the left by k bits, if A ij > 0 and k < 0, shifts the bits to the left and inserts k 0-bits on the right, and, if A ij > 0 and k < 0, shifts the bits to the right and inserts |k| 0-bits on the left. For example, if the value of a grayscale pixel is 13 and its corresponding binary value is 00001101, the high-order 6-bits: Low-order 2-bits: This implies that if the pixels' greyscale value is 13, then the high-order 6-bits are 000011, and the remaining low-order 2-bits are 01.

High-Order Bits Compression with Two-Tuples Coding Method
Step 1: Calculate the high-order b Ele -bits of the carrier image, A by using bitshift() function and record it as: (2) Step 2: Transform A High i,j into one-dimensional space and double data type by using double() function as defined as follows: Step 3: Record the value of continuous occurrence of A High d i, j and its continuous occurrence times by using the following find() function as the following: Starting subscript of each A High d(i,j) :

Number of consecutive occurrences of A High d(i,j)
: Step 4: When count 2(i,j) ≥ 2 b Num − 1, the count 2 needs to be disassembled, make count 2(i,j) ≤ 2 b Num − 1, the sequence of count 2(i,j) along the extension is recorded as COUNT extend(i,j) , and its length is counted as the following: Step 5: Each of the elements of A High d(i,j) and its corresponding number COUNT extend(i,j) are sequentially connected into a new one-dimensional space, A High conect by using strcat() function as below: A High connect = stract(A High(count(i,j)) , count 2 extend(i,j) ) (7) Step 6: Splits A High conect into 8-bit blocks, B High(i) , such that: If the last block is less than 8 bits, appends 8-k "0 "bits on the right of B High(i) .
Step 7: Transform B High(i) to decimal space by using function bin2dec (): B H is the second part of the encrypted image E (A), the total of bits is

High-Order Bits Compression with Two-Tuples Coding Method
Step 1: Calculate the low-order b Num -bits of the carrier image, A with the size of m × n, and record it as: Step 2: Convert every A Low i,j to a one-dimensional binary space A LOW2 i,j with function dec2bin(), such that: All the digits of A LOW2 i,j should be equal to b Num , if less then b Num , add '0 in the front of A LOW2 i,j . Then, perform fragmentation by using function cellstr() to partition the A LOW2 i,j into b Num -bits cell, A LowCell i,j such that: Step 3: Concatenate m × n scattered A LowCell i,j into a new continuous one-dimensional binary space, A LowConnect , as the following strcat() function: Step 4: Change A LowConnect to character type, A LowChar with the function char() as below: A LowChar = char(A LowConnet ) (14) and group A LowChar by 8 bits. If the last group is less than 8 bits, append 8-k0-bits on the right to obtain the B Low (i), such that: Step 5: Convert B Low (i) to decimal sequence by using function bin2dec(): B L is the first part of the encrypted image, E (A), and the total number of bits is m × n × (8 − b Ele ).

Compressed Redundancy Space-Filling Bits
In this part, we use logistic mapping in a chaotic system to generate random sequences for filling the remaining space of the compressed image. The logistic mapping is described as follows: where 3.569945 ≤ µ ≤ 4, x k ∈ [0, 1]. The logistic mapping has the characteristics of certainty, pseudorandomness, nonperiodicity and nonconvergence, the sensitivity of the initial value, unpredictability and fast generation speed, thus ensuring the randomness and security of sequence generation. B L is obtained by combining the low-order bits in Section 4.4, and B H is obtained by high-order bits can compression in Section 4.3, the total length of the two sequences is The remaining space of the image is marked as space, and the total bits of space is b space , then: Next, given the initial values x 0 and µ, the sequence x i is generated, and then p i is obtained by x i , the random sequence p i is divided into a group every eight bits and converted to decimal, get the sequence B P , that is: B P is the third part of the encrypted image E (A). Finally, all the pixels of the encrypted image are formed by connecting B L , B H and B P in sequence and these pixels are rewritten into the image with the size of m × n, get the encrypted image E (A).
Next, we use an example illustrated in Figure 9 to demonstrate the coding and compression process. Let the carrier image, A, with a size of m × n be given by m = 14 and n = 1, the grayscale values of A are represented as (163,162,168,166,164,160,124,126,15,13,8,7,9,5), and set the value of b Ele = 4 and b Num = 4 and generate the fragmentation of High 4 and Low 4 by using formula (1). Firstly, transform all the Low 4 into a one-dimensional binary space, B LowConnect = {0011001010000110 . . . 10010101}. Subsequently, for each 8-bit block of B LowConnect , re-group them into a new pixel in order B Low (i). Then obtained B L with seven new pixels are (50,134,118,64,207,135,149), served as the first part of the encrypted image E(A), The generated B L is positioned at the first pixel to the eighth pixel. Subsequently, apply the proposed two-tuples coding (4,4) to all the High 4 bits, output A HighConnect = {("1010", 6), ("0111", 2), ("0000", 6)}. Then, split binary coding of A HighConnect = {101001100111001000000110} into 8-bit blocks, B High(i) . Finally, the B H with three new pixels are generated as (166,118,6), which are taken as the second part of the image, and E (A) is positioned from the ninth pixel to the 11th pixel. Consequently, the 12th to 16th pixels are the compressed redundant space, which can be used for embedding additional data.

Pixel Diffusion
To further improve the security of the encrypted image, we spread the image in plaintext domain into the ciphertext domain by using pixel diffusion. The pixel diffusion is used to change the image's statistical characteristics in ciphertext domain and prevent the attacker from obtaining valuable information by comparing the image in plaintext space with the encrypted image in the ciphertext domain.
We still use formula (1) to generate a random sequence, D, with a size of m × n, and rewrite it into a pixel diffusion matrix D of m × n size: The image after pixel diffusion is E (A), then: So far, the whole image encryption process is completed and the final image, E (A) with redundant space is obtained. To prevent the data hider destroying the useful information of the carrier image, the content owner needs to let the hider know which part is the redundant space, the starting position of redundant space is as follows: The content owner can directly select a starting position start − point in the redundant space to pass it to the information inserter or directly replace the last T byte of the image E (A) with start − point.

Embedding Additional Data
After receiving the encrypted image, the data hider needs to know the starting position, t, of the embeddable space and obtain the embeddable space C Space .
Suppose: Additional data are: W = w 1 , w 2 , · · · w k , w i ∈ {0, 1}, and the length of additional data is k, which should satisfy the following requirements: k ≤ b_space, by giving the initial value x 0 and µ, a sequence Q with length k and unequal elements is generated by Logistic mapping: It ends when there are k elements in Q that satisfy the condition, then we use w i to replace the value C Space (q i ) of the q i position in the sequence C Space to embed the additional data to get C S : For example: C Space = 1 0 1 1 0 1 0 0 1 0 1 1 1, W = 1 0 0 1 1 0 0, Q = 2, 11,5,9,7,4,8, as shown in Figure 10. After data embedding, the encrypted image E (A) AD with additional data is obtained.

Recovery of the Carrier Image
Step 1: Input the password, get x 0 and µ, and get the pixel diffusion matrix D by formula (21) and (22), then Step 2: The first m × n × (8 − b Ele ) bits of the image E (A) are taken as a onedimensional binary space, B L and then every b Num -bits of sequence B L is set a group to get the m × n low-order (8 − b Ele ) bits of carrier image, A.
Step 3: Starting from the ( m × n × (8 − b Ele )/8 × 8 + 1) bits of the encrypted image, E (A), the next (b Num + b Ele ) × b count bits are arranged into a binary sequence, B H , and then each (b Num + b Ele ) bits of the sequence B H is taken as a group. Construct twotuples (Element, Number) from parameters b Ele and b Num , in each group. Expand Element according to the value of b Ele and b Num and get the sequence B H .
Step 4: Each b Ele bit of sequence B H is a group, and m × n high-order b Ele bits of the image A are obtained.
Step 5: The high-order b Ele bits and low-order (8 − b Ele ) bits are combined in turn to get all the pixels of image A, and then the image A is accurately restored by rewriting it into an image of m × n size.

• Extracting Additional Data
Step 1: The initial values x 0 and µ are obtained by entering the password. Then the sequence Q is generated by Equation (3).
Step 2: From the last T bytes of image E (A) AD to get start − point, starting from the start − point bit of image E (A) AD , the left bits are arranged into a binary sequence BS, and the sequence W is obtained by finding the value w i = BS(Q(i)) of the corresponding position of sequence Q in sequence BS. The computer formula is as follows: Step 3: By restoring all binary w i to the original file format, then the additional data can be extracted accurately. The process of extracting additional data is straightforward and fast.

Embedded Capacity
The embedding capacity depends on the compression ratio of the image. As aforementioned in Section 3.3, the image compression ratio of our RDH-EI scheme depends on: (i) the core parameters b Ele and b Num of (Element, Number); (ii) number of Element types (NET), i.e., the total number of types of elements without changing the arrangement order; and (iii) the number of true identities of Element (b_count), i.e., the real quantity needed to identify the NET. That is, when the number of Element consecutive occurrences exceeds 2 b Num − 1, it needs to be disassembled. Table 4 shows the relevant parameters of Lena greyscale image sized 512 × 512 and b Ele = b Num . The calculation formula of the maximum redundant space is as follows: The calculation formula of the compression rate is as follows: Compression rate =

Reduced bits Total number o f carrier images
The embedding rate (ER) of additional data is as follows:

Experimental Results
The experiments were conducted on Windows 10 desktop with an AMD Ryzen 5, CPU 2.10 GHz, 8 GB RAM and the selected platform was Matlab version 2018a. The experimental image presented in this paper was a standard greyscale image of Lena size 512 × 512. The embedded data are presented in binary form, and can be text, picture, sound and other multimedia data. To avoid experiment bias, the data that we embedded were randomly generated and presented in a binary sequence. The experimental results are summarized in Figures 11 and 12. Figure 11a illustrates an original Lena image serving as a carrier image, A. Figure 11b-d are the encrypted image, E (A) without embedded addition data, AD, the encrypted image with pixel diffusion, E (A), and the encrypted image with additional data, AD, respectively, with the parameters b Ele = 3, b Num = 3. Figure 11e-h show the histograms of encrypted image, E (A) without embedded addition data, AD, the encrypted image with pixel diffusion, E (A), the encrypted image with additional data, E (A) AD respectively. Whereas, Figure 12a-h illustrates the experimental results conducted in a different set of parameters, in which b Ele = 3, b Num = 4. Remarkably, the experimental results of Figures 11e-h and 12e-h revealed that the proposed RDH-EI scheme had achieved a good encryption effect, especially the implemented two-tuples coding method with the pixel diffusion makes the histogram of the encrypted carrier image, E (A) tend to be averaged from a random uniform distribution. It can also be further observed in Figures 11 and 12 that the appearance of the original carrier image A and its encrypted images, i.e., E (A), E (A) and E (A) AD are visually distinct, as well as the distribution of histogram between carrier image A and its encrypted images are divergent. Furthermore, in Figures 11d and 12d, the embedding of additional data has little effect on the histogram of encrypted image, meaning our scheme can effectively resist the statistical analysis and segmentation attack. The proposed RDH-EI scheme, therefore, proves it can effectively change the distribution of pixel values in the ciphertext domain, and achieve a higher security level compared to recent works.

Correlation Analysis
The correlation coefficient can reflect the degree of correlation between two linear correlation data sets. We can use the correlation coefficient to identify the similarity between two images. The correlation coefficient value is between −1 and 1, and the more the correlation coefficient of the two images tends to 1, the more similar the two images are. The correlation coefficient between the two images, A and its encryption versions, E (A), E (A) and E (A) AD are calculated as follows: where Cov(X, Y) is the covariance of X and Y, D(X), D(Y) are the variances of X and Y respectively. Typically, the adjacent pixels of the original carrier image A are highly correlated, and the corresponding correlation values are close to 1. The adjacent pixels, on the other hand, are decorrelated after image encryption, and the resulting correlation value decreases. From the data in Tables 5 and 6, it can be seen that the correlation coefficient of the original carrier image, A is close to 1. However, the correlation coefficients of the encrypted image versions, E (A), E (A) AD and E (A) AD are mostly less than 0. These experimental results indicate that our RDH-EI scheme can effectively destroy the correlation of carrier images, thus increasing the complexity of statistical analysis and security attacks. Table 5. Image adjacent pixel correlation (Figure 11a-d).  Table 6. Image adjacent pixel correlation (Figure 12a-d).  Table 7 summarizes the impactful RDH-EI schemes over the most recent three years. Overall, the RDH-EI schemes constructed based on the blueprint of VRBE method have improved the embedding rate significantly compared to the RDH-EI constructed based on the VRAE and RRBE methods. Notably, the proposed scheme and other VRBE methods [40,41,43] have achieved the highest embedding rate, with on average more than 1.600 bpp, compared to VRAE method [22,[27][28][29][30][31][32] and RRBE methods [36][37][38][39], which were limited to the range of [0.120 bpp-0.720 bpp] and [0.180 bpp-1.192 bpp], respectively. Our schemes and VRBE-based RDH-EI schemes can achieve a higher embedding rate without scarifying the image quality, as the Peak Signal-to-Noise Ratio (PSNR) is generally more than 40 dB. Furthermore, the content owner in our scheme and VRBE-based RDH-EI schemes [40,41,43] generated the embeddable space in advance, therefore removing the solemn duties of the data hider. The data hider focuses only on embedding additional data in the reserved space, without concerning the issues of destroying the original carrier image A when embedding additional data AD into encrypted carrier image E (A). As the process of carrier image encryption E (A) and additional data embedding E (A) AD are completely independent, it further implies the separable properties of our RDH-EI scheme and VRAE-based schemes are better than VRAE method and RRBE method.

Correlation
Subsequently, the nailed-down performance analysis of VRBE-based RDH-EI schemes can compare our scheme with Liu et al. [40], Yi and Zhou [41] and Qin et al. [43]. Table 7 reveals that our scheme and schemes in [40,41,43] are categorized as VRBE method, which intuitively achieves reversible and separable properties. However, different approaches and methods were adopted to maximize the redundant space. Our scheme employs the proposed two-tuples coding method in Section 3 resulted in the highest embedding rate of 1.753 bpp, which are 8.7%, 1.77% and 9.87% higher than Liu et al. [40], Yi and Zhou [41], Qin et al. [43]. The application of sparse block coding in Qin et al. [43] has the lowest embedding rate of 1.580 bpp, compared to the Arnold transform-based encryption method proposed by Liu et al. [40] and parametric binary tree labelling in the works of Yi and Zhou [41], of which the embedding rates are 1.600 bpp and 1.722 bpp respectively.
On the other hand, PSNR is most commonly used to reflect the quality of encryption image reconstruction. Generally, the acceptable PSNR in RDH-EI schemes is higher than 30 dB. The performance of PSNR is close correlated to the embedding rate. The higher the embedding rate, the smaller the PSNR. From Table 7, the PSNR of our scheme scored 45.76 dB and outperformed Liu et al.'s [40] scheme, which only scored 41.49 dB. When benchmarked with Qin et al. [43], who achieved 46.53 dB, our scheme is slightly different at 1.68%. However, our scheme enjoys a higher embedding rate than Qin et al. [43]. Therefore, from four essential elements of performance benchmarking in RDH-EI schemes, i.e., embedding rate, reversibility, separability and image quality (PSNR), the proposed scheme has apparent advantages over recent comparative literature [27][28][29][30][31][32][36][37][38][39][40][41]43].

Complexity Analysis
In addition to comparisons of the embedded capacity, reversibility, separability, visual quality of different methods, we also analyzed the computing complexity of the recent VRBE methods during the image encrypt and the embedding process. To avoid experiment bias, the time measurement was taken when the embedding rate reached the maximum. Generally, the VRBE method embeds additional data, AD, directly in the reserved space of the encrypted image, resulting in a faster data embedding process than VRAE and RRBE methods. However, different methods in generating the encrypted images with redundant space may affect the computing efficiency, as summarized in Table 8.

Security Analysis
The security analysis of RDH-EI is closely related to the performance of image histogram, coefficient correlation and PSNR. The analyzed result and discussion in Sections 5.2-5.4 revealed that the proposed scheme could achieve a high-security level compared to recent RDH-EI schemes. This section subsequently focuses on analyzing the security of carrier image A and additional data AD.
For a carrier image, A, with the size of m × n, after high-order bits compression, loworder bits recombination, vacancy filling and data embedding, the generated encryption image still in the size of m × n. However, the carrier image information can be determined only during the process of the high-order bits compression and low-order bits recombination, and the scale is (b Num + b Ele ) × b count . The rest of the encryption processes are not related to the carrier image. As the scale (b Num + b Ele ) × b count < m × n, so it is impossible to obtain all the information of carrier image A by using a few pieces of information through puzzle solver and brute-force attack. Moreover, the two-tuples coding completely breaks the structure and encoding mode of the original image pixels, and can resist statistical analysis and differential analysis. Our scheme has strong resistance ability, and it is obviously performing better than VRAE and RRBE methods in securing the carrier image A.
For the security of additional data, firstly it is difficult for attackers to accurately obtain the starting position of secret information. Secondly, the embedding position of additional data, AD is randomly selected from the carrier's filling-bits part, which is a group of random numbers and has no direct relationship with the carrier image, A. The content owner can accurately restore the carrier image A and know what the filling-bits part is. However, it is not helpful to extract additional data AD as it is challenging to identify which bits belong to additional data in the filling-bits part. Of course, we can encrypt the additional data AD before embedding it, which can further improve the security of additional data. The security analysis of the encrypted image E (A) and additional data AD as well as the performance analysis of the image histogram, coefficient correlation and PSNR in Sections 5. 2-5.4, show that our RDH-EI scheme enjoys a higher security level than recent works RDH-EI.

Conclusions
This paper presents an RDH-EI scheme with two-tuples coding that aimed to improve the low embedding rate and security level of recent VRBE-based RDH-EI schemes. Our RDH-EI scheme consists of five processes: high-order bits compression with two-tuples coding method, low-order bits combination, vacancy filling, data embedding, and pixel diffusion. Our scheme's main advantages are: (i) the two-tuples coding method had effectively compressed the high-order bits of the carrier image with an optimal compression ratio of 28.90%, resulting in a larger redundant space to embed more additional data. The proposed two-tuples coding is not limited to the application in the information hiding field; however, it is also an alternative method for lossless compression of the image. (ii) The proposed RDH-EI scheme can improve the embedding rate without scarifying the image quality. Our scheme achieved the highest embedding rate of 1.753 bpp with the PSNR 45.76 dB compared to recent VRBE-based RDH-EI schemes. (iii) The proposed two-tuple coding method achieved computing efficiency almost 50% better than recent VRBE-based RDH-EI schemes and the achievement is comparable to sparse block coding. (iv) Our scheme enjoys a higher security level and can defend against differential analysis and statistical attacks.
The correlation coefficients of the encrypted image in our scheme are mostly less than 0, i.e., −0.00024 in the encrypted image's horizontal direction. The distribution of histogram and the appearance between the carrier image and its encrypted versions are distinct. The two-tuple coding method implemented with pixel diffusion makes the encrypted carrier image histogram appear to be averaged from a random uniform distribution. (v) The proposed RDH-EI scheme satisfies both reversible and separable properties, compared to VRAE and RRBE methods. While the data hider in the recent RDH-EI schemes still suffered from the issues of destroying the original carrier image when embedding additional data, the capability of our scheme in generating the embeddable space by the content owner in advance, our scheme, therefore, has a high generality to be adopted in preserving image privacy in different fields and various applications. Future works will concentrate on optimizing security parameters to further improve the algorithm efficiency. Subsequently, a prototype will be developed to support medical image processing and outsourced cloud storage applications.  Data Availability Statement: All data has been present in main text.

Conflicts of Interest:
The authors declare no conflict of interest.