1. Introduction
Data-hiding has a long history and has been widely used for information security for centuries. Moreover, a good review of the recent works appeared in [
1], which classified modern data-hiding techniques and cryptography as two distinguishable domains for information security and explained comprehensive analysis of information-hiding techniques. An interactive buyer-seller watermarking protocol has been proposed to not allow invisible watermarked copy [
2], and a secure comparison protocol in the encrypted domain was also proposed [
3].
New techniques to hide information in encrypted images have drawn much interest of researchers who have extended its applications. Data-hiding in encrypted images involves several different problems since it is impossible to directly use the contents of encrypted data. Therefore, applications of data hiding in the encrypted images have been explained in a number of papers [
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23]. The usage of an encrypted image as the cover medium for concealing data is not only a relatively new research subject, but an interesting subject due to simultaneous consideration of both cryptography and steganography technology, which have been independently developed. An encrypted image can be protected when it is transmitted through a public channel, increasing the difficulty of analyzing the embedded data.
Among data-hiding technologies, reversibility is one of the major research topics since an encrypted image should be recovered to the original image after extracting the hidden data. Reversible data-hiding (RDH) systems are designed to assign embedded data to a cover media and to allow recovery of the original image without distortion after extraction. Especially, for military and medical images, which are important applications of data-hiding, there should be no distortion in the recovered image due to data hiding.
The embedding procedure of the RDH systems with flipping least significant bit (LSB) is first to divide an encrypted image into blocks and to use the LSBs of pixels in the blocks according to the embedded data. In the receiver, spatial correlation in the natural digital image is calculated to estimate the embedded data. Zhang [
4] first proposed RDH systems with flipping LSBs and suggested the fluctuation function by using the four neighboring pixels to measure the spatial correlation between pixels in each block. Hong’s system [
5] separately calculated the horizontal and vertical fluctuations of the correlation. Moreover, Hong [
5] proposed a side-match technique that the updated border of neighboring blocks be used for calculating the correlation in order to improve the spatial correlation, whereas the marginal pixels of the block are not used in [
4]. Modified fluctuation functions were proposed according to the position of pixels in block [
6], and the functions were composed of three different equations. Reversible data hiding systems were proposed by using a new embedding pattern and multiple judgments [
8]. These RDH systems perfectly recovered the original image if there was no error in the extracted data.
The RDH about lossless data compression are based on data compression to make new space for the data hider to hide data. Research about lossless data compression was designed to develop efficient data compression to reduce image distortion. In [
7], a separable RDH scheme for encrypted images was proposed by compressing the encrypted data using a source coding scheme with side information, making data extraction independent of encryption. An efficient data compression from low-density parity-check codes was proposed, and a new data extraction method using side information was discussed in [
9] to enhance rate-distortion performance. In [
10,
11], the recursive code construction for binary covers was improved, and it was shown that the construction can achieve the rate-distortion bound. To improve RDH schemes based on the distortion matrices, a system estimating the optimal transition probability matrix for a general distortion matrix was proposed in [
12].
In the RDH system with histogram shift, the histogram was first produced by the error value between original pixels and estimated pixels. According to embedded data and the error value, the error histogram was shifted for saving space for data embedding by shifting the bins of the histogram. In [
13], the RDH based on the histogram of the prediction vector quantization-compressed image was proposed where the index of the image was used for embedding. Similarly, Ma [
14] proposed RDH systems for encrypted images by reserving room, where the original image was divided into two partitions and the LSBs of one partition were embedded into the other partition by a traditional RDH algorithm [
18]. In [
15], generating RDH codes according to the theoretical expressions [
19] of RDH were determined by the differences between the original pixel-values and the corresponding values estimated from the neighbors. An efficient RDH method was proposed by estimating pairwise prediction errors and modifying the 2D histogram of the prediction-errors in [
16]. Recently, Zhang [
17] proposed a combination method with a histogram modification and lossless data compression by using an entropy coder.
Recently, modified RDH systems were proposed to have better performance. Data hiding by using zero coefficient quantization table was proposed in JPEG images [
20]. Hussain et al. [
21] introduced a hybrid data hiding method combining the right-most digit replacement (RMDR) with an adaptive least significant bit (ALSB) for digital images. Kumar et al. [
22] proposed RDH based on prediction error and expansion based on adjacent pixels. Hong et al. [
23] proposed a new data-hiding technique for absolute moment block truncation coding (AMBTC) of a compressed image based on quantization level modification. Improved embedding pattern and a new measurement function were proposed in encrypted images with a high payload to enhance transmission performance [
24]. One vacating room before encryption was proposed in encrypted image, where the content owner creates room for embedding data in the cover image before encryption [
25]. Qian et al. [
26] proposed a novel scheme of RDH in encrypted images using distributed source coding to protect the secrecy of the system. Xiao et al. [
27] proposed a separable RDH technique in encrypted images based on the pixel value ordering where homomorphism encryption is used for image encryption.
The Reed–Solomon (RS) codes are named for their inventors who published the codes in 1960. RS codes have good error-correction capability for bursty errors since they are non-binary cyclic maximum distance separable codes. Therefore, they have been widely used in consumer electronics, data transmission technologies, broadcast systems, optical communications and image processing systems. During the past decades, research about RS codes has been updated quickly. The Berlekamp–Massey (BM) algorithm [
28] is an efficient hard decoding algorithm of RS codes that uses a linear feedback shift register (LFSR) to determine error location polynomials with the smallest degree and error locations by solving the roots of the polynomials. A new run-length-limited decoding algorithm to increase the performance of RS codes in a visible light communication system has been proposed [
29].
In this paper, efficient reversible data-hiding systems are proposed by using new fluctuation functions and rate-matched RS codes. The estimated bits of the fluctuation functions are more correct since boundary pixels and average distance are used for calculation. With the help of the error-correcting capability of RS codes, the embedded message can be recovered correctly. In order to efficiently correct the errors according to weak spatial correlation, RS codes based on the BM algorithm are considered for decoding. In the bit error rate (BER) results of three different images, our proposed system showed better performances than the data-hiding systems. Peak signal to noise ratio (PSNR) performances were also shown in the experimental results. To check the performance enhancements, fundamental RDH systems are considered as referenced systems, but our fluctuation functions or coding schemes can be applied to recently-published RDH systems with modifications.
4. Experimental Results
In this paper, the three gray-scale Lena, Peppers and Jet images, shown in
Figure 3, are considered. The size of the three gray images is
. The range of the block size
s is from six to 38, which was already considered in [
4,
5]. The function
to flip
w LSBs is considered as
, which is shown in (
6).
To compare the proposed data-hiding systems with the referenced systems, the error pattern of the data-hiding system is shown in
Figure 4. For a Lena image and
, error positions of Zhang’s system [
4], Hong’s system [
5], the proposed system with fluctuation functions in (
15) and (
16) and the proposed system with the fluctuation function and RS(15,11) codes are shown in (a), (b), (c) and (d) in
Figure 4, respectively. The small black square in a large square in
Figure 4 denotes
error pixels in a Lena image. From
Figure 4a to
Figure 4c, the recovery performance of the system with the proposed fluctuation functions in (
15) and (
16) is better than the ones of Zhang’s system [
4] and Hong’s system [
5] for
s = 8 and the Lena image. Moreover, there is no error pattern in (d) of
Figure 4 when the proposed system with RS(15,11) codes is used.
To investigate how RS codes can help recover the error in RDH systems, an example for the proposed system with RS codewords is shown in
Figure 5.
Figure 5a is similar to
Figure 4c. One error in
Figure 5a is located at
. It is assumed that the codewords of RS(15,11) start from
and are assigned to the right sequentially. Then, the error at
belong to the 11th codeword, which starts at
and ends at
. The gray rectangle in
Figure 5a denotes the 11th codeword, and the codewords are also shown in
Figure 5b. The codeword is composed of 15 symbols, which corresponds to 60 bits. A blue rectangle in
Figure 5b denotes a symbol of codewords. From
Figure 5b, one symbol of the codeword is an error, and it is known that RS(15,11) can correct up to two error symbols. Therefore, the error pattern at
can be recovered from the RS codes. Similarly, the other errors can be recovered by the help of RS codes.
In
Figure 6, the BER performances of the referenced systems and proposed system without RS codes are shown according to the size of
s when the Lena, Peppers and Jet image in
Figure 3 are considered. The ‘Ref. Zhang [
4]’ and ‘Ref. Hong [
5]’ in these figures denote BERs for Zhang’s scheme in [
4] and Hong’s scheme in [
5], respectively. The ‘Pro. Fluc.’ in these figures stands for BERs for the proposed fluctuation functions in (
15) and (
16). In
Figure 6, as block size
s increases, the BERs of two referenced system sand the proposed system without RS codes become better. BER performances of two referenced systems and the proposed system without RS codes in
Figure 6a,b are better than BER performances in
Figure 6c, since the Jet is a weakly correlated image. In
Figure 6a,b, BER performances of the proposed fluctuation function are always better than the referenced systems. It is shown that the proposed fluctuation functions are effective for the RDH system without the help of RS codes. In
Figure 6c, the BER of the proposed system without RS codes is the same or better than the BER of Zhang’s system. For a small length of
s, the BERs of three systems turn around since the reliability of the fluctuation function in the weakly correlated image is small. For some value of
s, the BER of Zhang’s system is better than the BER of the proposed system without RS codes, since the embedding process is different, and the fluctuation functions in Zhang’s system are based on the difference of neighboring pixels.
To investigate the performances of the proposed systems with RS codes, four RS codes are used, which are an RS(15, 11), an RS(15, 7), an RS(31, 23) and an RS(31, 15) code in this simulation. The code rates of the four RS codes are 0.73, 0.47, 0.74 and 0.48, respectively, and they correspond to around 1/2 and 3/4. The lengths of the codes are 15 and 31. These RS codes can be used for estimating the performance change according to the length and rate of RS codes.
In
Figure 7, BER performances of the referenced systems and the proposed systems are shown according to the size of
s when the Lena, Peppers, and Jet images in
Figure 3 are considered. The ‘Ref. Zhang [
4]’ and ‘Ref. Hong [
5]’ in these figures also denote BERs for Zhang’s scheme in [
4] and Hong’s scheme in [
5], respectively. The notations ‘Pro. RS(15, 11),’ ‘Pro. RS(15, 7),’ ‘Pro. RS(31, 23),’ and ‘Pro. RS(31, 15)’ represent the proposed systems with RS(15, 11), RS(15, 7), RS(31, 23) and RS(31, 15) codes, respectively. For most values of
s, the BER performances of the proposed systems with four RS codes are better than those of the referenced systems [
4,
5]. In
Figure 7c, it is shown that the BER performance of RS(15, 11) fluctuates as
s increases, since the RS codes can correct up to two error symbols. For
s = 14, 18 and 20, there are less than three error symbols in every codeword of RS(15, 11) codes, and the BER is zero. In
Figure 7a,b, the BER of the proposed system is 0 for
, though the BER of the referenced systems [
4,
5] is non-zero. In
Figure 7c, BERs of the proposed systems with RS(15,7) and RS(31,15) codes are better than the two referenced systems. If
s is larger than or equal to 10, BERs of the two proposed systems are zero over the weakly correlated image. For high rate RS codes, such as RS(15,11) and RS(31, 23), BER performances in
Figure 7c turn around as
s increases, since the image is weakly correlated, and the error capability of the codes is small.
Since the proposed systems consider RS codes with code rates of 0.75 and 0.5, the transmission efficiencies of the proposed systems are less than those of the referenced systems. Therefore, for fixed BER, the effective length of the embedded message must be considered for proper transmission. For fair comparison, the number of messages for BER = 0 is considered, where BER = 0 means that embedded messages are extracted without error. The main results of the proposed systems and the referenced systems [
4,
5] are listed in
Table 1,
Table 2 and
Table 3 for BER = 0 when the Lena, Peppers, and Jet images are considered, respectively. The ‘Ref. Zhang [
4],’ ‘Ref. Hong [
5],’ and ‘Pro. RS(15, 11), Pro. RS(15, 7), Pro. RS(31, 23), Pro. RS(31, 15),’ in
Table 1,
Table 2 and
Table 3 correspond to the referenced systems [
4,
5] and the proposed system with RS(15, 11), RS(15, 7), RS(31, 23) and RS(31, 15) codes, respectively. The ‘Rate’ in these tables denotes the code rates of the considered systems; since the referenced systems have no RS codes, their code rates are one. The ‘Min.
s’ in these tables stands for the minimum size of
s that guarantees BER = 0. The ‘No. messages’ in these tables denotes the number of actual embedded messages corresponding to minimum
s. The ‘Gain’ in these tables denotes the ratio of message length of the proposed systems to the message length of the referenced system and is written as a percentage. The ‘G1’ and ‘G2’ are gains when the referenced systems are those in [
4,
5], respectively.
As can be seen in
Table 1,
Table 2 and
Table 3, the minimum
s of the proposed systems is always smaller than that of the referenced systems. The number of embedded messages in the proposed systems is also larger than that in the referenced systems. In
Table 1, when the Lena image is considered, the proposed systems with an RS(15, 11) code and an RS(31, 23) show about three times more efficient transmission than the referenced system [
4]. In
Table 3, the proposed systems with an RS(15, 7) code and an RS(31, 15) are more than two times more efficient than the referenced systems [
4,
5].
To verify the performance of image recovery, PSNR performances for the three images in
Figure 3 according to the embedding rate are shown in
Figure 8. The ‘Embedding rate’ in
Figure 8 represents the ratio of the number of embedded messages to the number of pixels in an embedded image. The ‘Inf.’ in
Figure 8 denotes infinite PSNR. Infinite PSNR cannot be illustrated in these figures; however, for convenience, it is located at 103.4 dB, the MSE of which corresponds to
(≈
). Since the pixels in the images are represented with eight bits, MAX in (
24) is 255. The ‘Ref. Zhang [
7], Dec.’ and ‘Ref. Zhang [
7], Rec.’ in
Figure 8 denote PSNRs in directly decrypted image and recovered image of Zhang’s scheme in [
7], respectively.
M and
S, which represent the number of LSBs of pixels and the number of hidden data per a group in [
7], are considered as two and one, respectively. The PSNRs in directly decrypted images and in recovered images represent those before image recovery and after image recovery in [
7], respectively, and PSNR performances in directly decrypted images were shown in [
7].
In
Figure 8, most embedding rates of the proposed systems are higher than those of the referenced systems, which satisfy the infinite PSNR. In
Figure 8a, the proposed scheme with RS(31, 23) codes has an embedding range up to 0.02, which guarantees an infinite PSNR, while the referenced systems have an embedding range up to 0.007. Similarly, in
Figure 8b, the proposed scheme with RS(31, 15) codes has an embedding range up to 0.014 for infinite PSNR.