A Novel Image Tamper Detection and Self-Recovery Algorithm Based on Watermarking and Chaotic System

: With the development of image editing software techniques, the content integrity and authenticity of original digital images become more and more important in digital content security. A novel image tampering detection and recovery algorithm based on digital watermarking technology and a chaotic system is proposed, and it can effectively locate the tampering region and achieve the approximate recovery of the original image by using the hidden information. The pseudo-random cyclic chain is realized by the chaotic system to construct the mapping relationship between the image subblocks. It can effectively guarantee the randomness of the positional relationship between the hidden information and the original image block for the better ergodicity of the pseudo-random chain. The recovery value optimization algorithm can represent image information better. In addition to the traditional Level-1 recovery, a weight adaptive algorithm is designed to distinguish the original block from the primary recovery block, allowing 3 × 3 neighbor block recovery to achieve better results. The experimental results show that the hierarchical tamper detection algorithm makes tamper detection have higher precision. When facing collage attacks and large general tampering, it will have higher recovery image quality and better resistance performance.


Introduction
With the development of digital portable devices, such as cell phone and digital camera, the images can be acquired more conveniently.The power of the image editing software become stronger.The authenticity and integrity are so important in digital content security that more and more researchers focus on this field.The watermarking technique, as one of the authentication methods, can detect the authenticity and localize the tampered area effectively, and it can also recover the modified or tampered image.
The algorithms of image self-recovery have three important parts: the authentication information, the recovery information and the mapping function to embed the authentication information and recovery information to the image.The authentication information has to detect the reality of the received image effectively, and it can localize the tampered area accurately.The mapping function can embed the information by modifying selected pixels, and it will influence the quality of the image.Therefore, a better mapping function is a key part of the proposed algorithm.The target of the modified algorithm is to improve the embedded image quality and the tampered area recovery accuracy.
The algorithms of image self-recovery based on watermark can be classified into two types from the embedding field: the spatial embedding and the transform embedding.The spatial embedding methods can modify the pixels directly in the spatial domain.They are simple and effective.For example, the author of [1] proposed the dual watermark to authenticate the tampered image and get a good recovery performance while the big tamper ratio appears.The image is divided into non-overlapping blocks of 2 × 2 pixels, and then the average values of each block are used to construct the recovery information, which is embedded into the two LSB planes.The scheme in [2] uses parity check and comparison between average intensities and the hierarchical structure is used to detect the tampered area.The authors obtained good recovery performance even in a high tamper ratio.To lower the risk of making an incorrect prediction, the method in [3] produces parity check bits from pixels whose bits have been rearranged.The parity check bits are produced from pixels whose bits have been rearranged.The Hamming code is used to construct the authentication information.To improve the security of Those algorithms, Arnold transform [4][5][6] is applied in the procedure to map the relationship of the blocks.
The algorithms in the transform domain first map the image into the other domain, such as discrete wavelet transform(DWT) [5,7,8], discrete cosine transform(DCT) [9][10][11], and lifting wavelet transform(LWT) [6].Due to characteristics of the transform domain, the authentication and recovery information are generated by coefficients of the transform domain.The index value [12] of Vector Quantization (VQ) is used to recover information.This method can construct recovery information better.However, the index should be used in the watermarking extraction procedure, and this increases the extra information.
To improve the recovery performance and take advantage of spatial embedding, a novel image self-recovery algorithm based on watermarking is proposed.The contributions of the algorithm are as follows: (1) The pseudo-random cyclic chain is used to construct the mapping relationship between the image subblocks.The pseudo-random cyclic chain is generated by the chaotic system, and the key space is large.
(2) The recovery information can better reflect the information of image blocks by constructing the recovery value optimization algorithm, and the hierarchical tamper detection algorithm makes tamper detection have higher precision.(3) The modified smooth function is generated to improve the recovery performance, and it can resolve the overflow problem.
The paper is organized as follows.Section 2 presents the mapping mechanism named Double Pseudo-random Chain between the subblocks generated by the chaotic system.Section 3 describes the proposed method.Section 4 reports the performance of the proposed framework on test images.Finally, Section 5 concludes the paper and discusses some possible future work.

Double Pseudo-Random Chain
Logistic mapping is a classic model for studying behaviors of complex systems such as dynamic systems, chaos, fractals, etc.Because of its simplicity and randomness, it is widely used in watermark embedding technology [13].It is defined as: where n is the iteration times of chaotic sequences and λ is the control parameters of the logistic mapping system.For any x 0 ∈ (0, 1), the Logistic mapping has chaotic characteristics when 3.99 < λ < 4.0 [14,15], it constitutes the pseudorandom chain we need.The double pseudo-random chain has two chaotic sequences which have different x 0 , and it has the following characteristics [16]: (1) The mapping relationship of the double pseudo-random chain is one-to-one and will not be repeated.(2) The mapping relationship of the double pseudorandom chain is uniquely determined by the initial parameters, and different parameters will produce different sequences, thus ensuring that the mapping sequences are irrelevant.(3) The neighbor elements of the double pseudo-random chain are far away from each other.

Proposed Scheme
The self-recovery watermarking scheme proposed in this paper includes five parts.The flow chart of the scheme consisting of the watermark generation and embedding procedure is shown in Figure 2 and the tamper detection and recovery procedure is shown in Figure 3.A new mapping mechanism based on the double pseudo-random chain is proposed in the block mapping module, which can embed the current image block information into two corresponding different image blocks.The details of the mapping scheme are proposed in Section 3.1.In the watermark generation module, a new adaptive optimization algorithm is proposed to obtain the best recovery information that can replace the current block information, and generate authentication information according to the recovery information and current block index.The specific algorithm is shown in Section 3.2.In the watermark embedding module, the watermark data are embedded according to the mapping sequence, and the improved smooth function is used to enhance the quality of the embedded image.The algorithm is described with more details in Section 3.3.In the tampering module, combined with the work in [1,5], a four-level detection strategy is proposed, which can effectively resist various types of attacks.The algorithm is presented in Section 3.4.In the image recovery module, in addition to recovering the image with the extracted recovery information, an adaptive secondary recovery strategy is proposed.The algorithm is detailed in Section 3.5.

A mapping sequence can be represented as
That is, the recovery information of image block A is hidden in image block B, and the recovery information of image block B hidden in image blocks C, and so on.Thus, for the mapping sequence, the most significant thing is to ensure the existence of the recovery information under the condition of a large tampering ratio.
In [1,9,12], the 1-D transform algorithm was used to build the blocks relationship mapping function.Although the sequence is random, the remainder of the elements in the same column are equal, and the performance in the face of column tampering is poor.Arnold transform [4][5][6] and Variant torus automorphism [3] are used to guarantee the randomness of the sequence, but it is not safe because both of them are periodic.Based on Arnold's Transform algorithm, Tai and Liao [5] remapped the watermark block mapped to the vicinity of the original block, which improves the security of the watermarked image.However, the mechanism of remapping is too ineffective.When the corner tampering rate exceeds 25%, 25% of the watermark will be completely lost, which seriously affects the quality of the image.Modulus operation is used as a mapping sequence by Hamid and Wang [10], but it also has the same problem as the methods in [1,12].
To solve the above problems, a double pseudo-random chain is used to form the block map sequence.Suppose that I 0 is a size of M × N 8-bit grayscale image, M, N ∈ 2 R (R = 8).We split the image into nonoverlapping image blocks whose size are m × n, as shown in Equation ( 2): Next, two different pseudo-random block index sequences named L and L are generated by Equation (1) with different initial values.We embed the data of the first chain in the upper 5 bits and embed the data of the second chain in the middle 5 bits.
As shown in Figure 4, suppose the value of the mapping sequences are For example, the recovery information of block l 0 is embedded in the high 5 bits of block l 0 , and the recovery information of block l 0 is embedded in the middle 5 bits of block l 0 .l 0 , l 0 , and l 0 are the elements of L, L , and L , and they are the labels of the divided 2 × 2 image blocks.
Pearson Linear correlation coefficient(LCC) A is widely used in many fields; it is an effective measurement of the linear correlation between two variables [17].The larger the correlation coefficient is, the greater the correlation and similarity between the original image and the transformed image is.We use this method to prove that the pseudo-random chain is more random and effective than other methods, such as Arnold transform.It is defined as: Figure 5 is LCC of Arnold transform and the double pseudo-random chain.In Figure 5a, the two parameters of standard Arnold transform are a = 1 and b = 1 [18], the abscissa means the number of transform times, 1 means transform once, 2 means transform again based on 1, and so on.In Figure 5b, the parameter of the pseudo-random chain is λ = 3.991, the abscissa means the initial value range from 0 to 1.Although the abscissas of the two subfigures are not the same, they all represent the traversal of all the conditions under the given parameters and can be used for comparison.
As we can see, the mapping sequence generated by the Arnold transform has a strong correlation, and some of the correlation coefficients can even reach 0.3, while the correlation coefficient of pseudo-random chain is small, almost close to 0. This also proves that the pseudo-random chain has better randomness than Arnold transform and is more suitable as a mapping sequence.

Watermark Generation
The watermarks of 8 × 8 image block and 4 × 4 image block are based on DWT and DCT [5,10].Those methods got more embeddable information and higher embedding image quality.However, the block effect was obvious in this case, and the detection of a small tamper ratio had obvious error and the false alarm rate is high.To solve this problem, the image is divided into 2 × 2 non-overlapping blocks.However, the image block can be embedded with limited information bits, and the maximum capacity has only 12 bits.In the current paper, recovery information is based on the most significant 5 bits of the average of the 2 × 2 image block [1,4,19], which is better for the smooth image block tampering.However, this method is very poor for texture block recovery.A variable-length recovery information construction scheme is proposed by Chen [20], which uses 12 bits for texture blocks and 6 bits for smooth blocks.Although it can represent the texture information better, it affects the embedded information bit, which makes it difficult to tamper detection.Aiming at this problem, a pixel value adaptive optimization algorithm is designed in this paper, which makes the value express image block information better.Figure 6 shows the whole procedures.Assume that the recovery information of block I (i,j) and I (i,j) will are embedded in I (i,j) according to mapping sequences L and L .The four pixels in the image block are represented by I (i,j) = p 0 , p 1 , p 2 , p 3 and I (i,j) = p 0 , p 1 , p 2 , p 3 .Since the recovery information is formed by the upper 5-bits of the p i or p i , its maximum value p max and minimum value p min can be obtained by Equations ( 5) and (6).

 
Let R be the recovery information.Then, R ∈ [p min , p max ].To make the embedded value reflect the actual value of the image block better, we use the following Equation to get the best-embedded value R of the image block.
Then, the upper 5 bits recovery information r i can be calculated by: Similarly, the recovery information I (i,j) can be got as: Through the above method, we generated 10-bits recovery information r i and r i .At the same time, we use the following Equation to generate 2-bits authentication information p and v. Here, ⊕ is exclusive OR operator, ∼ is reverse operator, and || is bitwise stitching operator of binary characters.
Finally, the watermark value W ar to be embedded in the image block I (i,j) can be generated by: To facilitate the following expression, W ar is expressed as W = {w 0 , w 1 , ......, w 11 }.

Watermark Embedding
The details of the watermark embedding procedure are shown in Figure 7.
Step 1 : Divide the whole M × N image into (M × N)/(m × n) non-overlapping image block I (i,j) whose size is m × n and let I (i,j) = {p 0 , p 1 , p 2 , p 3 }.
Step 2: Generate two unrelated pseudo-random chains L and L according to the method mentioned in Section 3.1.
Step 3: Calculate the recovery information and the authentication information that need to be embedded, and obtain the whole watermark information W for each image block according to Section 3.2.
Step 4: Embed W into the corresponding image block.However, if the recovery information is directly embedded into the lower 3 bits of the original image, the quality will be greatly affected.Lee [1] and Yang [12] put forward the smooth function, which has a great effect on improving the quality of the embedded image.However, it also has some drawbacks, thus we make some modifications to it.The function is as follows: In this Equation, {w 3i , w 3i+1 , w 3i+2 } are the values to be embedded, p i is the value of original image, v i is the difference between the lower 3-bit values of original image and {w 3i , w 3i+1 , w 3i+2 }, and wp i are the pixel values of the embedded image generated by smooth function.The principle of this function is to add or subtract 8 from the embedded pixel value without affecting the lower 3-bit values, but it will reduce the difference between the embedded image value and the original image value.For instance, given p i = 232 = (11101000) 2 , {w 3i , w 3i+1 , w 3i+2 } = {1, 1, 1}, and v i = (111) 2 − (000) 2 = 7 − 0 = 7.Using Equation ( 14), we obtain embedded value The gap between original pixel value and embedded pixel value is |wp i − p i | = 1.We get better embedded pixel value without changing {w 3i , w 3i+1 , w 3i+2 }.Table 1 shows the effect of the smooth function on watermark embedding.

Hierarchical Tamper Detection
Firstly, the tampered image is partitioned into non-overlapping 2 × 2 image blocks.The proposed tamper detection algorithms are described below.
Step 1: Use the same x 0 in Section 3.3 to generate two pseudo-random chains L and L .
Step 2: Use Equation ( 15) to extract 12-bit watermark information of the current block I (i,j) from wp 0 , wp 1 , wp 2 , and wp 3 .
Step 3: Level-1 detection.According to Section 3.2, we can easily notice that W e will satisfy Equations ( 10) and (11) if the image block has not been tampered.Thus, we can calculate and compare ew 0 with ew 0 ⊕ ew 1 ⊕ ew 2 ⊕ ew 3 ⊕ ew 4 ⊕ ew 5 ⊕ ew 6 ⊕ ew 7 ⊕ ew 8 ⊕ ew 9 , and then calculate and compare ew 11 with ∼ ew 10 .If one of the comparison results is not equal, the current block is set to be invalid.
Step 4: Level-2 detection.If the current block is detected as valid in the Level-1 detection, use Equation ( 16) to decode W e to obtain recovery information.
According to Section 3.2, R e and R e will satisfy Equations ( 8) and ( 9).Thus, if R e = R /8 × 8 and R e = R /8 × 8, set the block to be a valid block, otherwise set it as an invalid block.
Step 5: Level-3 detection.If the current block is marked as valid in the Level-2 detection, the current block with 3 × 3 neighbor block is considered for detection.As shown in Figure 8, The neighbor block is divided into four groups of triples, which are (N, NE, E), (E, SE, S), (S, SW, W), (W, NW, N).If any triple is invalid, the current block is marked as invalid [1].Step 6: Level-4 detection.Consider the 3 × 3 neighbor block as shown in Figure 9.If there are 4 or more invalid blocks, the current block is marked as invalid.

Image Recovery
When the tamper detection is completed, the blocks marked as invalid should be recovered.A scheme for secondary recovery is designed to get better recovered image quality.Suppose the block marked as invalid is I (i,j) = {p 0 , p 1 , p 2 , p 3 }.The recovery information hide in I (i,j) and I (i,j) according to L and L .
The detailed procedure of Level-1 self-recovery is as follows: Step 1: If I (i,j) is a valid block, go to Step 2. If the I (i,j) is an invalid block, go to Step 3.
Step 2: Extract the watermark information from the lower three bit planes of block I (i,j) and use Equation (??) to get the recovery value I r (i,j) .Then, go to Step 4.
Step 3: If I (i,j) is an invalid block, the I (i,j) is marked as the invalid block.If I (i,j) is a valid block, then extract the watermark information from the lower 3-bits of I (i,j) and use Equation ( 18) to get the recovery value I (i,j) , then go to Step 4.
Step 4: Use I r (i,j) to recover the current invalid block I (i,j) .If I (i,j) and I (i,j) are both invalid blocks, then the current tampering block cannot be recovered by extracting the recovery information.This situation is more likely to occur when the tampering ratio is high, so the design of the level-2 recovery strategy is important for image recovery.The mean of the 3 × 3 neighboring blocks is used to recover the image [1,5].However, this method ignores the difference between the original block and the Level-1 recovery block.The pixel value of the original block is more accurate than the recovery block.In this paper, an adaptive weighted recovery algorithm is designed to resolve this problem.
The Level-2 recovery process is as follows: Step 1: In the 3 × 3 neighboring block, let the number of original blocks is n o , the pixel weight is µ o and the pixel value is p i (0 ≤ i ≤ n o ).The number of recovery blocks is n r , the corresponding pixel values and the pixel weight are p i and µ r (0 ≤ j ≤ n r ).The number of unrecovered blocks is n t , and the corresponding pixel weight is 0. k is the ratio of the original pixel block weight to the recovery pixel block weight and it is generally 1.5 or 2, then µ o , µ r can be calculated by Equations ( 19) and ( 20): Solving Equation ( 19), µ o and µ r can be obtained: Then, the current invalid block pixel value can be calculated by: We can prove I r (i,j) ∈ [0, 255] as follows.
Assume p max is the maximum value of p i and p j , p min is the minimum value of p i and p j , and ii is easy to get p max , p min ∈ [0, 255].Then, Equation ( 22) can be obtained by Equation ( 21) Similarly, we can prove Therefore, I r (i,j) ∈ [0, 255], which can guarantee that the modified pixel value does not overflow.The recovery procedure is shown in Figure 10.The brown block is marked as the recovered image block after tampering, and the purple block is marked as the original block.As can be seen in the figure, n r = 4 and n o = 3, and we set k = 1.5.Thus, we can calculate the weight of the original block and recovery block as µ o = 0.18 and µ r = 0.12.Finally, we can calculate I r (i,j) according to Equation (21).
Step 2: Use I r (i,j) to recover invalid block I (i,j) .

Experimental Results and Performance Analysis
We performed a series of analyses and simulations of the performance of the proposed scheme in tamper detection.The types of attacks include collage attacks and large general tampering.We also compared the performance with the existing block-based approach [1,2,5,9].All simulation environments were MATLAB R2018b.

PSNR
Peak signal-to-noise ratio (PSNR) is widely used in the field of image processing.It can measure the degree of deviation of the watermarked image or recovered image from the original image [21].
where I O (i,j) represents the pixel value of the original image and I R (i,j) is the pixel value of the reconstructed image.Table 2 is the PSNR of the watermarked image, which is about 2.82 dB higher than the theoretically value of 37.9 dB [21] due to the addition of the smooth function.

Performance on Collage Attack
The first one is to collage the block of the current image to another block, and the second one is to collage the block of one image to the corresponding position of another image.

First Kind of Collage Attack
A 256 × 256 8-bit grayscale Lena image was used to simulate first kind collage attacks.Figure 11 shows the details for the simulation.Figure 11a is the watermarked image in [1] with a PSNR of 40.72 dB. Figure 11b shows the result of the collage attack in [1]; we paste the blocks of coordinates (47,41) to (210,124) into the image blocks of coordinates (47,133) to (210,216).The theoretical tampering rate is 21.02%.Figure 11c show the result of the tamper detection for paper [1].As sown in this figure, the method in [1] could not detect the first kind collage attack.Figure 11d is the result of image recovery of Figure 11b.Since the image tampering cannot be detected, and the PSNR is only 17.26 dB. Figure 11a1 is the watermarked image of Tai [5] with a PSNR of 44.01 dB. Figure 11b1 is the collage image of Figure 11a1.In Figure 11c1, we can see that Tai's method can detect the first kind collage attack and Figure 11d1 shows the recovery image and the PSNR is 29.21 dB.The block effect is obvious from the red part.Figure 11a2 is the watermarked image of our scheme, and the PSNR is 40.71 dB. Figure 11b2 is the collage images, the ratio and location of tampering are the same as Figure 11b.Figure 11c2 shows the results of tampering detection.We can detect the first kind of collage attack and the detection tamper ratio is 21.02%, which is consistent with the theory.Figure 11d2 is recovery image with a PSNR of 32.35 dB.

Second Kind of Collage Attack
We also collaged 256 × 256 8-bit grayscale Lena to 256 × 256 8-bit grayscale Baboon to simulate the second kind collage attack.
Figure 12a is the watermarked image from [1] with a PSNR of 40.74 dB. Figure 12b is the result of the second collage attack of the method in [1]; we paste Lena image blocks of coordinates (47,41) to (210,124) into Baboon image blocks with coordinates (47,133) to (210,216).The theoretical tampering ratio is 21.02%.Figure 12c is the result of the tamper detection for the method in [1], which cannot detect the collage attack.Figure 12d is the result of recovery image for Figure 12b with a PSNR of 16.56 dB. Figure 12a1 is watermarked image of Tai [5] with a PSNR of 44.02 dB. Figure 12b1 is the tampered image of Figure 12a1.As the first kind attack, the second collage attack can be detected by Tai's [5] scheme.Figure 12d1 shows the information of the recovery image, and the block effect can be observed.Figure 12a2 is a watermarked image of proposed scheme with a PSNR of 40.71 dB. Figure 12b2 shows the collage images, and the ratio and location of tampering are the same as in Figure 12b.Figure 12c2 shows the results of the tampering detection.From this, we can see that we can detect the second kind collage attack and the detection tamper ratio is 21.02%, which is consistent with the theory.Figure 12d2 is a recovery image with a PSNR of 32.35 dB.
In summary, compared with the the methods in [1,5], the proposed scheme is resistant to the collage attack.

Performance on Large General Tampering
To simulate the performance of the scheme on the large general tampering problem, we tampered with the watermarked image to the degree from 0% to 95%.The simulated image is a 256 × 256 8-bit Lena and the results are shown in Table 3.In Table 3, we can see that compared to the methods of Lin [2], Lee [1] and Tai [5], the proposed scheme has better performance when the tamper ratio is 33-90%.The mapping of double pseudo-random chain has better ergodicity and will have better effect on general tampering.In contrast, the mapping sequences of Lin and Lee are not random.For example, the remainder of each column of Lee's mapping sequence is equal, which led to better results in very low tampering rates and extremely high tampering rates.Although the mapping sequence of Tai [5] is a better random sequence, it is also periodic.The 2-bit embedding method makes the watermarked image of Tai have a higher PSNR.However, the recovery information is embedded just once; the secondary recovery scheme is not effective.The performance is not as good as our method in the case of a high tampering ratio.Overall, the proposed approach is more general and practical.
The previous paragraph mentioned the limitations of Lin and Lee's method.To illustrate this problem, we specifically simulated a large number of column tampering.The contrast results of the column tampering Peppers are shown in Figure 13.The column tampering test results for the Lena, Baboon, Barbara, Peppers, and Cameraman are shown in Figure 14.In Figure 13, the proposed scheme has a better effect on tamper detection, and the PSNR of the recovery image is higher.Furthermore, from the red pane in Figure 13d, we can see that Lee's scheme has the Probability of False Rejection (PFR) for Peppers because the smooth function has not been improved.We can see in Figure 14 that, for a large number of column tampering, the proposed scheme has better resistance than Lin and Lee.Besides, to be more universal, five 8-bit grayscale images from the standard test set were taken for 0-95% central tampering simulation in this study.The images were: Lena, Baboon, Barbara, Peppers, and Cameraman.The results are shown in Figure 15.In Figure 15, for different types of images with different characteristics, the PSNR of the restored images is not much different, which proves the versatility of our scheme.Moreover, as the tampering ratio increases from 0% to 95%, the PSNR of the recovered image decreases smoothly and the PSNR range is always 40dB ∼ 18dB, which proves the efficiency of the algorithm.

Conclusions
To authenticate the integrity of the digital image and locate the tamper area, a novel image self-recovery scheme based on watermarking technique is proposed.The mapping relationship between the subblocks is constructed by the chaotic system, thus the security of the algorithm is better.The authentication and the recovery information are generated by the image block content.The optimization algorithm is used to find the better recovery information, which makes the recovery performance better.A weight adaptive algorithm is proposed to assign different weight to the original block and the primary recovery block, and it is different from the traditional Level-2 recovery scheme, which makes the 3 × 3 neighbor block recovery achieve better results.Many experiments and analysis were done to show better performance of this method.

Figure 1 .
Figure 1.The security of the pseudo-random cyclic chain: (a) The original Lena image; (b) The reordering Lena image.

Figure 3 .
Figure 3.The tamper detection and recovery procedure.

Figure 8 .
Figure 8.The four triples of current block.

Figure 15 .
Figure 15.The PSNR of the recovered image relative to the tampered ratio.

Table 1 .
The comparison of "without smoothing function" and "with smoothing function".

Table 2 .
The PSNR (dB) of the watermarked images.
1indicates that no data are provided.

Table 3 .
The PSNR (dB) of the recovered image relative to the tampered size and location.