1. Introduction
With an increasing number of people having mobile equipment and with the advanced transmission capability of high-speed networks, more and more images are transmitted through the Internet via mobile devices. However, the Internet is a non-secure channel; secure images, used by the military or in the medical field, need to be protected against illegal attempts to manipulate them before they are transmitted on the Internet. Many techniques for protecting images have been proposed, such as cryptography, watermarking, digital signatures, and so on.
Cryptography is a technique that constructs a protocol to prevent third parties from reading private messages. In this technique, the original medium will become the ciphertext after encryption, which, however, attracts the attention of third parties. Watermarking techniques ensure image integrity and authenticity, copyright, and ownership. These techniques prevent unauthorized users from accessing the image and achieve image integrity, where the content is intact and authentic and the image is original. The watermarking technique is better than cryptography in terms of protecting the image content. The concept of a digital watermark was first created in 1992 by Andrew and Charles [
1]. The watermark is a special technique of steganography, in which secret information is concealed in a medium. Watermarks have been widely used in many applications such as image authentication and ownership identification. When the watermarked image has been tampered with, the owner of the image can easily detect the modification in the image.
According the concealed location, watermark schemes can be categorized into two groups, spatial domain and frequency domain. Spatial domain watermarking techniques operate on image pixel image directly. The simplest spatial domain method is Least Significant Bit (LSB) replacement. The scheme conceals the secret message into the pixel by replacing the least significant bits. Famous spatial watermarking schemes include difference expansion, quantization-based hiding scheme, side-match hiding scheme, LSB matching, modulus function scheme, and so on. Difference expansion-based schemes expand the difference between two original pixels (or the prediction value and the pixel) to embed the secret information. Quantization-based hiding scheme quantize the value range of the pixel into several subsets. Each subset represents a different secret message. The scheme modifies the pixel to let the pixel located in the correct subset fit with its corresponding hidden message. Side-match techniques use neighboring pixels to measure the hiding length and its corresponding embedding strategy. Modulus function uses the index of each pixel as weight to compute a modulus value to map the hiding message.
Mielikainen proposed the LSB matching method in 2006 [
2]. The scheme applies two functions, an LSB function and a matching function, to embed two secret bits into two spatial pixels. Let (
AL,
AR) be two neighboring pixels and s
1 and s
2 are two binary secret bits. First, the pixel
AL is computed using the LSB function to get the lowest bit of
AL. Next, the scheme determines whether the lowest bit is equal to the secret bit s
1. If LSB(
AL) = s
1, then
AL does not need modification and can be input directly into the matching function
F(
AL,
AR). Conversely, if LSB(
AL) ≠ s
1, then
AL =
AL − 1, and it is input into the matching function
F(
AL,
AR). The matching function is as follows:
The scheme determines whether the value of F(AL, AR) is equal to the secret bit s2 and uses four modification rules to conceal data:
Rule 1: LSB(AL) = s1 and F(AL, AR) = s2, the pixels AL and AR do not need modification.
Rule 2: LSB(AL) = s1 and F(AL, AR) ≠ s2, the pixel AL does not change, and A′R = AR + 1.
Rule 3: LSB(AL) ≠ s1 and F(AL − 1, AR) = s2, the pixel A′L = AL − 1, and AR does not change.
Rule 4: LSB(AL) ≠ s1 and F(AL − 1, AR) ≠ s2, the pixel A′L = AL + 1, and AR does not change.
In 2015, Lyu et al. used a rehashing model to propose an image authentication scheme [
3]. In this scheme, a cover image is divided into several 1 × 2-sized pixel pairs. They let (
AL,
AR) be a pixel pair and
i be the location of the pair, where
AL is used to conceal the hash code of the pixel pair, and
AR is used to embed the recovery information. Lyu et al. supposed that the hash value of the location
i is
k. The hash value is then concealed in the three least significant bits (LSBs) of
AL. The recovered information is the three most significant bits (MSBs) of the mean value of the pixel pair, which will be embedded into the three LSBs of
AR. In the extraction and recovery process, the stego-image is also divided into several 1
2-sized pixel pairs. The authentication code and the recovery information are extracted from the first and the second pixels, respectively. The location
of the pixel pair is used to map the hash table to generate the hash value
. If the hash value is equal to the LSBs of the first pixel
A′L, then the pixel has not been tampered with. On the other hand, if the value is unequal to the LSBs of
A′L, then the pixel has been tampered with. In this case, the recovery information is extracted from the three LSBs of A
′R and is used to recover the original pixel.
The watermark, concealed in the spatial domain, can be very easily removed using image processing. Hence, researchers conceal the watermark in the frequency domain to maintain the robustness. Frequency domain watermarking schemes first transform the pixel values into coefficients. Then, the secret message is concealed in the transformed coefficients. The popular transformation methods are discrete cosine transform (DCT), Fourier, and discrete wavelet transform (DWT). Yu et al. proposed a watermarking scheme for image authenticity in 2015 [
4]. The authors divided a host image into several 8 × 8-sized blocks and used the DCT compression method to decompose each block. They used block content and variances to generate a watermark and concealed it in the first ten coefficients of a low-frequency band. Qi et al. applied singular value decomposition (SVD) to form a watermarking scheme for preserving the image content authentication and localizing the tampered location [
5]. They used a single value to generate a watermark and concealed it in the wavelet coefficients of the image. The scheme includes SVD and Wavelet operations. In 2014, Al-Otum proposed a watermarking technique to verify the authenticity and localize the tampered area [
6]. He used DWT to decompose a cover image and conceal a secret message in the low-frequency second-level DWT coefficients.
In 2016, Huang et al. proposed a reversible hiding scheme in JPEG images [
7]. In their scheme, a cover image is divided into several 8
8-sized blocks. Each block is compressed by the DCT compression process to generate coefficients. The coefficients are quantized with a quantization table to get the quantized coefficients. The secret message is then concealed in alternating-current (AC) quantized coefficients with values 1 and −1. The zero AC coefficients remain unchanged. Other AC coefficients are shifted to prevent collision. Furthermore, in their scheme, a threshold
Tz is used to determine whether the block is embeddable or not. If the number of AC coefficients with a magnitude of 1 is less than the threshold, then the block is non-embeddable. For the non-embeddable block, the scheme does not change the coefficients to maintain the image quality. For the embeddable block, the coefficients are modified by
where
is the
ith coefficient,
is the stego-coefficient, and
is the secret bit. The function
returns the sign of the input value. If the coefficient is equal to 1 or −1, then the secret message is concealed in the coefficient by
. On the contrary, if the coefficient is not equal to 1 or −1, it is shifted by
.
Most of the authentication schemes described above were designed for general computers, not mobile devices. For those who want to transmit images through the Internet using mobile devices, a transitional authentication scheme is not suitable because mobile devices have limited power and battery lives that cannot handle time-consuming hiding operations.
To solve this problem, this paper proposes an authentication scheme for mobile devices. The proposed scheme extracts image features from a cover image and conceals the features in the image for image authentication and tamper detection. The hiding operators are “add” and “subtract”. The featured extraction operation used in the proposed scheme is DCT which is a common compression method used to generate JPEG images. The applied operations are lightweight operators suitable for mobile devices.
DCT represents a sequence of numbers in terms of a sum of cosine functions in different coefficients. It uses cosine function rather than sine function because fewer cosine functions are needed to approximate the same signal. DCT has been widely used in many applications, such as lossy compression, image and audio processing, feature extraction, pattern recognition, and so on. In general, DCT is cognate to the Discrete Fourier transform (DFT). The DCT numbers are concerned with Fourier coefficients of a periodically and symmetrically extended array. The length of DCT number is twice of that of DFT coefficients and DCT operates on real data with even symmetry. The DCT transformation is a symmetric transformation that allows the transformation array to be precomputed offline and applied in a mobile environment thereby increasing computational efficiency [
8].
3. Proposed Scheme mark
The diagram of the embedding process of the proposed scheme is shown in
Figure 5. A cover image is divided into several blocks performed by DCT transformation. A watermark is constructed to be embedded in the block pair to generate the stego-image.
After receiving the stego-image, the extraction and recovery process is used to determine whether the image has been tampered with or not. This is shown in
Figure 6.
The stego-image is divided into several blocks, and the watermark is extracted from the stego-pixels to determine if the block is valid or not. If the block is invalid, the extracted information is used to recover it.
The embedding and extraction process includes four major phases: authentication watermark generation, data embedding, tamper detection, and image recovery.
3.1. Authentication Watermark Generation Phase
The proposed scheme divides a cover image into several
-sized blocks. Each block randomly chooses another block as a block pair. Let (
BA,
BB) be the block pair.
Figure 7 shows the diagram of the image blocks of the proposed scheme. The proposed scheme generates an authentication watermark for the block pair (
BA,
BB) and embeds the code into the block pair.
The authentication watermark generation algorithm is shown below:
• Transform the blocks by DCT to obtain the coefficients DC and AC.
The transformation equation is:
For example,
Figure 8a,b are two example blocks. The corresponding coefficients of the blocks are shown in
Figure 8c,d.
• Quantize the coefficients using a quantization table with a quality factor (QF) [
6].
For example, a quantization table with QF = 50 is shown in
Figure 9. The coefficients in
Figure 6c,d after quantization using
Figure 7 are shown in
Figure 8a,b.
• Generate an authentication watermark using the DC values of BA and BB.
DC value is the most important value of the coefficients. The proposed scheme uses two DC values to generate the authentication watermark. The seven most significant bits of the two DC values of BA and BB are extracted to form MSB_DCA and MSB_DCB , respectively. Concatenate MSB_DCA and MSB_DCB to generate the first fourteen authentication watermark bits. Then, use XOR operator bitwise on the fourteen bits to generate a debugging bit . Perform NOT operator on the debugging bit to generate the final testing bit .
For example, in
Figure 10a, the DC values of
BA and
BB are 11 and 32, respectively. The seven MSBs of 11 are (0000101)
2, where (11)
10 = (00001011)
2. Hence,
MSB_DCA = 0000101, and
MSB_DCB = 0010000. The debugging bit is
= 0 ⊕ 0 ⊕ 0 ⊕ 0 ⊕ 1 ⊕ 0 ⊕ 1 ⊕ 0 ⊕ 0 ⊕ 1 ⊕ 0 ⊕ 0 ⊕ 0 ⊕ 0 = 1, and the testing bit is
= ~
= 0.
• Concatenate MSB_DCA, MSB_DCB, the debugging bit, and the testing bit to generate an authentication watermark (w), where w = MSB_DCA || MSB_DCB || || .
Following the same example, the authentication watermark is w = MSB_DCA || MSB_DCB |||| = 0000101||0010000||1||0 = 0000101001000010.
3.2. Authentication Watermark Embedding Phase
The authentication watermark is then concealed in each block of the block pair (BA, BB). The watermark is concealed twice in each block using Least Significant Bit Replacement (LSB). The proposed scheme separates each 4 4-sized block into two sub-blocks. Each 2 4-sized sub-block conceals one authentication watermark. Every pixel in the sub-block owns two secret bits.
For example,
Figure 11a shows the binary string of each pixel of
BA. The block
BA is divided into two sub-blocks. The first sub-block is colored gray. The second sub-block is colored white. The watermark
w = 0000101001000010 is then concealed in each binary string using LSB. Each pixel conceals two secret message bits to generate the stego-binary string. The watermark is concealed in both sub-blocks. Each sub-block owns one watermark. The results are shown in
Figure 11c. Finally, the proposed scheme transforms the binary string into a decimal number to get the stego-pixels, such as in
Figure 11e. The watermark is also concealed in
BB to get the stego-block
B’B, such as in
Figure 11f.
3.3. Tamper Detection Phase
After receiving the stego-image, the receiver divides the stego-image into several -sized blocks. Then, the receiver follows the next four steps to extract the watermark and detect the tampering.
- Step 1:
Watermark extraction and block detection
Divide the block into two -sized sub-blocks.
Extract two embedded bits from each stego-pixel to generate a watermark with sixteen bits for each sub-block. The watermark is w’ = MSB_DCA || MSB_DCB || || .
Compare two extracted watermarks. If the watermark of the first sub-block is different from that of the second sub-block, then the block is suspicious.
- Step 2:
Sub-block detection for suspicious blocks
Use XOR operator bitwise on the fourteen bits of each watermark to generate a debugging bit .
Compare with the 15th bit of the watermark . If does not equal to , then the block is invalid.
Compare with the 16th bit of the watermark . If does not equal to , then the block is invalid.
- Step 3:
Corresponding block detection
Transform the block using DCT to obtain the coefficient DC values DCA and DCB from the block pair .
Extract the seven most significant bits from DCA and DCB, called MSB_DC*A and MSB_DC*B.
If MSB_DC*A MSB_DCA or MSB_DC*B MSB_DCB, then the block is valid. Otherwise, the block is invalid.
- Step 4:
Closed neighboring detection for the valid block.
3.4. Image Recovery Phase
The invalid blocks are recovered using two sequential steps:
- Step 1:
Corresponding block recovery
If both B’A and B’B are invalid, then the block cannot be recovered. The block is ignored, and the scheme continues to recover other blocks.
Retrieve sixteen bits w’ from the valid block.
If the block is valid, extract seven bits string from w’, which starts from the 1st bit to the 7th bit as MSB_DCA, and the bit string starts from the 8th bit to 14th bit as MSB_DCB.
Randomly pad one 0 or 1 to the end of MSB_DCA and MSB_DCB to form new DC values.
Replace the DC value of the invalid block with the new DC value and transform the coefficients into pixels using inverse DCT.
Mark the block as valid.
- Step 2:
Neighboring blocks recovery
4. Experimental Results
In this section, we shall demonstrate the effectiveness of the proposed scheme with several experimental results. Six 512
512-sized grayscale images were used to test the performance of the proposed scheme. The images were “Boat”, “Barbara”, “Mandrill”, “Pepper”, “Man”, and “Lena”, shown in
Figure 12a–f, respectively. The scheme was implemented by MATLAB R2014b in Windows 10 64-bit operating system, Intel
® Core(TM) i7-6700 CPU @ 3.40 GHz 3.40 GHz, 16GB RAM.
In order to measure the visual quality of the stego-image, the similarity between the cover image and the stego-image was calculated using the peak signal-to-noise ratio (PSNR) given by:
where dB refers to the decibels, and MSE represents the mean squared error between the cover image and the stego-image, and is given by:
The watermarked images are shown in
Figure 13a–f, respectively, for each test image. The PSNR values of the test images are higher than 46 dB. The image quality of the stego-image is excellent.
In the first experiment, we cropped the central area of each watermarked image to generate the tampered image. The tampered images are shown in
Figure 14a–f. The most important features of each watermarked image were cropped.
Figure 15 displays the recovered images of
Figure 14. The image quality of the recovered image is good since the PSNR value of each image is higher than 37.5 dB.
With the second experiment, we blacked out 50% of each test image at various locations. Six different types of tampering distributions are shown in
Figure 16a–c,g–i. The recovered images are shown in
Figure 16d–f,j–l, respectively. The image quality is higher than 36.2 dB.
In the third experiment, two kinds of malicious manipulations were applied to the test images Boat and Mandrill. The manipulated images are shown in
Figure 17a,c and their recovered images are shown in
Figure 17b,d, respectively. The PSNR values of the recovered images are higher than 35 dB.
Table 1 shows the execution time of each phase by using the proposed scheme. The average time of embedding processing is 4.77 s which contains 1.31 s for DCT, 2.47 s for XOR, 1 s for embedding operator. The average times of extraction processing and recovering processing are 1.25 and 1.4 s, respectively. The execution time of the proposed scheme is very short. Hence, the proposed scheme is very suitable for mobile device.
The next experiment compared the performance of the proposed scheme and Lu’s scheme.
Table 2 shows the image qualities of the stego-image and the recovered image using the proposed scheme and Lu’s scheme. In
Table 2, the stego-image quality of the proposed scheme is 46.3 dB, which is higher than Lu’s scheme of 38.43 dB. Other images have similar situations. For the stego-image, the image quality of the proposed scheme is, on average, higher than Lu’s scheme of 7.8 dB. For the recovered image, the image quality of the proposed scheme is, on average, higher than Lu’s scheme of 6 dB.
The proposed scheme was not only developed in a personal computer (PC) environment but also implemented in the mobile device iPad Pro 256GB with iOS11 system. The system was developed by Mathworks. Three test images with large sizes were used to test the performance of the proposed scheme: Lena (1252
1252), Jerry (2016
1512), and Shirly (4032
3024).
Figure 18 shows the test images with large sizes. The experimental results of these test images are shown in
Table 3. For the test image Lena, sized 1252
1252, the execution time of the PC is about 10 s and that of the iPad is about 11 s. For the test image Jerry, sized 2016
1512, the execution time of the PC is 19 s and that of the iPad is 27 s. For the largest test image, Shirly, sized 4032
3024, the execution time of the PC is 78 s and that of the iPad is 91 s. The PSNR values are the same in different devices. From the results, we can see that the proposed scheme can perform very well on the mobile device.