Complex Wavelet-Based Image Watermarking with the Human Visual Saliency Model

: Imperceptibility and robustness are the two complementary, but fundamental requirements of any digital image watermarking method. To improve the invisibility and robustness of multiplicative image watermarking, a complex wavelet based watermarking algorithm is proposed by using the human visual texture masking and visual saliency model. First, image blocks with high entropy are selected as the watermark embedding space to achieve imperceptibility. Then, an adaptive multiplicative watermark embedding strength factor is designed by utilizing texture masking and visual saliency to enhance robustness. Furthermore, the complex wavelet coefﬁcients of the low frequency sub-band are modeled by a Gaussian distribution, and a watermark decoding method is proposed based on the maximum likelihood criterion. Finally, the effectiveness of the watermarking is validated by using the peak signal-to-noise ratio (PSNR) and the structural similarity index measure (SSIM) through experiments. Simulation results demonstrate the invisibility of the proposed method and its strong robustness against various attacks, including additive noise, image ﬁltering, JPEG compression, amplitude


Introduction
With the growing popularity of big data and multimedia applications, a large number of digital multimedia data are generated, transmitted, and distributed over the Internet every day. The security of these digital data is a relevant problem. An efficient solution is watermarking technology, which is mainly used for copyright protection, authentication, fingerprinting, etc. [1][2][3]. In general, the main idea of digital watermarking is to embed useful information in a host signal without affecting the perceptual quality of the host signal. For a watermarking method, the three indispensable, but conflicting requirements are robustness, invisibility, and capacity [1]. These requirements are mutually reinforcing and have to be solved together. For instance, when the imperceptibility of watermarking is improved, the robustness of watermarking will be reduced. Therefore, an ideal digital watermarking should achieve good balance among these three requirements.
To achieve the above goal, extensive watermarking methods have been proposed in recent years. These methods can be classified in different ways, e.g., spatial domain methods [4] and frequency domain methods [5][6][7][8][9], based on watermark embedding space. Depending on the manner of embedding, the method can be further categorized into additive [10], multiplicative [11,12], and quantization based methods [13,14]. In addition, the watermarking methods can be categorized as blind [11] and non-blind [15] ones based on watermark decoding.
In terms of embedding region, most current watermarking methods focus on the frequency domain [6,7,16], because frequency domain watermarking algorithms are relatively more robust, invisible, and stable, especially the wavelet based watermarking [16]. The reason is that wavelet based watermarking has two obvious advantages. One of the advantages is that the wavelet transform fits well with the human visual system, which can be exploited in the design of an invisible watermarking [17][18][19][20][21]. The other advantage is that the wavelet transform has good multi-scale analytic characteristics, which can be used to develop a robust watermarking method [21][22][23][24]. Subsequently, many watermarking methods that use wavelets have been proposed in the past two decades. In terms of embedding method, the multiplicative watermarking methods are reportedly more robust, and they provide higher imperceptibility than the additive ones [25,26]. The multiplicative watermarking approaches are dependent on image content [25], and more importantly, they have strong robustness. Therefore, multiplicative watermarking methods are preferred for copyright protection [26]. For this reason, the multiplicative embedding approach is adopted in our study.
The wavelet transform has the advantages of the localization of the time frequency and multi-scale analysis, and it is suitable for describing the characteristics of 1D signals. However, when the signal dimension increases, the wavelet transform cannot sufficiently describe the singularity of the signal [27]. Therefore, to capture the direction information of 2D signals, multi-scale geometric analytic techniques for obtaining the intrinsic geometric structure information of images, such as contours and smooth curves, have emerged in recent years. These technologies include the ridgelets [15,28], wave atoms [29], contourlets [27,30], framelets [31], and dual tree-complex wavelet transform (DT-CWT) [32,33].
Current watermarking algorithms, such as the methods in [16,28,30], have achieved satisfying results; however, some problems need to be solved. Akhaee et al. [16] proposed a robust scaling based watermarking with the multi-objective optimization approach. Although the balance of invisibility and robustness of watermarking has been elaborately addressed by the method [16], the cost of multi-objective optimization is high, which hinders its extension to real applications. Despite the success of the multi-scale geometric analysis technology in various image watermarking-like methods [9,30], the time cost of these methods is also high. As a result, designing a simple and effective digital watermarking method to balance robustness and imperceptibility is necessary.
In addressing the above issues, the quantization watermarking approach with the L1 norm function was proposed in our previous work [33]. This work achieved good imperceptibility, but the robustness of the watermark against some attacks remains insufficient. In the present study, a watermarking that can be easily implemented is developed, and the robustness of watermarking is boosted based on the visual perception model. In this manner, the fidelity of the image can be improved. The developed method can achieve a good balance between invisibility and robustness of watermarking, which is beneficial to the practical use of watermarking technology.
DT-CWT is regarded as an overcomplete transform, which creates redundant complex wavelet coefficients that can be utilized to embed watermarks. In general, shift invariance is the main feature of DT-CWT. We can use this property to produce a watermark that can be decoded even after the host signal has undergone geometric attacks, such as amplitude scaling and rotation. DT-CWT also has good directional selectivity. Therefore, we propose an image watermarking by using DT-CWT in this paper. First, we segment the original image and choose the image blocks with high entropy in this study. Second, we embed watermark data into the low frequency of the complex wavelet coefficients by a visual perceptual model, and we extract the watermark data by using the maximum likelihood estimator (MLE). Finally, we validate the effectiveness of the watermarking algorithm through experimental simulation.
The contributions of the proposed method are twofold. On the one hand, an adaptive watermark embedding method in terms of texture masking and the visual saliency model is developed, which embeds each watermark bit into a set of dual tree complex wavelet coefficients. Using this strategy, the robustness and imperceptibility of the watermark can be well balanced. On the other hand, the low frequency of complex coefficients with high entropy is selected as the watermark embedding space, which can improve the robustness of watermarking against some geometric attacks, such as rotation, scaling, and combinational attack.
The rest of the paper is structured as follows. Section 2 provides the basic concept of DT-CWT. Section 3 introduces the proposed watermarking method, including watermark embedding and watermark decoding. We test and discuss the performance of the proposed watermarking through experiments, and the corresponding findings are discussed in Section 4. The conclusion is presented in Section 5.

Dual Tree-Complex Wavelet Transform
DT-CWT, which was initially proposed by Selesnick, Baraniuk, and Kingsbury [32], inherits the characteristics of wavelets, and it can approximate shift invariances and has good directional selectivity [32]. A DT-CWT with a wavelet transform can produce six directional sub-bands oriented at 75 • ,15 • , −45 • , −75 • , −15 • , and 45 • on a decomposition scale. By contrast, a wavelet transform only has three directional sub-bands oriented at 90 • , 0 • , and 45 • on a scale. A comparison of the impulse responses of these two wavelet transforms is shown in Figure 1. As mentioned above, DT-CWT can effectively approximate shift invariances. This invariance can be used to design watermarking, which then can be used to counter geometric attacks. For instance, if the image block is re-sampled after scaling, then DT-CWT can generate a set of coefficients that are roughly the same as the original patch. This scheme enables the watermarking to counter scaling attacks. Transformations, such as discrete wavelet transform (DWT), discrete cosine transform (DCT), and Fourier transform, do not have this property.  For a 1D signal, the wavelet coefficients obtained by using two filter trees are twice those of the original wavelet transform. Furthermore, the 1D signal can be decomposed by the 1D DT-CWT with a shifted and dilated mother wavelet function and scaling function [34], i.e., where Z denotes the set of natural numbers; J and l refer to the indices of shifts and dilations, respectively; s j 0 ,l denotes the scaling coefficient; and c j,l is the complex wavelet transform coefficient with , where r and i represent the real and imaginary parts, respectively. Figure 2 shows the calculation process of the real part and the imaginary part of DT-CWT. For tree a, filters h 0 and h 1 are used to compute the real part. For tree b, filters g 0 and g 1 are utilized to calculate the imaginary part. As shown in Figure 2, the output of the two trees can be interpreted as the real part and the imaginary part of the complex wavelet coefficients. For a 2D signal, a 2D image f (x, y) can be decomposed by 2D DT-CWT [34], i.e., where θ ∈ Θ = {±15 • , ±45 • , ±75 • } denotes the directionality of DT-CWT. At each scale of decomposition, the DT-CWT decomposition of f (x, y) results in six complex valued high pass sub-bands in which each high pass sub-band corresponds to one unique direction θ.

Watermark Embedding and Detection
Human eyes are generally less sensitive to high entropy image blocks than smooth ones based on the human visual perception model. The reason is that relatively strong edges usually appear in high entropy image blocks [35]. Inspired by [35], we propose an image watermarking method by using high entropy blocks in this work. The block diagram of the proposed method is illustrated in Figure 3, which consists of watermark encoding and watermark decoding. The main advantage of this proposed method is its simple implementation; moreover, the tradeoff between invisibility and robustness can be resolved by a visual perceptual model. DT-CWT is also adopted in this work to embed watermark information, which can improve the robustness of the watermarking against geometric attacks.

Watermark Embedding
As shown in Figure 3a. The procedure of the watermark embedding involves the following steps: Step 1: The original image is segmented into blocks, and the first blocks in the ascending order of estimated entropy are selected for watermarking purposes.
Step 2: DT-CWT is applied to each selected image block, and a single bit of "0" or "1" is embedded in each block by manipulating the complex wavelet coefficients of the low frequency sub-band as follows: where x denotes the host coefficients of the low frequency sub-band;ỹ denotes the modified coefficients; and α is called the watermark strength factor, its value being determined by texture masking and visual saliency in Section 3.2.
Step 3: Repeat Step 2 for each image block.
Step 4: The inverse DT-CWT is applied to the watermarked blocks, and the watermarked blocks are combined with the non-watermarked blocks to obtain the whole watermarked image.

Visual Saliency Based Watermark Strength Factor
The watermark strength factor α can affect imperceptibility. To achieve the transparency of the watermark, two important concepts, texture masking and visual saliency, are used to design the watermark strength factor. The just noticeable difference (JND) threshold is often high in the texture region of an image [36]. Therefore, a high watermark strength factor can be selected to embed more information in the texture region. In addition. The work in [37] studied the spread transform dither modulation (STDM) watermarking algorithm based on the visual saliency model and achieved good results. Furthermore, Wang et al. studied the JND estimation algorithm based on visual saliency in the wavelet domain [38]. The work in [39] utilized the JND scheme in designing a watermarking method. Besides this, in [40], an adaptive quantization watermarking algorithm was proposed. The term "adaptive" in their work [40] was mainly used to describe a process, behavior, and/or a system that is able to interact with its environment. However, in our work, "adaptive" describes the embedding strength of watermark. As a result, the concept of visual saliency [38,41,42] is used to develop an adaptive watermark strength factor in this work. The human eye is inclined to focus on prominent areas, and distortions are more likely hidden in the area far from the image saliency part. However, watermark embedding strength can be enhanced accordingly. The watermark strength factor can be calculated as follows.
First, on the basis of the characteristic of the texture masking, the high frequency energy of the i th image block is calculated by Equation (5), i.e., the value is the average of the sum of the energies of the six high frequency sub-bands.
where E HF is the average energy of all image blocks. The watermark strength factor can increase with increasing E HF . Hence, the high frequency portion of the watermark strength factor α 1 can be computed by employing the relationship proposed in [36] as follows: where the values of a , c, and ξ are set to 1.023, 0.02, and 3.5 × 10 −5 , respectively. According to [42], the final strength factor α can be computed by exploiting visual saliency. First, saliency distance is calculated for each image block, as denoted by D, and the maximum saliency distance is determined from the image blocks, as denoted by D max . In this manner, the visual saliency based strength factor α 2 can be represented as α 2 = 1 + δ · D, where δ = 0.02/D max .
In summary, the final watermark strength factor α can be calculated as: where α 0 denotes a positive constant, and α 0 is subtracted in this work to control the degree of image distortion after the watermark embedding. In this manner, the value of α 0 can be set to 1.0 in this work. On the basis of the above analysis, texture masking and visual saliency can be utilized to calculate the strength factor. This strength factor can adaptively change with the change of image texture and the degree of saliency. In this manner, the strength of the embedding can be controlled more appropriately, thus further improving watermarking performance.

Watermark Detection
The effect of attacks at the receiver can simply be modeled as an additive white Gaussian noise (AWGN) [16]. Furthermore, the complex wavelet coefficients of the low frequency sub-band can be modeled by a Gaussian distribution. The distribution of watermark information "1" or "0" can be represented as follows: where σ 2 y|1 = (1 + α) 2 σ 2 + σ 2 n , σ 2 y|0 = (1 − α) 2 σ 2 + σ 2 n , σ 2 n is the variance of the noise in the related sub-band coefficients.
Thus, by substituting (11) and (12) in (13) and (14), we have: We take the logarithm of both sides by calculating: where . Therefore, the watermark detection threshold can be expressed as:

Imperceptibility of Watermarking
To assess the performance of the proposed watermarking method, experiments are conducted by using real images. In this study, we used eight natural images (Barbara, Boat, Bridge, Elaine, Lena, Man, Mandrill, and Peppers), each with a size of 512 × 512. The host images and their watermarked version with 16 × 16 blocks and 128-bit message are shown in Figure 4. Throughout the experiments, three level DT-CWT was used to decompose each selected block, and the filters used were the near-symmetric 13, 19 tap filters and Q-shift 14, 14 tap filters. The watermark strength factor α was set to 0.0153 according to Equation (8) in Section 3.2. For each image in Figure 4, the top image is the host image, the middle image the watermarked image, and the bottom image the difference image between the host image and the watermarked version. From Figure 4, the watermark imperceptibility is satisfied. The proposed watermarking method provided an image dependent watermark with strong components in the complex part of the image, which is barely noticeable to the human eyes. This scheme allowed for the setting of the high watermark strength factor, while the visual quality of the watermarked image was kept at an acceptable level. Moreover, the peak signal-to-noise-ratio (PSNR) and the structural similarity index measure (SSIM) [43] were used to evaluate the performance of the proposed watermarking method in a subjective manner. The results are shown Table 1, in which the watermarking evaluation results are satisfactory. Therefore, the embedded watermarks were perceptually invisible.

Error of Probability Analysis
The error probability in the presence of AWGN was derived as follows. Error occurred whenever watermark information "1" was embedded into the host image, while watermark information "0" was extracted at the decoder end, and vice versa. The error probability of the watermarking included these two errors.
According to Equation (17), the error probability of embedding watermark information "1" was: where τ = 2N ln . According to previous results [16], f e|1 can be written as: where Meanwhile, the error probability of embedding watermark information "0" is: According to [16], f e|0 can be further written as: where The data bits "0" and "1" were assumed to be inserted in the original image with equal probabilities. On the basis of Equations (21) and (23), the total error probability can be written as: (24) Figure 5 shows the error probability F e versus the different values of noise variance for the Lena image under AWGN attack. From Figure 5, although the error probability F e increased with the increase of noise variance, the change value of error probability remained small as the noise attack strength increased.

Performance under Attacks
For testing robustness, several common attacks, as described by [16,28], were utilized for the watermarked images by the proposed method. These attacks included common image processing attacks and geometric distortion attacks. In this study, bit-error rate (BER) was used to evaluate the robustness of the watermarking under several intentional or unintentional attacks. To save on space, the robustness of the proposed method under AWGN, median filtering, Gaussian filtering, JPEG compression, scaling attack, rotation attack, and combinational attack on the eight well known images (i.e., Barbara, Boat, Bridge, Elaine, Lena, Man, Mandrill, and Peppers) was investigated.
(1) AWGN attack: Figure 6 shows the results of BER of various test images against AWGN attacks. When the noise variance was less than 10, BER was near zero. When noise variance was less than 33, the corresponding BER was near 0.1. Therefore, the proposed watermarking had good robustness against AWGN attacks.  (2) JPEG compression attack: Figure 7 shows the results of BER against JPEG compression attacks, in which the proposed watermarking demonstrates robustness, even if the quality factor is very low (e.g., quality factor of five). When the strength of the JPEG compression attack was very large (e.g., quality factor of five), the result of BER was less than 0.3. When the JPEG compression quality factor was greater than 25, BER tended to be zero. Therefore, the proposed watermarking algorithm was robust against JPEG compression attacks.  (3) Scaling attack: Table 2 shows the BER results under amplitude scaling attacks. The watermarking algorithm was robust to most scaling attacks. However, Table 2 also shows that when the scaling factor was equal to 0.7 or 1.1, BER had a comparatively large value. The reason is not yet clear, and this issue will be explored in our future work.   Table 3, the maximum BER was 0.1016, whereas most of the other BER values were small. The effect of BER was not prominent when the angle increased. Thus, the proposed embedding approach was robust against rotation attacks. Table 3. BER results of the extracted watermark under rotation attack.

Image
Rotation Angle  Table 4 shows the BER results against Gaussian filtering and median filtering attacks. The window sizes of Gaussian filtering and median filtering were 3 × 3, 5 × 5, and 7×7. The proposed method was highly robust against Gaussian filtering attacks. When the window size of the median filtering was 3 × 3, the performance of the proposed scheme was robust. However, with the increase of the window size of the median filtering, the robustness of the watermarking was decreased.   Table 5 shows the BER results under the combinational attack for various distortions associated with JPEG compression with a quality factor of 20%. The robustness of the proposed method against this kind of combinational attack was satisfactory. The result of BER against Gaussian noise attack combined with JPEG compression attack for different images is shown in Figure 8, which further confirmed that the proposed scheme had good robustness for this kind of combinational attack.

Comparison with Other Methods
In this part of the study, our method is compared with its most related competitors, particularly the methods reported in [16,28,44,45]. The methods in [16,44] were chosen on the basis of their similarity with the proposed watermarking. For example, the methods all used high entropy image blocks of an original image for watermark embedding. To ensure fairness in comparison, the message lengths and PSNR values used in our experiments were the same as those in the other works. The results in Table 6 depict the same watermark lengths of 256 bits embedded into Barbara, Boat, and Peppers images, while the PSNR of the watermarked image was 45 dB. The results of our method were better than those in the other works for most attacks. For instance, for geometric attacks, such as scaling and rotation attacks, the proposed method outperformed the above three watermarking methods. The main reason was that DT-CWT can effectively approximate shift invariances for geometric transformations. However, the proposed method was slightly ineffective compared to the methods in [28,44] under AWGN attacks. This problem will be investigated thoroughly by studying the statistical properties of DT-CWT coefficients and noise in our future work.  Tables 7 and 8 show the BER results against median filtering attack and JPEG compression attack by using the proposed method and those methods in [28,44], in which the watermark length was 256 bits and the PSNR of the watermarked image was 45 dB, respectively. As shown in both tables, the proposed method had better results than the methods in [28,44].  Table 9 shows the BER results under scaling attack with a watermark length of 100 bits and a PSNR of the watermarked image of 45 dB. The results of the proposed method outperformed those of the methods in [44,45] for most scaling attack. As described in the third part of Section 4.2 (i.e., "scaling attack"), when the scaling factor was equal to 0.7 or 1.1, the robustness of the watermarking decreased dramatically. We will investigate this issue in our future work. Table 9. BER (%) comparison of the recovered watermark under scaling attack.

Conclusions
An image watermarking method was proposed in this study by using DT-CWT and the multiplicative strategy. In this approach, after partitioning a host image into non-overlapping blocks, the high entropy image blocks were selected for watermark embedding. Watermark data extraction was performed by using the MLE decision criterion, and the embedding factor was computed to improve the robustness of the watermarking by using the texture masking and visual saliency scheme. The performance of the proposed scheme was evaluated in terms of image quality and robustness. The experimental results demonstrated the effectiveness of the proposed method.