A Hybrid Robust Image Watermarking Method Based on DWT-DCT and SIFT for Copyright Protection

In this paper, a robust hybrid watermarking method based on discrete wavelet transform (DWT), discrete cosine transform (DCT), and scale-invariant feature transformation (SIFT) is proposed. Indeed, it is of prime interest to develop robust feature-based image watermarking schemes to withstand both image processing attacks and geometric distortions while preserving good imperceptibility. To this end, a robust watermark is embedded in the DWT-DCT domain to withstand image processing manipulations, while SIFT is used to protect the watermark from geometric attacks. First, the watermark is embedded in the middle band of the discrete cosine transform (DCT) coefficients of the HL1 band of the discrete wavelet transform (DWT). Then, the SIFT feature points are registered to be used in the extraction process to correct the geometric transformations. Extensive experiments have been conducted to assess the effectiveness of the proposed scheme. The results demonstrate its high robustness against standard image processing attacks and geometric manipulations while preserving a high imperceptibility. Furthermore, it compares favorably with alternative methods.


Introduction
The growth of digital information technologies makes their distribution and duplication much easier. Therefore, the necessity to design secure techniques has increased in the last few decades. Digital image watermarking has been found to be an effective solution for copyright protection of images [1]. Its basic procedure is to embed imperceptible information, termed watermark, in the original image. Thus, the copyright of the image can be provided by extracting the embedded secret watermark.
Three main properties are required in image watermarking systems [2]: imperceptibility, capacity, and robustness. Imperceptibility refers to the fact that the watermarked image should look identical to the original one. Capacity represents the maximum number of bits embedded in the original image. It is the primary constraint that should be ensured after imperceptibility [3] in high-capacity methods. Indeed, in this category of techniques, a considerable quantity of information should be embedded without losing image quality. In the copyright protection methods, this constraint is less critical, but it can influence the results in terms of robustness of imperceptibility, especially when the watermark size is too big. Robustness refers to the ability to detect the watermark even if the watermarked image suffered from several manipulations called attacks. A good watermarking scheme should ensure the best trade-off between these three properties. Indeed, generally, with the increase of capacity, the robustness of the image decreases while simultaneously decreasing its imperceptibility and vice versa. The main objective of the proposed method is copyright

Previous Work
The state-of-the-art image watermarking methods discussed in this paper can be classified into three categories: single transform-based method, hybrid transform-based methods, and scale-invariant methods.
In [21], the authors proposed a digital image watermarking method based on singular value decomposition (SVD). Firstly, the SVD is applied to the original image to obtain the orthogonal matrices U and V and the diagonal matrix S. Then, the watermark is embedded into the diagonal matrix S additively. The watermarked image is reconstructed using the modified matrices Sw Uw and Vw. Experimental results show that the method gives good results in both security and robustness against several attacks, such as compression, filtering, noise, cropping, etc.
In [30], an image watermarking method using contourlet transform along with singular value decomposition is proposed. In the embedding, the contourlet transform is applied to the original image, and the coefficients are modified by combining singular values of the selected direction with singular values of the watermark. The technique gives good imperceptibility results and is robust against several attacks. The method proposed by Amini et al. [31], presented a robust multiplicative watermark decoder using vector-based hidden Markov model in the wavelet domain is proposed. The results show good resistance to attacks and ensure a good imperceptibility. Based on single transformations, several transforms can be used as watermarking primitives. In [32], the authors proposed a color image watermarking scheme using quaternion polar harmonic transform (QPHT) with a maximum likelihood decoder. The watermark is embedded into the QPHT magnitudes using a multiplicative approach. The method ensures both imperceptibility and robustness. In [33], a color image watermarking scheme in the sparse domain is presented. The method considers the inter-channel dependencies between RGB channels and inter-scale dependencies of the sparse coefficients of color images by employing the hidden Markov model.
The main objective of the majority of existing watermarking schemes is to provide good robustness against several attacks preserving at the same time a high imperceptibility. Hybrid methods generally perform better than single transform methods. As a consequence, the need to develop these methods that combine two transforms to achieve this aim has increased considerably. Several hybrid methods have been proposed in the literature [34][35][36][37][38].
In Lagzian's method [26], singular values of the redundant discrete wavelet transform (RDWT) are modified to insert the watermark. Makbol et al. [27] proposed a hybrid method based on integer wavelet transform (IWT) and singular value decomposition (SVD). The authors embed the watermark in the singular values of the first level of IWT.
Singh et al. [28] proposed a hybrid semi-blind method in the redundant wavelet domain. The authors take advantage of the shift-invariance of "RDWT" and nonsubsampled contourlet transform (NSCT) to avoid the shift sensitivity of the classical wavelet transforms. The watermark is inserted by modifying the SVD coefficients in the RDWT-NSCT domain.
Hybrid schemes are generally very robust against a wide range of attacks, especially image processing operations, since they exploit the benefits of two or more transformations to achieve watermarking robustness. Nevertheless, the majority of these methods show weakness to geometric attacks. To overcome this issue, methods using invariant descriptors like SIFT [39] and SURF [40] have been widely used. SIFT has been extensively proposed for image watermarking against geometric attacks [13,[41][42][43]. In [41], a robust scheme against resolution scaling has been proposed. First, a watermark zone selection algorithm is performed to get the candidate pixel locations that are to be modified. Afterward, SIFT features, which act as a watermark, are extracted and registered. Then, a patch is embedded in the image such that it gives robust SIFT features.
In [42], the watermark is embedded into the circular patches invariant to scaling and translation, generated by the SIFT descriptor. The authors take advantage of the polar-mapped circular patches to ensure rotation invariance.
A rotation, scale, and translation invariant watermarking scheme based on discrete Tchebichef transform (DTT), singular value decomposition (SVD), and scale-invariant feature transform (SIFT) is proposed [44]. The DTT coefficients of the image are arranged similarly to the sub-band scheme generating LL, HL, LH, and HH sub-bands. The principal components of the watermark are inserted into the diagonal components of each DTT sub-band. Next, Arnold transform and permutation applied to the watermark are used to enhance security. The scheme is robust to geometrical and combined attacks.
Chen et al. [45] proposed a robust watermarking scheme with a feature-based synchronization technique. The watermark is repeatedly embedded in each selected local square feature region (LSFR) by modulating the discrete Fourier transform (DFT) coefficients. The extraction is based on a local statistical feature, and the SURF orientation descriptor is used for watermark synchronization. The method is robust against common attacks and screen-cam attacks. The method is effective against screen-cam, as well as common desynchronization attacks.
In [29], a robust watermarking scheme using (SIFT) and (DWT) domain is proposed. The SIFT feature areas are extracted from the original image, and one level DWT is applied on the selected SIFT feature areas. Differently to the proposed approach, which embeds the watermark in a single sub-band, they insert the mark in the two sub-bands HL1 AND LH1. To do so, the watermark is divided into two parts that are inserted by modifying the fractional portion of the horizontal or vertical, high-frequency DWT coefficients. The experimental results showed that the scheme can resist both signal processing and geometric attacks.
The authors of [43] proposed robust image watermarking based on scale-invariant feature transform (SIFT), singular value decomposition (SVD), and all phase biorthogonal transform (APBT). A series of SIFT keypoints are obtained after carrying out SIFT, which are selected to obtain the neighborhood that can be used in the watermark embedding process. A block-based APBT is performed on the neighborhoods of the selected feature points. To insert the watermark, a coefficients matrix of a set of APBT coefficients for SVD is generated.
In [13], a SIFT-based watermarking scheme in the DWT-SVD domain is proposed. First, a 3-level discrete wavelet transform (DWT) is performed to the original image. Next, the SVD is applied to the LL 3 , and the watermark is embedded additively. The rotation, scale, and translation (RST) attacks are corrected by matching the key points of the original image and the watermarked one.
Recently, a SURF-DCT based image watermarking has been proposed [46]. First, the watermark is encrypted using chaotic encryption technology in order to enhance its security. Next, the DCT coefficients are modified using the positive and negative quantization rules. The method proves to be resistant against geometric and non-geometric attacks.
In our previous work [24], we proposed a blind robust image watermarking method based on the discrete Fourier transform (DFT) and DCT for copyright protection. The watermark is inserted in the DCT middle band of the DFT magnitude. The watermark is encrypted with the Arnold transform to increase the security of the proposed method. The method shows high imperceptibility for textured and non-textured images. Regarding the robustness, the technique can withstand signal processing attacks, JPEG, JPEG2000 compressions, etc., but shows weakness to geometric attacks. To overcome this problem, we propose a novel method based on SIFT to avoid vulnerability to these attacks.

Background
This section describes three techniques relevant to the proposed method, namely, the DCT, DWT, and SIFT. The DWT and DCT techniques are used to embed the watermark bits, while SIFT is used to make the proposed method invariant to geometric attacks.

Discrete Cosine Transform
The discrete cosine transform (DCT) is a famous transformation techniquee that transforms an image from the spatial domain to the frequency domain [47]. It has been widely applied in image processing exploiting both the decorrelation and the energy compaction properties. The mathematical expressions of the 2D-DCT and inverse 2D-DCT are, respectively: (1) where f (x, y) and C(u, v) are the pixel values in the spatial domain and the DCT coefficients, respectively. m, n represent the block size. α(u) and α(v) are two coefficients defined as follows:

Discrete Wavelet Transform
Discrete wavelet transform has been widely used in image processing and its applications. It consists of decomposing an image into four sub-bands, one corresponding to the low pass band (LL) and three others corresponding to horizontal (HL), vertical (LH), and diagonal (HH) high pass bands. The image can be decomposed iteratively by further decomposing the low pass band each time. It has been used extensively in image watermarking due to its excellent spatio-temporal localization as well as its correlation with the human visual system (HVS) [48].

Scale Invariant Feature Transform (Sift)
The scale-invariant feature transform (SIFT) proposed by G. Lowe [39] is an image descriptor that extracts characteristic features. These features are invariant to image translation, rotation, scaling, and brightness change. Firstly, a search for peaks in the scale space of the difference-of-Gaussians (DoG) function is performed to select the candidate's features. Second, the position of each feature is localized. Next, the orientations are assigned based on image gradient directions. The scale-space D(x, y, σ) is computed using a DoG function with the aim of extracting the locations of candidates' features. The original images are smoothed successively using a variable-scale (σ 1 ,σ 2 , and σ 3 ) Gaussian function and the scale-space images is calculated by subtracting two successive smoothed images (as shown in Figure 2). x and y represent the coordinates of the image, while σ is the scale of the Gaussian function.
Lowe's algorithm has been used in several applications such as multi view matching [49], object tracking [50], etc. Similarly, SIFT has been extensively used within the context of robust image watermarking [41,42].

Proposed Scheme
In this paper, we propose a hybrid robust image watermarking scheme based on DWT-DCT and SIFT for copyright protection. The main contribution of the proposed method is that it ensures both robustness to signal processing and geometrical attacks using the DWT-DCT domain to embed the watermark and SIFT descriptor, respectively, while preserving the high imperceptibility of the watermarked image. The reason behind using DWT is its excellent spatial localization and multiresolution characteristics, which are similar to the human visual system (HVS) [22], while the choice of using DCT is its strong energy compaction property [23] and good robustness against common image processing attacks. Combining these two well-known transforms, the proposed method can withstand common signal processing manipulations, including filtering, noise, JPEG compression, among others, while ensuring high imperceptibility. Moreover, RST geometric correction using SIFT ensures robustness against geometric attacks.

Embedding Process
First, the original image is decomposed into four sub-bands LL 1 , HL 1 , LH 1 and HH 1 by performing the 1-level Haar 2D-DWT. Next, the HL 1 -sub-band is divided into nonoverlapping 8 × 8 blocks and the 2D-DCT is applied to each block. The choice of HL 1 has been driven by the fact that this sub-band ensures a good tradeoff between robustness and imperceptibility compared to LL 1 and HH 1 . Afterward, two uncorrelated pseudorandom sequences are generated using a secret key. Each sequence is a vector composed by {−1,1} values with a normal distribution having zero mean and unity variance. The first sequence is for bit 0 (PN zero ) while the second one is for bit 1 (PN one ). The motivation behind this choice (normally distributed watermark) is the robustness to the attacks trying to produce an unwatermarked document by averaging multiple differently watermarked copies of it [51]. On the detection side, it is important that the PN sequences are statistically independent. This constraint is granted by the pseudo-random nature of the sequences. In addition, such sequences could be easily regenerated by providing the correct seed (key). The used watermark is a binary image. The inserted information is the PN-sequences, according to the watermark bits. If the watermark bits are 0 then the inserted information is (PN zero ), otherwise (PN one ) is inserted. Gray scale image could be used as watermark. However, the nature of the watermark (binary image, gray-scale image) is not the main concern since the application of the proposed watermarking image scheme is copyright protection. Then, for each block, the two pseudo-random sequences are embedded in the DCT mid-band of the HL 1 coefficients according to the watermark bit (shown in blue in Figure 3). The Equation (4) is used to insert the sequence PN zero if the watermark bit is 0 while Equation (5) is used in the case of bit 1.
where X is the original DCT mid-band of the HL 1 of the DWT, and Y is the modified DCT mid-band of the HL 1 of the DWT. λ is the watermark strength that adjusts the tradeoff between imperceptibility and robustness requirements. This parameter is empirically chosen so that it ensures the best tradeoff between robustness and imperceptibility. Note that rather than tuning the parameter λ empirically, it could be statistically tuned based on some optimization criteria such as the error rate, the PSNR, the SSIM, or the normalized correlation. Next, the inverse DCT (2D-IDCT) is carried out for each modified block, and the inverse 2D-DWT (2D-IDWT) is performed to obtain the watermarked image. Finally, SIFT features are extracted and saved in order to correct the geometrical distortions in the extraction process. In most situations of copyright protection applications, the owner of the image is the only one to possess the secret key and the SIFT keypoints needed to extract the watermark. In case there is a need to share with someone else, for each image, the SIFT features and the secret key need to be shared with the extracting side. Therefore, the method is semi-blind since the key points are needed in the extracting process. The proposed watermark embedding is illustrated in Figure 4.  The steps of watermark insertion are described in detail in Algorithm 1. 2. HL 1 -sub-band of level-1 is divided into non-overlapping 8 × 8 blocks.
4. Generate two uncorrelated pseudo-random sequences using a secret key. One sequence for bit 0 (PN zero ) and the second sequence for bit 1 (PN one ).
5. For each block, insert the two pseudo-random sequences in the DCT mid-band of the HL 1 coefficients according to the watermark bit. If the watermark bit is 0 the Equation (4) is used. The Equation (5) is used otherwise.
9. Extract SIFT features and save them.

Extraction Process
The extracting process is divided into two steps: geometrical distortions correction and watermark extraction. Before extracting the watermark, the first step is to correct the geometric manipulations that the attacked image has undergone. To this end, SIFT features are extracted first from the attacked image, and matching is performed between them and the recorded features saved in the insertion step. The idea behind using SIFT relies on the fact that it is RST invariant [52].
It is worth noticing that the proposed method doesn't require the original image but SIFT features that make it semi-blind. However, the scheme can be blind if no geometric distortions are performed.
In order to correct image rotation attack, the attacked image should be rotated by R c . The mathematical formulation of the correction angle is calculated as follows: are two vectors composed of two keypoints taken from the watermarked image and the rotated image, respectively. N denotes the number of valid matching points. According to Equation (6), the rotation angle is calculated from every two pairs of matching points. Afterward, the angle is corrected by calculating the average sum of angles. Similarly, to correct the scaling attack, the attacked image should be scaled by S c .
where Sw i and Ss i are the scale values of the matching point in watermarked image and scaled image, respectively. Thus, the scaled image can be corrected by scaling it with S c .
To correct the translation attack, CT x and CT y are used to correct the translated image on the horizontal and vertical location of coordinates.
Thus, after performing the correction of geometric attack step, the weakness against RST attacks can be avoided. The watermark can be extracted perfectly when the watermarked image suffers from this kind of attacks.
The second step is to extract the watermark. To do so, it is sufficient to carry out the 1-level HL 1 of the 2D-DWT and calculate the 2D-DCT of the HL 1 . Afterwards, two pseudorandom sequences using the same key of embedding are generated. Next, the watermark is extracted by calculating the correlation between the PN sequences and the modified coefficients as shown in Equation (10).
where Corr(0) is the correlation between the DCT middle frequency of the HL 1 coefficients and PN zero , and Corr (1) is the correlation between the DCT middle frequency of the HL 1 coefficients and PN one . Finally, the watermark image is extracted. Figure 5 sketches the watermark extracting process that is described in detail in Algorithm 2. 5. Apply 2D-DCT to HL 1 .
6. Generate two pseudo-random sequences using the same key of embedding.
7. Extract the watermark using the correlation between the PN sequences and the altered coefficients as shown in Equation (10).

Experimental Setup
The performance of the proposed technique is evaluated on thirty 512 × 512 standard gray-scale natural images. The images have been carefully selected in order to cover a wide range of images (indoor, outdoor, portrait, etc., texture). These images include the most commonly used images in the watermarking literature. ('Baboon', 'Pepper', 'Cameraman', 'Lena', 'Goldhill', 'Walkbridge', 'Womanblonde', 'Livingroom', 'Pirate' and 'Lake') and other gray-scale images taken from [53] ( Figure A3) in the experiments to assess the imperceptibility and the robustness of the proposed work. A 64 × 64 binary image was taken as a watermark.
The parameter lambda, which denotes the embedding strength of the embedded watermark, affects the visual quality and robustness. This value is chosen in such a way that ensures the best tradeoff between imperceptibility and robustness. To this end, extensive experiments have been conducted using empirically several values, ranging from 0.01 to 5, to determine the value ensuring this tradeoff. This parameter is tuned experimentally, and we kept lambda = 0.4 (see Figures 6 and 7). The same lambda value is used for all the test images. Figures 6 and 7 illustrate the effect of lambda on the performance of the proposed method in terms of imperceptibility and robustness.

Imperceptibility
Subjective evaluation experiments is the gold standard to measure the imperceptibility. However, such a process needs heavy technical and human resources to be conducted [54,55]. This is the reason why the objective metrics are used to assess visual quality of the watermarked images.
In order to evaluate the imperceptibility of the watermarking methods, several metrics have been proposed. Peak signal to noise ratio (PSNR) is the most widely used metric in the watermarking literature to measure the distance between the original image and the watermarked one. It is defined as follows: where MAX is the maximum possible pixel value of the image, which is equal to 255 for an 8 bit per pixel representation , and MSE is given by: where I(i, j) and K(i, j) refers to the original image and the watermarked image, respectively. m and n are the dimensions of the image. The structural similarity (SSIM) index performs similarity measurement using a combination of three heuristic factors that is, luminance comparison, contrast comparison, and structure comparison. It is the most influential perceptual quality metric [56]. It is defined by (13).
where I 0 and I w are, respectively, the original image and the watermarked image, µ I0 and µ Iw are, respectively, the local means of I 0 and I w , σ 2 I0 is the variance of I 0 whereas σ 2 Iw is the variance of Iw, c 1 and c 2 are two variables used to stabilize the division with weak denominator.

Robustness
The robustness of our work is evaluated using normalized correlation (NC) and bit error rate (BER) between the original watermark and the extracted one.
The normalized correlation (NC) is a widely used attribute for quantifying the robustness of the underlying watermarking technique against various attacks. It measures the similarity between the extracted watermark and the original watermark. It is defined by: where W and W are the original and the extracted watermark, respectively. To further evaluate the robustness of the proposed work, bit error rate (BER) is used to calculate the bit error rate between the original watermark and the extracted one. It is defined as follows: where W i,j and W i,j are original and extracted watermark with size of (m × n) and refers to X or operation. Table 1 exhibits the imperceptibility of the proposed technique measured by the two well-known metrics PSNR and SSIM for all test images and their average. One can notice that the proposed method can ensure good imperceptibility according to the obtained values of PSNR and SSIM in Table 1 , Figures 6, 8 and 9. We believe that the main reason stands on the fact that the watermark is embedded in the middle DCT coefficients of the LH DWT sub-band that ensures high imperceptibility.   Table 1 that the imperceptibility of the proposed scheme is insensitive to the image nature. Figures 8 and A3 show the original images and their corresponding watermarked ones. Moreover, as depicted in Figure 8 there is no visual distortion between the original images and the watermarked ones.

Evaluation of Imperceptibility
The violin plot representation of PSNR and SSIM of the proposed scheme. The values of PSNR and SSIM shown in Table 1 are represented in black in Figure 9. In addition, according to Figure 9, the majority of SSIM values are concentrated between 0.995 and 0.9998. PSNR values are between 45.28 and 49.97, illustrating the good robustness of the proposed scheme regardless of the image nature.

Evaluation of Robustness
Since the application of the proposed scheme is copyright protection, robustness is the most important requirement. Image processing, JPEG compression and geometrical manipulations are the three categories of attacks that watermarked images have undergone. Image processing attacks include Gaussian noise (GN), salt and pepper noise (SPN), low-pass Gaussian filtering (LPGF), histogram equalization (HE), Gaussian smoothing (GS), median filtering (MF). JPEG compression and JPEG2000 represent the compression attacks. Rotation (ROT), scaling (SC), translation (TR), and cropping (CR) were selected as geometrical attacks. Figures 10 and 11 show the robustness of the proposed method in terms of NC against rotation and scaling using four test images with several textures. Figure 12 depicts Lena image after performing several attacks. The false alarm probability is not discussed in the paper, and the robustness of our work is evaluated using normalized correlation (NC) and bit error rate (BER) between the original watermark and the extracted one.
The 30 test images used in the experiments were chosen according to their characteristics (texture, indoor, outdoor, etc). In addition, some typical images have been used in the experiments. The images used in Figures 8 and 10 are selected in such a way that they represent differences in these characteristics.
Experiments were performed to evaluate the limitations of the proposed method. The parameter values of the attacks have been tuned such that the watermark is no longer recovered. We consider that with an NC value lower than 0.7, distortions are sufficiently high such that the watermark cannot be recovered. Figure 13 displays the extracted watermarks after different attacks, including histogram equalization, JPEG compression, salt and pepper noise, Gaussian noise, cropping, rotation, etc. It can be observed that although the watermarked images have been exposed to these attacks, the watermark is almost extracted perfectly. Table 2 shows the robustness in terms of NC for several images against Gaussian noise using zero mean, 0.001 and 0.01 variances, respectively, and the NC average for 30 test images.   Gaussian smoothing is a very common operation in image processing [57]. It consists of removing detail and noise [58]. The Gaussian smoothing has been applied to the test images with different standard deviations and window sizes. As depicted in Table 4, the proposed technique is able to withstand Gaussian smoothing attack. Even with a standard deviation σ = 0.9 and 7 × 7 window size, the obtained NC values are greater than 0.96. It can be noticed from Table A8 that the proposed technique can withstand Gaussian smoothing for all thirty test images.
The low-pass Gaussian filtering attack is also one of the common manipulations in image processing. It aims to remove high-frequency components from the image. The watermarked images are filtered with a low-pass Gaussian filter using several mask sizes (3 × 3), (5 × 5), and (7 × 7) and two standard deviation values (σ = 0.5, σ = 0.6). It can be concluded from Table 5 that high NC values are achieved under the low-pass Gaussian filtering with the different mask sizes. In addition, one can see from Table A8 that the proposed method can resist low-pass Gaussian filtering for the dataset, and the minimum average NC value is 0.9812.   Robustness against lossy compression is crucial due to the wide diffusion of lossy compression tools and the huge use of this image format. To assess the performance from this point of view, JPEG compression is iteratively applied to the watermarked images, each time decreasing the quality factor, ranging from 90 to 5. Table 6 summarizes the results obtained in terms of NC after JPEG compression using several quality factors for the 30 test images. As can be seen, the proposed method exhibits good robustness against this attack. Furthermore, the robustness against JPEG2000 has been investigated using different compression ratios (CR) varying from 1 to 10. Table 7 depicts the results in terms of normalized correlation against JPEG2000 attack using 30 images. It can be seen from Table 6 that the proposed method can withstand JPEG attack when the quality factor is above 40. For quality factors below 40, the watermark can be well recognized since the NC values are above 0.7. Regarding JPEG2000 compression, it can be seen from Table 7 that the proposed method can resist to JPEG2000 attack when the compression ratio (CR) is below 10. We consider that the obtained results are comparable since the minimum NC average of all test images is above 0.7.

Gaussian Smoothing Normalized Correlation
One can see from Table 7 that the proposed method is quite robust against JPEG2000. The proposed method shows its limitation when the compression ratio (CR) is larger than 6 but the results are still satisfactory (NC= 0.7031, CR = 10).  Figure 10. Robustness in terms of NC against rotation attack. Figures 10 and 11 show the robustness in terms of NC of the proposed technique against rotation and scaling using four test images with several textures, respectively. The rotation attack is applied using several rotation angles ranging from 1 to 45. The obtained results presented in Table 8, show the good robustness of the proposed method against rotation attack. Similarly, the test images have undergone scaling attack with different scaling factors ranging from 0.1 to 2.5. It can be seen from Figure 11 that the proposed technique is able to withstand scaling attack for all images. We note that the results for selected images under test are reported in Figures 10 and 11. The remaining results for rotation and scaling attacks are reported in Figures A1 and A2 Figure 11. Robustness in terms of NC against scaling attack. To further test the robustness of the proposed method, different combinations of attacks have been carried out. Table 9 sketches a set of combinations of image processing attacks, while Table 10 exhibits a set of combinations of both geometric and image processing attacks. It can be concluded from these tables that the proposed method is robust to attack combination for the both types of attacks since all the obtained NC values are greater than 0.96. In addition, one can see from Table A8 the resistance to combined attacks of the proposed method for all the thirty test images.  Moreover, as depicted in Figure 13, the extracted watermark is well recognizable even after applying several attacks to the watermarked image which indicates the good robustness of the proposed method. Table A7 shows the robustness evaluation using bit error rate (BER) against a wide range of attacks. The presented results represent the average values of BER for 30 test images shown in Figures 8 and A3. It can be seen from Table A7 that the proposed method can resist the majority of the attacks such as image processing (filtering, noise, etc.), JPEG compressions (JPEG and JPEG2000), geometric attacks (rotation, scaling, translation, and cropping) and combined attacks. The obtained values of BER calculated between the original watermark and the extracted one are near zero, which illustrates the robustness of the proposed technique.
One can see from Table A7 that the robustness performance in terms of BER decreases when the quality factor of JPEG decreases. However, even for high values of this parameter (5%), the watermark can still be recovered. Similarly, when the noise is applied with a high density (0.01 or higher), the BER increases. However, for Gaussian smoothing for 7 × 7 and σ = 9, the obtained results are comparable. It can be observed from Table A7 that the proposed method is robust against histogram equalization, cropping, and scaling. As shown in Table A7, the proposed technique can resist to rotation for the angles below 40 • , Gaussian noise (σ = 0.005, salt and pepper noise (σ = 0.001)), Low-pass Gaussian filtering for (σ = 0.5, (3 × 3, 5 × 5, 7 × 7, 9 × 9) and σ = 0.6 (3 × 3)), Gaussian smoothing (σ = 0.5, 5 × 5), JPEG when quality factor is above 50%, and JPEG2000 for compression ratio below 8. According to Table A7, it can be seen that the robustness of the proposed method has its limitations for the following attacks parameters: Low-pass Gaussian filtering for (σ = 0.6, (5 × 5, 7 × 7, 9 × 9)), Gaussian smoothing (σ = 0.5, (5 × 5, 7 × 7, 9 × 9)), JPEG when quality factor is above 50%, and JPEG2000 for compression ratio below 8. Table A8 reports the robustness results in terms of NC with the aim of evaluating the limitations of the proposed method. For Gaussian noise until the density reaches the value 0.8, one can still recover the watermark. For salt & pepper noise with density 0.7, the watermark can be extracted. Regarding JPEG compression with quality factor below 4, the extracted watermark cannot be appropriately extracted. After applying the Gaussian smoothing with a window of 9 × 9 and σ = 10), the extracted watermark cannot be recognized. To summarize, one can see from Table A8 that the proposed method can't resist these attacks. This is due to the high damage caused by these severe attacks which cause the huge decrease of robustness in terms of NC.
We note that all the test images are under the same attack in Tables 2, 3, 8, A7 and A8.

Comparison of Imperceptibility and Robustness
In Table 11, is presented the comparison in terms of imperceptibility between the proposed scheme and the schemes in [26][27][28]. It is clear from Table 11 that the proposed method shows better imperceptibility compared with Lagzian et al. [26], Makbol et al. [27] and Singh et al. [28] methods in terms of PSNR. It can be seen from Table 12 that the use of DWT-DCT only fails to provide robustness to geometric attacks, while using SIFT avoids weakness against this kind of attack.  We have compared our work with state-of-the-art methods based on the presented results of the latter. We have not implemented alternative methods. Thus, for the results presented in Tables 13-17, A1 and A2, we have compared the proposed work only for the attacks exhibited in the alternative techniques.
To further evaluate the robustness of the proposed method, it has been compared to [13,21,[26][27][28][29] in terms of normalized correlation (NC). Additionally, the watermarked images have undergone several combined attacks. These combined attacks include image processing manipulations as well as geometric operations.   Tables 13 and 14 show the comparison results in terms of robustness with [26,27] methods under several attacks including, rotation, Gaussian noise, salt and pepper noise, median filtering, JPEG compression, histogram equalization. It can be observed that our method outperforms the schemes in [26,27] in the majority of the attacks. Table 15 depicts the robustness results in terms of NC against different attacks compared to Singh et al. method [28]. One can see from Table 15 that the proposed method shows high robustness compared to [28] against different attacks including Gaussian noise, salt and pepper noise, median filtering, histogram equalization, JPEG compression, and rotation.
To further evaluate the robustness performance of the proposed method, we compare it with Zhang et al.'s method [13]. To this end, the watermarked image has undergone several geometric distortions as well as image processing attacks. The obtained results, shown in Table 16, indicate the superiority of the proposed scheme. The rotation and scaling attacks have been investigated in Tables 14-17, respectively. It can be seen from these tables that the proposed method is quite robust to rotation and scaling attacks for several test images thanks to the SIFT operator. In addition, the proposed technique outperforms the schemes in [13,21,27,28]. Table 17 shows that one can distinguish three categories of attacks. In the first category, including Gaussian noise, salt and pepper noise, and cropping attacks, the proposed method clearly outperforms [13]. For JPEG, rotation and scaling attacks results are quite comparable, even if the proposed method performs slightly better. Finally, the third category contains a single attack (median filtering). In this case, [13] outperforms the proposed method. As shown in Table 17 the proposed method is quite robust to cropping, median filtering, rotation, scaling and outperforms the methods in [13,21]. Table 9 shows the obtained results in terms of NC after carrying out several combined attacks. It can be concluded from Table 9 that the proposed method is able to withstand combined attacks (all NC values are above 0.9937). Moreover, our scheme shows high robustness compared to Zhang's scheme [13]. Figure 14 displays the robustness comparison results in terms of normalized correlation between zhang's scheme [13], Lyu's scheme [29], Liu's scheme [21] and the proposed scheme. Comparison with Luy et al. method [29], (blue curve in Figure 14) shows that the proposed method is more effective whatever the attack under test. Note that it uses a single transform with SIFT descriptor. This highlights the importance of using both transforms. A deeper look shows that it performs particularly less for small rotation (5°and 10°) and scaling. The differences between the two methods are less pronounced for JPEG, cropping, and very small rotation (2°).
Regarding the comparison with Liu et al. method [21] (green curve in Figure 14), it appears that the proposed technique is quite superior for median filtering attack. Indeed the NC values drop from 0.98 to 0.5. In addition, the proposed method shows superior robustness for small rotation (5°and 10°) and scaling. The results are comparable for JPEG, cropping, and minimal rotation (2°) attacks. This corroborates the reported properties of the SVD in cases where perturbations are small [12]. Figure 14 shows that the method presented in [13] (yellow curve) gives comparable results in terms of NC for JPEG, rotation, and scaling attacks compared to the proposed method.
For cropping, the proposed method exhibits higher robustness as compared to the scheme reported in [13], while this is the contrary for median filtering attack. These results are not surprising. Indeed, both methods use two transforms associated with SIFT descriptor.
One can see from Table 13 that the proposed method outperforms the technique proposed in [26] in all attacks except for JPEG compression with quality factor 50 and median filtering. For these two attacks even the method in [26] outperforms the proposed method, the robustness results in terms of NC are comparable. It can be seen from Table 14 that the proposed method is robust against the tested attacks except for JPEG compression in which the alternative method [27] shows its superiority in terms of robustness. Similarly, in Tables 15-17, it can be observed that the proposed method fails to show its superiority in terms of robustness in only one case (median filtering (in Tables 15-17), and rotation (in Table 17)).
It can be seen that the proposed technique outperforms the scheme in [29] for a wide range of attacks such as, rotation, JPEG, salt and pepper noise, median filtering, and cropping (25% and 50%). One can see from Table A2 that the proposed method can obtain comparable results in terms of robustness for centered cropping (75%).

MF(3X3)
Crop ( The robustness of the proposed method is compared to our previous work [24] and the scheme in [34]. The attacks used for the comparison are applied to three images (Lena, Peppers, and Baboon) as shown in Tables A5 and A6. For the three images (Lena, Peppers, and Baboon), the proposed method outperforms the scheme in [34] for JPEG, JPEG2000, histogram equalization, and cropping attacks. In addition, the proposed technique provides the highest robustness performance than the scheme in [24] for JPEG, JPEG2000, histogram equalization, and cropping attacks. In sum, the proposed method shows comparable results in terms of robustness against geometric attacks. At the same time, it can outperform our previous method [24] since the SIFT is used to correct geometric attacks. Table A3 sketches the comparison of the robustness of the proposed technique with the scheme in [44] after applying several attacks, such as additive noise, median filtering, histogram equalization, JPEG, rotation, and scaling. It can be observed from Table A3 that the proposed algorithm shows high robustness for Gaussian noise, histogram equalization, rotation, and JPEG (when QF is great than 50) attacks. The proposed technique achieves comparable results for median filtering, jpeg when QF is below 40, and scaling (for zoom greater than 0.9) attacks. Table A4 shows the results of robustness compared to the scheme of Chen et al. [45] in terms of BER. The comparison has been made for three different images Lena, Mandrill, and Peppers. According to Table A4, it can be seen that the proposed method provides high robustness for JPEG, scaling, and median filtering. For rotation and cropping attacks, the proposed method can achieve comparable results in terms of robustness.
To summarize, one can conclude from the experiments that combining a hybrid scheme with SIFT descriptor allows significant gains for several attacks while preserving good imperceptibility.

Conclusions
In this paper, a robust image watermarking method based on SIFT in the DWT-DCT domain is presented. Its goal is to ensure both robustness against geometric and image processing attacks while preserving high imperceptibility. The proposed method takes the advantages of combining the DWT and DCT transforms to ensure robustness in the face of common image processing attacks such as filtering, histogram equalization, JPEG compression, and noise attacks without degrading the image quality. At the same time, SIFT descriptor characteristics are used to obtain robustness against geometrical attacks, especially rotation, scaling, and translation. The experimental results and comparisons have demonstrated the high robustness of the proposed method for both common image processing attacks and geometrical attacks while preserving a good imperceptibility. Future work will focus on using a meta-heuristic algorithm to find the optimal watermark strength.

Acknowledgments:
We thank the anonymous reviewers for critically reading the manuscript and suggesting substantial improvements.

Conflicts of Interest:
The authors declare no conflict of interest.
Appendix A Table A1. Robustness comparison between Hu's method [34] and the proposed scheme in terms of NC.