Next Article in Journal
Justification of the Higher Order Effective Model Describing the Lubrication of a Rotating Shaft with Micropolar Fluid
Previous Article in Journal
A Selective Video Encryption Scheme Based on Coding Characteristics
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Blind Audio Watermarking Based on Parametric Slant-Hadamard Transform and Hessenberg Decomposition

by
Pranab Kumar Dhar
1,*,
Azizul Hakim Chowdhury
1 and
Takeshi Koshiba
2
1
Department of Computer Science and Engineering, Chittagong University of Engineering and Technology (CUET), Chattogram 4349, Bangladesh
2
Faculty of Education and Integrated Arts and Sciences, Waseda University, 1-6-1 Nishiwaseda, Shinjuku-ku, Tokyo 169-8050, Japan
*
Author to whom correspondence should be addressed.
Symmetry 2020, 12(3), 333; https://doi.org/10.3390/sym12030333
Submission received: 26 January 2020 / Revised: 20 February 2020 / Accepted: 21 February 2020 / Published: 26 February 2020

Abstract

:
Digital watermarking has been widely utilized for ownership protection of multimedia contents. This paper introduces a blind symmetric audio watermarking algorithm based on parametric Slant-Hadamard transform (PSHT) and Hessenberg decomposition (HD). In our proposed algorithm, at first watermark image is preprocessed to enhance the security. Then, host signal is divided into non-overlapping frames and the samples of each frame are reshaped into a square matrix. Next, PSHT is performed on each square matrix individually and a part of this transformed matrix of size m×m is selected and HD is applied to it. Euclidean normalization is calculated from the 1st column of the Hessenberg matrix, which is further used for embedding and extracting the watermark. Simulation results ensure the imperceptibility of the proposed method for watermarked audios. Moreover, it is demonstrated that the proposed algorithm is highly robust against numerous attacks. Furthermore, comparative analysis substantiates its superiority among other state-of-the-art methods.

1. Introduction

Nowadays, the significant improvement of the internet makes it possible to easily access different multimedia data. Thus, various types of new challenges related to copyright protection and content tempering are introduced every day. Digital watermarking has been effectively utilized to tackle these new challenges. It is a process of embedding secret information into digital contents for authenticity. The major applications of digital watermarking include data authentication, fingerprinting, copyright protection, ownership protection, and broadcast monitoring [1]. The primal requirements of watermarking methods are (i) imperceptibility (ii) robustness, (iii) data payload, and (iv) security [2]. The imperceptibility property of a watermarking algorithm defines the indistinguishability between the host signal and watermarked signal. The robustness property of a watermarking algorithm is the ability to sustain the watermark against numerous signal processing attacks. Data payload of a watermarking algorithm defines the number of watermark bits that are embedded into the host signal. Security of a watermarking algorithm ensures that a watermark can be detectable only by an authorized person. The main challenge of a watermarking algorithm is to maintain a good trade-off among these requirements. In general, digital watermarking can be classified by different properties. On the basis of robustness property, digital watermarking can be classified into robust and fragile (or semi-fragile) watermarking. Moreover, the watermarking methods can be classified into blind, semi-blind, and non-blind. While blind watermarking method can detect the watermark without the host signal, the non-blind method requires the host signal to extract the watermark and semi-blind method needs some information of host signal to extract the watermark. In this paper, we introduce a blind symmetric audio watermarking algorithm using a parametric Slant-Hadamard transform (PSHT), Hesssenberg decomposition (HD), and Euclidean normalization, which provides a good trade-off among imperceptibility, robustness, and data payload.
The remainder of this paper is organized as follows. Section 2 provides the related research that includes a brief summary of recent methods. Section 3 briefly describes the background information including PSHT and HD. Section 4 introduces the proposed watermarking method consisting of watermark preprocessing, watermark embedding, and extraction processes. Section 5 provides the experimental results and compares the performance of the proposed method with recent methods in terms of imperceptibility and robustness. Finally, in Section 6, the conclusion of this paper is presented.

2. Related Research

An extensive survey on audio watermarking techniques is described in [1,2]. According to the domain, watermarking is classified into time domain and transform domain techniques. Time domain techniques embed a watermark into the audio signal by modifying its coefficients directly [3]. This technique is easy to implement and requires few computational resources. On the other hand, the transform domain technique is applied to coefficients obtained as the result of transformation of either a whole audio or the frame of the audio. Some well-known and conventional transform domain techniques are discrete wavelet transform (DWT) [4], discrete cosine transform (DCT) [5], and fast fourier transform (FFT) [6]. Pandey et al. [3] presented a method that uses the pseudo-random gray sequence property. However, the imperceptibility result of this method is not quite high and robustness result is provided for very few attacks. Kaur et al. [4] suggested a method based on a mathematical model by using features such as energy, short time energy, and zero cross means, but robustness against some attacks is quite low. Tsai et al. [5] proposed a watermarking method based on energy averaging. However, the data payload of this method is not reported there. In [6], a watermarking scheme was proposed based on Lucas regular sequence (LRS) and FFT. However, this scheme shows less robustness against some of the common attacks. Dhar et al. [7] proposed a DCT-based algorithm using singular value decomposition (SVD) and exponential-log operations (ELO) where the watermark is embedded to the highest power of DCT coefficients, but robustness results against some common attacks were not reported. Karnjana et al. [8] introduced a method based on singular spectrum analysis (SSA) and psychoacoustic model (PM), but it shows quite low robustness against some common attacks. The authors of [9] proposed a multifunctional algorithm based on chaotic scrambling. However, the peak signal-to-noise ratio (PSNR) of this method is quite low. In [10], authors proposed a method based on DCT, singular value decomposition (SVD), entropy, and log-polar transformation (LPT). It shows good results for imperceptibility, but it does not show good robustness results against some common signal processing attacks. Hwang et al. [11] introduced a watermarking method based on quantization index modulation (QIM) and SVD, but the imperceptibility and robustness of this scheme is a little low. In [12], a watermarking method is proposed based on flexible segmentation (FS) and adaptive embedding (AE), but it provides low SNR and low robustness against some common attacks. Hu et al. [13] suggested a method in dual domain using flexible segmentation and adaptive embedding where binary watermark bits are inserted into discrete wavelet packet transform coefficients. However, it shows slightly poor results for imperceptibility. In [14], the authors introduced a watermarking algorithm using DWT and direct-sequence spread spectrum (DSSS). However, the robustness result of this method against some attacks is quite low. Irawati et al. [15] presented a method based on DCT and QR decomposition. The SNR of this method ranges between 11 dB to 27 dB, which is much lower than the basic requirement, and the bit error rate (BER) against some attacks is also quite high. Gupta et al. [16] suggested a watermarking method using lifting wavelet transform (LWT) and adaptive quantization. Although this method is blind, the SNR and normalized correlation (NC) of this method is poor. In [17], a watermarking scheme is proposed using audio characteristics and scrambling encryption. This scheme shows high security; however, it has low robustness against some attacks. In [18], a watermarking scheme is suggested using empherical mode decomposition (EMD) where intrinsic feature of final residual is used to embed the watermark. It shows good robustness, but the objective listening test was not performed and data payload was also not reported there. Safitri et al. [19] presented a method using DWT, SVD, and BCH code where watermark bits are inserted using QIM. However, the PSNR of this method is little low and also robustness against some attacks was not conducted. A histogram-based audio watermarking using stationary wavelet transform (SWT) and synchronization is suggested in [20]. However, the data payload of this method is quite low and BER of this method against some attacks is quite high. An audio watermarking method using phase shifting is introduced in [21]. However, the PSNR result of this method is not reported and the robustness result against some attacks is quite low. From the above studies, we observed that some methods have low robustness, whereas some methods have less imperceptible or low data payload. To overcome the limitations stated above, in this paper, we suggest a blind symmetric audio watermarking algorithm based on PSHT, HD, and Euclidean normalization. To the best of our knowledge, this is the first audio watermarking algorithm that utilizes PSHT, HD, and Euclidean normalization jointly. The main features of the proposed algorithm are as follows: (i) it applies PSHT, HD, and Euclidean normalization unitedly; (ii) the logistic map is used for scrambling the watermark to safeguard the unauthorized detection; (iii) it embeds watermark into the largest value of the 1st column of Hessenberg matrix using a new embedding equation; (iv) watermark is extracted without the host signal; (v) it ensures the trade-off among imperceptibility, robustness, and data payload. Simulation results demonstrated that our proposed method is highly robust against numerous attacks. The BER of the proposed method varies from 0 to 6.54, whereas the BER of the recent methods [4,5,6,7,8,9,10,11,12] vary from 0 to 17.76. The PSNR of the proposed method varies from 43.81 to 47.75, whereas PSNR of the recent methods vary from 19.39 to 44.81. In other word, the proposed method outperforms state-of-the-art methods in terms of robustness and imperceptibility.

3. Background Information

3.1. Parametric Slant-Hadamard Transform (PSHT)

Parametric Slant-Hadamard transform (PSHT) was introduces by Agaian and this method is mostly used for signal processing [22]. PSHT mainly includes some parameters for which the fidelity, robustness, and imperceptibility property varies. Let f denote the original signal and F denote the transformed signal. Then, two-dimensional PSHT can be described as:
F = S 2 n f S 2 n T ,
where S 2 n represents a 2n × 2n parametric slant-Hadamard matrix with real elements. The inverse transform to recover f from the transformed matrix F is given by:
f = S 2 n T F S 2 n .
The parametric slant-Hadamard matrix with order 2 n is obtained from the matrix of order 2 n 1 with the help of Kronecker product operator given as:
S 2 n =   1 2 Q 2 n ( I 2   S 2 n 1 ) ,   n > 1 ,
where I 2 represents the identity matrix of order 2 and Q 2 n denotes the matrix of recursion kernel property. Q 2 n can be described as follows:
Q 2 n = [ 1 0 a 2 n b 2 n 0 2 n 1 2 1 0 a 2 n b 2 n 0 2 n 1 2 0 2 n 1 2 I 2 n 1 2 0 2 n 1 2 I 2 n 1 2 0 1 b 2 n a 2 n 0 2 n 1 2 0 1 b 2 n a 2 n 0 2 n 1 2 0 2 n 1 2 I 2 n 1 2 0 2 n 1 2 I 2 n 1 2 ]
where O M denotes the all-zero matrix of size M×M and denotes the Kronecker product [22].
The parameters a 2 n and b 2 n are defined as:
a 2 n = 3 ( 2 2 n 2 ) 4 ( 2 2 n 2 ) β 2 n and     b 2 n = 2 ( 2 2 n 2 ) β 2 n 4 ( 2 2 n 2 ) β 2 n
The PSHT can be categorized into four groups based on the value of parameter β:
(i) for all β 2 n =1, it represents the classical slant transform;
(ii) for β 2 n = 2 2 n 2 and n > 1, it represents the Walsh-Hadamard transform;
(iii) for β 4 = β 8 = = β 2 n = β , and |β| ≤ 4, it represents the constant beta slant transform;
(iv) for β 4 β 8 β 2 n , 2 2 n 2 β 2 n 2 2 n 2 and n = 2, 3, 4, …, it represents the multiple beta slant transform.

3.2. Hesssenberg Decomposition (HD)

The Hessenberg decomposition (HD) decomposes a general square matrix A into the following form:
A = PHP
where P denotes orthogonal matrix and H is an upper triangular matrix [23].

4. Proposed Watermarking Algorithm

Let Y = {y (n),1≤ n ≤ S}be the host signal containing S samples and W = {w (k,l), 1≤ k ≤ M, 1≤ l ≤ M} represent the binary watermark image. Let w ( k , l ) ( 0 , 1 ) be the pixel value at the point ( k , l ) that will be embedded into the host audio.

4.1. Watermark Preprocessing

For the enhancement of confidentiality, at first, a watermark should be preprocessed. The proposed method uses a logistic map which encompasses the chaotic characteristic to encrypt the binary watermark image and this feature will ensure the confidentiality of the proposed method. The mapping is defined as follow:
y ( i + 1 ) = { a × y ( i ) × ( 1 y ( i ) ) ,                   i f   y ( i ) > 0 a × y ( i ) × ( 1 y ( i ) ) + b ,                 o t h e r w i s e
where y (1)∈ (0,1) and a, b are real parameters according to the map’s initial condition. After this, a binary sequence is obtained with the help of the following equation:
z ( i ) = {   1 ,       i f   y ( i ) > T 0 ,           o t h e r w i s e
where T represents a predefined threshold value, which depends on the real parameters a and b. Moreover, T is proportional to a and b, i.e., as the values of a and b increase, the value of T also increases and vice versa.
The original binary watermark image W is converted into an one dimensional sequence r , where r ={ r ( i ) , i = 1, 2, 3, …, M × M}. Then, in the final stage of preprocessing, r ( i ) is encrypted using z ( i ) with the help of the following equation:
u ( i ) = z ( i ) r ( i ) ,             1 i   M × M
where is the exclusive-or (XOR) operation. After this encryption process, u(i) cannot be found through random search. In this process, y (1), a, and b can be used as a secret key K. The pseudo code of the watermark preprocessing is presented in Algorithm 1.
Algorithm 1: Watermark Preprocessing
Variable Declaration:
W (i = 1, 2, …., M; j = 1, 2, …., M): the watermark image
y ( i + 1 ) ( i =   1 ,   2 ,   ,   M × M ) : logistic mapping parameter
a, b: real parameters
z ( i ) ( i =   1 ,   2 ,   ,   M × M ) : binary sequence
T : predefined threshold value
r ( i ) ( i =   1 ,   2 ,   ,   M × M ) : new one dimensional sequence from Wi
u ( i ) : encrypted watermark sequence
Watermark Preprocessing Procedure:
Let y (1)∈ (0,1)
for i = 1: M do
calculate y ( i + 1 ) using Equation (7)
calculate z ( i ) using Equation (8)
calculate u ( i ) using Equation (9)
end for
return encrypted watermark sequence

4.2. Watermark Embedding Process

The proposed watermark embedding procedure is shown in Figure 1 and is described as follows:
1. The host signal Y is firstly divided into M × M non-overlapping frames F = { F 1 ,   F 2   F 3 ,     ,   F M × M } and each frame Fi is converted into two-dimensional matrix C i of size m×m, where i represents the frame number.
2. PSHT is applied on each matrix C i and transformed matrix T i is obtained.
3. Then, each transformed matrix T i is sub-divided into N non-overlapping blocks B = { B j ,   1 j N } of size n×n and sum of the absolute mean of each block is calculated using the following equation:
Z j = k = 1 n l = 1 n | B j | n × n w h e r e   1 j N
where | B j | denotes the absolute value of the j t h block Bj and Z j denotes the absolute mean of the j t h block.
4. Find Zmax = max{Z1, Z2, Z3, ,EN} of the blocks {B1, B2, B3, …, BN}, where max operation returns the largest value in {Z1, Z2, Z3, …, ZN}.
5. The Z m a x is selected for decomposition and for simplicity it is represented as R i . HD is then performed on the selected n×n matrix R i   , which is represented by:
R i =   P i ×   H i ×   P i T
where Pi denotes the orthogonal matrix and Hi denotes the Hessenberg matrix.
6. Euclidean normalization of the 1st column of the Hessenberg matrix Hi is calculated using the following equation:
n i = i = 1 n H i 2 ( k , 1 )
where H i ( k , 1 ) denotes the coefficient of k t h row and 1st column of Hessenberg matrix and k n .
Let d i = mod ( n i , 1), where d i is the fractional part of n i and x i = floor( d i   × s ), where x i is the integer part of n i , and s 10 .
7. The watermark bit is embedded into the Euclidean normalization n i of the 1st column of Hessenberg matrix Hi. Watermark is embedded using the following rule:
(i) when mod (xi, 2) = 0, the following equation is used:
n i = { n i x 1 s   i f     u ( i ) = 1 n i + x 2         i f     u ( i ) = 0
(ii) when mod (xi, 2) = 1, the following equation is used:
n i = { n i + x 2         i f     u ( i ) = 1 n i x 1 s   i f     u ( i ) = 0
8. Finally, the largest coefficient denoted by H i ( k , 1 ) l a r g e s t of the 1st column of Hessenberg matrix is modified using the following equation:
H i ( k , 1 ) l a r g e s t =   ( H i ( k , 1 ) l a r g e s t )   × n i n i
9. The modified largest coefficient H i ( k , 1 ) l a r g e s t is re-inserted into H i to obtain the modified Hessenberg matrix H i and inverse HD is applied for obtaining the modified matrix R i , which can be defined as:
R i   =   P i   ×   H i ×   P i T
10. N non-overlapping blocks including the modified block are recombined to obtain T i . Inverse PSHT is applied to the T i to obtain the modified matrix C i .
11. Each watermarked frame F i is obtained by reshaping each modified matrix C i .
12. Finally, watermarked signal Y is obtained by concatenating all the watermarked frames.
The pseudo code of the watermark embedding procedure is presented in Algorithm 2.
Algorithm 2: Watermark Embedding
Variable Declaration:
Y: host audio signal
F: segmented non-overlapping frame
  C i ( i =   1 ,   2 ,   ,   M × M ) : frame represented in dimensional matrix with size m×m
T i ( i =   1 ,   2 ,   ,   M × M ) : transformed matrix
B j ( i =   1 ,   2 ,   ,   N ) : non-overlapping bloc
Z j ( i =   1 ,   2 ,   ,   N ) : sum of absolute mean of the j t h block
R i ( i =   1 ,   2 ,   ,   M × M ) : block with maximum sum of absolute mean
H i ( i =   1 ,   2 ,   ,   M × M ) : Hessenberg matrix
n i ( i =   1 ,   2 ,   ,   M × M ) : the 2nd order Euclidean normalization
x i ( i =   1 ,   2 ,   ,   M × M ) : quantization coefficient for embedding
Watermark Embedding Procedure:
for i = 1: M × M do
convert the i t h frame coefficients into two dimensional matrix C i
apply PSHT on C i to obtain T i
for j = 1: N do
subdividing into non-overlapping block B j
calculate the sum of absolute mean Z j of each block B j using Equation (10)
end for
select block R i with maximum sum of absolute mean Z m a x
apply HD on matrix R i using Equation (11)
calculate n i using Equation (12)
calculate d i and x i
update n i into n i using Equations (13) and (14)
modify the largest Hessenberg coefficient H i ( k , 1 ) l a r g e s t using Equation (15)
apply inverse HD on matrix R i using Equation (16)
apply inverse PSHT on T i
reshape C i properly
reshape F i properly.
end for
return watermarked audio Y

4.3. Watermark Extraction Process

The proposed watermark detection procedure is shown in Figure 2. The blind extraction of the watermark is described in the following steps:
1. The attacked watermarked audio Y is firstly divided into M×M non-overlapping frames and each frame is converted into two-dimensional matrix C i .
2. T i is obtained by applying PSHT on each matrix C i .
3. T i is sub-divided into N non-overlapping blocks B = { B j ,   1 j N } and Z j is calculated. After that, R i is selected.
4. HD is then performed on R i to obtain the matrices P i and H i . n i is calculated from H i .
5. Then, d i and x i are calculated from n i .
6. The encrypted watermark sequence is extracted using the following rule:
u ( i ) = { 1     if   m o d ( x i ,   2 ) = 1   0     if   m o d ( x i ,   2 ) = 0
7. Chaotic decryption is performed using the secret key K in order to find the binary watermark sequence with the following equation:
r ( i ) = z ( i ) u ( i )
8. Finally, the watermark sequence is obtained after rearranging the binary sequence r ( i ) into a square matrix W with size M×M.
The pseudo code of the watermark extraction procedure is presented in Algorithm 3.
Algorithm 3: Watermark Extraction
Variable Declaration:
Y : attacked watermarked audio signal
F: attacked watermarked frame
C i ( ( i =   1 ,   2 ,   , M × M ) : watermarkedframe represented in two dimensional matrix with size m×m
T i ( i =   1 ,   2 ,   , M × M ) : modified transformed matrix
B j ( i =   1 ,   2 ,   , N ) : modified non-overlapping block
Z j ( i =   1 ,   2 ,   , N ) : sum of absolute mean of modified the j t h block
R i ( i =   1 ,   2 ,   , M × M ) : modified block with maximum sum of absolute mean
H i ( i =   1 ,   2 ,   , M × M ) : modified Hessenberg matrix n i ( i =   1 ,   2 ,   , M ) : modified the 2ndorder
Euclidean normalization
x i ( i =   1 ,   2 ,   , M ) : quantization coefficientfor extraction
Watermark Extraction Procedure:
for i = 1: M × M do
convert the coefficients of the i t h frame into two dimensional matrix C i
apply PSHT on C i to obtain
for j = 1: N do
subdividing into non-overlapping block B j
calculate the sum of absolute mean Z j of each block B j
end for
select block R i with maximum sum of absolute mean Z m a x
apply HD on matrix R i
calculate n i
calculate d i and x i
calculate u ( i ) using the Equation (17)
calculate r ( i ) using the Equation (18)
reshape r ( i )
end for
return watermark W

5. Experimental Results and Discussion

In this section, the performance of our proposed algorithm has been evaluated and compared with some state-of-the-art methods. In this study, we used 20 audio files belong to four different audio groups as host audio signals, which are given below:
Group 1: 05 files containing pop music;
Group 2: 05 files containing classical music;
Group 3: 05 files containing jazz music;
Group 4: 05 files containing rock music;
All audio files are mono-channel 16 bit with a 44.1 kHz sampling rate and they contain 262,144 samples (duration 5.94 s). The selected size of the frame for each audio is 256 samples. Therefore, we have 1024 frames for each audio. A binary watermark image and the corresponding encrypted watermark image with size 32×32 are shown in Figure 3. Thus, one watermark bit is embedded in each frame. In this study, constant beta slant transform is used with parameters β 4 = 2 , β 8 = 2 ,     β 16 = 2 . Moreover, the selected value of y(1), b, T, and s are 1, 1, 0.5, and 10, respectively. These parameters are considered to obtain a good trade-off between the imperceptibility and robustness. HD is applied on matrix Ri with size 8×8 for better computation cost of space and time.

5.1. Imperceptibility Analysis

Imperceptibility property of the proposed algorithm is assessed by using both subjective and objective analysis.

5.1.1. Subjective Analysis

For ensuring imperceptibility, perceptual quality of watermarked audio should be calculated. In this study, 10 participants were blindly given both the original and watermarked signals and were asked to differentiate these two signals based on a subjective difference grade (SDG) that ranged from 5.0 to 1.0 (imperceptible to very annoying) as given in Table 1. The average result of subjective grading is presented in Table 2. The result shows that the mean opinion score (MOS) of the proposed method lies between 4.9 to 5.0 for all watermarked audios, which ensures the imperceptibility of the watermarked audio.
Subjective evaluation was also conducted by another technique known as the ABX method. The test was evaluated with the help of 10 subjects. At first, each subject listened to both the host signal (A) and the watermarked signal (B). Then, they were given another unknown signal (X) and were asked to find out the unknown one. Five trials were conducted by each subject. Table 2 presents the results of the correct detection, which varied between 48% to 54%, indicating the high imperceptibility of the proposed method.

5.1.2. Objective Analysis

The objective assessment is generally measured by the SNR of the watermarked audio. According to the standard of Industrial Federation of the Phonographic Society (IFPI), the ideal SNR of watermarked audio should be more than 20 dB for satisfying the imperceptibility property [7]. The SNR of the proposed method for various audio is given in Table 2. We observed that SNRs of various audios are greater than 40 dB, which satisfy the international standard.
Moreover, objective assessment was also conducted using object difference grade (ODG), which is one of the output obtained from the perceptual evaluation of audio quality (PEAQ) measurement based on ITU-R BS.1387 (International Telecommunication Union-Radio-communication Sector) standard [7]. The ODG score lies between 0 to −4 (imperceptible to very annoying) given in Table 1. The objective quality of different audios using the proposed method are evaluated in terms of ODG and the results are shown in Table 2. It is observed that all ODGs of our proposed algorithm range from −0.39 to −0.46, indicating that the original and watermarked audios are perceptual similar. Table 3 shows a comparative analysis between the proposed and several recent methods [4,12] in terms of SNR and MOS. From this comparison, it was observed that our proposed method shows better result in terms of SNR and MOS. In other words, subjective and objective analysis proves that the proposed method provides better performance than the other methods in terms of imperceptibility.

5.2. Robustness Analysis

The robustness of our proposed algorithm has been evaluated using (1) normalized correlation (NC) and (2) bit error rate (BER).Define if appropriate.
Normalized correlation (NC) compares the similarities between two images. It is calculated as follows:
N C ( W , W ) = k = 1 M l = 1 M w ( k , l ) w ( k , l ) k = 1 M l = 1 M w ( k , l ) w ( k , l ) k = 1 M l = 1 M w ( k , l ) w ( k , l )
where W and W denote the original watermark and extracted watermark, respectively, and k, l denote the matrix indices. The value of NC ranges from 1 to 0. The correlation of the two images is very high when the NC is closer to one. On the other hand, the correlation of the images is very low when the NC is closer to zero.
BER is generally used to calculate the bit error rate between the original and extracted watermark, which is given by:
B E R ( W , W ) = k = 1 M l = 1 M w ( k , l ) w ( k , l ) M × M
For evaluating the robustness, various common signal processing attacks were applied on the watermarked audio signals which are given below:
  • Noise addition: Additive white Gaussian noise (AWGN) was added with a watermarked signal until the signal had an SNR of 20 dB.
  • Cropping: A number of 1000 samples of the watermarked audio were removed from different positions, and then, these samples were replaced with the watermarked audio signal attacked by additive white Gaussian noise.
  • Re-sampling: The watermarked signal with a sample rate of 44.1 kHz was sampled to 22.05 kHz and again resampled by a rate of 44.1 kHz.
  • Re-quantization: The watermarked audio was quantized from 16 bit to 8 bit.
  • Compression: The watermarked signal was compressed using MPEG-1 layer 3 compression (128 kbps).
  • Noise Reduction: Noise reduction was successfully done from the watermarked audio with the help of “Hiss removal” function.
  • Echo addition: Echo signal containing a delay time of 150 ms and decay rate of 35% was applied to the watermarked signal.
  • Distortion: The watermarked audio signal was distorted within a range of 0 dB to −10 dB.
  • Amplification: The watermarked audio was amplified (enlarged) by 1.25 times of its original amplitude.
  • Delay: A delay time of 150 ms was used and the volume of the delayed signal contains 3% of the original signal.
  • Invert: The watermarked audio signal was fully inverted to obtain the inverted form of the actual watermark signal.
  • Low-Pass Filter: A low-pass filter with a cut-off frequency of 15,000 Hz was applied to the watermarked audio.
Table 4 and Table 5 show the robustness result of our proposed algorithm in terms of NC and BER, which are obtained from various attacked watermarked audio signals. We observed that the proposed method recovers the watermark successfully from the attacked watermark audio signals for noise reduction, invert, and echo addition, as the NC values are 1 and BER values are 0.
Moreover, the proposed method shows good NC and BER values for amplification, distortion, delay, re-sampling, re-quantization, cropping, and low-pass filtering attack. The NC of the proposed method for various attacks varies from 0.9459 to 1. Moreover, the BER of the proposed method varies from 0 to 6.54 for various attacks. In other words, the NCs of our proposed method are greater than 0.9459 and BERs of the proposed method are less than 7%. Figure 4, Figure 5, Figure 6 and Figure 7 show the extracted watermark images for different audios against various attacks. From these figures, we observed that watermark is extracted without any errors in most of the cases, which proves the high robustness of the proposed method.
Table 6 illustrates a comparative analysis between the proposed and some recent methods [4,5,6,7,8,9,10,11,12] in terms of noise addition, resampling, re-quantization, and MP3 compression. From this table, we observed that our proposed method shows less BER than the other recent methods for noise addition. Moreover, it shows better result than that of the methods presented in [4,6,7,8,10,11] for the re-sampling attack. For the re-quantization attack, it shows better result than that of the methods proposed in [5,6,7,8] and for MP3 compression, it shows better result than that of the methods suggested in [5,6,7,8,9,11,12]. From these results, we can conclude that our proposed method provides lower BER values against some common attacks compared with some recent state-of-the-art methods. Overall, our proposed method shows better performance than the recent state-of-the-art methods in terms of imperceptibility and robustness. This is because the watermark bits were inserted into the largest value of the 1st column of the Hessenberg matrix of PSHT coefficients of each frame using a quantization function.

5.3. Data Payload

Data payload defines the number of bits that can be embedded into the original signal over a unit of time. It is measured by bits per second. The data payload P is defined as follows:
P = B T ( b p s )
where T indicates the time duration of the original audio signal and B indicates the number of watermark bits to be embedded into the host signal. The standard value for data payload is more than 20 bps [7]. The data payload value of our proposed scheme is 172.39 bps, which is much higher than the standard value.

5.4. Security Analysis

To enhance the security, the proposed scheme uses chaotic encryption. First, we encrypted the watermark using logistic mapping where a key K is used for both encryption and decryption. Second, there is another parameter β, which is used in the PSHT process. Different values of β shows different experimental results. Last, a quantization coefficient x was used for both embedding and blind extraction. Therefore, it is not possible to detect the embedded watermark without these three parameters.

5.5. Computation Time Analysis

The computation time of our proposed method including both the embedding and extraction processes is calculated and compared with that of the methods presented in [5,6,8], which is given in Table 7. We observed that the computation time for embedding process of our proposed method is 2.03 s, which is much lower than that of the methods given in [5,8], whereas it is slightly higher than that of the method reported in [6]. On the other hand, the computation time for detection process of our proposed method is 0.75 s, which is much lower than that of the methods given in [6,8]. From this point of view, it can be concluded that the proposed method has lower computational cost compared with other methods.

6. Conclusions

In this paper, we proposed a blind symmetric audio watermarking algorithm based on two well-known transformation and decomposition techniques, namely PSHT and HD, which are used in audio watermarking for the first time. Watermark is embedded into the largest value of the 1st column of the Hessenberg matrix of PSHT coefficients of each frame using a new quantization function. By simulation, it is demonstrated that the proposed algorithm is highly robust against numerous attacks such as noise addition, noise reduction, echo addition, cropping, re-quantization, MP3 compression, re-sampling, distortion, amplification, delay, invert, and low-pass filter. In addition, the proposed algorithm is computationally faster and it has high data payload. Moreover, the audio quality tests ensure the high imperceptibility of the watermarked audios. Furthermore, comparative analysis substantiates its superiority among other state-of-the-art methods. These results verified the validity of our proposed algorithm for audio copyright protection. In the future, the proposed algorithm will be compared with several recent state-of-the-art methods using the same dataset in terms of imperceptibility, robustness, and computation time.

Author Contributions

All authors contributed equally to the conception of the idea, the design of experiments, the analysis and interpretation of results, and the writing and improvement of the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Xiang, Y.; Hua, G.; Yan, B. Digital Audio Watermarking: Fundamentals, Techniques and Challenges; Springer: Singapore, 2017. [Google Scholar]
  2. Cvejic, N. Digital Audio Watermarking Techniques and Technologies: Applications and Benchmarks; IGI Global: Hershey, PA, USA, 2007. [Google Scholar]
  3. Pandey, M.K.; Parmar, G.; Gupta, R. Audio watermarking by spreading echo in time domainusing pseudo noise gray sequence. In Proceedings of the IEEE International Conference on Industrial Instrumentation and Control (ICIC), Pune, India, 28–30 May 2015; pp. 740–743. [Google Scholar]
  4. Kaur, A.; Dutta, M.K.; Soni, K.M.; Taneja, N. Localized & self-adaptive audio watermarking algorithm in the wavelet domain. J. Inf. Secur. Appl. 2017, 33, 1–15. [Google Scholar]
  5. Tsai, S.E.; Yang, S.M. An effective watermarking method based on energy averaging in audio signals. Math. Probl. Eng. 2018, 2018, 6420314. [Google Scholar] [CrossRef]
  6. Pourhashemi, S.M.; Mosleh, M.; Erfani, Y. Audio watermarking based on synergy between Lucas regular sequence and Fast Fourier Transform. Multimed. Tools Appl. 2019, 78, 22883–22908. [Google Scholar] [CrossRef]
  7. Dhar, P.K.; Shimamura, T. Blind audio watermarking in transform domain based on singular value decomposition and exponential-log operations. Radioengineering 2017, 26, 552–561. [Google Scholar] [CrossRef]
  8. Karnjana, J.; Unoki, M.; Aimmanee, P.; Wutiwiwatchai, C. Audio watermarking scheme based on singular spectrum analysis and psychoacoustic model with self-synchronization. J. Electr. Comput. Eng. 2016, 2016, 5067313. [Google Scholar] [CrossRef] [Green Version]
  9. Liu, H.; Liu, X.; Shi, B.; Chen, T.; Wang, J. Multifunctional audio watermarking algorithm based on Chaotic Scrambling. J. Comput. Methods Sci. Eng. 2017, 17, 443–454. [Google Scholar] [CrossRef]
  10. Dhar, P.K.; Shimamura, T. Blind SVD-based audio watermarking using entropy and log-polar transformation. J. Inf. Secur. Appl. 2015, 20, 74–83. [Google Scholar] [CrossRef]
  11. Hwang, M.J.; Lee, J.; Lee, M.; Kang, H.G. SVD-based adaptive QIM watermarking on stereo audio signals. IEEE Trans. Multimed. 2018, 20, 45–54. [Google Scholar] [CrossRef]
  12. Luo, Y.; Peng, D.; Sang, Y.; Xiang, Y. Dual-domain audio watermarking algorithm based on flexible segmentation and adaptive embedding. IEEE Access 2019, 7, 10533–10545. [Google Scholar] [CrossRef]
  13. Hu, H.T.; Lee, T.T. High-performance self-synchronous blind audio watermarking in a unified FFT framework. IEEE Access 2019, 7, 19063–19076. [Google Scholar] [CrossRef]
  14. Choudhary, S.; Nath, K.; Panda, J. Double layered audio zero-watermarking using DWT and DSSS. In Proceedings of the International Conference on Communication and Signal Processing (ICCSP), Chennai, India, 6–8 April 2017; pp. 0419–0423. [Google Scholar]
  15. Irawati, I.D.; Budiman, G.; Ramdhani, F. QR-based watermarking in audio subband using DCT. In Proceedings of the International Conference on Control, Electronics, Renewable Energy and Communications (ICCEREC), Bandung, Indonesia, 5–7 December 2018; pp. 136–141. [Google Scholar]
  16. Gupta, A.K.; Agarwal, A.; Singh, A.; Vimal, D.; Kumar, D. Blind audio watermarking using adaptive quantization and Lifting wavelet transform. In Proceedings of the IEEE 5th International Conference on Signal Processing and Integrated Networks (SPIN), Noida, India, 22–23 February 2018; pp. 556–559. [Google Scholar]
  17. Weina, W. Digital audio blind watermarking algorithm based on audio characteristic and scrambling encryption. In Proceedings of the IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China, 25–26 March 2017; pp. 1195–1199. [Google Scholar]
  18. Tang, X.; Ma, Z.; Niu, X.; Yang, Y. Robust audio watermarking algorithm based on empirical mode decomposition. Chin. J. Electron. 2016, 25, 1005–1010. [Google Scholar] [CrossRef]
  19. Safitri, I.; Ginanjar, R.R.; Rizal, A. Adaptive multilevel wavelet BCH code method in the audio watermarking system. In Proceedings of the IEEE International Conference on Control, Electronics, Renewable Energy and Communications (ICCREC), Yogyakarta, Indonesia, 26–28 September 2017; pp. 55–59. [Google Scholar]
  20. Sulistyawan, V.N.; Budiman, G.; Safitri, I. Histogram-based audio watermarking with synchronization in stationary audio subband. In Proceedings of the IEEE International Conference on Control, Electronics, Renewable Energy and Communications (ICCEREC), Bandung, Indonesia, 5–7 December 2018; pp. 195–201. [Google Scholar]
  21. Sakai, H.; Iwaki, M. Audio watermarking method based on phase-shifting having robustness against band-pass filtering attacks. In Proceedings of the IEEE 7th Global Conference on Consumer Electronics (GCCE), Nara, Japan, 9–12 October 2018; pp. 343–346. [Google Scholar]
  22. Agaian, S.; Tourshan, K.; Noonan, J.P. Parametric Slant-Hadamard transforms with applications. IEEE Signal Process. Lett. 2002, 9, 375–377. [Google Scholar] [CrossRef]
  23. Seddik, H.; Sayadi, M.; Fnaiech, F.; Cheriet, M. Image watermarking based on the Hessenberg transform. Int. J. Image Graph. 2009, 9, 411–433. [Google Scholar] [CrossRef]
Figure 1. Watermark embedding process.
Figure 1. Watermark embedding process.
Symmetry 12 00333 g001
Figure 2. Extraction process.
Figure 2. Extraction process.
Symmetry 12 00333 g002
Figure 3. (a) Binary watermark image. (b) Encrypted watermark image.
Figure 3. (a) Binary watermark image. (b) Encrypted watermark image.
Symmetry 12 00333 g003
Figure 4. Extracted watermark against different attacks for pop audio signal: (a) no attack, (b) noise addition, (c) noise reduction, (d) echo addition, (e) cropping, (f) re-quantization, (g) compression (MP3), (h) re-sampling, (i) distortion, (j) amplification, (k) delay, (l) invert, (m) low-pass filter.
Figure 4. Extracted watermark against different attacks for pop audio signal: (a) no attack, (b) noise addition, (c) noise reduction, (d) echo addition, (e) cropping, (f) re-quantization, (g) compression (MP3), (h) re-sampling, (i) distortion, (j) amplification, (k) delay, (l) invert, (m) low-pass filter.
Symmetry 12 00333 g004
Figure 5. Extracted watermark against different attacks for classical audio signal: (a) no attack, (b) noise addition, (c) noise reduction, (d) echo addition, (e) cropping, (f) re-quantization, (g) compression (MP3), (h) re-sampling, (i) distortion, (j) amplification, (k) delay, (l) invert, (m) low-pass filter.
Figure 5. Extracted watermark against different attacks for classical audio signal: (a) no attack, (b) noise addition, (c) noise reduction, (d) echo addition, (e) cropping, (f) re-quantization, (g) compression (MP3), (h) re-sampling, (i) distortion, (j) amplification, (k) delay, (l) invert, (m) low-pass filter.
Symmetry 12 00333 g005
Figure 6. Extracted watermark against different attacks for jazz audio signal (a) no attack, (b) noise addition, (c) noise reduction, (d) echo addition, (e) cropping, (f) re-quantization, (g) compression (MP3), (h) re-sampling, (i) distortion, (j) amplification, (k) delay, (l) invert, (m) low-pass filter.
Figure 6. Extracted watermark against different attacks for jazz audio signal (a) no attack, (b) noise addition, (c) noise reduction, (d) echo addition, (e) cropping, (f) re-quantization, (g) compression (MP3), (h) re-sampling, (i) distortion, (j) amplification, (k) delay, (l) invert, (m) low-pass filter.
Symmetry 12 00333 g006
Figure 7. Extracted watermark against different attacks for rock audio signal (a) no attack, (b) noise addition, (c) noise reduction, (d) echo addition, (e) cropping, (f) re-quantization, (g) compression (MP3), (h) re-sampling, (i) distortion, (j) amplification, (k) delay, (l) invert, (m) low-pass filter.
Figure 7. Extracted watermark against different attacks for rock audio signal (a) no attack, (b) noise addition, (c) noise reduction, (d) echo addition, (e) cropping, (f) re-quantization, (g) compression (MP3), (h) re-sampling, (i) distortion, (j) amplification, (k) delay, (l) invert, (m) low-pass filter.
Symmetry 12 00333 g007aSymmetry 12 00333 g007b
Table 1. Subjective and objective difference grades.
Table 1. Subjective and objective difference grades.
SDGODGDescriptionQuality
50ImperceptibleExcellent
4−1Perceptible, but not annoyingGood
3−2Slightly annoyingFair
2−3AnnoyingPoor
1−4Very annoyingBad
Table 2. Subjective and objective evaluation for different watermarked sounds.
Table 2. Subjective and objective evaluation for different watermarked sounds.
Audio SignalMOSCorrect DetectionSNRODG
Pop4.9054%43.81−0.46
Classical5.0048%47.75−0.35
Jazz5.0048%47.08−0.37
Rock4.9054%47.60−0.38
Average4.9551%46.56−0.39
Table 3. A comparative analysis between the proposed and various methods in terms of imperceptibility.
Table 3. A comparative analysis between the proposed and various methods in terms of imperceptibility.
ReferenceMethodSNRMOS
[4]Energy averaging41.47-
[5]Localized and self-adaptive algorithm31.403.7
[6]LRS-FFT44.81-
[7]DCT-SVD-ELO33.474.88
[8]SSA-PM25.61-
[9]Multifunctional algorithm23.33-
[10]DCT-SVD-LPT37.204.85
[11]SVD-QIM19.39-
[12]FS-AE33.6-
ProposedPSHT-HD46.564.95
Table 4. NC of extracted watermark for watermarked signal against various attacks.
Table 4. NC of extracted watermark for watermarked signal against various attacks.
AttackPopClassicalJazzRock
No attack1111
Noise Addition0.99860.99950.99111
Noise Reduction1111
Echo Addition1111
Cropping0.99780.99770.99880.9982
Re-quantization0.996810.99921
Compression (MP3)0.95660.94590.96190.9643
Re-sampling0.983610.99430.9893
Distortion0.976610.98950.9992
Amplification0.994410.98711
Delay0.99440.99760.98951
Invert1111
Low-Pass Filtering0.96490.98710.98220.9919
Table 5. BER (%) of extracted watermark for watermarked signal against various attacks
Table 5. BER (%) of extracted watermark for watermarked signal against various attacks
AttackPopClassicalJazzRock
No attack0000
Noise Addition0.370.881.070
Noise Reduction0000
Echo Addition0000
Cropping0.240.0260.140.20
Re-quantization0.3900.090
Compression (MP3)5.186.544.594.30
Re-sampling1.6700.680.88
Distortion2.8301.270.09
Amplification0.6801.560
Delay0.490.291.270
Invert0000
Low-Pass Filtering4.541.562.150.98
Table 6. General comparison of several recent methods with proposed algorithm in terms of BER (%).
Table 6. General comparison of several recent methods with proposed algorithm in terms of BER (%).
ReferenceMethodNoise AdditionResamplingRe-QuantizationMP3 Compression
ProposedPSHT-HD0.58(20 dB)0.81(22.05 kHz)0.12 (8 Bits/Sample)5.15(128 kbps)
[4]Energy averaging-8.0(22.05 kHz)-5.0(128 kbps)
[5]Localized and self-adaptive algorithm6.03(30 dB)0(22.05 kHz)0.14(8 bits/sample)6.20(64 kbps)
[6]LRS-FFT5.17(-)6.56(22.05 kHz)4.94(8 bits/sample)6.88(128 kbps)
[7]DCT-SVD-ELO0.91(-)0.88(22.05 kHz)0.23(8 bits/sample)6.13 (32 kbps)
[8]SSA-PM2.50(36 dB)6.06(22.05 kHz)8.83(16 bits/sample)9.44(128 kbps)
[9]Multifunctional algorithm4.22(-)0(22.05 kHz)-7.48(32 kbps)
[10]DCT-SVD-LPT0.83(-)1.56(22.05 kHz)0(8 bits/sample)3.91(128 kbps)
[11]SVD-QIM10.25(30 dB)4.88(16 kHz)-17.76(128 kbps)
[12]FS-AE7.23(20 dB)--6.04(48 kbps)
Table 7. Comparison of several recent methods with proposed algorithm in terms of computation time.
Table 7. Comparison of several recent methods with proposed algorithm in terms of computation time.
ReferenceMethodEmbedding Time(s)Extraction Time(s)
[5]Localized and self-adaptive algorithm2.77–3.42-
[6]LRS-FFT1.460.89
[8]SSA-PM2581200
ProposedPSHT-HD2.030.75

Share and Cite

MDPI and ACS Style

Dhar, P.K.; Chowdhury, A.H.; Koshiba, T. Blind Audio Watermarking Based on Parametric Slant-Hadamard Transform and Hessenberg Decomposition. Symmetry 2020, 12, 333. https://doi.org/10.3390/sym12030333

AMA Style

Dhar PK, Chowdhury AH, Koshiba T. Blind Audio Watermarking Based on Parametric Slant-Hadamard Transform and Hessenberg Decomposition. Symmetry. 2020; 12(3):333. https://doi.org/10.3390/sym12030333

Chicago/Turabian Style

Dhar, Pranab Kumar, Azizul Hakim Chowdhury, and Takeshi Koshiba. 2020. "Blind Audio Watermarking Based on Parametric Slant-Hadamard Transform and Hessenberg Decomposition" Symmetry 12, no. 3: 333. https://doi.org/10.3390/sym12030333

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop