DIPS: Data Integrity Protection of Signals

Marco Botta; Davide Cavagnino; Annunziata Marra

doi:10.3390/a19030211

Abstract

The integrity protection of digital signals is an important task in modern applications. We propose DIPS (Data Integrity Protection of Signals), a fragile watermarking algorithm aiming to protect the integrity of sampled signals like images composed of pixels or sampled audio signals that can be divided into block units. The present paper starts with two works that propose fragile watermarking algorithms yielding high-quality watermarked objects, identifies their security vulnerabilities, and finally defines a method that embeds a compressed Message Authentication Code of each block into the LSBs of the block samples. As it modifies 2 bits per block at most, the introduced distortion is extremely low, thus resulting in a very high objective quality (

P S N R

). Experimental results confirming this characteristic are reported on real sampled signals such as speech, images, and ECG signals.

Keywords:

data integrity; fragile watermarking; least significant bit; message authentication code; sampled signal

1. Introduction

Digital watermarking is a branch of the data-hiding field that refers to a collection of methods, techniques, and algorithms with the purpose to embed data, i.e., the watermark, into digital objects to satisfy specific application requirements like copyright protection, origin tracking, integrity protection, and tamper detection [1].

Watermarking takes place in two distinct phases. The embedding phase, performed once, stores the watermark into the cover (i.e., original) object (using a secret key for the security of the whole method) and the extraction/verification phase (run every time the application is required to check a desired property of the object such as integrity or ownership) that extracts the watermark and controls its characteristics (like correlation or areas of difference from the expected).

Watermark embedding obviously requires the alteration of the digital object. The watermark data may be stored directly in the representation domain of the digital object like pixels or audio samples; in that case, it is said to be stored in the spatial or time domains. Nonetheless, before embedding the object may be transformed into another domain, like the Fourier transform domain or the singular value decomposition domain, where the watermark is stored just before inverse transforming the data to obtain the original domain; in this case, the watermarking algorithm is said to operate in the transformed domain.

The properties of a watermark depend on its field of application. For example, copyright protection requires that the watermark be robust, that is, it must be able to resist intentional attacks aimed at its removal. On the contrary, tamper detection uses fragile watermarks that are (locally) modified at any minimal alteration of the digital object: the portions where the recovered and expected watermarks differ reveal changes (intentional or not) undergone by the digital object. Thus, in the context of fragile watermarking, the embedded fragile watermark is used as a mean to verify the integrity of the digital object, identifying (i.e., localizing) altered parts of the object if any tampering took place. The tampering attack may take place during the storage or transmission phases of the watermarked object (see Figure 1).

Figure 1. A high-level schema of the embedding and verification procedures.

Given that the watermark embedding operation alters the cover object producing a watermarked object, watermarking algorithms may be reversible or non-reversible depending on the possibility of recovering the digital object from the watermarked one during the extraction/verification phase. In some application contexts, the cover object is available during the verification step, while in others, the watermarked object is the only data available. In the first case, the algorithm may be referred to as non-blind, whereas in the second case, the algorithm is said to be blind.

Digital steganography is another branch of data hiding, where the objective is not only to embed and hide data into a digital entity but also conceal this embedding in such a way that it should not be possible to (mathematically) prove that some data has been stored. Steganography has partially different constraints and other fields of application, and the algorithm proposed in this paper does not deal with them. Nonetheless, embedding in the LSBs of signal samples is one of the techniques used due to the low amount of noise introduced and the consequent reduction in detectability (see, for example, [2,3]).

Digital watermarking algorithms have been developed for various types of digital object, such as images [4,5], videos [6], audio [7,8], neural networks [9,10], and 3D models [11,12]. In this paper, we build upon the algorithm in [4], later improved in [13], developed for the integrity protection of digital images and provide a solution to a security issue we discovered. Specifically, we demonstrate a possible attack on the method and present a solution that introduces minimal error, as two bits per block at most are changed. Finally, we apply the improved algorithm to the fragile watermarking of audio, images, and ECG signals.

The paper is structured as follows: the next section discusses related work, and the proposed algorithm is presented in Section 3. Section 4 reports the experimental results of applying the method to a set of audio, image, and electrocardiogram files. Finally, Section 5 discusses the relevant aspects of the work, and conclusions are drawn in Section 6. Section 7 summarizes the symbols and variables used in this paper for readability, and Appendix A presents a formal proof that only two bit changes are needed to insert the watermark.

3. The DIPS Algorithm

In order to overcome security issues, DIPS makes use of Message Authentication Codes (MACs). A Message Authentication Code (MAC) is a computer security primitive used to verify the integrity and authenticity of digital data. Essentially, a MAC M is a short digest, depending on a secret symmetric (i.e., shared among authorized entities) key K, computed on a bit string representing the digital object O to protect: M is attached to O and can be used by those having access to K to check the integrity of O by computing its MAC

M^{'}

and checking if

M = M^{'}

. Only the entities who possess the key K can generate the correct MAC and/or verify it. Cryptographic properties guarantee that attackers, not knowing the secret key K, cannot alter O and generate a new valid MAC. Being based on a symmetric secret key, a MAC cannot guarantee the non-repudiation property, which is the ability to prove to a third party the origin of an object from an entity. In that case, digital signatures are needed [19,20]. A MAC can be computed with many techniques, with the most widely used being through a cryptographic hash function, like HMAC [21] and KMAC [22], or with a block cipher, like CMAC [23]. For an introduction and insights on MACs and digital signatures, see [24].

In particular, the basic idea is to store in a block of

2^{n}

samples a fragile watermark made of a Message Authentication Code (MAC) of the block itself. To minimize the noise, or error, introduced by this embedding, the MAC will be stored into the Least Significant Bits (LSBs) of the

2^{n}

samples. The embedding function will be the one expressed by Equation (1): given that the MAC will be computed on the most significant part of the samples (i.e., the block samples without the LSBs), the only part that can be modified for watermark embedding are the LSBs. This changes the possibility of incrementing or decrementing a sample value: the only permissible changes will be to 0 with a LSB valued at 1, which will decrement the modulo sum in Equation (1) by i (for the i-th sample), or to 1 with a LSB valued at 0 incrementing the modulo sum in Equation (1) by i.

DIPS performs the following steps:

Choose a secret symmetric key K to be used for the computation of the MAC.
For every block, compute the MAC of its $2^{n}$ samples (without the LSBs) using the key K and the block position (in order to avoid copy and paste attacks) as seeds. If the MAC used has length $> (n + 1)$ bits, compress it to obtain $n + 1$ bits: KMAC may produce a MAC of any desired length, whereas HMAC, based on SHA-256, requires a reduction from 256 to $n + 1$ bits. This can be done by splitting the 256 bits into blocks of $n + 1$ bits and XORing them (the last block being $256 \mod (n + 1)$ ) in length. Let m denote the resulting MAC: being a string of $n + 1$ bits, it can be interpreted as an integer value $0 \leq m < 2^{n + 1}$ .
Compute the block value v using Equation (1) and evaluate the difference $D = (m - v) \mod 2^{n + 1}$ , then modify the LSBs of some samples to make $v \equiv m \mod 2^{n + 1}$ , i.e. $D = 0$ . To this end, if $LSB (p_{i}) = 0$ , then changing it to 1 will increase v by i (modulo $2^{n + 1}$ ); on the contrary, changing $LSB (p_{i})$ from 1 to 0 will decrease v by i or increase it by $2^{n + 1} - i$ (which is also modulo $2^{n + 1}$ in this case).

The algorithm to find the correct and minimal (in terms of number of LSBs) modification was developed and implemented in MATLAB R2023b (more details on the pseudocode presented in Algorithm 1 may be provided upon request to the corresponding author.). The developed procedure has a recursive component and is capable of changing an increasing number of LSBs from 1 to a pre-defined maximum; we provide a high-level description of this in Algorithm 1. The implemented procedure was used to watermark a large number of blocks (see Section 4), and we empirically verified that the maximum number of changed LSBs is 2. We provide a formal proof of this property in Appendix A.

Algorithm 1 embeds a fragile watermark in a block of

2^{n}

samples. Firstly, the modulo sum v in Equation (1) and a

n + 1

-bits MAC

m c

of the samples (without the LSBs) are computed. If these two values differ, then some of the LSBs are modified to increase the sum (1) by the amount

m c - v

so that the MAC is stored; this is performed with a recursive function as can be seen in the second part of Algorithm 1. Note that the watermark embedding is based on the formula in Equation (1) with a slight variation as in Equation (3): it is applied to the LSBs only, following this consideration. A flip (i.e., a NOT operation) of an LSB modifies the sum by an amount corresponding to the weight (from 1 to

2^{n}

) assigned to that position, and the sample is not involved in the embedding of the difference between the present sum value and the desired one. Given that the MAC

m c

depends on the sample values amended from the LSBs, the embedding function can be applied to the LSBs only (see Equation (3)).

The array

c h a n g e (1 \dots 2^{n + 1} - 1)

contains in position i the index of the sample (from 1 to

2^{n}

) whose LSB, if modified, will alter the sum (3) by i. Note that

2^{n + 1} - 1 - 2^{n}

entries will not refer to any sample; thus, not all possible differences can be compensated by altering only one LSB.

Algorithm 1 Fragile watermarking of a block B of size

2^{n}

samples. Input: B and its position P, MAC key K,

m a x A l l o w e d C h a n g e s

. Output: watermarked block B

function

B =

embedWatermark(

B, P, K, m a x A l l o w e d C h a n g e s

)

1:: $v \leftarrow$ sum as in Equation (3)
2:: $l s b =$ extract LSBs from B
3:: $B^{'} = B$ without LSBs
4:: $m c \leftarrow M A C (K, P, B^{'})$ compressed to $n + 1$ bits
5:: if $m c \neq v$ then
6:: $c h a n g e (1 . . 2^{n + 1} - 1) \leftarrow 0$
7:: for $i \leftarrow 1 to 2^{n}$ do
8:: if $l s b_{i} = 0$ then
9:: $c h a n g e (i) \leftarrow i$
10:: else
11:: $c h a n g e (2^{n + 1} - i) \leftarrow i$
12:: end if
13:: end for
14:: $u s e d (1 . . 2^{n + 1} - 1) \leftarrow F a l s e$
15:: for $a l l o w e d C h n g \leftarrow 1 to m a x A l l o w e d C h a n g e s$ do
16:: $L S B l i s t \leftarrow$ findChange( $(m c - v) \mod 2^{n + 1}$ , ${}$ , 1, $a l l o w e d C h n g$ )
17:: if NOT(isempty( $L S B l i s t$ )) then
18:: exit from for $a l l o w e d C h n g$ cycle
19:: end if
20:: end for
21:: if isempty( $L S B l i s t$ ) then
22:: error(’Unable to embed the watermark’)
23:: exit
24:: end if
25:: for each element r in $L S B l i s t$ do
26:: $flip l s b_{r}$
27:: end for
28:: $B =$ add $l s b$ to $B^{'}$
29:: end if
30:: return B

function

o u t L i s t =

findChange(

d i f f, l i s t, a c t u a l L e v e l, m a x L e v e l

)

1:: if $a c t u a l L e v e l > m a x L e v e l$ then
2:: return $o u t L i s t \leftarrow {}$
3:: end if
4:: if $c h a n g e (d i f f) > 0$ AND NOT( $u s e d (d i f f)$ ) then
5:: return $o u t L i s t \leftarrow l i s t \cup {c h a n g e (d i f f)}$
6:: end if
7:: for $c h \leftarrow 1 to 2^{n + 1} - 1$ do
8:: if $c h a n g e (c h) > 0$ AND NOT( $u s e d (c h)$ ) then
9:: $u s e d (c h) \leftarrow T r u e$
10:: $o u t L i s t \leftarrow$ findChange( $(d i f f - c h) \mod 2^{n + 1}$ , $l i s t \cup {c h a n g e (c h)}$ , $a c t u a l L e v e l + 1$ , $m a x L e v e l$ )
11:: if NOT(isempty( $o u t L i s t$ )) then
12:: return $o u t L i s t$
13:: end if
14:: $u s e d (c h) \leftarrow F a l s e$
15:: end if
16:: end for

The recursive function builds a list of LSB modifications (i.e., a list of LSBs to which apply the NOT operator) to add the required difference

d i f f

to the sum in Equation (3). If the modification can be obtained by changing one LSB not already used, it is added to the list of required changes (the complexity of this operation is

O (1)

); otherwise, for all the possible changes

c h

not already used, the change

c h

is added to the list of changes, and the function is recursively called to find the necessary changes for the remaining part

d i f f - c h

. In this case, the complexity becomes

O (N)

where N is the block length. Thus, if the function is recursively called C times, then the complexity is

O (N^{C - 1})

. If the function returns a non-empty list, then a set of LSBs to obtain the value

m c

has been found. Otherwise, the change

c h

is discarded and the next change is tested.

According to the proof in Appendix A, confirmed by the set of experiments we performed on a large number of blocks of various sizes (see Section 4), we found that

m a x L e v e l

, the maximum allowed recursion depth, may be set to 2.

A change to any bit of the block (LSBs included) will be detected as a modification to the watermark computed with Equation (3):

v = (\sum_{i = 1}^{2^{n}} i L S B_{i}) \mod 2^{n + 1} .

(3)

A consideration is required at this point: as with the original paper [4], this algorithm embeds a watermark of

n + 1

bits. Therefore, a modification to the block could result in a false negative (i.e., undetected tampered block) with probability

1 / 2^{n + 1}

.

4. Experimental Results

In this section, we report the test results of DIPS applied to a set of audio files in wave format, PNG color images, and raw ECG signals. Moreover, we provide a performance comparison with a recently published fragile watermarking algorithm for audio signals [8] and up-to-date quality results on images. The measures used in these evaluations and comparisons are the

S N R

and

P S N R

.

The

S N R

and

P S N R

values resulting from the watermarked object are computed with the well-known formulas:

S N R = 10 {log}_{10} \frac{P_{M}}{M S E}

(4)

and

P S N R = 10 {log}_{10} \frac{M A X^{2}}{M S E},

(5)

where

P_{M}

is the average power of the signal (i.e., the average of the squared samples),

M A X = 2^{d} - 1

is the maximum possible excursion of the samples, with being d the bit depth, and

M S E

is the mean squared error between the original and the watermarked object.

Assuming that a maximum of 2 LSBs are changed in each block (this is based on the proof provided in the Appendix A, confirmed by our set of experiments, that any difference between the MAC value and the sum in Equation (3) can be compensated with 0, 1, or 2 LSB flips), then the

M S E = (1^{2} + 1^{2}) / 2^{n} = 2^{- (n - 1)}

, so it is possible to compute the minimum theoretic

P S N R

, according to the following equation:

P S N R (n) = 10 {log}_{10} \frac{M A X^{2}}{2 / 2^{n}} = 10 {log}_{10} (2^{n - 1} M A X^{2}) .

(6)

Table 1 reports minimum

P S N R

values for two different bit depths and block sizes

2^{n}

, with n in the range [6, 16].

Table 1. Minimum

P S N R

for different block sizes

2^{n}

and sample’s bit depth d, computed considering a maximum of 2 LSB flips.

We evaluated DIPS performances on a set of 100 audio files available from [25]: the wave file sizes vary between 10 MB and 20 MB and incorporate recorded mono audio of time duration between 110 s and 220 s sampled at a frequency of 44,100 Hz with 16 bits per sample. These audio files contain voice recordings of males and females reading a text. The test results are reported in Table 2: for every block size analyzed, the block time length is derived and the average (with standard deviation) and minimal

P S N R

and

S N R

values obtained from the tested files are reported.

Table 2.

P S N R

and

S N R

performance comparison for different block sizes of audio files (time duration refers to a frequency sampling of 44,100 Hz, the samples bit depth is 16). The mean ± standard deviation and, in parentheses, the minimum value of the statistics over the 100 files are reported for each block size

2^{n}

.

As observed, the

P S N R

values are higher than the theoretical minimum (see Equation (6) and Table 1) because not all blocks had a modification of 2 LSBs. In fact, on the tested files, almost all blocks are needed to change the LSBs to store the MAC, and approximately half of the blocks had 1 LSB modified and the other half had 2 LSBs flipped. No blocks required more than 2 changes (also see the proof in Appendix A).

For comparison, we present the results of the DIPS algorithm against those of a recently published work for the fragile watermarking of audio files [8]. In that paper, the authors proposed a procedure based on hashes: the audio file is divided into 256 blocks and each one (LSBs excluded) is protected with a hash of S bits in length stored in the LSBs of the block. Then, a hash of 256 bits computed on the whole file (excluding the LSBs) is distributed over the blocks, at one bit per block, and one more hash (LSB seal hash) is used to authenticate the LSBs of the samples in the file. The paper reports results for

S = 8

, thus, 9 watermark bits are stored in each block, with an exception made for the block also storing the seal hash. The paper shows the performance for some audio files of different time durations sampled at 44,100 Hz. From this, we infer the block size given that [8] splits the file into 256 blocks. Even if a precise comparison is difficult due to unavailability of the audio files they used and the different constraints on the block sizes in DIPS, we collected in Table 3 the

P S N R

and

S N R

values of some files from [8], given that they have a similar block size to DIPS.

Table 3. Comparison of objective quality performance between DIPS and [8]. The value in parentheses refers to the

S N R

of a second file having the same time duration.

To analyze the comparison in Table 3, some considerations are in order. First of all, both algorithms change LSBs only. Secondly, for a fixed number of potentially changeable LSBs, the bigger the block, the smaller the (m.s.) error and thus the larger the

P S N R

. The algorithm in [8] reports results embedding 9 bits per block, thus changing, on average,

4.5

bits (LSBs) per block. DIPS instead modifies, at most, 2 LSBs to embed 9 bits per block. Consequently, it has higher performance in terms of

P S N R

. On the contrary,

S N R

performances are comparable: the DIPS algorithm has better results in some cases and less in others, but this fact depends upon the average power of the host signal (which is apparently smaller for the audio files we used) and the slightly different block sizes.

As a last consideration, we report the fact that [8] is able to detect a modification to the audio file (even without block localization), reserving one watermark bit per block to store a file hash. This can also be provided by DIPS with a small modification: just define blocks of

2^{n} + 1

samples, compute the MAC of each block, and store it as previously done; then, after having zeroed the LSBs of the (

2^{n} + 1

)-th sample of each block, compute a MAC of the whole file and store it one bit per block in the LSBs of the (

2^{n} + 1

)-th samples. With this modification, the localization capability is the same and the tamper detection is 100%. By using the same kind of idea, one can embed a digital signature covering the whole watermarked file (excluding only the LSBs devoted to carrying the digital signature itself). This would also add a feature of non-repudiation. It is up to the application to choose which method to use depending on its needs. In any case, the objective of our paper is to present a security improvement with respect to the methods in [4,13], which work on the authentication of single blocks of samples.

Moreover, we report in Table 4 a comparison of the quality of the watermarked files produced by DIPS with the other audio watermarking algorithms previously presented [8,16,17,18]. As it can be observed, the proposed improvement in security also has as side effect of reducing the noise introduced by the fragile watermarking process with respect to other methods developed for audio files. Nonetheless, we remark on the fact that the proposed solution can be applied to other file formats composed of samples and whose application field is resilient to a minimal modification of some samples’ Least Significant Bits. In order to show this, we also report further experiments on images and

E C G

signals.

Table 4. Objective quality comparison of the watermarked files produced by DIPS and by [8,16,17,18].

A second set of experiments were performed on 500 color images of

768 \times 576

pixels [26]. Table 5 reports quality of watermarked images in terms of

P S N R

and

S N R

. Again, the quality is higher than the minimum

P S N R

values reported in Table 1. For a comparison on this dataset, see [27].

Table 5. PSNR and SNR performance comparison for different block sizes of color images. The mean ± standard deviation and, in parentheses, the minimum value of the statistics over the 500 files are reported for each block size

2^{n}

.

Finally, we tested DIPS on a set of ECG signals taken from [28]. This data is related to a project for the implementation of technologies for 5P-Medicine related to cardiovascular diseases. It includes ECG data acquired from 219 individuals seated for 30 s and standing for 30 s. We only used the data related to the ECG sensors in millivolts (mV). As shown in Table 6, the quality results are better than the minimum

P S N R

reported in Table 1 as well.

Table 6. PSNR and SNR performance comparison for different block sizes of electrocardiograms. The mean ± standard deviation and, in parentheses, the minimum value of the statistics over the 219 files are reported for each block size

2^{n}

.

5. Discussion

In this section, we will argue about some relevant issues that DIPS faces and solves.

Quality. First of all, to the best of our knowledge, DIPS obtains the best quality in terms of $P S N R$ and $S N R$ with respect to any published method when applied to audio signals and images (concerning ECG signals, we did not find any paper that presents watermarking of such signals). Note that any other objective quality measure would result in a value that is definitely better than most other methods, as DIPS only modifies at most 2 LSBs per block. This advantage is even more evident for high bit-depth signals. For instance, the Structural Similarity Index ( $S S I M$ ) for the color images in our experiments is always >0.998, with 1 being the best value.
Attacks. It should be pointed out that we presented a simplified version of DIPS in order to focus the attention on the method itself. This simplified version is prone to a very simple attack: as the value v in Equation (1) can be obtained in several different ways, an attacker might change a proper set of LSBs and obtain the same value v, and such tampering would go undetected. To solve this issue, DIPS actually scrambles the positions of the LSBs according to the secret key when computing the sum. Another solution would be to add a digital signature computed (with a private key) on the watermarked object and insert it into a set of LSBs reserved for it. In order to prevent cut-and-paste attacks, the MAC of a block also contains the position of the block in the watermarked signal.
Capacity and localization. There is a strong correlation between payload capacity and localization ability of DIPS. The larger the block size, the coarser the localization and the smaller the total payload capacity. Anyway, there is also a limit on the minimum size of a block that depends on how much one wants to compress the MAC: the smaller the MAC, the higher the probability of not detecting a tampered block. In particular, in order to insert a watermark of $n + 1$ bits, the block must consists of at least $2^{n}$ samples. As in the original algorithm [4], with a block size of $2^{n}$ samples, the probability of missing the detection of a single tampered block is $1 / 2^{n + 1}$ . This can be acceptable in many application contexts because the probability of missing the tampering of at least one of k blocks is $1 / 2^{k (n + 1)}$ , which rapidly decreases for increasing k and n: for example, with a block size of $2^{15}$ samples $n = 15$ , the probability of missing the tampering of at least one of $k = 3$ tampered blocks is $1 / 2^{48}$ , i.e., less than $4 \cdot 10^{- 15}$ .
Security. The security of the method lies in the secrecy of the symmetric key K used to compute the MACs of the various blocks and, obviously, on the mathematical security of the used MAC algorithms (which is guaranteed for the MAC methods previously cited): anyone who knows the key K can verify the integrity of a file as well as generate the authentication information for that file.
Data hiding. DIPS can also be used as a data-hiding method to securely transfer secret messages between parties: as the payload capacity can be varied according to the block size, DIPS can carry $(n + 1) \times N_{B}$ bits, where $N_{B}$ is the total number of blocks.

Finally, we remark the security improvement over the algorithms proposed in [4,13] and the applicability of DIPS to any object resilient to minimal LSB modifications and whose structure can be interpreted as a sequence of samples.

6. Conclusions

This paper presented DIPS (Data Integrity Protection of Signals) a security improvement to an integrity protection watermarking algorithm developed for blocks of samples [4,13]. The paper builds upon two previously published works, identifies a security vulnerability present in them, and finally proposes an embedding method that stores a Message Authentication Code of each block into the LSBs of the block itself by modifying at most two LSBs per block. The watermarked objects show a very high quality, improving the state-of-the-art quality as shown for real-world signals. Finally, it is worth mentioning that the proposed solution has a wide range of applications, namely to all files composed of samples, like audio, images, or ECG signals, and whose applications tolerate very small changes to some sample LSBs.

7. Notation and Nomenclature

This section is aimed at summarizing the notation of the variables used throughout the paper and the nomenclature of functions and data structures used in the algorithms.

7.1. Notation

In this paper, we will write the following:

$N^{+}$ for the set of positive integers;
Scalar values with italics uppercase and lowercase letters from the Latin alphabet, eventually with a subscript, e.g., $n, D, p_{i}, m, m c$ .

7.2. Nomenclature

Table 7 lists the meaning of some terms, variables and functions used throughout the paper.

Table 7. Meaning of some terms used in the paper.

Author Contributions

All the authors gave the same contribution in all aspects of this paper. All authors have read and agreed to the published version of the manuscript.

Funding

This work has been supported by the Italian Ministero dell’Università e della Ricerca.

Data Availability Statement

The presented algorithm can be used with any uncompressed file composed of samples, and the results obtained in Section 4 apply to any audio file. In Section 4, we report one reference on the dataset used to compute some results.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

CMAC	Cipher-based Message Authentication Code
DIPS	Data Integrity Protection of Signals
DWT	Discrete Wavelet Transform
ECG	Electrocardiogram
HMAC	Hash-based Message Authentication Code
KMAC	KECCAK Message Authentication Code
LSB	Least Significant Bit
MAC	Message Authentication Code
MSE	Mean Squared Error
PSNR	Peak Signal-to-Noise Ratio
SNR	Signal-to-Noise Ratio
SSIM	Structural Similarity Index
SVD	Singular Value Decomposition

Appendix A

In this section, we will prove the following theorem:

Theorem A1.

Let

I = {x_{i} ∣ x_{i} = i, 0 \leq i \leq 2^{n}}

. Given the sum

s = (\sum_{i = 0}^{2^{n}} w_{i} x_{i})

(A1)

where

w_{i} \in {0, 1}

are binary weights, and its corresponding modulo value

v = s (\mod 2^{n + 1})

, then any desired value

m \in {0, 1, \dots, 2^{n + 1} - 1}

can be obtained from a suitable displacement value d by changing (i.e., flipping, or negating) at maximum two weights

w_{a}, w_{b}

, with

0 \leq a, b \leq 2^{n}

.

Proof.

To change v into

m = (v + d) (\mod 2^{n + 1})

, it is necessary to identify a set of indexes for which negating the corresponding binary weights

w_{i}

results in a modification of d, with

- 2^{n + 1} < d < 2^{n + 1}

.

The flipping, or negation, of a binary weight

w_{a}

can be obtained with

w_{a}^{'} = 1 - w_{a}

or, equivalently,

w_{a}^{'} = NOT (w_{a})

. Flipping a weight

w_{a} = 0

adds a to s, conversely negating a weight

w_{a} = 1

subtracts a from s.

To prove the theorem, it must be shown that, from any value d, there exists at least one index a or pair of indices

a, b

such that flipping the corresponding weight

w_{a}

or weights

w_{a}, w_{b}

, the required displacement d is obtained through one of the following operations:

\pm x_{a}

or

\pm x_{a} \pm x_{b}

.

The proof is given into two parts, A and B, depending on the sign of d.

Case A: d > 0

In this case, the condition

m = v + d

can be achieved with one of the following subcases:

Addition: If there exist $a, b$ such that $x_{a} + x_{b} = d$ and $w_{a} = w_{b} = 0$ , set $w_{a}^{'} = 1$ and $w_{b}^{'} = 1$ .
- Example ( $n = 3, m = 15, v = 10, d = 5$ ): if $s = x_{2} + x_{3} + x_{5}$ , choosing $x_{1}, x_{4}$ results in the new sum $s^{'} = x_{1} + x_{2} + x_{3} + x_{4} + x_{5} = 15$ .
Substitution: If there exist $a, b$ such that $x_{a} - x_{b} = d$ with $w_{a} = 0$ and $w_{b} = 1$ , set $w_{a}^{'} = 1$ and $w_{b}^{'} = 0$ .
- Example ( $n = 3, m = 15, v = 9, d = 6$ ): if $s = x_{1} + x_{2} + x_{6}$ , and $d = x_{8} - x_{2}$ the new sum is $s^{'} = x_{1} + x_{6} + x_{8} = 15$ .
Subtraction: If there exist $a, b$ such that $x_{a} + x_{b} = 2^{n + 1} - d$ (that is, $- (x_{a} + x_{b}) \equiv d (\mod 2^{n + 1})$ ) and $w_{a} = w_{b} = 1$ , set $w_{a}^{'} = 0$ and $w_{b}^{'} = 0$ .
- Example ( $n = 3, m = 15, v = 4, d = 11$ ): if $s = x_{0} + x_{2} + x_{5} + x_{6} + x_{7}$ and $x_{0} + x_{5} = 16 - d$ removing $x_{0}$ and $x_{5}$ gives the new sum $s^{'} = x_{2} + x_{6} + x_{7} = 15$ .
Complementary substitution: If there exist $a, b$ such that $x_{a} - x_{b} = 2^{n + 1} - d$ with $w_{a} = 1$ and $w_{b} = 0$ , set $w_{a}^{'} = 0$ and $w_{b}^{'} = 1$ .
- Example ( $n = 3, m = 15, v = 3, d = 12$ ): if $s = x_{1} + x_{5} + x_{6} + x_{7}$ and $x_{6} - x_{2} = 16 - d$ the new sum is $s^{'} = x_{1} + x_{2} + x_{5} + x_{7} = 15$ .

Case B: d < 0

In this case, the value

m \equiv (v - | d |) (\mod 2^{n + 1})

is obtained through an analogous logic:

5.

Subtraction: If there exist

a, b

such that

x_{a} + x_{b} = | d |

and

w_{a} = w_{b} = 1

, set

w_{a}^{'} = 0

and

w_{b}^{'} = 0

.

Example ( $n = 3, m = 1, v = 8, d = - 7$ ): if $s = x_{0} + x_{2} + x_{4} + x_{5} + x_{6} + x_{7}$ , choosing $x_{0}, x_{7}$ results in the new sum $s^{'} = x_{2} + x_{4} + x_{5} + x_{6}$ , or choosing $x_{2}, x_{5}$ gives the sum $s^{'} = x_{0} + x_{4} + x_{6} + x_{7}$ .

6.

Substitution: If there exist

a, b

such that

x_{a} - x_{b} = | d |

and

w_{a} = 1

and

w_{b} = 0

, set

w_{a}^{'} = 0

and

w_{b}^{'} = 1

.

Example ( $n = 3, m = 1, v = 4, d = - 3$ ): if $s = x_{2} + x_{3} + x_{7} + x_{8}$ , and $| d | = x_{8} - x_{5}$ , the new sum is $s^{'} = x_{2} + x_{3} + x_{5} + x_{7}$ .

7.

Addition: If there exist

a, b

such that

x_{a} + x_{b} = 2^{n + 1} - | d |

and

w_{a} = 0

and

w_{b} = 0

, set

w_{a}^{'} = 1

and

w_{b}^{'} = 1

.

Example ( $n = 3, m = 1, v = 11, d = - 10$ ): if $s = x_{4} + x_{7}$ , and $16 - | d | = x_{0} + x_{6}$ , the new sum is $s^{'} = x_{0} + x_{4} + x_{6} + x_{7}$ .

8.

Complementary substitution: If there exist

a, b

such that

x_{a} - x_{b} = (| d | - 2^{n + 1})

(equivalently,

x_{b} - x_{a} = 2^{n + 1} - | d |

) with

w_{a} = 0

and

w_{b} = 1

, set

w_{a}^{'} = 1

and

w_{b}^{'} = 0

.

Example ( $n = 3, m = 1, v = 8, d = - 7$ ): if $s = x_{4} + x_{5} + x_{7} + x_{8}$ and $x_{7} - x_{0} = 16 - | d |$ the new sum is $s^{'} = x_{0} + x_{4} + x_{5} + x_{8}$ .

Given that the set I contains all integers from 0 to

2^{n}

, then the interval

[(- 2^{n + 1} + 1) \dots (2^{n + 1} - 1)]

is covered by all possible sums and differences of two distinct elements of I. Thus, for any displacement d, there must exist at least one pair

(a, b)

satisfying one of the above subcases, showing that any value m can be obtained by flipping at most two weights. □

References

Cox, I.; Miller, M.; Bloom, J.; Fridrich, J.; Kalker, T. Digital Watermarking and Steganography, 2nd ed.; Morgan Kaufmann: San Francisco, CA, USA, 2007. [Google Scholar]
Mielikainen, J. LSB Matching Revisited. IEEE Signal Process. Lett. 2006, 13, 285–287. [Google Scholar] [CrossRef]
Li, X.; Yang, B.; Cheng, D.; Zeng, T. A Generalization of LSB Matching. IEEE Signal Process. Lett. 2009, 16, 69–72. [Google Scholar] [CrossRef]
Lin, P.Y.; Lee, J.S.; Chang, C.C. Protecting the Content Integrity of Digital Imagery with Fidelity Preservation. ACM Trans. Multimed. Comput. Commun. Appl. 2011, 7, 1–20. [Google Scholar] [CrossRef]
Wan, W.; Wang, J.; Zhang, Y.; Li, J.; Yu, H.; Sun, J. A comprehensive survey on robust image watermarking. Neurocomputing 2022, 488, 226–247. [Google Scholar] [CrossRef]
Asikuzzaman, M.; Pickering, M.R. An Overview of Digital Video Watermarking. IEEE Trans. Circuits Syst. Video Technol. 2017, 28, 2131–2153. [Google Scholar] [CrossRef]
Arnold, M. Audio watermarking: Features, applications and algorithms. In Proceedings of the 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532), New York, NY, USA, 30 July–2 August 2000; Volume 2, pp. 1013–1016. [Google Scholar] [CrossRef]
AlSabhany, A.A.; Ali, A.H.; Alsaadi, M. A lightweight fragile audio watermarking method using nested hashes for self-authentication and tamper-proof. Multimed. Tools Appl. 2024, 83, 89135–89149. [Google Scholar] [CrossRef]
Botta, M.; Cavagnino, D.; Esposito, R. NeuNAC: A novel fragile watermarking algorithm for integrity protection of neural networks. Inf. Sci. 2021, 576, 228–241. [Google Scholar] [CrossRef]
Trias, C.D.S.; Mitrea, M.; Tartaglione, E.; Fiandrotti, A.; Cagnazzo, M.; Chaudhuri, S. A hitchhiker’s guide to white-box neural network watermarking robustness. In Proceedings of the 2023 11th European Workshop on Visual Information Processing (EUVIP), Gjovik, Norway, 11–14 September 2023; pp. 1–6. [Google Scholar]
Vasc, B.; Raveendran, N.; Vasic, B. Neuro-OSVETA: A Robust Watermarking of 3D Meshes. arXiv 2023, arXiv:2304.10348. [Google Scholar] [CrossRef]
Wang, Y.; Liu, J.; Yang, Y.; Ma, D.; Liu, R. 3D model watermarking algorithm robust to geometric attacks. IET Image Process. 2017, 11, 822–832. [Google Scholar] [CrossRef]
Botta, M.; Cavagnino, D.; Pomponiu, V. Protecting the Content Integrity of Digital Imagery with Fidelity Preservation: An Improved Version. ACM Trans. Multimed. Comput. Commun. Appl. 2014, 10, 1–5. [Google Scholar] [CrossRef]
Kim, C.; Yang, C.N. Self-Embedding Fragile Watermarking Scheme to Detect Image Tampering Using AMBTC and OPAP Approaches. Appl. Sci. 2021, 11, 1146. [Google Scholar] [CrossRef]
Li, P.; Cao, W.; Yu, T. Improved LSB substitution based semi-blind fragile watermarking for high-accuracy tamper localization. J. Inf. Secur. Appl. 2026, 97, 104330. [Google Scholar] [CrossRef]
Renza, D.; Ballesteros L., D.M.; Lemus, C. Authenticity verification of audio signals based on fragile watermarking for audio forensics. Expert Syst. Appl. 2018, 91, 211–222. [Google Scholar] [CrossRef]
Hu, H.T.; Lee, T.T. Hybrid Blind Audio Watermarking for Proprietary Protection, Tamper Proofing, and Self-Recovery. IEEE Access 2019, 7, 180395–180408. [Google Scholar] [CrossRef]
Zhou, S.; Song, M.; Qian, Q.; Liao, W.; Gong, X. GRACED: A Novel Fragile Watermarking for Speech Based on Endpoint Detection. Secur. Commun. Networks 2022, 2022, 9496748. [Google Scholar] [CrossRef]
Yang, X.; Wang, W.; Tian, T.; Wang, C. Cryptanalysis and Improvement of a Blockchain-Based Certificateless Signature for IIoT Devices. IEEE Trans. Ind. Inform. 2024, 20, 1884–1894. [Google Scholar] [CrossRef]
Hauer, E.; Dittmann, J.; Steinebach, M. Digital signatures based on invertible watermarks for video authentication. In Proceedings of the IFIP International Conference on Communications and Multimedia Security; Springer: Berlin/Heidelberg, Germany, 2005; pp. 277–279. [Google Scholar]
Krawczyk, H.; Bellare, M.; Canetti, R. HMAC: Keyed-Hashing for Message Authentication. In RFC 2104; RFC Editor: Marina del Rey, CA, USA, 1997. [Google Scholar] [CrossRef]
Kelsey, J.; Chang, S.j.; Perlner, R. SHA-3 Derived Functions: cSHAKE, KMAC, TupleHash, and ParallelHash; Technical Report SP 800-185; National Institute of Standards and Technology (NIST): Gaithersburg, MD, USA, 2016. [CrossRef]
Dworkin, M. Recommendation for Block Cipher Modes of Operation: The CMAC Mode for Authentication; Technical Report SP 800-38B; National Institute of Standards and Technology (NIST): Gaithersburg, MD, USA, 2005. [CrossRef]
Schneier, B. Applied Cryptography, 2nd ed.; Wiley: London, UK, 1996. [Google Scholar]
Mysore, G.J. Can we Automatically Transform Speech Recorded on Common Consumer Devices in Real-World Environments into Professional Production Quality Speech?—A Dataset, Insights, and Challenges. IEEE Signal Process. Lett. 2015, 22, 1006–1010. [Google Scholar] [CrossRef]
Olmos, A.; Kingdom, F. Fred Kingdomś Laboratory, McGill Vision Research. Available online: http://tabby.vision.mcgill.ca/ (accessed on 3 March 2026).
Botta, M.; Cavagnino, D.; Pomponiu, V. A modular framework for color image watermarking. Signal Process. 2016, 119, 102–114. [Google Scholar] [CrossRef]
Pires, I.; Pinto, R.; Silva, P.; Garcia, N.M. ECG data related to 30-s seated and 30-s standing for 5P-medicine project. In Mendeley Data V2; Elsevier: Amsterdam, The Netherlands, 2022. [Google Scholar] [CrossRef]

Figure 1. A high-level schema of the embedding and verification procedures.

Table 1. Minimum

P S N R

for different block sizes

2^{n}

and sample’s bit depth d, computed considering a maximum of 2 LSB flips.

Table 1. Minimum

P S N R

for different block sizes

2^{n}

and sample’s bit depth d, computed considering a maximum of 2 LSB flips.

n	Block Size [Samples]	PSNR for d = 8 [dB]	PSNR for d = 16 [dB]
6	64	63.18	111.38
7	128	66.19	114.39
8	256	69.20	117.40
9	512	72.21	120.41
10	1024	75.22	123.42
11	2048	78.23	126.43
12	4096	81.24	129.44
13	8192	84.25	132.45
14	16,384	87.26	135.46
15	32,768	90.27	138.47
16	65,536	93.28	141.48

Table 2.

P S N R

and

S N R

performance comparison for different block sizes of audio files (time duration refers to a frequency sampling of 44,100 Hz, the samples bit depth is 16). The mean ± standard deviation and, in parentheses, the minimum value of the statistics over the 100 files are reported for each block size

2^{n}

.

Table 2.

P S N R

and

S N R

performance comparison for different block sizes of audio files (time duration refers to a frequency sampling of 44,100 Hz, the samples bit depth is 16). The mean ± standard deviation and, in parentheses, the minimum value of the statistics over the 100 files are reported for each block size

2^{n}

.

n	Block Size [Samples]	Block Size [msec]	PSNR [dB] Avg ± Std (>min)	SNR [dB] Avg ± Std (>min)
8	256	5.8	118.66 ± 0.01 (>118.64)	90.47 ± 1.61 (>86.68)
9	512	11.6	121.67 ± 0.01 (>121.64)	93.48 ± 1.61 (>89.68)
10	1024	23.2	124.68 ± 0.02 (>124.64)	96.49 ± 1.61 (>92.66)
11	2048	46.4	127.69 ± 0.02 (>127.62)	99.49 ± 1.62 (>95.70)
12	4096	92.9	130.70 ± 0.04 (>130.63)	102.51 ± 1.61 (>98.76)
13	8192	185.7	133.72 ± 0.05 (>133.58)	105.53 ± 1.61 (>101.84)
14	16,384	371.5	136.73 ± 0.07 (>136.50)	108.53 ± 1.62 (>104.67)
15	32,768	743.0	139.72 ± 0.09 (>139.49)	111.52 ± 1.64 (>107.59)
16	65,536	1486.1	142.74 ± 0.16 (>142.44)	114.55 ± 1.59 (>110.80)

Table 3. Comparison of objective quality performance between DIPS and [8]. The value in parentheses refers to the

S N R

of a second file having the same time duration.

Table 3. Comparison of objective quality performance between DIPS and [8]. The value in parentheses refers to the

S N R

of a second file having the same time duration.

Block Size [Samples] DIPS	PSNR [dB] DIPS	SNR [dB] DIPS	Block Size [Samples] [8]	PSNR [dB] [8]	SNR [dB] [8]
256	118.66	90.47	344	110	89
1024	124.68	96.49	1378	115	99
2048	127.69	99.49	2067	117	103 (97)

Table 4. Objective quality comparison of the watermarked files produced by DIPS and by [8,16,17,18].

Method	DIPS	[8]	[16]	[18]	[17]
PSNR [dB]	118.66–142.74	117	69.7	n/a	n/a
SNR [dB]	90.47–114.55	97–103	50.86	61.15	35.75

Table 5. PSNR and SNR performance comparison for different block sizes of color images. The mean ± standard deviation and, in parentheses, the minimum value of the statistics over the 500 files are reported for each block size

2^{n}

.

Table 5. PSNR and SNR performance comparison for different block sizes of color images. The mean ± standard deviation and, in parentheses, the minimum value of the statistics over the 500 files are reported for each block size

2^{n}

.

n	Block Size [Pixels]	PSNR [dB] Avg ± Std (>min)	SNR [dB] Avg ± Std (>min)
7	128	67.46 ± 0.03 (>67.04)	55.72 ± 1.61 (>48.64)
8	256	70.47 ± 0.03 (>70.42)	58.72 ± 1.61 (>51.65)
9	512	73.47 ± 0.03 (>73.09)	61.72 ± 1.61 (>54.65)
10	1024	76.48 ± 0.04 (>76.19)	64.73 ± 1.61 (>57.65)
11	2048	79.48 ± 0.05 (>79.23)	67.74 ± 1.61 (>60.69)
12	4096	82.49 ± 0.07 (>82.08)	70.75 ± 1.62 (>63.61)
13	8192	85.51 ± 0.10 (>85.27)	73.76 ± 1.62 (>66.61)
14	16,384	88.51 ± 0.14 (>88.00)	76.77 ± 1.62 (>69.67)
15	32,768	91.52 ± 0.19 (>90.79)	79.78 ± 1.63 (>72.54)
16	65,536	94.56 ± 0.28 (>93.98)	82.82 ± 1.64 (>75.58)

Table 6. PSNR and SNR performance comparison for different block sizes of electrocardiograms. The mean ± standard deviation and, in parentheses, the minimum value of the statistics over the 219 files are reported for each block size

2^{n}

.

Table 6. PSNR and SNR performance comparison for different block sizes of electrocardiograms. The mean ± standard deviation and, in parentheses, the minimum value of the statistics over the 219 files are reported for each block size

2^{n}

.

n	Block Size [Samples]	PSNR [dB] Avg ± Std (>min)	SNR [dB] Avg ± Std (>min)
8	256	118.67 ± 0.12 (>118.28)	76.54 ± 0.17 (>76.11)
9	512	121.68 ± 0.16 (>121.18)	79.54 ± 0.20 (>78.99)
10	1024	124.69 ± 0.25 (>124.09)	82.56 ± 0.28 (>81.92)
11	2048	127.79 ± 0.33 (>127.04)	85.66 ± 0.35 (>84.89)
12	4096	131.05 ± 0.46 (>130.05)	88.92 ± 0.47 (>87.83)
13	8192	134.59 ± 0.71 (>133.32)	92.46 ± 0.73 (>91.07)
14	16,384	137.75 ± 1.05 (>136.33)	95.62 ± 1.04 (>94.05)
15	32,768	140.81 ± 1.50 (>139.34)	98.68 ± 1.52 (>97.06)

Table 7. Meaning of some terms used in the paper.

Term	Meaning
B	block of samples
isempty	function that returns $T r u e$ if the argument does not contain any element, and $F a l s e$ otherwise
$L S B l i s t$	list containing the LSBs whose flipping embeds the required watermark
$m a x A l l o w e d C h a n g e s$	maximum allowed number of LSB modifications
$c h a n g e$	array of values containing in position i the index of the sample (from 1 to $2^{n}$ ) whose LSB, if modified, will alter the embedded value by i
K	algorithm secret key
$l s b$	values of the Least Significant Bits of a block B; $l s b_{i}$ is the LSB value of i-th sample
$M A C (K, P, B^{'})$	function that computes the MAC of $P, B^{'}$ using the key K
$m c$	value of the compressed MAC
$n + 1$	number of watermark bits inserted in a block
$N = 2^{n}$	number of samples in a block
$u s e d$	array of values each one stating if an LSB contributes to modify the embedded watermark value
v	value embedded in a block

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Multiple requests from the same IP address are counted as one view.

DIPS: Data Integrity Protection of Signals

Abstract

1. Introduction

2. Related Work

3. The DIPS Algorithm

4. Experimental Results

5. Discussion

6. Conclusions

7. Notation and Nomenclature

7.1. Notation

7.2. Nomenclature

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A

References

Article Metrics

Article Access Statistics