1. Introduction
The single-pixel camera (SPC) concept is an application of the Compressive Sensing (CS) paradigm in which signal compression is carried out at acquisition time by exposing a 2D scene to a sequence of masks and recording the light reflected by each of them using a single sensor. The captured scene can be reconstructed based on these SPC measurements with the use of various reconstruction algorithms. In order for the reconstruction to be possible, the original scene must be sparse in some domain, and the columns of the sensing matrix must be incoherent with the sparsity basis.
Such cameras are an alternative to widely used technologies such as charge-coupled devices (CCDs) or metal–oxide–semiconductor (CMOS) devices that are only available for a small section of the electromagnetic spectrum. Applications for SPC have been developed in multiple fields, such as fluorescence [
1] and infrared [
2] microscopy, methane gas leak detection [
3], and three-dimensional imaging [
4].
Today, most applications include communication via public channels, where data are exposed to theft or attacks. The common practice used to protect data is standard encryption after the compression stage. Another solution is partial encryption, where, in the case of the transform-based compression of images, only the sign or a part of significant coefficients is encrypted. One difference from standard encryption is the resulting image, which is noise-like in standard encryption and has discernible content in partial encryption. The main drawback of partial encryption is the weakness of error concealment attacks (ECAs). It was shown that the encrypted coefficients can be estimated using semantic or statistical rationales to obtain an image closer to the original [
5].
Data hiding is the generic name for the set of techniques that aim to insert information into a multimedia carrier. Reversible data hiding (RDH) comprises a particular class of methods allowing perfect recovery of the carrier after information extraction. Data hiding is commonly used for data integrity, covert communication, non-repudiation, copyright protection, and authentication, among other applications [
6].
We propose an RDH method specific to CS that involves protecting part of the CS measurements with a secret key and embedding them into the remaining measurements. The method can also be seen as an alternative to partial encryption due to the result consisting of reconstructed images with visible distortion.
The embedded measurements are protected by XOR operations with a single-use secret key generated via a cryptographically secure pseudorandom number generator (CSPRNG) that passes the NIST SP 800-22 tests [
7]. The key is longer than the encrypted plaintext. This secret key must be transmitted using a secure communication protocol that follows the guidelines in NIST SP 800-57 [
8], thus ensuring that only authorized parties can access it.
The embedding is performed on the fly with an RDH method, which is a prediction error expansion algorithm modified to fit CS measurement statistics.
Opposite to the common practice in RDH, where the concern is to embed data without visible distortion of the carrier, in our method, the embedding is performed on a great number of bits to cause strong visual distortion if an unauthorized user tries to reconstruct the image from the modified measurements.
The method is dedicated to the confidential information protection of CS measurements obtained sequentially from an SPC. Due to the on-the-fly insertion, it avoids the buffering of CS measurements for the subsequent standard encryption and generation of a thumbnail preview.
The novelties of our method are as follows:
Light encryption with a secret key for part of the CS measurements and using the remaining measurements as the carrier. No measurements have to be selected for embedding since, in CS, random projections are equipotent in image reconstruction. Image distortion is tuned by the percentage of embedded measurements.
On-the-fly embedding of the measurements. In an SPC scenario, the CS measurements are taken sequentially. To avoid measurement buffering before transmission, the embeddings are realized as they are obtained.
A modified version of the prediction error expansion algorithm. One particularity of CS measurements is their statistical independence. Since there is no correlation between neighboring measurements, the prediction error is uniformly distributed. For the prediction error to have a Gaussian distribution, we replaced the mean of the neighbors with the mean of all measurements.
The method is evaluated under the following aspects: the capacity of the data-hiding algorithm, the distortion introduced by embedding, and the impact on data volume. An equation is derived for the insertion capacity, and the experimental rate–distortion curve of the method is given.
The experiments were performed on synthetic and real data. A sky image was divided into 359 patches, and we calculated a series of random projections for each patch to simulate the acquisition. The idea behind this choice was the potential of SPC in sky exploration, where bands of the electromagnetic spectrum, like the far infrared, are of interest. For such bands, large areas of sensors are not available. In addition, the signal is very weak [
9]. The real data is measured under white light using our setup from [
10].
The rest of the paper is organized as follows:
Section 2 presents the two related domains i.e., data protection with a secret key and partial or selective encryption.
Section 3 describes the on-the-fly data hiding, explains the modified data-hiding algorithm, and discusses the impact on data representation.
Section 4 is dedicated to the insertion capacity: an equation for capacity is derived, and the mechanism of distortion control is introduced.
Section 5 analyzes the sources of distortion in order to better understand how to control them.
Section 6 discusses method behavior against error concealment attacks.
Section 7 is dedicated to experiments done with a sky image and real data. Examples of image reconstruction are depicted, and an experimental rate–distortion curve is plotted. Conclusions are drawn in the final section.
3. The Embedding of CS Measurements
The algorithm we use is a reversible data-hiding method that was proposed in [
34] and further discussed in [
35]. It is a prediction error expansion schema, as was presented in [
36], where the authors innovate at the level of predictor and insert data bit by bit into a 2D image. If the prediction error sits between two thresholds
, it is expanded, and information is inserted. Otherwise, the value is shifted so that no overlapping occurs, and extraction is lossless.
Considering that the image to be acquired
is sparse and given the set of patterns
for projection, the measurements obtained based on the CS are as follows:
The prediction error expansion algorithm has been updated to better fit this scenario: inserting the data on the fly in the measurements acquired sequentially by an SPC.
Firstly, the predicted value is a constant equal to the mean value of the measurements. This is done because, as opposed to a natural image where neighboring pixels have correlated values and allow for a reliable prediction, consecutive SPC measurements are statistically independent of each other. This also allows us to try to insert data in all the positions in the stream, as none of them must remain unchanged in order to compute prediction errors.
Secondly, the original algorithm uses two thresholds for data insertion and extraction, as information is embedded only when predicted errors fall between the thresholds. This directly impacts capacity and distortion. In the proposed schema, the two thresholds can be used for the fine control of the image distortion or simply ignored.
Lastly, our method allows multiple bits to be inserted as opposed to only one, leading to a higher possible distortion. This aspect is discussed in more detail in the following subsections.
Another important aspect is that the embedded data is composed of measurements that are processed and inserted on the fly at acquisition time (the flowchart is illustrated in
Figure 1). This directly impacts the embedding capacity of the schema compared with the original one. When selecting measurements to serve as embedded data, there is no extra check done; the criteria used to decide if a value is eligible or not do not apply.
The CS measurement matrix type is Hadamard with randomly permuted columns, a binary matrix that is easy to implement and use in CS scenarios [
37]. The construction of a Hadamard matrix leads to a first row providing a measurement with a double maximum value compared with the others. This is why the first row is not used in the proposed scenario, being excluded altogether.
The columns are permuted using a random number generator with a known seed, so the results are reproducible by an end user. This seed is generated at acquisition time and is then transmitted with the measurements in order to make the reconstruction possible.
Two types of users can receive and use the modified measurements: an authorized user who can compute the original measurements and reconstruct the original image or an unauthorised user who is only able to reconstruct a lower-quality version of the image using the modified measurements.
In the case of an authorized user, the information received is the modified measurements, the seed used to scramble the Hadamard matrix, the position map of embedded measurements, and the secret key used for data protection. An unauthorized user only receives the modified measurements and a reduced measurements matrix that can be used to reconstruct the distorted image. It is built by removing from the original matrix the columns corresponding to the indexes in the position map.
3.1. On-the-Fly Insertion
Suppose that the SPC collects sensor measurements one by one, and insertion is done on n bits. In order to embed data in the acquired values, some of them are processed and used to mark the next measurements as follows:
The measurement is first converted to binary (represented on 16 bits), and an XOR with the secret key is applied.
The resulting 16 bits are split into chunks of length n.
Each chunk is converted back to decimal, resulting in the values that will be inserted. If the division of the total number of bits by n provides a remainder, the last bits are stored and appended to the next binary representation.
Each of the decimal values is inserted in the following measurement values based on the algorithm described in
Section 3.2.
The maximum for the inserted values is as follows:
The insertion process is depicted in
Figure 1. The SPC generates measurements
one by one, and each of them is processed at acquisition time. The very first collected measurement is processed, and the resulting
values are inserted in the next available measurements. For the following acquired measurements, if data is available for insertion and the measurement is eligible for marking, the data is embedded, and the modified measurement value is transmitted. If insertion is not possible, the measurement is shifted and transmitted. If no data is available for insertion, the acquired measurement is processed to be embedded in the next coming ones.
A numerical example is depicted in
Figure 2. The insertion is done on seven bits, the threshold
, and the measurement to be inserted is
. It is first converted in binary to
. The seven-bit chunks are
and
, while the two-bit remainder
will serve to build up the next chunk. The carrying sequence consists of the following measurements [−4, 2, 22, 23]. The two chunks are inserted in the first and second measurements that correspond to prediction errors less than T and larger than −T. The rest of the measurements are only shifted by seven bits.
With these steps, the following data is transmitted: the marked measurements , the seed value used to construct the measurement matrix, and a location map representing the positions in y that were extracted and then embedded into eligible measurements that follow.
An authorized user has access to extra information that allows the original measurements to be computed: the threshold value T (with defining the expandable set of prediction errors), as well as the secret key used to perform the XOR operation.
Because a part of the measurements are used for embedding , the resulting has a length of that is less than L. This means that in order to be able to reconstruct a lower quality of the original image based on , an unauthorized user must use a truncated version of the measurement matrix. It can be constructed based on the random seed and location map as follows: generate a Hadamard matrix, permute the columns using the same random number generator algorithm and seed initially used, and then remove the lines corresponding to the positions in the location map.
Considering this, the extraction steps an authorized user would perform are as follows:
Go through the elements of
one by one to extract the embedded data and recover the set of original measurement values. This is possible since we use an RDH algorithm [
35].
For each extracted value , convert it to binary, represent it on n bits, and perform an XOR operation to obtain the original data.
Concatenate the resulting values and slice chunks of 16 bits. Each 16-bit value is then converted back to decimal, resulting in the set of measurement values embedded in the data insertion stage.
Use the position map to insert these in the correct locations, finally obtaining the original values of y.
As the deployed RDH algorithm is reversible, the recovered values are the same as the original ones except for the final one that could be truncated [
35].
4. The Insertion Capacity
The scrambled Hadamard matrix usually behaves like a random Gaussian matrix [
37]. The obtained measurements have a Gaussian distribution with zero mean (
Figure 3). The standard deviation depends on scene sparsity: the more sparse the scene, the lower the standard deviation.
Knowing the distribution and the thresholds
, the number of insertable measurements
M can be estimated by multiplying the total number of measurements
L by the probability
P to be between thresholds:
If the insertion is done on n levels, then the maximum number of bits that could be inserted is . In fact, this is only an upper limit since there is always a nonzero probability of using such measurements as data.
Suppose that
x is the true number of measurements that, at the end of the on-the-fly insertion process, are carrying data. Then the following equation holds
where
is the total number of measurements—insertable or not—sacrificed in view of insertion (the rounding was neglected), and
q is the probability of having insertable measurements among them. The true number of measurements that carry data, derived from Equation (
10) is
Hereinafter, we calculate
q in the hypothesis that we need two carrying measurements for inserting one, meaning a number of insertion levels
n between 8 and 16. Such a situation occurs when we have a string of 3 insertable measurements with, eventually, one or more non-insertable measurements between the first and second ones:
where
m is the number of terms. The term
becomes quickly negligible, being a probability. For loose thresholds,
and likewise
. As the thresholds stretch around zero, the probability
P reduces, the terms
start to count, and the approximation becomes coarser. For
, the error in approximating
q is
, and for
, it is
. The error estimation was done for
. For higher
m,
is lower than
. We do not encourage the use of small thresholds because it reduces the insertion capacity and fragilizes the method, as explained in
Section 6.
With this approximation, the true number of measurements carrying data is
The capacity in number of inserted bits is
, and the relative capacity
is
There are two mechanisms to control the capacity and, finally, the distortion: the threshold T through , and the number of insertion levels n. The former can provide a fine tuning, the latter a coarse one.
If we give up the fine tuning by working with large enough thresholds such that we have
, the relative capacity simplifies to
which has the advantage of being independent of image sparsity.
The fine-tuning by controlling
T is limited by the constraint of having a representation of 32 bits after data embedding. In this case, the range of
should be
, meaning
From Equation (
16), it follows that the threshold can be at maximum:
For 13 levels, for instance,
is 3, meaning a very low capacity of insertion and a low distortion of the recovered image. It is a mechanism that can be effective only at lower
n.
Figure 4 depicts
as a function of the standard deviation of the measurements, parametrized for 8 to 13 insertion levels and the corresponding
. The plots follow a similar trend. The capacity is constant up to a certain standard deviation, then decreases because of insufficient measurements available for carrying data (Equation (
14)). The capacity controlled only by the number of insertion levels is represented by the dashed lines (Equation (
15)).
The number of remaining measurements after insertion and available for a low-quality reconstruction is
In
Figure 5,
is plotted vs. the number of insertion levels for three initial numbers of measurements given as percentages.
7. Experimental Results
We experimentally test our method under the following aspects: embedding capacity, distortion, and impact on data volume.
The experiments are carried out on simulated and real data. The simulated data are obtained from an image of the W51 nebula (
Figure 8) converted to grayscale using the weighted method, normalized to have values between 0 and 1, and split into 359 patches of
each.
Figure 9 shows two patches with different sparsities. We simulated the acquisition by calculating random projections on a Hadamard matrix generated using Sylvester’s construction. The matrix only contains 1 and −1 elements. The elements in the first row are all positive, while all the other rows have an equal number of positive and negative elements. The first row was excluded since it leads to a first measurement with a double maximum value compared to the others. The matrix columns are randomly scrambled prior to image projection [
37]. We use 40% measurements, meaning that 60% of the matrix rows are discarded. To reconstruct the images from the embedded measurements, the matrix size is reduced accordingly.
The real data consists of a set of measurements acquired with our setup in [
10]. The imaged object is a 0.5 mm thick sheet of metal with machined patterns in it and with a total dimension of 13 mm × 13 mm (
Figure 10). The object was exposed to homogenized light from a halogen lamp. Since the construction of the SLM in Single Pixel camera only allows the use of sensing matrices composed of 0 or 1, the real data were measured with an S-matrix having randomly scrambled columns. This matrix can be obtained from the Hadamard matrix by removing its first row and column and turning all negative values to 0 [
39], followed by randomly permuting its columns.
For image reconstruction, either from simulated or real data, we use an implementation of the fast iterative shrinkage-thresholding algorithm (FISTA) presented in [
40].
For embedding data, we considered a loose threshold of , which ensures that all measurements are eligible for insertion.
When analyzing the results, we must take into account that an unauthorized user should still be able to reconstruct a lower-quality version of the original image using the modified measurements.
Since, in our experiment, all measurements are eligible for insertion, the capacity can be calculated directly using Equation (
15). In
Figure 11, the relative capacity
is plotted against the number of insertion levels
n. The curve has an ascending trend, with capacity values ranging from
bits/measurement for 1 level to
bits/measurement for 14 levels.
To evaluate the image distortion obtained by embedding, we calculate the PSNR, taking as a reference the image reconstructed from the original measurements.
Figure 12 plots the median of the PSNR of all 359 patches for a loose threshold (
) and a more restrictive one (
). A drop in the reconstruction quality can be noticed after six levels. Up to 7 insertion levels, the distortion is not visible. For this reason, we restrict our analysis to
. We repeated the experiments for a tight threshold
. The plot in
Figure 12 shows a significantly higher distortion for small
n but results close to loose thresholds for
.
We deepened the analysis, and we split the image set into three subsets corresponding to the following three ranges of pixel standard deviation: [13, 19], [19, 24], and [24, 67]. The ranges were chosen to cover the standard deviations of all images and to have approximately the same number of images in each subset. In natural images, the standard deviation gives an indication of the image sparsity, which is the main parameter in the CS reconstruction from a fixed percentage of measurements. The plots of PSNR vs.
n are quite similar for the three ranges (
Figure 13).
Examples of image degradation are shown in
Figure 14. For seven insertion levels, the quality of the two patches is rather good. At 10 insertion levels, the distortion is quite strong, especially for the dense patch that has a PSNR of
dB. Its content becomes almost indiscernible at 14 levels.
The PSNR median at 10 levels is
dB for the whole set (
Figure 12). On the three ranges considered for the standard deviation, it is
dB,
dB, and
dB, respectively. Given the visual quality of the images and the small variations of the PSNR with the standard deviation range, we adopted 10 as the appropriate number of insertion levels.
The distortion and the capacity are closely related.
Figure 15 shows the correlation between the relative capacity
and the PSNR. As expected, both capacity and distortion increase with the number of insertion levels.
The data insertion by our algorithm modifies the data volume. There are two opposite effects that compete: sacrificing measurements for insertion reduces the data volume, and data embedding extends the data representation from 16 to 32 bits. These effects are summarized in the compression rate equation below:
For loose thresholds, the number of measurements
x in Equation (13) becomes
which leads to the following equation for the compression rate:
The compression rate shows that the algorithm always expands the data volume for a number of insertion levels less than 16. Also, it should be noted that, given the loose thresholds, the data volume is not influenced by either the image size or the content. Only the number of insertion levels matters.
Figure 16 shows the evolution of
r for the analyzed range of insertion levels. The compression rate is always higher than 1, showing a data overflow varying from 39% for
to 7% for
.
The data throughput of SPCs is limited by the Digital Micromirror Device (DMD). Texas Instruments DMD (Dallas, TX, USA) products have binary pattern rates ranging from 2500 Hz to 32,000 Hz [
41]. By multiplying these rates by 32 bits, which is the measurement binary representation after data hiding, a maximum throughput of about 1 MHz is to be expected. The volume expansion of only 7–39% takes into account the reduction in the number of measurements to be transmitted. This means a shorter transmission time, but here, we must also take into consideration the processing time necessary for data hiding.
To jointly evaluate the data expansion and image distortion, we plotted in
Figure 17 the rate–distortion curve. As the number of insertion levels increases, the distortion becomes more evident and the data expansion less important.
For 10 levels, a number that we considered appropriate for our application, the measurement volume increases by . With respect to image size and taking into account that the sampling rate is , it follows that the data volume after insertion represents from the image size, which means an image compression rate of . CS expands data representation by 2, from 8 bits/pixel to 16 bits/measurement.
We measured the time of execution and the necessary memory for both data hiding and data extraction for
-pixel images, 40% measurements, and loose thresholds (
Table 1). The simulations were performed on macOS 13.4 with a 3.5 GHz CPU and 8 GB RAM, using the software Python 3.9.6. It is to be noted that the time and memory increase with
n and that they are higher for data extraction.
Table 1 also includes the values obtained when inserting 10 bits and using
and
, respectively. As expected, they are lower for both execution time and memory requirements when compared with the
and
values.
The experiments on smaller patches of 128 × 128 pixels have shown, on average, a higher PSNR. If the user chooses to work with patch sizes other than pixels, the distortion can be tuned from the number of insertion levels.
We compare our method with [
33], which is close to ours in terms of goal and approach. Similar to us, the authors of [
33] perform partial encryption of CS measurements and hide information. Unlike our method, the encryption concerns the sign of part of the measurements, while the hidden data is side information. The tuning of distortion is obtained via the percentage of encrypted measurements. For
images and 50% measurements, at an encryption percentage of
, the distortion of the reconstructed image is
dB, similar to our result. The data hiding in [
33] is done by histogram shifting, which supposes an estimation of the measurement distribution. Our method has the advantage of being blind and that of an on-the-fly insertion.
The simulation results included in the article are focused on the nebula patches, as these have similar characteristics to what would be expected in a single-pixel camera scenario. We also tested the algorithm against natural images commonly used in data-hiding articles. The results are rather similar to those obtained for nebula patches. For a loose threshold of T = 500, which ensures that all measurements are eligible for insertion, visible distortion appears starting with insertion on 7 bits.
Two other methods related to our work are [
42,
43]. Although the CS is not concerned with the scene acquisition, it is used to pre-process the cover or watermark images. In [
42], for pre-processing, the authors use prediction coding based on CS progressive recovery and encryption by XOR with a random stream generated with a secret key. Side information is inserted into the encrypted data. The resulting insertion capacity is up to 4 bpp for a recovered image quality higher than 39 dB. In [
43], the CS is applied on the Principal Components (PCs) of the watermark image to get CS measurements that are further embedded into the HL subband of the wavelet-transformed cover image. The method has three security layers, one from PCs and two from CS. The method reaches insertion rates of around 2 bpp and a PSNR above 30 dB for both the watermark and the cover image.
The insertion in real data with our method led to lower PSNRs compared with simulated measurements. At 10 insertion levels, the PSNR is
dB for the real image and about 14 dB for the images reconstructed from simulated measurements. The real images are shown in
Figure 10. In all three cases, i.e., 7, 10, and 14 insertion levels, the PSNRs are under those obtained for synthetic data.