1. Introduction
Image is an important carrier of information but vulnerable to attack or tampering when it is transmitted as a plaintext; therefore, image encryption technology has been a research highlight in the field of information security. Since the double random phase encoding (DRPE) method was proposed by Refregier and Javidi in 1995 [
1], optical image encryption has played an increasingly important role in the field of image encryption due to its merits of rapid processing speed, high parallelism, and sufficient degree of freedom in security key [
2,
3]. In the past years, more and more frontier technologies, such as compressive sensing [
4,
5,
6], metasurface [
7,
8,
9,
10], quantum walks [
11,
12], and ghost imaging [
13,
14,
15], have been gradually introduced into optical image encryption, thereby promoting the vigorous development of this technology. However, most optical image encryption technologies are only applicable to the encryption of a two-dimensional (2D) image. Compared with a 2D image, a three-dimensional (3D) image possesses a richer content and larger information capacity, and it can express the depth, position, and spatial relationship of the target object more accurately and quantitatively. Consequently, how to encrypt the 3D image has become a research hotspot in the field of optical image encryption in recent years. So far, several beneficial research studies about optical 3D image encryption have been developed, whose core technologies are respectively based on diffractive imaging [
16], integral imaging [
17,
18,
19], the optical heterodyne technique [
20], interferenceless coded aperture correlation holography [
21], digital holography [
22,
23,
24], and computer-generated holography [
25,
26,
27,
28,
29,
30,
31]. It can be seen that holographic technology has very extensive applications in three-dimensional image encryption. Meanwhile, as a research hotspot in information optics, holographic technology has also been applied in more fields [
32,
33]. Among various holographic techniques, computer-generated holography has many unique merits of flexible encoding, easy storage, and photorealistic reconstruction for both virtual and real 3D scenes. For example, Kong et al. successively proposed an optical 3D image encryption method [
25] based on Fourier computer-generated hologram (CGH) and vector stochastic decomposition algorithm and a 3D information hierarchical encryption method [
26] based on Fresnel CGH and chaotic random phase mask (CRPM). Later, Han et al. [
27] combined the iterative phase retrieval algorithm with the layer-oriented angular-spectrum algorithm for generating a novel 3D image encryption algorithm. In addition, Piao et al. successively proposed two multi-depth 3D image encryption methods based on a phase retrieval algorithm [
28] and a CGH with a cascaded structure [
29], respectively. Besides, Wang et al. [
30] proposed a 3D image encryption method by using the layer-oriented angular-spectrum algorithm, double-phase method, and CRPM. Recently, Su et al. [
31] also proposed a 3D image encryption method based on structured light illumination and iterative layer-oriented angular-spectrum algorithm.
However, 3D plaintext images are encrypted into the ciphertexts with noise-like distribution in all CGH-based 3D image encryption methods mentioned above [
25,
26,
27,
28,
29,
30,
31], which may cause these ciphertexts to attract unnecessary attention during their transmission process and become vulnerable to illegal attacks, resulting in a serious threat to the security of 3D plaintext information. To solve this problem, Ji et al. [
34] proposed a 3D image hiding method based on CGH and Henon mapping very recently, where the 3D plaintext image to be encrypted is firstly encoded into a phase-only hologram (POH) by using a classical layer-oriented angular-spectrum algorithm, and subsequently the POH is encrypted by introducing CRPM generated through the Henon mapping technique, and finally the encrypted result is embedded into a 2D host image by combining discrete wavelet transform (DWT) and singular value decomposition (SVD) algorithms, achieving 3D plaintext information hiding with high concealment. The biggest advantage of this innovative scheme is that the final ciphertext is a visible image not easily perceived by illegal attackers so that the concerns of potential third-party attackers regarding the encrypted 3D plaintext information can be effectively mitigated. However, the security of such a good 3D image hiding method with high imperceptibility is inadequate for high-security applications because of its small key space and low key sensitivities. Regarding key sensitivity, the decrypted 3D image’s contour can still at least be easily recognizable with the naked eyes when the error rate reaches a half for any one of two singular matrix keys, or the wavelength key deviation is near 50 nm, or the diffraction distance key deviation is near 50 mm. To be honest, these low key sensitivities are a bit difficult to accept for most of optical image encryption methods. To enhance the security of the 3D image hiding method, in this paper, we propose a security-enhanced 3D image hiding method based on layer-based POH with structured light illumination. In the proposed method, the original 3D plaintext image is initially encoded into a layer-based POH by utilizing an iterative layer-oriented angular-spectrum algorithm with structured light illumination, where the structured light has many optical parameters that all serve as digital keys with high sensitivities such that the types and quantities as well as degrees of freedom of the security keys are all greatly increased, contributing to a significant expansion of key space. Afterwards, the layer-based POH is encrypted into an encrypted phase with the help of a CRPM generated through a piecewise linear chaotic map (PWLCM), in which the key sensitivities used in PWLCM are not worse than those in a Henon map. Finally, the encrypted phase is embedded into a 2D host image by using a DWT-SVD-based digital image watermarking technology. It is certain that the final ciphertext has a visible image distribution, thus it is not easily detectable for any abnormalities, ensuring a high concealment of 3D plaintext information. Most importantly, the sensitivities of singular matrix keys and the wavelength key are substantially enhanced thanks to the introduction of structured light, leading to an apparent enhancement of the level of security. To the best of our knowledge, this is the first time that structured light illumination has been applied in the CGH-based 3D image hiding method, which contributes to a substantial security-enhanced effect, especially the greatly expanded key space and the significantly improved key sensitivities. Therefore, the proposed 3D image hiding method shows the advantages of large key space, high key sensitivities, good concealment, and a high level of security. The main innovation of this paper is the improvement for encoding the layer-based POH with the help of structured light illumination and the application of this novel POH in the digital watermarking, bringing about a security-enhanced 3D image hiding method.
2. Proposed 3D Image Hiding Method
The proposed 3D image hiding method is composed of a hiding process and a decryption process, where the framework diagrams for hiding process and decryption process are depicted in
Figure 1 and
Figure 2, respectively.
2.1. Hiding Process
The purpose of the proposed 3D image hiding method is to hide a 3D plaintext image in a visual ciphertext image for realizing a 3D image encryption with high security and concealment, and the proposed method is actually composed of three parts: (1) layer-based POH encoding, (2) CRPM-based encryption, and (3) DWT-SVD-based digital image watermarking. Specifically, the proposed 3D image hiding process is divided into the following nine steps:
S1: Slicing of 3D plaintext image: On the original 3D plaintext image is firstly performed an operation of slicing for adapting the subsequent layer-based POH encoding; that is, the original 3D plaintext image is divided into parallel layers in accordance with depth information, and then the amplitude information of each layer is extracted, where the amplitude information of the i-th layer is recorded as the i-th layer of the plaintext image , and stands for the layer index. Afterwards, the hiding step S2 is run.
S2: Generation and illumination of encryption structured light (ESL): A collimated plane wave with unit amplitude is phase-modulated by an encryption structured phase mask (ESPM), thus the ESL is generated. Therefore, the complex amplitude distribution of the generated ESL is equal to the phase distribution of the adopted ESPM. In the proposed method, the ESPM is composed of a Fresnel zone plate and a radial Hilbert mask, and thereby the phase distribution of this ESPM can be written by the following [
35]:
where
denotes the operation of phase extraction,
is the imaginary unit,
is the wavelength of the incident light for encryption,
is the radius of the Fresnel zone plate,
is the focal length of the Fresnel zone plate,
is the topological charge of the radial Hilbert mask, and
is the spatial azimuth angle of the Hilbert mask. Afterwards, the ESL begins to diffraction forward to the front surface of the POH to be retrieved, and thereby the complex amplitude distribution
of the ESL propagating forward to the front plane of the POH can be calculated by the following:
where
stands for the operation of angular spectrum diffraction with the wavelength of
and the distance of
,
and
respectively denote the Fourier transform and inverse Fourier transform,
is the distance from the ESPM to the POH, and
and
represent spatial frequency coordinates. Afterwards, the hiding step S3 is run.
S3: Forward diffraction: The ESL propagating forward to the front plane of the POH is phase-modulated by the POH, and thus the complex amplitude distribution
of the ESL propagating forward to the back plane of the POH can be expressed as the following:
where
denotes the phase distribution of the POH, and it is initially set as a random phase distributed at [0, 2π]; however, it is noted that
will be continuously updated as the iteration process proceeds. Thereafter, the ESL continues to diffract forward and successively reaches the planes of
layers of plaintext images (
,
,
,
), where the complex amplitude distribution
of the ESL propagating forward to
can be calculated by the following:
where
is the distance from the POH to
. Subsequently, the amplitude distribution
and phase distribution
of the complex amplitude distribution
are extracted and saved, which are respectively expressed as follows:
where
denotes the operation of amplitude extraction. Afterwards, the hiding step S4 is run.
S4: Convergence judgment: After the amplitude distribution
is obtained, it is used as the
-th layer of the decrypted image, and then the correlation coefficient
between the
-th layer of decrypted image
and the
-th layer of plaintext image
can be calculated. After CCs corresponding to all layers are calculated, the minimum value
among these CCs is extracted for performing the convergence judgment. On the one hand, if
exceeds the predefined threshold
, the iteration has converged, and then the hiding step S6 will be run. On the other hand, if
does not exceed the predefined threshold
, the iteration has not yet converged, and then the amplitude constraint will be conducted, that is, the amplitude distribution
is replaced with the amplitude distribution
. Meanwhile, the phase distribution
is retained, and thus a new complex amplitude distribution
at
is generated, which can be expressed as the following:
Afterwards, the hiding step S5 is run.
S5: Backward diffraction: The new complex amplitude distribution
at
will propagate backward to the back plane of the POH, where
. Thereafter, the complex amplitude distribution
of the ESL propagating backward to the back plane of the POH can be calculated by the following:
where
stands for the operation of inverse angular spectrum diffraction with the wavelength of
and the distance of
. Afterwards, a phase distribution
can be derived by using the complex amplitude distribution
and the complex amplitude distribution
calculated above, which can be expressed as follows:
Subsequently, the phase distribution of the POH is updated by using the phase distribution , that is, . Afterwards, the hiding step S3 is run again, and the next round of iteration starts.
S6: Output of POH and CRPM-based encryption: When the iteration has converged, the phase distribution
of the POH at this time can be output. So far, the original 3D plaintext image has been encoded into a layer-based POH by utilizing an iterative layer-oriented angular-spectrum algorithm with structured light illumination. Thereafter, a CRPM is generated through the PWLCM [
36]. Concretely, a pseudo-random sequence
is first produced by using the PWLCM, which can be recured by the following:
In this sequence,
, and
denotes the total number of elements in a CRPM,
is the controlling parameter within the range of [0, 0.5], and
represents the initial value within the range of [0, 1]. In fact, each element of this sequence
generated by PWLCM is distributed within the range of [0, 1]. By multiplying each element by 2π, thus each element is scaled to lie within the range of [0, 2π]. Subsequently, these elements distributed within the range of [0, 2π] are sequentially arranged into a matrix
with the matrix size the same as the phase distribution
of layer-based POH, and thereby a CRPM is formed. Next, this CRPM is integrated into the layer-based POH for generating a new phase distribution
, which is denoted as an encrypted phase (POH
Δ), and it can be expressed as follows:
where
stands for the operation of modulo. In such a way, the layer-based POH is further encrypted into an encrypted phase POH
Δ with the help of a CRPM. Afterwards, the hiding step S7 is run.
S7: DWT and SVD of host image: Firstly, a 2D visual image is selected as the host image, and then the host image is decomposed into four sub-bands (, , , ) via DWT, which represent the low-frequency, vertical high-frequency, horizontal high-frequency, and diagonal high-frequency components, respectively. Thereafter, the low-frequency component is further decomposed through SVD in order to generate three components (, , ), where is a diagonal matrix, and and are two non-empty matrices. Afterwards, the hiding step S7 is run.
S8: Embedding of POH: The encrypted phase POH
Δ is embedded into the diagonal matrix
with the help of an embedding factor
; thus an encryption integrated matrix
is formed, which can be mathematically expressed as follows:
Afterwards, the hiding step S9 is run.
S9: Generation of ciphertext: The encryption integrated matrix
is decomposed via SVD for producing three components (
,
,
), where
is a diagonal matrix, and
and
are two non-empty matrices. Thereafter, these three components (
,
,
) are employed to perform an operation of inverse SVD (ISVD) for forming a reference sub-band
, which can be expressed as the following:
Subsequently, these four components (, , , ) are employed to perform an operation of inverse DWT (IDWT) for generating the final host image containing the embedded POH information, which is the final ciphertext.
So far, the 3D image hiding process has been finished completely, that is, an original 3D plaintext image is hiding in a 2D host image for finally generating a ciphertext with visible image distribution, leading to a protection with high concealment for 3D plaintext information. Compared with the current 3D image hiding method [
34], the introduction of structured light during the layer-based POH encoding supplies many additional optical parameters being served as digital keys, other than the traditional wavelength and distance keys, such that the types and quantities of security keys are all significantly increased, and all increased digital keys have high sensitivities, contributing to a significant expansion of key space. More importantly, the sensitivities of the wavelength key and singular matrix keys are also greatly enhanced, overcoming the problem about the intolerably low sensitivities of these keys in the current 3D image hiding method [
34], bringing about an apparent security-enhanced effect.
2.2. Decryption Process
The decryption process corresponding to the proposed 3D image hiding method is concretely divided into the following five steps:
D1: DWT and SVD of ciphertext: The ciphertext is firstly decomposed into four sub-bands (, , , ) via DWT. Subsequently, is further decomposed through SVD to obtain three components (, , ). Afterwards, the decryption step D2 is run.
D2: Generation of decrypted phase: The diagonal matrix
is combined with two singular matrix keys (
,
) for decryption, and these three components (
,
,
) are employed to perform an operation of ISVD for generating the decryption integrated matrix
. Thereafter, the decrypted phase POH
∇ is obtained with the help of the diagonal matrix
and the embedding factor
, and its phase distribution
can be calculated by the following:
Afterwards, the decryption step D3 is run.
D3: CRPM-based decryption and output of decrypted POH: Two digital keys (
,
) for decryption are respectively used as the controlling parameter and the initial value for generating the decryption CRPM (CRPM
∇) through the same PWLCM as the 3D image hiding process mentioned above, and its phase distribution is recorded as
. Next, the decrypted POH (
) is obtained with the help of the decrypted phase (POH
∇) and the decryption CRPM (CRPM
∇), and the phase distribution
of
can be mathematically calculated by the following:
Afterwards, the decryption step D4 is run.
D4: Generation and illumination of decryption structured light (DSL): A collimated plane wave with unit amplitude is phase-modulated by a decryption structured phase mask (DSPM), thereby the DSL is generated, and then the DSL begins to diffract forward to the front surface of the
, thereby the complex amplitude distribution
of the DSL propagating forward to the front plane of the
can be expressed as follows:
where
represents the phase distribution of the DSPM,
is the wavelength of the decryption incident light, and
is the distance between the DSPM and the
. Afterwards, the decryption step D5 is run.
D5: Forward propagation decryption: The DSL propagating forward to the front plane of the
is phase-modulated by the
, and then the phase-modulated DSL continues to diffract forward and successively reaches the
,
,
,
, thereby the amplitude distribution
of the DSL propagating forward to
can be calculated by the following:
Actually, is the i-th layer of the decrypted image. After decrypted images corresponding to all layers are obtained, all layers of decrypted images are finally stacked to form a decrypted 3D image.
So far, the 3D image decryption process has been finished completely. It is worth noting that the decryption steps D1–D3 are realized by using a digital approach in general, while the decryption steps D4–D5 can be performed by using either a digital approach or an optical approach with the help of a diffractive imaging system. Moreover, the required security keys for 3D image decryption include two singular matrix keys (, ) and many digital keys containing the initial value key , the controlling parameter key , the wavelength key , the distance key between the DSPM and the , and the optical parameter keys of the DSPM (focal length key and topological charge number key ), and it is worth claiming that only when all decryption keys are correct can the original 3D plaintext image be correctly retrieved.
3. Simulations and Results
In order to demonstrate the feasibility of the proposed 3D image hiding method, a set of computational simulations are performed. Firstly, a 3D training model with the maximum depth value of 10 mm is constructed in the Autodesk 3ds Max 2023, Version 24.0 and employed as an original 3D plaintext image. Subsequently, the nearest and farthest distances from the 3D train image to the POH to be retrieved are set as 800 mm and 810 mm, respectively. After computer rendering, the amplitude map and depth map of the 3D train image are obtained, which are shown in
Figure 3a,b, respectively. Moreover, the wavelength
of the encryption incident light and the distance
from ESPM to POH are respectively set as 632.8 nm and 200 mm, and the predefined threshold
of CC is set as 0.95. Additionally, the ESPM is set to consist of a Fresnel zone plate with the focal length
of 30 mm and a radial Hilbert mask with the topological charge number
of 6, and thus the phase distribution of this ESPM can be calculated by using Equation (1), which is presented in
Figure 3c. Afterwards, the POH can be generated by using an iterative layer-oriented angular-spectrum algorithm with structured light illumination, which is displayed in
Figure 3d. In addition, the initial value
and the controlling parameter
during the generation of CRPM are respectively set as 0.741 and 0.277 such that a CRPM can be generated by using the PWLCM, which is shown in
Figure 3e. Thereafter, a grayscale image (“Barbara”) is employed as the 2D host image, which is depicted in
Figure 3f. After the operation of DWT, the host image is decomposed into four sub-bands, and the DWT result is displayed in
Figure 3g. Moreover, the embedding factor
during the embedding of POH is set to 0.04. After the embedding of POH, an encryption integrated matrix is generated and then decomposed for producing three components (
,
,
) via SVD, where two singular matrices,
and
, are shown in
Figure 3h,i, respectively. Finally, a ciphertext, that is, the host image
containing the embedded POH information, can be generated, which is displayed in
Figure 3j. So far, the hiding process of 3D plaintext image has been finished.
The Shannon entropy, which quantifies the uncertainty, information content, or complexity of pixel values in an image, is commonly used for calculating image information entropy. Its mathematical expression is as follows:
where
is the ratio of the frequency of gray level
to the count of gray level
. In fact, the embedding of encrypted information into the host image typically introduces noise, which inevitably increases its information entropy. Therefore, the entropy variation
serves as an effective metric for evaluating watermarking quality, which is defined as the difference between the ciphertext’s entropy and the original host image’s entropy, and this parameter can demonstrate our scheme’s performance. For the original Barbara image, the measured entropy was 7.6161 bits/pixel. After embedding the encrypted data, the entropy increased marginally to 7.6497 bits/pixel, yielding a minimal
of +0.0336 bits/pixel, merely a 0.44% increase. This negligible entropy change indicates that the embedded ciphertext maintains exceptional statistical similarity to the host image. Potential attackers cannot reliably detect hidden information through entropy analysis. The extremely small entropy perturbation confirms that our watermarking method effectively preserves the host image’s statistical properties while securely concealing the encrypted information.
Next, the decryption process is performed in order to check the effectiveness of keys.
Figure 4 shows various decryption results by using all correct keys and only one wrong key. From the decrypted results shown in
Figure 4, it can be seen that the original 3D plaintext image can be correctly reconstructed with high quality when all decryption keys are correct; however, any slight deviation in any one of the security keys will prevent the effective recovery of the original 3D plaintext image, resulting in a failed decryption, and these decryption results at this time generally exhibit noise-like distribution, and their CCs are also low to close to zero, making it impossible to identify any features or meaningful information related to the original 3D plaintext image from these decryption results. Consequently, it can be concluded that all keys used in the proposed method are effective and secure. Moreover, it is worth noting that the focal length key as well as the topological charge number key are newly added compared to the current method [
34], contributing to an expanded key space.
Subsequently, several relationship curves between the mean of CCs for all layers (average CC) and the key deviations are plotted to describe the sensitivity of each key quantitatively, where the average CC as functions of error percentage of singular matrix
, error percentage of singular matrix
, initial value deviation, controlling parameter deviation
, wavelength deviation
, focal length deviation
, topological charge number deviation
, and distance deviation
, are shown in
Figure 5a–h, respectively. From the sensitivity curves shown in
Figure 5, it can be seen that the CC decreases sharply to close to zero when any one of the keys has a slight deviation, and the original 3D plaintext image cannot be correctly retrieved at this time, proving the high sensitivities of all keys. In particular, the sensitivities of two singular matrix keys (
and
) depicted in
Figure 5a,b are about 1% error percentage, which are much better than those in the current method [
34] because the decrypted 3D image can be easily recognizable with naked eyes when the error percentage of
or
reaches 50% in the current method [
34]. Moreover, the sensitivity of the wavelength key shown in
Figure 5e is near 1 nm, which is also significantly better than the one that is near 50 nm in the current method [
34]. Consequently, it is sure that the proposed method exhibits an obvious sensitivity-enhanced effect compared with the current method [
34], which is also beneficial for the expansion of key space and the improvement of security.
Afterwards, the key space data for each digital key can be obtained as shown in
Table 1, where the sensitivity described in
Table 1 is the digital key deviation when the average CC first drops below 0.1. Meanwhile, the decrypted 3D image exhibits a noise-like distribution without any visible 3D plaintext information, and the value range in
Table 1 is obtained from the positive range of int-type data. Thus, the total key space for digital keys in the proposed 3D image hiding method is the product of the key space data of each individual digital key, which is approximately
. Compared with the current 3D image hiding method proposed by Ref. [
34], the supplementary digital keys in our proposed method provide a higher degree of freedom in key space thanks to the introduction of structured light. Meanwhile, the sensitivities of the singular matrix keys and wavelength key are substantially enhanced. Therefore, the total key space in our proposed method is several ten orders of magnitude larger than that in the comparison method proposed by Ref. [
34], leading to an apparent security-enhanced effect. It is worth noting that the current key space in our proposed method is still limited because the adopted ESPM is only composed of a Fresnel zone plate and a radial Hilbert mask. Of course, once an ESPM with more optical parameters is used in the future, the key space will naturally expand accordingly. Consequently, it is certain that the proposed 3D image hiding method has a high level of security with large key space and high key sensitivities, leading to strong protection for the 3D plaintext image.
Additionally, the robustness of the proposed 3D image hiding method against noise and occlusion attacks was tested. First, to evaluate its resistance to noise interference, the ciphertext was contaminated with varying levels of Gaussian noise, uniform noise, and salt-and-pepper noise. The ciphertext corrupted by Gaussian noise can be expressed as follows:
where
and
respectively denote the original ciphertext and the contaminated ciphertext,
is the Gaussian noise with a mean of
and a standard deviation of 0.02, and the coefficient
is the noise strength.
Figure 6 displays various noise-affected decrypted results by changing the noise strength. From the decrypted results displayed in
Figure 6, it can be seen that although the quality of the decrypted results decreases as the noise strength increases, the contour of the original 3D plaintext image can still be easily recognized with the naked eyes even when the noise strength reaches its maximum, which indicates that the proposed method has good immunity to noise attack.
To further evaluate the scheme’s robustness against uniform noise, we conducted systematic tests by introducing varying intensities of uniform noise to the ciphertext. The noise-contaminated ciphertext can be mathematically modeled as follows:
where
and
respectively denote the original ciphertext and noise-contaminated ciphertext,
represents uniform noise with amplitude range of [−0.03, 0.03], and the coefficient
indicates the noise intensity. As illustrated in
Figure 7, a comparative analysis of decryption outcomes under varying noise intensities is presented. We observe that while the decrypted image quality progressively degrades with increasing noise intensities, the fundamental contours of the original 3D plaintext image remain clearly discernible to the naked eye, even at maximum noise intensity. This empirical evidence conclusively demonstrates our method’s robust resistance to uniform noise attacks.
For the salt-and-pepper noise resistance test, we introduced varying intensity levels of this noise type to the ciphertext, which can be mathematically represented as follows:
where
and
respectively represent the original ciphertext and noise-contaminated ciphertext, and
denotes salt-and-pepper noise with a density of 0.001 and a balanced 1:1 pepper-to-salt ratio, while the coefficient
quantifies the noise intensity.
Figure 8 presents a comparative analysis of decryption performance under varying noise intensities. The experimental results demonstrate that although increasing noise intensity progressively degrades the decrypted image quality, the fundamental contours of the original 3D plaintext remain clearly distinguishable through visual inspection, even at maximum noise intensity. This empirical evidence confirms our method’s superior robustness against salt-and-pepper noise attacks.
Next, in order to check the ability to resist an occlusion attack, the part-pixels of ciphertext are occluded.
Figure 9 shows the partly occluded ciphertexts with different occlusion ratios and corresponding decrypted results. From the decrypted results depicted in
Figure 9, it can be seen that although the quality of the decrypted results decreases with the increment of the occluded area, the contour of the original 3D plaintext image can still be easily distinguished with the naked eyes, which confirms that the proposed method has high robustness against occlusion attacks.
Thereafter, the robustness against a low-pass filtering attack of the proposed 3D image hiding method is also checked by taking mean filtering and frequency-domain smooth filtering for example.
Figure 10 presents the ciphertexts after applying mean filtering attacks with different filter windows and corresponding decrypted results. From the decrypted results shown in
Figure 10, it can be observed that although the decryption quality decreases as the filter window size increases, a considerable amount of useful information about the 3D plaintext image is still retained, which indicates that the proposed method exhibits good robustness against a mean filtering attack.
Figure 11 presents the ciphertexts after applying smoothing filter attacks with different filter radii and corresponding decrypted results. From the decrypted results shown in
Figure 11, it can be observed that although the decryption quality decreases with the decrement of the filter radius, most useful information about 3D plaintext image is still easily recognized with the naked eyes, which proves that the proposed method exhibits strong robustness against a frequency-domain smooth filtering attack.
Hereafter, the resistance of the proposed 3D image encryption method against a chosen-plaintext attack is also examined. For the chosen-plaintext attack, a method based on multiple impulse functions [
37] is referenced and applied to the proposed 3D image encryption method. First, by selecting an impulse function as the plaintext, the encrypted phase value corresponding to each pixel of the plaintext image is obtained through the encryption system with the assistance of ESPM and POHC. Subsequently, the encrypted phase is generated after traversing all pixels. Next, a normal POHC is generated by selecting a 3D image as the plaintext image and processing it through the encryption system with the help of ESPM. This 3D plaintext image and the corresponding POHC are then recorded as a plaintext-ciphertext pair. Following this, a phase key is deduced by combining the plaintext-ciphertext pair with the aforementioned encrypted phase, and its phase distribution is illustrated in
Figure 12a. Finally, decryption is performed using the deduced phase key as DSPM with the assistance of POHC, and the decrypted results obtained from the POHC are shown in
Figure 12b–d, respectively. As can be observed from
Figure 12, the deduced phase key in
Figure 12a, obtained through the chosen-plaintext attack, is significantly different from the ESPM shown in
Figure 3c. Moreover, the corresponding decrypted results in
Figure 12b–d contain no valid information related to the original 3D plaintext image, which demonstrates that the proposed method exhibits strong resistance against a chosen-plaintext attack.
Moreover, in order to further demonstrate the high security of the proposed 3D image hiding method, statistical attacks on histogram and correlation analyses are also performed.
Figure 13a and
Figure 13b show the histograms of the amplitude information of the 3D train image and corresponding ciphertext depicted in
Figure 3j, respectively. Besides, a 3D dinosaur skeleton model is constructed and employed as another 3D plaintext image, and its amplitude map and depth map are displayed in
Figure 13c and
Figure 13d, respectively. Subsequently, the 3D dinosaur skeleton image is also hidden in the 2D host image (“Barbara”) by using the proposed method, and thus a new ciphertext is generated, which is depicted in
Figure 13e.
Figure 13f and
Figure 13g show the histograms of the amplitude information of the 3D dinosaur skeleton image and corresponding ciphertext depicted in
Figure 13e, respectively. By comparing the histograms shown in
Figure 13a,b with those shown in
Figure 13f,g, it can be seen that although the adopted 3D plaintext images in the 3D image hiding mentioned above are completely different, there is no significant difference between the histograms of these two ciphertexts. In such a way, any useful information about the original 3D plaintext image cannot be obtained from the histogram of the ciphertext, proving that the proposed method has good capability to resist a histogram attack.
In addition, correlation analysis is also carried out, in which 25,000 and 5000 pairs of adjacent pixels in horizontal, vertical, and diagonal directions are randomly selected from the amplitude information of the 3D train plaintext image and corresponding ciphertext, respectively.
Figure 14(a1–a3) shows the horizontal, vertical, and diagonal correlations of adjacent pixels in the amplitude information of the 3D train plaintext image, respectively.
Figure 14(b1–b3) shows the horizontal, vertical, and diagonal correlations of adjacent pixels in the ciphertext depicted in
Figure 3j, respectively. From the correlation distributions shown in
Figure 14, it can be seen that both plaintext and ciphertext exhibit a strong correlation in all directions, which demonstrates the high imperceptibility of ciphertext with visible image distribution, rather than noise-like distribution with extremely weak correlation in all directions. Consequently, it can be concluded that the proposed method can effectively resist the correlation analysis attack.
Finally, the computational time of the proposed 3D image hiding method is also analyzed in detail.
Table 2 lists the used parameters in the simulation and the adopted configuration of computational platform. Under this computational condition, 100 time iterations are performed in sequence.
Figure 15a shows the mapping relationship between the number of iterations and the time consumption, and it can be easily seen from
Figure 15a that the time consumption almost linearly increases as the number of iterations increases. However, there is no need to be too high for the number of iterations as well as the time consumption during the actual simulation because the decrypted 3D image has exhibited high quality after a few number of iterations.
Figure 15b shows the mapping relationship between the number of iterations and the average CC, and it can be observed that the average CC tends to converge as the number of iterations gradually increases. In particular, when the number of iterations reaches seven, the average CC is calculated as 0.9501, which has reached the predefined threshold, indicating that the decryption quality is already good, and then the average CC will almost be unchanged with more iterations, and the total computational time at this time is only about 2.6 s. Such a short time consumption and a high decryption quality contribute to a low complexity of the proposed method in time consumption. Consequently, it can be concluded that the proposed method has good performance in the aspect of computational time.