Enhancing Performance of Lossy Compression on Encrypted Gray Images through Heuristic Optimization of Bitplane Allocation

: Nowadays, it remains a major challenge to efﬁciently compress encrypted images. In this paper, we propose a novel encryption-then-compression (ETC) scheme to enhance the performance of lossy compression on encrypted gray images through heuristic optimization of bitplane allocation. Speciﬁcally, in compressing an encrypted image, we take a bitplane as a basic compression unit and formulate the lossy compression task as an optimization problem that maximizes the peak signal-to-noise ratio (PSNR) subject to a given compression ratio. We then develop a heuristic strategy of bitplane allocation to approximately solve this optimization problem, which leverages the asymmetric characteristics of different bitplanes. In particular, an encrypted image is divided into four sub-images. Among them, one sub-image is reserved, while the most signiﬁcant bitplanes (MSBs) of the other sub-images are selected successively, and so are the second, third, etc., MSBs until a given compression ratio is met. As there exist clear statistical correlations within a bitplane and between adjacent bitplanes, where bitplane denotes those belonging to the ﬁrst three MSBs, we further use the low-density parity-check (LDPC) code to compress these bitplanes according to the ETC framework. In reconstructing the original image, we ﬁrst deploy the joint LDPC decoding, decryption, and Markov random ﬁeld (MRF) exploitation to recover the chosen bitplanes belonging to the ﬁrst three MSBs in a lossless way, and then apply content-adaptive interpolation to further obtain missing bitplanes and thus discarded pixels, which is symmetric to the encrypted image compression process. Experimental simulation results show that the proposed scheme achieves desirable visual quality of reconstructed images and remarkably outperforms the state-of-the-art ETC methods, which indicates the feasibility and effectiveness of the proposed scheme.


Introduction
The encryption-then-compression (ETC) technique is a kind of technique that performs compression on encrypted digital media such as images and videos. It adapts to scenarios of cloud computing, distributed processing, etc., and is in contrast to the conventional compression-then-encryption (ETC) system that conducts compression prior to encryption. In an ETC system, the content owner only encrypts digital media for secrecy but does not compress them because of computing resource limitations. After the encrypted digital media have been uploaded to the cloud side, the cloud side compresses, without access to the decryption key, the encrypted digital media in order to save storage space and strategy, e.g., maintaining the first, most significant bitplane (MSB) of these three subimages successively, then preserving the second MSBs, and so on. In other words, we exploit the asymmetric characteristics of different bitplanes in optimal bitplane selection.
As there exist clear statistical correlations both within any bitplane of the first three MSBs and between adjacent bitplanes, we further leverage an MRF to characterize these statistical correlations [8,9]. Based on this, we then use the low-density parity-check (LDPC) code according to the framework of Johnson et al. [1] to compress the first three MSBs of sub-images S 00 , S 01 , S 10 , and S 11 . This would further enhance the compression efficiency of the encrypted images.
After obtaining the downsampled encrypted image, the receiver conducts de-compression and decryption to recover the original gray image in a lossy way, which is actually symmetric to the process of encrypted image compression. In particular, LDPCcoded bitplanes are first reconstructed perfectly via the joint LDPC decoding, decryption, and MRF exploitation [8,9], and CAI [24] is then deployed to generate missing bitplanes and thus pixels discarded in the compression process.
With the stream cipher for encryption, the pixel-and bitplane-level downsampling for compression, and the MRF and CAI for lossy reconstruction, we thus propose a novel ETC scheme for the lossy compression of encrypted gray images. Experimental results show that the proposed scheme results in desirable visual quality for a reconstructed image and achieves a significant improvement over the state-of-the-art, indicating the feasibility and effectiveness of the proposed scheme.
In summary, the contributions of the paper are two-fold: (1) We formulate the task that maximizes the PSNR of an image reconstructed by a downsampling-based ETC system on the condition of a given compression ratio as an optimization problem, and develop a heuristic bitplane allocation strategy to solve this optimization problem; (2) We propose a novel ETC scheme for lossy compression on encrypted gray images through heuristic optimization of bitplane allocation, achieving a remarkable improvement over the state-ofthe-art.
The remainder of the paper is arranged as follows. Related works are introduced in Section 2. The proposed ETC scheme is described in Section 3. Experimental results and conclusions are given in Sections 4 and 5, respectively.

Lossy Compression on Encrypted Images
As this paper mainly focuses on the lossy compression of encrypted images, lossy compression methods for encrypted images and videos in the literature are briefly described below.
For the quantization-based method, a scalar quantizer is applied to quantize an encrypted image, aiming to reduce the data amount, and the successive de-quantization and content-adaptive interpolation (CAI) are generally used to reconstruct the original image in a lossy way. For example, Gschwandtne et al. [14] employed quantizers with different steps and the Huffman coding to implement the compression of encrypted gray images. Zhang et al. [15] conducted compression by discarding rough and fine details of transformed coefficients and quantizing reserved coefficients. Zhang et al. [16] formed based and enhanced layers of an encrypted gray image, followed by quantizing the enhanced layer to implement compression. Later, Zhang et al. [17] decomposed a gray image into multi-layers and quantized the permuted prediction errors of different layers. Wang et al. quantized lifting wavelet-transformed coefficients via scalar quantizers that are constructed in a heuristic [20], rate-distortion optimization [21], or weighted rate-distortion optimization [22] way.
For the uniform downsampling-based category [24][25][26], an encrypted gray image is uniformly downsampled to achieve compression. At the receiver, the CAI [24,25] or MRF [26] is incorporated to reconstruct the original image in a lossy way.

MRF-Based ETC Method
As aforementioned in Section 1, we employ an MRF to characterize statistical correlations both within a bitplane [8] and between adjacent bitplanes [9], and apply it to improve the compression efficiency. Thus, in this subsection, we mainly introduce MRF and MRF-based ETC methods [8,9].

MRF Introduction
MRF is a kind of statistical model characterizing the spatial statistics of a natural image. It has been widely applied in many research fields, such as image denoising, segmentation, and computer stereo vision.
Suppose that I is a Q-bit image of size m × n and L = {(x, y), x ∈ [1, n], y ∈ [1, m]} denotes the corresponding coordinate set, where Q is the bit depth. If each pixel I(x, y) is represented with a random variable, F s (s ∈ L), that takes on values in the state space, Φ = 0, 1, · · · , 2 Q − 1 , then all F s s form a random field, F = {F s |F s ∈ Φ, s ∈ L}. If the random field F further meets positivity and Markovian, i.e., p(F = F) > 0 for ∀F ∈ F and p(F s |F L−s ) = p(F s |F δ(s) ), respectively, then F becomes an MRF, where L − s and δ(s) denote the coordinate set excluding s and the neighborhood of s, respectively.
With the Hammersley-Clifford theorem, MRF is equivalent to the Gibbs random field and thus can be equivalently calculated as [8]: where Z, U(F), and T are a normalizing constant, an energy function equal to U(F) = ∑ c∈C V c (F), and a temperature constant, respectively. The C denotes a set of cliques formed by the neighborhood system δ(s), and V c (·) is a potential function defined on a given clique c(c ∈ C). To compute p(F = F) in an efficient way, MRF is first represented with a factor graph and the sum-product algorithm is then applied on the represented factor graph to obtain p(F = F) [21,28] . When the first-order neighboring system is considered, the factor graph for MRF can be constructed as follows [21]. Specifically, take a variable node (VN) to represent each pixel of a given image, and use a factor node (FN) to characterize the statistical correlations between neighboring VNs. Figure 1 illustrates the factor graph for a 3 × 3 image, where VN and FN are denoted with a circle and square, respectively. A VN takes on values in Φ = 0, 1, · · · , 2 Q − 1 and an FN stands for the potential function U(F) in Equation (1). If a gray image is considered as a combination of eight bitplanes, each bitplane can then be taken as a binary image and thus characterized with a factor graph for a binary image (i.e., Q = 1). As investigated in [9], there exist clear statistical correlations within each bitplane and between adjacent bitplanes, where bitplanes denote those belonging to the first three MSBs. Therefore, MRF can be further applied to characterize statistical correlations between adjacent bitplanes, and the corresponding factor graph is illustrated in Figure 2. Illustration of a factor graph for the MRF between adjacent bitplanes. B k and B k−1 denote two adjacent bitplanes, F k (x, y) and F k−1 (x, y) are two VNs at the same coordinate of B k and B k−1 , and D k (x, y) is an FN representing the statistical correlation between F k (x, y) and F k−1 (x, y).

Joint Factor Graph for Reconstruction of Bitplanes
In the ETC framework developed by Johnson et al. [1], a stream cipher is taken for encryption, a channel code such as LDPC can be used for the compression of encrypted data, and a joint decoding and decryption process is conducted to recover the original data. Under this framework, Wang et al. exploited an MRF to characterize statistical correlations within a bitplane [8] and between adjacent bitplanes [9]. They then constructed factor graphs for stream cipher-based encryption, LDPC-based compression, and MRF exploitation, and combined these three factor graphs seamlessly to form a joint factor graph for the reconstruction of a binary image (BIRFG) [8]. Based on BIRFG, they further incorporated the factor graph for statistical correlations between adjacent bitplanes to form a joint factor graph for gray image reconstruction (JFGIR) [9]. Both BIRFG and JFGIR are briefly described as follows. Figure 3 illustrates BIRFG. It consists of three factor graphs, i.e., the factor graphs for LDPC decoding, decryption, and MRF exploitation, which are plotted in boxes with solid, dot, and dot-and-dash lines, respectively. The factor graph of LDPC decoding is used to decompress the received encrypted and compressed sequence, the graph of decryption is deployed to decrypt the decompressed but encrypted signal, and that of MRF exploitation is incorporated to exploit the statistical correlations of natural images. . BIRFG for binary image I of size m × n. S j (j = 1, . . . , q) are LDPC syndrome bits, which is taken as the encrypted and decompressed bit sequence;Ŷ i (i = 1, . . . , mn) is the decompressed but encrypted sequence, K i is the encryption key sequence,F i is a 1-D bit sequence converted from a given binary image, andF y,x denotes pixels of 2-D binary image. M y,x /N y,x , P y,x , t i , and g j represent the constraints imposed by the potential function, image source prior, decryption, and LDPC code, respectively.
By using BIRFG to characterize bitplanes B k and B k−1 in Figure 2 and deploying D k (x, y) to represent statistical correlations between bitplanes, JFGIR is thus formed accordingly. For compactness, it is omitted here.
With the constructed joint factor graph, the sum-product algorithm is then applied iteratively. After the iteration converges, the original image can thus be reconstructed. For details of the sum-product algorithm on BIRFG and JFGIR, readers are referred to [8] and [9], respectively.

Proposed ETC Scheme
In this section, we present the proposed ETC scheme through heuristic optimization of bitplane allocation, as illustrated in Figure 4. It includes gray image encryption via a stream cipher, encrypted image compression with heuristic bitplane allocation optimization, and lossy reconstruction using MRF and content-adaptive interpolation (CAI). Below are details for these parts.

Stream Cipher-Based Encryption
Suppose that I is a gray image of size m × n. Then, the kth bitplane of image I, namely B k , is obtained as It is clear that B 8 and B 1 denote the most significant bitplane (MSB) and least significant bitplane (LSB), respectively.
Next, generate a pseudorandom bit sequence of length m × n, say K k = {K k (i)|K k (i) ∈ 0, 1(i = 1, . . . , m × n)}, via the kth secret key KEY c + 2 k , where KEY c is a one-time-pad initial secret key. Afterwards, encrypt each bitplane with K k as where I k denotes the kth bitplane of image I and ⊕ is an exclusive or operation.
After encryption bitplane-by-bitplane, all encrypted bitplanes are merged in the following way to yield the stream ciphered gray image Y, i.e., where Y k represents the kth bitplane of image Y. The encrypted Y is sent through a public channel to the cloud side, and the secret key KEY c is transmitted via a secure channel to the receiver.

Compression via Heuristic Optimization of Bitplane Allocation
To save storage space and communication bandwidth, the cloud side chooses to compress the encrypted gray image Y on the condition of no decryption key. As the encryption has masked the image content, the cloud side without access to the decryption key cannot exploit the statistical characteristics of the original unencrypted image to conduct compression. As introduced in Section 2, to achieve feasible compression, the LDPC syndrome can be used for the lossless compression of a stream ciphered binary image, and the compressive sensing matrix, scalar quantizer, or uniform downsampling can be applied for the lossy compression of a stream ciphered binary or gray image.
As uniform downsampling generally achieves relatively large compression ratios at tolerable distortions and is rather suitable for the cloud side without the decryption key to conduct compression on images encrypted by random permutation, stream cipher, cryptography, etc., it is adopted in our work as the compression method. To facilitate CAI at the receiver, we employ the uniform downsampling with a scaling factor of two. In particular, for a stream ciphered image Y, we divide it into four sub-images, namely B 00 , B 01 , B 10 , and B 11 , according to Figure 5, i.e., Without loss of generality, we take B 00 as the uniformly downsampled result. This leads to an initial compression ratio of R = 0.25. If the target compression ratio R 0 is larger than 0.25, then more bitplanes are needed to choose from B 01 , B 10 , and B 11 and sent to the receiver. If R 0 is less than 0.25, then more bitplanes are required to discard from B 00 . This thus gives rise to a key problem, i.e., how to select suitable bitplanes from sub-images so that the reconstruction performance in terms of PSNR can be maximized on the condition of a given compression ratio R 0 .
This problem is essentially a rate-distortion optimization problem. As the statistical characteristics of the original image generally cannot be exploited at the cloud side, this ratedistortion optimization cannot be directly implemented at the cloud side. In an alternative way, the cloud side can choose to design a preferable bitplane selection method, with which the receiver can well exploit the selected bitplanes to recover discarded parts. As a result, the reconstruction quality may be maximized at a given compression ratio.
For a given R 0 larger than 0.25, it is clear that the more important the bitplane is for reconstruction, the higher the priority in bitplane selection should be. As MSB is the most important bitplane in image reconstruction and LSB is the least important one, we choose to optimize bitplane allocation in a heuristic way. Specifically, we conduct the bitplane selection as follows.
If R > R 0 holds, then (R − R 0 ) × 8 × m × n bits are randomly discarded from the MSB of B 11 , where discarding locations can be generated through a secret key KEY s . Otherwise, go to Step 2. For a given R 0 smaller than 0.25, bitplanes in B 00 should be discarded. In this situation, the LSB bitplane of B 00 is first removed, and the compression ratio is updated as R = 0.25 − 1/32 = 0.21875. If R < R 0 holds, then (R − R 0 ) × 8 × m × n bits in LSB are randomly retained. Otherwise, continue to remove the second MSB, the third one, and so on until R 0 is achieved.
Although this heuristic optimization of bitplane allocation can probably maximize PSNR at a given compression ratio, we may further improve the compression efficiency by exploiting the statistical correlations that exist within a bitplane and between adjacent bitplanes, where bitplanes belong to the first three MSBs. Specifically, by incorporating uniform downsampling and statistical correlation exploitation, we conduct the compression of an encrypted gray image as follows.
(1) Divide the stream ciphered gray image Y into four sub-images B 00 , B 01 , B 10 , and B 11 via the method in Figure 5.
where H is a parity-check matrix, B 1D 00 is a one-dimensional (D) vector converted from the 2-D matrix B 00 , and S 8 00 is the syndrome used as the compressed sequence for B 00 . If the second and third MSBs of B 00 are selected, they are then compressed similarly, yielding Syndrome S 7 00 and S 6 00 , respectively. As there exist weak statistical correlations within and between the other five bitplanes, these bitplanes are not compressed via the LDPC coding and directly sent to the receiver. For notational convenience, these five bitplanes are also denoted S k 00 (k = 5,. . . ,1). (3) Compute the compression ratio, R acc , for sub-image B 00 as where length(·) denotes a function counting the bit number. If the given compression ratio R 0 is smaller than R acc , then determine the minimum k via the following optimization, i.e., argmin k∈ [1,8] Next, transmit the syndrome sequence of S i 00 (i = k, . . . , 8) to the receiver and terminate the compression process. Otherwise, send the syndrome sequence of S k 00 (k = 1, . . . , 8) to the receiver and go to the next step. (4) Compress the MSB of B 11 by LDPC coding and generate syndrome S 8 11 . If R acc = R acc + length(S 8 11 ) 8×m×n > R 0 holds, then further send S 8 11 to the receiver and end the compression. Otherwise, go to the next step. (5) Impose the LDPC coding on the MSB of B 01 and yield syndrome S 8 01 . If R acc = R acc + length(S 8 01 ) 8×m×n > R 0 holds, then transmit S 8 01 to the receiver and complete the compression. Otherwise, go to the next step. 8×m×n > R 0 holds, then send S 8 10 to the receiver and finish the compression. Otherwise, go to the next step. (7) Similar to Steps 4-7, condense the second and third MSBs of B 11 , B 01 , and B 10 successively until the given compression ratio R 0 is approximately satisfied. (8) Similar to the compression of bitplanes B k 00 (k = 1, . . . , 5), the other five bitplanes of B 11 , B 01 , and B 10 are not LDPC-coded and denoted S k 11 , S k 01 , and S k 10 for notational convenience, respectively. If R 0 is sufficiently large, S k 11 , S k 01 , and S k 10 are chosen sequentially from k = 5 to k = 1 until R 0 is nearly achieved. It is noted that this compression procedure leads to a compression ratio approximating but not exactly the same as R 0 , aiming to trade-off the complexity in bitplane selection and the exact compression ratio. If an exact compression ratio is required, optimal bit selection is required, which needs much more investigation and thus can be considered an interesting topic for future work.
After the compression is completed, the selected syndrome sequence, say S = {S 8 00 , S 7 00 , . . .}, is transmitted through a public channel to the receiver, while the secret key KEY s is sent via a secret channel. It is worth pointing out that the KEY s can be used to generate all related coordinates of randomly discarded or supplemented bits by yielding sub-keys such as KEY s + 2 s + 2 k , where s = 0, 1, 2, 3 denotes the label of a sub-image and k = 1, . . . , 8 stands for the number of bitplanes.
In addition, to facilitate LDPC decoding at the receiver, we need to tell the receiver about the number of sent bitplanes in each sub-image, and the number of syndrome bits for the three MSBs of each sub-image. As each sub-image contains eight bitplanes, three bits are enough to represent the bitplane number. As the size of each bitplane of a sub-image is m × n/4, a total of len = log 2 (m × n/4) bits is sufficient to record the number of syndrome bits of each bitplane of a sub-image. Therefore, by incorporating this auxiliary information, the final compression ratio R is calculated as where t(0 ≤ t ≤ 12) is the number of LDPC-coded MSBs of all four sub-images. Moreover, as bitplanes other than the three MSBs are not compressed, their sizes are fixed to m × n/4 and thus do not need to be recorded.

Reconstruction via MRF Exploitation and Interpolation
After receiving the compressed encrypted sequence S, the auxiliary information indicating the transmitted bitplane number and the LDPC-coded sequence length, and secret keys KEY c and KEY s , the receiver performs decompression and decryption to reconstruct the original image in a lossy way.
First of all, from the received auxiliary information, the receiver determines the number of received bitplanes of each sub-image and the length of LDPC-coded bitplanes. Via these parameters, the syndrome sequence of each LDPC-coded bitplane is extracted from S, and the corresponding bitplane is then recovered in a lossless way via the joint LDPC decoding, decryption, and MRF exploitation (see also Section 2.2.2) [9]. Similarly, the non-compressed bitplanes (i.e., the first to fifth LSBs) are obtained from S and then recovered accordingly. For the other bitplanes that are not sent from the cloud side, all are set with zeros. After bitplanes have been recovered, they are merged to yield four reconstructed sub-images: B 00 , B 01 , B 10 , and B 11 .
As there exist clear statistical correlations between pixels in the neighborhood, we further employ CAI [29] to reconstruct missing parts. In more detail, we use B 00 to interpolate B 11 , B 01 , and B 10 successively. Figure 6 illustrates the interpolation in our work. Specifically, at the first stage, we deploy B 00 to interpolate B 11 . Suppose that t = [t 1 , t 2 , t 3 , t 4 ] are neighboring pixels in the diagonal directions of pixel B 11 (x, y), where t belongs to pixels in B 00 whereas t is located in B 11 . Then, we use t to predict B 11 (x, y) as if max(t) − min(t) < 20 where mean(·), max(·), min(·), and median(·) are functions generating the mean, maximum, minimum, and median values of a given vector. We proceed to improve the prediction accuracy for B 11 (x, y). Assume that N(N > 0) MSBs of B 11 are recovered in a lossless way from the received sequence S. Then, they can be exploited to improve the prediction result v 0 . In more detail, first, obtain the first N MSB bits from pixel magnitude B 11 (x, y) and v 0 , and then converge them to decimal values d 0 and , then the prediction value v 0 deviates from the original value significantly.
To tackle this issue, another two prediction values are generated as where sum(·) sums up all elements of a given vector. The N MSB bits of v 1 and v 2 are also extracted to yield the corresponding decimal values d 1 and d 2 , respectively. Value d j (j = 0, 1, 2) leading to the minimum difference is taken as the optimum one, d opt , i.e.
The value yielding v opt is thus considered the practically optimum interpolation value.
Via the d opt , the difference with respect to d 0 is calculated as γ opt = d opt − d 0 . Thus, the final interpolation result for B 11 (x, y) is obtained as where ∆ = 2 8−N and mod(·, ·) is a modulo function. After generating B 11 , we then use it as well as B 00 to interpolate B 01 and B 10 successively. This is rather similar to the interpolation of B 11 , as shown in Figure 6.
When sub-images B 11 , B 01 , and B 10 are reconstructed, they are merged according to Figure 6 to yield the reconstructed gray image I , which is an inverse process of sub-image extraction, as illustrated in Figure 5.

Experiments and Analysis
In this section, we assess the proposed scheme for the lossy compression and reconstruction of gray images. We first give suitable parameters for the proposed scheme; then, we evaluate the visual quality of the reconstructed images and, finally, examine the rate-PSNR (peak signal-to-noise ratio) performance. Below are details of the experimental simulations and results.

Experimental Settings
The proposed scheme involves an MRF in bitplane reconstruction. As adopted in [8], the discontinuity-adaptive model [30] is also applied in our scheme. It includes three model parameters, i.e., the sharpness-controlled model parameter of δ, the temperature constant of T, and the source prior of P. As degrees of statistical correlations within a bitplane and between adjacent bitplanes are different, different mode parameters are set. Specifically, for MRF within a bitplane, we deploy the practically optimal model parameters explored in [8], i.e., δ = 45, T = 0.00049, and P = 0.5. For MRF between adjacent bitplanes, we use the practically optimum model parameters set in [9], i.e., δ = 45 and P = 0.5, and T = 0.005 and 0.05 for MRF between the 8th and 7th MSBs and that between the 7th and 6th MSBs, respectively.
In compressing the first three MSBs, the LDPC code is adopted. The length of the LDPC code is configured as 4096. The compression rate of the LDPC code is set as R ∈ [0.03, 0.95] with step 0.025. The binary search is applied on this rate set, and the minimum compression rate leading to successful compression and lossless reconstruction is then taken as the optimal compression rate. It is noted that different MSBs would have different optimal compression rates.
In the simulation, we test eight 512 × 512 gray images of different texture and edges, namely Baboon, Barb, Boat, Hill, Lena, Man, Peppers, and Tank, as illustrated in Figure 7. Each test image is encrypted, compressed, and reconstructed via the proposed scheme in Section 3, in which the to-be-LDPC-coded bitplane of size 256 × 256 is divided into 16 non-overlapped blocks of size 64 × 64 and each block is compressed via Equation (6). The reconstruction performance is evaluated with PSNR and compression ratio (CR), where CR is calculated via Equation (9).

Visual Quality Evaluation of Reconstructed Images
As described in Section 3, by sending different numbers of bitplanes, we achieve different compression ratios and obtain reconstructed images with various PSNRs and thus different visual quality. To evaluate the visual quality of the reconstructed images, we take images "F16" and "Lena" as examples. Figure 8 illustrates the reconstructed images at different compression ratios for images "F16" and "Lena". It is shown that the reconstructed images have feasible visual quality even at a low compression ratio of 0.2. Moreover, by comparing with Figure 8a or c and b or d, one can find that with the increase in compression ratio, more details are saved in the compression process and thus better visual quality can be achieved for the reconstructed image. This is because the proposed scheme fully exploits the statistical correlations within a bitplane, between adjacent bitplanes, and among neighboring pixels, and thus well recovers missing parts during the compression of encrypted gray images.

Rate-PSNR Performance Assessment
To further assess the proposed scheme, we examine the rate-PSNR performance and compare it with the state-of-the-art ETC approaches presented in [23][24][25]. For notational convenience, these approaches are denoted KANG, ZHOU, and QIN, respectively. For fair comparison, parameter settings in Section 4.1 are adopted for the proposed scheme, and the optimal parameters given in [23][24][25] are applied for KANG, ZHOU, and QIN, respectively.
In the simulation, the test images in Figure 7 are used for all four compared methods. After each test image is processed with the corresponding encryption, compression, and reconstruction algorithms, the rate-PSNR performance is evaluated. Figure 9 plots the rate-PSNR performance for KANG, ZHOU, QIN, and the proposed scheme. It is found that the proposed scheme generally outperforms KANG, ZHOU, and QIN remarkably. Considering that KANG, ZHOU, and QIN all deploy content-adaptive interpolation (CAI) at the receiver, and ZHOU and QIN further optimize the encrypted image compression through context-adaptive sampling and reconstruction via the inpainting technique, respectively, these simulation results well demonstrate the feasibility and effectiveness of the proposed scheme.
In addition, when compared to KANG, the proposed scheme achieves a significant improvement. Although both of them apply the same CAI method, the proposed scheme conducts the heuristic optimization of bitplane allocation, as well as the exploitation of the statistical correlations within a bitplane and between adjacent bitplanes, and consequently improves the reconstruction performance at the same compression ratio.

Conclusions
In this paper, we present a lossy compression scheme on encrypted gray images through heuristic optimization of bitplane allocation. For a stream ciphered gray image, we develop a pixel-and bitplane-level downsampling method to perform the compression. At the pixel level, the uniform downsampling technique is employed, which yields four sub-images with a quarter of the size and may discard part of the sub-images according to the given compression ratio. At the bitplane level, the task of bitplane selection-based compression is first formulated as an optimization problem that chooses suitable bitplanes to maximize the PSNR of the reconstructed image subject to a given compression ratio, and then a heuristic strategy is developed to approximately solve this optimization problem. For the LDPC-coded bitplanes among the first three MSBs, we further exploit an LDPC code to compress them. In the reconstruction stage, bitplanes among the first three MSBs are recovered in a lossless way via the joint LDPC decoding, decryption, and MRF exploitation. They are then merged to form sub-images, and content-adaptive interpolation is deployed to reconstruct missing bitplanes and thus discarded pixels. Experimental simulation results show that the proposed scheme outperforms the state-of-the-art ETC methods, indicating the feasibility and effectiveness of the proposed scheme.
Although the proposed ETC scheme is promising, it could be further improved in future investigation. For example, each sub-image generated in this work may be further downsampled in a multi-scale way, aiming to better exploit the statistical correlations between neighboring pixels to improve the compression efficiency. In addition, as some of the MSBs of the sub-images are sent to the receiver, they could be fully used to improve the accuracy of content-adaptive interpolation. Furthermore, the proposed scheme may be extended to the efficient compression of encrypted color images.