An Enhanced Adaptive Block Truncation Coding with Edge Quantization Scheme †

: Recently, image compression using adaptive block truncation coding based on edge quantization (ABTC-EQ) was proposed by Mathews and Nair. Their approach deals with an image for two types of blocks, edge blocks and non-edge blocks. Different from using the bi-clustering approach on all blocks in previous block truncation coding (BTC)-like schemes, ABTC-EQ adopts tri-clustering to tackle edge blocks. The compression ratio of ABTC-EQ is reduced, but the visual quality of the reconstructed image is signiﬁcantly improved. However, it is observed that ABTC-EQ uses 2 bits to represent the index of three clusters in a block. We can only use an average of 5/3 bits by variable-length code to represent the index of each cluster. On the other hand, there are two observations on the quantization levels in a block. The ﬁrst observation is that the difference between the two quantization values is often smaller than the quantization values themselves. The second observation is that more clusters may enhance the visual quality of the reconstructed image. Based on variable-length coding and the above observations, we design variants of ABTC-EQ to enhance the visual quality of the reconstructed image and compression ratio.


Introduction
Rapid improvements in the area of network and information technology increase the services of digital multimedia, especially digital image, in today's digitalized and information world. For example, consider storage and transmission. File compression reduces the amount of space needed to store data, and also speeds the time to send over the Internet. About compressing digital images, there are two main types of compressing digital images, lossless and lossy. In this paper, we deal with block truncation coding (BTC) and its variants [1][2][3][4], which are lossy compression algorithms. Because of their stable compression rates and low computation efforts, BTC-like schemes are widely used in cryptography, e.g., data hiding [5][6][7][8][9], watermarking [10], secret image sharing, and visual cryptography [11][12][13][14].
The BTC was first proposed by Delp and Mitchell [1]. It is a block-based lossy image compression technique for grayscale or color images, where a quantizer is adopted to reduce the number of into non-overlapping (k × k)-sized blocks, where k may be set to be (4 × 4), (6 × 6), (8 × 8) and so on. AMBTC adopts block-wise operation. For each block, the mean pixel valuex is calculated bȳ where x i denotes the i-th pixel in this block. Each pixel value x i is compared with the mean valuex using Equation (2). If x i is greater than or equal tox, b i becomes 1, otherwise b i becomes 0. That is, a bitmap M = [b i ] of the same block size which consists of two clusters is generated.
AMBTC preserves two quantization values per block and the higher mean and the lower mean. Equation (3) describes the method of generating two quantized values in each block. Here, t denotes the number of "1" in each bitmap M, i.e., the number of pixels under x i ≥x. · is the floor function is the function that takes as input as real number x, and gives as output the greatest integer less than or equal to x, denoted floor(x), or x . The means µ 1 and µ 0 are, respectively, the higher and lower means based onx.
Finally, a block of the image is compressed into two quantization levels (µ 0 , µ 1 ) and bitmap M, i.e., trio (µ 0 , µ 1 , M). A bitmap M contains the bit-planes that represent the pixels, and the values µ 0 and µ 1 are used to decode the AMBTC compressed image. For the case k = 4, i.e., we deal with an image by (4 × 4) block-wise operation. Sixteen pixels in a block are represented as a trio (µ 0 , µ 1 , M) of 8 + 8 + 16 = 32 bits, and thus the CR is (16 × 8)/32 = 4. Consider the example of a 512 × 512-pixel image. The file size of 2 M bits can be reduced to 0.5 M bits. In decoding phase, when two quantization levels and the bitmap obtained, the corresponding image block can be easily reconstructed by replacing every "1" in a bitmap M with µ 1 , while every "0" is replaced with µ 0 .
Because AMBTC provides better image quality and the fast computation, most BTC-based data hiding schemes and secret image sharing schemes adopt AMBTC approach. In [3], the authors proposed a MBTC by using max-min quantizer to further enhance the quality of reconstructed image. In AMBTC, the threshold value used for distinguishing two clusters is simply using the mean valuex in a block. A threshold value x th of MBTC in Equation (4) is obtained by calculating the average value of the maximum value (x max ), minimum value (x min ), and mean value (x) in a block.
Afterwards, by the same argument of AMBTC but using x th instead ofx, we may obtain a trio for each block in MBTC.

Mathews and Nair's ABTC-EQ
In previous BTC and its variants, e.g., AMBTC and MBTC, the quantization approaches are all the same. They all use bi-clustering approaches (two quantization levels) in all blocks. ABTC-EQ is an edge-based block truncation scheme. Its quantization is based on the edge information. In ABTC-EQ, we find the edged image from the given input image by Canny edge detector [15], where the process of the detection algorithm is composed of 4 different steps: (1) smooth the image with Gaussian filter to remove the noise. (2) find the intensity gradients of the image using finite-difference approximations for the partial derivatives. (3) apply non-maximum suppression to the gradient magnitude. (4) use the double thresholding algorithm to determine potential edges.
For a block of the edge map (bitmap) E = [e i ] (sized (k × k)) obtained from the given input image based on the process, we classify the block into an edge block or a non-edge block using the criteria. That is if all pixel values in the block E are '0', it classifies to a non-edge block, otherwise, it classifies to an edge block. Notes: The image created after applying the edge detector algorithm is a bitmap and includes edges representing the shape of objects. By applying this feature, the set of blocks may be classified into edge blocks or non-edge blocks.
For these two types of blocks, we use various quantization approaches. In case of non-edge blocks, we use MBTC. Therefore, a non-edge image block can be represented as a trio (µ 0 , µ 1 , M n ), where we intentionally use M n notation to represent a bit map for the non-edge block. On the other hand, for edge blocks, we use tri-clustering approach. The pixels in a block are classified into three clusters (c 0 , c 1 , c 2 ), which similar pixels are grouped into the same cluster, by k-means clustering algorithm [16]. A bitmap M e = [b i ] of the edge block is generated by Equation (5).
Bi-clustering and tri-clustering approaches are performed for non-edge blocks and edge blocks, respectively. To discriminate edge blocks from non-edge blocks, an identifier flag f should be defined and assigned with the value 0 (respectively, 1) for the edge block (respectively, non-edge block). Finally, k 2 pixels may be represented as ( f = 1, µ 0 , µ 1 , M n ) or ( f = 0, µ 0 , µ 1 , µ 2 , M e ). Therefore, the CR of ABTC-EQ is dynamic not static like previous BTC schemes.

Design Concept
The advantage of ABTC-EQ [4] is to improve the quality of the reconstructed image because it can represent edge and non-edge blocks. However, it is impossible that BTC and its variants cannot represent edge block, because they only use two quantization levels, (µ 0 , µ 1 , M), to represent a block. However, ABTC-EQ enhances the visual quality of the reconstructed image as well as reduces the CR due to using an extra flag bit f and extra quantization value µ 2 . To enhance BTC-like approaches, we obviously should improve the CR as well as with a high PSNR. As a result of studying the ABTC-EQ, it was found that the PSNR improvement of the reconstructed image was due to the tri-clustering approach to the edge block. The weakness of this approach is that the CR decreases due to the increase of the additional bits to represent the edge block. Some in-depth observations on ABTC-EQ are listed below. Observation 1. It is not necessary to use two bits to represent a pixel (i.e., '00', '01', and '10') in bit map M e for edge block.
As shown in Equation (5), we use (00), (01), and (10) for three clusters c 0 , c 1 , and c 2 , respectively. However, we may use Huffman code, a variable-length code, to represent three clusters, by (0), (10), and (11) with average length 5/3 bits for clusters c 0 , c 1 , and c 2 . Note: the Huffman code can be uniquely decoded. By this approach, the size of bit map M e is reduced from 2k 2 to (5/3)/k 2 . Finally, the CR can be enhanced.

Observation 2.
Consider two quantization values (say µ i and µ j , where µ i < µ j ). The difference (µ j − µ i ) between two quantization values is often smaller than the quantization values µ i and µ j themselves.
By observing quantization value, we herein use a homologous way to describe the difference between two quantization values with the help of the coder in converting voice. A well-known coder, differential pulse code modulation (DPCM), is described as follows: obtain the pulse of analog signals by sampling and then convert the difference of pulses into binary sequences using the non-uniform coding scale. This property is also true for the quantization levels in ABTC-EQ, i.e., the large difference of quantization values does not occur frequently. Thus, we could carefully design our quantization ranges for the small difference between two quantization values.
Observation 3. More clusters may enhance the PSNR of the reconstructed images.
In previous BTC-like schemes, all blocks adopt bi-clustering, i.e., using two quantization ranges for each block. Mathews and Nair's ABTC-EQ performed a tri-clustering approach on edge blocks. Because there are three values µ 0 , µ 1 , and µ 2 to approximate the pixel grayscale values, it can reduce the mean square error. We may use more clusters (say four clusters) to more precisely approximate pixel values.

The Proposed Schemes
We aim to achieve the high CR and the reasonable PSNR of the reconstructed image. For the purpose, we proposed three schemes: (1) Scheme A motivated from Observation 1 , (2) Scheme B is based on Observations 1 and 2, and (3) Scheme C is based on Observations 2 and 3. Compared with Mathews and Nair's ABTC-EQ [4], Scheme A has the same PSNR, while enhances the CR. However, Scheme C has the same CR, while enhances the PSNR. On the other hand, Scheme B further enhances the CR than Scheme A but still retain a reasonable PSNR compared with AMBTC [2] and MBTC [3].
(1) Scheme A: The algorithm is the same as that of ABTC-EQ. If a block belongs to the edge block as described in Observation 1, the bitmap (M e ) is composed of {'0', '01', '10'} (see Equation (7)). In this case, we show the compression performance of the bitmap when a variable-length coding is applied to the bitmap. The proof is demonstrated by the Equations (8) and (9). Finally, we use ( f = 0, µ 0 , µ 1 , µ 2 , M e ) for an edge block.
Theorem 1. Suppose that the percentages of edge blocks and non-edge blocks in an image be p e and p n , where p e + p n = 1, when dealing with (k × k)-pixel block in an image. The CR of ABTC-EQ is CR MN = 8k 2 (17+k 2 )+(8+k 2 )×p e , and Scheme A has the CR A = 8k 2 (17+k 2 )+(8+(2k 2 /3))×p e , where CR A > CR MN . Meanwhile, they have the same PSNR of reconstructed image, i.e., PSNR MN = PSNR A .
By Equations (8) and (9), since (8 + (2k 2 /3)) < (8 + k 2 )) we obviously have CR A > CR MN . Except using different bit map M e from M e , Scheme A uses the same approaches of ABTC-EQ. Thus, both schemes have the same PSNR, i.e., PSNR MN = PSNR A . Notes: In Equations (8) and (9), (1 + 2 × 8 + k 2 ) represents data format of the edge block, i.e., "1" is a flag bit, "2 × 8" is 2 pixels × 8 bits (two quantization levels), and "k 2 " is k × k × 1 bits (bitmap) in non-edge block. In each block, (1 + 3 × 8 + 2 × k 2 ) denotes the format of the non-edge block, i.e., "3 × 8" is 3 pixels × 8 bits and "2 × k 2 " is k × k × 2 bits in edge block. The 5/3 of Equation (9) indicates that the number of cluster is 3 and the total bit length is 5 when a block belongs to an edge block. By this approach, the size of bit map M e is reduced from 2k 2 to (5/3)/k 2 . Finally, we proved that the CR performance is enhanced by Scheme A.
(2) Scheme B: Basically, this scheme is based on the compressed bitmap M e = [b i ] derived from Observation 1. Moreover, as explained in Observation 2, a new compressed quantization levels (δ 0 , δ 1 , δ 2 ) is exploited. This is derived from the idea that compression performance can be improved by exploiting the difference between the two quantization levels. For this, we define a way of classifying the range of quantization levels into four categories. That is, for the tri-cluster δ i , 0 ≤ i ≤ 2, we use n i bits with a radix R i : (r n i , r n i −1 , . . . , r 1 ) to represent its quantization levels. ( The new quantization levels due to the tri-cluster, δ i , 0 ≤ i ≤ 2, is then iteratively determined based on radix R i with the equation min{µ i − ∑ i j=0 δ j }. Finally, we obtain new quantization levels ( f = 0, δ 0 , δ 1 , δ 2 , M e ) for an edge block. This process can greatly reduce the bits of the quantization level.

Theorem 2.
Suppose that the percentages of edge blocks and non-edge blocks in an image be p e and p n , where p e + p n = 1, when dealing with (k × k)-pixel block in an image. Scheme B has the CR B = Proof. By the compression data formats of Scheme B ( f = 1, µ 0 , µ 1 , M n ) and ( f = 0, δ 0 , δ 1 , δ 2 , M e ), we may derive in Equation (11).
Via Equation (11) and four quantization ranges in Equation (10), we have compression ratios for these four quantization ranges CR B-I = . From the above and Equation (9) All compression ratios of Scheme B are larger than CR MN (ABTC-EQ). By Observation 3, we may obtain the approximated (µ 0 , µ 1 , µ 2 ) with a tolerant distortion from (δ 0 , δ 1 , δ 2 ). Moreover, Scheme B may have the higher PSNR than those of AMBTC and MBTC.
(3) Scheme C: As the number of clusters for a block increases, the PSNR of an image increases proportionally like the case of Observation 3. In this scheme, four clusters (c 0 , c 1 , c 2 , c 3 ) are introduced for edge blocks and the bitmap ( we use a new quantization levels (δ 0 , δ 1 , δ 2 , δ 3 ) to represent (µ 0 , µ 1 , µ 2 , µ 3 ). Here, the quantization range is defined like Equation (14) and the value δ i , 0 ≤ i ≤ 3 is then iteratively determined based on radix R i with the the criteria min{µ i − ∑ i j=0 δ j }. Finally, we use new format ( f = 0, δ 0 , δ 1 , δ 2 , δ 3 , M e ) for an edge block. Scheme C is a method to improve image quality while maintaining the same level of compression as ABTC-EQ, and here, it is proved that the compression ratios of Scheme C and ABTC-EQ are the same for this case.
Theorem 3. Suppose that the percentages of edge blocks and non-edge blocks in an image be p e and p n , where p e + p n = 1, when dealing with (k × k)-pixel block in an image. Scheme C has the CR C = 8k 2 (17+k 2 )+(8+k 2 )×p e , where CR C = CR MN .

Examples
An example, dealing with a (4 × 4)-pixel image block, is given in this sub section to easily understand all proposed schemes: Scheme A, Scheme B-I ∼ Scheme B-IV, and Scheme C. Moreover, we will show the stored bits for this block and average mean square error (AMSE) of a single block for all prosed schemes, and the AMBTC, MBTC, and ABTC-EQ.
Suppose that a (4 × 4)-pixel image block is Obviously, by using AMBTC [2], this block can be represented as a compressed trio (77, 123, 1010111011000100), which has to store 32 bits for this block and the AMSE is 167.56. The trio is (74, 120, 1010111011001100) when using MBTC, which differs with clustering the thirteen pixel in this block, and its AMSE is slightly reduced as 160.44. According to the definition of an edge block, this image block is an edge block. We then obtain three clusters from these 16 pixels of block via k-means clustering algorithm, and assign (0), (10), and (11)  Via Equation (7), we determine µ 0 = 61, µ 1 = 89, and µ 2 = 125. Therefore, the compressed data 0, 61, 89, 125, 11101101111111011110101010010) of 54 bits. Moreover, the AMSE is 77.81. Mathews and Nair's ABTC-EQ uses ( f = 0, µ 0 , µ 1 , µ 2 , M e ) with the same (µ 0 , µ 1 , µ 2 ), and thus it has the same AMSE. However, it uses the bit map M e of 32 bits and it requires a total of 57 bits to store this block. Therefore, we showed that Scheme A provides an advantage to reduce 3-bit in compression of the block compared to ABTC-EQ.
The AMBTC has the worst AMSE = 167.56, but it needs the least bits (32 bits) for representing a block. The above three examples imply that the AMSE = 48.13 of Scheme C is much lesser than those of other schemes (note: this significant improvement comes from the quad-clustering approach), and its number of required bits is the same as ABTC-EQ. Compared with ABTC-EQ, Scheme A has the same AMSE but has fewer bits for a block. About Scheme B, it can make a trade of the number of required bits for AMSE. For example, Scheme B-IV only needs 42 bits for a block, and meanwhile, the AMSE = 82.19 is far less than AMSE = 167.56 of AMBTC.

Experimental Results
Five test images, Lena, Butterfly, Cameraman, Lake, and Peppers are used for evaluating all BTC-like schemes: AMBTC, MBTC, ABTC-EQ, and the proposed schemes (Scheme A, Scheme B and Scheme C). To properly deal with all (k × k) blocks, where k = 4, 6 and 8, we use all test images of the size 504 × 504 pixels. The evaluation metrics, PSNR, CR, structural similarity (SSIM) index, and feature similarity (FSIM) index, are used to compare the performance of all these schemes. Table 1 illustrates the comparison of all BTC-like Schemes. For the test image Lena, consider dealing with (4 × 4) blocks by all schemes.
Scheme C adopts quad-clustering, and also uses 24 bits to represent four quantization values by the approach of using difference. Therefore, Scheme C has the best visual quality (PSNR = 39.62 dB) and meanwhile has the same CR = 3.09 as ABTC-EQ. While AMBTC and MBTC have high CR, they have poor PSNR because they only use the bi-clustering approach. The PSNR = 33.87 dB of MBTC is slightly greater than the PSNR = 33.42 dB of AMBTC. This slight enhancement comes from using the more precise threshold value for MBTC.
Scheme A uses tri-clustering and same quantization ranges like ABTC-EQ, and thus Scheme A and ABTC-EQ have the same PSNR = 37.49 dB. Because of using a variable-length code to record the index of the cluster, Scheme A has a higher CR = 3.24 than the CR = 3.09 of ABTC-EQ. On the other hand, Scheme B may trade-off PSNR for CR by using different quantization ranges. Scheme B-I has PSNR = 37.47 dB almost the same to PSNR = 37.49 dB of ABTC-EQ, and has the higher CR than Scheme A. If we want to achieve a high CR and meanwhile retain a moderate PSNR, we may choose Scheme B-IV, which has the CR = 3.62 PSNR = 35.96 dB. Moreover, all the values of SSIM and FSIM demonstrate consistency with the performance of PSNR. For simplicity, we only show experimental results for Lena. The original image is given in Figure 1a, and the reconstructed images from AMBTC, MBTC, ABTC-EQ, Scheme A, Scheme B-I, Scheme B-II, Scheme B-III, Scheme B-IV, and Scheme C using (4 × 4) blocks are, respectively, illustrated in Figure 1b-j. Scheme C (Figure 1j) has the best PSNR 39.62 dB. ABTC-EQ and the proposed schemes deal with edged blocks and thus may have better performance near edges. The edge images of the original image Cameraman and the reconstructed images from AMBTC, MBTC, ABTC-EQ, and Scheme C are shown in Figure 2. The non-edge based schemes (AMBTC and MBTC) do not retain the details of selected portions, as shown in the dashed circle. However, both Scheme C and ABTC-EQ have better details. Moreover, it is observed that Scheme C demonstrates more edges in the circle area than ABTC-EQ. Scheme C depicts the improvement in the visual quality near edges, and its edge image is very similar to the original image.

Discussion
We further discuss three important issues in-depth: (i) the visual quality of reconstructed image, i.e., PSNR, (ii) the size of a compressed rate, i.e., CR, and (iii) an appropriate way of using Scheme A, Scheme B, and Scheme C for applications.
Therefore, when comparing with the original values of four clusters (c 0 , c 1 , c 2 , c 3 ), our recovered values have very small distortion. In Example 3, the original values are (µ 0 , µ 1 , µ 2 , µ 3 ) = (61, 83, 115, 139), and the recovered values are (60, 82, 115, 139), almost the same to the original one. For this case, we still use 24 bits to represent four quantization values, which are the same to ABTC-EQ using 24 bits to represent three quantization values.

(2) Size of Compressed Rates (CRs):
Here, we deal with the enhancement of three modified ABTC-EQ schemes such as Scheme A, Scheme B, and Scheme C. Except the Scheme C (using two bits to represent four clusters), the other two schemes enhance the CR. As we know about compression technology, the CR is the most important key property. Better CR implies that compression technology has a better performance. Therefore, we showed a theoretical analysis of the estimated CRs in Theorems 1, 2, and 3. In addition, to prove the accuracy for theorem, we show a comparison of the simulation results (Table 2) and the estimated CRs derived by the Theorems. That is when k is {4,6,8} for all values of the given P e , the expected compression ratio of the proposed method is shown in Table 2.
Moreover, the average of CRs for five test images (Lena, Butterfly, Cameraman, Lake, Pepper) are listed. Consider the CRs using Scheme A. For the case k = 4, the average CR of experimental values is 3.23 near the theoretical CR = 3.24 (p e = 0.35). The average CR of experimental values is 4.27 (respectively, 4.78) for k = 6 (respectively, k = 8), which is near the theoretical CR = 4.27 (p e = 0.45) (respectively, CR = 4.78 (p e = 0.50)). This result consists with the increment of p e for a large k. As we know, if the edge values e i , 1 ≤ i ≤ k 2 , in E is "1" and not all the edge values are "1", then the image block is defined as an edge block. The number of edge blocks is increased when the value of k is increased, and thus the probability is increase for the large k. Table 2. Estimated CRs of all proposed schemes with k = 4, 6 and 8 for 0.05 ≤ p e ≤ 0.6.

Pe
Scheme

(3) Time Complexity:
The time complexity of all the proposed methods is an important criterion for the performance evaluation of compression algorithms. The suggested method fits this criterion very well. Because the only difference between Scheme A and the original ABTC-EQ is that Scheme A adopts Huffman code for representing three clusters, namely using (0), (10), and (11). This is a very simple Huffman code. In fact, we do not need encoding/decoding for this Huffman code in Scheme A. From another viewpoint, the original ABTC-EQ uses (00), (01), and (10) for representing three clusters, while Scheme A uses (0), (10), and (11) instead. The Huffman code in Scheme A is only used to represent the index of clusters c 0 , c 1 , and c 2 , and we do not need encoding/decoding of Huffman code. At this time, the (0), (10), and (11) are to only assure the unique representation of the index of the cluster.
Scheme B uses the same approach as Scheme A, represent clusters using Huffman code, and this approach combines various quantization ranges. Quantized pixels are used as representative pixels (8 bits) of each block when the image is decompressed. We use Scheme B-I as an example for describing the quantization representation is the same as a binary representation.
The original ABTC-EQ and Scheme C use c = 3 and 4, respectively. The proposed Scheme C uses four clusters and the quantization ranges in Equation (14). As described above the quantization approach has almost the same execution time as the original ABTC-EQ, but subdividing more clusters in every block for Scheme C is slightly greater than ABTC-EQ (note: time complexity order of clustering is O(t × c × k 2 ).

(4) Appropriate Way Using Our Schemes:
Our goal is for consumers to understand and appropriately use application scenarios for the three proposed ABTC-EQ schemes. To clearly describe their respective application scenarios, we create a radar chart to illustrate the multiple performances (the number of clusters, the PSNR of a recovered image, and the size of compressed file) and the variations among all the schemes (Scheme A, Scheme B-I∼IV, and Scheme C). The values of PSNR and CR are adopted from Table 1 using the test image Lena and the block size 4 × 4 pixels. From Figure 3, we conclude the following results showing how to appropriately use Scheme A, Scheme B-I∼B-IV, and Scheme C to develop their specialties according to applications. Because Scheme A has the same PSNR as ABTC-EQ, if we want to obtain a good PSNR along with a good CR, we should choose Scheme A. On the other hand, Scheme C has a very high PSNR. Therefore, if we want to have a significant improvement in PSNR, we may choose Scheme C, which still has the same CR as ABTC-EQ. For the application that a high CR is a requirement, we may use Scheme B to trade off the PSNR for the CR.

Conclusions
In this paper, we propose a method to improve the compression performance of ABTC-EQ, a method that overcomes the edge loss problem of the existing BTC-like image. It is very reasonable that the tri-clustering approach is preferable to the bi-clustering approach for the quality of the image. On the other hand, the problem of introducing the tri-clustering approach is that the file size increases. In this paper, by introducing variable-length coding, we found a method that satisfies both the image compression and image quality. In addition, we have mentioned in detail in the paper a sufficient theoretical analysis of the proposed method. When compared with ABTC-EQ, Scheme A enhances CR and does not change the PSNR, while Scheme C enhances PSNR without reducing CR. Scheme B trades off PSNR for CR. From these properties, we demonstrate how to properly use these schemes for intended applications. Moreover, experimental results are given to illustrate the effectiveness and advantages of the proposed schemes.