Chimera: A New Efficient Transform for High Quality Lossy Image Compression

A novel scheme is presented for image compression using a compatible form called Chimera. This form represents a new transformation for the image pixels. The compression methods generally look for image division to obtain small parts of an image called blocks. These blocks contain limited predicted patterns such as flat area, simple slope, and single edge inside images. The block content of these images represent a special form of data which be reformed using simple masks to obtain a compressed representation. The compression representation is different according to the type of transform function which represents the preprocessing operation prior the coding step. The cost of any image transformation is represented by two main parameters which are the size of compressed block and the error in reconstructed block. Our proposed Chimera Transform (CT) shows a robustness against other transform such as Discrete Cosine Transform (DCT), Wavelet Transform (WT) and Karhunen-Loeve Transform (KLT). The suggested approach is designed to compress a specific data type which are the images, and this represents the first powerful characteristic of this transform. Additionally, the reconstructed image using Chimera transform has a small size with low error which could be considered as the second characteristic of the suggested approach. Our results show a Peak Signal to Noise Ratio (PSNR) enhancement of 2.0272 for DCT, 1.179 for WT and 4.301 for KLT. In addition, a Structural Similarity Index Measure (SSIM) enhancement of 0.1108 for DCT, 0.051 for WT and 0.175 for KLT.


Introduction
With the significant increase of multimedia technology in mobile devices and diverse applications, image compression is essential in reducing the amount of data. Nowadays, large amounts of images transfer between mobile devices through wireless communication requiring a fast and robust scheme for image compression. In this regard, a compact representation of a digital image is required to transfer important information using image compression. When the amount of digital information is reduced, it will speed-up the exchange of information and free-up more space for storage in mobile devices [1].
Moreover, the lossy image compression is a powerful technique in the computer image processing field. It is the standard tool to save an image because the technique gives good quality with small memory size for storage [2]. Also, the lossy image compression technique is dependent on two factors. The first factor is that the input data must be sorted in dependent or semi-dependent form and the second factor is that this processed data must use a procedure to rearrange this data into a useful form [3].
There are many applications which participate with the image compression in their schemes. Object detection is one of the applications which required a reduced representation of digital image. This reduction will limit the size of detector model, parameters, and the data storage for this purpose [4]. Biometric authentication in mobile devices represents another application which demands image compression. This application uses the image modality such as faces, iris, and eyebrows to identify people via matching process with the template stored in remote database. Such applications need a fast match and information transfer for the data by data reduction using image compression [5,6].
Multimedia mobile communication [7] uses wireless image transmission to exchange important information between portable devices. Such applications pose a substantial impact for users in recent times. These applications are required to reduce the data redundancy in digital image before transmission. There are different types of lossy image compression such as JPEG and JPEG2000 which use Discrete Cosine Transform (DCT) and Wavelet Transform (WT). These techniques are considered traditional transforms which use a general transform for any data. Thus the proposed approach of this paper uses a specific technique to deal with image data specifically. There are some metrics which are used to evaluate different types of lossy image compression schemes. Namely, Mean Squared Error (MSE), Peak Signal to Noise Ratio (PSNR), Structural Similarity Index Measure (SSIM), and Compression Ratio (CR). MSE calculates the cumulative squared error between the compressed image and original image. PSNR detects the logarithmic scale of the error of an image before and after compression. SSIM is considered as a subjective metric for image quality degradation which is used to measure the similarity between the original and reconstructed image. Finally, the CR is used to evaluate the compression scheme.
DCT and WT are used to compress data across different dimensions such as voice signal (1D), gray image (2D), and movies (3D). Artusi et al. [8] implemented a new scheme of compression for high dimensional images. The suggested scheme involved a new coding method called JPEG XT which is based on two layers which are basic and extension layer and contain low and full dynamic range of an image respectively. Kaur et al. [9], introduced a lossless image compression using Huffman based LZW. The proposed scheme used Retinex Algorithm which involved Huffman coding, word concatenation, and Contrast enhancement. Mathur et al. [10], proposed another image compression scheme which removed the redundant codes using Huffman coding. While Jagadeesh et al. suggested a compression method using an adaptive Huffman coding but this scheme is based on binary tree. This method shows an improved result comparing with LWZ method [11]. Karhunen-Loeve Transform (KLT) was used for transforming a block of signal in terms of energy and decorrelation compaction performances [12].
Image compression has been implemented using curve fitting models as in the work of Khalaf et al. [13], which was derived from a hyperbolic tangent function with only three coefficients. In this regard, the used function had the benefits of a symmetric property to minimize the construction error and to enhance some details with texture for the reconstructed image. Their results show an enhancement of up to 20 dB of PSNR. In addition, Khalaf et al. shows the effects of preprocessing and postprocessing on the compressed images using DCT which address the issues of the monochrome images [14]. Lu et al. extended an optimal piecewise linear approximation for images from 1D curve fitting to 2D surface fitting by a dual-agent algorithm. Their algorithm achieved compact code length with guaranteed error bound, through providing a more dedicated representation of image features [15]. Some compression algorithms have used Singular Value Decomposition (SVD) [16]. This scheme involved a prediction error using Recursive Neural Network (RNN), and Vector Quantization techniques. Other compression methods suggested a 2D image modeling algorithm based on stochastic state-space system to fit quarter plane causal dynamic Roesser model to an image. The results were causal, recursive, and separable-in-denominator [17][18][19].
Embedded Zerotree Wavelet Transform has also been implemented in lossy image compression and perceived that the performance, for example; Set Partitioning in Hierarchical Trees (SPIHT) encoding technique produced good results in comparison with other encoding schemes [20].
A. Losada et al. modified Shapiro's Embedded Zerotree Wavelet algorithm (EZW) for image codec which is based on the Wavelet Transform and on the self-similarity inherent in images through applying a multi-iteration EZW to optimise the combination of ZT and Huffman coding [21]. Oufai et al. proposed another modification of the EZW algorithm by using four symbols instead of using six as Shapiro's distributed the entropy in EZW and also optimized the coding by a binary grouping of elements before coding [22].
Our study focuses on image compression. Two contributions in this study were done as follows: 1.
Suggest a novel scheme for image compression which will be compatible with different image conditions. 2.
Propose three hypotheses, the first and the second hypotheses summarize the important requirements of the lossy image compression, while the third hypothesis uses the first and the second hypotheses to implement a powerful transform.
The remained of this paper is organized as follows: Section 2 focuses on the principles of lossy image compression. Section 3 focuses on the suggested scheme. Section 4 represents the study results and discussions, and Section 5 is the conclusion and the future work.

Problem Statement of the Lossy Image Compression
The image compression is designed to work with a small part of image namely, block. This block has K × K pixels each pixel has independent value. Note that, the image is a combination of pixels which is considered as a reflection for connected areas and deterministic objects.
In this trend, the block involves some special cases of data combination. Whereas the block could not be considered as a pure random data, therefore blocks will not be represented as a dependent form like a deterministic relation or function. In general, the pixels inside image block could be grouped into one, two or three connected groups, and these groups could be distributed randomly.
There are three suggested hypotheses which are proposed in this paper. The first two suggested hypotheses will impact on the lossy image compression and data redundancy. The first hypothesis is related to the limitation of useful pattern inside the block. However, the pattern of a block could be considered as an useful pattern, if the block has a random data. While, the second hypothesis is related to the redundancy of the data inside image due to similarity between the blocks of a certain image. Consequently, this process does not affect the entire image as a result of similarity between blocks. Thus, the proposed approach makes use of this property to substitute a current block with a similar one.
Moreover, the lossy image compression techniques could be implemented using data transformation such as DCT to obtain an image of type JPEG. This transformation can transfer the data from spacial domain to frequency domain. This process could be considered as a rearrangement tool for the information inside each block. In this regards, the possibility of data rearrangement could be considered as a changing rule for data distribution, then this data could be restored in a useful form for quantization and coding processes. Consequently, the amount of coded data can become smaller than before [23].
In addition, the lossy image compression used different techniques of transformation according to the form of data distribution inside the block. For instance, DCT is much more efficient than Walsh-Hadamard (WH) in which the blocks contain wavy or smooth changes. In contrast, WH is more functional than DCT in the case of edgy blocks [24]. Also, the Wavelet Transform works more efficiently when the blocks have a flat area or edgy [25].

The Concept of Chimera Transform
A cross correlation within image block using few coefficients represents a key point for any image transformation to be considered as an efficient scheme. With this regards, an image transformation gives one coefficient of high value while the residue coefficients are very small values (almost zeros) for an image block which affords a maximum cross correlation. In this case, the gain of quantization and coding is very high (small number of bits) which gives a good quality for the retrieved image.
Theoretically, a suggested transform could be designed using the proposed approach which is named Chimera. Chimera ( Figure 1) is a fictional animal consisting of body parts which are taken from different animals. This mythical animal is used to describe anything composed of different disparate parts which are wildly imaginative, implausible, or dazzling. The proposed transformation is based on the Chimera methodology in which has a multiple aspects to be compatible with most block patterns. However, this transformation has three challenges, namely, orthogonality, complexity and a large number of coefficients. The first challenge could be mitigated using a mask of ones for the selected coefficient with max correlation, and masks of zeros for the other coefficients. This solution could work if all possible cases of image blocks are estimated. On other hand, the complexity could be reduced by separating the DC value (minimum value of the block) from other components, then saving it as a separate coefficient. So that, the remaining components (without DC value) have to be normalized with respect to the maximum value in this block. The third challenge represents the main challenge of the proposed approach due to the large number of possible cases. For instance, a 4 × 4 block gray image needs (256 16 = 2 128 ) which is greater than 10 38 of possible cases. The first and second suggested hypotheses could be considered as the key point of the proposed approach which are used to solve this problem.
Moreover, the first and second suggested hypotheses offer a way to find an acceptable set of coefficients to implement Chimera transform. In this case, a 256 free coefficients were implemented according to the suggested hypotheses. In other words, a 4 × 4 image block was converted into three coefficients which are (A) for DC component (minimum), (B) for normalization (maximum − minimum) and (C) for mask label.

The Proposed Approach
The overall stages of the proposed work is shown in Figure 2 which consists of Chimera mask calculation for image compression, and image restoration as will be explained in the next sections.

Chimera Coefficients Calculation
The first stage in the proposed approach is coefficients calculation. This stage involved a cascaded steps to compress the image in this proposed work. The steps are image division, block isolation, block normalization, correlation calculation and best mask estimation. Figure 3 shows the steps of the coefficients calculation stage.  First of all, the entire input image was divided into N blocks of size 4 × 4. Then, for each block the minimum value was isolated and was considered as coefficient (A). Consequently, the resulted component of the block was normalized using min-max normalization using Equation (1). The min-max value was considered as coefficient (B).
where x i is input value, z i is the output value (normalized) and x is the 4 × 4 block. The correlation coefficient (R) was calculated between the normalized block (z) and the library of 256 free masks and used to estimate the best mask for the specific block. The best mask was estimated using argument calculation for the min error (i.e., maximum value of R). Finally, the mask label (C) for the best estimated mask and the min-max value (B) of normalized block (z) was saved for the next stage of an image restoration. Figure 4 shows the calculated components for some test images (man and boat images). The small set of the 256 masks was acquired from the proposed transform. Consequently, these masks were added a minimum amount of errors to the reconstructed block. With this trend, the error minimization could be dependent on coefficients number and mask pattern.
The masks were proposed using the suggested hypotheses to obtain a set of arrays of k × k size. This satisfied our aim by estimating a set of masks for all possible cases for the input block. While the number of the masks depends on two parameters, namely, the block size and the data complexity. A powerful analysis for a large number of masks leads to obtain a low output error, however; as a result the number of masks inside the mask library will be increased. As shown in the experimental section, the selection of 256 masks for 4 × 4 block size is dependent on the suggested hypotheses to obtain a robust results.
Moreover, each 256 masks were designed as matrix size 4 × 4 and the values of these matrix elements were in range of [0, 1]. The mask implementation was not a simple process, therefore this operation consists of three main steps. The first step of mask implementation was proposed to generate a 16 vectors of size 4 × 1 as explained in Table 1. Subsequently, a vector transpose and multiplication was applied on the 16 vectors to obtain a 256 of possible cases (masks). In the third step, the resulted masks were enhanced to obtain the final matrices.
Moreover, Table 1 was based on our suggested hypotheses which states that each 4 one dimension neighbored points (4 × 1 vector) in the image, there are a 16 useful cases were divided over 5 groups as follows: group 1 is the Base that has one flat case, group 2 is the Slope that has two slow growing cases, group 3 is the Simple edge that has six cases, group 4 is the One bit that has four cases and group 5 is the Step that has three cases. Finally, according to our suggested hypotheses, a (16) possible cases were generated and divided into the 5 groups as shown in Table 1. Consequently, the 16 useful possible cases (generations) were scaled to the value of 24 to avoid fraction numbers in the mask calculation (we used an integer number 24 which is the least common multiple of the values of 1, 2 3 , 1 3 , 1 2 , ...). With this assumption, each generation should contain a maximum value of 24 and a minimum value of 0. However, the first group (Base) and last group (Step) were excluded from the previous assumption in which the maximum and minimum values were 24 and 12, respectively. The second step is demonstrated in Equation (2) which was used to generate 256 masks, each of 4 × 4 size.
The third step was involved an enhancement process which applied on the mask was considered as unsystematic and statistical process. Also, the suggested hypotheses were used to detect useful patterns (generated cases) and conserve the desired masks. Table 2 shows the excluded patterns and the reasons for excluding. The first five cases are equal and the other cases have no zero value, therefore; these cases were excluded from the mask library. Since, all the generated masks were symmetric, an anti-symmetry operation was required to mitigate this issue with other special masks. Thus, the suggested hypotheses were designed to overcome the symmetrical problem for masks by proposing suggested matrices as shown in Table 3. However, the Non-Changed (NC) cases are left for future work.
In addition, the proposed algorithm for the suggested compression (Algorithm 1) and de-compression (Algorithm 2) approaches are explained below.

Chimera Image Restoration
Image restoration represents the second stage in the Chimera proposed approach. This stage consists of four steps which are de-normalization, DC component restoration, block aggregation, and image enhancement. Figure 5 shows the consequences of these four steps. The mask label (C) from the previous stage (Chimera mask calculation) was used to locate the desired mask from the library of 256 masks. Then, this mask was de-normalized using coefficient (B) which was saved in the previous stage. Subsequently, the required block was restored using DC component (coefficient A). This operation was applied to all blocks of size 4 × 4 for the entire image. Consequently, these blocks were aggregated to construct the required image. An additional step involved image enhancement to remove the undesired boundaries between the aggregated blocks a consequence of the previous step.

Experiments
This section describes our results and comparative evaluation. A comparative evaluation was applied between the proposed approach and the standard transforms which are used in the lossy image compression. Moreover, PSNR for the same compression ratio (image size) was used in this comparative study. Also, the reported results of the proposed approach show an outstanding over the tradition methods.

Results
Eight standard images were used in the evaluation to show the robustness of our proposed approach and to gain the subjective and objective metrics for these images. Subsequently, these results were compared with standard lossy transforms for image compression. As mentioned in the previous section, three image transformations were used, two of them are independent of the image content which are Discrete Cosine Transform (DCT) which produces JPEG and Wavelet Transform (WT) which produces JPEG2000, and a content-dependent transform which is Karhunen-Loeve Transform (KLT) [12].
An image block of size 4 × 4 was used in this evaluation and each block was tested and evaluated using the suggested approach and the standard methods. The reported result was quantized into three coefficients of 8-bit size.
The Compression Ratio (CR) was set to 5.3:1, meaning that 1.5 bpp (bits per pixel) was used for all tests. The results were evaluated without coding. The coding process was excluded since this process does not effect on the quality of the resulted image and CR could be considered as a random parameter for coding. Figure 6 shows some of visual results for image de-compression using different image transforms. For instance, Lena image which represents de-compression using suggested Chimera Transform (CT) is shown in Figure 6B. Clearly, the proposed CT shows a better visualization than the others, namely, the DCT (Figure 6C), the WT ( Figure 6D), and the KLT ( Figure 6E).

Comparative Evaluation
In order to show the robustness of the proposed approach, a comparative evaluation was involved in this work. In this evaluation, four different approaches were compared to obtain four evaluation scores. Figure 7 shows the entire process for this evaluation.
Peak Signal to Noise Ratio (PSNR) was used to evaluate the Chimera proposed approach with three other approaches which are JPEG (DCT), JPEG 2000 (WT), and KLT. Also, the evaluation scores were obtained using PSNR between the original image and compressed image for each of the compression approaches. Table 4 shows the comparative results using image compression metrics namely, Peak Signal to Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM). These metrics were applied to the proposed Chimera Transform (CT), DCT, WT, and KLT.  Figure 8 shows the comparative evaluation between the proposed Chimera Transform (CT) and the DCT, WT and KLT transforms. This figure shows that CT overcame other transforms in both PSNR and SSIM metrics.

Conclusions and Future Work
We have introduced a new transform called Chimera which shows a robustness against other standard transforms such as Discrete Cosine Transform (DCT), Wavelet Transform (WT) and Karhunen-Loeve Transform (KLT). The suggested approach was designed to compress a specific data type of the images and this represents the first powerful characteristic of this transform. Also, the reconstructed image using Chimera transform has a small size with low error which could be considered as the second characteristic of the suggested approach.
The reported results show a PSNR of 34.4137 for CT while, 32.3856 for DCT, 33.9148 for WT and 30.1127 for KLT. In addition, the suggested approach shows a 0.9553 of SSIM for CT, 0.8225 for DCT, 0.9404 for WT and 0.8912 for KLT. The reported results were evaluated on the Moon image of size 1920 × 1080. Table 4 shows the other evaluations which applied on eight different standard images of same size for each reconstructed image.
Other aspects of the suggested approach are not mentioned in this paper. These aspects relate to preprocessing, post-processing and coding. For future work, coding operation could be considered after CT to improve the quality of the image compression. Image block of size 8 × 8 could be considered and evaluated to increase the Compression Ratio (CR). Also, the number of masks and coefficients could be increased to gain a low level of error in the reconstructed image. In addition, the deep machine learning such as the convoluational Neural Network (CNN) could be introduced to generate the mask library instead of hand-craft mask generation as shown in the Tables 1 and 2 in the compression.