Walsh–Hadamard Kernel Feature-Based Image Compression Using DCT with Bi-Level Quantization

Nayak, Dibyalekha; Ray, Kananbala; Kar, Tejaswini; Kwan, Chiman

doi:10.3390/computers11070110

Open AccessArticle

Walsh–Hadamard Kernel Feature-Based Image Compression Using DCT with Bi-Level Quantization

¹

School of Electronics Engineering, KIIT Deemed to be University, Odisha 751024, India

²

Signal Processing, Inc., Rockville, MD 20850, USA

^*

Author to whom correspondence should be addressed.

Computers 2022, 11(7), 110; https://doi.org/10.3390/computers11070110

Submission received: 11 June 2022 / Revised: 24 June 2022 / Accepted: 30 June 2022 / Published: 4 July 2022

(This article belongs to the Special Issue Human Understandable Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

To meet the high bit rate requirements in many multimedia applications, a lossy image compression algorithm based on Walsh–Hadamard kernel-based feature extraction, discrete cosine transform (DCT), and bi-level quantization is proposed in this paper. The selection of the quantization matrix of the block is made based on a weighted combination of the block feature strength (BFS) of the block extracted by projecting the selected Walsh–Hadamard basis kernels on an image block. The BFS is compared with an automatically generated threshold for applying the specific quantization matrix for compression. In this paper, higher BFS blocks are processed via DCT and high Q matrix, and blocks with lower feature strength are processed via DCT and low Q matrix. So, blocks with higher feature strength are less compressed and vice versa. The proposed algorithm is compared to different DCT and block truncation coding (BTC)-based approaches based on the quality parameters, such as peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) at constant bits per pixel (bpp). The proposed method shows significant improvements in performance over standard JPEG and recent approaches at lower bpp. It achieved an average PSNR of 35.61 dB and an average SSIM of 0.90 at a bpp of 0.5 and better perceptual quality with lower visual artifacts.

Keywords:

DCT; JPEG; WHT; Multiple Feature Extraction; K-means

1. Introduction

In the current technology era, many multimedia applications are on demand, which creates a huge amount of video and image data just at a finger click. There is a high demand for data compression techniques to handle those data. The data compression can be achieved by reducing or removing the redundant information from the original data. There are two types of image compression techniques: lossy and lossless image compression. In the medical diagnosis process, we need a reconstructed image, which has the exact replica of the original image. That is, no information is lost and the type of compression is known as the lossless image compression. Meanwhile, in lossy image compression techniques, some data loss is acceptable, such as a photograph where a minor loss in redundant data from the original image is acceptable to contain a better bit rate.

The details of different image compression techniques can be found in the literature [1,2] and the techniques can be classified as direct and transform domain methods. In direct techniques, such as block truncation coding [2] and vector quantization [3], compression is applied directly to the spatial domain data. However, in the transform domain techniques, the image is transformed to a different domain before the compression. Principal component analysis (PCA), discrete wavelet transform (DWT) and discrete cosine transform (DCT) [4,5,6] are the most popular transform domain methods found in the literature. In these transform domain methods, the information content of the image is concentrated in a few blocks, making them suitable for data reduction. The DWT method poses a better reconstructed image than any other methods. According to image and video compression quality measure standards, PSNR of a DWT-based approach is much better than a DCT-based approach. However, due to lower complexity, DCT-based approaches are preferred in multimedia devices.

The simple and efficient encoding and decoding structure of JPEG standard is making it most popular among all compression standards [7,8]. This standard is used in digital cameras and various equipment that are used to take photographs for data storage and transmission worldwide. As DCT is applicable in a block-wise manner, for the encoding process, we have considered the 8 × 8 block size in the proposed method. For each block, a 2D DCT is obtained following quantization and zig-zag scanning.

DCT [9] based JPEG is the most widely used transform domain algorithm that is used today, with various modifications. In JPEG, the quantization matrix (QM) plays a major role that decides the compression ratio and, hence, the bpp of the compressed image. The JPEG depends on adaptive quantization and a scanning method has been developed by Messaoudi et al. [10]. There are JPEG papers based on saliency, which is generating a low bit rate [11]. Some image compression methods have used the feature extraction technique for bit rate reduction using block truncation coding (BTC) [12,13]. BTC methods are mostly acceptable for low-complexity applications, but the bit rate is still very high for this method. The high bit rate is reduced by using multi-grouping techniques [14,15]. The compression algorithms developed based on DWT [16], H264 [17], H265 [18], and JPEG2000 [19,20] have effective compression, but, due to the high complexity, these methods have high computational requirements. Low-complexity saliency-based DCT has better compression but is not applied for the entire image [21]. The Walsh–Hadamard Transform (WHT) is used for image compression [22] due to the high speed of operation and its ability to reduce bit rate and, hence, with higher compression ratio.

To reduce the correlation between the RGB bands, Touil et al. [23] proposed a DCT-based compression method where the authors used three different uncorrelated spaces by using the Bat algorithm, which gives better performance than standard JPEG. This method is capable of reducing the number of small energy coefficients of the DCT by optimizing the cost function. In [24], a homogeneous texture region (HTR)-based block allocation for the DCT compression is proposed. The blocks with high HTR are given larger block-size to perform the DCT and vice versa.

Motivated by the popularity and simplicity of the DCT-based compression approach, in this paper, we have proposed a DCT-based framework followed by bi-level quantization for image compression. Here, the quantization matrix is not fixed for the entire image blocks. The value of the quantization matrix is decided based on the feature strength of the blocks extracted using Walsh–Hadamard kernel.

The unique contributions for this paper are as follows:

Deployed Walsh–Hadamard transform kernel (WHTK)-based feature extraction for image compression. This enables faster implementation.
Generation of block feature strength (BFS) based on a weighted combination of color, edge, and texture strength extracted using Walsh–Hadamard transform kernel. This enhances the compression performance.
Automatic selection of the quantization level of the blocks using BFS and K-means [25] algorithm. This ensures easy implementation.
The proposed DCT-based framework using bi-level quantization is efficient and faster as compared to standard JPEG and other recent algorithms, such as H264, H265, JPEG 2000, and others.

The remainder of the paper is organized as follows. Section 2 describes the materials and methods, including an overview of Walsh–Hadamard Transform and its properties, and the proposed methods in detail. Section 3 describes the simulations, results, and analysis. Finally, some concluding remarks are given in Section 4.

2. Materials and Methods

This section describes the overview of Walsh–Hadamard Transforms (WHT) and its properties, Walsh Basis Vector, and Walsh-Basis-Vector-based feature extraction, followed by details of the proposed algorithm.

2.1. Overview of Walsh–Hadamard Transforms

The DCT, DWT, and WHT transform methods are used in various signal and image processing fields. These methods are popular due to their compaction of the energy in the lower frequency components and the quality of reconstruction. Through analyzing and classifying the coefficients of linear image transform and the frequency components of the image, we obtain the edge information. The kernels or the bases of the image transform, which give the edge information, can also provide energy information of an image, which gives the local properties of an image, such as texture and color. WHT is very useful due to its simplicity of operation over other linear transform methods. WHT has reduced computational complexity due to its +1 and −1 orthogonal components. WHT is an average but non-sinusoidal image transform method. The Walsh–Hadamard Transform matrix (WHTM) can be defined as

W_{h}

with N sets of rows. The properties of the

W_{h}

are defined as follows:

(i): $W_{h}$ = 1 or $- 1$ ,
(ii): $W_{h} (0) = 1$ for all $h = 1, ....., N - 1$ ,
(iii): $W_{h}$ has all zero crossing. The size of the WHTM is of power of 2.

The WHT matrix is valid for N > 2 and N mod 4 = 0. WHTM for N = 8 is given by:

W H T M = (\begin{matrix} 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & - 1 & 1 & - 1 & 1 & - 1 & 1 & - 1 \\ 1 & 1 & - 1 & - 1 & 1 & 1 & - 1 & - 1 \\ 1 & - 1 & - 1 & 1 & 1 & - 1 & - 1 & 1 \\ 1 & 1 & 1 & 1 & - 1 & - 1 & - 1 & - 1 \\ 1 & - 1 & 1 & - 1 & - 1 & 1 & - 1 & 1 \\ 1 & 1 & - 1 & - 1 & - 1 & - 1 & 1 & 1 \\ 1 & - 1 & - 1 & 1 & - 1 & 1 & 1 & - 1 \end{matrix})

(1)

The row of WHTM is known as the 1D Walsh–Hadamard Basis Vector and is ortho-normal. By using the tensor of 1D-basis vector from WHTM, we obtain the 2D WHT kernel by multiplying the corresponding row with the column of the WHT [26]. The basis WHT kernel can be used as the basis vectors k of the WHTM. We can obtain the basis vectors as

V = {v_{1}, v_{2}, v_{3} \dots \dots \dots \dots ., v_{64}}

according to the number of WHT kernel.

2.2. Properties of Walsh–Hadamard Transform

Consider an image block of size

N \times N

and pixels of

f (u, v)

, we can find the 2D WHTM as:

W (i, j) = \sum_{i = 0}^{N - 1} \sum_{j = 0}^{N - 1} f (u, v) * g (u, v, i, j)

(2)

where

W (i, j)

are the Walsh transform results and the g(x, y ,u ,v) are the kernels. The very first kernel is

g (u, v, 0, 0)

and the next kernel is

g (u, v, 0, 1)

. The 1st kernel is the zero sequencing kernel.

Below are some interesting properties of WHT:

(i) Property 1: The zero sequencing term is a measure of average brightness of the image block, which gives color strength of the image.

W (0, 0) = \sum_{u = 0}^{N - 1} \sum_{v = 0}^{N - 1} f (u, v)

(3)

Property 1 gives the brightness of the image by using the statistics of zero sequencing term (

W (0, 0)

). This gives the color strength of an image.

(ii) Property 2: There exists an energy conservation property among the transform domain and the spatial domain of a block [26].

\sum_{u = 0}^{N - 1} \sum_{v = 0}^{N - 1} {| f_{h} (u, v) |}^{2} = \sum_{i = 0}^{N - 1} \sum_{j = 0}^{N - 1} {| W (i, j) |}^{2}

(4)

This property can be deployed for extracting edge information based on the non-zero sequencing terms

W (0, 1)

,

W (1, 0)

, and

W (1, 1)

. Property 2 gives the idea of conservation of energy in the transform domain, which gives the edge information based on the nonzero sequencing terms

W (0, 1)

,

W (1, 0)

, and

W (1, 1)

.

(iii) Property 3: The property of energy compaction indicates that the energy of the image is distributed to only a few transform coefficients. This shows that the few coefficients have the significant values. Property 3 gives the energy compaction of the image in the transform coefficients, which give the information of the texture strength of the image block.

Motivated by the properties of Walsh–Hadamard Transform [26], we have considered the energy, texture, and color properties to obtain the better visual features of an image in the proposed method.

In the preprocessing stage, we have converted the image block from RGB to YCbCr model and generated one color component (

C C

) from which we can extract the features of an image.

C C

value can be generated by considering the average of the two planes as follows:

C C (x, y) = \frac{1}{2} (Y (x, y) + C b (x, y)) = \frac{1}{2} (Y (x, y) + C r (x, y))

(5)

where

x = 1, 2, 3 \dots M_{1}

and

y = 1, 2, 3 \dots M_{2}

;

M_{1} \times M_{2}

indicates the size of the image.

Y (x, y)

,

C b (x, y)

, and

C r (x, y)

indicate YCbCr planes where Y is the luminance/brightness of the color, and

C b

,

C r

are the blue component and the red component related to the chrominance of color. The image is divided into 8 × 8 blocks. Feature extraction has been done by applying the basis vector on the

C C

values.

2.3. Walsh Basis Vectors

Walsh basis vectors are n-dimensional basis vectors that depend on the kernel of the WHT [26]. The idea of using the basis vector for feature extraction has been discussed in [26]. The low frequencies and high frequencies define important features of the image, as low frequencies capture the brightness of the image, whereas high frequencies capture the edge features of the image. The basis kernel (

v_{1}

) has the capability to capture the low-frequency components and the basis kernels

(v_{2}, v_{3}, v_{4} \dots \dots . v_{n})

have the capability to capture the high-frequency components of the image, where n is the total number of kernels in the WHTM. In this paper, for feature extraction, we have considered only

v_{1}

basis kernel for capturing the low-frequency information and

v_{9}, v_{10}

. basis kernels for capturing the high-frequency information. To capture the texture information, we have considered

v_{1}, v_{9},

and

v_{10}

basis kernels. While selecting the basis vectors, we need to ensure that one vector should not be the transform of the other. In [26], Lakshmipriya et al. selected a basis kernel of dimension 4 × 4 for feature extraction by considering

v_{1}, v_{5},

and

v_{6}

basis kernels of the WHTM. In contrast to this, in our work, we have considered WHTM basis kernel

v_{1}, v_{9},

and

v_{10}

, as illustrated in Figure 1, with a dimension of

8 \times 8

for feature extraction.

V_{1}, V_{2}

, and

V_{3}

are the vector representation of the Walsh basis kernels, which are orthogonal vectors, as they are representing the Walsh–Hadamard kernel, and are calculated based on (6), (7), and (8) as follows.

V_{1} = \frac{1}{8} * [1, 1, 1, 1, 1, 1, 1, 1] * [1, 1, 1, 1, 1, 1, 1, 1]

(6)

V_{2} = \frac{1}{8} * [1, 1, 1, 1, - 1, - 1, - 1, - 1] * [1, 1, 1, 1, 1, 1, 1, 1]

(7)

V_{3} = \frac{1}{8} * [1, 1, 1, 1, - 1, - 1, - 1, - 1] * [1, 1, 1, 1, - 1, - 1, - 1, - 1]

(8)

Color strength is extracted by applying the

V_{1}

basis vector on the image block, which has low-frequency information. Edge strength is extracted by applying

V_{2}

and

V_{3}

basis vectors on the image block, which have high-frequency information. Texture strength is extracted by applying

V_{1}, V_{2}

, and

V_{3}

basis vectors on the image block. The procedure for feature extraction using basis vectors is described in the following subsection.

2.4. Feature Extraction and Generation of Block Feature Strength (BFS) Using Walsh Basis Vector

In this section, we describe the block-based color feature, edge feature, and texture feature extraction techniques using the three Walsh basis vectors defined in the previous subsection.

Let

Z_{m_{1}} = {Z_{1}^{m_{1}}, Z_{2}^{m_{1}}, Z_{3}^{m_{1}}}

,

m_{1} = 1, 2, 3 \dots . .

, no. of blocks.

Z_{m_{1}}

represents the inner product of block

K_{m_{1}}

with Walsh basis vector

V_{r}

r = 1, 2, 3

. Then:

Z_{1}^{m 1} = 〈 K_{m_{1}}, V_{1} 〉, Z_{2}^{m 1} = 〈 K_{m_{1}}, V_{2} 〉, Z_{3}^{m 1} = 〈 K_{m_{1}}, V_{3} 〉

(9)

where,

〈 K_{m_{1}}, V_{r} 〉 = \sum_{c = 1}^{p} K_{m_{1} c} * V_{r c}

; p represents the number of elements in a block,

K_{m_{1} c}

represents the cth value of the block, and

V_{r c}

represents the cth value of the

V_{r}

Walsh basis vector.

The block-based color feature strength, edge feature strength, and texture feature strength denoted as

C_{s b}, E_{s b}

and

T_{s b}

, respectively, are obtained as follows:

Block color strength ( $C_{s b}$ ): It is defined as the amount of brightness in the image block which can be calculated as given in (10):

$C_{s b} = Z_{m_{1}} = 〈 K_{m_{1}}, V_{1} 〉$

(10)
Block edge strength ( $E_{s b}$ ): Block edge strength is computed by using the magnitude of gradient vectors $Z_{2}^{m_{1}}$ and $Z_{3}^{m_{1}}$ as given in (11):

$E_{s b} = \sqrt{Z_{2}^{m_{1}^{2}} + Z_{3}^{m_{1}^{2}}}$

(11)

For efficient implementation, (11) can be approximated as:

$E_{s b} = Z_{2}^{m_{1}^{2}} + Z_{3}^{m_{1}^{2}}$

(12)
Block texture strength ( $T_{s b}$ ): The spatial occurrence of the pixel intensities in a specified region denoted as $T_{s b}$ is given by (13):

$T_{s b} = K_{m_{1}}^{2} - X T_{m_{1}}^{2}$

(13)

$T_{s b}$ is texture strength of the $K_{m 1}$ block, where XT is defined in (14):

$X T = Z_{1}^{m_{1}} V_{1} + Z_{2}^{m_{1}} V_{2} + Z_{3}^{m_{1}} V_{3} = \sum_{c = 1}^{3} 〈 K_{m_{1}}, V_{c} 〉 V_{c}$

(14)

By combining the value of

C_{s b}

,

E_{s b}

, and

T_{s b}

, a block feature strength (BFS) is generated as given in (15):

B F S = α C_{s b} + β E_{s b} + γ T_{s b}

(15)

where, the

α, β

, and

γ

are three constants known as feature scaling factor and their values range between 0 and 1, such that:

α + β + γ = 1

(16)

2.5. Proposed Method

In this section, we describe the details of the proposed algorithm. Literature review revealed that the YCbCr model provides a better quality of image and it catches more human-eye-sensitive intensity colors. Therefore, it is preferably used in JPEG compression. Moreover, the RGB model is less acceptable due to its redundant nature [23]. As a result, we also processed the image in YCbCr color space. After conversion to the YCbCr model, the image is divided into N × N non-overlapping blocks. For each block, the block color strength, the block edge strength, and the block texture strength are extracted using Walsh–Hadamard basis [16] vectors as given in (10), (12), and (13), respectively. The complete feature content of the block is generated by considering weighted combination of individual feature strengths using (15), known as block feature strength (BFS) of the block. After generation of the BFS, a block selection is made by comparing the BFS with automatically generated threshold (Th) using k-means algorithm. If the BFS exceeds the threshold value, then that particular block will be compressed by a high value of Q-Marix chosen from (17), where F ranges from 10 to 100 or else the block will be compressed by the least value of the Q-Matrix,

Q_{10}

, which offers the highest compression. In this technique, the low feature content blocks are compressed to a fixed level by

Q_{10}

, which ensures the minimum quality of the reconstructed image. To improve the quality of the reconstruction, the Q-Matrix assigned to the high BFS block should be higher than 10. For the decoding process, the respective block Q-Matrix is taken into consideration and followed by inverse-DCT process. Algorithm 1 represents the encoding process of the proposed method.

Algorithm 1 Proposed algorithm

Divide the image into N non-over lapping blocks.

f o r i = 1 : N

Evaluate : B F S (i)

i f (B F S (i) > T h)

% Generate (Threshold : T h

) using K-Means algorithm

Apply

in Q-Matrix (High Q-Matrix for DCT compression)

e l s e

Apply the smallest Q_{10}

in Q-Matrix for DCT Compression

e n d

of If block

e n d

of for loop.

The details of the proposed method work flow are shown in Figure 2. In the encoding and the decoding process of JPEG compression, the Q-Matrix plays a major role. The required bit rate can be achieved by the quantization step in the JPEG framework. The QM (Q-Matrix) [27], also known as quality factor matrix, can be varied to change the bit rate. So, the bpp of the image is configured by the series of quantization tables developed in JPEG to obtain a different PSNR.

The quantization table is given by

Q_{F}

:

Q_{F} = {\begin{matrix} ⎣ Q_{F} \times \frac{F}{50} + \frac{1}{2} ⎦ & 1 < F \leq 50 \\ ⎣ Q_{F} \times (2 - \frac{F}{50}) + \frac{1}{2} ⎦ & 50 < F \leq 100 \end{matrix}

(17)

A different Q-factor matrix can be generated by varying the F parameter in (17). To evaluate the performance of the proposed method for different bpp, we have used a different QM. To obtain the different Q, we have used F values from 1 to 100 in (17).

3. Simulations and Result Analysis

The performance of the proposed algorithm is measured based on objective and subjective evaluation criteria. The objective evaluation is made based on PSNR [28] and SSIM [29] at a given bpp. To evaluate the efficacy of the proposed method, we have considered seven standard images, such as Lena, Airplane, Peppers Boat, Zelda, Girl, and Couple [30], in the first comparative study by using different quantization matrices, such as

Q_{10}, Q_{20}, Q_{30}, Q_{40}

, and

Q_{50}

. In the second comparative study, the performance of the proposed method is evaluated for different images from KODAK dataset [31] having 25 images. The different images are 24 bits color images of sizes ranging from 256 × 256 to 1024 × 1024. During simulations, we have experimented with different sets of values of

α, β

, and

γ

for five images and observed that optimum performance is obtained by considering values of

α, β

, and

γ

as 0.3, 0.3, and 0.4, respectively. Therefore, all simulations are made by considering this set of values only. In our experiments, we have considered only the color images. For illustration purposes, the KODAK dataset images are given in Figure 3.

The performance measurements of the proposed method and the JPEG method based on the different QM are illustrated in Figure 4. In Figure 4, the pink color graph shows the variation in PSNR by considering a fixed quantization of

Q_{10}

for blocks with lower BFS and

Q_{F}

for blocks with higher BFS values. By varying the F parameter in (17) from

Q_{10}

to

Q_{50}

with an increment of 10 units, different QM are generated. In Figure 4, Q(x,y) indicates the two QM values x and y used in the two-level quantization process by the proposed algorithm, as shown in the pink color line. The x value is kept fixed to 10, corresponding to blocks with low BFS, and the y can take value from {10, 20, 30, 40, 50}. Again, the variation in the PSNR with respect to the single QM as used in standard JPEG is shown in cyan color. From Figure 4, we have illustrated that the proposed method has improved the PSNR value as compared to the JPEG baseline method. It is observed from Figure 4 that, with the increase in QM value, there is an increase in the PSNR value for both the methods. However, using two quantization matrices for encoding by the proposed method has ensured lower bpp than the corresponding JPEG counterpart.

Variation in the quantization matrix controls the degree of compression resulting in variation in compression ratio and bpp of the image and, hence, the PSNR of the image. The variation in the PSNR with respect to bpp for JPEG [8], JPEG2000 [20], H264 [17], H265 [18], and AMBTC [2] are shown in Figure 5. It is observed from Figure 5 that, for a fixed bpp, the proposed WDCT method has a better PSNR than JPEG [8], ADCT [10], SJPEG [11], and, OcsDCT [23]. Again, PSNR comparisons for different DCT-based methods, such as ADCT [10], SJPEG [11] and OcsDCT [23], at constant QM of Q30 are shown in Table 1. In the ADCT [10] method, block-wise adaptive scanning technique was used to generate the compressed image, and the adaptive scanning improved the bpp and PSNR values. In the saliency-based SJPEG [11], a saliency-based compression framework is proposed where the salient blocks are encoded separately and the non-salient region blocks are encoded individually, which ensured improved quality of the image at lower bpp. In OcsDCT [23], the BAT algorithm was used to optimize the cost function to compute the threshold partially, which reduced the numberless significant DCT coefficients having lesser energy. In this way, it managed to reduce the bit rate with a higher PSNR value. From Figure 5, we can observe that our proposed method, as well as JPEG, ADCT, OcsDCT, and SPJEG are not as good as H264, H265, and JPEG2000. However, our method has the lowest complexity. The comparison of execution times will be shown in a later section.

It is observed from Table 1 that, for a fixed quantization matrix, the proposed WDCT method outperformed, having the lowest average bpp of 0.53 and highest average PSNR of 35.61 as compared to those methods (ADCT, JPEG, and OcsDCT) with similar complexity. From Table 1, it is clearly observed that the proposed method outperformed most of the cases, except for the Lena and Girl images. The reason behind lower PSNR for these two images is that these images have a smaller area having more detailed information and larger areas having less detailed information, which is treated as background. Lena image has detailed information content only in the face area and the remaining portion has very low details. Similarly, the Girl image has detailed information near her face area, near the table area, and near the drawer region only. Less detailed areas undergo higher compression than high detailing areas, resulting in lower reconstruction quality of the image. This clearly illustrates that the proposed WDCT is well capable of handling images having detailed information content by maintaining a good quality of reconstruction with high compression level.

The proposed method is also compared with different BTC-based approaches, such as ImgAMBTC [14] and ADMG [15], and the results are included in Table 2. Evaluation of SSIM at constant bpp of 0.6 and variation of bpp by using different levels of QM are shown in the Figure 6 and Figure 7, respectively.

It is observed from Figure 6 that the proposed method has an SSIM value similar to that of ADCT and has an SSIM value 0.03 units less than that of the JPEG method at constant bpp of 0.6, whereas the proposed method leads in SSIM value by 0.14 units compared to the SJPEG method.

Figure 7 illustrates the comparison of the bpp values of the JPEG method and the proposed method by using a fixed QM. From Figure 8, it can be observed that, for a QM of Q₁₀, the proposed WDCT and the JPEG method has equal bpp. However, with the increase in the quality factor of the QM, there is an increase in the bpp for both the methods. For higher value of the QM, the proposed method has lower bpp than the corresponding JPEG counterpart that ensured higher compression rate for the proposed method.

All simulations are carried out using a platform powered by an i5 CPU with 8 GB RAM in MatlabR2017a environment, and execution times are measured. The execution time of the proposed WDCT- and DCT-based JPEG method are compared and plotted in Figure 8. It is observed from Figure 8 that the WDCT method has lower computation time than JPEG for all QM.

Performance comparison of different methods in terms of average execution time in seconds is shown in Figure 9. It is observed from Figure 9 that the proposed method has the lowest execution time compared to H264, H265, JPEG 2000, and JPEG considering all the images. It is noted that JPEG 2000 is a wavelet-based approach and it is more complex than our proposed method. Although, H264, H265, JPEG, and the proposed WDCT- are DCT-based approaches, due to two-level quantization of the proposed WDCT algorithm in contrast to the fixed quantization of the JPEG, have lower execution times.

The subjective evaluation of the proposed WDCT and other DCT-based approaches are made based on the perceptual quality of the reconstructed image. Figure 10 shows the reconstructed image of Lena of size 256 × 256 by using the proposed WDCT method and JPEG at 0.4 bpp. The WDCT-based reconstructed image has a PSNR of 34.8 dB, and the JPEG-based reconstructed image has a PSNR of 31.5 dB. For the proposed WDCT approach, the main feature-containing blocks are compressed by using the

Q_{30}

, and the remaining blocks are compressed by using the fixed QM of

Q_{10}

, while JPEG is compressed by using a QM of

Q_{20}

. The variation in QM for the illustration of the two reconstruction approaches are just to maintain constant bpp. It is observed from Figure 10 that the proposed method has fewer artifacts at 0.4 bpp than the direct JPEG method at 0.4 bpp. The high feature-containing areas are compressed by using the high QM and the low feature content blocks compressed by using the low QM ensures improved bpp, while maintaining appreciable PSNR and SSIM values. The proposed WDCT method has lower bpp with better reconstruction image quality than other DCT-based approaches.

The ADMG method used a combination of multi-grouping and AMBTC [2] with a predefined threshold to reduce the bit rate. The bpp of this method is 1.99. In ImgAMBTC [14], the ADMG [15] grouping is applied to the selected blocks having higher edge features, and the rest of the blocks are compressed by using the AMBTC [2] method. In order to compare the proposed WDCT method with the BTC-based approaches, we have considered a higher QM to obtain a comparable bpp, which is shown in Table 2. On average, our method has the highest PSNR and lowest bpp. It is emphasized that high PSNR and low bpp mean better performance.

4. Conclusions

In this work, an efficient Walsh–Hadamard kernel-based feature extraction is proposed for image compression. Walsh–Hadamard basis vector is well capable of extracting multiple features that are combined to generate the final block feature strength. Based on the block feature strength the block is either compressed using DCT with high Q matrix or DCT with low Q matrix. The block selection is based on final block feature strength and an automatic threshold generated via k-means algorithm. The two-level block quantization by Walsh-basis-vector-based feature extraction imparts a unique and rich flavor to the proposed framework. The proposed method achieved an average SSIM of 0.9 at 0.5 bpp and an average PSNR of 35.61 dB. A new feature selection scheme for capturing better image features followed by other transform domain approaches for compression can be the future scope of the work.

Author Contributions

D.N. wrote the paper. D.N., K.R. and T.K. developed the algorithms and codes, and generated all the plots and figures. C.K. suggested the comparison with H264, H265, and JPEG2000 and proofread the paper. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Gonzalez, R.C.; Woods, R.E. Digital Image Processing, 3rd ed.; Prentice Hall: Hoboken, NJ, USA, 2008; pp. 526–527, 692. [Google Scholar]
Lema, M.; Mitchell, O. Absolute Moment Block Truncation Coding and Its Application to Color Images. IEEE Trans. Commun. 1984, 32, 1148–1157. [Google Scholar] [CrossRef]
Feng, Y.S.; Nasrabadi, N.M. Dynamic address-vector quantisation of RGB colour images. IEE Proc. I Commun. Speech Vis. 1991, 138, 225–231. [Google Scholar] [CrossRef]
Clausen, C.; Wechsler, H. Color image compression using PCA and back propagation learning. Pattern Recognit. 2000, 33, 1555–1560. [Google Scholar] [CrossRef]
Xiong, Z.; Ramchandran, K.; Orchard, M.T.; Zhang, Y.Q. A comparative study of DCT- and wavelet-based image coding. IEEE Trans. Circuits Syst. Video Technol. 1999, 9, 692–695. [Google Scholar] [CrossRef] [Green Version]
Messaoudi, A.; Srairi, K. Colour image compression algorithm based on the DCT transform using difference lookup table. Electron. Lett. 2016, 52, 1685–1686. [Google Scholar] [CrossRef]
Bauermann, I.; Steinbach, E. Further Lossless Compression of JPEG Images. Picture Coding Symposium. In Proceedings of PCS, San Francisco, CA, USA, 15–17 December 2004. [Google Scholar]
Wallace, G. The JPEG still picture compression standard. IEEE Trans. Consum. Electron. 1992, 38, 18–34. [Google Scholar] [CrossRef]
Douak, F.; Benzid, R.; Benoudjit, N. Color image compression algorithm based on the DCT transform combined to an adaptive block scanning. AEU Int. J. Electron. Commun. 2011, 65, 16–26. [Google Scholar] [CrossRef]
Messaoudi, A.; Benchabane, F.; Srairi, K. DCT-based color image compression algorithm using adaptive block scanning. Signal Image Video Process 2019, 13, 1441–1449. [Google Scholar] [CrossRef]
Rahul, K.; Tiwari, A.K. Saliency enabled compression in JPEG framework. IET Image Process. 2018, 12, 1142–1149. [Google Scholar] [CrossRef]
Kurita, T.; Otsu, N. A method of block truncation coding for colour image compression. IEEE Trans. Commun. 1993, 41, 1270–1274. [Google Scholar] [CrossRef]
Mathews, J.; Nair, M.S. Adaptive block truncation coding technique using edge-based quantization approach. Comput. Electr. Eng. 2015, 43, 169–179. [Google Scholar] [CrossRef]
Chuang, J.-C.; Hu, Y.-C.; Chen, C.-M.; Yin, Z. Adaptive grayscale image coding scheme based on dynamic multi-grouping absolute moment block truncation coding. Multimed. Tools Appl. 2020, 79, 28189–28205. [Google Scholar] [CrossRef]
Xiang, Z.; Hu, Y.-C.; Yao, H.; Qin, C. Adaptive and dynamic multi-grouping scheme for absolute moment block truncation coding. Multimed. Tools Appl. 2019, 78, 7895–7909. [Google Scholar] [CrossRef]
Barua, S.; Mitra, K.; Veeraraghavan, A. Saliency guided wavelet compression for low-bitrate image and video coding. In Proceedings of the 2015 IEEE Global Conference on Signal and Information Processing (GlobalSIP) IEEE, Orlando, FL, USA, 14–16 December 2015; pp. 1185–1189. [Google Scholar]
Jisha, P.R. Image Compression Using Intra Prediction of H.264/AVC and Implement of Hiding Secret Image into an Encoded Bitstream. IJSRD—Int. J. Sci. Res. Dev. 2013, 1, 1411–1416. [Google Scholar]
Nguyen, T.; Marpe, D. Performance analysis of HEVC-based intra coding for still image compression. In Proceedings of the 2012 Picture Coding Symposium, Krakow, Poland, 7–9 May 2012; pp. 233–236. [Google Scholar] [CrossRef]
Christopoulos, C.; Skodras, A.; Ebrahimi, T. The JPEG2000 still image coding system: An overview. IEEE Trans. Consum. Electron. 2000, 46, 1103–1127. [Google Scholar] [CrossRef] [Green Version]
Rabbani, M.; Joshi, R. An overview of the JPEG 2000 still image compression standard. Signal Process. Image Commun. 2002, 17, 3–48. [Google Scholar] [CrossRef]
Yang, J.; Zhu, G.; Shi, Y.-Q. Analyzing the Effect of JPEG Compression on Local Variance of Image Intensity. IEEE Trans. Image Process. 2016, 25, 2647–2656. [Google Scholar] [CrossRef]
Diana, A.A.; Thangarjan, R. Saliency-Based Image Compression Using Walsh–Hadamard Transform (WHT). In Biologically Rationalized Computing Techniques for Image Processing Applications; Hemanth, J., Balas, V., Eds.; Lecture Notes in Computational Vision and Biomechanics; Springer: Cham, Switzerland, 2018; Volume 25. [Google Scholar] [CrossRef]
Touil, D.E.; Terki, N. Optimized color space for image compression based on DCT and Bat algorithm. Multimed. Tools Appl. 2020, 80, 9547–9567. [Google Scholar] [CrossRef]
Cho, J.; Kwon, O.-J.; Choi, S. Improvement of JPEG XL Lossy Image Coding Using Region Adaptive DCT Block Partitioning Structure. IEEE Access 2021, 9, 113213–113225. [Google Scholar] [CrossRef]
Kanungo, T.; Mount, D.M.; Netanyahu, N.S.; Piatko, C.D.; Silverman, R.; Wu, A.Y. An efficient k-means clustering algorithm: Analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 881–892. [Google Scholar] [CrossRef]
Lakshmi Priya, G.G.; Domnic, S. Walsh Hadamard Transform Kernel-Based Feature Vector for Shot Boundary Detection. IEEE Trans. Image Process. 2014, 23, 5187–5197. [Google Scholar] [CrossRef]
Kornblum, J.D. Using JPEG quantization tables to identify imagery processed by software. In Digital Investigation S21–S25; Digital Forensic Research Workshop; Elsevier Ltd.: Amsterdam, The Netherlands, 2008. [Google Scholar] [CrossRef]
Zhou, Y.; Wang, C.; Zhou, X. DCT-Based Color Image Compression Algorithm Using an Efficient Lossless Encoder. In Proceedings of the 2018 14th IEEE International Conference on Signal Processing (ICSP), Beijing, China, 12–16 August 2018; pp. 450–454. [Google Scholar] [CrossRef]
Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Processing 2004, 13, 600–612. [Google Scholar] [CrossRef] [Green Version]
USC-SIPI Image Database. Available online: http://sipi.usc.edu/database/database.misc (accessed on 1 April 2019).
Kodak Lossless True Color Image Suite. Available online: http://www.r0k.us/graphics/Kodak (accessed on 1 April 2019).

Figure 1. Representation of the Walsh Basis kernel: (a)

v_{1}

(b)

v_{9}

and (c)

v_{10}

, respectively.

Figure 1. Representation of the Walsh Basis kernel: (a)

v_{1}

(b)

v_{9}

and (c)

v_{10}

, respectively.

Figure 2. Proposed work flow.

Figure 3. Kodak dataset images.

Figure 4. Comparison of PSNR values for JPEG and the proposed WDCT for different QM.

Figure 5. PSNR comparison of different methods at constant bpp.

Figure 6. Comparison of SSIM values of the different methods at constant bpp of 0.6.

Figure 7. Variation of bpp values for JPEG and WDCT for different QM.

Figure 8. Comparison of average execution time for JPEG and WDCT for different QM.

Figure 9. Comparison of computational times for different methods.

Figure 10. Illustration of reconstructed image using standard JPEG and proposed WDCT. (a) Original image. (b) Reconstructed image by using JPEG at 0.4 bpp [8]. (c) Reconstructed image by using the proposed WDCT 0.4 bpp.

Table 1. Performance comparison of the proposed method with different DCT-based methods [10,11,23]. Bold numbers indicate the best method in each row.

Images	ADCT [10]		JPEG with Saliency [11]		Ocs DCT [23]		Proposed WDCT
	PSNR	bpp	PSNR	bpp	PSNR	bpp	PSNR	Bpp
Lena	32.65	0.7	35.66	0.7	36.31	0.4	35.25	0.5
Airplane	31.12	0.5	34.32	0.5	31.01	0.7	35.86	0.7
Girl	35.87	0.3	33.00	0.4	32.78	0.6	35.4	0.5
Peppers	31.19	0.8	32.91	0.9	33.13	0.7	35.62	0.6
Boat	32.05	0.8	35.19	0.8	-	-	35.90	0.7
Couple	32.62	0.7	32.51	0.8	32.17	0.6	35.10	0.3
Zelda	32.21	0.8	32.89	0.8	32.75	0.6	35.50	0.4
Kodak Average	33.35	0.8	31.78	0.8	-	-	36.28	0.7
Average	32.63	0.7	33.53	0.7	32.37	0.6	35.61	0.5

Table 2. Comparison of different methods [14,15] with proposed WDCT. Bold numbers indicate better methods.

Images	ImgAMBTC [14]		ADMG [15]		Proposed WDCT
Images	PSNR	Bpp	PSNR	Bpp	PSNR	Bpp
Lena	38.89	2	35.61	1.8	40.8	1.2
Airplane	38.93	1.7	36.48	1.8	40.5	1.9
Girl	39.9	2.4	34.63	2.4	39.78	1.7
Peppers	39.1	2	35.6	1.9	39.63	1.2
Boat	37.98	2.7	34.97	2.05	38.8	2.0
Couple	38.41	1.6	33.65	1.7	40.6	2
Zelda	39.01	2.3	33.13	1.9	39.12	1.8
Kodak Average	39	2.1	32.82	2.2	40.9	1.6
Average	38.90	2.11	34.47	2.01	40.02	1.7

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Nayak, D.; Ray, K.; Kar, T.; Kwan, C. Walsh–Hadamard Kernel Feature-Based Image Compression Using DCT with Bi-Level Quantization. Computers 2022, 11, 110. https://doi.org/10.3390/computers11070110

AMA Style

Nayak D, Ray K, Kar T, Kwan C. Walsh–Hadamard Kernel Feature-Based Image Compression Using DCT with Bi-Level Quantization. Computers. 2022; 11(7):110. https://doi.org/10.3390/computers11070110

Chicago/Turabian Style

Nayak, Dibyalekha, Kananbala Ray, Tejaswini Kar, and Chiman Kwan. 2022. "Walsh–Hadamard Kernel Feature-Based Image Compression Using DCT with Bi-Level Quantization" Computers 11, no. 7: 110. https://doi.org/10.3390/computers11070110

APA Style

Nayak, D., Ray, K., Kar, T., & Kwan, C. (2022). Walsh–Hadamard Kernel Feature-Based Image Compression Using DCT with Bi-Level Quantization. Computers, 11(7), 110. https://doi.org/10.3390/computers11070110

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Walsh–Hadamard Kernel Feature-Based Image Compression Using DCT with Bi-Level Quantization

Abstract

1. Introduction

2. Materials and Methods

2.1. Overview of Walsh–Hadamard Transforms

2.2. Properties of Walsh–Hadamard Transform

2.3. Walsh Basis Vectors

2.4. Feature Extraction and Generation of Block Feature Strength (BFS) Using Walsh Basis Vector

2.5. Proposed Method

3. Simulations and Result Analysis

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI