Reduction of Artefacts in JPEG-XR Compressed Images

The JPEG-XR encoding process utilizes two types of transform operations: Photo Overlap Transform (POT) and Photo Core Transform (PCT). Using the Device Porting Kit (DPK) provided by Microsoft, we performed encoding and decoding processes on JPEG XR images. It was discovered that when the quantization parameter is >1-lossy compression conditions, the resulting image displays chequerboard block artefacts, border artefacts and corner artefacts. These artefacts are due to the nonlinearity of transforms used by JPEG-XR. Typically, it is not so visible; however, it can cause problems while copying and scanning applications, as it shows nonlinear transforms when the source and the target of the image have different configurations. Hence, it is important for document image processing pipelines to take such artefacts into account. Additionally, these artefacts are most problematic for high-quality settings and appear more visible at high compression ratios. In this paper, we analyse the cause of the above artefacts. It was found that the main problem lies in the step of POT and quantization. To solve this problem, the use of a “uniform matrix” is proposed. After POT (encoding) and before inverse POT (decoding), an extra step is added to multiply this uniform matrix. Results suggest that it is an easy and effective way to decrease chequerboard, border and corner artefacts, thereby improving the image quality of lossy encoding JPEG XR than the original DPK program with no increased calculation complexity or file size.


Introduction
In a smart city environment, a colossal quantity of image data is generated from traffic control systems, intruder detection and surveillance systems and from other intelligent sensing gadgets and devices. Therefore, sending the data across the network might consume very high network traffic; hence, these images are compressed using suitable compression approaches such as Joint Photographic Experts Group (JPEG), JPEG Extended Range (JPEG-XR) and so on [1][2][3]. JPEG-XR was initially presented under the name Windows Media Photo, and subsequently, it was retitled as High Definition In Section 2 of this paper, we will analyse chequerboard block artefacts, border artefacts and corner artefacts respectively. Through implementation and observation, we will speculate about possible causes and prove them mathematically. In Section 3, we will design a separate "uniform matrix" for each of the three artefacts in order to improve encoding quality. In Section 4, we will implement the aforementioned uniform matrices and compare the results with Microsoft's DPK.
It will become apparent that our method effectively removes chequerboard, border and corner artefacts and provides better image quality and compression ratio.

Chequerboard, Border and Corner Block Artefacts
We used the DPK provided by Microsoft to encode and decode images. We discovered that chequerboard, border and corner artefacts occur when the QP > 1. Figure 1a depicts the original image before transformation. It is a 64 × 64 monochrome bitmap (BMP). Figure 1b portrays a lossless transform using QP = 1. It is identical to the original image. Figure 1c shows a lossy transform using QP = 41. The difference between it and the original is not very apparent. Figure 1d illustrates a lossy transform using QP = 61. The chequerboard, border and corner artefacts are more obvious than in Figure 1c. Figure 2a,b shows the results of linear contrast enhancement ( [1] f ENC (p[i]) = min(p) + (p[i] − min(p)) * R, where p = pixel array of image, i = array-index in p, R = 6 (expand ratio, fixed in this paper)) in Figure 1c,d. Chequerboard, border and corner artefacts are very noticeable. Note that if we set the overlap parameter L = 0, the artefacts disappear even without transformation with POT, as shown in Figure 2c,d. We can infer from the above discussion that the two causes of the artefacts are: 1. lossy quantization (QP > 1) 2. the use of POT (L > 0) Hereinafter, we will analyse each of the two causes.

Lossy Quantization and Irreversibility
Quantization of DPK is lossless when the quantization parameter QP = 1, as shown in Figure 3a. When QP > 1, the result is similar to uniform quantization, as shown in Figure 3b-d. The solid lines are the quantization results of QP = 21, QP = 41, and QP = 61, respectively. It can be observed that as QP increases, the interval (step) becomes wider.
The process of JPEG-XR encoding is as shown in Figure 4. From [18], we know that the POT and PCT implemented by the lifting structure are "reversible". This means that after a value has been transformed by POT or PCT, we can apply the corresponding inverse transforms to return it to its original value, as shown in Figure 5.
When the quantization parameter QP = 1, the quantization is both lossless and reversible. With QP = 1 and the reversibility of POT and PCT, the encoding/decoding process of JPEG-XR becomes completely lossless. This way, no chequerboard, or border, artefacts will be produced, as shown in Figure 6.
On the other hand, when QP > 1, or when quantization is lossy, the values may be offset or distorted. The inverse PCT, POT step is then irreversible and may cause chequerboard, border and corner artefacts, as shown in Figure 7.

POT
The purpose of JPEG-XR's POT is to decrease block artefacts caused by PCT. As such, its application area must interleave with that of PCT [18], as shown in Figure 8. Since POT must interleave with PCT when undergoing a POT, the image must be divided into two regions. To each of the two regions would be applied a different transform as shown in Figure 9a, depicting the border with 4 × 1 POT, while Figure 9b portrays the 4 × 4 POT inside. The location of the chequerboard and border artefacts happens to be where the 4 × 4 POT and 4 × 1 were applied. Additionally, the corner artefacts are located at the 42 × 2 corners where POT was not applied, as shown in Figures 10 and 11.    Through research, we found the causes of the chequerboard, border and corner artefacts: 1. The 4 × 4 POT causes chequerboard artefacts 2. The 4 × 1 POT causes border artefacts 3. The four corners without POT application causes corner artefacts Next, we will describe each of the three causes in detail.

The 4× 4 POT: Chequerboard Block Artefacts
Based on [24], we know that the operations of 4 × 4 POT are separated into four stages, which by using the 16 points, a, b, c, d, e, f, g, h, i, j, k, l, m, n, o and p, are given as: 1. Hadamard transform stage: T HH ( f , g, j, k) 2. Scaling stage: 3. Rotation stage: 4. Hadamard transform stage: Each of the above stages uses four rotational operators: T R , T HH , T RR and T S . Figure 12a-d shows the implementations of T R , T HH , T RR and T S in DPK [26], respectively.
From [18], we know that the four rotational operators listed in Figure 12 are implemented by the lifting structure [24,25,27]. Its basic structure is as shown in Figure 13. The advantage of using the lifting structure is that the transform operations can be separated into simple addition, subtraction and multiplication (e.g., Figure 13a can be expanded into Equation (1)). Moreover, inverse transform operations can be achieved by simply reversing the operation order, and switching the pluses and minuses (e.g., Equation (2) is the result of expanding Figure 13b).
The expanded results as shown in Equations (7) and (8) were obtained.
Among them: When we substitute a to p all with the same variable x, the result is Equation (9): Using the same method as Equation (2) and Figure 13b, the inverse 4 × 4 POT can be expanded; the result is Equation (10).
A look at Equations (9) and (10) shows that both 4 × 4 POT and inverse 4 × 4 POT resulted in a chequerboard-like distribution, as shown in Figure 14.  To further understand how the chequerboard phenomenon occurred, we followed the encoding process described in Figure 4. First, the 4 × 4 PCT (represented as f PCT4×4 ) was applied to the 4 × 4 POT result from Equation (9). Note that since application areas of POT and PCT interleave (Section 2.2), the input array was relatively displaced. The operation result is shown in Equation (11).
We know that PCT and DCT are similar [24]. An image is transformed from the spatial domain to the frequency domain. The distribution of transformed data depends on its frequency. For example, in Equation (11), the top left corner is Low-Pass (LP), while the others are High-Pass (HP). In Equation (11), the value distribution was chequerboard-like due to the previous 4 × 4 POT. After the 4 × 4 PCT, some weak signals were generated in the HP areas. Still, most of the signals were concentrated in the LP area, or the top left of Equation (11) From [24], we know that JPEG-XR has a two-stage PCT, as shown in Figure 15. Each 4 × 4 block processes the first-stage PCT, resulting in an LP coefficient (light-grey) and 15 HP coefficients (white). Then, the second-stage transform was applied to the 16 DC coefficients collected into a single 4 × 4 block. These yielded 16 new coefficients, referred to the LP coefficient of the original block, respectively. We now apply the second-stage 4 × 4 PCT to a 4 × 4 matrix, filling the LP value of the first-stage transform from Equation (11). The result is shown in Equation (12).
In Section 2.1, we learned that when QP > 1, values became distorted after the process of quantization and inverse quantization. We set LP S2 instead of the LP value in Equation (11), which has been processed with the second-stage 4 × 4 PCT, C 0 = 11 * x − 2.5. We then used QP = 61 to perform quantization and inverse quantization for Equations (11) and (12) (shown as f Q.IQ ()). The results are Equations (13) and (14), respectively. From the source code of DPK [26], we know that under 8-bit greyscale conditions, after undergoing linear adjustments (offset) with the DPK program, the result is |x| ≤ 2048. From Figure 3, we can see that when QP = 61 and |c| > 144, c ∈ R, then f Q.IQ (c) > 0. This means if we would like f Q.IQ (k * x) > 0, k ∈ R, then |k| ≥ 144/2048, or |k| ≥ 0.0703125. From this, we learn that in the HP area of Equations (11) and (12), the values were adjusted to zero after the f Q.IQ () calculation, as they were too small (< 0.0703125 * x). At the end, only the LP portion, or the f Q.IQ (C 0 ), of the top left corner remained, as shown in Equation (14).
The final result of Equation (17) can be rearranged into Figure 16. Assuming C 2 > 0 with greyscale representation, the top left and lower right corners have higher values (light grey), while the top right and lower left corners have lower values (dark grey). This distribution is like a chequerboard. Up to now, we have explained the cause for chequerboard artefacts: lossy quantization caused the HP signals to be cleaned out to zero, leaving only the LP signals (Equation (13)). Then, after the inverse 4 × 4 PCT, it can be observed that the results were evenly distributed, or all 16 values were pretty much equal (Equation (16)). At last, the inverse 4 × 4 POT was applied, and a chequerboard-like phenomenon similar to Equation (10) can be seen.

The 4 × 1 POT: Border Artefacts
The implementation of the 4 × 1 POT using the DPK program [26] is shown in Figure 17. Referring to the method of Equation (1) and Figure 13a, we used a, b, c and d to represent the four points, then substituted them into Figure 17 and expanded. The result is Equation (18)  Similarly, we made a = b = c = d = x, and the result after substitution is Equation (19).
Similarly, we can expand the calculation for the inverse 4 × 1 POT where the result is Equation (20).
We combined the results of Equations (9), (10), (19) and (20) according to the upper boundary application area of Figure 9. The result is shown in Figure 18. The top two rows are the area where 4 × 1 POT was applied. The bottom four rows are where 4 × 4 POT was applied. One can see that in Figure 18, the value of 4 × 1 POT is bigger than the lower 4 × 4 POT (light grey). In Figure 18b, the value of inverse 4 × 1 POT is less than the lower inverse 4 × 4 POT (dark grey). Next, let us attempt to understand how border artefacts were formed. Here, we will also consider the interleaving between application areas of POT and PCT. Please refer to Figure 18. To represent the actual state of the application for the upper image boundary, we took two horizontal 4 × 1 POT areas for the top two rows. The next two rows are the top half of the 4 × 4 POT area. This is shown in Equation (21).
Using the same steps as Section 2.3, we applied PCT, quantization, inverse PCT and inverse POT operations on Equation (21). The result is Equation (22) Considering the characteristics of quantization, when x > 0, C 4×1 ≥ C 4×4 . This means we can make C 4×1 = C 4×4 + K where K ≥ 0. Therefore, the result of Equation (22) can be reordered into Figure 19. Assuming C 4×1 > 0, using greyscale representation, one can see that the top two rows (where 4 × 1 POT was applied) have lower values (dark grey). We only took half (4 × 2) of the bottom two rows (where 4 × 4 POT was applied), so only half of the chequerboard is shown here, but all the values in this area were higher than the top two rows (lighter grey). Up to now, we have explained the cause of border artefacts: this is due to different transformation values resulting from inverse 4 × 1 POT and inverse 4 × 4 POT.
One can see from Equation (24) that the magnitude of the value change in the top left corner is greater because we did not apply POT to it. Therefore, after undergoing PCT, more signals would be generated in the HP area. Next, we also applied quantization and inverse quantization to Equation (24) (represented by f Q.IQ ()). The result is Equation (25).
Note that the result of Equation (25) is different from Sections 2.3 and 2.4. The HP signals were not completely transformed to zero. Since f Q.IQ (0.2733 * x) and f Q.IQ (−0.1164 * x) satisfied the aforementioned |k| ≥ 0.0703125, they were kept. The decoding result differed from Sections 2.3 and 2.4 and is the reason for corner artefacts. Up to now, we have explained the cause of corner artefacts.

Improvement
From the analysis in Section 2, we discovered three problems in POT: (1) 4 × 4 POT is "uneven" (different values/non-uniform), which causes chequerboard artefacts; (2) the transform result value of 4 × 1 POT is greater than 4 × 4 POT, which causes border artefacts; and (3) POT was not applied on 2 × 2 areas in the four corners, which causes corner artefacts. To solve these problems, we propose the use of a "uniform matrix". After the POT (encoding) step and before the inverse POT (decoding) step, our uniform matrix was multiplied to even things out. After applying our solution, the process is as described in Figure 20.

Improvement of 4 × 4 POT
As shown in Equation (9), the result of 4 × 4 POT is "uneven". We hope to obtain a uniform matrix so that the Hadamard product between it and 4 × 4 POT becomes even. We took the average value of 16 points (0.6887 * x) that have undergone the 4 × 4 POT as our evenness: After calculations, the uniform matrix of 4 × 4 POT is: The corresponding inverse uniform matrix is its reciprocal.
For the ease of implementation, we adjusted the uniform matrix into a fraction. We set the numerator to 256 (2 8 ) and took an approximate value. This way, we can implement using the integer data structure and bit shifting in the program in order to maintain accuracy and decrease complexity. Finally, we get Equations (29) and (30), which are respectively the uniform matrix and inverse uniform matrix of 4 × 4 POT.

Improvement of 4× 1 POT
Next, we will attempt to improve the result of 4 × 1 POT. Through [24], we discovered that 4 × 1 POT is a 1D transform, and 4× 4 POT is a 2D transform. This means that 4 × 4 POT has one more dimension than 4× 1 POT, which causes differences in the transformation result. We hope to modify the 4× 1 POT to make its transformation result the same as the 4 × 4 POT. The easiest way is to apply 4× 4 POT instead of 4 × 1 POT on the area, but the problem with this is that a 4 × 1 POT area has only four points, while 4 × 4 POT needs 16 points. By observing Equations (7) and (8), we found: By Equation (31), we proved that 4 × 4 POT is "symmetrical", as shown in Equation (32). If 4 × 4 POT was performed on an input of 8 × 2 = 16 points consisting of eight symmetrical points a-h, the result would also be symmetrical (A-H).
Due to the symmetrical characteristic of 4 × 4 POT, we can combine the 4 × 1 POT and its neighbouring 4 × 1 POT area to form 4 × 2 = 8 points, then expand the area into 4 × 4 POT size through symmetrization (mirroring). Now, the 4 × 4 POT may be applied. The process is as shown in Figure 21. Similarly, we can take advantage of this characteristic when decoding to convert into inverse 4 × 4 POT. To maintain the evenness of the 4 × 4 POT results, it is necessary to multiply the uniform matrix mentioned in Section 3.1 (Equations (29) and (30)) to the process described in Figure 21.

Improvement of Corner Artefacts (2 × 2 POT)
From the POT application areas marked on Figure 9, we found that there was a 2 × 2 area in each of the four corners where POT was not applied. Their values were different than the areas where POT was applied. This scenario leads to block artefacts in the four corners. In order to ameliorate this phenomenon, we found the implementation of 2 × 2 POT in DPK, as shown in Figure 22. Here, we also applied the aforementioned expansion method; represent four points with a, b, c and d, substitute into Figure 22 and expand. The result is Equation (34): We make a = b = c = d = x and substitute to get: Since we want the same result as that of the aforementioned improved results of 4 × 4 POT and 4 × 1 POT (Sections 3.1 and 3.2), here the target value was set to 0.6887 * x, i.e., The resulting uniform matrix of POT 2 × 2 is as shown below.

Experimental Results
First, let us take a look at how a monochrome image has been improved. As shown in Figure 23, one can clearly see that chequerboard, border and corner artefacts have been eliminated. Next, we tested using the six images in Figures 24-26 (all are 8bpp greyscale images). We performed the JPEG-XR encoding and decoding process using two methods: Microsoft DPK [26] and our improvement program from Section 3. The overlap parameter was fixed at L = 1, while the QP was set from 21-101 with an interval of 20. Additionally, 11 were added to observe the performance under a low compression ratio. Calculated separately, PSNR and SSIM [28,29] were used to assess the coding efficiency. The results appear in Figures 27-31.    Figure 24 respectively. These two images have wider white borders to simulate photo reproduction and scans. The results show that our improvement method yielded better performance when the compression ratio was medium to low (QP ≤ 41). PSNR increased by as much as 1.97 dB, while SSIM increased by as much as 0.00727. Moreover, when the compression ratio was low (QP ≤ 11), the resulting file size was as much as 10% smaller than DPK. Figures 28c, 29a, 30c and 31a are the results of Figure 25 respectively. These two images are a common advertising logo and a poster. The results show that PSNR increased by as much as 1.14 dB, while SSIM increased by as much as 0.141. Moreover, when QP = 11, the file size decreased by 5%-10%. Figures 29b and 31b are the result of Figure 26a using the "Lena" picture (512 × 512 pixel). Using our improvement method, the result was almost the same as the DPK when QP ≥ 50. The reason there was little room for improvement was that chequerboard, border and corner artefacts are insignificant for high-frequency images such as this one. Figures 29c and 31c are the result of Figure 26b. This is an image of common sketches. in addition, the results show that our improvement method yielded better performance when the compression ratio was medium to low (QP ≤ 40). PSNR increased by as much as 0.34 dB, while SSIM also slightly increased (at most 0.00982). Figure 27 is the partial visual comparison of the logo image. Figure 27c,d is the result after linear contrast enhancement. One can clearly see that the artefacts appeared in the DPK result (c), and these have been completely eliminated by our proposed method (d).
From these experiments, we learn that under the same compression ratio, our improvement method yielded better results on low-frequency images (mostly white or monochrome, fewer variations) such as Figures 24-26. On high-frequency images (less white or monochrome parts, large variations), our performance was about the same as DPK, as seen for the image of Lena. As the chequerboard, border and corner artefacts have been eliminated, the values of neighbouring pixels became more "even". This effectively increased the compression ratio to produce smaller files.

Conclusions
JPEG-XR encoding in lossy conditions causes chequerboard, border and corner artefacts. These phenomena are particularly noticeable for higher compression ratios (QP ≥ 41). We discovered that uneven POT results are the cause of these artefacts. We therefore offer a method of improvement, which involves the use of a "uniform matrix". This method improves the results for 4 × 4, 4 × 1 and 2 × 2 POT, making them "even". Experiments prove that this method effectively ameliorates chequerboard, border and corner artefacts while yielding the same or better image quality than the original DPK program with no increased calculation complexity or file size. Under the same compression ratio (QP), file sizes may even be smaller. Results show that the proposed improvement method is very effective.