Compression for Bayer CFA Images: Review and Performance Comparison

Bayer color filter array (CFA) images are captured by a single-chip image sensor covered with a Bayer CFA pattern which has been widely used in modern digital cameras. In the past two decades, many compression methods have been proposed to compress Bayer CFA images. These compression methods can be roughly divided into the compression-first-based (CF-based) scheme and the demosaicing-first-based (DF-based) scheme. However, in the literature, no review article for the two compression schemes and their compression performance is reported. In this article, the related CF-based and DF-based compression works are reviewed first. Then, the testing Bayer CFA images created from the Kodak, IMAX, screen content images, videos, and classical image datasets are compressed on the Joint Photographic Experts Group-2000 (JPEG-2000) and the newly released Versatile Video Coding (VVC) platform VTM-16.2. In terms of the commonly used objective quality, perceptual quality metrics, the perceptual effect, and the quality–bitrate tradeoff metric, the compression performance comparison of the CF-based compression methods, in particular the reversible color transform-based compression methods and the DF-based compression methods, is reported and discussed.


Introduction
To save hardware costs, most color digital cameras employ single-sensor technologies with Bayer color filter array (CFA) patterns to capture real-world scenes. The four widely used 2 × 2 Bayer CFA patterns [1], namely Pat 1 = [G 1 , R 2 , B 3 , G 4 ], Pat 2 = [G 1 , B 2 , R 3 , G 4 ], Pat 3 = [R 1 , G 2 , G 3 , B 4 ], and Pat 4 = [B 1 , G 2 , G 3 , R 4 ], are shown in Figure 1a-d, respectively. For convenience, the captured Bayer CFA raw image is denoted by I Bayer , in which each pixel has only one R (red), G (green), or B (blue) color value and I Bayer consists of 25% R, 50% G, and 25% B color values. To fully utilize the limited device resources of cameras, such as the limited storage space and transmission capacity, prior to storing or transmitting Bayer CFA images, compressing Bayer CFA images is necessary. Without the loss of generality, in our discussion for compressing I Bayer , we only consider the first Bayer CFA pattern in Figure 1a, but our discussion is also applicable to the other three CFA patterns. During the past two decades, many compression methods for I Bayer have been developed and they can be roughly divided into two schemes, namely the compression-first-based (CF-based) scheme in Figure 2a and the demosaicing-first-based (DF-based) scheme in Figure 3a. In both compression schemes, at the client side, besides the reconstructed Bayer CFA image I rec,Bayer , the reconstructed RGB full-color image, which is obtained by demosaicing I rec,Bayer , is also used to evaluate the quality and quality-bitrate performance of the related compression methods. It is noticeable that the input Bayer CFA image I Bayer and the demosaiced RGB full-color image I demo,RGB at the server side of Figures 2a and 3a are used as the ground-truth Bayer CFA image and the ground-truth reconstructed RGB full-color image, respectively.

The Related CF-Based Compression Methods
At the server side of Figure 2a, the input Bayer CFA image I Bayer is first decorrelated to some subimages. In Figure 2b, the decorrelated subimages could be the reversible color transform-based (RCT-based) subimages [2][3][4][5][6][7][8][9], the wavelet transform-based (WT-based) subbands [10][11][12][13][14], or the prediction-based residuals [10,15].  In the RCT-based decorrelated subimages, the four RCT-based formats, namely the Y 1 Cr 2 Cb 3 Y 4 format proposed by Lee and Ortega [2] and implemented on JPEG [16], the YD g C o C g format proposed by Malvar and Sullivan [3] and implemented on JPEG-XR [17] and JPEG-2000 [18,19], the YLMN format proposed by Mohammed et al. [4] and implemented on Golomb-Rice codec [20] and JPEG-2000, and the Y∆C b C r format proposed by Richter and Fößel [8] and implemented on JPEG-XS [21], have received growing attention. For the YD g C o C g representation of I Bayer , Suzuki [12] proposed a lossless WT-based spectral-spatial transformation (WSST) approach to improve the bitrate performance. To improve WSST, Suzuki [14] proposed a weighted version by taking the edge directions into account. For improving the compression performance, Richter et al. [9] not only performed a nonlinear gamma correction on I Bayer but also deployed two white-balance constants into the two luma components in the Star-Tetrix transformation-based representation which was implemented on JPEG-XS [21].
Zhang and Wu [10] proposed a lossless merge-and residual-based method for compressing I Bayer . As a result, the rectangular compact green subimage, the red residual subimage, and blue residual subimage are fed into the codec. Chung and Chan [15] proposed a lossless context matching-based prediction method, in which when predicting the current pixel, the neighboring pixels of the current pixel are ranked, to obtain more accurate residual red and blue subimages, achieving a better compression performance relative to the method [10] on the Rice encoder. Zhang and Wu [10] performed a Mallat wavelet transform [22] on I Bayer , and then, the Golomb-Rice encoder [20] was utilized to encode the transformed wavelet coefficients. Lee et al. [11] proposed a camera-aware multi-resolution analysis (CAMRA) framework for compressing I Bayer . They leveraged the decorrelated wavelet coefficients and the image pipeline techniques at the server side. Later, Lee and Hirakawa [13] proposed a new shift-and-decorrelate lifting method to improve the compression performance of CAMRA.
We now explain why we only consider the RCT-based compression methods in the above-mentioned CF-based compression methods and select JPEG-2000 as the compression platform. In the literature, the RCT-based compression methods often served as the main comparative methods. Among these codecs used to evaluate the compression performance of the considered compression methods, JPEG-2000 is most favored. In addition, the prediction-based residual approaches [10,15] are lossless; the WT-based approaches [10][11][12][13][14] involve different codecs, such as the Golomb-Rice encoder and JPEG-2000, various WTbased computations, such as Haar, 5/3, and 9/7 wavelet transforms, varying matrix chain multiplications, and lifting operations. Accordingly, we take the RCT-based methods as the representatives of the CF-based methods. Section 2 will introduce the above-mentioned four RCT-based methods in detail. Furthermore, JPEG-2000 is adopted to evaluate the compression performance of the considered compression methods.

The Related DF-Based Compression Methods
In Figure 3a, at the server side of the DF-based compression scheme, the input Bayer CFA image I Bayer is first demosaiced to an RGB full-color image I demo,RGB which also serves as the ground-truth RGB full-color image. Section 3.1.1 will introduce how to demosaic I Bayer to I demo,RGB . Next, I demo,RGB is converted to a YCbCr image I YCbCr . Section 3.1.2 will introduce an RGB-to-YCbCr transformation. Then, a chroma 4:2:0 subsampling method is performed on the chroma image I CbCr to obtain a subsampled CbCr image I sub,CbCr whose size is a quarter of the original chroma image I CbCr . In Section 3.2, two kinds of chroma 4:2:0 subsampling approaches [23][24][25][26][27][28][29][30][31][32], namely the Bayer CFA pattern-independent approach and the Bayer CFA pattern-dependent approach in Figure 3b, will be introduced. Furthermore, based on the subsampled CbCr image I sub,CbCr and the luma image I Y , a luma modification method is performed on I Y to obtain a modified luma image I mod,Y . Section 3.3 will introduce the two related luma modification methods [33,34].
As a result, the subsampled CbCr image I sub,CbCr and the modified luma image I mod,Y are fed into the encoder. At the client side, a chroma upsampling method is first performed on the decompressed subsampled CbCr image to construct the upsampled CbCr image. Finally, a YCbCr-to-RGB transformation is performed on the upsampled YCbCr image to obtain the reconstructed Bayer CFA image I rec,Bayer , and then as mentioned before, I rec,Bayer is demosaiced to an RGB full-color image which serves as the reconstructed RGB full-color image. Among the codecs used to evaluate the compression performance of the considered DF-based compression methods, the Versatile Video Coding (VVC) platform [35] is most favored.

Motivation and Contribution
To date, in the literature, no review article for the above-mentioned CF-based compression scheme and the DF-based compression scheme for Bayer CFA images has been published. Besides that, no compression performance comparison of the two compression schemes has been reported. Therefore, it motivated us to review the related CF-based compression works, in particular the related RCT-based compression works and the DF-based works. In addition, it motivated us to compare and discuss the compression performance for the considered compression methods on JPEG-2000 and the newly released VVC platform VTM-16.2.
In the RCT-based compression methods, the four main methods, namely the Y 1 Y 4 Cb 3 Cr 2 method [2], the YD g C o C g method [3], the YLMN method [4], and the Y∆C b C r method [8], are introduced in detail. To evaluate the compression performance of these compression methods on JPEG-2000 and VTM-16.2, considering simplicity and effectiveness, the three methods, namely the YD g C o C g method [3], the YLMN method [4], and the Y∆C b C r method [8], are selected as the representatives.
Based on the testing Bayer CFA images created from the Kodak, IMAX, screen content images (SCI), Videos, and classical images (CI) datasets, thorough experiments have been carried out for the above-mentioned representatives of the CF-based and DF-based compression schemes on JPEG-2000 and VTM-16.2. When setting the quantization parameter (QP) to 0 for VTM-16.2 and setting the compression ratio (CR) to 1 for JPEG-2000, in terms of the three popular quality metrics, namely the peak signal-to-noise ratio (PSNR), the structure similarity (SSIM) [36], and the feature similarity (FSIM) [37], the YD g C o C g method achieves the best PSNR performance, and the BIDM-OLM method is ranked second; the BIDM-OLM method achieves the best SSIM and FSIM performance, and the YD g C o C g method is ranked second. On JPEG-2000, in terms of the widely used quality-bitrate tradeoff metric, namely the Bjøntegaard delta (BD)-PSNR [38], the BIDM-OLM method always achieves the best BD-PSNR performance. On VTM-16.2, the YD g C o C g method achieves the best BD-PSNR performance under the high bitrate circumstance, while under the middle and low bitrate circumstances, the BIDM-OLM method achieves the best BD-PSNR performance. In addition, the perceptual quality comparison and the execution time requirement comparison are also made. Finally, some future works are addressed.
The remainder of this article is organized as follows. In Section 2, the related RCTbased compression works for I Bayer are introduced. In Section 3, the related DF-based compression works for I Bayer are introduced. In Section 4, the compression performance comparison between the two compression schemes are provided. In Section 5, some concluding remarks and future works are addressed.

The Reversible Color Transform-Based (RCT-Based) Compression Works for Bayer CFA Images
For I Bayer , we mainly introduce the four RCT-based compression methods: the YD g C o C g method [3], the YLMN method [4], and the Y∆C b C r method [8]. Figure 4 depicts the relation between the 2 × 2 Bayer CFA block and the four RCT-based formats.  Lee and Ortega [2] proposed a Y 1 Cr 2 Cb 3 Y 4 method to decorrelate the input Bayer CFA image I Bayer to four subimages, namely the Y 1 , Cr 2 , Cb 3 , and Y 4 subimages such that I Y 1 Cr 2 Cb 3 Y 4 can be better compressed by JPEG. First, each 2 × 2 Bayer CFA block B Bayer is converted to a 2 × 2 Y 1 Cr 2 Cb 3 Y 4 block in Figure 4b by using the following formula: where Y 1 and Y 4 denote the two converted luma components, and Cb 3 and Cb 2 denote the converted chroma components.
On the other hand, by Equation (1), one input Bayer CFA image I Bayer is converted into a Y 1 Cr 2 Cb 3 Y 4 image I Y 1 Cr 2 Cb 3 Y 4 which contains four subimages, namely the two luma subimages, I Y 1 and I Y 4 , constituting a quincunx-located luma image in Figure 5a, and the two chroma subimages, I Cb 3 and I Cr 2 . In order to produce a compact luma image containing I Y 1 and I Y 4 , every luma value in each even column of the quincunx-located image is shifted left to the odd column, and then, each even column is removed. Figure 5b shows the rectangular compact luma image. Next, the rectangular compact luma image is rotated 45 degree clockwise to produce a rhombic compact luma image which is located on the center of the resultant compact luma image using a mirroring method. Finally, the resultant compact luma image is compressed by a shape-oriented JPEG [39], in which only the meaningful luma values of the resultant compact luma image are compressed. Similarly, the compact subimage I Cb 3 and the compact subimage I Cr 2 are compressed using the same codec. At the client side, the decompressed Y 1 Cr 2 Cb 3 Y 4 image is converted to the constructed Bayer CFA image I rec,Bayer by the inverse of Equation (1).

The YD g C o C g Method
Extending from the YC o C g R method [40], Malvar and Sullivan [3] proposed an effective YD g C o C g method. Considering each 2 × 2 Bayer CFA block B Bayer in Figure 1a, B Bayer is converted to a 2 × 2 YD g C o C g block where the luma value Y is the average of four Bayer CFA pixel-values in B Bayer . The chroma value D g equals the difference between two green pixel-values in B Bayer . The chroma value C o equals the difference between the red pixel-value and the blue pixel-value, and the chroma value C g is equal to the difference between the average of two green pixel-values and the average of the red pixel-value and the blue pixel-value. In the YD g C o C g method, the The inverse of the above transformation in Equation (2), i.e., the Because the transformation in Equation (2) and the inverse transformation in Equation (3) only have entries 0, 1, −1, 1 2 , − 1 2 , and 1 4 , according to Equation (3) in [3], only right-shift operations for integer values are needed, leading to a low computational cost benefit. On JPEG-2000, experimental results indicated the compression performance superiority of the YD g C o C g method over the "G channel merging plus color differences" method [10] which outperformed the method by compressing the Bayer CFA image directly.

The YLMN Method
Mohammed et al. [4] proposed a YLMN method for compressing I Bayer . Considering a 2 × 2 Bayer CFA Block B Bayer , based on the generalized S-transform [41], the lifting-based reversible color transformation [42], and the 2 × 2 unimodular matrix A with for 0 < α < 1, the best value of α is determined as 1/2 and it yields Based on the differential pulse code modulation (DPCM) and the median edge prediction principle, the To reduce the correlation between W r and W b , the W r D r W b D b format in Equation (6) is transformed to the following YLMN format: Later, Mohammed and Wahid [5] slightly modified the YLMN format in Equation (7) by taking the floor function into account. According to the results proposed by Khan and Wahid [43], Rahman et al. [6] proposed a modified DPCM-based representation for compressing I Bayer in wireless capsule endoscopy applications.

The Y∆C b C R Method
Richter and Fößel [8] proposed a Y∆C b C r method by modifying thegdB 3 R 2 method [7]. After performing the 4 × 4 Haar wavelet transform on the four channels,g, d, B 3 , and R 2 , the Y∆C b C r format is expressed as In Equation (8), the Y component is the average of the four Bayer CFA pixels of B Bayer . ∆ denotes the difference between G 1 and G 4 . C B and C R equal the blue pixel B 3 and red pixel R 2 minus the average of G 1 and G 4 , respectively.
Experimental results indicated that prior to encoding I Bayer , as a pre-processing step [8], performing a nonlinear gamma correction on I Bayer can achieve better compression performance when using the Y∆C b C r method.
To evaluate the compression performance of the RCT-based compression scheme for I Bayer , considering effectiveness, the three RCT-based methods, namely, the YD g C o C g method [3], the YLMN method [4], and the Y∆C b C r method [8], are selected as the representatives.

The Demosaicing-First-Based (DF-Based) Compression Works for Bayer CFA Images
For I Bayer , we introduce the related DF-based compression methods, and in particular, the related CSLM (chroma subsampling-then-luma modification) methods are introduced in more detail. We first introduce how to demosaic I Bayer to an RGB full-color image I demo,RGB , and then, we introduce how to convert I demo,RGB to a YCbCr image, I YCbCr . Finally, three representative methods are selected to evaluate the compression performance of the DF-based compression scheme.

Demosaicing I Bayer to I Demo,Rgb and Then Converting I Demo,Rgb to I Ycbcr
In the DF-based compression scheme for encoding I Bayer , as depicted at the server side of Figure 3a, I Bayer is first demosaiced to an RGB full-color image I demo,RGB .

Demosaicing I Bayer to I demo,RGB
To demosaic I Bayer to I demo,RGB , a demosaicing method is performed on I Bayer to estimate the other two color channels of each Bayer CFA pixel [44][45][46][47].
Bilinear interpolation [48] is the simplest demosaicing method in which the unknown two-color channels of each Bayer CFA pixel are estimated by averaging its proper adjacent pixels. Kimmel [49] proposed a color difference-based demosaicing method using a template matching approach. Gunturk et al. [50] proposed a demosaicing method using an alternating projection approach. Pei and Tam [51] proposed a demosaicing method using a color correlation approach. Lu and Tan [52] proposed a demosaicing method using the spatial and spectral correlation among the neighboring pixels of each Bayer CFA pixel. Wu and Zhang [53] proposed a demosaicing method using the edge direction information and a soft-decision framework. Lukac and Plataniotis [54] proposed a demosaicing method using normalized color-ratio information.
Hirakawa and Parks [55] proposed an adaptive homogeneity-directed demosaicing method. Chung et al. [56] proposed a demosaicing method using several gradient edgedetection masks and an adaptive heterogeneity-projection technique. Using a generic variational approach, Condat [57] proposed a general demosaicing method for arbitrary CFA patterns. Yang et al. [58] proposed a color difference-and edge sensing-based demosaicing method for arbitrary CFA patterns. Zhang et al. [59] proposed a demosaicing method using local directional interpolation and nonlocal adaptive thresholding.
Kiku et al. [60] proposed a residual interpolation-based demosaicing method. In their method, the missing green values are first estimated by using a bilateral interpolation. Next, a window-based linear relation with two parameters, in which the number of equations is larger than 2, between the estimated green values and the collocated ground-truth red values are constructed. Then, a linear regression technique is applied to solve the two parameters involved in the linear relation. Using the solved two parameters, the missing red values are thus reconstructed. In the same argument, the missing blue values are constructed. To alleviate the spot artifact problem in [60], based on a multiple-window approach, Ye et al. [61] first constructed multiple linear systems, and then the average 2-parameter solution was used to estimate the missing red and blue values, leading to a better smoothing effect. In [62,63], the convolutional neural networks (CNN) based demosaicing methods were proposed. Considering the fact that the green channel has twice the sampling rate and better quality than the red and blue channels in I Bayer , Guo et al. proposed a green channel prior-NET-based joint denoising and demosaicing method. Based on a progressive collaborative representation framework, Ni et al. [64] proposed multiple training-and-refining steps to improve the demosaicing performance.
Due to simplicity and effectiveness, as the first step of the DF-based compression method for I Bayer , Kiku et al.'s demosaicing method is adopted to demosaic the input Bayer CFA image I Bayer to an RGB full-color image I demo,RGB . In the next subsection, the conversion from the demosaiced RGB full-color image I demo,RGB to a YCbCr image I YCbCr is introduced.

Converting I demo,RGB to I YCbCr
After demosaicing I Bayer to I demo,RGB , I demo,RGB is further transformed to I YCbCr by using the BT.601-5 color conversion [65]: Because the human visual system is less sensitive for chroma differences than for luminance, the luma image I Y and the chroma image I CbCr are decorrelated from the converted YCbCr image I YCbCr . Therefore, chroma subsampling on each 2 × 2 CbCr block B CbCr is naturally included prior to encoding the YCbCr image [66].

Chroma Subsampling
In this subsection, eight Bayer CFA pattern-independent chroma subsampling methods and five Bayer CFA pattern-dependent chroma subsampling methods are introduced.    [25] proposed an IDID chroma subsampling method, and at the client side, NEDI is adopted as the chroma upsampling process. Inspired by the palette mode used for screen content images (SCI) [68], in which each SCI has only a few dominant colors in the background, Wang et al. [26] proposed a JCDU chroma subsampling method, and the bicubic convolution interpolation (BCI) [69] is adopted as the upsampling process at the client side.
However, because the above eight Bayer CFA pattern-independent chroma subsampling methods do not take the Bayer CFA pattern into account, their compression performance is limited. On the other hand, there is room to improve their compression performance.

The Bayer CFA Pattern-Dependent Chroma Subsampling Methods
In this subsection, we introduce the five state-of-the-art Bayer CFA pattern-dependent chroma subsampling methods: the direct mapping (DM) method [27], the COPY-based distortion minimization (CDM) method [28] and the two variants [29,30], and the bilinear interpolation-based distortion minimization (BIDM) method [31].
The Direct Mapping (DM) Method [27] Before presenting the DM method [27], the YCbCr-to-RGB transformation, which is the reverse of the RGB-to-YCbCr transformation in Equation (9), is defined by Chen et al. [27] first observed that the R-color value is dominated by the luma value and the Cb value, and the B-color value is dominated by the luma value and the Cr value. In addition, from the 3 × 3 coefficient matrix in Equation (10), the Cb component has more influence on reconstructing the B pixel than on reconstructing the G pixel. In the same argument, the Cr component has more influence on reconstructing the R pixel than on reconstructing the G pixel. Consequently, the subsampled (Cb, Cr)-pair of B CbCr is set to (Cb 3 , Cr 2 ), where the relation between the subsampled chroma pair of B CbCr , i.e., (Cb 3 , Cr 2 ), and the 2 × 2 Bayer CFA pattern is depicted in Figure 7. Lin et al. [28] first adopted the upsampling process "COPY", which is called the nearest neighbor (NN) upsampling process supported by some compression standard such as VVC [35], to duplicate the subsampled (Cb, Cr)-parameter of B CbCr , denoted by (Cb s , Cr s ), as the four estimated chroma pairs of B CbCr at the server side. Next, they proposed a COPY-based 2 × 2 Bayer CFA block-distortion function to measure the distortion between B CbCr and the 2 × 2 estimated chroma block B est,CbCr , and the block-distortion is defined by with Applying the differentiation technique to Equation (11), in the real domain, the solution of (Cb s , Cr s ) is expressed as In Lin et al.'s copy-based block-distortion minimization (CDM) method [28], Equation (13) is used to determine the subsampled chroma pair of each 2 × 2 chroma block B CbCr ; experimental data indicated that the CDM method achieves better compression performance relative to most Bayer CFA pattern-independent chroma subsampling methods. Furthermore, according to the convex function definition in [70], Chung et al. [29] proved that the COPY-based 2 × 2 Bayer CFA block-distortion function in Equation (11) is a convex function because the determinant of the Hessian matrix of D Bayer (Cb s , Cr s ) in Equation (11) is equal to 66.1412 (>0). Then, using this convex function property, an iterative CDM (ICDM) method was proposed to obtain a better subsampled (Cb, Cr)-pair of B CbCr when compared with the CDM method [28].
Based on the same differentiation technique used in [28] but considering the 2 × 2 demosaiced RGB full-color block-distortion function, Lin et al. [30] derived that the subsampled chroma pair of B CbCr equals the average chroma pair of the four chroma entries of B CbCr . Furthermore, they proposed a "modified 4:2:0(A)" chroma subsampling method that selects the best case among the four average subsampled chroma pairs of B CbCr by considering the four combinations of the ceiling operation-based 4:2:0(A) and the floor operation-based 4:2:0(A). At the client side, the "modified 4:2:0(A)" method adopts the three neighboring (TN) reference pixels-based upsampling process [71]. However, in our experiment, the "modified 4:2:0(A)-COPY" method, where "COPY" denotes the copy interpolation, outperforms the "modified 4:2:0(A)-TN" method. Therefore, the "modified 4:2:0(A)-COPY" method is included in the comparative method instead of the modified 4:2:0(A)-TN method.

The Bilinear Interpolation-Based Distortion Minimization (BIDM) Method
To improve the accuracy of COPY-based block-distortion function in Equation (11), Chung et al. [31] proposed a more effective bilinear interpolation-based (BI-based) 2 × 2 Bayer CFA block-distortion function. For simplicity, we just introduce it for each 2 × 2 Cb block B Cb . For convenience, let the subsampled Cb parameter of B Cb , denoted by Cb s , be located at (1, 0) in Figure 8. At the server side, we now describe how to express the estimated top-left entry of B Cb , denoted by Cb est 1 , as a function with the parameter Cb s . The functions for the other three estimated entries of B Cb , namely Cb est 2 , Cb est 3 , and Cb est 4 , can be derived similarly. After estimating the four entries of B CbCr , the BI-based 2 × 2 Bayer CFA block-distortion function can be derived.
To estimate Cb est 1 which is located at (3/4, 1/4), the subsampled chroma parameter Cb s and the three neighboring subsampled Cb values of B Cb , namely Cb 1,1 located at (1, 1), Cb 0,1 located at (0, 1), and Cb 0,0 located at (0, 0), are referred to. Because the BI-based distortion minimization (BIDM) method is performed on each 2 × 2 chroma block in a raster scanning order, the three reference subsampled Cb values were obtained in advance. To estimate Cb est,2 , Cb est,3 , and Cb est,4 , some future neighboring reference subsampled Cb values are unknown, but they can be calculated by 4:2:0(A) or the CDM chroma subsampling method in which neither method needs to reference any neighboring subsampled Cb values of the current 2 × 2 Cb block. In our experiment, we adopt 4:2:0(A) to calculate the future subsampled Cb values. Following the notations in Figure 8 and using the above BI-based approach [31] In general, the estimation of Cb est i , 1 ≤ i ≤ 4, is expressed as After estimating the four chroma pairs of B CbCr , the estimated 2 × 2 CbCr block B est,CbCr , denoted by B est,CbCr , the collocated 2 × 2 luma block B Y , the Bayer CFA pattern in Figure 1a with Pat 1 = [G 1 , R 2 , B 3 , G 4 ], and Equation (10) are utilized together to reconstruct the estimated 2 × 2 Bayer CFA block B est,Bayer (= [G est 1 , R est 2 , B est 3 , G est 4 ]) at the server side. By Equations (9) and (15), the block-distortion D Bayer (Cb s , Cr s ) between each 2 × 2 Bayer CFA block and the corresponding 2 × 2 estimated Bayer CFA block is expressed as where a i and b i have been defined in Equation (12). Furthermore, the determinant of the Hessian matrix of D Bayer (Cb s , Cr s ) in Equation (16) equals 6.6216 (>0) [31], and it deduces the convex property of the positive definite blockdistortion function D Bayer (Cb s , Cr s ) in Equation (16). Using the differentiation technique, it yields that in the real domain, the solution of the subsampled (Cb, Cr)-pair of B CbCr , denoted by (Cb In the integer domain, the subsampled chroma solution in Equation (17) is taken as the initial solution of the iterative BIDM method for obtaining a better subsampled chroma solution of B CbCr . In the (k + 1)th iteration of BIDM, if the previous subsampled chroma solution can be replaced by a better solution among the eight neighboring subsampled chroma candidates of (Cb   [28], and ICDM-BI [29]. Considering the effectiveness of the above Bayer CFA pattern-dependent chroma subsampling methods, the CDM method, the "modified 4:2:0(A)" method, and the BIDM method are selected as the representatives to evaluate the compression performance of the DF-based compression scheme.

Luma Modification
After introducing the related chroma subsampling works in the CSLM methods, in this subsection, the optimal Bayer CFA pattern-dependent luma modification (OLM) method [33] is first introduced, and then, the difference between the OLM method and Chiu et al.'s non-optimal method [34] is highlighted. For easy exposition, the BIDM-OLM method is used to assist the introduction of the OLM method. After performing the BIDM method on each 2 × 2 chroma block B CbCr , the goal of the OLM method is to determine the best modified luma value Y i , 1 ≤ i ≤ 4, for the corresponding 2 × 2 luma block B Y such that the 1 × 1 Bayer CFA pixel-distortion can be minimized, achieving better quality of the reconstructed Bayer CFA image.
After performing the iterative BIDM method on the chroma block B CbCr , let the subsampled (Cb, Cr)-pair of B CbCr be denoted by (Cb Bayer , Cr Bayer ). Let the two chroma variables Cb i and Cr i , 1 ≤ i ≤ 4, in Equation (10) be replaced by Cb Bayer and Cr Bayer , respectively. It is intractable to search for a unique modified luma value of Y i , 1 <= i <= 4, which satisfies the three equations in Equation (10) simultaneously. To derive the search interval for determining the best modified luma value Y i , 1 ≤ i ≤ 4, for each 2 × 2 luma block B Y , we consider Y R i , Y G i , and Y B i , which satisfy Solving each equation in Equation (18), it yields By using the contradiction method, it has been proved that the modified luma value Y i can be found in the smaller interval [Low i , High i ] where (20) where " · and " · denote the floor function and ceiling function, respectively. For 1 ≤ i ≤ 4, it can be verified that the condition "(High i − Low i ) = 2" holds, and it thus indicates that the best modified luma value Y i can be determined in constant time such that the pixel-distortion value is minimal, where the pixel-distortion (PD) function is defined by For I Bayer , the previous method proposed by Chiu et al. [34] determined the modified luma value Y i by using the appropriate equality in Equation (19) directly, but it cannot guarantee that the determined modified luma value is the best. Experimental data revealed that the CDM-OLM method [33] can achieve at least 10 dB quality improvement when compared with the pure CDM chroma subsampling method [28].
To evaluate the compression performance of the DF-based compression scheme for I Bayer , considering the effectiveness, the three selected representatives are CDM-OLM, "modified 4:2:0(A)"-OLM, and BIDM-OLM.

Experimental Results
Based on the ground-truth Bayer CFA images collected from the five datasets, namely Kodak, IMAX, SCI, Videos, and CI datasets, the quality and quality-bitrate tradeoff com-parison of the reconstructed Bayer CFA and RGB full-color images by using the DF-based compression scheme and the RCT-based compression scheme for encoding Bayer CFA images are demonstrated. In addition, the execution time comparison is also provided.
All considered methods for compressing I Bayer are implemented on a computer with an Intel Core i7-6700 CPU 3.4 GHz and 24 GB RAM. The operating system is the Microsoft Windows 10 64-bit operating system. The program development environment is Visual C++ 2019. The compression standards used to evaluate the compression performance of the considered methods are JPEG-2000 [18,19] and the VVC platform VTM- 16.2 [35].
Because it is difficult to access the testing RGB full-color image datasets with real image pipelining parameters, such as the gamma correction coefficients and white balance parameters, at the client side, the reconstructed RGB full-color image is obtained by demosaicing the decompressed reconstructed Bayer CFA image. As mentioned before, the demosaiced Bayer CFA image I demo,Bayer obtained at the server side is used as the ground-truth RGB full-color image for evaluating the quality of the reconstructed RGB full-color image obtained at the client side.

Quality Comparison and Discussion
When setting QP to zero for VTM-16.2 and setting CR to 1 for JPEG-2000, we first compare the quality performance of the reconstructed Bayer CFA and RGB full-color images between the DF-based compression scheme and the CF-based compression scheme for I Bayer . Secondly, the discussion of the compression comparison of the two compression schemes is provided.

Quality Comparison and Discussion
Three popular objective quality metrics, namely PSNR, SSIM [36], and FSIM [37], are used to compare the quality performance of the reconstructed Bayer CFA images and the reconstructed RGB full-color images by using all the considered compression methods for Bayer CFA images. PSNR is used to evaluate the average quality of one reconstructed Bayer CFA image, and it is defined by where denotes the number of the Bayer CFA images in one dataset; MSE (mean square error) equals 1 where I Bayer denotes the groundtruth Bayer CFA image, I rec,Bayer denotes the reconstructed Bayer CFA image, and XY denotes the image size. First, the PSNR value of each dataset is calculated. Next, the average PSNR values of the five considered datasets are calculated as the average PSNR value of one reconstructed Bayer CFA image. By using Kiku et al.'s demosaicing method [60] to demosaic each reconstructed Bayer CFA image I rec,Bayer to a reconstructed RGB full-color image I rec,RGB , the CPSNR of I rec,RGB is defined by where CMSE equals 1 where I demo,RGB denotes the ground-truth-demosaiced RGB full-color image obtained at the server side.
SSIM is expressed as the product of the luminance mean similarity, the contrast similarity, and the structure similarity between I Bayer and I rec,Bayer . To measure the color SSIM (CSSIM) of I rec,RGB , the SSIM value of each c-color, c ∈ {R, G, B}, the image of I rec,RGB is calculated, and then, the average SSIM value of the three calculated SSIM values is used as the CSSIM value of I rec,RGB . FSIM utilizes the phase consistency and gradient magnitude to weight the local quality maps for obtaining a feature quality score of I rec,Bayer . The color FSIM (CFSIM) value of I rec,RGB is defined as the average FSIM value of the three calculated SSIM values of the three color images of I rec,RGB . Interested readers are suggested to refer to the original papers [36,37] for the detailed definitions of SSIM and FSIM, respectively.
Based on the five testing datasets, when setting QP = 0 and CR = 1 for VTM-16.2 and JPEG-2000, respectively, Table 1 tabulates the average PSNR, SSIM, and FSIM performance of the reconstructed Bayer CFA images by using the three representative DF-based compression methods and the three representative RCT-based compression methods, where the individual results on VTM-16.2 and JPEG-2000 are tabulated in different rows. From Table 1, we observe that on VTM-16.2 and JPEG-2000, the YD g C o C g method always achieves the highest PSNR, and the BIDM-OLM method is always ranked second; the BIDM-OLM method achieves the highest SSIM and FSIM, and the YD g C o C g method is ranked second. The average CPSNR, CSSIM, and CFSIM values of the reconstructed RGB full-color images are tabulated in Table 2. From Table 2, we observe that on VTM-16.2 and JPEG-2000, the YD g C o C g method always achieves the highest CPSNR, and the BIDM-OLM method is ranked second; the BIDM-OLM method always achieves the highest CSSIM and CFSIM, and the YD g C o C g method is always ranked second.

. Execution Time Requirement Comparison and Discussion
For each image, the average execution time (in seconds) to transform the input Bayer CFA image to the subsampled YCbCr image for the DF-based compression method or the RCT-based format for the CF-based compression method is tabulated in Table 3. Besides the average execution time of one image for each testing dataset, the average execution time of one image for all five datasets is listed in the last column of Table 3, namely "AVG". From Table 3, we observe that the dataset "Kodak" takes more time than the other four datasets because of its high resolution. Furthermore, the compression methods in the DFbased compression scheme take more time than the methods in the CF-based compression scheme. Because the average execution time of one image is always less than one second, it can be neglected when compared with the practical encoding time. It is noticeable that each DF-based method (or RCT-based method) can be realized by the usage of the GPUbased parallel computation to reduce the time requirement because each method can be decomposed into many independent subtasks.

Quality-Bitrate Tradeoff Comparison and Discussion
Under different QP and CR intervals, in terms of the BD-PSNR metric, the qualitybitrate tradeoff comparison for all considered compression methods for I Bayer is reported and discussed. To compare the visual effects for the considered compression methods for I Bayer , the decompressed Bayer CFA images are further demosaiced by using Kiku et al.'s method to produce the RGB full-color images.

The Quality-Bitrate Tradeoff Comparison
In order to show the average BD-PSNR performance comparison among the considered DF-based compression methods and RCT-based compression methods, we take the method by encoding the testing Bayer CFA images directly as the baseline compression method. According to the reconstructed Bayer CFA images using one considered compression method, under the same bitrate requirement, the quality-bitrate tradeoff metric "BD-PSNR" [38] is used to report the average PSNR gain of the considered compression method over the baseline compression method.
On VTM-16.2, five QP intervals, namely [4,20], [12,28], [20,36], [28,44], and [36,51], are used to evaluate the BD-PSNR gains of the reconstructed Bayer CFA images by using the considered methods over the baseline method. On JPEG-2000, five CR intervals, namely [5,20], [15,30], [20,35], [25,40], and [30,45], are used to evaluate the BD-PSNR gains of the reconstructed Bayer CFA images by using the considered methods over the baseline method. In Table 4, we observe that on JPEG-2000, the BIDM-OLM method always achieves the best BD-PSNR performance for the five CR intervals. In the same table, on VTM-16.2, the YD g C o C g method achieves the best BD-PSNR gains for the two QP intervals, namely [4,20] and [12,28]; the BIDM-OLM method achieves the best BD-PSNR gains for the two QP intervals, namely [20,36] and [28,44], and the modified 4:2:0(A)-OLM method achieves the best BD-PSNR gain for the QP interval [36,51]. It is noticeable that in the QP interval [36,51], the BD-PSNR gain of the BIDM-OLM method is quite competitive with that of the modified 4:2:0(A)-OLM method, and the difference is only 0.0059 (=3.6112 − 3.6063). On VTM-16.2, the same five QP intervals are used to evaluate the BD-CPSNR gains of the reconstructed RGB full-color images by using the considered methods over the same baseline method. On JPEG-2000, the same five CR intervals are used to evaluate the BD-CPSNR gains of the reconstructed RGB full-color images by using the considered methods over the baseline method. In Table 5, we observe that the BIDM-OLM method always has the highest BD-CPSNR gains for the five CR intervals on JPEG-2000, and it has the highest BD-CPSNR gains for the three QP intervals, namely [20,36], [28,44], and [36,51], on VTM-16.2. In Table 5, we observe that the YD g C o C g method has the highest BD-CPSNR gains for the two QP intervals, [4,20] and [12,28], on VTM-16.2. In summary, the similar conclusions in Tables 4 and 5 reveal that on JPEG-2000, the BIDM-OLM method always has the highest BD-PSNR and BD-CPSNR gains for the five CR intervals; on VTM-16.2, the YD g C o C g method has the highest BD-PSNR and BD-CPSNR gains for low and middle QP intervals, and the BIDM-OLM method has the highest BD-PSNR and BD-CPSNR gains for middle and high QP intervals.

The Visual Effect Comparison
This subsection shows the visual effect comparison among the considered compression methods for I Bayer . The testing image example in Figure 9a is taken from the 13th groundtruth IMAX image. After performing the CDM-OLM, modified 4:2:0(A)-OLM, BIDM-OLM, YD g C o C g , YLMN, and Y∆C b C r methods on the Bayer CFA image of Figure 9b which is cut off from Figure 9a, the demosaiced RGB full-color images are illustrated in Figure 9c  Under VTM-16.2, for each method, from the two images on the left in Figure 9c-h, we clearly observe that the DF-based compression methods, particularly the BIDM-OLM method, have better color and texture preservation effects than the RCT-based compression methods. Under JPEG-2000, from the right two images of Figure 9c-h, we observe that the DF-based compression methods still outperform the RCT-based compression methods. The above visual effect observations indicate the visual effect merit of the DF-based methods, particularly the BIDM-OLM method, for one middle and high QP/CR case.
We now consider one low and middle QP/CR case: VTM-16.2 for QP = 24 and JPEG-2000 for CR = 20. After performing the six considered compression methods on the Bayer CFA image of Figure 9b, the demosaiced RGB full-color images are illustrated in Figure 10a-f, where for each method, the two demosaiced RGB full-color images on the left and on the right are under VTM-16.2 for QP = 24 and JPEG-2000 for CR = 20, respectively. For each method, from the two images on the right in Figure 10, under JPEG-2000, we observe that the DF-based compression methods, particularly the BIDM-OLM method, have better color and texture preservation effects in the tree, roof, and chimney parts when compared with the RCT-based compression methods. Under VTM-16.2 for each method, as shown in the two images on the left in Figure 10, we observe that the DF-based compression methods are quite competitive with the RCTbased compression methods. Note that in the roof part, the BIDM-OLM method has a better visual effect than the YD g C o C g method.
For low QP/CR cases, the visual effect comparison is omitted because the CPSNR values of demosaiced RGB full-color images of the considered compression methods are too high to be visually distinguished. For example, under VTM-16.2 for QP ≤ 20, the CPSNR values are often larger than or equal to 40. Under the two codecs, VTM-16.2 and JPEG-2000 for different QP and CR values, it is suggested that the readers refer to the related experimental results in the above-mentioned two github websites.

Conclusions and Future Works
We have introduced the compression-first-based compression methods, in particular the reversible color transform-based (RCT-based) compression methods, and the demosaicing-first-based (DF-based) compression methods for Bayer CFA images. Based on five datasets, thorough experiments have been carried out to compare the quality and quality of bitrate tradeoff performance of the RCT-based compression methods and the DF-based compression methods. To the best of our knowledge, this is the first time that such a compression performance comparison has been reported for the two compression approaches for I Bayer . Experimental results demonstrated that on JPEG-2000, the BIDM-OLM method always has the highest BD-PSNR and BD-CPSNR gains for different CR intervals. On VTM-16.2, the YD g C o C g method has the highest BD-PSNR and BD-CPSNR gains for low and middle QP intervals, and the BIDM-OLM method has the highest BD-PSNR and BD-CPSNR gains for middle and high QP intervals.
Some future works are addressed below. The first future work is to deploy some image pipelining techniques, such as denoising, gamma correction, and white balancing, into the reconstructed Bayer CFA image at the client side to produce the reconstructed RGB full-color images for the above-mentioned two compression schemes for I Bayer . After that, the compression performance is examined. The second future work is to extend the results of this article to RGBW CFA images [72][73][74], which have been widely used in consumer markets and can receive more luminance in the low illumination condition than that of Bayer CFA images [75]. In this future work, the demosaicing method for RGBW images can be adopted from the methods reported in [57,76,77]. The third future work is to take the latest SSIM variants [78,79] into account for enhancing the quality comparison of the considered compression methods for Bayer CFA images.