Color Demosaicing of RGBW Color Filter Array Based on Laplacian Pyramid

In recent years, red, green, blue, and white (RGBW) color filter arrays (CFAs) have been developed to solve the problem of low-light conditions. In this paper, we propose a new color demosaicing algorithm for RGBW CFAs using a Laplacian pyramid. Because the white channel has a high correlation to the red, green, and blue channels, the white channel is interpolated first using each color difference channel. After we estimate the white channel, the red, green, and blue channels are interpolated using the Laplacian pyramid decomposition of the estimated white channel. Our proposed method using Laplacian pyramid restoration works with Canon-RGBW CFAs and any other periodic CFAs. The experimental results demonstrated that the proposed method shows superior performance compared with other conventional methods in terms of the color peak signal-to-noise ratio, structural similarity index measure, and average execution time.


Introduction
Most image acquisition devices use a color filter array (CFA) to acquire color images. Color filter array is structured to obtain color information from a single sensor, which is sampled with each pixel. Each pixel records the value of only one color from the absorbed light. Most commercial cameras use a CFA called the Bayer pattern [1], shown in Figure 1a. The image acquired from the CFA is called the mosaic image, and it requires restoring the missing pixels for each color channel to create a full color image. This is called color demosaicing or color interpolation. It is an essential part of image signal processing.
Since the launch of commercial cameras, numerous color demosaicing algorithms have been developed. Interpolation-based methods such as bilateral and bicubic interpolation appeared in the very early stages of the development. Interpolation-based methods result in unwanted artifacts such as the zipper effect, false color, and blurring. To address these problems, advanced interpolation methods based on high correlation between color channels have emerged. These interpolation methods use the color difference model [2,3]. In addition to the color difference model, several methods for color demosaicing have been developed, such as those based on regularization [4], frequency analysis [5,6], compressive sensing [7,8], and residual interpolation [9,10] using guided filtering [11]. With the recent advances in computer science and equipment, methods have been proposed for interpolating the missing color channels with graphics processing units (GPUs) using convolutional neural networks (CNNs) [12,13]. Most complex demosaicing methods restore the image details, but with high computational costs. Thus, they cannot be applied to industrial products. Therefore, it is important to develop both real-time processing and restoration details.
Autonomous driving technology has developed along with the use of various image acquisition devices. The acquired images help make important judgments in autonomous driving, such as distance measurement, driving speed, and the direction of the vehicle.
The image acquisition device used for autonomous driving may be a LiDAR or a general camera. LiDAR accurately measures distance. However, due to the high price and low resolution of LiDAR, several commercial cameras are used in autonomous vehicles.
Despite advances in color demosaicing algorithms, the problem of low-light conditions remains. Because autonomous driving also accounts for a large part of night driving, images in low-light conditions are important. In low-light conditions, noise is noticeable because the signal's power is reduced. To solve this problem, several methods have been developed. One of the methods includes increasing the sensor's size. However, the size of the sensor of commercial cameras cannot be increased sufficiently due to spatial constraints. Another method requires the camera user to increase the exposure time; however, if the exposure time is increased, motion blurring artifacts may occur. Changing the pattern of the CFA to improve image quality in low-light conditions is another method. Numerous CFA patterns containing a white channel that receives the full range of visible light have been designed, such as the Kodak-RGBW CFA, shown in Figure 1b [14], the Sony-RGBW CFA, shown in Figure 1c [15], and the Canon-RGBW CFA, shown in Figure 1d [16]. These patterns solve the problem of image quality in low-light conditions because the spectral sensitivity of white is higher than that of red, green, and blue. The greater the number of white pixels in the CFA, the more resistant the CFA is to noise in low-light conditions. The contributions of this paper are summarized as follows. We propose an optimized color demosaicing algorithm for the Canon-RGBW CFA. Moreover, white channel interpolation based on the gradient of color difference models is proposed. We then propose color channel demosaicing using a Laplacian pyramid based on the structural similarity of the subsampled color channels. We compare the existing state-of-the-art (SOTA) methods to our proposed method with real data, Kodak, and the McMaster datasets [17] using noise-free and high noisy images. Our method achieved higher image quality than the SOTA methods based on metrics such as PSNR and SSIM. Our method is also applicable to other periodic CFAs.

Frequency Analysis of CFAs
In this study, it was necessary to select an appropriate CFA that contains white for good performance in normal-light and low-light conditions. If the ratio of white is very high in a CFA, it is robust against extreme low-light conditions, but there is a problem with color bleeding. Therefore, we needed a CFA with an appropriate ratio of white. Hao et al. [18] proposed analyzing various CFAs through a combination of luminance and chrominance in the frequency domain. Through Hao's method [18], we explain the reason for selecting the Canon-RGBW CFA. The paper [18] proposed several principles for a good CFA. One is to design the distance between luminance and chrominance to be as long as possible in the frequency domain. The second is to design many repeated chrominances. The former principle is for accurate luminance estimation, and the latter is for accurate color estimation. The two conditions have a trade-off relationship with one other. Figure 2a shows the frequency analysis of the McMaster16 dataset for the Bayer pattern CFA, and Figure 2b shows the frequency analysis of the McMaster16 dataset for the Canon-RGBW CFA. L stands for luminance, and C1, C2, and C3 stand for different chrominances in Figure 2. In the case of Figure 2a, the distance between the luminance and chrominance is long, but there are few repeated chrominances. However, Figure 2b shows a short distance between the luminance and chrominance, but many repeated chrominances exist. The distance between the luminance and chrominance is short in the frequency domain, but the higher percentage of white in the spatial domain of Canon-RGBW CFA allows for a more accurate estimation of the white channel. Because the white channel has characteristics similar to luminance, the disadvantage can be offset. Furthermore, the Canon-RGBW CFA has a simple sampling period for color channels, such as that corresponding to an integer multiple of the Bayer pattern CFA. It is an appropriate array to utilize with the Laplacian pyramid [19].

Traditional Methods of Color Demosaicing
Most conventional color demosaicing methods have been developed using color difference or residual domain. Both methods are performed in two steps for the Bayer pattern CFA. The first step is interpolating the green channel with the maximum color information. Then, other colors are interpolated using the interpolated green channel.
The color difference domain is based on the idea that red and blue colors are highly correlated with the green color. We proceed with demosaicing using the difference between green and another color. Interpolation is required for each color channel because all values are not present. For example, green pixels are half, and the red and blue colors each occupy a quarter of the image size in the Bayer pattern CFA. To interpolate each color channel with no pixel values, either the Hamilton and Adams' [20] interpolation formula or a bilinear filter is used. The color difference domain is calculated using previously interpolated color channels, as follows: whereG andR are fully interpolated color channels and∆ is the color difference. The color difference∆ is estimated using bilinear or edge directional interpolation. Pei et al. proposed a method for generating color difference domains using bilinear interpolation [21] and Pekkucuksen et al. proposed a method for generating color difference domains by discriminating four directions [22]. The estimated color differences are added to the original pixel values to estimate the final green channel: Kiku et al. developed the gradient-based-threshold-free (GBTF) method and proposed a residual domain [9]. After estimating the green channel using the GBTF method, it was used as a guide image. They obtained temporary red and blue channels using guided images, as follows: where tent means tentative, GF(input A , input B ) means guided filtering, input A means the guide image, and input B means the filtering input image. They defined the difference between the original CFA and the tentative color image as the residual image.
R resi =R CFA − R tent at red pixels, B resi =B CFA − B tent at blue pixels.
BecauseR resi andB resi are masked images, we need to interpolate them to generate fully residual imagesR resi andB resi . The estimated color images R and B are calculated by the summation of residual and tentative images, as follows: The demosaiced output generated using the residual domain is better than that generated using the color difference domain. However, the computational cost is high.

Laplacian Pyramid
A Laplacian pyramid consists of downsampled images and high-frequency components of the image. We call the high-frequency components of the image detailed images. Because only one downsampled image and detailed image of each level are needed to restore the original, a Laplacian pyramid is commonly used for image compression and detail enhancement [23]. The detailed image is the differences between the blurred image and its previous downsampled image at each level. For example, the blurred image W * 0 is the upsampled image of W 1 , and the image W 1 is the downsampled image of W 0 . The detailed image L 0 is the difference between W 0 and the blurred image W * 0 . Figure 3 shows the flowchart of a Laplacian pyramid decomposition with depth level l = 2.
The downsampled image of the next level is computed in the order of subsampling after Gaussian blurring: where l is the index of the depth level, g(x, y) is a Gaussian kernel that creates blur, and (m, n) is the location value that indicates where the image is subsampled. When we create a detailed image L l , we first perform upsampling on the downsampled image from the previous level as below: We need two steps to upsample the downsampled image. The first step is to create an image that doubles the downsampled image. At this time, the value is filled at the subsampled position (m,n) and the remainder is filled with 0. This step is regarded as zero padding. The second step is to create a final upsampled image through low-pass filtering of the zero-padded image. Therefore, ref. (7) means zero padding to double the image size. After zero padding, it is necessary to filter for complete upsampling, as shown below: where h(x, y) is the interpolation filter, such as the bilinear filter. The detailed image of the lth level is the differences between W l and the blurred image W * l , As we can see in Figure 4, the subsampled color channels of the Canon-RGBW CFA have a structure similar to the Laplacian pyramid in Figure 3. If we sub-sample the Canon-RGBW CFA twice to create each color channel, we observe that it is structurally similar to a Laplacian pyramid decomposition with depth level l = 2. The only difference is that here there is no Gaussian filtering in downsampling. This means that g(x, y) is a Dirac delta function in (6). If we can estimate the detailed images of each color channel, such as the detailed images L W 0 , L W 1 in Figure 3, we can restore the full resolution color image. This is similar to reconstruction in the Laplacian pyramid.

Proposed Algorithm
This section describes our proposed algorithm, which comprises two steps. First, we interpolate the white channel using the color difference model. Second, we interpolate the red, green, and blue channels using Laplacian pyramid reconstruction.

White Channel Interpolation
Most demosaicing methods designed for the Bayer pattern CFA begin by interpolating the green channel, whereas our algorithm starts by interpolating the white channel. This is because the white channel has high sensitivity, is noise-resistant, and performs good in lowlight conditions. Moreover, the white channel has a higher spatial resolution than the other channels. The Canon-RGBW CFA consists of three times as many white pixels as pixels for the other colors, as depicted in Figure 1d. Therefore, the white channel is advantageous for edge direction determination. Third, the white channel is highly correlated with the red, green, and blue color channels. This is an important key to restoring the missing color pixel values.
We utilize a gradient-based-threshold-free (GBTF) algorithm [22] for the Canon-RGBW CFA. The GBTF algorithm is designed for the Bayer pattern CFA. Therefore, we modify the GBTF algorithm to apply to the Canon-RGBW CFA. First, we perform the interpolation for the white and color pixel values in the vertical and horizontal directions. The directional estimates for the missing white and color pixel values are calculated as follows: at white pixels, at white pixels, where W and C denote the white channel and the color channel, e.g., red, green, and blue channels, and H and V denote horizontal and vertical directions. Equation (10) can also be viewed as a bilinear filter for directional interpolation. The next step is to calculate the horizontal and vertical color differences: The absolute gradients of the color difference are defined as follows: We use the absolute gradients of color difference in (12) for the edge weight calculation. For white channel interpolation, we estimate the directional color difference in (11), and then we combine them accordingly: For a local window size of 9 × 9, the weights for the vertical and horizontal directions are calculated as follows: Equation (13) describes the directional adaptive low-pass filtering of the differences between color channels, because the color difference model assumes that the differences between color channels vary slowly. Finally, it is possible to refine the estimated value of a pixel by updating the initial color difference estimates using its four neighbor weights. The final color difference can be given as follows: The weights (w N , w S , w W , w E ) are calculated in (16). The vertical direction weights (w N , w S ) use a local window size of 9 × 5 and the horizontal direction weights (w W , w E ) use 5 × 9 window.
After calculating the final color difference in (15), we add it to the target color pixel value. The final interpolated white channel is interpreted as: W(i, j) = C(i, j) +∆ W,C (i, j) at each color pixels.

Red, Green, and Blue Channels Interpolation
The color difference model is based on the high correlation between each color channel. In general, a high correlation between two variables does not mean that they are the same. However, the smoothness in the color difference domain implies that the high-frequency components from different color channels are correlated and are similar to each other. The approximate equality of high-frequency components between different color channels is well-known in demosaicing [3,24].
The color channel is reconstructed based on Laplacian pyramid reconstruction. We demonstrate that our proposed method is better than conventional methods. Figure 5 shows examples of (a) an RGB image,  Figure 5e. The smaller the pixel values in the color difference image, the higher the correlation between each color image. This means that the Laplacian difference model is more advantageous than the color difference model for color interpolation.
The white image W and the color image C consist of low-frequency components W low and C low and the high-frequency components W high and C high , as follows: Because W * 0 is upsampled after the image W 0 in Figure 3 is downsampled, it can be considered a low-frequency component. Thus, W low in (18) can be regarded as W * 0 . The detailed image in a Laplacian pyramid means that it is a high-frequency component of the original image. Therefore, W high in (18) can be regarded as the detailed image L W 0 . We define the low-and high-frequency components for each depth level l as follows: We obtain the low-frequency components W low and C low by upsampling the downsampled image in the Laplacian pyramid. We define upsampling in a Laplacian pyramid as a bilinear method for lower computations. Thus we need to perform the Laplacian pyramid decomposition for the estimated white channel. If we interpolate the red channel, we choose the values m = 0 and n = 1 for the Canon-RGBW CFA, as follows: After choosing the value of (m, n), we upsample the image W 2 to calculate the detailed image L W 1 . We compute the detailed image L W 0 in the same way as before. To do our proposed color channel interpolation, we need the detailed images L C 0 and L C 1 for each color channel. Figure 6 shows the outline of the proposed interpolation for the color channels. The aforementioned assumption is that the high-frequency components shared between the color channels can be approximated as the same. Therefore, we can assume that the high-frequency components of the color channel L C 0 and L C 1 are approximately the same as those of the white channel L W 0 and L W 1 .
After acquiring a detailed image with W, the image goes through the process of restoration to a full-resolution image. We summarize the proposed color channel interpolation based on a Laplacian pyramid in Algorithm 1.

Algorithm 1: Color Interpolation using the Laplacian Pyramid
Input: The estimated image W 0 , The subsampled CFA C l Output: The reconstructed color image C 0 , C ∈ R, G, B 1 Decide (m, n) according to (20) Make Laplacian pyramid about W with (m, n) and depth level l 3 while l > 0 do 4

Experiment Results
For the experimental evaluation of our proposed algorithm, we compared the qualities of the real captured image based on the Canon-RGBW CFA and the Bayer pattern CFA. Experimental results show that the RGBW CFA is more robust to noise than the Bayer pattern CFA in low-light conditions. Furthermore, spatial information could be obtained from the RGBW CFA results but not Bayer pattern CFA results. To obtain the full-resolution images of the red, green, blue, and white channels, we used a filter wheel camera. The filter wheel camera can be rotated in the clockwise or counterclockwise direction for filter selection, and it has four filters: red, green, blue, and white. We took the full resolution images of the red, green, blue, and white channels of the same scene using the four different filters. The exposure time was set to 1/60 s under 5 lux illumination. Using the full-resolution images of the four channels, we performed sampling to fit the Bayer and Canon-RGBW CFA patterns. The image acquired at 5 lux illumination was very dark, as shown in Figure 7a. We performed linear stretching, which is the simplest method to increase pixel values (Figure 7b). We multiplied pixels that acquired raw data by a large number (in this case, 25) and then performed white balancing, as shown in Figure 7c. We performed the simplest white balancing method, which is the gray world method [25].   Figure 8a,b show that the Canon-RGBW CFA is more robust to noise than the Bayer pattern CFA, according to the results of applying the same RI method to the two CFAs. This means that the CFA containing a white channel is more robust to noise than the Bayer pattern CFA. Figure 8b,c show the results of applying the RI and proposed methods with the same Canon-RGBW CFA. It can be observed that the result of the proposed method exhibits more high frequency information compared to that of the RI method. Furthermore, we compared the results of applying the same denoising method [26], BM3D(Block-matching and 3D filtering), to Figure 8a-c. The BM3D result of the Canon-RGBW CFA contained more image details than that of the Bayer pattern CFA, as shown in Figure 8d,e, and the objects which applied the proposed method are better identified, as shown in Figure 8e,f. The qualitative evaluation of the Kodak dataset without noise and with high noise is shown in Figure 9. In the noise-free dataset, CM1 shows good results in terms of color fidelity. However, it can be seen that there is a slight color error in the vertical direction of the fence and in the red letters on the signs. CM2 and CM3 have a color bleed artifact. The difference maps for CM2 and CM3 are brighter than that of the original. This is a typical color loss phenomenon that is called color bleeding. CM2 and CM3 show partially incorrect color estimation, as we can see yellow pixels in the each difference map. CM4 has aliasing in the fence. CM5 shows good results. However, the red letters on the signs also show color bleeding. The proposed method based on the Laplacian pyramid shows results similar to CM5, but the color bleeding is reduced. In other words, the proposed algorithm produces the best visual result. Figure 9h is the image generated by adding Gaussian noise with a standard deviation of σ n = 0.03 to the original image. In the high-noise condition, the noisy dataset shows a similar tendency to the noise free condition. Most of the methods show good results, even in high-noise environments that could be considered low-light conditions. However, CM1 looks somewhat more resistant to noise than the proposed method. CM2 and CM3 still show color bleeding. CM4 obtains the worst results. CM5 is similar to the proposed method. However, it shows more color bleeding in the red letters on the signs. Figure 10 shows the difference between original image and the results of Figure 9 at the red line in Figure 9a. The red, green, and blue lines in Figure 10 are the difference pixel values of the red, green, and blue channel. (n) Figure 9. Visual comparison of the color demosaiced image from the noise-free and noisy datasets with σ n = 0.03. The upper parts of each row show the enlarged images, and the lower parts of each row show the difference maps. The first row shows the results of color demoisaicing without noise, and the second row shows noise with σ n = 0.03: (a) Ground truth, (b) CM1 [10], (c) CM2 [27], (d) CM3 [28], (e) CM4 [13,21], (f) CM5 [13], (g) PM. (h-n) use the same methods as (a-g). In addition to experiments on real captured images, we used the Kodak and McMaster dataset [17] for demosaicing comparison. The Kodak dataset consists of 24 color images of size 768 × 512 and the McMaster dataset consists of 18 color images of size 500 × 500. Because the Kodak and McMaster datasets have no original white pixel values, we assume that the values of the white pixels are a summation of the red, green, and blue pixel values at the same location; that is, W = (R + G + B). We conducted experiments on two types of degradation. The synthetic dataset was created without noise and with Gaussian noise having a standard deviation of σ n = 0.03. An image in low-light conditions has low signal power. Its signal-to-noise ratio (SNR) value is small. This means that it is equivalent to an image under normal-light conditions with added noise. It can be assumed that as the standard deviation of the noise increases, the light in the environment decreases. We chose Gaussian noise with a standard deviation of σ n = 0.03 because the average SNR values of the Kodak and McMaster datasets are 15.9dB and 17.6dB. It can be assumed to be a low-light condition.
We used two metrics to measure the performance of the proposed method. One was the color peak signal-to-noise ratio (CPSNR) and the other was the structural similarity index (SSIM) [29]. CPSNR is measured using the mean square error of the original image and the estimated image, and SSIM evaluates the similarity of structures and features between the two images as perceived by the human visual system. The proposed method for performance evaluation was compared with five existing methods. All the conventional methods were implemented using the Canon-RGBW CFA. The first conventional method (CM1) was the RI method [10], which did not use the color difference model but the residual model. The second (CM2) was Oh's method [27], which contained the colorization-based method. The third (CM3) was Kim's method [28], which uses rank minimization with a colorization constraint. The fourth (CM4) and fifth (CM5) were demosaicnet [13], based on deep learning. Because there is no learning-based method for the Canon-RGBW CFA, we used a hybrid method combined with the existing color demosaicing method. We implemented deep learning with the Bayer pattern CFA, such as the red, green, and blue pixel locations in the Canon-RGBW CFA. Thereafter, we applied handcrafted methods. In other words, demosaicnet was applied to a Bayer patternlike image that was subsampled once using the Canon-RGBW CFA. CM4 was combined with demosaicnet [13] and the color difference model [21], and CM5 was combined with demosaicnet [13] and the proposed method based on a Laplacian pyramid.
The quantitative evaluation of the noise-free Kodak dataset in terms of CPSNR and SSIM is presented in Table 1. In Tables 1-3, the table colored in vivid sky blue indicates the highest score and the table colored in light sky blue indicates the second highest score. A high CPSNR value means that the original and estimated images are quite similar and a high SSIM value also means that the estimated image is structurally similar to the original image. It can be observed that the proposed method shows improved performance. Of all the methods, including the residual, colorization, rank minimization, and hybrid methods, the proposed method achieved most of the highest CPSNR and SSIM values on the Kodak dataset. We also achieved the best results for the average CPSNR and SSIM. We summarize the average CPSNR and SSIM values for the Kodak and McMaster dataset without noise and containing added noise with σ n = 0.03 in Table 2. For the most part, it can be observed that the proposed method and the hybrid method that combines deep learning with the proposed Laplacian pyramid show good performance.
To evaluate the time complexity of the conventional methods and the proposed method, we estimated their average execution time on the Kodak dataset. The codes of conventional methods are available online. All the codes were written in Matlab files except the demosaicnet and tested on MATLAB R2021a. The demosaicnet model was built in PyTorch 1.8.1 and tested on Ubuntu 16.04 environment (Python3.8, CUDA11.2). We used a desktop computer equipped with an Intel Core i7-11700k 3.6GHz CPU, 32 GB memory, and an Nvidia RTX-3090 GPU. The results are listed in Table 3. Because CM4 and CM5 are deep learning-based hybrid methods, the average execution time is the total time that the CPU and GPU operate. Our proposed method consumes much less time than the conventional methods.

Conclusions
In this paper, we proposed a color demosaicing method for the Canon-RGBW CFA using a Laplacian pyramid. First, we interpolated the white channel utilizing the color difference model and gradient. Red, green, and blue channels were restored with our proposed method using Laplacian pyramid decomposition and restoration. Our method can be utilized with any periodic CFA. The experimental results showed that the proposed method has superior results compared with the other conventional methods, including quantitative, visual performance, and computational complexity. As can be seen from the experimental results, our method still shows color bleeding with the Canon-RGBW CFA. To alleviate this, we will experiment by adding appropriate regularization.