Crosstalk Correction for Color Filter Array Image Sensors Based on Lp-Regularized Multi-Channel Deconvolution

In this paper, we propose a crosstalk correction method for color filter array (CFA) image sensors based on Lp-regularized multi-channel deconvolution. Most imaging systems with CFA exhibit a crosstalk phenomenon caused by the physical limitations of the image sensor. In general, this phenomenon produces both color degradation and spatial degradation, which are respectively called desaturation and blurring. To improve the color fidelity and the spatial resolution in crosstalk correction, the feasible solution of the ill-posed problem is regularized by image priors. First, the crosstalk problem with complex spatial and spectral degradation is formulated as a multi-channel degradation model. An objective function with a hyper-Laplacian prior is then designed for crosstalk correction. This approach enables the simultaneous improvement of the color fidelity and the sharpness restoration of the details without noise amplification. Furthermore, an efficient solver minimizes the objective function for crosstalk correction consisting of Lp regularization terms. The proposed method was verified on synthetic datasets according to various crosstalk and noise levels. Experimental results demonstrated that the proposed method outperforms the conventional methods in terms of the color peak signal-to-noise ratio and structural similarity index measure.


Introduction
Advancements in image acquisition devices, including camera imaging systems, have continually opened new application areas. In recent years, the rapid development of imaging systems fostered by mobile-camera technology has promoted the application of imaging systems in many fields, such as security, military, aerial imaging, satellite imaging, and autonomous vehicles. Accordingly, the demand for cost-effective high-performance imaging using low computational power has increased. To fulfill this need for efficiency in camera imaging systems, a single sensor based on a color filter array (CFA) is essential, instead of using three sensors or optical beam splitters.
In response to consumer perceptions and expectations of digital color image contents, attempts have been made to increase the spatial resolution engendered by the number of pixels in a limited-sized imaging system. Many signal-processing-based methods for single sensor architectures have been required to achieve high-resolution, high-sensitivity imaging. Studies on CFAs [1][2][3] and demosaicing methods [4][5][6][7][8] have been conducted to maximize CFA performance. To increase the spatial resolution in a limited-size image sensor, the pixel size should be reduced. However, this approach can compromise the sensitivity [9]. As a result, smaller pixels tend to generate more noise. Thus, denoising algorithms [10,11] are needed to solve such problems, which often occur in low-light situations. Although the effect on noise is minimized through signal processing, the issue caused by the pixel size reduction remains.
Another problem caused by the pixel size reduction is the crosstalk phenomenon. Specifically, it is caused by the physical limitations of the image sensor, that is, the leakage of photons and electrons between adjacent pixels [12,13]. As the pixel size continues to shrink, crosstalk between adjacent pixels emerges. In CFA image sensors, photon and electron leakage due to optical refraction and minority carriers inevitably occurs as the geometric distance between adjacent pixels decreases. In the obtained image, crosstalk typically generates two artifacts: desaturation and blurring. As shown in Figure 1, color constancy and preservation of the spatial resolution of the image thus cannot be achieved. Several attempts have been made to overcome this crosstalk phenomenon. The first approach is based on the hardware design of the sensor or CFA. To achieve better color reproduction, the CFA is designed to optimize the spectral responses of the entire imaging system by compensating for the crosstalk effects [14]. Use of backside illumination technology with a parabolic color filter decreases the crosstalk while increasing the efficiency [15]. New CFA patterns have been proposed to minimize crosstalk in a sensor with a small pixel size using a combination of primary and complementary color filters, such as yellow, cyan, and magenta, as in [12]. Although the crosstalk effect has been reduced through physical improvements, the image sensors experience difficulties as the pixel size decreases.
The second approach is based on signal processing. For example, the crosstalk phenomenon was analyzed by Hirakawa et al. using a spatio-spectral sampling theory, which was formulated by using a convolution operation [13]. This method assumes that the most significant artifact is desaturation; thus, it focuses on color correction. However, the color correction method that employs matrix inversion may have limitations owing to certain issues, such as blurry artifacts and noise amplification under low-light conditions.
To solve the desaturation and blurring problem, joint decrosstalk and demosaicing method was proposed [16]. This method is based on a piecewise autoregressive image model. It performs demosaicing iteratively under crosstalk constraints. However, this method is not practicable because it is difficult to utilize recently studied demosaicing algorithms, such as ARI [8], and it requires considerable computation. In addition, because this method focuses more on estimating the edge directional weight that is required for demosaicing, it does not sufficiently preserve the edges in constraint. Therefore, we herein propose a crosstalk correction method that is based on multi-channel deconvolution with a hyper-Laplacian prior. It restores the image spatial resolution and color fidelity without producing obvious artifacts.
The main contributions of this study are the following: • The crosstalk problem is formulated as a multi-channel degradation model; • A multi-channel deconvolution method based on the objective function with a hyper-Laplacian prior is designed. The proposed method utilizes L p regularization to achieve the estimated image with sharp edges and details, and it efficiently suppresses noise amplification for each color component. Concurrently, intercolor regularization is employed to smooth the color difference components and to encourage the homogeneity of the edges; • An efficient algorithm based on alternating minimization is described. Experimental results validate that the proposed method is more robust than conventional methods.
The remainder of this paper is organized as follows. Section 2 presents the crosstalk analysis and problem formulation on the multi-channel degradation model. In Section 3, the proposed method with a hyper-Laplacian prior is detailed. Section 4 presents the experimental results. Finally, Section 6 concludes this paper.

Problem Formulation
An image sensor is based on either a charge-coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS). By using CFA and a microlens array, the image sensor converts the light received from the main lens into digital signals. As each color filter has its own spectrum, only one piece of color information is obtained from one pixel.
To estimate an original color image from a subsampled image according to the CFA pattern, demosaicing is used. It is intended to overcome the physical limitations of a single-sensor imaging system. Consider an original discrete color image, x(n) = [x R (n), x G (n), x B (n)], where n ∈ Z 2 , and x R (n), x G (n), and x B (n) are the three color components for respective red (R), green (G), and blue (B) channels. The subsampled image y s (n) according to the CFA pattern can be expressed as where d k (n) is the subsampling function of CFA and k ∈ {R, G, B}. This function is periodic, and d R (n) + d G (n) + d B (n) = 1 for each pixel. The term e(n) is the corresponding signal-independent additive noise. During the process of obtaining a subsampled image, desaturation and blurring occur as a result of the crosstalk phenomenon. Figure 2 illustrates the structure of the imaging system. The light entering through the camera's main lens sequentially passes through the microlens and color filter and reaches the imaging sensor. In this process, interference occurs on account of optical diffraction and minority carriers caused by the surrounding pixels. Therefore, the image acquired by the imaging sensor suffers simultaneously from spatial and spectral degradation. As shown in Figure 2c, the spectral sensitivity shifts owing to crosstalk inside the imaging sensor. The observation model with the crosstalk phenomenon can be represented as follows: where y ct (n) is the subsampled image under the crosstalk phenomenon and b(n, m) is the crosstalk function. This function represents a combination of optical diffraction and minority carriers. We assume that the crosstalk function is space-invariant. Using matrix notation, the relation between the subsampled image and the original image is rewritten as follows: where x denotes the three color component vectors of the original color image with M rows and N columns; that is,  In general, it is important to solve the inverse problem according to the degradation model. If the observation model of Equation (3) is a linear shift-invariant (LSI) system, BD can be replaced with DB since the commutative property is established. The special type of matrix associated with LSI systems is known as the Toeplitz matrix. In the above system, B satisfies this matrix; however D is not a Toeplitz matrix. In other words, subsampling operator D violates the commutative property of convolution. Thus, to solve the Equation (3) problem, the inverse problem with respect to matrix D should be solved sequentially after solving the inverse problem in terms of matrix B and considering noise e.
On the other hand, the crosstalk degradation is solved by focusing on color fidelity in [13] after solving the D matrix, which relates to subsampling through demosaicing. As this method performs demosaicing first, it is not suitable for handling the modeling presented in Equation (3). Alternatively, a method of performing deconvolution (B −1 ) in the subsampled domain should be considered prior to demosaicing (D −1 ). It is important to solve the problems sequentially according to modeling. However, performing deconvolution in the subsampled domain causes another problem. Although the method can solve the blur phenomenon in Equation (3), it does not sufficiently separate mixing between color channels. Furthermore, an iterative approach is used that repeatedly performs demosaicing and deconvolution in the subsampled domain to overcome blur artifacts and maintain color fidelity in [16].
The crosstalk function is not a simple low-pass filter. Rather, it is a complex process in which the spectral information of three color components is mixed and the spatial information of the surrounding pixels are simultaneously mixed in the subsampled domain, as shown in Figure 3. In particular, as the standard deviation of the Gaussian kernel increases, the image degradation due to the crosstalk phenomenon increases. As the noise increases, it becomes difficult to improve the color fidelity. Therefore, a new formulation is needed to solve the problem based on a multi-channel degradation model. The following multi-channel degradation model is considered: where the condition BD = DH should be satisfied, and multi-channel degradation matrix H is defined by Here, H kl is the dimension MN × MN for (k, l) ∈ {R, G, B}. Diagonal matrices H kk represent the within-channel kernel, and off-diagonal matrices H kl represent the crosschannel kernel when k = l.  Matrix H can be composed by utilizing the characteristic that each location and channel has a different kernel according to the CFA periodic structure. Figure 4a shows an example of configuring matrix H using matrix B and the CFA structure. As each color filter has its own spectrum, the crosstalk kernel is separated into three different kernels depending on the location of each color filter even though they have the same shape as the low-pass filter. Thereafter, a total of nine kernels can be configured by separating each color channel. Note that multi-channel degradation matrix H changes according to the type of CFA pattern. In the case of a RGBW [3] or multispectral filter array [2], the number of spectral bands and spatial periodicity of CFA sampling patterns are different and the formulation thus must be slightly changed. Figure 4b shows that different formulations-single-channel and multi-channel degradation models-can yield the same degraded results. By modeling the same phenomenon differently, we propose a method that can simultaneously solve complex spatial and spectral degradation through a multi-channel degradation model.

Proposed Method
In this section, we propose an efficient algorithm for multi-channel deconvolution based on the multi-channel degradation model, as mentioned above. To overcome the ill-posed problem, effective regularization terms are employed to restrict the solution space. The proposed method utilizes L p regularization to achieve the estimated image with sharp edges and details. It efficiently suppresses noise amplification for each color component. Concurrently, intercolor regularization is employed to smooth the color difference components and to encourage the homogeneity of the edges. Thereafter, we describe an efficient solver to minimize the cost function regularized by specific prior information.

Multi-Channel Deconvolution
The multi-channel deconvolution problem is used to obtain an estimate of x given y, D, H, and the prior knowledge of e. Equation (4) can be rewritten as where matrix H k is given by with the kth block row matrix of dimension MN × 3MN. The regularized solution of the multi-channel deconvolution problem in (6) is defined aŝ Here, the first term, called the data fidelity term, denotes a measure of the Euclidean distance from the original image to the observed image. The second term R(x) is the regularization term based on the prior model. Regularization parameters λ k manage the trade-off between the two terms. As each color channel suffers from degradation under the same conditions, it is assumed that λ k for each channel has the same value.
The data fidelity term can generally be defined as the L 2 -norm, which is known as the least-squares approach. Although the L 1 -norm increases the robustness to outliers such as noise or errors, it is employed in this study because the error term is generally modeled as Gaussian noise and the crosstalk kernel is an already known prior. As the crosstalk kernel is not signal-dependent, it can be parameterized through experiments. Moreover, it is generally assumed to be a Gaussian kernel [14]. In addition, L 2 -norm data fidelity enables the finding of solutions with low computational costs [17].
The regularization term is herein adopted to decrease the uncertainty in the inverse problem by constraining the solution space. The regularization term can be expressed according to the combination of several constraints: where λ i is the regularization parameter for each constraint function R i (x). Tikhonov regularization is a representative smoothing constraint that assumes that the distribution of gradient magnitudes for images is smooth [18]. This strategy decreases the uncertainty using the L 2 -norm and is defined as follows: where C denotes the second derivative linear operator, which is known as the Tikhonov matrix. Tikhonov regularization smooths the estimated image with a limited high-frequency component. However, this regularization strategy inevitably oversmooth some important information, such as edge sharpness and detail.
To reconstruct the estimated image while preserving the sharp edges, total variation (TV) regularization is introduced in [19]. TV regularization is one of the most successful techniques for reconstructing images. In the field of denoising, various noise types are effectively removed using this regularization strategy [20]. Unlike Tikhonov regularization, TV regularization is employed in the image restoration process under the assumption that the image gradient distribution has a Laplace distribution. Therefore, high-frequency information, such as edge information or details, can be effectively restored. TV regularization is defined as follows: where ∇ denotes the first derivative linear operator. TV regularization is isotropic if the norm || · || denotes the L 2 -norm, and it is anisotropic if it denotes the L 1 -norm. Although TV regularization exhibits excellent performance in various image processing fields, recent works have focused on presenting constraints with consideration of the image characteristics. In research on the properties of natural images [21,22], the statistic of real-world scenes substantially follows a heavy-tailed distribution in their gradients, which is modeled as a hyper-Laplacian prior (p(x) ∝ exp(−k||∇x|| p p )). This prior is incorporated as L p regularization, and it is defined as follows: where the norm || · || p p denotes the L p -norm and the value of p is typically set to [0.5, 0.8].

Constraints
In the spirit of a hyper-Laplacian prior, the proposed constraint consists of spatial regularization to encourage high frequencies within each color channel. It additionally fosters intercolor regularization to enforce the homogeneity of the edges or details in the color channels. The first constraint R 1 (x) coerces sharp edges in the estimated image. If the prior knowledge of the image gradient is considered, an optimal solution can be found. Figure 5a presents the probability density of the first-order derivative using 42 images from the Kodak and McMaster datasets [23]. The empirical distribution can be modeled as a hyper-Laplacian distribution with p = 0.66. This property allows a regularization term to employ a hyper-Laplacian prior in the deconvolution problem. Therefore, R 1 (x) is represented as follows: where ∇ denotes the first derivative operator and p = 0.66. The second constraint R 2 (x) is chosen to encourage smoothness in the differences between the color components. In most demosaicing, color interpolation is performed under the assumption of a high correlation among the three color bands [4][5][6][7][8]. This property is also valid for natural color images. The optimal L p -norm can be selected through the probability density of the color difference gradient, as presented in Figure 5b. Similar to the case of R 1 (x), a hyper-Laplacian distribution with p = 0.62 allows the regularization term to deploy L p -norm prior in the image restoration process. The term R 2 (x) is expressed as follows: This regularization term can be expressed in the form and M is an 3MN × 3MN dimensional matrix. Each regularization term of R 1 (x) and R 2 (x) is incorporated into the objective function of (8), and the overall objective function is represented as follows: where λ 1 and λ 2 are regularization parameters for the constraints of color components and color difference components, respectively. As R 1 (x) and R 2 (x) have similar p values near 2/3, we assume p = 2/3 in the following optimization problem for convenience.

Optimization
In this subsection, we describe how to optimize the objective function of Equation (16). The minimization of an objective function with L p regularization engenders difficulty in obtaining closed-form solutions because of the non-differentiability and non-linearity of the L p regularization. To efficiently recover estimated image x and solve this problem, a strategy based on alternating minimization [24], known as half-quadratic splitting, is employed. Two auxiliary variables w and v are introduced to approximate ∇x and ∇Mx among the non-differentiable terms || · || p p , respectively. The approximation model to Equation (16) can be expressed as follows: where w and v are the auxiliary variables, θ i denotes the auxiliary regularization parameter, and z represents the estimated full-color image. Alternating minimization guarantees that the minimizer of Equation (17) converges to that of Equation (16) as the value of θ i moves toward ∞. Therefore, the objective function in Equation (17) is convex and differentiable. This formulation allows the minimization problem with respect to the others to obtain a closed-form formula if one of the three variables, w, v, and x, is fixed. Note that vector z is first estimated based on the demosaicing method that satisfies the condition y = Dz. The choice of the demosaicing method is not relevant; however, the state-of-the-art demosaicing method (e.g., ARI [8]) is advantageous in reducing color artifacts. Although the effects of subsampling cannot be completely eliminated, this process enables the objective function to negate the subsampling operator of matrix D.
Since w and v are separable in a given x, the minimizer of the objective function is readily obtained. By fixing x, the formulation of Equation (17) is separable in w and v. Based on the alternative minimization method [24], the minimization problem can be transformed into two subproblems, as follows: Each objective function in (18) and (19) is a single-variable problem. Although it is difficult to derive an exact solution for these objective functions, the minimization can be obtained using a numerical root-finder, such as the Newton-Raphson method. In the special case of p = 2/3, the exact analytic solution for non-zero w and v is described in [21]. Specifically, the multidimensional minimizer of α||s|| p p + ||s − t|| 2 2 is extended to the scalar minimizer of α|s| p + |s − t| 2 . By calculating the derivative of α|s| p + |s − t| 2 with respect to s, the minimizer is given by: For p = 2/3, the equation can be expressed as follows: Therefore, the solution of each objective function in (18) and (19) can be solved numerically through the root of the quartic polynomial.
Note that if p = 1, then the objective functions in (18) and (19) will be equal to the TV regularized problem. Although the proposed method has an optimal solution with approximately p = 2/3 and a corresponding solution is derived above, the TV regularization remains a useful strategy in terms of speed and implementation. Each solution of (18) and (19) with p = 1 is given by the one-dimensional shrinkage, as follows: The solution to this problem can be achieved component-wise.
For fixed w and v variables, the objective function in (17) can be simplified as a least-squares formulation aŝ Moreover, the objective function is convex and differentiable with respect to x. As the objective function is composed of quadratic terms, the corresponding optimal solution of Equation (24) becomes a least-squares problem as follows: The solution can be obtained by solving its normal equations. We summarize the proposed crosstalk correction process based on multi-channel deconvolution in Algorithm 1 with the continuation technique on regularization parameter θ i . The continuous scheme is widely used with the penalty method to speed up the overall convergence. As a small value of θ i is initially set and gradually increases in iterations, the convergence speed can be accelerated. As the value of θ i moves toward inf, the minimizer of Equation (17) transformed by the quadratic penalty method converges to that of Equation (16). In terms of convergence, the objective function in Equation (17) is convex in x, w and v. Algorithm 1 Crosstalk Correction based on L p -regularized Multi-channel Deconvolution.

Input:
The observed image y, the subsamling matrix D, the crosstalk matrix H, and the regularization parameters λ i Output: The reconstructed imagex Initialization: x 0 = 0 Solve z based on the demosaicing method (y = Dz) θ i ← λ i 1: repeat 2: Solve w according to Equation (18) 3: Solve v according to Equation (19) 4: Solve x according to Equation (24) 5: In implementation terms, the computation of each block in Equation (25) can be accelerated by a two-dimensional (2D) fast Fourier transform (FFT) at each iteration. Under the periodic boundary condition for x, ∇ T ∇, and all block matrices in H T H are block circulants. Specifically, the matrix on the left-hand side of Equation (25) can be precomputed once before the iterations. At each iteration, six FFTs are applied to w and v. Then, three inverse FFTs are computed to obtain x. Therefore, a total of nine FFTs (including three inverse FFTs) are required to solve Equation (25).

Experimental Results
The performance of the proposed method was evaluated using the color peak signalto-noise ratio (CPSNR) and structural similarity index (SSIM) [25] as objective image quality metrics. The former was used to evaluate the intensity differences between the original image and the estimated image; the latter was employed to evaluate the structural similarity in terms of the human visual system. In particular, SSIM is suitable for comprehensively determining the similarity between luminance and chrominance components.

Datasets
Several experiments were conducted to verify the performance of the proposed method using artificially degraded images. For the comparisons, we generated crosstalk degradation on four public benchmark datasets: Kodak, McMaster [23], Set5 [26], and Set14 [27]. The point spread function of crosstalk is inspired by the crosstalk kernel given by [14]. Moreover, two levels of degradation were tested with consideration of the crosstalk phenomenon that intensifies in accordance with the pixel size. For crosstalk degradation 1, the synthetic datasets were generated by convolving the ground truth (GT) images within the given dynamic range [0, 1] and the crosstalk kernels of the 2D Gaussian kernel with a standard deviation of σ g = 0.45. They were also generated by adding Gaussian noise with a standard deviation of σ n = 0.01. For crosstalk degradation 2, the Gaussian crosstalk kernel with a standard deviation of σ g = 0.60 and a Gaussian noise with a standard deviation of σ n = 0.04 were used. We assume that the crosstalk kernel is space-invariant.

Compared Methods
For performance comparisons, we implemented four conventional methods. The first conventional method (CM1) was explained as the ARI method in [8]. It only performed demosaicing and ignored the crosstalk kernel. The second (CM2), described as Lucy-demosaicing in [16], performed single-channel deconvolution based on the Lucy-Richardson method [28]. The third (CM3) was inspired by CM2 and applied TV-based single-channel deconvolution [24]. For CM2 and CM3, deconvolution was performed in the subsampled domain, followed by demosaicing. The fourth method (CM4) was deemed a color correction method in [13]. It focuses on color correction with consideration of the crosstalk kernel. For CM4, the CFA image was first demosaiced by the various demosaicing methods. The fifth (CM5), sixth (CM6), and seventh (CM7) performed single-channel deconvolution based on deep-learning approach [29][30][31]. Each method uses a convolution neural network (CNN) to solve a constraint optimization problem. In a similar way to CM3 and CM4, deconvolution performed in the subsampled domain for CM5, CM6, and CM7 using pre-trained models. In this experiment, ARI demosaicing [8] was applied equally to various conventional methods requiring demosaicing. The parameters were chosen to achieve the highest CPSNR values for all test images.

Comparisons
The quantitative evaluation of the Kodak and McMaster datasets [23] for crosstalk degradation 1 in terms of the CPSNR and SSIM values is demonstrated in Table 1. It is observed that the proposed method improved the performance. It achieved the highest CPSNR and SSIM values among all methods, including demosaicing (CM1), deconvolution (CM2 and CM3), color correction (CM4), and deep-learning-based deconvolution approach (CM5, CM6, and CM7). Note that the higher the CPSNR value is, the closer the estimated image is to the original image. Moreover, the higher the SSIM value, the better the perceived quality. The widening performance gap between the conventional methods and the proposed method can be confirmed for crosstalk degradation 2 in Table 2. In other words, the proposed method is robust against both crosstalk degradation and noise. Quantitative assessments for the Set5 [26] and Set14 [27], including the Kodak and McMaster datasets [23], are summarized in Table 3.
A comparison of the qualitative evaluation for crosstalk correction is visualized in Figure 6. In CM1, where only demosaicing was applied, color degradation is clearly observed. When the degree of crosstalk degradation is insignificant, it is defined as crosstalk degradation 1. The single-channel deconvolution methods (CM2 and CM3) show relatively good results in terms of improving color fidelity and producing sharp details. However, noise amplification is not completely considered. In CM4, which focused on color correction, blurring artifacts are not at all improved. The deep-learning-based deconvolution methods (CM5, CM6, and CM7) show usable results in terms of reducing noise and producing sharp details. However, the color fidelity has improved in a limited way. In contrast, the proposed method based on multi-channel deconvolution successfully reconstructed edge sharpness and overcame color degradation without noise amplification.  Furthermore, as the degree of crosstalk degradation increased, the performance difference between the conventional methods and the proposed method became more apparent. Figures 7 and 8 present the results for crosstalk degradation 2. CM1 shows noisy results with-out overcoming color degradation and blurring artifacts. The majority of single-channel deconvolution methods (CM2 and CM3) exhibit limitations in terms of color fidelity because deconvolution was performed in the subsampled domain. With respect to noise, CM3 shows slightly better results than CM2. In CM4, the color fidelity is improved close to the ground truth; however, the noise is amplified, appearing like ink stains in the pixels with noise. However, as is clearly observed, the proposed method based on multi-channel deconvolution improves the color fidelity and produces much sharper details than the other methods. Specifically, it produces the fewest artifacts, such as noise amplification. In addition, the visual comparison of the difference image shows that the proposed method achieves a low error rate. Therefore, the proposed method yields more natural and credible results than the conventional methods.

Influence of Parameters
The proposed method involves regularization parameters λ i , which manage the trade-off between the fidelity to the data and smoothness of the solution. To analyze the effects of the values of λ i on the crosstalk correction method, we performed experiments with different parameter settings. The value of λ i should be determined empirically based on experiments so that the restored images have reasonable signal-to-noise ratios.

Convergence Analysis
Finally, we analyzed the convergence behavior of multi-channel deconvolution for crosstalk correction. The criterion ||x n+1 − x n || 2 /||x n || 2 < η was used to terminate the iteration, where η was set to 10 −4 . The propose method converges for various initial conditions, as plotted in Figure 12a. For the initial estimates, x 0 , a blank image (x 0 = 0), a demosaiced version of the observed image (x 0 = z), or a degraded version of the demosaiced image (x 0 = H T z) was employed. For the experiment on various levels of crosstalk degradation, the objective function decreased in several iterations, as illustrated in Figure 12b. In addition, the visualization in Figure 12c shows that the residual image converges to the image with no value as the iteration progresses. Therefore, the objective function with a hyper-Laplacian prior converged in several iterations for some variables through the efficient solver.

Discussion
The crosstalk problem generated by the crosstalk kernel in the subsampled domain causes not only spatial degradation, but also spectral degradation at the same time. Singlechannel deconvolution does not consider the characteristics of CFA, and as a result, there is a limit to solving complex problems. In this study, it is shown that the crosstalk problem can be defined as a multi-channel degradation model. In addition, a method to solve the defined modeling is presented, and an effective objective function and optimization method are proposed. As can be seen from the experimental results, single-channel deconvolution can solve the blurring, but cannot solve the desaturation regardless of various methods. The color correction method is able to improve color fidelity, but the problems regarding blur and noise still remained. On the other hand, the proposed method is able to improve both spatial resolution and color fidelity without noise amplification, and it has been demonstrated that a synergistic effect with the state-of-the-art demosaicing algorithm can be achieved.
One of the important points is that the regularization parameter in the proposed method controls the smoothness. Although we have suggested the effective range of regularization parameters empirically, they will be important variables affecting the results. Overall, the main concept of the proposed multi-channel deconvolution for crosstalk correction is to overcome spatial and spectral degradation simultaneously, and its validity has been proven. The proposed multi-channel deconvolution approach can provide new insights into solving the crosstalk problem that occurs in the image acquisition system.

Conclusions
A crosstalk correction method based on L p -regularized multi-channel deconvolution was herein presented. The crosstalk problem with complex spatial and spectral degradation is formulated as a multi-channel degradation model, and an objective function with a hyper-Laplacian prior is designed. The proposed method based on multi-channel deconvolution improves color fidelity and produces sharp edges without obvious artifacts, such as noise amplification. Furthermore, an efficient solver minimizes the objective function for crosstalk correction. The cost function gradually converges toward a minimum value for certain variables. The proposed method was compared with conventional crosstalk correction methods using synthetic datasets according to various crosstalk and noise levels. Experimental results demonstrated that the proposed method outperformed the conventional methods in terms of the CPSNR and SSIM values. However, in terms of speed, the weakness of the proposed method is that the computational complexity is high to perform real-time processing in camera imaging system. In further research, we will implement the proposed method with multiple GPUs for a real-time performance. We believe that the proposed method will be applied to the application of imaging systems in many fields by overcoming the physical limitations of the image sensor.