Multispectral Image Denoising via Nonlocal Multitask Sparse Learning

The goal of multispectral imaging is to obtain the spectrum for each pixel in the image of a scene and deliver much reliable information. It has been widely applied to several fields including mineralogy, oceanography and astronomy. However, multispectral images (MSIs) are often corrupted by various noises. In this paper, we propose a MSI denoising model based on nonlocal multitask sparse learning. The nonlocal self-similarity across space and the high correlation of the MSI along the spectrum via multitask sparse learning are fully exploited in the proposed model. A nonnegative matrix factorization (NMF) based algorithm is developed to solve the proposed model. Experimental results on both simulated and real data demonstrate that the proposed method performs better than several existing state-of-the-art denoising methods.


Introduction
With plenty of available spectral information, multispectral image (MSI) has received considerable attention in wide applications such as image interpretation [1,2], superresolution [3][4][5], compression [6][7][8] and surveillance [9].In practice, MSIs are inevitably corrupted by various noises due to the equipment limitations, the loss of radiance energy and narrow band width.Noise is always the major challenge that seriously degrades the quality of MSIs.Thus, it is necessary to remove noise in many MSI applications.Recently, there are several methods to efficiently solve MSI denoising problems.In particular, sparse coding has attracted much attention due to its state-of-the-art performance in MSI processing [10,11].These methods assume that images can be linearly sparse represented by only a few basis atoms of a dictionary with the sparse coefficients.The noise can be greatly reduced, since noise is stochastic that cannot be sparsely approximated, but the noise-free MSI can be sparsely approximated by a few basis atoms.
The choice of dictionary is not tractable in sparse coding [12].In early sparse coding based MSI applications, a fixed dictionary such as discrete cosine transform (DCT) dictionary and discrete wavelet transform (DWT) dictionary [13] is adopted, which are the orthogonal dictionaries and have the geometric invariant property [14].In addition, the dictionary composed of two or more fixed basis transform such as DCT and DWT, is often employed for the MSI and hyperspectral image (HSI) processing [15], which obtains good performance due to combining the advantages of the different dictionaries.
Actually, the performance of the MSI denoising method using a data-driven dictionary is superior to the one with a fixed dictionary [16,17].The data-driven dictionary can be learned according to the current result in the iterations.Therefore, dictionary learning for MSI denoising problem is very popular.Based on principal component analysis (PCA) [5], K-means [18], K-singular value decomposition (K-SVD) [19] and nonnegative matrix factorization (NMF) [20] technologies, the dictionary learning approaches have achieved the outstanding performance in various MSI applications.In particular, the NMF based dictionary learning approaches have been widely used for solving the MSI unmixing, denoising and fusion problems since they can sufficiently utilize the sparsity property of MSIs [20,21].
In this paper, we propose a new MSI denoising model by fully employing the spatio-spectral sparsity of the MSI, and a NMF based dictionary learning algorithm to solve the proposed model, which alternately updates the basis atoms of the dictionary and the corresponding sparse representation.An emphasis behind the proposed model is the utilization of nonlocal sparsity by the patch grouping in each band.The other emphasis is that the high correlation of the MSI along spectrum is exploited via multitask sparse learning, as Figure 1 illustrated.Multitask sparse learning aims to enhance performance by accomplishing a task together with other related tasks, using a shared sparse representation across tasks, since what is learned for each task can help other tasks to be learned better [22].We consider each band denoising problem as one task, and the denoising tasks on all bands share a common sparse representation according to the correlation of adjacent bands.In [23], a HSI denoising model based on the multitask learning was proposed, but it only adopted the correlation of adjacent patches and failed to use the nonlocal self-similarity of each band image.The proposed model simultaneously takes spectral and spatial sparsity of the MSI into account and generates a competitive denoising result.

NMF
Different dictionaries share common sparse representation S. The outline of this paper is as follows: Section 2 introduces the MSI denoising problem based on sparse coding.In Section 3, we propose a new MSI denoising model, which simultaneously considers the spectral and spatial sparsity of the MSI.Section 4 shows the optimization algorithm for solving the proposed model.In Section 5, we provide numerical experiments on both simulated and real data to demonstrate the effectiveness of the proposed method.Finally, the conclusions and future works are given.

Problem Formulation
A MSI is made up of many band images and each band image is a two-dimensional image.For simplicity, the two-dimensional image is generally stacked as a column vector in lexicographic order x ∈ R n in image processing [24][25][26].We focus on solving the MSI denoising problem by sparse coding.Mathematically, the sparse coding model for image denoising problem assumes that x can be represented by x ≈ As, where A ∈ R n×r (n < r) is a dictionary matrix and s ∈ R r is the coefficient vector.Due to the redundant property of the dictionary A, most entries of the coefficient vector s are zero or close to zero, i.e., s is sparse.
If the dictionary A is known, the image denoising problem can be formulated as 1 -norm regularized sparse optimization model, i.e., min where || • || 2 is the 2 -norm, || • || 1 is the 1 -norm and constant λ denotes the regularization parameter controlling the degree of sparsity.Obviously, problem (2) is convex.There are many efficient methods for solving 1 -minimization problem (2), such as iterative thresholding algorithm [27] and Bregman split algorithm [28].In MSI denoising problem, for the k-th band x k of the MSI, we derive an unified model based on model (2) as follows: where λ k is the regularization parameter of the k-th band denoising problem.If the dictionary A k is learned in problem (3), which becomes a non-convex optimization problem.A popular approach to simultaneously estimate A k and s k is the sparse NMF method, which integrates dictionary learning and sparse coding into one model, i.e., (s k , A k ) = arg min The nonnegative constraint is consistent with the sparse constraint in the NMF method.Under fairly mild conditions, the sparsest decomposition of Formulation (1) by NMF is indeed unique [29].In the following section, we will present the patch based MSI denoising method, which fully exploits the nonlocal similarity of the MSI across space and the high correlation of the MSI along the spectrum via multitask sparse learning.

The Proposed Method
We present a novel model for MSI denoising problem, which simultaneously uses the spectral and spatial sparsity priors of the MSI.The proposed model consists of two components: patch grouping for employing nonlocal self-similarity of MSI across space and multitask sparse learning for characterizing the high correlation of the MSI along the spectrum.Concretely, we denote the i-th image patch of size √ n × √ n at position i in the k-th band as y k i ∈ R n , reordered into the column vector lexicographically.We divide the k-th band into a set of key patches i=1 (where M is the number of the key patches).The MSI contains much global and local redundancy and correlation in spatial and spectral dimensions.For each band, a small number of key patches could represent the overall band image, which means M is a small value.By performing block matching technoledge [30], a set of patches that are most similar to each key patch y k i can be found.
After image patch grouping, we align these similar patches to form a matrix , where T denotes the number of these nonlocal similar patches.The matrix X k i is low-rank since the similar patches have the similar structures.For the k-th band, we record these low-rank matrices as the column vector lexicographically to form a data matrix X k ∈ R N×M , where N = n • T, as Figure 1 illustrated.The data matrix X k may be low-rank, and its size is not very large since n, T and M are small values such as n = 49, T = 45 in numerical experiments.
By treating the MSI denoising problem as a multitask learning problem, in which multiple tasks are solved at the same time with shared representation, we consider each band denoising problem as one task, and the denoising tasks on all bands share a sparse representation.Since adjacent bands of the MSI are related, we can well employ the high correlation of the MSI along the spectrum by solving all bands denoising problems in parallel using a shared representation.Given K data matrices X 1 , . .., X K , we simultaneously learn K dictionaries A 1 , . .., A K with a common sparse representation S. As [31,32] presented, learning multiple related tasks simultaneously could obtain better results than learning each task independently, since what is to learn for each task can help other tasks be learned better.Therefore, we conjecture that solving multiple related denoising tasks simultaneously could enhance denoising performance.To capture useful information from the corrupted MSI with as few basis atoms as possible, the shared representation is very sparse.
In [33], multitask learning based on the capped-1 , 1 regularization performed better than that with 1 -norm regularization.Here, we use the capped-1 , 1 regularization in our model to enhance sparsity.We formulate the proposed model for MSI denoising as follows: where is the common sparse representation and R is the size of the dictionary, s i is the i-column of matrix S and θ > 0 is a thresholding parameter.The first term in model ( 5) is the fidelity term and the second term is the capped-1 , 1 regularization.The proposed model simultaneously utilizes the nonlocal self-similarity by patch grouping across space and the global correlation along the spectrum via multitask sparse learning.
Obviously, model ( 5) is equivalent to the following model: where x i k denotes the i-column of data matrix X k .By an indicator function, we rewrite the model ( 6) as follows: where Each band has a corresponding dictionary since different bands deliver different information although they are related.For the MSI denoising problem, this model (5) exploits the high correlation of the MSI along the spectrum by the multitask sparse learning and nonlocal sparsity via patch grouping across space, which makes full use of the spectral and spatial sparsity priors of the MSI.By solving the above model ( 5), we can obtain the good estimations of dictionaries Â1 , • • • , ÂK and the shared sparse representation Ŝ.The matrix Xk containing the noise-free patches is recovered by Finally, each band of the MSI is reconstructed simultaneously by averaging the corresponding pixels of the noise-free patches.

Optimization Algorithm
There are several algorithms to solve the proposed model such as alternating direction method of multipliers (ADMM) [34] and primal-dual method [35].Here, to save the computational complexity, we adopt a multiplicative update rule based NMF algorithm [36] to solve the proposed model.
We first denote two matrix operations: ⊗ defines the elementwise product and stands for the elementwise division.Let the objective function as follows: , Derive the Karush-Kuhn-Tucker (KKT) conditions of the model ( 5), i.e., where Due to S ⊗ ∇ S F = 0 and A k ⊗ ∇ A k F = 0 in KKT conditions, we have According to the multiplicative update rule based NMF algorithm [36], we obtain the following multiplicative update rules to solve the model ( 5): The solution of model ( 5) can be achieved by alternatively employing the above update rules (15) and (16).The overall procedure is summarized in Algorithm 1.In addition, the computational cost is KR(2N + 1)(M + R) + MR(2R + 3) for updating S in each iteration, and 2R(N M + MR + NR + N) for updating A k in each iteration.Thus, the total complexity of the Algorithm 1 is the sum of the computational costs from computing S and A k , which is much less than the tensor based approaches for the MSI denoising problem.
Referring to [33,36], Algorithm 1 satisfies the block coordinate descent procedure.Thus, the value of objective function ( 9) is non-increasing.

Numerical Results
To demonstrate the effectiveness of the proposed method for MSI denoising, we present the comparisons on both simulated and real data.Five state-of-the-art MSI denoising methods are utilized for comparisons.They are band-wise KSVD [37], band-wise BM3D [30], 3D-cube KSVD [38], PARAFAC [39], and MTSNMF [23].All parameters involved in the comparing algorithms are carefully tuned to obtain the best performance or automatically chosen as described in their papers.The experimental results are compared both quantitatively and visually.

Simulated Experiments for MSI Denoising
We select two MSIs from the Columbia MSI database for the simulated experiments.The MSI database contains 32 real-world scenes of a wide variety of real-world materials and objects, and each MSI has spatial resolution 512 × 512 and 31 bands, which includes full spectral resolution reflectance values collected from 400 nm to 700 nm with 10 nm steps.We scale the MSIs into the range of [0, 1] in simulated experiments.In addition, we add the Gaussian noise into two test MSIs to generate noisy observations.In practice, different bands of the MSI contain different noise levels.Following [23], the standard deviations of Gaussian distribution for different bands are set as random number from the range of [0.05, 0.2], and we store the random numbers in a descending order as σ 1 , σ 2 , • • • , σ K .Namely, the noise level decreases gradually from the first band to the last band.The dictionary size is suggested to be set to R = 4n empirically in [38].For the regularization parameters, we just simply set λ k = σ k 2 log(R), patch size 7 × 7 and threshold θ = 1 K ∑ K i=1 λ i .The maximum number of iterations is 300 for the proposed method.
To evaluate the denoising performance of the proposed method, four quantitative picture quality indices (PQI) are exploited, including peak signal-to-noise ratio (PSNR), structure similarity [40] (SSIM), feature similarity [41] (FSIM), erreur relative globale adimensionnelle de synthe èse [42] (ERGAS).In image processing and computer vision, PSNR and SSIM are two primary and conventional PQIs.They evaluate the similarity between the ground truth image and the restored image based on the mean squared error (MSE) and structural consistency, respectively.The FSIM focuses on the perceptual consistency with the restored image.The higher values of these three measures indicate that the restored image is more closer to the ground truth image.The ERGAS estimates fidelity of the restored image based on the weighted sum of MSE in each band.It is worth pointing out that the smaller ERGAS value means a better estimation of the ground truth image.
In Figures 2 and 3, we compare the denoising results obtained by five methods with the results achieved by the proposed approach on band 1 and band 30 of the MSI peppers.Obviously, the denoising performances of band-wise KSVD and band-wise BM3D methods are not good since their restored band images are unclear.The MTSNMF and PARAFAC approaches create some artifacts in the denoising results maybe due to employing less self-similarity of image in their methods.The restored image obtained by the 3D-cube KSVD method shows blurring artifacts on the edges.However, the proposed method removes most noise and obtains clear image details especially in the region corresponding to the white rectangle in clean peppers.Table 1 displays the comprehensive performance of six test methods on peppers with respect to four PQIs.From these quantitative comparisons, it is easy to observe that the proposed method outperforms other test methods according to all evaluation measures.
We also show the denoising results of six test methods on the MSI pompoms.The denoising results of band 6 and band 21 of pompoms are reported in Figures 4 and 5, respectively.We can see that the proposed method evidently performs better than others, both on the recovery of the edges and texture structures, especially in the region corresponding to the white rectangle in the clean band.These restored images obtained by five test approaches are a little blurry and still contain some noise.Furthermore, Table 2 indicates the comprehensive performance of six test methods on pompoms with respect to four PQIs, which presents the advantage of the proposed method in quantitative comparisons.

A Real Experiment
A real remote sensing data is utilized to evaluate the MSI denoising performance, i.e., urban area HYDICE MSI.The size of the original MSI is 304 × 304 × 210.We remove some bands that are seriously polluted by the atmosphere.The remaining test data with the size of 304 × 304 × 157.Like simulated experiments, we scale the test MSI into the interval [0, 1] and adopt the same parameter settings.In addition, since the noise level is unknown for real data, we show a qualitative evaluation based on the visual effect.
The denoising results of the band 50 obtained by five test methods are presented in Figure 6.The close-ups of the white rectangle in the noisy band 50 in Figure 6 are presented in Figure 7 corresponding to five test methods.We distinctly observe that the noisy band 50 is dark and contains some mixed noise.In Figure 6, the restored bands by test methods are similar visually.However, in Figure 7, the difference is evident between the proposed method and other test methods.There are some noise in the restored images obtained by the band-wise KSVD and 3D-cube KSVD approaches.The PARAFAC method returns a blurry image.The MTSNMF method removes some noise and achieves a clear restored band image, but its restored image loses some details.Nevertheless, the proposed method creates a clear image by finely recovering the details and edges, and removing a lot of noise.All of the above numerical experiments attest that the proposed approach outperforms other test methods both quantitatively and visually.

Discussion
The proposed method uses nonlocal self-similarity across space and the high correlation of the MSI along the spectrum; thus, it achieves the excellent performance, as reported in Tables 1 and 2. The band-wise KSVD method and the band-wise BM3D method are dictionary learning based denoising methods for two-dimensional images.They don't consider the correlation of the MSI along the spectrum.Thus, they obtain the lower PSNR and SSIM values than the proposed method.The PARAFAC method exploits decomposition uniqueness and single rank character to obtain the better performance than some two-dimensional filter based denoising methods.However, the PARAFAC method only considers the correlation along the spectrum.As shown in Figures 2 and 3, the results of the PARAFAC method are not good.The MTSNMF method takes the spatio-spectral sparsity priors into account, but it doesn't exploit the nonlocal self-similarity, which is very efficient for image denoising.Therefore, the results of the proposed method are much better than results of the MTSNMF method as shown in Tables 1 and 2.
Figure 8 shows the results of the proposed method with different parameters on pompoms.In Figure 8a, we can see a larger size of dictionary R may lead to higher PSNR value but more complexity.With the increasing R, the PSNR value of the proposed method becomes almost stable, which means overcompleteness of sparse representation in the dictionary learning offers some geometric invariant properties [43] and advantages the true signal recovery.In Figure 8b, we discover that an appropriate threshold θ is helpful for our algorithm to achieve a good performance.

Conclusions
In this paper, we proposed a MSI denoising model, which fully exploited the nonlocal self-similarity of the MSI on the spatial domain and the global correlation on the spectral domain via multitask sparse learning.An efficient NMF based algorithm was developed to solve the proposed model.Experimental results both on simulated and real data demonstrated that the proposed method performed better than several existing state-of-the-art MSI denoising methods.Usually, the MSIs are huge, and implementation of parallel-computing for the MSI denoising problem will be considered as our future work.

Figure 1 .
Figure 1.Framework of the proposed multispectral image denoising method.Each band denoising problem is considered as one task, and the denoising tasks on all bands share a common sparse representation S.

Figure 2 .
Figure 2. From left to right in the first row: band 1 of the clean peppers, the noisy peppers, the restored results obtained by band-wise KSVD, and band-wise BM3D; from left to right in the second row: the band 1 of the restored results obtained by 3D-cube KSVD, MTSNMF, PARAFAC and the proposed method.Focus on comparing the region of all test methods corresponding to the white rectangle in the clean band.Clean Noisy BWKSVD BWBM3D

Figure 3 .
Figure 3. From left to right in the first row: band 30 of the clean peppers, the noisy peppers, the restored results obtained by band-wise KSVD, and band-wise BM3D; from left to right in the second row: band 30 of the restored results obtained by 3D-cube KSVD, MTSNMF, PARAFAC and the proposed method.Focus on comparing the region of all test methods corresponding to the white rectangle in the clean band.

Figure 4 .Figure 5 .
Figure 4. From left to right in first row: The band 6 of the clean pompoms, the noisy pompoms, the restored results obtained by band-wise KSVD, band-wise BM3D; From left to right in second row: The band 6 of the restored results obtained by 3D-cube KSVD, MTSNMF, PARAFAC and the proposed method.Focus on comparing the region of all test methods corresponding to the white rectangle in clean band.Clean Noisy BWKSVD BWBM3D

Figure 6 .
Figure 6.From left to right in the first row: band 50 of urban area HYDICE, the restored results obtained by band-wise KSVD, 3D-cube KSVD; from left to right in the second row: band 50 of the restored results obtained by MTSNMF, PARAFAC and the proposed method.

Figure 7 .
Figure 7. Close-ups of the white rectangle in the noisy band 50 of urban area HYDICE.

Figure 8 .
Figure 8. Results of the proposed method with different parameters.(a) dictionary size R vs. PSNR (b) threshold θ vs. PSNR.

Table 1 .
Performance of six comparing methods with respect to four picture quality indices on peppers.

Table 2 .
Performance of six comparing methods with respect to four picture quality indices on pompoms.