Non-Local SVD Denoising of MRI Based on Sparse Representations

Magnetic Resonance (MR) Imaging is a diagnostic technique that produces noisy images, which must be filtered before processing to prevent diagnostic errors. However, filtering the noise while keeping fine details is a difficult task. This paper presents a method, based on sparse representations and singular value decomposition (SVD), for non-locally denoising MR images. The proposed method prevents blurring, artifacts, and residual noise. Our method is composed of three stages. The first stage divides the image into sub-volumes, to obtain its sparse representation, by using the KSVD algorithm. Then, the global influence of the dictionary atoms is computed to upgrade the dictionary and obtain a better reconstruction of the sub-volumes. In the second stage, based on the sparse representation, the noise-free sub-volume is estimated using a non-local approach and SVD. The noise-free voxel is reconstructed by aggregating the overlapped voxels according to the rarity of the sub-volumes it belongs, which is computed from the global influence of the atoms. The third stage repeats the process using a different sub-volume size for producing a new filtered image, which is averaged with the previously filtered images. The results provided show that our method outperforms several state-of-the-art methods in both simulated and real data.


Introduction
MR Imaging is a useful diagnostic technique that produces high-resolution images that are affected by different sources of quality deterioration due to the inherent limitations in the hardware, scanning time, movements of patients, and, one of the most important, noise. The presence of noise is not only a visual problem but also may interfere during the analysis and segmentation tasks [1].
A study of errors in radiology conducted by Donald L. Renfrew [2] found that errors usually involved limitations in imaging technique and acquisition of inaccurate or incomplete clinical history, among others. Previous research found that 69.23% were perceptual errors, and 6.4% of errors were due to poor image quality [3]. This research suggests the importance of guaranteeing the image quality in order to obtain better results in image segmentation and analysis for preventing errors. Despite the advance in MR imaging acquisition technology, which allowed the increment in the signal-to-noise ratio, the acquisition speed, and the resolution, MR images are still affected by noise [4], therefore the filtering of MR images continues being an active research area [5][6][7][8][9][10].
The raw captured data of an MR image are complex-valued, i.e., the image can be decomposed in amplitude and phase or real and imaginary images [11]. Both real and imaginary images are affected by additive Gaussian noise. However, magnitude MR images, which are obtained by computing the the self-similarity in conjunction with filtering in the transform domain, and has a variant for direct processing of volumetric data, denominated BM4D [32], which is still considered a state-of-the-art algorithm [33]. Like the BM3D, the BM4D in the first stage applies thresholding for gathering similar cubes in a 4D data structure and removes noise through a 4D transform. The second stage restores the details by using collaborative filtering. The stacked 3D cube voxels in the 4D structure lead to difficulties in removing noise in the beginning and at the end of the bands of hyperspectral image [33]. In addition, when the noise level is high, the block-matching distance does not allow the program to obtain enough similar blocks; therefore, the denoising reduces its performance, and some artifacts appear.
Another important approach that attracted attention in image processing is sparse representation. It has been used for image labeling [34], image segmentation and classification [35], image inpainting [36] and image denoising [37][38][39], among others. This approach was also used for denoising MR images. Recently, Yuan [40] proposed an efficient implementation, based on an improved Split-Bregman algorithm, for filtering MR images using a sparse tensor model. Zhang et al. [41] use clustering to gather similar patches and filter them using a sparse model, which uses sparse gradient priors to exploit the image low-rank properties. Kang et al. [7] use a sparse representation-based model for both filtering Rician noise and deblurring MR images. The model incorporates a regularization term to preserve small details and repeated patterns while using a non-convex total variation term for preserving edges. Li et al. [42] presented an improved dictionary learning model for medical image fusion denoising. By incorporating a low rank and sparse regularization term, into the sparse model, this proposal can remove noise while preserving textural details.
Further contributions, such as those proposed by Veraart et al. [9], use random matrix theory for filtering typical fluctuations originated from thermal noise in diffusion-weighted MR images. Phophalia and Mitra [43] use Kernel principal component analysis (KPCA) and Rough Set Theory (RST) for denoising volumetric MR data. RST is used to group similar voxels forming basis vectors, which are projected into a kernel space for performing PCA where the linear separation is possible, and the filtering is carried out.
Baselice et al. [44] introduced an approach from the complex domain under the argument that, in this domain, the filtering has the advantage of a simplified statistical model of the signal with Gaussian nature and zero mean. This method is an advantage over methods that operate in the amplitude domain where the noise model may not comply with being Gaussian.
Although recently some authors have focused on accurately estimating the noise [45] and correcting the bias produced by the aggregation of non-stationary distributed MR images samples, before trying to remove the noise [46,47], new research focused on denoising MR images by applying machine learning-based techniques. Chang et al. [6] presented an extension of the bilateral filter, which adapts to voxel intensity variations by including an intensity similarity function and automates the calculations of denoising parameters values, by extracting descriptors of texture issues, through an ensemble classifier of support vector machines and artificial neural networks. Other works use deep networks for exploring the similarity between neighboring slices in order to reduce the over-smoothing [5] and for denoising Dynamic contrast-enhanced MR images [8]. Zhang et al. [48] proposed a feed-forward denoising convolutional neural networks (DnCNNs) that took the batch normalization and residual learning architecture to denoising images, reducing the training process and improving the denoising performance. Based on the DnCNN, Jiang et al. [49] proposed a denoising method for brain MR images. They proposed a multichannel version of the DnCNN for dealing with 3D images, improving filtering performance in various noise levels. Kidoh et al. [50] also proposed a deep learning-based reconstruction method for denoising brain MR images. Their method reduces image noise while preserving image quality outperforming the DnCNN. This paper presents a filter for denoising MR magnitude images. Unlike some of the works presented based on NLM [17][18][19][20], or SVD [26,27], or sparse representations [7,[40][41][42], or some combination of these [21,[28][29][30]41], we present a novel method, which does not propose a new sparse model but fully exploits one of the most typical representative examples of sparse representation, dictionary learning. We use dictionary learning for both orchestrating the non-local and the SVD approach efficiently, and for outperforming the representation of fine details of the image, by (1) enhancing the learned dictionary and determining the weighting coefficients of the patches, according to the global influence of the dictionary atoms, for filtering noise by aggregation, while avoiding blurring and keeping fine details. (2) Establishing the self-similarity of the image efficiently, in a clean space by analyzing the dictionary atoms and the sparse coding matrix, which allows us to accurately recover similar patches throughout the volume, without resorting to the exhaustive comparison of each patch in the volume, because the self-similarity can be inferred from the sparse coding matrix. Thus, we minimize the appearance of artifacts. (3) Unlike [25][26][27], which establishes complex criteria that use optimization models to determine the best low-rank approximation for filtering noise, our method, based on the similar patches identified from the sparse representation, determines simply and efficiently, the best low-rank approximation according to the estimated noise level of the image, thus reducing the remaining noise in the denoised image.
The rest of this paper is organized as follows: Section 2 presents the fundamentals of sparse representations. Section 3 presents the proposed method. Section 4 presents the results and discussion. Finally, Section 5 presents the conclusions of this work.

Sparse Representation
Dictionary Learning is a typical example of sparse representation. It aims to find the sparse representation of a set of signals Y from a dictionary D, learned from the signals themselves [51]. i.e., Y ≈ DA where A is the sparse coding matrix or the sparse representation of Y.
Formally, let Y := [y 1 y 2 . . . y P ] ∈ R n×P be a matrix representing a set of P n-dimensional signals, the dictionary learning aims to find a dictionary D := [d 1 d 2 . . . d K ] ∈ R n×K and a sparse matrix A := [α 1 α 2 . . . α P ] ∈ R K×P such that Y can be approximated by a linear combination of the basis vectors (column vectors known as atoms) of D according to A. Most of the coefficients α i,j of A are zeros, or close to zero [52]. Dictionary learning can typically be formulated as an optimization problem, as Equation (1)  In this formulation, y c is the c-th column vector (signal) of Y, D is learned from the signals themselves, L is a regularization parameter, and α c is the c-th column vector of A. The term α c 0 measures the sparsity of the decomposition. It can be understood as the number of coefficients different from zero in α c , or sparse coefficients, to approximate the signals in Y as sparse as possible.
The KSVD algorithm proposed by Aharon et al. in [53,54] allows us to obtain the optimal values of D and A in Equation (1). By analyzing the sparse representation given by D and A, some characteristics of the signals in Y can be detected. Aspects such as the similarity between the signals and their rarity, are easily manageable from their sparse representation and can be useful for improving the filtering process. This will be addressed below.

Proposed Method for MRI Denoising
The typical procedure for denoising images, via sparse representations, consist of decompose the noisy image I into full overlapped n × n patches to convert them into signals and filter them using a model, such as that in Equation (1). This procedure can be extended in the case of MR images for processing the whole volume; however, as mentioned before, noise in MR images follows a Rician distribution, which is non-additive but dependent on the data. To handle this type of noise, and to get an accurate estimation of its stabilized standard deviation γ, we apply the variance stabilization transformation (VST) proposed by Foi [12] before processing the noisy volume V, which give us V VST . Then, we divide the volume V VST into full overlapped sub-volumes of n × n × n. Each sub-volume is converted into a signal y c ∈ R n 3 ×1 to conform the matrix Y := [y 1 y 2 . . . y P ] ∈ R n 3 ×P , where P is the number of signals. Now Y can be reconstructed solving Equation (1) by using the KSVD algorithm.
However, the reconstruction given by DA leads to the recovery of an MR image suffering from blurring effect, artifacts, and that keeps some residual noise, as shown in Figure 1. To overcome this, we focus on three important issues of the sparse representations: the weighting of its atoms, the self-similarity that emerges from these, and the scales that these representations can handle. The three steps that make up our method address each of the issues above. Figure 2 illustrates the steps of our method. Columns from left to right correspond to a phantom noise-free image added with 5% Rician noise, original phantom noise-free image, and image denoised by the KSVD. The third column from top to down corresponds to artifacts, residual noise, and blurring issues that the KSVD produces.

Weighting the Dictionary Atoms
In digital image processing, there exist several approaches for denoising images corrupted by noise; one of them is filtering in the spatial domain using masks. The coefficients of the spatial masks are set depending on both the noise level and the type of noise. For Gaussian noise, the coefficients are weighting factors set according to the distance to the center of the mask [55]. The central coefficient gives the weight of the pixel under analysis during the denoising process. The greater this coefficient is, the more similar the filtered pixel will be to the original. This similarity implies that this coefficient can control the blurring produced by the smoothing process. We apply this reasoning to the dictionary learning, but instead of weighting the pixels of the image, we weight the atoms.
For denoising the image while keeping fine details, which is essential in MR images, this new approach proposes weighting the atoms according to its global influence. We define the global influence of a dictionary atom as the probability that such an atom appears as a linear combination in the reconstruction of a signal. i.e., For example, after finding the optimal A in Equation (1), the global influence of an atoms is calculated as the number of non-zero coefficients corresponding to that atom in A. Formally, the global influence gi of an atom d c of D is equal to the probability that d c affects a signal, as Equation (2) indicates.
where α c,p is the element entry of the c-th row and the p-th column of the matrix A. The global influence is used to reinforce the effect of the more frequent atoms in the signals, to reduce the blurring caused by the average of sub-volumes during the aggregation. In the case of brain MR images, this global influence will be used for reinforcing atoms that represent common patterns found in these images, like the edges between brain structures, and therefore, for preventing the blurring in such patterns. Equation (5) updates the dictionary for reinforcing the atoms according to their global influence.
] is a vector containing the global influence of all atoms, δ is a constant for controlling the influence (δ = 8, which was experimentally obtained by numerical tests). 1 is a vector of ones, and Diag(·) is a K × K diagonal matrix with diagonal element entries equal to 1 + δGI. Now Y, a noise-free reconstruction of Y that keeps fine details, is computed using Equation (6).
Although Y, improves the representation of fine details, there is still some residual noise that can be eliminated, taking advantage of the self-similarity of the image, which can be easily examined from the sparse representation given by D and A. This will be addressed below.

Signal Grouping
Non-local filtering overcomes local filtering [56,57]; however, it is dependent on parameters such as search windows and thresholds [18]. Also, it is affected by searching for similar patches in a noisy space, which may affect the patch grouping and therefore lead to a bad estimation of the pixel. Instead, we search for similar signals in the clean space given by D A without searching windows, because the sparse representation allows examining the self-similarity of the image through the dictionary atoms.
For performing non-local filtering based on sparse representations, we rely on the fact that similar signals admit similar sparse representations [58,59]; hence, to find similar signals to a given signal y c , we starting determining which atoms make up y c by using Equation (7).
where K is the number of atoms in D (number of rows of A). I denotes the set of influencer atoms of y c i.e., the set of atoms belonging to D that are linearly combined according to α c for generating y c ( Figure 2 encloses an influencer atom of y c in a red rectangle in the SIGNAL GROUPING box). Then, we search for the most influencer atoms of Y as Equation (8) indicates.
where |α c | := [|α 1,c |, |α 2,c |,· · · , |α K,c |] is the vector α c with its components in absolut value, max(|α c |) denotes the largest component of |α c |, and P is the number of signals in Y. M denotes the set of ordered pairs (r, c) meaning that d r of D is the greatest influencer atom of the signal y c of Y. From the two previous sets, the set CSS( y c ) of candidates to similar signals of y c is defined according to Equation (9).
where CSS( y c ) contains all the signals y j of Y whose greatest influencer atom belongs to I( y c ). Figure 3a shows an example of a given reference signal y c in a MR image. Figure 3b shows the set of candidates to similar signals CSS( y c ) from the given reference signal of Figure 3a, obtained using Equation (9). For obtaining the set of similar signals of y c denoted by SS( y c ) (signals pointed out by green arrows in Figure 2), and additionally, to get a better filtering performance in visual terms, we rely on the fact that the human eye is relatively less sensitive to dark signals than to bright ones [60]. Therefore, we do not define a fixed threshold for selecting similar signals, but a variable one, dependent on both the magnitude of the reference signal and the noise level of the image. This variable threshold will allow reference signals with a large magnitude admit less difference to similar signals than the dark ones; in addition, will allow increasing the number of similar signals for MR images with high noise levels to get a better estimation of the noise-free reference signal. Hence, we define SS( y c ) according to Equation (10).
where b is the upper limit and b − a is the lower limit, respectively, of the interval containing the variable threshold that establishes the relative similarity between the signals (a = 0.06 and b = 0.18, which were experimentally obtained by numerical tests). 1 n 3 is an n 3 -dimensional vector of ones (with n being the edge of the sub-volume). γ is the estimated noise level. The left side of the inequality in Equation (10) is undefined when y c = 0. Hence, we convert it into y j when y c ≤ 10 −6 . Please note that a bright signal (large magnitude signal) admits less difference in their similar signals than a dark one because of the magnitude of the threshold b − a y c 1 n 3 decreases as long as y c increases.
Also note that, although the threshold is inversely proportional to the magnitude of the reference signal in the interval [b − a, b], this threshold will be greater for signals of high noise level images than for signals of low noise level images, because of the term (1 + γ). We can increase the threshold for collecting similar signals on high noise level images with out significantly affect the subsequent estimation of the noise-free reference signal, because we search for similar signals in the clean space given by the sparse representation. Figure 3c shows the set of similar signals SS( y c ) of the reference signal given in Figure 3a obtained after applying Equation (10) to CSS( y c ). The red arrows in Figure 3c point out similar signals away from the reference signal, demonstrating that the sparse representation allows us to locate similar signals throughout the MR image, without examining the entire volume. It also shows that our method can locate signals in a space not restricted to search windows. Therefore we can collect a larger number of similar signals (with respect to methods that use such windows), which improves the denoising. Now, the set SS( y c ) can be used for estimating y c in a non-local way; however, as was mentioned before, all the signals in Y keep some residual noise, and a better estimation could be carried out by a cleaner version of SS( y c ).

Efficient SVD Filtering
A known way for reducing the noise of an image is by analyzing its singular value decomposition (SVD). SVD is a matrix factorization that allows for the decomposition of a given matrix into three new matrices U, Σ and V so that X = UΣV , where U and V are orthogonal matrices, and Σ is a diagonal matrix whose diagonal entries are the singular values of X. Alternatively, in summation form, X = ∑ n i=1 s i u i v i , which can be understood as a weighted sum of component images, where the Eigen-images corresponding to the smallest singular values are commonly associated with noise [61]. Therefore, a vital issue for denoising based on SVD is determining what singular values correspond to the component noisy images. According to the Eckart-Young-Mirsky theorem [62], given a matrix X = UΣV then for any rank-k matrix X, it is fulfilled that: where · F is the Frobenius norm, Σ k is the Σ matrix of the SVD decomposition of X with the singular values σ k+1 = σ k+2 = . . . = σ n = 0. By squaring Equations (11) and (12) is obtained.
Taking the right side of the inequality and matching γ 2 , where γ is the stabilized standard deviation of noise (estimated according to [12]), we have: Equation (13) tells us that, when the sum of squares of the n − k less significant singular values of X is equal or greater than γ 2 , we have the best k for filtering the noise of X. These n − k less significant singular values correspond to the noisy component images (noisy component volumes in our case), and therefore, using only the first k more significant singular values we discard such component noisy images of X. Now, for filtering the residual noise of both y c and SS( y c ) let us define X as the matrix made up of y c and its similar signals SS( y c ), i.e., X = [ y c SS( y c )], and following Equation (13) the noise-free version X is obtained with the low rank approximation given by UΣ k V considering the estimated level of noise γ.

Signal Estimation
y c is updated from X, according to Equation (14).
where x i is the i − th signal of X and l is the number of signals in X.

Aggregation
The restoration of the MR sub-volumes from Y involves the aggregation of overlapped voxels, which can lead to the blurry recovery of the voxel. However, as mentioned before, weighting the voxels can control blurring. Now, the voxels to be aggregated belong to sub-volumes (signals), which may be frequent (has many similar signals) or rare (has a few similar signals). Since the frequent sub-volumes are better estimated than the rare sub-volumes because of the number of signals averaged in Equation (14), these voxels should have more weight than the voxels from the rare sub-volumes. Therefore, we propose weighting the voxel according to the rarity of the signal they belong to.
Similar to the procedure outlined in Leal et al. [63], the global influence of atoms can be used to establish the rarity of a signal. For example, a signal will be rare if it accumulates little global influence; it will be frequent if it accumulates a lot of global influence. Equation (15) allows computing the rarity of a signal based on the global influence of the atoms that make it up.
where R( y c ) denotes the rarity score of the signal y c , , denotes the dot product, and α c is the c-th column vector of A, which represents the sparse representation of y c on D .
High scores of Equation (15) are related to frequent signals, while low scores are related to rare signals. The denoised version of the voxel v is obtained according to Equation (16), after converting back the signals into sub-volumes.
where v i is a version of the voxel v contained in the sub-volume V i , R(V i ) is the rarity score of the signal y i corresponding to the sub-volume V i , and C(v i ) denotes the set of sub-volumes V i that contains a version of v.

Scaling
An important parameter for denoising based on sparse representations is the patch size. Large patches lead to denoised images suffering from blurring effect, while small patches lead to denoised images that can keep a high level of noise. However, small patches keep also fine details of the original image.
Averaging images denoised using different patch sizes can carry out a good trade-off between blurring and fine details. The noisy image can be denoised several times; but, each time with a different patch size, then the denoised versions of the image are averaged. The different patch sizes for processing the image are known as scales. Let us define V n as the denoised version of a noisy MR image V using the proposed method with a sub-volume of size n × n × n. The averaged denoised version V of V, is obtained using Equation (17).
where |VS| denotes the number of elements in VS, and VS is the set of different sub-volume sizes for processing V. In our tests, we achieved the best results by averaging the denoised versions V 3 and V 4 . To obtain the final output V * , we apply the inverse VST (VST −1 ) to the denoised volume V. The denoising of Rician data using the proposed method, Non-local SVD Denoising-NLSD, is synthesized by Equation (18) where γ is the stabilized standard deviation induced by the VST, V is the volume corrupted by Rician noise with standard deviation σ.

Results and Discussion
We conducted several experiments to obtain the values of the parameters for the optimal setting of our method and to compare some state-of-the-art methods. These experiments were conducted in MATLAB R2016a running under the Windows 10 Professional OS on an Intel Core i7 First Generation CPU with 8GB of memory.
A dataset of MR phantom images (1 T1w, 1 T2w, and 1 PDw) from the BrainWeb data base [64], with a size of 181 × 217 × 181 voxels (resolution = 1 mm 3 ), was used for numerical comparisons, as well as for setting the optimal parameter values of our method.

Parameter Estimation
The first stage of our method, which is preceded by the VST, aims to compute the global influence to reinforce the atoms for avoiding blurring in the denoised image. This reinforcing is controlled by the parameter δ, which does not intervene in the remaining stages. For determining a suitable value for this parameter, it was performed an experiment which consisted of corrupting the dataset of MR phantom images with different levels of Rician noise (1-10% of the maximum intensity), i.e., from each volume of the dataset were obtained 10 different noisy versions, one per each level of noise, for a total of 30 corrupted images. Then, we ran the first stage of our method over each corrupted image, followed by the inverse VST. Finally, we calculate the widely recognized quality measures Peak Signal to Noise Ratio (PSNR), and the Structural Similarity Index (SSIM) [65] of each denoised image to establish the best value for this parameter numerically. Figure 4 shows the results of applying the first stage of our method over the dataset of MR phantom images corrupted with the different Rician noise levels. Each point on the green curve is the average PSNR of the denoised images for a given value of δ, and each point on the blue curve is the average SSIM.
We observed that the PSNR is maximum (31.991) at δ = 8 and the SSIM is maximum (0.8934) at δ = −3. It is preferable to choose the value of δ that generates the maximum PSNR instead of the value that generates the maximum SSIM. Since this metric (PSNR) is more sensitive to noise than the SSIM (which is more correlated with a human visual system) [66], and by choosing the value of δ that give us the maximum PSNR we are choosing the cleaner search space for the next stage. Additionally, the best PSNR and SSIM pair is reached at δ = 8. The second stage of our method is to perform the Non-Local-SVD filtering. At this stage, we search for similar signals to a reference signal in the clean space given by the first stage. We defined a variable threshold in the interval [b − a, b] for establishing the level of similarity between the signals (right side of the inequality in Equation (10)). This threshold depends on the magnitude of y c and the noise level γ estimated in the first stage. The interval where the threshold moves is determined by the constants a and b. To establish a suitable value for these constants, it was performed an experiment which consisted of running the second stage of our method over the sparse representation Y = D A of each denoised image produced by the tests of the first stage (previous to the application of the inverse VST), trying different values for a and b. In this experiment, the application of the second stage of our method was followed by the inverse VST. Figure 5 shows the results. Each point on the PSNR surface (Figure 5a) represents the average of the PSNRs computed after applying the second stage or our method over the images denoised by the first stage, for a given pair of values a and b. Similarly, Each point on the SSIM surface (Figure 5b) is the average SSIM.
As shown in Figure 5a, the maximum PSNR is equal to 36.247, which is reached at a = 0.06 and b = 0.18. At this point, the SSIM is 0.9407. As shown in Figure 5b, the maximum SSIM is 0.9439, which is reached at a = 0.02 and b = 0.17. At this point, the PSNR is 36.104. We chose the pair a = 0.06 and b = 0.18 because the increment of the PSNR at this point with respect to the PSNR at the point where the maximum SSIM is reached, is more significant than the increment of the SSIM at a = 0.02 and b = 0.17, with respect to the SSIM at the point where the maximum PSNR is reached.

Methods Comparison
We compared the proposed method for MRI denoising, NLSD, to the following widely adopted filters: KSVD [53], PRINLM [21] (implementation downloaded from [67]), BM4D [32] (implementation downloaded from [68]), and DnCNN [48] (implementation dowloaded from [69]). The implementation of the methods PRINLM and BM4D used for comparison are developed in MATLAB with its core developed in C++. The DnCNN, the KSVD, and the NLSD, are implemented purely in MATLAB and do not use any graphic acceleration. The application of the KSVD was preceded by the VST, and followed by its inverse. We performed both quantitative and qualitative comparisons for demonstrating the effectivity or our method.

Synthetic Data
For numerical comparisons, we used the MR phantom images dataset, corrupted with Rician noise (1-10% of the maximum intensity). The PSNR, and the SSIM were used to evaluate the performance. Figures 6-17 and Tables 1-4 show the results of these comparisons. Figure 6 shows the results of the comparisons on the T1w synthetic noise-free MR image corrupted with 5% Rician noise. It shows that the methods PRINLM, BM4D, DnCNN, and NLSD perform very well removing noise; however, the PRINLM produces some artifacts as the zoom-in in Figure 7d,k show. The KSVD and the BM4D also produce artifacts but to a lesser extent, as shown in Figure 7c,e, respectively, and Figure 7j,l respectively. The DnCNN over-smooth the image and therefore fine details as shown in Figure 8m. Instead, the NLSD removes noise while keeping fine details, in addition, does not produce artifacts. The KSVD maintains a lot of residual noise, as well as the BM4D, although to a much lesser extent, as shown in Figure 8c,e, respectively. Figure 8d,g show that the PRINLM and the NLSD, respectively, do not keep residual noise. Figure 8h-n show the zoom-in on the details labeled as "c" in Figure 6. It shows that the methods KSVD, BM4D, and DnCNN are affected by the blurring effect (Figure 8j,l,m, respectively). The PRINLM (Figure 8k) and the NLSD (Figure 8n) keep fine details; however, the PRINLM tends to swell such details. Table 1 summarizes the results of the numerical tests in the T1w synthetic MR image. Bold highlights indicate the best results.    Figure 9 qualitatively compares the denoising methods from the residual images (images resulting from the difference between the denoised and noisy images). It shows that the KSVD retains some structural information, and the DnCNN to a lesser extent, while the PRINLM, the BM4D, and the NLSD perform very well. However, the NLSD presents both a higher PSNR and a higher SSIM, as Table 1 shows.  Figure 10 shows the results of the comparisons on the T1w synthetic noise-free MR image corrupted with 9% Rician noise. It shows that the PRINLM do not eliminate the noise in dark regions but tends to flatten (circled area) and destroy fine details (red arrows). BM4D also keep noise in dark areas (circled area) and also destroy fine details. The DnCNN remove noise but over-blur the image destroying fine details. The KSVD keep a lot of noise and destroy fine details. The NLSD instead keep fine details and perform very well removing noise. You can zooming in the images to appreciate the details. Figure 11 shows the qualitative comparison of the method evaluated from the residual images. No structural information is observed on the residuals of the NLSD. The residuals of the KSVD and the DnCNN present some structural information. The residuals of the PRINLM and the BM4D also present structural information to a lesser extent.  Figure 12 shows the results of the comparisons on the T2w synthetic noise-free MR image corrupted with 7% Rician noise. It shows that the image filtered by the KSVD (Figure 12c) keeps high levels of noise, which is corroborated by the corresponding zoom-in of region "c" in Figure 13c. The BM4D also keeps noise, although to a lesser extent, as shown in region "c" in Figure 13e. The DnCNN over-blur the image and destroy fine details, as can be seen in the corresponding zoom-in of region "b" in Figure 13f. The methods PRINLM and NLSD perform very well at removing noise; however, as shown in the zoom-in of the regions "a" and "b" in Figure 13d, the PRINLM produces artifacts. The NLSD does not produce artifacts. Figure 14 qualitatively compares the denoising methods from the residual images. It shows that the KSVD presents a correlated residual image and also the BM4D. The DnCNN present a residual image that keep some estructural information, but to a lesser extent, while the PRINLM, and the NLSD perform very well. However, the NLSD presents both a higher PSNR and a higher SSIM, as Table 2 shows. Table 2 summarizes the results of the numerical tests in the T2w synthetic MR image. Bold highlights indicate the best results.  . Zoom-in on the details labeled as "a", "b", and "c" in Figure 12. Figures (a-g) correspond to the zoom-in of the Original image, the Noisy image, the KSVD, the PRINLM, the BM4D, the DnCNN, and the NLSD, respectively.   Figure 15 shows the results of the comparisons on the PDw synthetic noise-free MR image corrupted with 5% Rician noise. Figure 16 shows the zoom-in of the regions labeled as "a" and "b" in Figure 15. It shows that the BM4D and mainly the KSVD retain noise, as shown in Figure 16c,e in the corresponding zoom-in. The DnCNN perform very well filtering noise but over-blur the image destroying fine details, as can be seen in Figure 16f. The PRINLM and the NLSD perform very well, although the NLSD (Figure 16g) tends to preserve some fine structures better than the PRINLM. Figure 16d shows how the PRINLM tends to thicken or destroy such structures. Figure 17 qualitatively compares the denoising methods from the residual images. It shows that the KSVD presents a correlated residual image. The BM4D and DnCNN also shows a bit of correlation, while the PRINLM and the NLSD perform very well. However, the NLSD presents both a higher PSNR and a higher SSIM, as Table 3 shows.     For qualitative comparisons on real clinical data, we used a data set of T1w brain MR images acquired in a 1.5T Siemens Magnetom equipped with a standard head coil. The parameters of the T1w images were: TR = 7.9 ms, TE = 3.8 ms, ACQ matrix 220 × 220 pixels, voxel size 0.5 mm × 0.5 mm × 0.5 mm. We compared the NLSD results to the PRINLM, the BM4D and the DnCNN results. The KSVD was not considered because its performance was much lower in the tests on synthetic data. Figure 18 shows the results. The four methods perform very well, as can be seen in their corresponding zoom-ins; however, the BM4D and DnCNN results are slightly blurry and retain some noise. The PRINLM also keeps some residual noise. It can be noted that, close to the dark areas the BM4D and the PRINLM maintain residual noise slightly flattened, as can be seen in the zoom-in. The NLSD instead reduces noise while keeping fine details and yielding better visual results. This can be seen around the skull, where almost there is no any noisy pixel. This is because of the variable threshold in Equation (10)

Conclusions
In this paper, we presented a new method for MR images denoising based on non-local means, singular value decomposition, and sparse representation. Both qualitative and quantitative results validate the effectiveness of the proposed method. We demonstrated how the global influence of the dictionary atoms could act as weighting factors to improve the noise filtering while keeping the fine details of the image and reducing the blurring effect caused by the average of overlapping patches.
We also demonstrated how the strategy of non-local means could be applied directly from the dictionary without resorting to the use of search windows, which limit the search for similar signals (patches or sub-volumes) to a given region. The dictionary, instead, allows finding similar signals in any location of the MR image without exploring the whole volume; and therefore, improving the filtering. Additionally, the strategy of searching for similar signals in the clean space given by the sparse representation allows the process of collecting such signals to improve; since, the level of noise in this space is lower than in the original space given by the noise MR image. Finally, we also showed how the estimated level of noise in the image could be used for computing efficiently the number of singular values, to obtain a low-rank approximation of the set of similar signals, and to reduce their noise before proceeding to the estimation of the noise-free signal.