Data-Driven Redundant Transform Based on Parseval Frames

: The sparsity of images in a certain transform domain or dictionary has been exploited in many image processing applications. Both classic transforms and sparsifying transforms reconstruct images by a linear combination of a small basis of the transform. Both kinds of transform are non-redundant. However, natural images admit complicated textures and structures, which can hardly be sparsely represented by square transforms. To solve this issue, we propose a data-driven redundant transform based on Parseval frames (DRTPF) by applying the frame and its dual frame as the backward and forward transform operators, respectively. Beneﬁtting from this pairwise use of frames, the proposed model combines a synthesis sparse system and an analysis sparse system. By enforcing the frame pair to be Parseval frames, the singular values and condition number of the learnt redundant frames, which are efﬁcient values for measuring the quality of the learnt sparsifying transforms, are forced to achieve an optimal state. We formulate a transform pair (i.e., frame pair) learning model and a two-phase iterative algorithm, analyze the robustness of the proposed DRTPF and the convergence of the corresponding algorithm, and demonstrate the effectiveness of our proposed DRTPF by analyzing its robustness against noise and sparsiﬁcation errors. Extensive experimental results on image denoising show that our proposed model achieves superior denoising performance, in terms of subjective and objective quality, compared to traditional sparse models.


Introduction
A transform is a classical technique in signal processing, such as compression, classification, and recognition [1][2][3][4][5]. Traditional transforms, based on analytic orthogonal bases such as DCT, DFT, and Wavelets [1,6], suffer from two shortcomings: they do not depend on the data, and they reconstruct each image by approximation in the same subspace spanned by a non-redundant basis of the transforms, which limits the compact representation of natural signals.
There are two typical models for sparse representation: synthesis [10,14,15] and analysis [16][17][18][19] models. So far, most sparse models rely on the concept of synthesis, which represents the underlying signal as a sparse combination of atoms from a given dictionary. Specifically, x = Dα, where x ∈ R N is the original signal, D ∈ R N×M is the given dictionary whose columns are the atoms, and α ∈ R M is the sparse coefficient, which is usually measured by the 0 -norm · 0 . A learning analysis sparse model was proposed by Elad [14,19], formulated as Ωx 0 = r with notation similar to that of the synthesis one. Instead of reconstructing the signal using a few atoms in dictionary (like in the synthesis model), an analysis model decomposes a signal in a sparse fashion, based on an assumption that the signal lies in a sparse subset of the dictionary.
An analysis model can be straightforwardly regarded as a forward transform if its corresponding backward transform Ω * is available. Recent research on transforms [2,4,5,20,21] has demonstrated the advantages of applying sparse constraints in transform learning. Motivated by this idea, many studies have been devoted to image denoising [5,20], classification [3,4], and other signal processing methods [21]. Learning-based transforms with sparse constraints measure the transform error, called sparsificaiton error, in the analysis or frequency domain, rather than in the temporal domain. Given training data X ∈ R N×L with signal vectors x i ∈ R N , i = 1, . . . , L as its columns, the problem of training a square sparsifying transform W ∈ R N×N [21] is formulated as where y i , i = 1, 2, . . . , L are the columns of Y satisfying a sparse constraint and µ W 2 F − λ log det(W) is a regularizer, which keeps W non-singular.
As we can see, learning-based models effectively reveal the relationship between the transform and the data. The square transform, which consists of a non-redundant basis, cannot express complicated images. In 2014, an overcomplete transform learning model called OCTOBOS [20] was proposed, which consists of a series of square transforms to represent different features of natural images. However, the number of transforms must be pre-defined, which admits limited flexibility in applications.
In recent years, frames, as an overcomplete system, have been applied in image processing such as denoising [22,23], image compressive [24] and high resolution image reconstruction [25]. A frame can be regarded as an extension of an orthogonal basis, as a frame Φ ∈ R N×M (N < M) also spans an N-dimensional space. Compared to a general frame, a tight frame (e.g., wavelet tight frames [26], ridgelets [27], curvelets [28], shearlets [29], and others) can achieve wider use, as the lower and upper frame bounds are equal. A tight frame inherits the good characteristics of an orthogonal basis in signal processing, as its rows are orthogonal [30]. In a sparse representation, a redundant frame serves as an overcomplete dictionary to represent the signal [23]. With the development of data-driven approaches, learning-based tight frames have recently been researched [31][32][33]. In [31], redundant tight frames were used in compressed sensing. In [32], tight frames were applied to few-view image reconstruction. In [33], a data-driven method was presented, in which the dictionary atoms associated with a tight frame are generated by filters. In general, these studies model the frame learning problem in the dictionary learning form with tight frame constraints. These methods focus on tight frames, as the singular values of a tight frame are equal, which leads to simple optimization. A tight frame is a Parseval frame if the frame bounds are equal to 1. In fact, a Parseval frame is a redundant extension of the concept of a standard orthogonal basis. Due to its super-performance in linear signal representation, it can be well-used in sparse signal representation and optimization.
In this paper, we propose a data-driven redundant transform model based on Parseval frames (DRTPF for short), and present a model for learning DRTPF as well as a corresponding algorithm for solving the model. The algorithm consists of a sparse coding phase and a transform learning phase. The sparse coding phase updates the sparse coefficients and a threshold value using a conventional Batch Orthogonal Matching Pursuit (BtOMP) and pointwise thresholding. The transform learning phase performs the update of the frame using Gradient Descent and a relaxation or contraction singular values mapping, as well as updating the dual frame, in an atom-wise manner, using Least Squares.
The advantages of the proposed DRTPF model (as well as the algorithm) are demonstrated with natural image denoising. To summarize, this paper makes the following contributions: 1. We propose the DRTPF method by integrating redundant Parseval frames with sparse constraints.
The DRTPF method consists of a forward transform and a backward transform, which correspond to a frame and its dual frame, respectively. In other words, DRTPF bridges synthesis and analysis models by assuming that two models share almost the same sparse coefficients. 2. DRTPF outperforms traditional transforms and frames by learning from data which exploits the features of natural images, whereas traditional transforms and frames admit a uniform representation of various images, which tend to fail to characterize the intrinsic individual-specific features. 3. Traditional transforms are usually orthogonal transforms and the signals remain isometric, yet they suffer from weak robustness due to their strict properties. In contrast, DRTPF preserves the signals in a bounded fashion, which admits higher robustness and flexibility. 4. We propose a model for learning DRTPF and compare DRTPF with traditional transforms and sparse models in robustness analysis and image denoising experiments. Both qualitative and quantitative results demonstrate that DRTPF outperforms traditional transforms and sparse models.
The rest of this paper is organized as follows. Section 2 reviews the related work on frames. Section 3 proposes the framework of DRTPF, including the form of DRTPF (Section 3.1) and the learning model and corresponding algorithm for DRTPF (Section 3.2). In Section 4, we demonstrate the effectiveness of our DRTPF model by analyzing the convergence of the corresponding algorithm and give experimental results on robustness analysis and image denoising, as well as evaluating the effectiveness of DRTPF compared with traditional transforms and sparse models.

Related Work
Let H be an N-dimensional discrete Hilbert space. A sequence {φ i } M i=1 ∈ H is a frame if and only if there exist two positive numbers A and B such that [30] A and B are called the bound of the frame and we call formula 2 the frame condition, as it is a termination of frame. Furthermore, [30]. In particular, {φ i } M i=1 is a Parseval frame if A = B = 1 is satisfied. There are two associated operators can be defined between the Hilbert space H N and a Square integrable Space l M 2 (·) once a frame is defined: One is the analysis operator, T, defined by and the other is its adjoint operator, T * , which is called the synthesis operator: Then, the frame operator can be defined by the following canonical expansion Let x ∈ R N be an arbitrary vector in H. A reconstruction function is an expression with the following form where the sequence is an orthogonal basis. In fact, for an arbitrary given frame {φ i } M i=1 , there is a series of dual frames corresponding to it. The non-uniqueness of the dual frame allows us to achieve a better expression of the signal by optimizing the dual frame.
The frame Φ and its dual frame Ψ can be stacked as the matrices Φ = [φ 1 , φ 2 , . . . , φ M ] and Ψ = [ψ 1 , ψ 2 , . . . , ψ N ], respectively. The matrices can be regard as sparse representation dictionaries, transform operators and so on. A frame Φ with the bounds A and B means that the maximum and minimum singular values of it are equal to A and B respectively. What' more, the singular values of tight frame are all equal, particularly, the singular values of Parseval frame are all equal to 1. Thus, when the frame Φ is applied as sparse representation dictionary or transform operator, its condition number are determined by B A . In this way, the model will never provide degenerate dictionary or transform. In fact, frames are matrices with special structure.

Data-Driven Redundant Transform Model Based on Tight Frame
In this section, we present our data-driven redundant transform based on Parseval frames (DRTPF, Section 3.1) model along with an efficient redundant transform learning algorithm (Section 3.2) which contains the sparse coding algorithm (Section 3.2.1) and the transform pair update algorithm (Section 3.2.2).

Data-Driven Redundant Transform
In this subsection, we first propose a threshold-based reconstruction function, with the assumption that the signal is sparse in the dual frame domain. Then, we present the data-driven redundant transform based on Parseval frames model.
be a frame and {ψ i } M i=1 be its dual frame. For convenience, we stack them as the matrices Φ = [φ 1 , φ 2 , . . . , φ M ] and Ψ = [ψ 1 , ψ 2 , . . . , ψ N ], respectively. Let x =x + e be a signal vector, wherex is the original noiseless signal and e is a zero-mean white Gaussian noise. The frame reconstruction function (6) can be formulated as x = ΦΨ T x = ΦΨ T (x + e). By assuming the sparse prior of signals over the Ψ domain, we apply a columnwise hard thresholding operator S λ (·) (which shall be defined in the next subsection) on Ψ T (x + e), such that where λ is a vector with elements λ i corresponding to ψ i , i = 1, 2, . . . , M. Apparently, S λ (Ψ T x) is the sparse coefficients of x under Ψ in the sense of an analysis model, while it also serves as the sparse coefficients under Φ in the sense of a synthesis model. In other words, Equation (7) admits that the synthesis and analysis models share almost the same sparse coefficients. As we all know, the standard orthogonal basis, which is a significant tool in signal representation and transformation, is a special kind of frame with frame bounds A = B = 1. In fact, the standard orthogonal basis is a special case of a Parseval frame. In order to exceed the so-called perfect reconstruction property of the standard orthogonal basis in signal representation and transform, we refer to the Parseval frame. Therefore, we propose the data-driven redundant transform based on Parseval frame (DRTPF), as follows x ← Φy, where (8) is the forward transform and (9) is the backward transform. The relationship between Φ and Ψ is formulated as (10), which implies the relationship between the frame and its dual frame. The forward transform operator Ψ is also a Parseval frame, as it is a dual frame of Φ. Thus, the projection of the signal x over the Ψ domain can be formulated as Equation (12) indicates that the transform coefficients of the proposed DRTBF are bounded by the original signal x. This constraint leads to a more robust result than traditional sparse models.
To convert DRTPF into an optimization problem, (11) can be written as the more compact expression ΦΦ T = I, which characterizes Φ in a way that is unrelated to the data. This property indicates that the rows of the frame Φ are orthogonal, thus satisfying the so-called perfect reconstruction property which ensures that a given signal can be perfectly represented by its canonical expansion (in a manner similar to orthogonal bases).
Assuming X ∈ R N×L is the training data with signal vectors x i ∈ R N , i = 1, 2, . . . , L as its columns, an optimization model for training DRTPF can be written as The dual frame condition ΦΨ T = I and the Parseval frame condition ΦΦ T = I imply that the difference of Φ and Ψ is in the null space of Φ.
The vectors a i , i = 1, 2, · · · , N are orthogonal to Φ. Thus, it is clear that the dual frame Ψ contains two subspaces: one spanned by Φ and the one spanned by the a i , i = 1, 2, · · · , N.

Transform Learning for the Drtbf Model
As there are no existing algorithm for solving problem (13), we apply the alternative direction method (ADM) and divide (13) into two sub-problems: A sparse coding phase, which updates the sparse coefficients Y and the threshold value λ, (Section 3.2.1); and the transform operator pair update phase, which computes Φ and Ψ, (Section 3.2.2).

Sparse Coding Phase
This subsection presents the sparse coding method for the proposed DRTBF model, in which the sparse coefficients of Y are obtained by OMP, and the threshold values λ are obtained by a designed elementwise method.

The Y Subproblem
The pursuit of Y is equivalent to solving the following problem with fixed Φ, Ψ, and λ: which can be easily solved by OMP [14,34], as (14) can be easily converted to the classical synthesis sparse expression min The λ Subproblem With fixed Φ, Ψ, and Y, finding λ is equivalent to solving the following problem which can be decomposed into M individual optimization problems arg min As the cardinality of J i depends on λ i , we transform (15) to another optimization problem: where y ij denotes the (i, j)th entry of Y and x i denotes the ith column of X. Denote l(λ i ) as We observe that the function f (λ i ) is a monotonically increasing function and that g(λ i ) is monotonically decreasing. We take ψ T i x i , i = 1, 2, . . . , L as candidates and compute all the values of f (λ i ) + g(λ i ). Then, the optimal λ i should lie in an interval determined by ψ T i x k and ψ T i x l , which correspond to the smallest and the second smallest values of f (λ i ) + g(λ i ), respectively. Then, any suitable value for λ i can be selected. The algorithm for the threshold is summarized as Algorithm 1.

Transform Pair Update Phase
The Ψ Subproblem With fixed Y and λ, the optimization problem to obtain Ψ is given bŷ Algorithm 1: Sparse coding algorithm.

Input and Initialization:
Training data X ∈ R N×L , iteration number r, initial value λ = 0. Output: Sparse coefficients Y, and threshold values λ Process: 1: Compute the sparse coefficients Y via (14), according to the OMP algorithm [14,34] ; Denote them as a vector ν. End for 4: Sort the elements of |ψ i T X| and the columns of X in descending order of ν. Denote the first and second samples as

End for
Such a problem is a highly nonlinear optimization problem, due to the definition of S λ . We (columnwise) solve Ψ by updating each column of Ψ while fixing others. The product ΦΨ T can be written as For each ψ i , we solve the following subproblem: where z = I − ∑ N p =i ψ p φ T p . We denote J i to be the indices (as before), and then separate the problem into the two following sub-problems: where y ij denotes the (i, j)th entry of Y and x i denotes the ith column of X. Equation (21) is a quadratic optimization, while Equation (22) has a closed form solution given by the normalized singular vector corresponding to the smallest singular value of XĴ . Based on the solutions of the two sub-problems, we give the solution of (20) as the average of the two solutions; that is,ψ i = 1 2 (ψ 1 i + ψ1 i 2ψ 2 i ). Please note that the second solution is added with the magnitude of the norm of the first solution, as (21) serves as a dominant term for the Ψ subproblem, while the solution of (22) maintains no energy but direction.

The Φ Subproblem
With fixed Y, λ, and Ψ, the model to obtain Φ is given by We convert (24) to an optimization problem which is formulated as We denote the target function (24) by h(Φ) and apply the gradient descent method to the unconstrained version of (24) and project the solution to the feasible space. The gradient is given by We summarize our overall algorithm in Algorithm 2.

Input and Initialization:
Training data X, frame bound (A, B), iteration num.
Build frames Φ ∈ R M×N and Ψ ∈ R M×N , either by using random entries or using N randomly chosen data. Output: End For 3: Update Φ via Gradient Descent, which is given as (25) and the step length is usually set to 0.01.

Image Denoising
We introduce a novel problem formulation for signal denoising by applying the data-driven redundant transform DRTPF. Image denoising aims to reconstruct a high-quality image I from its noise corrupted version L, which is formulated as L = I + n where n is a noisy signal. For a signal satisfying the DRTPF, the denoising model based on DRTPF is formulated as where R i is an operator that extracts the ith patch of the image I, y i is the ith column of Y, and λ denotes a vector [λ 1 , λ 2 , · · · , λ M ] with λ j operating on the jth element of Ψ T R i I. On the right side of Equation (26), the first term is the global force, which demands proximity between the degraded image L and its high-quality version I. The other terms are the local constraints, which ensure that every patch at location i satisfies the DRTPF. This formulation assumes that the noise image L can be approximated by a noiseless imageÎ whose patch extracted by R i can be sparsely represented by the given transforms Φ and Ψ.
To solve Problem (26), we apply Algorithm 1 to obtain the sparse coefficients Y and the threshold values λ. We mainly state the iterative method to obtain I.
Denote d k = Ψ T R i I k−1 . We set O k as an index set that satisfies |d k l | ≤ λ l , l ∈ O k . Set u k ∈ R M as a vector with elements u k l = 1 l ∈ O k , 0 otherwise. Then, the non-convex and non-smooth thresholds can be removed, with the substitution y i − S λ (Ψ T R i I k ) ≈ y i − Ψ T R i I k u k Thus, in the kth step, the problem that needs to be solved can be expressed as where is pointwise multiplication. This convex problem can be easily solved by the gradient descent algorithm.
We summarize the restoration algorithm in Algorithm 3.

Input
Training dictionaries Φ, Ψ, iteration number r, a degraded image L, set I 0 = L. Output: The high-quality imageÎ 1: Compute Y and λ via the method in Algorithm 1.
For k=1:r 2: Compute d k = Ψ T R i I k−1 . Set O k as an index set that satisfies |d k l | ≤ λ l , l ∈ O k . Set

Experimental Results
We demonstrate the effectiveness of our proposed data-driven redundant transform based on Parseval frames (DRTPF) by first analyzing the robustness of the model against Gaussian White Noise. Then we discuss the convergence of the proposed transform learning algorithm and the ability of the proposed DRTPF to provide low sparsification errors. Finally, we evaluate the effectiveness of the proposed DRTPF by applying it to nature image denoising. We use a fixed step size in the transform update and denoising steps of our algorithms.

Robustness Analysis
In this subsection, we illustrate the robustness of DRTPF by training DRTPF using the image 'Barbara' and testing DRTPF for denoising the same image with Gaussian white noise added. The noise level (standard deviation) δ ranged from 20 to 60 with a step size of 2. In the experiment, the frames Φ and Ψ of size 100 × 200 were initialized as 1D overcomplete DCT (ODCT) and 10 × 10 overlapping mean-subtracted patches were used. The patch size was set as 8 × 8 with stripe 1. We set the parameters η 1 = 1.1 and η 3 = 1e + 7, and η 2 was replaced by the 0 thresholding 0.6σ (i.e., Y 0 ≤ 0.6σ). For comparison, our proposed algorithm was compared with K-SVD [14]. The size of dictionary learnt from K-SVD is 8 × 256 at its optimal state, according to the previous work.
We show the denoising result in Figure 1, from which it is apparent that with higher noise, our DRTPF method outperformed K-SVD more and more. In other words, our proposed model has good robustness. In fact, in our model, the sparse coefficients are calculated accurately by the inner product of the signals and the frame Ψ, and are limited to a certain range. Theoretically, it should be more robust. The learnt transforms Φ and Ψ are illustrated in Figure 2. These figures show that our frame learning method can capture the features in both analysis and synthesis ways. Figure 3 shows two exemplified visual results on the images 'Babara' at noise level σ = 30 and σ = 50. From Figure 3 we know that our proposed DRTPF can obtain more clearer features than K-SVD [14]. Figure 1. Robustness Analysis. DRTPF is trained and tested using the image 'Barbara'. The X-label is the noise level δ and the Y-label is the PSNR. It can be seen that DRTPF performs more robustly than K-SVD.

Sparsification of Nature Images
A classic sparsifying transform learning model [21] is formulated as where X is the training data, Y are the sparse coefficients, and W is the learnt transform. The quality of the learnt transforms in the experiment [21] was judged based on their condition number and sparsification error. Similar to the experimental setting in [21], we also evaluated the effectiveness of the transforms learnt from our DRTPF by their condition number and sparsification error. The l 2 -norm condition number of the transform operator Φ is denoted as the ratio of the maximum singular value to the minimum singular value of Φ; that is, In our case, the condition number K Φ = 1, as the maximum and minimum singular values (which are determined by the optimal frame bounds) must be equal to 1. Similarly, we can obtain that K Ψ = 1. It is the best case when the transform operators have condition number equal to 1. The sparsification error of the model (28) is defined as Similarly, we define the 'sparsification error' of the proposed DRTPF, to measure the energy loss due to sparse representation, which is formulated as The 'sparsification error' indicates the compact ability of the transform Ψ with reasonable ignorance of the thresholding operator S λ (·). To demonstrate that our model and algorithms are insensitive to the initialized transforms, we applied the proposed sparse coding and transform operator pair learning algorithms to train a pair of transforms. The training data are patches of size 10 × 10 extracted from the image 'Babara' which is shown in Figure 4. The trained transform pair are of size 100 × 200. We extracted the patches with non-overlap and removed the DC values of every sample. We set the parameters η 1 = 1.1 and η 3 = 1e + 7, and η 2 was replaced by the 0 thresholding 0.6σ, as before. The matrices used for initialization were the 1D DCT matrix, the matrix with random columns sampled from the training data, and the redundant identity matrix. As the transform for DRTPF is redundant, the redundandt identity matrix here is formed as [I I] where I is the identity matrix of size 100 × 100.
The convergence curve of the objective function and the 'sparsification error' are shown in Figure 5. From the left sub-figure of Figure 5 we know that our proposed algorithm for DRTPF is converged, and all the initializations converge to the same result after about 20 iterations which demonstrate that our proposed DRTPF and the corresponding algorithm are insensitive to different initializations. The right sub-figure of Figure 5 shows the 'sparsification error' of the three initialized methods, the 2D DCT transform of and the KLT transform. The 2D DCT is formed by the Kronecker product of two 1D DCT transform, i.e., D = D 0 ⊗ D 0 , where D 0 is the 1D DCT transform of size 8 × 8 and ⊗ denotes the Kronecker product. The KLT transform K of size 64 × 64 is obtained by principle component analysis (PCA) method. The 'sparsification error' of 2D DCT and KLT are calculated via the model in [21] at iteration zero. This figure shows that the 'sparsification error' of the proposed DRTPF model is also converged and insensitive to the initialization matrices. In fact, the loss function of the proposed DRTPF mainly contains two partions: X − ΦY 2 F and Y − S λ (Ψ T X) 2 F . The first partion is the recovery loss (i.e., the loss in temporal domain) and the second partion is the 'sparsification error' (i.e., the loss in frequency domain). Our proposed model aims to achieve low error both in temporal domain and frequency domain.
To illustrate the behavior of the proposed DRTPF in image representation, we choose six images shown in Figure 4 to train transforms and recover images. The Figure 6 shows the average sparsity curve and the recovery PSNR values with the increase of the sparsity. From the left sub-figure we know that the images are well sparsified along the iterative process. This figure is generated by setting y i 0 < 5 and the recovery PSNR is 32.27 dB. For each sample x i vectorized by a 10 × 10 patch, its correspondinge sparse coefficients y i is of length 200. It is easy to know that the sparsity rate is lower than 2.5%. Furthermore, less than 5% of the data need to be stored to recover an image with PSNR larger than 32.27 dB. The right sub-figure of Figure 6 shows the average recovery PSNR values with the increase of the sparsity which is a main measurement for the quality of the learnt transform. From the figure we know that in most of the case, our proposed DRTPF can obtain a better image quality in terms of PSNR with lower sparsity than the compared LST [21] method and the classic DCT transform. The ransform for LST [21] method and the classic DCT transform are of size 64 × 64. The transform of LST [21]is trained by 4096 8 × 8 samples extracted from every image shown in Figure 6 with the main of the patches removed. The experiment is set as them illustrated in the paper [21]. When the total sparsity of a 512 × 512 image is more than 47,000, the recovery results of the proposed DRTPF and the LST [21] are nearly the same. The recovery PSNR at sparsity 47,000 is 37.3 dB. The Y-label is the objective function and the sparsification error, respectively. It can be seen that our DRTPF learning algorithm is convergent and insensitive to initialization.

Image Denoising
In this subsection, we evaluate the performance of our DRTPF model using six natural images of size 512 × 512, which are shown in Figure 4. We added Gaussian white noise to these images at different noise levels (σ = 20, 30, 40, 50, 60). We set the parameters η 1 = 1.1 and η 3 = 1e + 7, and η 2 was replaced by the 0 thresholding 0.6σ, as before. We compared DRTPF with the three most related methods of sparse representation: K-SVD [14], the overcomplete transform (T.KSVD) [3], the learning-based frame (DTF [33]), the BM3D [35] and WNNM [36]. The BM3D and WNNM are nonlocal-based methods with the parameters setting as in corresponding paper. We note that DTF works on filters, instead of image patches. In the experiment, our DRTPF method and K-SVD were the same as in Section 5.1. All methods were trained iteratively (25 times). The DTF method was initialized by 64 3-level Harr wavelet filters of size 16 × 16. The operator size of the T.KSVD method was 128 × 64 and the patch size it worked on was 8 × 8 overlapping mean-subtracted patches. The hard thresholding was s = 30. Table 1 shows the comparison results, in terms of average PSNR. As shown in Table 1, our DRTPF method and the DTF method outperformed K-SVD and T.KSVD on most images, i.e., our proposed DRTPF outperforms K-SVD for 0.47 dB and outperforms T.KSVD for 0.76 dB at noise level σ = 60. This result implies that methods using frames are more robust against noise. Furthermore, the higher the noise level, the better the results of DRTPF method and the DTF method than K-SVD and T.KSVD. We can also see that our DRTPF method outperformed DTF on most of the images, especially when the noise level was very high. In fact, in our model, the sparse coefficients are calculated accurately by the inner product of the signals and the frame Ψ, and are limited to a certain range. Theoretically, it should perform better than the compared method. Figure 7 shows two exemplified visual results on the images 'Boat' and 'Man' at noise level σ = 40. The PSNR of the K-SVD, T.KSVD, DTF, and the proposed DRTPF are 27.17 dB, 26.14 dB, 26.99 dB, 27.34 dB for 'Man' and 27.23 dB, 26.45 dB, 27.20 dB and 27.39 dB for 'Boat'. Our proposed DRTPF and the DTF method provide more features and higher PSNR values of the two images than K-SVD and T.KSVD. Though the DTF provides higher PSNR values than K-SVD and T.KSVD, and better visual performance, the results of this method suffer from deformation and margin smoothing as it based on filter. The proposed DRTPF shows much clearer and better visual results than the other competing methods without any deformation. All the six methods can be classified to two categories (1) without any extra constraint, e.g., nonlocal similarity, and (2) with additional prior like nonlocal similarity. Our proposed DRTPF belongs to category (1). We would like to point out that our goal was to establish a redundant transform learning method but not focus on image denoising. Our model is plain without applying any extra prior, besides the basic sparsity characteristics of the signals. The experimental results demonstrate that our proposed models can achieve better performance than traditional sparse models in image denoising. However, the methods BM3D and WNNM are based on image nonlocal self-similarity (NSS). The NSS prior refers to the fact that for a given local patch in a natural image, one can find many similar patches to it across the image. Intuitively, by stacking nonlocal similar patch vectors into a matrix, this matrix should be a low-rank matrix and have sparse singular values. The exploitation of NSS has been used to significantly boost image denoising performance. We have not involved this prior into our model.  [3], K-SVD [14], DFT [33], and DRTPF.

Conclusions
In this paper, we propose a Parseval frame-based data-driven overcomplete transform (DRTPF) to capture features of images. We also propose the corresponding formulations, as well as algorithms for calculating the sparse coefficients and DRTPF model learning. We have proposed a general frame learning method without imposing any structure on the frame. By applying frames to redundant transforms, we combine the ideas of analysis and synthesis sparse models and let them share almost identical sparse coefficients. We conducted robustness analysis, sparsification of nature image and image denoising experiments, which demonstrated that DRTPF can outperform state-of-the-art models, as it exploits the underlying sparsity of natural signals by the integration of frames and sparse models.
In future work, we shall consider more efficient optimization algorithms for DRTPF, which facilitate the representation ability and application of the proposed method.