Sparse Coding Algorithm with Negentropy and Weighted (cid:96) 1 -Norm for Signal Reconstruction

: Compressive sensing theory has attracted widespread attention in recent years and sparse signal reconstruction has been widely used in signal processing and communication. This paper addresses the problem of sparse signal recovery especially with non-Gaussian noise. The main contribution of this paper is the proposal of an algorithm where the negentropy and reweighted schemes represent the core of an approach to the solution of the problem. The signal reconstruction problem is formalized as a constrained minimization problem, where the objective function is the sum of a measurement of error statistical characteristic term, the negentropy, and a sparse regularization term, (cid:96) p -norm, for 0 < p < 1. The (cid:96) p -norm, however, leads to a non-convex optimization problem which is difﬁcult to solve efﬁciently. Herein we treat the (cid:96) p -norm as a serious of weighted (cid:96) 1 -norms so that the sub-problems become convex. We propose an optimized algorithm that combines forward-backward splitting. The algorithm is fast and succeeds in exactly recovering sparse signals with Gaussian and non-Gaussian noise. Several numerical experiments and comparisons demonstrate the superiority of the proposed algorithm.


Introduction
Sparse signal reconstruction, or compressed sensing, is an emerging field in signal processing and communication [1][2][3][4][5][6].The problem of recovering a sparse signal from a very low number of linear measurements arises in many real application fields, ranging from error correction and lost data recovery, to image acquisition and reconstruction.In general, an N-dimensional sparse signal can be described by M < N significant coefficients in an appropriate transform domain.In some cases, a signal which is non-sparse in time domain can be classified as a sparse one, as it shows sparsity in spatial domain or some appropriate transform domain, such as the frequency domain and Gabor transformed domain.
In this paper, the signals will be treated as real-valued functions having domains that are either continuous or discrete, and either infinite or finite.We will typically be concerned with normed vector spaces.In the case of a discrete, finite domain, the signals can be viewed as vectors in an n-dimensional Euclidean space, denoted by R n .According to the compressed sensing theory, sparse signal reconstruction problem can be formalized as a constrained minimization problem, where the objective function defines the sparsity.The basic mathematical model is: Entropy 2017, 19, 599 2 of 11 where A ∈ R M×N is a known measurement matrix with M < N and needs to satisfy the RIP condition.Any M columns vectors are linearly independent [7].x ∈ R M×1 is an available measurement vector.e ∈ R M×1 is an unknown noise vector.The sparse reconstruction problem can be cast as: given the M × N measurement matrix A, find the vector s ∈ R N×1 , whose nonzero components are only , with K N (such a vector will be called K-sparse vector), from the measurements x, by solving: The most natural choice for F(s) is given by F(s) = s 0 , where s 0 denotes the 0 -norm, which counts the number of non-zero components in s.With F(s) = s 0 , however, the problem in (2) becomes a combinatorial optimization and is proven to be non-deterministic polynomial (NP)-hard [8].Recent results have shown that the use of different sparsity inducing functions allows us to exactly and efficiently recover a sparse signal from a lower number of measurements [9][10][11].For instance, p -norm is introduced as a relaxation of the 0 -norm, and the problem can be formulated by applying constraints on the signal model and introducing a cost function: where is the p -norm of s, with 0 < p < 1. λ > 0 is the regularizing parameter which controls the sparsity.Many algorithms have been developed to solve the problem in (3) in the literature [12][13][14][15][16][17][18][19][20], where the mean square error (MSE) [21] criterion based on second-order statistics has been employed for these algorithms, which show their optimality when e is Gaussian noise.In practical applications, however the transmitted signals are distorted by not only Gaussian noise, but also other kinds of noise, such as burst noise and high noise.Burst noise is a type of discrete noise and consists of sudden interruptions.High noise has high energy, frequency or power.These noises have non-Gaussian characteristics.In such cases, the MSE criterion becomes less robust.
In this work, we focus on the problem of sparse signal reconstruction especially with non-Gaussian noise.This problem is formalized as a constrained minimization problem and verified by simulation.We propose a sparse coding algorithm in which the sparse signal is recovered by applying the negentropy [22] as the error measurement and p -norm as the sparsity regularization.The p -norm as sparsity constraint is not very commonly used due to its non-convexity.To solve the corresponding non-convex minimization problems, we treat p -norm as a serious of weighted 1 -norm and convert it to a series of convex optimizations.The negentropy, rather than the MSE, is used because of two folds of reasons.First, in the square error (SE) case, the minimization is on the cost function of the form MSE + p -norm like Equation (3).Since SE is required to be as small as possible while the 1 -norm is lower bounded by a finite value, the optimization cannot reach a very small SE value, so that leads a biased solution.In our case, the cost is negentropy + (weighted) 1 -norm, the negentropy is required to be maximized and have finite value in the optimal state, so that an unbiased solution can be obtained.Second, negentropy can tolerate bigger, non-Gaussian noise on the zero-valued components so that the estimation for non-zero-valued components becomes more accurate.For efficient optimization, we propose an algorithm with two main steps: first we use the gradient-based maximization only to the negentropy; then we find a sparser solution within the neighborhood of what has been obtained in the gradient-based maximization.Such a strategy was termed in the literature as the forward-backward splitting (FOBOS) algorithm [23].
Entropy 2017, 19, 599 The proposed algorithm is distinguished from other related works in that we present a novel objective function for sparse signal recovery.The sparse signals can be estimated by applying the negentropy as the error measurement and weighted 1 -norm as the sparsity regularization.An effective algorithm is provided, which has improved accuracy and convergence rate.
This paper is organized as follows: in Section 2 we recall the least absolute shrinkage and selection operator (LASSO) algorithm which is developed to solve the problem in Equation ( 3) and we propose a new one.The formulation of our algorithm is presented and the details about the targeted minimization problem are described.Then we propose to use negentropy and weighted 1 -norm as the core of our algorithm.Numerical results which show the effectiveness of the proposed algorithm are presented in Section 3. Section 4 concludes this paper.

Least Absolute Shrinkage and Selection Operator
Although the p -norm, with 0 < p < 1, seem to be the natural choice for sparsity regularization, the fact that it is not convex makes the optimization process hard.The 1 -norm is the one that is "closest" to it yet 1 -norm retains the computationally attractive property of convexity and has been used for such problem in Equation (3) for a long time.Here, the LASSO, which is the hot research topic in statistical society, is discussed to solve the problem.We intend to recover the original signal s from (1).s is the sparse reconstruction of the original signal by optimizing the following function as the LASSO [24,25]: One cause is the loss function based on MSE criterion, which makes it necessary to reduce the rest error as much as possible.The other clause is a regularization function which uses the 1 -norm to obtain a sparse solution.This algorithm is likely to be sparse to coefficient vector obtained as the estimation results by use as constraints of the 1 -norm.Therefore, only the main part of the signal can be approximated by a linear combination of the basis set, s can be reconstructed by sparse coding and the noise can be removed.However, the MSE criterion has little robustness when the signal reconstruction process involves non-Gaussian noise.We want to improve Equation (4) in order to tolerate bigger, non-Gaussian noise and obtain the sparser solution.

Proposed Minimization Formulation
In this section we describe a novel and effective proposal for the constrained minimization problem in Equation ( 3).With complex noise and more sparsity, a challenging optimization problem has to be solved.The proposed algorithm is based on the signal model in Equation (1) presented in Figure 1, where s = [s(0), s(1), • • • , s(N − 1)] T ∈ R N×1 is sparse enough so that the non-zero components, K, in s is less than M. For the model, the measurement matrix A and measurement vector x are known, while the vector s is to be estimated and reconstructed, under the assumption that s is For quantizing the fidelity of the estimation, one can measure a kind of distance between x and x = A s.In the most situations, the root of mean square error is used usually for such purpose.If the error is Gaussian distributed, MSE is optimal, since the first and second moments incorporate all statistical characteristics.However, if the error is not Gaussian distributed, such as audio signal, image, communication signal, MSE will not be optimal.Considering the application to such cases, in this paper, we use the non-Gaussianity of error.As a sparse measure we use p -norm.Therefore, the sparse vector s can be estimated by minimizing the following objective function and formulate the optimization problem as follows where J n (•) denotes the negentropy, • p is p -norm and λ is the regularizing parameter.Note that, algorithms based on MSE criterion have also proven to be weaker in the performances of convergence and accuracy.
Entropy 2017, 19, 599 4 of 11 Here, we apply the negentropy and p -norm to construct the objective function, where the measurement of e is negentropy and the sparsity constraint is for sparse-promoting.The small error on the zero components can be tolerated in exchange for high accuracy of the non-zero components in order to retain all significantly large components, so s can be more exact and the performance of sparse signal reconstruction can be observably improved.

Negentropy Maximization
In the information theory, Gaussian variables have maximum entropy in all random variables which have the same variance.Therefore, one can use entropy to measure non-Gaussian noise.A modified form of entropy is negentropy which is used as the error measurement in the proposed algorithm.Negentropy can tolerate non-Gaussian noise on the signal so that the estimation of sparse signal becomes more accurate.The negentropy is defined as . Assume e is super-Gaussian, which is non-Gaussian and the kurtosis value is greater than zero.This is true in most realistic situations, we can just use: Here, we apply the negentropy and p -norm to construct the objective function, where the measurement of e is negentropy and the sparsity constraint is for sparse-promoting.The small error on the zero components can be tolerated in exchange for high accuracy of the non-zero components in order to retain all significantly large components, so s can be more exact and the performance of sparse signal reconstruction can be observably improved.

Negentropy Maximization
In the information theory, Gaussian variables have maximum entropy in all random variables which have the same variance.Therefore, one can use entropy to measure non-Gaussian noise.A modified form of entropy is negentropy which is used as the error measurement in the proposed algorithm.Negentropy can tolerate non-Gaussian noise on the signal so that the estimation of sparse signal becomes more accurate.The negentropy is defined as J n (e) = H(e gauss ) − H(e), where H(•) is the differential entropy and H(e) = − p e (ξ) log p e (ξ)dξ, e gauss is the Gaussian random variable which has the same variance with e.When e is Gaussian, J n (e) = 0. J n (e) gets larger while e becomes more non-Gaussian.J n (e) reflects the non-Gaussian level of e.When we apply the negentropy to the sparse signal reconstruction, the probability density distribution of variables is considered to be known.It is unpractical for the noise e.Here, approximately, the negentropy is given by [E{G(e)} − E{G(e gauss )}] 2 .
Entropy 2017, 19, 599 5 of 11 Assume e is super-Gaussian, which is non-Gaussian and the kurtosis value is greater than zero.This is true in most realistic situations, we can just use: J n (e) = E{G(e)} − E{G(e gauss )} (6) where E{•} denotes the statistical expectation.G(•) is the nonlinear function, such as G(e) = e 3 , G(e) = e• exp(−e 2 /2) and G(e) = tanh(c•e) [22].In this paper, as an instance, the function is as follows: where c is constant.This nonlinear function is smooth and differentiable.When c 1, G(e) is approximately equal to |e|.Then, the gradient of J n (e) can be calculated in order to maximize the negentropy so that −J n (e) can be minimized.It is an effective method to solve the optimization problem.The gradient of J n (e) is calculated as: where g(e) is the derivative of G(e) and g(e) = tanh(ce).The sparse signal s can be iteratively updated as follows: where k represents the number of iterations and µ n ≥ 0 is the step size.

Weighted 1 -norm and FOBOS
Considering the p regulation, we find a sparse solution in the neighborhood of s k+ 1 2 .That is we solve the problem as follows: where s k+ 1 2 is the updated vector from (9).Based on the understanding that the sparsity constraint s p p resulting in sparser results when p is closer to zero, we tend to choose value of p between 0 and 1.However, this choice makes the optimization problem a concave one.To reform a convex sparsity constraint, we propose to approximate the p -norm sparsity constraint with a weighted 1 -norm.Thus, Equation ( 10) is rewritten as follows: where w ∈ R N×1 is weight.Since . Then, we update each component of s k+ 1 2 by a FOBOS-like algorithm as: 2 )max{0, ( s i 2 )} (12) where s i k+ 2 )max{0, ( s i 2 )}

3.
If the stop condition is satisfied, algorithm ends, otherwise goes to step 1.

Output: s
The stop condition can be the desired accuracy, convergence or a maximal iteration number.

Results and Discussions
In this section, we analyze the performance of the proposed algorithm for sparse signal reconstruction with negentropy and weighted 1 -norm described in Section 2. In order to perform the numerical analysis, we present our experiment results to show whether the algorithm can recover the true sparse signal s orig or not.The original sparse signal s orig was sized as a 100 × 1 vector and generated by drawing value randomly from a normal distribution N(0, 1).In each sparse vector, there were several non-zero values, which were also picked and located randomly.The measurement matrix A was sized as 40 × 100.Correspondingly, x orig had 40 sample size and was generated by A and s orig using the equation x orig = As orig + e.The noise vectors e were added based on Gaussian and non-Gaussian random entries with various signal to noise ratios (SNR).The other parameters were set as below, c = 7, λ = 1.57× 10 −2 , µ n = 6.1 × 10 −3 , p = 0.9, δ = 10 −7 .In order to evaluate the performance of the proposed algorithm, we use the normalized 2 -error as a criteria to measure reconstruction accuracy for sparse signals, where the normalized 2 -error is defined as s − s org 2 2 / s 2 2 .

The Reconstructed Signal Comparison between MSE Criterion and Proposed Algorithms
As discussed in Section 2.1, in the algorithm based on the MSE criterion, the objective function is: The measurement of e in Equation ( 13) is a Frobenius norm which is the common model of error [26].As described in Section 2, we apply the negentropy and weighted 1 -norm to form the objective function, where the measurement of e is negentropy and the sparsity constraint can result in sparser results.Here, we compare the proposed algorithm with the MSE algorithm from the recovery and original signals images.
Figure 2 shows the reconstructed and original signal comparison between MSE and proposed algorithm.From Figure 2, it is clearly that MSE algorithm does not work well since it does not exploit the sparsity availably and is based on the assumption that noise has Gaussian characteristic.The zero components of s orig is estimated exactly with the proposed algorithm, but is partly not zero with MSE algorithm.Furthermore, the non-zero components of s orig is recovered more precisely with the proposed algorithm, where s has better sparse characteristics.
algorithm.From Figure 2, it is clearly that MSE algorithm does not work well since it does not exploit the sparsity availably and is based on the assumption that noise has Gaussian characteristic.The zero components of orig s is estimated exactly with the proposed algorithm, but is partly not zero with MSE algorithm.Furthermore, the non-zero components of orig s is recovered more precisely with the proposed algorithm, where s  has better sparse characteristics.

The Accuracy Performance Comparison of the Algorithms
For a more complete description on how the performance is affected by noise, the normalized 2  -error of s  is plotted under different SNR in Figure 3.The results demonstrate the significant reduction of normalized 2  -error as SNR is increased.More specifically, the proposed algorithm has better performance that is at 10 −2 order, compared with the MSE algorithm that is at 10 −1 order.Therefore, the proposed algorithm has higher accuracy as expected and is more suitable for sparse signal reconstruction with complex noise.

The Accuracy Performance Comparison of the Algorithms
For a more complete description on how the performance is affected by noise, the normalized (a) (b)

The Convergence Performance Comparison of the Algorithms
To illustrate the convergence of the algorithm, we present the performance of normalized 2error with the iterations in Figure 4 in the same non-Gaussian noise circumstance.As the number of iterations increases, the relative error of reconstructed sparse signal decreases, and becomes stable at a certain error value where the convergence is reached.Figure 4 shows that the number of iterations for convergence of the proposed algorithm is smaller than that of MSE.It can be seen that the proposed algorithm converged faster.Novel loss function and FOBOS-like algorithm lead to the better performance in convergence rate.Besides, the stable normalized 2 -error is lower compared to that of MSE seen in Figure 4.With respect to the performance of reconstructing sparse signals, the recovery ratios are shown in Figure 5.The recovery ratio of the proposed algorithm can reach 100% with the smaller number of iterations than that of MSE.Therefore, the proposed algorithm has significant advantages in computational complexity and specially in recovery ratio.

The Convergence Performance Comparison of the Algorithms
To illustrate the convergence of the algorithm, we present the performance of normalized 2 -error with the iterations in Figure 4 in the same non-Gaussian noise circumstance.As the number of iterations increases, the relative error of reconstructed sparse signal decreases, and becomes stable at a certain error value where the convergence is reached.Figure 4 shows that the number of iterations for convergence of the proposed algorithm is smaller than that of MSE.It can be seen that the proposed algorithm converged faster.Novel loss function and FOBOS-like algorithm lead to the better performance in convergence rate.Besides, the stable normalized 2 -error is lower compared to that of MSE seen in Figure 4.With respect to the performance of reconstructing sparse signals, the recovery ratios are shown in Figure 5.The recovery ratio of the proposed algorithm can reach 100% with the smaller number of iterations than that of MSE.Therefore, the proposed algorithm has significant advantages in computational complexity and specially in recovery ratio.

The Convergence Performance Comparison of the Algorithms
To illustrate the convergence of the algorithm, we present the performance of normalized 2 error with the iterations in Figure 4 in the same non-Gaussian noise circumstance.As the number of iterations increases, the relative error of reconstructed sparse signal decreases, and becomes stable at a certain error value where the convergence is reached.Figure 4 shows that the number of iterations for convergence of the proposed algorithm is smaller than that of MSE.It can be seen that the proposed algorithm converged faster.Novel loss function and FOBOS-like algorithm lead to the better performance in convergence rate.Besides, the stable normalized 2  -error is lower compared to that of MSE seen in Figure 4.With respect to the performance of reconstructing sparse signals, the recovery ratios are shown in Figure 5.The recovery ratio of the proposed algorithm can reach 100% with the smaller number of iterations than that of MSE.Therefore, the proposed algorithm has significant advantages in computational complexity and specially in recovery ratio.Therefore, when implementing the proposed strategy in practice, it is important to consider the sparsity of signals.

Conclusions
We have presented an effective algorithm for reconstructing sparse signals, which is based on a convex optimization problem upon the proposed objective function.The objective function includes the negentropy as the fitting error measurement and p -norm, which is treated as the weighted 1norm, as the sparse-promotion.Furthermore, the proposed algorithm includes two main steps in each iteration: (1) gradient based maximization to the negentroy; (2) a soft thresholding to the result of (1), for sparsity promotion.Experiments show that the proposed algorithm has improved accuracy and convergence rate, especially when the noise is non-Gaussian in the information transmission.In our future research, the algorithm will be optimized for complexity and the real-world applications in signal processing and communication fields will also be considered.Figure 6 presents the normalized 2 -error with the sparsity level which is the number of non-zero components in all components.We consider a randomly generated K-sparse signal s of length N = 100 and K = 5, 10, 15, 20 non-zero components.It is apparent that for different sparse signals, the normalized 2 -error is reduced through decreasing the number of non-zero components.Therefore, when implementing the proposed strategy in practice, it is important to consider the sparsity of signals.Therefore, when implementing the proposed strategy in practice, it is important to consider the sparsity of signals.

Conclusions
We have presented an effective algorithm for reconstructing sparse signals, which is based on a convex optimization problem upon the proposed objective function.The objective function includes the negentropy as the fitting error measurement and p -norm, which is treated as the weighted 1norm, as the sparse-promotion.Furthermore, the proposed algorithm includes two main steps in each iteration: (1) gradient based maximization to the negentroy; (2) a soft thresholding to the result of (1), for sparsity promotion.Experiments show that the proposed algorithm has improved accuracy and convergence rate, especially when the noise is non-Gaussian in the information transmission.In our future research, the algorithm will be optimized for complexity and the real-world applications in signal processing and communication fields will also be considered.

Conclusions
We have presented an effective algorithm for reconstructing sparse signals, which is based on a convex optimization problem upon the proposed objective function.The objective function includes the negentropy as the fitting error measurement and p -norm, which is treated as the weighted 1 -norm, as the sparse-promotion.Furthermore, the proposed algorithm includes two main steps in each iteration: (1) gradient based maximization to the negentroy; (2) a soft thresholding to the result of (1), for sparsity promotion.Experiments show that the proposed algorithm has improved accuracy and convergence rate, especially when the noise is non-Gaussian in the information transmission.In our future research, the algorithm will be optimized for complexity and the real-world applications in signal processing and communication fields will also be considered.

Figure 2 .
Figure 2. Reconstructed signal s  and original signal orig s comparison between MSE and proposed algorithms.(a) MSE algorithm with Gaussian noise; (b) proposed algorithm with Gaussian noise; (c) MSE algorithm with non-Gaussian noise; (d) proposed algorithm with non-Gaussian noise.

Figure 2 .
Figure 2. Reconstructed signal s and original signal s orig comparison between MSE and proposed algorithms.(a) MSE algorithm with Gaussian noise; (b) proposed algorithm with Gaussian noise; (c) MSE algorithm with non-Gaussian noise; (d) proposed algorithm with non-Gaussian noise.

Figure 3 .
Figure 3. Normalized 2 -error performance versus SNR.(a) comparison between MSE and proposed algorithms with Gaussian noise; (b) comparison between MSE and proposed algorithms with non-Gaussian noise.

Figure 3 .
Figure 3. Normalized 2 -error performance versus SNR.(a) comparison between MSE and proposed algorithms with Gaussian noise; (b) comparison between MSE and proposed algorithms with non-Gaussian noise.

Figure 3 . 2 
Figure 3. Normalized 2  -error performance versus SNR. (a) comparison between MSE and proposed algorithms with Gaussian noise; (b) comparison between MSE and proposed algorithms with non-Gaussian noise.

Figure 6 .
Figure 6.Normalized 2 -error performance versus iteration number for different sparse signal with the proposed algorithm.

Figure 5 .
Figure 5. Recovery ratios of s versus iteration number.

Figure 6
Figure6presents the normalized 2 -error with the sparsity level which is the number of nonzero components in all components.We consider a randomly generated K -sparse signal s of length 100 N  and 5,10,15,20 K  non-zero components.It is apparent that for different sparse signals, the normalized 2 -error is reduced through decreasing the number of non-zero components.

Figure 6 .
Figure 6.Normalized 2 -error performance versus iteration number for different sparse signal with the proposed algorithm.

Figure 6 .
Figure 6.Normalized 2 -error performance versus iteration number for different sparse signal with the proposed algorithm.
Proposed algorithm for sparse signal reconstruction with negentropy and weighted 1 -norm To summarize, the method proposed in this paper is presented in Algorithm 1.Task: Estimate the sparse signal s by minimizing J(s) = −J n (e) + λ s p p Initialization: input signal matrix A, system noise e and proper c, λ, µ n , p and δ.Initialize s 0 and w 0 .
It is apparent that for different sparse signals, the normalized 2 -error is reduced through decreasing the number of non-zero components.