A Two-Stage Method to Test the Robustness of the Generalized Approximate Message Passing Algorithm

: We propose a two-stage method to test the robustness of the generalized approximate message passing algorithm (GAMP). A pursuit process based on the marginal posterior probability is inserted in the standard GAMP algorithm to ﬁnd the support of a sparse vector, and a revised GAMP process is used to estimate the amplitudes of the support. The numerical experiments with simulation and real world data conﬁrm the robustness and performance of our proposed algorithm


Introduction
Compressed sensing (CS) has been a research focus in recent years [1].Given the support of an unknown K-sparse vector x ∈ R N as S = {j:j ∈ {1, . . ., N}, x j = 0}, we consider an under-determined noisy linear system y = Ax + n (1) where y ∈ R M is a compressed sampling observation, n ∈ R M is an additive Gaussian noise, A ∈ R M×N is the sensing matrix.The object is to reconstruct x from y and A. There are many reconstruction algorithms to solve the CS problem.The message passing based methods, including approximate message passing (AMP) algorithm [2] and generalized approximate message passing (GAMP) algorithm [3], have remained the most efficient ever since they were proposed.They work very well in the zero-mean Gaussian sensing matrix case, but become unstable and divergent in the more general sensing matrix case.For example, although an improvement to the non-zero mean Gaussian sensing matrix case has in a sense been proposed [4], the highly columns-correlated non-Gaussian sensing matrix case is still a problem.We propose a new GAMP-like algorithm, termed Bernoulli-Gaussian Pursuit GAMP (BGP-GAMP), to raise the robustness of standard GAMP algorithm.
Our algorithm firstly utilizes the marginal posterior probability of x to sequentially find the support S, and then estimates the amplitudes on S by a tiny revised GAMP algorithm, termed Fixed Support GAMP (FS-GAMP) which has been proposed in our previous work [5].A recent work has been proposed by Rodger [6], which is similar to our work, but gives another viewpoint from a neural network statistical model.The numerical experiments verify our idea and show a good performance and robustness.The rest of this paper is organized as follows.Firstly, we review the standard GAMP, especially Bernoulli-Gaussian prior GAMP, algorithm in Section 2. Secondly, we propose the FsGAMP and BGP-GAMP algorithms in Sections 3 and 4, respectively.Thirdly, experiments are conducted in Section 5. Finally, we draw a conclusion in the last section.

GAMP Algorithm Review
The GAMP algorithm can be classified to minimum mean error estimation form (MMSE-GAMP) and maximum posterior estimation form (MAP-GAMP) [3].We restrict our attention to MMSE-GAMP, since the problems in the real world are noisy, such that the posterior function has a very complex shape with details which depend on the observation y.However, MMSE is robust to such fluctuations.
Firstly, GAMP algorithm approximates the true marginal posterior p(z m |y) by the output channel marginal posterior where z m is the estimation of (Ax) m condition on the observation y m , and p y m |z m (y m |z) is the likelihood of the output channel y m = z m + w m .It is worth noting that w m means a noise, a random variable, and it corresponds to the estimated random variable z m , instead of the original noise n m in Equation ( 1).Therefore, w m is the residual after estimation of (Ax) m .The iterative algorithm computes the conditional mean and variance of z m .We want to obtain the posterior mean and variance estimates Secondly, GAMP algorithm approximates the true marginal posterior p(x n |y) by the input channel marginal posterior where p x n (x) is the prior of random variable x n in the input channel r n = x n + v n .We want to obtain the posterior mean and variance estimates GAMP algorithm is an intrinsic parallel computing process, since it decouples y m = ∑ N n=1 a mn x n + w m to the input channel r n = x n + v n estimation and the output channel y m = z m + w m estimation.According to the factor graph theory, the messages are probability measures.They can be classified to two types: from the variable node to the factor node, and from the factor node to the variable node.For AMP and GAMP algorithms, all the messages are passed simultaneously between the factor nodes and variable nodes along the edges on the factor graph.These probability measures can be approximated as Gaussian density, based on the derivation of AMP and GAMP algorithms.Therefore, the mean and variance of every entry of x are i.i.d.Gaussian random variable.This intrinsic parallel computing implies a flaw: if A does not satisfy restricted isometry property (RIP) very well, For example A has highly correlated columns, GAMP will diverge.In real world problems, such as radar application, the sensing matrix A may be deterministic, not statistics.Therefore the RIP analysis can not be applied to this situation, researchers use simulation to find the behavior of this kind of reconstruction.We will show it in Section 5.
The steps of GAMP are listed in Algorithm 1, where |A| 2 is a componentwise magnitude squared, the elementwise product and division are denoted as and , respectively, notation [•] means the componentwise form of a vector, e.g., a = [a i ], and notation |•| means the cardinality of a set, or the number of non-zero elements of a sparse vector.

Algorithm 1: Generalized Approximate Message Passing (GAMP).
Input: y, A Description: where λ is the sparsity-rate K/N, and the mean θ and variance φ are hyper-parameters which are set to zero and one respectively at the initiation and those parameters do not change in the algorithm running time.It is called BG-GAMP algorithm.One can obtain the input channel marginal posterior with the normalization factor and (r n , ν r n , λ, θ, φ)-dependent variables where π n is the input channel marginal posterior support probability Pr{x n = 0|r n , ν r n }, according to Equation (10), and then Pr{x n = 0|r n , ν r n } = 1 − π n .Substituting Equation (10) to Equations ( 6) and ( 7), we have The hypothesis of Bernoulli-Gaussian data x is sparse.Sparsity-rate λ has been set to K/N.Although the sparsity K is unknown in practice, we can choose a small K value and an error bound, and then we do the recovery algorithm.If the recovery result makes the final reconstruction error lower than the bound, this result can be accepted, otherwise we decrease the K value and do the recovery again until the bound has been achieved.This procedure is like the Least Angle Regression (LARS) algorithm which tests all the regularization hyper-parameters.

Fixed Support GAMP Algorithm
According to the belief propagation theory, if the number of cycles on a factor graph is low, then the robustness of message passing will be enhanced [8].Therefore, we have an idea that the message passing can be restricted just on the support Λ if Λ has been determined in some way.That is the meaning of FS-GAMP.The steps of FS-GAMP are listed in Algorithm 2. It is very similar to the GAMP algorithm, but the major difference between FS-GAMP and GAMP is that the entries which do not belong to the support Λ are assigned to 0 directly.

Bernoulli-Gaussian Pursuit GAMP Algorithm
The hypothesis of FS-GAMP algorithm is a known support Λ, but we do not know it.Now back to BG-GAMP, Equations ( 15) and (16) use the marginal posterior probability π n to weight the non-zero estimation part (γ n , µ n ) and the zero prior part θ.At each iteration of BG-GAMP, when π n is greater than zero-one switch threshold ζ = 0.5, the non-zero estimation part takes over the major contribution.In this research we suppose that the non-zero prior probability of every entry in x is 0.5.That implies the maximal entropy assumption.Therefore, the zero-one switch threshold is also set to 0.5 in accordance with the maximal entropy assumption.For a matrix which does not meet RIP conditions very well, we observe that the mean and the variance of one π n > ζ entry shows no sign of convergence, while some entries' π j < ζ, j = n already increase fast.When they are greater than ζ, the means and the variances of these entries also show no sign of convergence.This phenomenon spreads to more and more entries, causing all entries to be totally diverged.
This observation inspires us to maintain the current computing status until one π n > ζ entry has converged, and then continue the next entry (or entries).In other words, we introduce a pursuit process of lines ( 4)- (6) in Algorithm 3 to sequentially find the support of x, and then estimate the amplitudes on this support by FS-GAMP.BGP-GAMP means BG-GAMP with an intrinsic pursuit process.The steps of BGP-GAMP are listed in Algorithm 3. Note that the initial input parameters of FS-GAMP in line (7) are the output parameters of GAMP in line (3).
The computational complexity of GAMP part in BGP-GAMP algorithm is the same as the standard GAMP, that is O(MN), at each iteration.The computational complexity of the FS-GAMP part in BGP-GAMP is O(M|Λ|) at each iteration.Because |Λ| ≤ K, the computational complexity of BGP-GAMP at each iteration is less than O(M(N + K)).

Experiments
We investigate the performance of BGP-GAMP by three experiments.Four benchmark algorithms are adopted to compare with BGP-GAMP.Orthogonal matching pursuit (OMP) [9] and basis pursuit (BP) by interior point method to solve linear programming [10] are the most robust 0 and 1 optimization algorithms, other algorithms (old or new) are weaker than them in robustness.AMP and GAMP are not only the root cause of the robust problem, but also the basis of our method.The implementation of OMP comes from (http://www.cmc.edu/pages/faculty/DNeedell),BP comes from l1magic (http://users.ece.gatech.edu/justin/l1magic),AMP comes from Kamilov's work (http://people.epfl.ch/ulugbek.kamilov)and GAMP comes from gamplab (http://eeweb.poly.edu/srangan/index.html).
We do Q = 100 trials in each experiment, and use the relative mean square error (RMSE) x l 2 as the performance metric.Given N/M = 3, M/K = 5, we set N = 256, and then M = 86, K = 18.Choose K locations at random as the support of x, and the amplitude of each non-zero entry x i is independently drawn from N (0, 1).We set the signal-to-noise ratio (SNR) to 25 dB.
The first experiment, termed γ test, investigates non-zero mean Gaussian sensing matrix case.We create two types of sensing matrix: type 1 means A(:, j) = γ/N + a j / √ N; type 2 means A(:, j) = γ/N + a j / a j 2 , where each entry of a j is i.i.d.drawn from N (0, 1).Set the range of γ to [0, 1, 1.8, 1.9, 1.95, 2, 3,4,5,6,7,8,9,10,11,12]. Figure 1a shows that GAMP violently diverges at γ = 2, AMP fast diverges at γ = 5.The severely sharp divergence of GAMP and AMP comes from the phase transition property of compressed sensing.Although BGP-GAMP also begins to diverge at γ = 5, the deterioration speed is slower than AMP.Furthermore, the RMSE of BGP-GAMP is always less than that of AMP.When γ < 4, the RMSE of BGP-GAMP is a little less than that of OMP, and when γ < 6, the RMSE of BGP-GAMP is obviously less than that of BP. Figure 1b shows that AMP violently diverges at γ = 1.95,GAMP violently diverges at γ = 3, but BGP-GAMP just begins to diverge at γ = 8.The RMSE of BGP-GAMP is always less than that of BP, and less than that of OMP until γ = 8.Moreover, Figure 1a,b shows that BGP-GAMP outperforms OMP slightly in performance over a parameter range γ < 2. In summary, our proposed algorithm obtains better robustness compared with the standard AMP/GAMP algorithms, at the same time our proposed algorithm achieves performance advantage over a suitable parameter range compared with the standard OMP and BP algorithms.
The second experiment, termed α test, investigates the highly correlated columns sensing matrix A case.The elements of A are neither normal nor i.i.d distributed.Create two types of sensing matrix: type 1 means A = (1/N)PQ, p mr , q rn ∼ N(0, 1), where P ∈ R M×R , Q ∈ R R×N , and R αN; type 2 means A = PQ like type 1, but normalizes each column of A. We know that A is low rank for α < M/N.Set the range of α to [0.2 0.3 . . .1.0].Since GAMP totally diverges at every α with type 1 and type 2 cases, it is not visible in Figure 2a,b.AMP also totally diverges at every α with type 2 case, therefore it is also not visible in Figure 2b.The two figures show that BGP-GAMP remains stable even in a low rank scenario α ≈ 0.4, and its performance is obviously better than AMP and BP, and a little better than OMP.It is worth noting that the threshold 0.1 in Figures 1 and 2 is the relative mean square error, not absolute mean square error, which means the average amplitude is just 1/100 of the original signal amplitude.These figures show, for instance, that the standard AMP algorithm and BP algorithm are approaching this threshold.Therefore, it is not a poor performance.We do not consider MAP-GAMP since MAP estimation is not general as MMSE estimation, please see [11] Section 3.6.3for more discussion.
The last experiment uses real world data, namely, a segment of music by Mozart.The data has 81,000 sample points, and it is segmented to 54 blocks, each block has N = 1500 sample points.The compressed sampling rate is set to 1/3, therefore, the measurement number is M = 500.We create the sensing matrix by DCT transform: A(i, :) = (1/N)(DCT(Φ(i, :))) T , where every element of Φ is drawn from N (0, 1), and then multiplied a scale factor 1/ √ M to it.Figure 3a shows the modulus of DCT coefficients of the original signal, and Figure 3b shows the modulus of DCT coefficients of the recovered signal.We can see that the major energy has been recovered from the compressed measurements.
Original Audio Signal Signal block

Conclusions
In this paper we propose a new compressed sensing algorithm, the two-stage method.This algorithm pursues the support of a sparse vector sequentially, an meanwhile estimates the amplitudes of the sparse vector by the FS-GAMP process.Numerical experiments with simulation and real world data confirm that our new algorithm performs very well in correctness and robustness.In future work we plan to analyze the variation rate of the marginal posterior probability π to obtain better results.

Figure 3 .
Figure 3. Recover the audio signal by Bernoulli Gaussian Pursuit GAMP.(a) Spectra of the original signal; (b) Spectra of the recovered signal, major signal energy has been restored but lost some detail.