A Compressed Sensing Recovery Algorithm Based on Support Set Selection

: The theory of compressed sensing (CS) has shown tremendous potential in many ﬁelds, especially in the signal processing area, due to its utility in recovering unknown signals with far lower sampling rates than the Nyquist frequency. In this paper, we present a novel, optimized recovery algorithm named supp-BPDN. The proposed algorithm executes a step of selecting and recording the support set of original signals before using the traditional recovery algorithm mostly used in signal processing called basis pursuit denoising (BPDN). We proved mathematically that even in a noise-affected CS system, the probability of selecting the support set of signals still approaches 1, which means supp-BPDN can maintain good performance in systems in which noise exists. Recovery results are demonstrated to verify the effectiveness and superiority of supp-BPDN. Besides, we set up a photonic-enabled CS system realizing the reconstruction of a two-tone signal with a peak frequency of 350 MHz through a 200 MHz analog-to-digital converter (ADC) and a signal with a peak frequency of 1 GHz by a 500 MHz ADC. Similarly, supp-BPDN showed better reconstruction results than BPDN.


Introduction
Compressed sensing (CS) is a theory that describes an unknown high-dimensional signal with few non-zero components that can be recovered from extremely incomplete information, which could reduce data redundancy to a considerable extent. Since CS theory was conceived [1,2], it attracted great attention all over the world for its applicability to many fields, such as signal processing, radar systems [3], imaging [4,5], optical microscopy [6], direction of arrival (DOA) [7], and cognitive radio (CR) [8]. In the signal processing area in particular, the rapid development of communication technologies, the utilization of high-speed computers, and high-frequency or ultrawide band signals have caused explosive growth of data, along with serious challenges to digital devices such as the analog-to-digital converter (ADC) using the Nyquist sampling theory [9,10], which limits the sampling rate to twice or above the highest frequency of signals. Nevertheless, CS systems can greatly overcome above difficulties.
CS systems are composed of two parts: measurement and reconstruction. In the former part, signals are compressively and randomly measured by the measurement matrix (MM), which usually consists of groups of pseudo random bit sequence (PRBS) in communication systems, by mixing the signal with PRBS with mixers or modulators. In other words, high-dimensional signals are projected randomly and linearly to a much lower-dimensional space, making it possible to reduce the requirement of sampling rate. Many works have been done in this area during the past decades and have made great progress. In the very beginning, an CS system was implemented in electronic systems called the random demodulator (RD) [11,12]. In the RD, a microwave signal is directly ing section, a CS system is proposed and simulated results are presented, verifying the effectiveness of supp-BPDN. Conclusions are drawn in the final section.

CS Theory
Consider an unknown vector x ∈ R N of N-length, x only containing k non-zero elements, and k N, the index set of these non-zero elements is defined as the support set supp(x), which is where [N] = {1 , 2 , ... , N} is the index set of all elements in x. In other words, [N] is all the subscripts of elements in x. Equation (1) means the other (N − k) components are totally equal to zero. It is the strict definition of "sparsity", which is the precondition of CS theory. However, there are few signals totally meeting the requirement that other components are all zero. To expand the application range of CS, luckily, the condition can be relaxed. At this time x is called "compressible". The (N − k) components in compressible signal are really small values, not strictly equal to zero. The sparsity or compressibility of a signal is the premise of CS. Under such conditions, CS states that in which y ∈ R M is the measurement vector, composed of measurements. Besides, Φ ∈ R M×N is the MM, and they hold the relation that k < M N. MM must obey the restricted isometry property (RIP) [36], which defines the restricted isometry constant (RIC) to be the smallest value δ k that makes hold for all k-sparse signals. RIP guarantees that the mapping relationship from x to y is unique and determined; i.e., different k-sparse signals cannot be mapped to the same y through Φ which obeys RIP.
The process of signal reconstruction is actually solving Equation (2), and the original unknown signal can be recovered by reconstruction algorithms. Besides, it is worth noticing that as long as the signal is sparse in any domain, such as the frequency domain or wavelet domain, the signal can be regarded as sparse and recovered as well. In a mathematical expression, it is or in a matrix expression, s = Ψx, among which Ψ ∈ R N is the orthogonal transformation basis. x is still the sparse vector. It is known that when M is not very small, Θ = ΦΨ also abides by the RIP at high probability according to the universality of CS [37,38], making the application range of CS much wider as well. Therefore, although s is not sparse, it can be recovered in CS system. It is more suitable for communication systems when Ψ stands for the Fourier transform matrix.

Problem Statement
We mean to outline signal reconstruction in communication systems under CS theory. The original unknown signal, denoted as s, is the sum of different cosine signals, and obviously it is sparse in frequency domain. In consideration of noise in actual systems, such as shot noise and thermal noise in PD, and quantization noise in ADC, from Equation (2) and Equation (4), the model of CS-enabled communication system is where n is additive noise, and n ∈ R M . In this specific situation, MM is usually used as the matrix composed of M groups of PRBS consisting of 0 and 1, denoted as r 1 until r M . Hence, the detailed expression of MM is in which ϕ i = (r 1i r 2i ... r Mi ) T is the i th column vector of Φ.

Convex Optimization Recovery Algorithm
Under the condition that y noise and Φ are known, one can intuitively derive Equation (2) to an optimization problem, which is However, the model established by 0 -norm is not differentiable. In order to solve Equation (7), enumerating all the possibilities of x is necessary-up to C k N kinds. Therefore, it is unrealistic to directly solve such a non-deterministic polynomial hard (NPH) problem [39]. Fortunately, because of the sparse signal, Equation (7) can be approximated to which is a convex problem; therefore, Equation (8) can be resolved directly by the BP algorithm.
More practically, the BP algorithm should be adapted to a noise-affected case, as mentioned in Equation (5). Then Equation (8) is evolved to a BPDN algorithm which demands where λ is a selected parameter to make results optimal or near-optimal, and λ = σ 2 log(N), and σ > 0 is the noise level [25,40].

Analysis of Ideal System
BPDN is an overly pessimistic algorithm, and recovery errors are often caused by clutter frequencies. In this paper we figure this problem out by finding the support set of sparse signal x, supp(x), and only recovering the information which corresponds to supp(x).
To obtain the exact supp(x), we start with analyzing the ideal CS system. Normally each row vector of Φ is a different and random PRBS denoted as r whose length is the same as that of the measured signal, and each column of Φ is denoted as ϕ as Equation (6) shows. Therefore, from expression one could realize that y just relates to x i and ϕ i , in which i ∈ supp(x). It means only column vectors of Φ which corresponding to supp(x) make contributions to y. Therefore, by means of solving argmax | y, ϕ i |, (12) supp(x) could be inferred. In this case, · means inner product, denoted as p in the following parts.
Suppose that x is a 1-sparse signal and supp(x) = {1}; i.e., x 1 is the element which carries effective information, and the remaining elements x j = 0 (j = 1) or x j can be regarded as zero. Under this circumstance, measurement vector y is By substituting Equation (13), ϕ 1 and ϕ j (j = 1), respectively, into Equation (12), one can get the inner products and As Φ is composed of 0 and 1 due to PRBS, and x 1 is a positive number, actually we have and | y, Besides, on account of PRBS characteristics, r i1 and r ij each holds the probability of 1 2 being 1 or 0, which is where P(·) means probability. Therefore, there is Substituting Equation (19) into Equations (16) and (17), for one single term in inner products, gives the following probabilities: and It can be known from above formulae that the relationship of each term of Equations (16) and (17) always holds. Therefore, the summations of Equation (22), which are hold as well. It means that in this example, p 1 is the maximum of all the inner products, and the subscript of p 1 is at least one of the elements in supp(x). Therefore, finding the support set can be equivalently transformed to obtaining the position of maximum of all the inner products. Furthermore, for the sum of all M terms in inner products, there is if p 1 = p j , and because x 1 0, leading to the fact that even there is only one term being Equation (25) is very useful for the analysis of the following noise-affected CS system.

Analysis of a Noise-Affected System
Similarly, we analyze the noise-affected CS system. The noise model is n = (n 1 , n 2 , ..., n M ) T , considered as absolute random. Likewise, with regard to the sparse vector x, we suppose that x 1 carries information while other elements are equal to zero. Therefore, under this circumstance, Equation (11) is Additionally, the inner products are and Due to the influence of noise, finding the support set of signal is not a certain event. The subscript of the maximum inner product is no longer precisely equivalent to supp(x). Therefore, we analyze the probability of equivalent transformation holds.
According to the random characteristic of PRBS, in the case of an extremely large sample size, |ϕ 1 | should be equal to |ϕ j | because ϕ 1 and ϕ j have the same number of 1s and 0s. Therefore, Substitute Equation (31) to Equations (29) and (30); then one can know that p 1 and p j have the same upper bound, which is From the triangle inequality |a + b| |a| + |b|, there are and In the preceding part of the text, the large probability of the following was proven: Meanwhile, in a extremely large sample size, On the basis of the magnitude of the noise power being less than that of the original signal power, and the fact that r ij = 1 or 0, By joining Equations (32), (36), and (37), one can figure out a more accurate interval of p 1 and p j , which is According to the central-limit theorem, p 1 and p j are both approximate Gaussian distributions in their intervals. For convenience, we denote the upper bounds as a max and b max , respectively. On account of | ∑ M i=1 r 2 i1 x 1 | | ∑ M i=1 r i1 r ij x 1 |, one can regard the relationship of a max and b max as as well.
On the basis of determining the intervals and distributions, one can analyze the probability of p 1 p j . P p 1 p j is actually the probability of satisfying p 1 p j in a universal set Ω, which is In accordance with the contingent probability formula P(AB) = P(A|B)P(B), each term is Therefore, the probability that p 1 p j is where p 1 and p j are both approximately subject to Gaussian distribution; i.e., Accordingly, for p 1 and p j , the probability density functions are denoted as g(a) and f (b); the distribution functions are written as G(a) and F(b), respectively.
From the total probability formula in the continuous interval, Equation (42) can be derived as can be approximate to 1, leading Equation (44) to Through the above analyses, for a 1-sparse signal, the support set of it is the maximum element in p. Similarly, for a k-sparse signal, the support set contains top k maximum values.
In conclusion, even in a CS system affected by noise, acquiring supp(x) via the maximum values of inner products is a high-probability event. Furthermore, the accuracy can be improved by increasing iteration times properly.

Algorithm Procedure
Procedures of the proposed algorithm are detailed in Algorithm 1. Before the process starts, input variables containing Φ, y noise and iteration times t are required, among which Φ = [ϕ 1 , ϕ 2 , ..., ϕ N ]. Besides, an initial zero vector denoted as ξ is demanded. ξ is an "assistant vector" used in supp-BPDN algorithm, which connects the estimated support set with the recovered signal information, and ξ = O ∈ R N×1 , where O is the zero vector. Specific procedures are as follows: To begin with, calculate the inner products of y noise and all initial ϕ, finding the column vector ϕ (t) z corresponding to the maximum inner product, where z is the subscript and position of the column vector, and (t) means values in the t th iteration. By means of this step one can get that z (t) belongs to the support set of x.
After that set ϕ (t) z = O, and put it back in Φ. Then repeat the above steps t times to get t values, forming the estimated support set written asŝupp(x). Therefore, with a probability close to 1, supp(x) ⊂ŝupp(x) holds. Next, set the elements corresponding toŝupp(x) in ξ as 1, and the rests are still 0. In the following step, solve the convex optimization problem to obtain x * . Lastly, calculatex = x * ξ to reconstruct the original sparse signalx in frequency domain, where is the Hadamard product. If time domain of original signal is demanded, calculateŝ = F −1 {x}.

Output: Estimation valuesx
For further detail, number of iterations t is a parameter which should be paid attention to. For every circulation of step 1 to step 3 in Algorithm 1, one element inŝupp(x) will be generated. Thus the fundamental relationship that k t should be satisfied; otherwise, there will be a loss of effective information of x. However, when we consider a CS system infected by noise, from Equations (25) and (45), the probability of selecting the support set exactly approaches 1 and is related to M. Therefore, t is usually an empirical parameter and there is a trade-off between accuracy and reconstruction performance. When t is a large number, redundant information will remain, leading to an increase in reconstruction error. However, if t is too small, perhaps the support set cannot be found completely. Generally we choose t = 3k to make sure all the elements in the support set are selected.
In order not to be confused with greedy algorithms which aim to find the support set of sparse vectors as well, supp-BPDN algorithm will be analyzed further. The main distinction between supp-BPDN and greedy algorithms is the updating of measurements y, which leads to the different construction of support set. In procedures of MP, by means of matrix multiplications involving Φ T , residuals are calculated during evaluation, and this update focuses on the chosen column of Φ, making MP possibly choose the same column repeatedly [37]. In procedures of OMP, by solving the least squares method, residuals are updated to be orthogonal to all selected elements [28]. In supp-BPDN, through the mathematical analysis, it is not necessary to update and calculate values every iteration. Besides, 1 recovery algorithms have the advantage of high accuracy, especially when handling large signals. Additionally, supp-BPDN inherits this merit.

Numerical Simulations
In this section, a series of numerical simulations performed via CVX toolbox in Matlab are reported. In the simulations, N = 256, (number of measurements) M = 64, (original signal) s was the sum of three cosines of different frequencies, and the mathematical expression was s(t) = a cos(2π where f 1 = 50 MHz; f 2 = 150 MHz; f 3 = 350 MHz; and a, b, and c are constants, leading to sparsity k = 6 in frequency domain. Meanwhile, as the original signals are measured by PRBS in communication systems, we adopted a matrix in which 0s and 1s are randomly generated, and multiplied it with Fourier transform matrix Ψ as MM. The first simulation compared the single recovery results of BPDN and supp-BPDN algorithms, as shown in the Figure 1. Noise was present in this situation, and we set signal-to-noise ratio (SNR) of CS system to 5 dB; the parameters used in reconstruction algorithm were λ = 0.05 2 log N, t = 3k, and other parameters were the same as above settings. It can be seen that in Figure 1a, although frequencies of the original signal were recovered accurately, there was some wrong information recovered as well, which resulted in recovery errors. However, in Figure 1b, it is obvious that the recovered result is more "clear" outside the range of supp(x) than with the supp-BPDN algorithm.  The second simulation exhibits the signal recovery performances of BP, BPDN, MP, OMP, and supp-BPDN algorithms when the system signal-to-noise ratio (SNR) changed. Quality of reconstruction result was measured by mean square error (MSE):

Frequency (Hz)
whereŝ is the reconstructed signal and s is the original signal.
In the simulation, SNR of the system increased from −10 to 40 dB and the step size was 1 dB. Under each variation, the algorithms were executed 100 times; we obtained recovery errors and averaged them. Results are shown in Figure 2; the performance of supp-BPDN was better than those of BPDN and BP algorithms in the entire SNR range. Curves of supp-BPDN and BPDN exhibit the same tendency in general; in the range of −5 to 25 dB, recovery errors of BP are less than those of BPDN, caused by the overly pessimistic characteristic of BPDN upon encountering noise mentioned in Section 2. The last simulation aimed to exhibit signal recovery performances among supp-BPDN, BPDN, and OMP algorithms. Distinctively, in this simulation, we used a different noise model: where w was the normalized Gaussian white noise. In this model, λ was no longer a constant, but a variable corresponding to the noise level σ, and the mapping relation was Similarly, we set M = 64, N = 256, and circulation times to 3,000. For every noise level, we repeated these algorithms 100 times. Recovery results are shown in Figure 4.
When the noise level is from 0.01 to 0.1, curves of BPDN and OMP possess a steeper slope; recovery error is more steady in supp-BPDN.

CS System for Microwave Photonics
In this section, we present the design of a photonic-assisted CS system to verify the feasibility and superiority of the supp-BPDN algorithm in pratical situations.

System Setup
The layout of the proposed scheme is shown in Figure 5. Firstly, a continuous wave (CW) laser is used to generate continuous wave regarded as an optical carrier propagating in single mode fiber (SMF). Additionally, the light is modulated by the sparse-frequency radio frequency (RF) signal s(t) during the first MZM. After that, the optical signal carrying information enters the second MZM to be modulated another time. During this step, known PRBS patterns are encoded on the optical wave, yielding a multiplied signal whose RF signal is mixed with PRBS patterns to propagate into PD, transforming it into an electrical signal. Both MZMs are worked at the quadrature bias point. After PD and the amplifier, the signal is integrated during each period by the integrator. Finally, an ADC with a much lower sampling rate than the Nyquist frequency is applied to digitalize the measurements y, and recovery algorithms are used to reconstruct signals. According to the mathematical model of CS theory, we can realize that there are two steps to establish a CS system, which are mixing and integrating. In the scheme shown in Figure 5, the original signal is mixed with MM via MZMs in the photonic link, and integration process is realized in an electrical circuit. It is noteworthy that if MM is an (M × N) matrix, it means M measurement times are needed using M groups of PRBS. It should be noted that the length of the signal during each measurement period must be the same as that of PRBS; otherwise, aliasing and measuring confusion will occur, which brings errors to the reconstruction results. There is no need to use the system every single measurement; we can get M number of measurements during one execution.

Results and Discussion
We set up a CS system shown in Figure 5 by OptiSystem and recovered a two-tone signal of 350 and 150 MHz by a 200-MHz ADC, and a two-tone signal of 1 GHz and 600 MHz with a 500-MHz ADC.
In the system, the output optical power of the first MZM is where s(t) is the normalized original signal, voltages of which range from -1 to 1 V. Additionally, after different PRBS modulate the optical wave, the output power of the second MZM is in which r(t) is the PRBS. Meanwhile, modulated waveforms of MZMs are shown in Figure 6, demonstrating a single period of signals in time domain. Figure 6a shows the original signal, and Figure 6b exhibits the original signal encoded by PRBS being measured. From Equation (50), after PD and the integrator, PRBS is integrated by a period as well. Therefore, data sampled by ADC are actually two integral values, which iŝ whereŷ m is the mth measurement in practical systems, and r m (t) is the mth group of PRBS. α is a constant coefficient of the CS system, which is different when parameters change. Figure 7 shows signals recovered by BPDN and supp-BPDN algorithms. The time interval of a period was 20 ns, the length of a group of PRBS was 256, measurement times M = 64, the system SNR was 15 dB, and the sampling rate of ADC was 200 MHz. We can figure out that signals recovered by supp-BPDN were the most coincident with the original signal.  In the next simulation, we changed the frequencies of signal to 1 GHz and 600 MHz, and the sampling rate of ADC was 500 MHz; recovery results are shown in Figure 8.
Similarly, supp-BPDN demonstrated better performance at recovering signals.

Conclusions
A novel signal recovery algorithm named supp-BPDN is put forward in this paper based on the theoretical analysis of the signal support set. This algorithm executes the step of selecting the signal support set before the traditional recovery algorithm BPDN; in other words, it selects the position indexes corresponding to the top t values of the inner products and only retains the recovery information in said position indexes, which can effectively avoid errors from wrong information. In the proposed algorithm, it is not necessary to remove the contributions of local optimal values in measurements y during each iteration, which simplifies the procedure. By means of numerical simulations, we compared supp-BPDN and traditional algorithms, and supp-BPDN had the best performances. Moreover, based on the advantages of microwave photonic technologies, a photonic-enabled CS system was designed to verify the proposed algorithm. Additionally, a signal with a maximum frequency of 350 MHz can be recovered when digitalized at 200 MHz. Moreover, a signal with a peak of 1 GHz could be reconstructed when digitalized at 500 MHz, achieving a breakthrough in the Nyquist sampling rate. Similarly, signals recovered by supp-BPDN fit the original signals more closely.