The Sufﬁcient Conditions for Orthogonal Matching Pursuit to Exactly Reconstruct Sparse Polynomials

: Orthogonal matching pursuit (OMP for short) is a classical method for sparse signal recovery in compressed sensing. In this paper, we consider the application of OMP to reconstruct sparse polynomials generated by uniformly bounded orthonormal systems, which is an extension of the work on OMP to reconstruct sparse trigonometric polynomials. Firstly, in both cases of sampled data with and without noise, sufﬁcient conditions for OMP to recover the coefﬁcient vector of a sparse polynomial are given, which are more loose than the existing results. Then, based on a more accurate estimation of the mutual coherence of a structured random matrix, the recovery guarantees and success probabilities for OMP to reconstruct sparse polynomials are obtained with the help of those sufﬁcient conditions. In addition, the error estimation for the recovered coefﬁcient vector is gained when the sampled data contain noise. Finally, the validity and correctness of the theoretical conclusions are veriﬁed by numerical experiments.


Introduction
It is well known that smooth functions have approximately sparse expansions under certain orthogonal systems. Therefore, one of the fundamental research problems in function approximation is the theoretical and algorithmic study of the exact reconstruction of the sparse polynomials [1][2][3][4][5][6]. In this paper, the form of the polynomial g(x) is considered as where {φ j (x)} j∈Λ is a set of uniformly bounded orthonormal basis functions defined on Ω ⊂ R d , Λ is the index set with |Λ| = n, where |Λ| denotes the number of elements in the set Λ, n can be finite or infinite. If the coefficient vector c = [c 1 , . . . , c n ] ∈ R n×1 has at most s elements that are not zero, where 2 ≤ s n, we call the polynomial g(x) an s-sparse polynomial, and s is the sparsity of the polynomial g(x) and the coefficient vector c. Obviously, if the sparse coefficient vector c can be recovered exactly, the sparse polynomial g(x) can be reconstructed. Therefore, we transform the problem of reconstructing the sparse polynomial g(x) into the problem of recovering the sparse coefficient vector c. Only the case that n is finite and d = 1 is studied in this paper, but the results in this case can be generalized to high dimensions.
The commonly used recovery method is the interpolation method, which requires that the coefficient vector c of the undetermined interpolation polynomialg(x) must satisfy the following system of linear equations: where Φ ∈ R n×n is the interpolation matrix generated by the values of the basis functions taken at the sampling points {x i } n i=1 , b = [g(x 1 ), · · · , g(x n )] ∈ R n×1 are the sampled data, c ∈ R n×1 is the coefficient vector.
When the number of basis functions n is large, the system (1) is often ill-conditioned and cannot give a good recovery of the coefficient vector c. Moreover, in practical applications, sampling is often expensive. Therefore, determining how to give a better recovery of the sparse coefficient vector c by a small amount of samples is a key issue in the reconstruction of the sparse polynomial g(x).

Compressed Sensing and the Reconstruction of Sparse Polynomial
In recent years, compressed sensing has developed rapidly [7]. Its main idea is to use nonlinear optimization to recover a sparse signal with as few observations as possible [8]. The original model for sparse signal recovery is where c 0 denotes the number of nonzero elements in the vector c, Φ ∈ R m×n is the measurement matrix, and b ∈ R m×1 is the observation vector. It is not difficult to find that the constraints in the model (2) are of the same form as the interpolation conditions (1). Hence, one considers applying the compressed sensing to recover the sparse coefficient vector with a small amount of samples and then reconstruct the sparse polynomial. Unfortunately, the model (2) is an NP-hard problem. If we know in advance that the sparsity of the signal to be reconstructed is s, then we convert the model (2) into the following 2 -norm model with inequality constraints: Greedy algorithm is one of the commonly used algorithms for solving the model (3). The orthogonal matching pursuit (OMP for short) algorithm is one of the most classic and popular greedy algorithms with its advantages of high efficiency and accuracy [9][10][11][12].

The Recovery Guarantee for OMP to Recover Sparse Signals
The recovery guarantees for OMP to recover sparse signals are the sufficient conditions for OMP to recover sparse signals accurately, which are often given by the restricted isometry constant or the mutual coherence of the measurement matrix Φ. A formal definition of some of the terms used in this section will be formally introduced in Section 2.
The restricted isometry constant [13] (RIC for short) δ s of the measurement matrix is one of its important characteristic quantities, which is the smallest value in (0, 1) that makes holds for every s-sparse vector c ∈ R n . In 2012, Mo and Shen [14] gave a sufficient condition for OMP to accurately recover s-sparse signals within s-step iterations as δ s+1 < 1/( √ s + 1). Wang and Shim [15] gave the same result in the same year. In 2015, Mo [16,17] and other scholars further optimized the above sufficient condition to δ s+1 < 1/ √ s + 1 and showed that OMP cannot recover any s-sparse signal when δ s+1 = 1/ √ s + 1. Another way to give the recovery guarantee for OMP to recover s-signals is to directly analyze the selection mechanism of OMP and give the minimum number of sampling points. Tropp [18] et al. showed that if the measurement matrix is an admissible measurement matrix, such as Gaussian random matrix or Bernoulli random matrix, when the number of noiseless sampling points satisfies m ∼ s ln n, OMP can recover any s-sparse signals. However, since the admissible measurement matrix requires high independence among the elements of the matrix, and the generated measurement matrix usually cannot satisfy its independence requirement in practical applications, the results and the analysis method in [18] are difficult to further generalize.
Other scholars considered giving the sufficient conditions for OMP by the mutual coherence of the measurement matrix. For both cases of sampled data with and without noise, the sufficient conditions for OMP to recover s-sparse vectors are shown in Proposition 1 [9] and Proposition 2 [19], respectively.
then OMP can accurately reconstruct arbitrary s-sparse vectors when sampling is free of noise.

Proposition 2.
Let the noise vector satisfy 2 < ε, where ε > 0 is the noise bound, and the mutual coherence of the measurement matrix Φ satisfies µ(Φ) < 1/(2s − 1). When the stopping rule of OMP is r l 2 ≤ ε and all the nonzero elements in the sparse vector c ∈ R n satisfy where r l means the residual vector generated in the lth iteration of OMP and supp(c) denotes the support set of the vector c. Then, OMP can find the exact position of the nonzero elements in c.
Tropp [9] and Cai et al. [20] explained that the sufficient conditions in Proposition 1 and Proposition 2 are sharp by constructing counter-examples. However, in most practical applications, the measurement matrix often has some good properties, such as randomness and column orthogonality. Therefore, we hope to relax the sufficient conditions in Proposition 1 and Proposition 2 with the help of those good properties, so that OMP can also recover sparse vectors with high probability in both the cases of sampled data with and without noise.

The Recovery Guarantee for OMP to Reconstruct Sparse Polynomials
The recovery guarantee for OMP to reconstruct sparse polynomials is often given by the relationship among the number of basis functions n, sparsity s, and the number of sampling points m.
The applications of OMP to the reconstruction of sparse polynomials are mostly focused on reconstructing sparse trigonometric polynomials, which is due to the good form of trigonometric polynomials. In 2008, Kunis and Rauhut [21] proved that the ssparse trigonometric polynomial can be reconstructed exactly with high probability by using OMP under random sampling when the number of sampling points m satisfies m ∼ s 2 ln(n). In 2011, Xu [4] constructed a set of deterministic samples to reconstruct sparse trigonometric polynomials and provided the recovery guarantee for OMP under the deterministic sampling. However, it is difficult to generalize this type of analysis to the study of reconstructing general sparse polynomials, and the above studies were all performed in the case of sampled data without noise.
Huang et al. [11] applied OMP to reconstruct general sparse polynomials generated by a uniformly bounded orthonormal system. With the help of greedy selection ratio and extensive knowledge in probability theory, they gave the recovery guarantee and success probability for OMP to reconstruct general sparse polynomials. Their results showed that the recovery guarantee for OMP to reconstruct general sparse polynomials is also m ∼ s 2 ln(n). However, their analytical method relies on the exact sampled data, and thus cannot be extended to the case of sampled data with noise.
In addition, most scholars gave the reconstruction of general sparse polynomials by solving the 1 -minimization [5,22,23] and the recovery guarantee through the RIC of the measurement matrix. However, the RIC-type recovery guarantees often contain some constants that are difficult to estimate, and solving the 1 -minimization requires more time cost than OMP when the problem size is large [11].
Therefore, we consider applying OMP to reconstruct general sparse polynomials under both cases of sampled data with and without noise, and hope to give some more specific recovery guarantees and success probability for OMP. Moreover, for the case of sampled data with noise, estimating the reconstruction error is necessary.

Contributions
In this paper, we apply OMP to the reconstruction problem of the sparse polynomial g(x) generated by a uniformly bounded orthonormal system {φ j (x)} j∈Λ . Here, the measurement matrix Φ ∈ R m×n in (3) is a structured random matrix generated by the values of the system {φ j (x)} j∈Λ at the independent random sampling points {x i } m i=1 . Firstly, although Feng [10] et al. and Rauhut [13] et al. have estimated the upper bound of the mutual coherence for structured random matrix Φ, we use the knowledge of probability theory to further optimize the upper bound to with probability at least 1 − n −p , where p > 0 is a fixed number. Secondly, combining the selection mechanism of OMP and the condition that the measurement matrix is a structured random matrix as Φ, for both cases of sampled data with and without noise, we give more relaxed sufficient conditions for OMP to recover the sparse coefficient vector of the sparse polynomial g(x), respectively. In addition, we also prove that the 2 -norm of the recovered error can be controlled by the noise bound ε when the sampled data contain noise.
Finally, by (4) and those sufficient conditions, we show that the recovery guarantees for OMP to reconstruct sparse polynomials is m ∼ s 2 ln n regardless if the sampled data contain noise or not, which is consistent with the recovery guarantee for OMP to reconstruct sparse trigonometric polynomials given in [21] for the case of sampled data without noise.
The rest of this paper is organized as follows: Section 2 introduces some preliminary knowledge required for this paper. Section 3 gives the recovery guarantees for OMP to reconstruct sparse polynomials in both cases of sampled data with and without noise, and gives the error estimation of the recovered coefficient vector by OMP when the sampled data contain noise. Section 4 contains the numerical experiments. Section 5 contains the conclusion.

Preparation of Manuscript
In this section, we will introduce some knowledge required for this paper.
When we apply compressed sensing to recover the sparse coefficient vector c of the sparse polynomial g(x), the measurement matrix Φ ∈ R m×n in model (2) is a structured random matrix generated by taking values of {φ j (x)} j∈Λ at the sampling point {x i } m i=1 , i.e., where {x i } m i=1 are sampled independently according to the probability measure ω(x), Φ j , j = 1, 2, · · · , n denote the jth column of the matrix Φ. The observation vector is

Orthogonal Matching Pursuit Algorithm
The greedy algorithm is an important class of methods for solving the model (3) [10,24], of which the orthogonal matching pursuit (OMP for short) algorithm is one of the most commonly used methods [9]. As shown in Algorithm 1, OMP first calculates the orthogonal projection complement of b in the space spanned by the currently selected column of Φ (the second line in the 'update' session), and then calculates the absolute value of the inner product between the orthogonal projection complement and the column in Φ (the 'match' session). We usually select the column that maximizes the absolute value of the inner product each time (the 'identify' session) [8].
Algorithm 1 Orthogonal matching pursuit algorithm. Input: Measurement matrix Φ, observation vector b, sparsity s, tolerance ε Output: Recovered vector c * Initialization: In the case of sampled data without noise, based on the selection mechanism of OMP, Tropp [9] et al. gave a sufficient condition for the exact recovery of s-sparse vectors when the columns of measurement matrix have normalized 2 -norm. Proposition 3. Let Φ opt ∈ R m×s be a matrix consisting of the columns of the measurement matrix Φ whose index lies in the support set of the s-sparse vector c. Then, OMP can recover the s-sparse vector c exactly when max where Ψ consists of the columns of the measurement matrix whose index is not in the support set of the vector c, Equation (7) is often referred to as the ERC (exact recovery condition) of OMP.
Remark 1. It is not difficult to verify that when the columns of the measurement matrix do not have normalized 2 -norm, the ERC still guarantees that OMP can accurately recover the s-sparse vector c when the sampled data do not contain noise.

Mutual Coherence and Cumulative Coherence Function
Mutual coherence and cumulative coherence function of the matrix are important parameters to measure the orthogonality of the matrix. They are defined as follows [13].
Definition 1. The mutual coherence of a matrix Φ ∈ R m×n is defined as where Φ j , Φ k represent the jth and kth column of the matrix, respectively.

Definition 2.
For a positive integer s ≤ n, the cumulative coherence function µ 1 (s) of the matrix Φ ∈ R m×n is defined as where Λ is an index set with cardinality s. Φ λ and Ψ j denote the columns of Φ whose index are in and not in Λ, respectively.

Remark 2.
According to the Schwarz inequality, the mutual coherence µ(Φ) obviously satisfies Mutual coherence µ(Φ) and cumulative coherence function µ 1 (s) of a matrix have the following relationship [13].

Proposition 4.
Suppose that the mutual coherence of a matrix Φ is µ(Φ), then holds for any natural number s.

Bernstein Inequality
Bernstein inequality is a classic inequality for variance estimation and higher-order moment estimation for independent bounded or unbounded random variables. Its form on bounded random variables is as follows.
Proposition 5. Suppose X 1 , . . . , X m are independent random variables with zero mean. Then, here For other forms of Bernstein inequality, see [25].

The Recovery Guarantee and Reconstruction Error for OMP to Reconstruct Sparse Polynomials
In this section, we first give a more accurate estimation of the upper bound for the mutual coherence of the structured random matrix. Then, for both cases of sampled data with and without noise, we obtain the sufficient conditions for OMP to recover sparse vectors when the measurement matrix is a structured random matrix. Finally, combining the above results, we derive the recovery guarantees and the success probability for OMP to reconstruct sparse polynomials. Moreover, for the case of sampled data with noise, we also show that the reconstruction error between the recovered and original coefficient vector can be controlled by the noise bound.

The Estimation for the Upper Bound of the Mutual Coherence
In this subsection, we give an upper bound estimation for the mutual coherence of the structured random measurement matrix (5) with the help of the definition of mutual coherence, the law of large numbers, and Proposition 5.
are independently sampled according to the corresponding probability measure ω(x). When m is sufficiently large, the mutual coherence µ(Φ) satisfies with probability at least 1 − n −p ; here p > 0 is a fixed number, and K > 0 is the uniform upper bound of the basis functions {φ j (x)} n j=1 .
Proof. According to the orthonormality of basis functions, for any j = 1, 2, · · · , n, we have Then, by the law of large numbers, when m → ∞, it holds that where Φ j represents the jth column of the matrix Φ. Therefore, when m → ∞, we have From the definition of the mutual coherence, it is clear that as m increases, it holds that are also independent of each other and have the uniform bound |X i | ≤ K 2 . Therefore, according to Proposition 5 and Remark 2, for any δ ∈ [0, 1], we have Based on Boole's inequality [13], there are , p > 0 is a fixed number, then we have Furthermore, it holds that In summary, the mutual coherence µ(Φ) of Φ satisfies with probability at least 1 − n −p .
Remark 3. From Definition 1, it is easy to find that the column normalization of the matrix does not affect its mutual coherence and the value of its cumulative coherence function.

The Recovery Guarantee for OMP under the Noiseless Condition
In this subsection, we first give a sufficient condition for OMP to exactly recover the s-sparse vectors by the mutual coherence µ(Φ) of the measurement matrix (5). Firstly, some assumptions and notations are introduced. Without loss of generality, assume that the first s elements of the original coefficient vector c ∈ R n are nonzero elements, and use the sets Λ s and Λ n−s to denote the support and nonsupport sets of the vector c, respectively.
Clearly, there are |Λ s | = s and |Λ n−s | = n − s. Then, partition the measurement matrix Φ as where Φ opt ∈ R m×s and Ψ ∈ R m×(n−s) are the submatrices consisting of the first s columns and the last (n − s) columns of the measurement matrix Φ, respectively, Ψ j , j ∈ Λ n−s denotes the jth column of the submatrix Ψ.
Theorem 1. If the measurement matrix Φ ∈ R m×n is the structured random matrix given in Lemma 1 and the the observation vector b = [g(x 1 ), . . . , g(x m )] ∈ R m×1 , then, when m is sufficiently large, and the mutual coherence µ(Φ) satisfies solving model (3) by OMP can exactly recover arbitrary s-sparse vector c with high probability.

Remark 4.
Since the law of large numbers is used in the proof of Theorem 1, the result in Theorem 1 holds with high probability.
Proof. Starting from the (ERC) condition in Proposition 3, expanding the pseudo-inverse by definition, we have Firstly, consider the first term on the right-hand side of Equation (10).
is a uniformly bounded orthonormal system, it follows from the law of large numbers that when m → ∞, it holds that Thus, Then, it holds that Secondly, we analyze the second term on the right-hand side of Equation (10). Similarly, by the law of large numbers, when m → ∞, we have Then, it holds that Since |Λ s | = s, it is clear from the definition of the cumulative coherence function that we have In summary, substituting (11) and (12) into (10) yields Finally, by the assumption µ(Φ) < 1/s of this theorem, according to Proposition 4, it follows that µ 1 (s) < s · µ(Φ) < 1 holds. Furthermore, the (ERC) condition holds. Based on Proposition 3, the conclusion of this theorem is proved.
In the case of sampled data without noise, Theorem 1 gives the sufficient condition for OMP to exactly recover s-sparse vectors with high probability through the mutual coherence of the structured random matrix. Then, with the help of Lemma 1 and Theorem 1, the recovery guarantee and success probability for OMP to reconstruct sparse polynomials is given in the following theorem. Theorem 2. Suppose that g(x) = j∈Λ,|Λ|=n c j φ j (x) is an s-sparse polynomial, where {φ j (x)} j∈Λ is a uniformly bounded orthonormal system defined on Ω with probability measure ω(x) and a uniform upper bound K > 0. Suppose that the measurement matrix Φ ∈ R m×n in the model (3)  Proof. Based on the assumptions, it is easy to verify that the measurement matrix Φ satisfies the conditions in Lemma 1, so the mutual coherence µ(Φ) satisfies with probability at least 1 − n −p . Then, when the number of sampling points satisfies it can be proved that the right-hand side of Equation (13) is less than 1/s. Therefore, from Lemma 1, it is obvious that solving model (3) by OMP can exactly recover the s-sparse coefficient vector c, and further reconstruct the s-sparse polynomial g(x).

Remark 5.
From Theorem 2, it is easy to find that when the constant p is taken as a large number, it will make the recovery guarantee m too large or even exceed the number of basis functions n, which leads to it not being able to reconstruct the sparse polynomial g(x) by fewer sampling points. However, if the constant p is small, the lower bound of the theoretical success probability will be too small, making it useless. Therefore, to balance the relationship between the recovery guarantee and the lower bound of the theoretical success probability, the constant p is usually taken as p ∈ [0.2, 0.4].

The Recovery Guarantee and the Reconstruction Error for OMP under the Noisy Condition
In this subsection, we discuss the recovery guarantee, the success probability, and the reconstruction error for OMP to reconstruct s-sparse polynomials in the case of the sampling data with noise. Before that, we firstly perform column normalization of the measurement matrix Φ, i.e., Similar to (9), here, we partition the measurement matrixΦ as By the law of large numbers, when m → ∞, we have Φ j 2 ≈ √ m, j = 1, 2, · · · , n. Thus, as m increases, there areΦ ≈ 1 √ m Φ and Therefore, the corresponding observation vector in noisy condition is where = [ 1 , 2 , · · · , m ] ∈ R m is the noise vector and satisfies 2 ≤ ε, and ε > 0 is the noise bound. Based on the above analysis, the model (3) can be rewritten as where the measurement matrixΦ ∈ R m×n and the observation vectorb ∈ R m×1 are shown in (14) and (15), respectively. Solving model (16) by OMP can recover the sparse coefficient c, and then reconstruct the s-sparse polynomial g(x). We next analyze OMP. For this purpose, we introduce some notations here. Suppose that OMP selects k indices located in the support set after k iterations, and let the set formed by these indices be Λ k . Since Λ k ⊂ Λ s , we denote Λ s−k = Λ s /Λ k . LetΦ (k) ∈ R m×k be the submatrix consisting of the columns in the measurement matrixΦ whose indicators lie in Λ k . Let be the projection of the observationb onto the space spanned by the columns ofΦ (k) , and denote the projection operator as P k . Thus, the residual vector r k of OMP in the kth iteration is For convenience, let α k = (I − P k )Φc, β k = (I − P k ) , and introduce the following notations from reference [19]: Reference [19] showed that OMP can select an element in the index set Λ s−k in the current iteration if Furthermore, according to the (ERC) condition (7), when µ(Φ) < 1/s, combining with Equation (12) and Proposition 4, we obtain that is also a sufficient condition for OMP to select an element in the index set Λ s−k in the current iteration. We then give the following theorem.
Theorem 3. Suppose that the measurement matrixΦ and the observation vectorb are given in (14) and (15), respectively, and the noise vector satisfies 2 ≤ ε. Then, when m is sufficiently large and µ(Φ) < 1/s, if the nonzero elements in the s-sparse vector satisfy then solving model (16) by OMP with the stop rule r 2 ≤ ε can accurately find the positions of the nonzero elements in vector c.
Before proving Theorem 3, we first give the following proposition [19] which is important in the proof. Proposition 6. If µ(Φ) < 1 s−1 , then all the eigenvalues of the matrix (Φ (s−k) ) (I − P k )Φ (s−k) are located in the interval whereΦ (s−k) denotes the submatrix consisting of the columns of the measurement matrixΦ whose index lie in Λ s−k .
With the help of Proposition 6, we give the proof of Theorem 3.

Proof.
Let c (s−k) be the vector consisting of the elements of the vector c whose index lie in Λ s−k , and by the definition of M k,1 , the relationship between the ∞ -norm and the 2 -norm, and the properties of the eigenvalues, we have where λ min is the smallest eigenvalue of the matrix (Φ (s−k) ) (I − P k )Φ (s−k) . According to Proposition 6, it holds that Combining with (17), if then OMP can select an element in the index set Λ s−k in the current iteration. Furthermore, for any j ∈ Λ s , according to 2 ≤ ε and Schwarz inequality, we have that is, N k ≤ ε. Thus, (19) holds when |c j |, j ∈ supp(c) satisfies (18), i.e., OMP can select an index located in Λ s−k in the current iteration. We next consider the stopping rule r k 2 ≤ ε. Here, we show that OMP does not stop when k < s under this rule. Recall that r k is the residual vector at the kth step of OMP, then from the triangle inequality and the definition of r k , we have Furthermore, by Proposition 6 and the assumption in this theorem, it holds that that is, when k < s, the 2 -norm of the residual vector r k does not satisfy the stopping rule, and OMP does not stop at the current iteration.

Remark 6.
Compared with the requirement of the column orthogonality of the measurement matrix limited by the upper bound of the mutual coherence given in Proposition 1 and Proposition 2, the requirement of that given in Theorem 1 and Theorem 3 is significantly relaxed, allowing more matrices to be used as measurement matrix that satisfy the requirement.
Based on the above theorem, similar to Theorem 2, we can derive the following theorem. Theorem 4. Suppose that g(x) = j∈Λ,|Λ|=n c j φ j (x) is an s-sparse polynomial, where {φ j (x)} j∈Λ is a uniformly bounded orthonormal system defined on Ω with probability measure ω(x) and a uniform upper bound K > 0. LetΦ ∈ R m×n be the measurement matrix with normalized columns, as shown in (14), which is generated by the value of {φ j (x)} j∈Λ at the sampling points Then, when solving model (16) by OMP with stopping rule r 2 ≤ ε can accurately find the positions of the nonzero terms of the sparse polynomials g(x) with probability at least 1 − n −p , where p > 0 is a fixed number. Furthermore, when OMP succeeds, the reconstruction error between the original coefficient vector c and the recovered vectorc satisfies where C > 0 is a constant independent of ε.
Proof. The proof of Theorem 4 is divided into two parts: first, we give the recovery guarantee and the success probability for OMP to accurately find the positions of nonzero terms of g(x); second, we estimate the reconstruction error between the original and recovered coefficient vector. The first part of the proof is similar to that of Theorem 2. Since the column normalization of the matrix does not affect the mutual coherence of the matrix, the estimation of the recovery guarantee and the success probability can be derived directly with the help of Lemma 1 and Theorem 3.
The second part is proved below. Without loss of generality, assume that Λ = {1, . . . , n} and Λ s = supp(c) = {1, . . . , s}, i.e., the first s elements of the original coefficient vector c are nonzero elements. The analysis in the first part shows that OMP can exactly find all elements in the set Λ s with probability at least 1 − n −p if the assumptions of this theorem hold.
Decompose the original coefficient vector c and the recovered coefficient vectorc into where c 1 andc 1 denote the first s elements of the vectors c andc, respectively. At this time, the nonzero part of the recovered vector is whereΦ † opt denotes the pseudo-inverse of the matrixΦ opt . Sinceb =Φc + =Φ opt c 1 + , we havec Finally, let C = Φ † opt 2 , then the reconstruction error of the coefficient vector is Combining the above two parts, the conclusion of the theorem is proved. (22) shows that if (20) and (21) hold, then OMP can recover the coefficient vector c accurately with probability at least 1 − n −p when ε = 0.

Remark 8.
Through a similar analysis to Remark 5, the constant p is still taken as p ∈ [0.2, 0.4] when the sampled data contain noise.

Remark 9.
The conclusion of Theorem 2 and Theorem 4 shows that regardless of whether the sampled data contain noise or not, the recovery guarantee for OMP to reconstruct sparse polynomials generated by uniformly orthonormal systems is consistent with that of OMP to reconstruct sparse trigonometric polynomials.

Numerical Experiments
In this section, we first introduce three commonly used uniformly bounded orthonormal systems, and then apply OMP to reconstruct the sparse polynomials generated by these three types of uniformly bounded orthonormal system. The first experiments verify the validity of the recovery guarantees and the lower bound of the success probability given in Theorem 2 and Theorem 4. The second experiments verify the accuracy of the estimation of the reconstruction error between the recovered and original vector given in (22). The last experiments show that even if the coefficient vector has a small disturbance, OMP can gain a good recovery of it.

Commonly Used Uniformly Bounded Orthonormal Systems
In this subsection, we introduce three commonly used uniformly bounded orthonormal systems: preconditioned Legendre polynomial system, Chebyshev polynomial system, and trigonometric polynomial system. Preconditioned Legendre polynomial system: The standard univariate Legendre polynomials [26] are They are orthonormal with respect to the uniform measure ω(x) = 1/2 on [−1, 1] and their L ∞ -norm are It is obvious that the standard Legendre polynomials are not uniformly bounded on [−1, 1]. Therefore, we consider the following function system [13]: {Q j (x)} j∈N 0 are orthonormal with respect to the Chebyshev measure ω(x) = π −1 1 − x 2 −1/2 on [−1, 1] and have the uniform upper bound K = √ 3. The function system {Q j (x)} j∈N 0 is called the preconditioned Legendre polynomial system.

The Verification of Recovery Guarantee and Success Probability of OMP Algorithm
In this subsection, for the two cases of sampled data with and without noise, we take the uniformly bounded orthonormal systems in Section 4.1 as examples to verify the validity of the recovery guarantees and the lower bounds of the success probability given by Theorem 2 and Theorem 4, respectively. Here, the sparsity s = 5 and s = 10, the parameter p = 0.2, 0.3, and 0.4, and the noise bound ε = 10 −5 are taken, respectively. The main steps of this experiment are as follows: Step 1: Randomly generate an n-dimensional s-sparse coefficient vector c ∈ R n with a support set Λ s .
Step 2: Taking the three types of uniformly bounded orthonormal system mentioned in Section 4.1 as examples, according to the corresponding probability measure, randomly select m sampling points {x i } m i=1 on corresponding domain.
Step 5: Compare the obtained results with the original coefficient vector and polynomial.
According to Theorem 2 and Theorem 4, we calculate the lower bounds of the number of sampling points m required for the three types of uniformly bounded orthonormal systems with different numbers of basis functions n and parameters p, which are the recovery guarantees for OMP to exactly reconstruct sparse polynomials. The recovery guarantees for the cases of s = 5 and s = 10 are shown in Tables 1 and 2, respectively.
The first column of Tables 1 and 2 denotes the number of basis functions n, and the remaining columns denote the lower bound of the number of sampling points m required for different constants p at sparsity s = 5 and s = 10 for the three types of the uniformly bounded orthonormal systems, respectively. From Tables 1 and 2, it is easy to see that the lower bound of the number of sampling points m increases with the increase of that of basis functions n, but the growth rate of the former is much smaller than the latter, and for the same n and s, the larger the parameter p is, the larger the lower bound of the number of sampling points m is. In addition, for the same number of basis functions n, sparsity s, and parameter p, the recovery guarantee of the trigonometric polynomial system is the smallest.
Next, for the sparse polynomials generated by the three types of uniformly bounded orthonormal system mentioned before, we take the number of basis functions n and sampling points m given in Tables 1 and 2, and we solve model (3) by OMP for 1000 independent repeated experiments and record the frequency of exact reconstruction as the actual success probability of OMP. It is clear that for different number of basis functions n and parameter p, the theoretical success probability is 1 − n −p . For the two cases of sampled data with and without noise, Figures 1-6 show the comparison between the theoretical and actual success probability for OMP to exactly reconstruct the three types of sparse polynomials for different numbers of basis functions n, parameter p, and sparsity s.    From Figures 1-6, it is easy to see that the theoretical success probability (red line) of OMP to reconstruct the three types of sparse polynomials with different parameters p gradually increases and tends to the actual success probability as the number of basis functions n increases, and the actual success probability is higher than the theoretical success probability in both noisy and noiseless cases. When the number of basis functions n is the same, the theoretical success probability also increases with the increase of parameter p, which illustrates the correctness and validity of the conclusions of Theorem 2 and Theorem 4. In addition, Figures 1-6 also show that when the sampled data contain noise of magnitude ε = 10 −5 , the success probability for OMP is comparable to that of the case when the sampled data do not contain noise, which also illustrates the robustness of OMP.

Verification of the Accuracy of the Reconstruction Error Estimation When Sampled Data Contain Noise
In this subsection, for the sampled data containing noise, we verify the accuracy of the reconstruction error estimation for the sparse coefficient vector. The number of basis functions n, sampling points m, and the noise bound are still taken as Table 1, Table 2, and ε = 10 −5 , respectively. The experimental procedure is similar to that in Section 4.2, where we still perform 1000 independent repeated experiments by using OMP to reconstruct the three types of sparse polynomials. Different from the experiments in Section 4.2, we randomly generate a noise vector with 2 ≤ ε and record the values of constant C and the real noise boundε = 2 when OMP succeeds. For the two cases of sparsity s = 5 and s = 10, the average reconstruction error and the upper bound of it are shown in Tables 3 and 4 when the parameter p = 0.2, 0.3, and 0.4 and OMP succeeds.  The columns of 'upper bound' in Tables 3 and 4 are the estimation of the upper bound of the reconstruction error, i.e., C ·ε, and the columns of 'error' mean the average reconstruction error when OMP succeeds, i.e., c −c 2 . From Tables 3 and 4, it is easy to see that for the three types of sparse polynomials, with different numbers of basis functions n and parameters p, all the upper bounds of the reconstruction errors are larger than the average reconstruction errors, which shows the correctness of the conclusion in Theorem 4.

Verification of the Accuracy of OMP to Recover Coefficient Vectors When Sampled Data Contain Noise
The case of sampled data g(x i ), i = 1, 2, · · · , m with noise can be regarded as the case that the coefficient vector contains noise, which can be expressed as It can be written in the matrix vector form aŝ b =Φĉ, whereΦ andb =b are given as (14) and (15), respectively, and the vectorĉ = [ĉ 1 , · · · ,ĉ n ] ∈ R n×1 . At this point, only s main term in the vectorĉ has a large absolute value, and the absolute values of the remaining terms are small. Therefore, in this subsection, for the case of coefficient vector with noise, we apply OMP to recover the noisy coefficient vector c and then verify the recovery effect of OMP. In this experiment, the noise bound is still taken as ε = 10 −5 , and the number of basis functions n and sampling points m are still taken as those in Tables 1 and 2 for s = 5 and s = 10, respectively. We solve model (16) by OMP for 1000 independent repeated experiments and record the actual error ĉ −c 2 when OMP succeeds. The average actual errors of s = 5 and s = 10 are shown in Tables 5 and 6, respectively.  The first column of Tables 5 and 6 indicates the number of basis functions, the second column indicates the value of parameter p, and the remaining columns indicate the average actual error when OMP succeeds. From Tables 5 and 6, it is easy to see that when OMP succeeds, the actual error of its recovery is smaller than the error bound ε, which indicates the accuracy of OMP in recovering the noisy coefficient vector, and also shows the noise resistance of OMP.

Conclusions
The main work of this paper is to give the sufficient conditions for OMP to reconstruct sparse polynomials generated by uniformly bounded orthonormal systems in both cases of sampled data with and without noise, and give the recovered error for the sparse coefficient vectors when the sampled data contain noise. The work in this paper can be regarded as a generalization of the study of OMP to reconstruct sparse trigonometric polynomials in [21].
For the structured random matrix generated by a uniformly bounded orthonormal system and random sampling points, a more accurate estimation for the upper bound of the mutual coherence of this matrix is given firstly in this paper. Then, for both cases of sampled data with and without noise, when the measurement matrix is a structured random matrix, more relaxed sufficient conditions for OMP to recover sparse coefficient vectors with high probability are given by the mutual coherence of the measurement matrix, which further relax the condition µ(Φ) < 1/(2s − 1) given in [9,19]. Meanwhile, the requirement on the sparse coefficient vector for OMP to exactly find the positions of nonzero elements is given when the sampled data contain noise. Finally, combining the results of the above two parts, it is proved that regardless of whether the sampled data contain noise or not, when the number of sampling points satisfies m ∼ s 2 ln n, OMP can reconstruct general sparse polynomials with probability at least 1 − n −p in both cases. Furthermore, with a simple calculation, the fact that reconstruction error between the original and recovered coefficient vector can be controlled by the noise bound is also illustrated in this paper. In addition, the research methods and conclusions of this paper can be extended to the study of multivariate sparse polynomial reconstruction problems.
However, it is easy to see from the experiments in Section 4.2 that the actual success probability of OMP is higher than the theoretical success probability, which indicates that the recovery guarantee given in this paper is not optimal. Perhaps some more advanced probability tools can be used to give a more accurate upper bound estimation for the mutual coherence of the measurement matrix, which in turn can further optimize the recovery guarantee and the success probability for OMP.

Conflicts of Interest:
The authors declare no conflict of interest.