A New Method of Measurement Matrix Optimization for Compressed Sensing Based on Alternating Minimization

In this paper, a new method of measurement matrix optimization for compressed sensing based on alternating minimization is introduced. The optimal measurement matrix is formulated in terms of minimizing the Frobenius norm of the difference between the Gram matrix of sensing matrix and the target one. The method considers the simultaneous minimization of the mutual coherence indexes including maximum mutual coherence μmax, t-averaged mutual coherence μave and global mutual coherence μall, and solves the problem that minimizing a single index usually results in the deterioration of the others. Firstly, the threshold of the shrinkage function is raised to be higher than the Welch bound and the relaxed Equiangular Tight Frame obtained by applying the new function to the Gram matrix is taken as the initial target Gram matrix, which reduces μave and solves the problem that μmax would be larger caused by the lower threshold in the known shrinkage function. Then a new target Gram matrix is obtained by sequentially applying rank reduction and eigenvalue averaging to the initial one, leading to lower. The analytical solutions of measurement matrix are derived by SVD and an alternating scheme is adopted in the method. Simulation results show that the proposed method simultaneously reduces the above three indexes and outperforms the known algorithms in terms of reconstruction performance.


Introduction
Compressed sensing (CS) [1] can sample the sparse or compressible signals at a sub-Nyquist rate, which brings great convenience for data storage, transmission, and processing. By adopting the reconstruction algorithms, the signal can be exactly reconstructed from the sampled data. As a marvel way of signal processing, CS is applied in different fields such as image encryption [2], wideband spectrum sensing [3], wireless sensor network data processing [4], etc.
The original signal x ∈ R N×1 is assumed to have a sparse representation in a known domain Ψ ∈ R N×L N (N ≤ L) as x = Ψs where Ψ is the dictionary matrix and s is a K-sparse signal. The incomplete measurement y ∈ R M×1 is obtained through the linear model y = Φx = Ds (1) where Φ ∈ R M×N (M < N) is called the measurement matrix and D = ΦΨ is the sensing matrix. Some specific properties of Φ have great impacts on the reconstruction performance. In [5,6], spark and restricted isometric property (RIP) are respectively proposed as the sufficient conditions on Φ to recovery guarantee. However, computing the spark of a matrix has combinatorial complexity and certifying RIP for a matrix requires combinatorial search, that is to say, these tasks are NP-hard and difficult to accomplish. To a large extent, the coherence between Φ and Ψ reflects the performance of meeting the above conditions. Actually, the coherence is equivalent to the mutual coherence of D. Since the mutual coherence can be easily manipulated to provide recovery guarantees, it is commonly used to measure the performance of Φ. The frequently-used mutual coherence indexes include the maximum mutual coherence µ max [7], t-averaged mutual coherence µ ave [8] and global mutual coherence µ all [9], which respectively represent the maximum, the average and the sum of squares of the correlation between any distinct pair of columns in D.
The first attempt to consider the optimal design of Φ is given in [8]. The simulation results carried out in [8] show that the optimized Φ leads to smaller µ ave and a substantially better CS reconstruction performance is obtained. Then the optimization of Φ becomes an important issue in CS. The recent works try to find the optimal Φ which has excellent performance in reducing the mutual correlation by minimizing the Frobenius norm of the difference between the Gram matrix G = D D and the target Gram matrix G t . The main work focuses on designing G t and finding the best Φ.
In [8], G t is obtained by shrinking the off-diagonal entries in G. The shrinkage technique reduces µ ave but it is time-consuming. Furthermore, µ max is still large, which ruins the worst-case guarantees of the reconstruction algorithms. A suitable point between the current solution and the one obtained using a new shrinkage function is chosen to design the G t in [10]. It is of very strong competitiveness in µ ave and µ max . However, the optimal point is hard to determine and the unsuitable point may seriously degrade the algorithm's performance. In [9], G t is obtained by averaging the eigenvalues of G. Simulation results show that µ all is reduced effectively but reducing µ ave and µ max is hard to be guaranteed, which means µ ave and µ max may maintain large values. Duarte-Carajalino and Sapiro [11] set G t as an identity matrix. Since D is overcomplete and G cannot be an identity matrix, simply minimizing the difference between G and G t does not imply low µ max [12]. In [13][14][15][16][17], G t is chosen from a set of relaxed Equiangular Tight Frames (ETF) [18]. The set can be formulated as where µ welch denotes the Welch bound [19] and G t (i, j) denotes the (i, j)th entry of G t . However, the maximum absolute value of off-diagonal entries in G is almost always greater than µ welch . In this case, the optimization usually implies a solution D with low µ ave but high µ max . In summary, the target Gram matrices mentioned above only focus on a certain mutual coherence index, and fail to take into account µ ave , µ max and µ all simultaneously. When a certain index is targeted, the other indexes may not decrease significantly or even increase. Therefore, Φ is not 'good' enough and the reconstruction performance is well below par.
After designing G t , the next step is to find the 'best' Φ by approaching G to G t . In [8][9][10][11]17], G is obtained by applying SVD to G t primarily and then the square root D of G is built asD D = G. At last, Φ is obtained by Φ =DΨ † where † denotes the Moore Penrose pseudoinverse. This kind of method is intuitive, but the generalized pseudoinverse poses problems of calculation accuracy and robustness [15]. In [14,15], gradient algorithm and quasi-Newtonian algorithm are respectively utilized to attain Φ. Firstly, the cost function F(Φ) with Φ as the variable is constructed. Then the search direction is determined by the derivative of F(Φ). Finally, Φ is obtained with a fixed step size. However, choosing a suitable step size which has a great influence on the accuracy of the solution requires a lot of comparison work. Moreover, the gradient algorithm and quasi-Newtonian algorithm cannot converge until a certain number of iterations is accomplished, resulting in high computational cost. In [11,16], the method for designing Φ shares the same concept as K-SVD [20], that is to update a matrix row by row. Eigenvalue decomposition is required to find the square root of the maximum eigenvalue for each row, which results in a significant increase in the calculation. For solving this problem, Hong et al. [16] utilize the power method instead of eigenvalue decomposition. However, the eigenvalue obtained by power method is the one with the largest absolute value. When the eigenvalue is negative, eigenvalue decomposition is still necessary.
The primary contributions of this paper are threefold:

•
The new target Gram matrix G t targets µ ave , µ max , and µ all of D simultaneously is designed. Firstly, a new shrinkage function whose threshold exceeds µ welch is utilized to determine the initial target Gram matrix. Then G t is obtained by sequentially applying rank reduction and eigenvalue averaging to the initial matrix. • Analytical solutions of the measurement matrix Φ to minimize the difference between G = Ψ Φ ΦΨ and G t are derived by SVD. • Based on alternating minimization, an iterative method is proposed to optimize the measurement matrix. The simulation results confirm the effectiveness of the proposed method in decreasing the mutual coherence indexes and improving reconstruction performance.
The remainder of this paper is organized as follows. Some basic definitions related to mutual coherence indexes and frames are described in the next section. The main results are presented in Section 3, where the solutions to the G t design are characterized and a class of the solutions to the optimal Φ is derived in detail. The procedure of our method and the discussion can be also found in Section 3. In Section 4, simulations are carried out to confirm the effectiveness of the proposed method. In the end, the conclusion is drawn.

Mutual Coherence Indexes
entry at the position of row i and column j in G, where i, j = 1, 2 · · · L. Here, we quote the definitions of the mutual coherence indexes as that presented by Donoho [7], Elad [8], and Zhao [9]. Definition 1. For a matrix D, the maximum mutual coherence µ max is defined as the largest absolute and normalized inner product between all columns in D that can be described as Definition 2. For a matrix D, the t-averaged mutual coherence µ ave is defined as the average of all absolute and normalized inner products between different columns in D that are above t and can be described as Definition 3. For a matrix D, the global mutual coherence µ all is defined as the sum of squares of normalized inner products between all columns in D that can be described as As shown in [5], the original signal can be exactly reconstructed as long as K < (1 + 1/µ max )/2. The conclusion is true from a worst-case standpoint which means that µ max does not do justice to the actual behavior of sparse representations. Therefore, Elad considers that an "average" measure of mutual coherence, namely µ ave , is more likely to describe its true behavior. Different from the previous two indexes, µ all reflects the overall property of D.
In fact, the purpose of reducing the mutual coherence indexes of D is to attain G that meets the following requirements: (1) The maximum absolute value of off-diagonal entries in G is sufficiently small; (2) The number of off-diagonal entries with large absolute value is minimized; (3) The average of off-diagonal entries with large absolute value is as small as possible. However, when a certain mutual coherence index is targeted solely, we cannot guarantee that the obtained G will fully meet the requirements. Therefore, the decrease of a certain index does not always mean better Φ and improved reconstruction performance. When the three indexes are reduced simultaneously, the requirements are better satisfied and better performance is obtained.

ETFs
It is shown in [19] The bound is achievable for ETF. Here, we recall the definition of ETF [18].

Definition 4.
Let F be a M × L matrix whose columns are f 1 , f 2 · · · f L . The matrix F is called an equiangular tight frame if it satisfies three conditions (1) Each column has a unit norm: (2) The columns are equiangular. For some nonnegative θ, we have f i f j = θ when i, j = 1, 2 · · · L and i = j. Sustik et al. [18] show that a real and (L − M)(L − 1)/M must be odd integers when L = 2M, M, and 2M − 1 must be an odd number and the sum of two squares respectively when L = 2M. Fickus et al. [21] surveys some known construction of ETFs and tabulates existence for sufficiently small dimensions. The above studies show that M and L must meet some exacting requirements when an ETF is available for D. However, it is really difficult to meet the requirements in practice, which means the maximum absolute value of the off-diagonal entries in G is usually significantly larger than µ welch .

The Proposed Method
The off-diagonal entries in G actually are the inner products between different columns in D. Reducing those entries is likely to lead to lower mutual coherence indexes and better performance. The most straightforward approach is to replace large off-diagonal values with small ones. However, it is impossible to solve Φ from a certain G because of the inequality of rank between Ψ Φ ΦΨ and G when the approach is adopted. Therefore, a feasible approach is to minimize the difference between G and G t that can be formulated as where G = Ψ Φ ΦΨ. This problem can be solved by alternating minimization strategy [14,16], which iteratively minimizes (5) to find the desired Φ. The idea is to update G t and Φ alternatively and repeat this proceeding until a stop criterion is reached. In this section, we design G t firstly and then derive the analytical solutions of Φ. Finally, an iterative method is proposed to optimize the measurement matrix based on alternating minimization.

The Design of G t
It can be seen from (5) that G t plays an important role in measurement matrix optimization. In recent works, G t is frequently set as the relaxed ETF matrix, which is obtained by applying the following shrinkage function where ς = µ welch for i, j = 1, 2 · · · L and i = j. Such a scheme in designing G t guarantees that the off-diagonal entries with large value of G will be intensively constrained, which means lower µ ave and µ max . Recall from Section 2.2 that the Welch bound is not achievable for G in most cases. As shown in [14,16], different ς yields different results and µ welch is not the optimal value. Li [22] et al. found that a smaller µ max is available when ς is slightly larger than µ welch . Inspired by [22], we propose an improved shrinkage function which divides the entries in G into three segments through two thresholds. One of the thresholds is µ welch and the other is larger than µ welch . The function is as follows where Thr = µ welch + c and 0 < c < µ welch . As can be seen from Equation (7), the maximum absolute value of off-diagonal entries in G t is raised from µ welch to Thr. According to the previous analysis, the new function is likely to lead to a further reduction in µ max while maintaining the advantage of Equation (6) with respect to µ ave . After shrinkage, G t becomes full rank generally [8], that is Rank(G t ) = L. However, the rank of G is identically equal to M. Thus, we consider mending this by forcing a rank M. A new target Gram matrix, denoted as G t_M , is obtained by solving The solutions to this problem are given by Theorem 1 below.
be the matrix obtained by applying the shrinkage operation shown as Equation (7) to G and G t = PΛP be the eigendecomposition of G t . P is orthonormal with dimension L and Λ = diag(λ 1 , λ 2 · · · λ L ) with |λ 1 | ≥ |λ 2 | ≥ · · · ≥ |λ L |. The solutions of the minimization problem defined by (8) are characterized by where U X ∈ R M×M and V X ∈ R L×L are unitary. Then G t_M can be rewritten as Denote f = G t − G t_M 2 F . By substituting G t_M with G t_M = X X, can be rewritten as a function of matrix X. Let ∂ f /∂X be the derivative of f with respect to X. The optimal X should satisfy ∂ f /∂X = 0. Equivalently, we have It then follows from Substituting Equation (12) into Equation (10), we obtain It turns out from the unitary invariance with Equation (13) that With a few manipulations, we conclude that the solution of (8) is equivalent to solving where tr() denotes the matrix trace operation. Noting that A = A A and G t = PΛP , the problem in (15) is equivalent to Denote B = V X P and rewrite B as B = [b 1 , b 2 · · · b L ] where b i ∈ R L×1 for i = 1, 2 · · · L. Rewrite A as A = [e 1 , e 2 · · · e M , 0, 0 · · · 0] , where e j ∈ R L×1 denotes a unit vector with the ith entry is equal to 1 for j = 1, 2 · · · M. Then, it is easy to obtain that Let b ji = e j b i be the jth entry of b i . It can be shown with some manipulations that the problem in (16) is equivalent to With |λ 1 | ≥ |λ 2 | ≥ · · · ≥ |λ L | and In this case, As can be seen, the rank of G t_M is equal to M. The proof is then completed. After the operation of rank reduction, the rank of G t_M is equal to that of G. Additionally, G t_M is most similar to G t in terms of Frobenius norm. Inspired by [9], we reduce the sum of squares of all off-diagonal values of G t_M , namelyμ all , by eigenvalue averaging. When minimizing the difference between G and G t_M , a smallerμ all is more likely to lead to a smaller µ all .μ all can be formulated aŝ whereĝ ij denotes the (i, j)th entry of G t_M andĝ ii = 1 holds for i = 1, 2 · · · L.
Noting that Assuming that with M equal all non-zero eigenvalues is given by Recall that G t_M is most similar to G t in terms of Frobenius norm, it means that G t_M is of good competitiveness in µ ave and µ max . Furthermore, as a variant of G t_M , G t_opt reduces the sum of squares of all off-diagonal values of G t_M , leading to a better performance in minimizing µ all . Therefore, G t_opt is more likely to be an ideal solution of target Gram matrix which leads to better µ ave , µ max , and µ all simultaneously.

The Analytical Solutions of Φ
After obtaining the target Gram matrix G t_opt , the next step of the optimization is to find the best Φ. To handle the problem, we try to find the optimal solution by minimizing the difference between Ψ Φ ΦΨ and G t_opt as min G t_opt − Ψ Φ ΦΨ  (21) and Λ M ∈ R M×M be the Mth principal submatrix ofΛ. Then the solutions of the minimization problem defined by (22)

Theorem 2. Let G t_opt be the matrix shown as Equation
where U Z ∈ R M×M is an arbitrary unitary matrix.
Proof. Assume that the off-diagonal values of Σ D and Σ Ψ are non-zero, Φ can be written as By substituting Φ in (22) with Equation (24), it can be shown with some manipulations that the solutions of the problem in (22) are equivalent to the solutions of Let Z = V D PΛP V D and z i be the ith diagonal entry of Z. Denote Λ Z = diag(z 1 , z 2 · · · z M ). With further manipulations, we simplify (25) to Obviously, the minima are achievable only if Λ Z = Σ 2 D holds and Λ Z 2 F takes the maximum value. Let U = V D P and u ij be the (i, j)th entry of U where i, j = 1, 2 · · · L. Noting that the top M diagonal entries ofΛ are all equal toλ, we have It is worth noting that Λ Z where U 1 ∈ R M×M . Since z i =λ holds for i = 1, 2 · · · M, we have u 2 i1 + u 2 i2 + · · · + u 2 iM = 1 and U 2 = 0, U 3 = 0 accordingly. As U Z is a unitary matrix, it is easy to verify that U 1 and U 4 are both unitary matrices. Then it follows that Substituting Equation (28) into Equation (24), the optimal solution is obtained by The proof is then completed.

Comments
According to Sections 3.1 and 3.2, the procedure for measurement matrix optimization has been summarized in Algorithm 1.

Algorithm 1. The proposed optimization method.
Input: Dictionary matrix Ψ ∈ R N×L N which has an SVD form of Ψ = U Ψ Σ Ψ 0 V Ψ , number of iterations Iter, constant c, Welch bound µ welch . Output: Measurement matrix Φ opt . Initialization: Initialize Φ 0 ∈ R M×N to a random matrix, initial U Z ∈ R M×M to a unitary matrix. For l = 1 to Iter do 1.
Compute the sensing matrix D = Φ l Ψ and normalize the columns in D.

3.
Shrink G and obtain G t by Apply eigenvalue decomposition to G t and obtain G t = PΛP .

5.
Compute the average of the top M diagonal entries in Λ, denoted asλ. 6. Construct

end return Φ Iter
Noting that G t plays an important role in measurement matrix optimization, Algorithm 1 takes µ ave , µ max and µ all into consideration simultaneously when designing the G t . By minimizing (5), the Gram matrix is most similar to G t in terms of Frobenius norm, leading to maintain the advantage of G t in reducing the mutual coherence indexes. Therefore, Algorithm 1 is effective in reducing µ ave , µ max , and µ all .
In the shrinkage function, a different threshold yields different results. Inspired by [22], we propose a shrinkage function shown as (7) which has a new threshold µ welch + c. We have not derived the optimal value of c in theory, but setting c to a proper value can also lead to a moderate result.
After averaging the eigenvalues of G t_M , the first term on the right part of (20) is minimized. However, the diagonal entries of G t_M change accordingly. Hence, we can't assure thatμ all reaches the minima, that is to say, G t_opt may not be the optimal solution in terms of µ all . Fortunately, we find that the change of ii in (20), which means our approach is effective in reducing µ all . The proposed algorithm is an iterative one. The main complexity of Algorithm 1 for each iteration is located at steps 1, 2, 4, and 7. For those steps, the flops required are O(MNL), O ML 2 , O L 3 , and O L 3 respectively. Hence, the complexity of Algorithm 1 is approximate to be O IterL 3 . Since the complexity for similar algorithms in [8,16,17] which apply eigenvalue decomposition or SVD is no less than O IterL 3 , the proposed algorithm has not increased the complexity significantly.

Simulation Results and Discussion
In this section, we conduct simulations to predetermine a suitable c firstly. Then, we examine the mutual coherence indexes and reconstruction performance of the proposed method and compare them with the well-established similar algorithms given in [8,16,17] by presenting the empirical results. Last, we verify the effectiveness of our method with various measurement matrices and dictionary matrices. The iteration number Iter is set to 100 and t is set to µ welch . For a given dictionary matrix Ψ ∈ R 80×120 , x ∈ R 120×1 has a sparse representation as x = Ψs where s is K-sparse and each non-zero entry is randomly positioned with a Gaussian distribution of i.i.d. zero-mean and unit variance. Orthogonal Matching Pursuit (OMP) [23] algorithm is employed in signal reconstruction. Denote ε = x e − x 2 / x 2 the reconstruction error where x e is the reconstructed signal. The reconstruction is identified as a success, called exact reconstruction, provided ε ≤ 10 −6 . Denote P suc the percentage of successful reconstruction. In Sections 4.1-4.3, Φ 0 and Ψ are both Gaussian random matrices.

The Choice of c
Since the analytical solution of c is extremely difficult, here, we conduct a serious of simulations to find a suitable c. Figure 1 illustrates the change tendency of mutual coherence indexes and P suc with argument c. We fix the row number of M to 28, the sparsity to 8, and varies c from 0 to 0.16. The experiment is performed for 1000 random sparse ensembles and the results are recorded.
When c = 0, the shrinkage function shown as (7) is the same as (6). As can be seen from the graphs, when c increases, µ ave increases, µ max and µ all decrease first and then increase, P suc increases firstly and then decreases. µ max and µ all reach the minima when c = 0.02 and c = 0.03 respectively. It is worth noting that appropriate increase of c leads to decrease of µ max and µ all but increase of µ ave . When c = 0.01, better µ max and µ all are obtained and the loss in µ ave is tolerable. Moreover, P suc reaches the maxima. Therefore, 0.01 may be a moderate value for c and c is set to 0.01 in the simulations in Sections 4.2-4.4.

Comparing the Mutual Coherence Indexes
This section presents a series of simulations to compare our method with algorithms given in [8,16,17] on the three mutual coherence indexes of D obtained by D = Φ opt Ψ where Φ opt is the optimized measurement matrix. For convenience, each method is denoted as Propose, Elad, Hong, and Entezari. The down-scaling factor for Elad is set to 0.95. The inner iteration number for Hong is set to 2, which means K-SVD is applied twice in every updating of Φ. The point is set to 0.5 to update the G t in Entezari. Figure 2 illustrates the change tendency of mutual coherence indexes with iteration number for M = 28. As can be seen from the figure, the indexes corresponding to different algorithms all change monotonously with the iteration number. When µ max and µ ave converge, the number of iterations required by our method is almost equal to that of Hong and significantly less than that of Elad. When µ all converges, the number of iterations required by our method is equivalent to that of Entezari and significantly less than that of Hong and Elad. Figure 3 presents the histogram of the absolute off-diagonal values of Φ opt Ψ Φ opt Ψ for M = 28. It is seen from the figure that Elad and Entezari have long tails, showing that the number of off-diagonal values that exceed 0.34 is relatively large. The tail of Hong is shorter than that of Elad and Entezari, and reaches the maximum of 0.34. Compared with Hong, our method has a shorter tail which reaches the maximum of 0.32 and has more off-diagonal values below the µ welch (0.1662).
In conclusion, while effectively reducing µ max and µ all , our method can maintain a small µ ave at the same time. Additionally, the number of iterations required for the convergence of each index of our method is significantly less than that of Elad. Therefore, from the view of mutual coherence indexes, the measurement matrix obtained by our method has better properties than the other three methods. This coincides with the theoretical results obtained in the Section 3.

Comparing the Reconstruction Performance
Case 1. Comparison of the P suc in the noiseless case.
In this case, we conduct two separate CS experiments, first by fixing K = 8 and varying M from 12 to 44 and second by fixing M = 28 and varying K from 4 to 20. Each experiment is performed for 1000 random sparse ensembles and the number of exact reconstruction is recorded. Figures 4 and 5 reveal that the P suc of our method is the highest, which indicates its superiority over the other three methods.      To show the robustness of the proposed method in noisy cases we consider the noisy model y = Φx + v where v is the vector of additive Gaussian noise with zero means. We conduct the experiment by fixing M = 28, K = 8, and varying SNR from 10 to 50 dB. The experiment is performed for 1000 random sparse ensembles and the average reconstruction error is recorded. From Figure 6, we can see that the reconstruction errors decrease with the increase of SNR, and the error of the proposed method is smaller than that of the others.  Table 3 presents that µ all of the Entezari is slightly larger than that of our method. It is interesting to note that the number of off-diagonal entries with smaller absolute values in the Entezari is significantly larger than that of our method from Figure 3. Moreover, it can be seen from Table 2 that µ ave of Hong is slightly lower than that of our method. However, the simulation results show that our method outperforms the others in terms of reconstruction performance. It is also worthy noting that our method reduces µ ave , µ max , and µ all simultaneously, leading to better reconstruction performance in CS. This implies that a single mutual coherence index cannot accurately reflect the actual performance of the methods, and verifies the necessity of using multiple indexes simultaneously in measurement matrix optimization.

Different Kinds of Φ and Ψ Optimized by the Proposed Methods
To analyze the performance of our method with various measurement matrices and dictionary matrices, a serious of simulations are carried out in this section. We choose the measurement matrix as a Gaussian random matrix and a Bernoulli random matrix, and choose the dictionary matrix as a Gaussian random matrix and the DCT matrix, respectively. We compare the mutual coherence indexes and the reconstruction performance before and after optimization. When Ψ is the Gaussian random matrix, Φ belongs to R M×80 and Ψ belongs to R 80×120 . When Ψ is the DCT matrix, Φ belongs to R M×120 and Ψ belongs to R 120×120 . Each experiment is performed for 1000 random sparse ensembles.
The mutual coherence indexes of different measurement matrices Φ with different dictionary matrices Ψ are shown in Figure 7. As seen from the simulations, all the optimized measurement matrices produce smaller µ max , µ ave , and µ all than the random ones.  Figures 8 and 9 present the reconstruction performance of OMP with the optimized measurement matrices and the random ones. It is seen from the graphs in these figures that all the optimized matrices outperform the random ones in terms of the percentage of exact reconstruction.

Conclusions
This paper focused on the optimization of measurement matrix for compressed sensing. To decrease µ max , µ ave , and µ all simultaneously, we designed a new target Gram matrix which was obtained by applying a new shrinkage function to the Gram matrix and updated by performing rank reduction and eigenvalue averaging. Then, we characterized the analytical solutions of the measurement matrix by SVD. Based on alternating minimization, we proposed an iterative method to optimize the measurement matrix. The simulation results show that the proposed method reduces µ max , µ ave , and µ all simultaneously and outperforms the existing algorithms in terms of reconstruction performance. In addition, the proposed method is computationally less expensive than some existing algorithms in the literature.
As detailed, we gave the optimal value of c under a fixed matrix scale through simulation. When the scale changes, the value of c in Section 4.1 may no longer be applicable. Therefore, it is meaningful to find the theoretical 'optimal value' of c. Furthermore, noting that lower mutual coherence indexes mean potentially higher reconstruction performance, further efforts are needed to decrease the indexes simultaneously.

Data Availability Statement:
No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest:
The authors declare no conflict of interest.