Sparse Bayesian Learning for DOA Estimation with Mutual Coupling

Sparse Bayesian learning (SBL) has given renewed interest to the problem of direction-of-arrival (DOA) estimation. It is generally assumed that the measurement matrix in SBL is precisely known. Unfortunately, this assumption may be invalid in practice due to the imperfect manifold caused by unknown or misspecified mutual coupling. This paper describes a modified SBL method for joint estimation of DOAs and mutual coupling coefficients with uniform linear arrays (ULAs). Unlike the existing method that only uses stationary priors, our new approach utilizes a hierarchical form of the Student t prior to enforce the sparsity of the unknown signal more heavily. We also provide a distinct Bayesian inference for the expectation-maximization (EM) algorithm, which can update the mutual coupling coefficients more efficiently. Another difference is that our method uses an additional singular value decomposition (SVD) to reduce the computational complexity of the signal reconstruction process and the sensitivity to the measurement noise.


Introduction
The problem of estimating the direction-of-arrival (DOA) of multiple narrow-band sources has received considerable attention in many fields, e.g., radar, sonar, radio astronomy and mobile communications [1]. Many high-resolution DOA estimation algorithms have been proposed in the last few decades. Their excellent performance relies crucially on perfect knowledge of the array manifold. In practice, however, the array manifold usually suffers from imperfections, such as unknown mutual coupling between antenna elements, imperfectly-known sensor positions and orientations and gain-phase imbalances. Without array manifold calibration, the performance of DOA estimation may degrade substantially. Hence, it is necessary to calibrate imperfections prior to carrying out DOA estimation.
A large number of calibration methods has been proposed to deal with the imperfect manifold caused by unknown or misspecified mutual coupling [2][3][4][5][6][7][8][9][10][11]. For example, some iterative mutual coupling auto-calibration methods for uniform linear arrays (ULAs) and uniform circular arrays (UCAs) were proposed in [2,3]. However, the high computational complexity required by iterations may be time consuming, and the convergence is not theoretical guaranteed, thus resulting in algorithmic instability. Recently, by taking advantage of the special structure of the mutual coupling matrix for ULAs, Ye and coauthors [4,5] demonstrated that DOAs can be accurately estimated without compensating for mutual coupling if a group of auxiliary sensors are added into ULAs. Dai and coauthors further extended the result to the spatial smoothing method [6], real-valued method [7] and l 1 -norm regularized method [12]. These methods are referred to as the auxiliary methods. Neither a calibration source nor iteration is required in the auxiliary methods. However, because of only applying a middle subarray, they reduce the working array aperture, which limits the applicability of these methods to cases with either few DOAs or many array elements. Some alternatives that can calibrate the imperfect manifold with the whole array in the presence of mutual coupling are proposed in [8][9][10][11].
Recently, the emerging technique of compressive sensing (CS) has given renewed interest to the problem of DOA estimation [12][13][14][15][16][17][18]. These methods exhibit many advantages, e.g., improved robustness to noise, limited number of snapshots and correlation of signals. Sparse Bayesian learning (SBL) is a popular and important technique for the sparse signal recovery in CS [19][20][21], which formulates the signal recovery problem from a Bayesian perspective, while the sparsity information is exploited by assuming a sparse prior for the signal of interest. It is worth noting that l 1 -norm regularized optimization is deemed to be a special example of SBL, if a maximum a posteriori (MAP) optimal estimate is adopted with a Laplace signal prior. Theoretical and empirical results show that SBL methods can achieve enhanced performance over l 1 -norm regularized optimization [20,21]. It is generally assumed that the measurement matrix in SBL is precisely known. Unfortunately, this assumption is invalid if perturbations on the measurement matrix (e.g., when the array manifold suffers from the aforementioned imperfections) are considered. Additive perturbations [22,23] and multiplicative perturbations [16] have been addressed in the recent literature.
In this paper, we will propose a modified Bayesian method for joint estimation of DOAs and the mutual coupling coefficients with ULAs, where the optimization problem is formulated in a SBL framework with a two-stage hierarchical prior, and then, we adopt an expectation-maximization (EM) algorithm that treats the signals of interest and the mutual coupling coefficients as hidden variables and parameters, respectively. To the best of our knowledge, there are very limited SBL methods that can resolve the DOAs with mutual coupling. This problem was first addressed in [24], where a unified SBL framework was proposed to address the DOA estimation problem with typical perturbations of mutual coupling, gain-phase uncertainty and sensor location error. The main differences between our method and the method in [24] are: • Our SBL model constitutes a two-stage hierarchical form, which results in the Student t prior in a hierarchical manner; while the model in [24] has only stationary priors. The advantage of the Student t prior is that it enforces the sparsity constraint more heavily [21,25]. • Our method provides a distinct Bayesian inference for the EM algorithm, which can update the mutual coupling coefficients more efficiently. • Our method uses an additional singular value decomposition (SVD) to reduce the computational complexity of the signal reconstruction process and the sensitivity to the measurement noise.
These differences guarantee that our method can give a better joint estimation performance of DOAs and the mutual coupling coefficients.

DOA Estimation Model
Consider K narrow-band far-field sources impinging on an M -element ULA, where the distance between adjacent sensors is d. The K signals, s 1 (t), s 2 (t), . . . , s K (t), arrive at the array from distinct directions, θ 1 , θ 2 , . . . , θ K , with respect to the normal line of the array. The M × 1 array output vector y(t) is then given by: where y(t) = [y 1 (t), y 2 (t), . . . , y M (t)] T , s(t) = [s 1 (t), s 2 (t), . . . , s K (t)] T , A = [α(θ 1 ), α(θ 2 ), . . . , α(θ K )], α(θ k ) = [1, e jφ(θ k ) , . . . , e j(M −1)φ(θ k ) ] T , φ(θ k ) = (−2πd/λ) sin(θ k ) and n(t) = [n 1 (t), n 2 (t), . . . , n M (t)] T is an unknown noise vector. The matrix C ∈ C M ×M is the mutual coupling matrix (MCM) for ULAs. Many theoretical or experimental studies [26,27] have demonstrated that a banded symmetric Toeplitz MCM with m (≤ M −1 2 ) coefficients can provide a good approximation of real-world situations, i.e., C = toeplitz(c), where c = [c 0 , c 1 , c 2 , . . . , c m ] with 0 < |c 1 |, |c 2 |, . . . , |c m | < c 0 = 1 and toeplitz(c) denotes a symmetric Toeplitz matrix constructed by the vector c. Denoting Following the convention in [14], we use the singular value decomposition (SVD) to reduce the computational complex of the signal reconstruction process and the sensitivity to the measurement noise. Let the SVD of Y be written in the form of: where U s ∈ C M ×K and V s ∈ C T ×K are unitary matrices whose columns are the singular vectors corresponding to theK largest singular values, while the columns of U n ∈ C M ×(M −K) and V n ∈ C T ×(M −K) are the singular vectors corresponding to the rest M −K singular values. Note that the value ofK is generally determined by the singular values; especially if the number of sources K is exactly known,K is set to K.
Using Equation (2) and definingŶ = YV s , we obtain: whereŜ = SV s andN = NV s . In order to cast the problem of DOA estimation with unknown mutual coupling as a sparse representation problem, we let Ω denote the set of possible locations and letθ be a generic location parameter. Furthermore, let {θ i } Kθ i=1 denote a grid that covers Ω. If the grid is fine enough, such that the true DOAs lie on (or, practically, close to) the grid, we can use the following model forŶ:Ŷ andŜθ is a Kθ ×K complex matrix whose i-th row corresponds to the signal impinging on the array from a possible source atθ i . It is easy to verify that the i-th row is nonzero and equals the k-th row ofŜ if signal k comes fromθ i for some k and zero otherwise. As a result, the aim of this paper is to find a row-sparseŜθ (in other words, with a few nonzero rows) and a Toeplitz matrix C that minimize the following objective function: where · 2 stands for the Frobenius norm. Finding the sparse solution to the above problem through the l 1 -norm regularized optimization is intractable, as it is a non-convex optimization problem with unknown C, which cannot be solved in polynomial time. In the next section, we will propose an SBL method for the DOA estimation in the presence of unknown coupling. To this end, we have to preliminarily model the noisy and sparse signals as in [19].

Noise Model
Firstly, we address the noise model that is commonly used in SBL. Assume elements in the noise vector n defined in Equation (1) are independent and each has a complex Gaussian distribution with zero mean and a common variance σ 2 . Since the orthogonal invariance property of the Gaussian random matrix makes the distribution impervious to multiplication by orthogonal matrices, each element inN is approximately i.i.d. complex Gaussian with the same mean and variance [15,28], i.e., p(N i,j ) = CN (N i,j |0, β −1 ), where β = σ −2 denotes the noise precision andN i,j is the (i, j)-th element inN. Then, we have: whereŷ k andŝ k denote the k-th columns ofŶ andŜθ, respectively. Usually, the noise variance σ 2 is unknown, so is the noise precision β. Hence, we model β as a Gamma hyperprior: where we set a, b → 0 as in [19,21] so as to obtain a broad hyperprior. The reason why we choose the Gamma hyperprior is that it is a conjugate prior (in Bayesian probability theory, a prior p(θ) is said to be conjugate to p(x|θ) if the posterior distribution p(θ|x) has the same functional form as the prior) of the Gaussian distribution.

Sparse Signal Model
A widely-used sparseness prior forŜθ is the Laplace distribution; however, it is not readily accomplished with such a prior in SBL, because the Laplace prior is not conjugate to the Gaussian likelihood, and hence, it is unlikely to perform the associated Bayesian inference in closed form [21]. A typical SBL treatment ofŜθ, proposed in RVM [19], begins by assigning a non-stationary Gaussian prior distribution with a distinct inverse variance δ i for each row ofŜθ. Letting δ = [δ 1 , δ 2 , . . . , δ Kθ ] T and ∆ = diag(δ), we have: In order to make the Bayesian inference convenient and to obtain a two-stage hierarchical prior that favors most rows ofŜθ being zeros, the hyper-parameter δ i 's are further modeled as independent Gamma distributions [19,21], i.e., where c → 1 and d → 0. In this case, the integral ∞ 0 CN (ŝ k |0, ∆ −1 ) · p(δ; c, d)dδ corresponds to the Student t distribution [19], which can be evaluated analytically. Alternatively, a two-stage hierarchical method that results in Laplace priors was addressed in [21].

The Proposed Sparse Bayesian Learning Method
The associated learning problem becomes the search for the hyperparameters β and δ, as well as the parameter C. As p(Ŝθ, β, δ|Ŷ; C) cannot be explicitly calculated, a Type-II ML [19] (or evidence maximization) procedure is exploited to perform the Bayesian inference. In other words, the hyperparameters (β and δ) and parameter (C) are estimated by maximizing p(Ŷ|β, δ; C) or its logarithm L(β, δ; C) ln p(Ŷ|β, δ; C). However, explicitly finding the values of β, δ and C that maximize L(β, δ; C) is intractable, and here, we adopt an EM algorithm that treatsŜθ as a hidden variable. The principle behind the EM algorithm is to instead repeatedly construct a lower bound on L(β, δ; C) (E-step) and then to optimize that lower-bound (M-step). Specifically, • E-step: Compute: p(Ŝθ|Ŷ, β, δ; C) (11) and evaluate: ln p(Ŷ,Ŝθ, β, δ; C) where (·) (t) denotes the estimated value in the t-th iteration. • M-step: Find the hyperparameter updates and the parameter updates: In the rest of this section, we will discuss the two steps in detail.

E-Step
Using the Bayes rule, we can verify that the posterior distribution ofŜθ in Equation (11) is also a complex Gaussian [19]: where: On the other hand, by combining the stages of the hierarchical Bayesian model, p(Ŷ,Ŝθ, β, δ; C) can be rewritten as: Therefore, Equation (12) leads to:
1. For β, ignoring terms in the logarithm independent thereof, we just have to maximize: ln p(Ŷ|Ŝθ, β; C)p(β) which, through differentiation, gives the update for β (t+1) : 2. For δ, also ignoring terms in the logarithm independent thereof, we have: Setting the derivative of Equation (21), with respect to δ, to zero and solving for each δ i gives the updates: where For C, its estimate should maximize the following expected value: In order to calculate the derivative of Equation (23), we need the following lemma.
Lemma 1 (see [2]). Let C and c be defined as earlier, then for any vector x, we have: where the matrix T(x) ∈ C M ×(m+1) is the sum of the two M × (m + 1) matrices: With the assistance of Lemma 1, we are capable of rewriting the terms ŷ k − CAθµ (t) k 2 2 and tr CAθΣ (t) (CAθ) H in Equation (23) as: (27) and: tr denotes the i-th column of D (t) . Using Equations (27) and (28), we can calculate the derivative of ln p(Ŷ|Ŝθ, β; C) p (t) , with respect to c, as: Setting the derivative to zero gives: Remark 1. Note that [9][10][11] provided an alternative parameterization for the steering vector: [ν 1 , . . . , ν m , 1, η 1 , . . . , η m ] with ν k = 1 g(θ k ,c) ( m m =1−k c |m | e jm φ(θ k ) ) and η k = 1 g(θ k ,c) ( m−k m =−m c |m | e jm φ(θ k ) ). However, it is unlikely that Equation (31) can be applied to the M-step for C. The most difficult aspect for applying the alternative parameterization in Equation (31) is that it is dependent on the geometric progression in α(θ k ); while the terms Aθµ The EM algorithm proceeds by repeated application of Equations (20), (22) and (30), until a prescribed accuracy is achieved. Once the algorithm is convergent, we are able to obtain the calibrated C. As far as the estimated δ i 's are concerned, we observed that many of δ i tend to infinity in the EM process. Clearly, the rows corresponding to these infinity δ i 's are zero; while the rows corresponding to small δ i 's imply the true DOAs.

Simulation Results
In this section, we will present several simulation results to illustrate the performance of our proposed method. We will compare the proposed method to the original SBL method in [24], the iterative method in [3] and the auxiliary methods in [5,12].
Simulation 1 addresses the performance comparisons of the joint estimation of DOAs and mutual coupling coefficients. Consider a scenario where a ULA composed of M = 10 sensors is used to receive K = 2 uncorrelated signals coming from θ 1 = -19.7 • C and θ 2 = 10.1 • C. Note that the effect of mutual coupling is negligible between two sensors that are far enough away from each other, because the mutual coupling coefficient is inversely proportional to their distance. Hence, it is reasonable to approximate the mutual coupling effect with just a few nonzero coefficients. In the simulation, we assume that the number of mutual coupling coefficients is m = 2 with c 1 = 0.5 − 0.4i and c 2 = 0.3 + 0.1i. Figure 1 shows the root mean square error (RMSE) of DOA estimation versus input SNR computed via 200 Monte Carlo runs, where the number of snapshots is 100. For the ease of comparison, we also include the curve for CRB. As can be seen from the figure, our method outperforms the state-of-the-art methods. This is because: (1) compared to the method in [24], our method utilizes a hierarchical form of the Student t prior to enforce the sparsity of unknown signal more heavily; moreover, our method uses the SVD to reduce the sensitivity to the noise; (2) compared to the methods in [5,12], our method uses the whole array, rather than a subarray, to estimate DOAs and to compensate for the mutual coupling effect. Figure 2 shows the estimation bias and variance for each DOA. Compared to the l 1 -norm regularized method [12], our method can significantly reduce the DOA estimation bias, as well as the variance. To assess the performance of mutual coupling calibration, Figure 3 shows the relative RMSE of the estimation of the mutual coupling coefficients. This relative RMSE is defined as: where c ni is the estimate of c i at the n-th Monte Carlo run. As shown from the Figure 3, the mutual coupling coefficients can be estimated more accurately in our method, especially for low SNR signals. Hence, if all of the methods embed the mutual coupling coefficients within the same classical eigendecomposition algorithm, such as MUSIC and ESPRIT, to estimate the DOA, our method can achieve the best performance.
Simulation 2 is used to investigate the capability of resolving closely-spaced sources with limited snapshots (T = 50) for several methods. The simulation considers a scenario where a ULA composed of M = 13 sensors is used to receive K = 2 uncorrelated signals coming from θ 1 = -2.5 • C and θ 2 = 3.5 • C, and the number of mutual coupling coefficients is m = 2 with c 1 = 0.5 − 0.4i and c 2 = 0.3 + 0.1i. We say that the two signals are exactly resolved in a given run, if max k=1,2 {|θ k − θ k |} is smaller than |θ 1 − θ 2 |/2, whereθ k stands for the estimated DOA for the k-th signal. As can be seen from Figure 4, the resolution performance of our method outperforms others. The superior resolution performance of sparse representation is natural, which is consistent with the simulation results in [12].   Simulation 3 addresses the "blind angle" phenomenon, which may result in the substantially degraded performance of DOA estimation [4,5]. We consider a scenario where a ULA composed of M = 12 sensors is used to receive K = 4 uncorrelated signals coming from θ 1 = 10 • C, θ 2 = 20 • C, θ 3 = 30 • C and θ 4 = 40 • C, and the number of mutual coupling coefficients is m = 2 with c 1 = 0.6 + 0.5i and c 2 = 0.3844 − 0.3476i. It is easy to verify that the blind angle occurs at θ 4 = 40 • C. The number of snapshots is 30, and the SNR is 10 dB. Figure 5 illustrates that: (1) our method can estimate all of the true DOAs accurately, including the blind DOA θ 4 = 40 • C; (2) the l 1 -norm regularized method [12] fails in estimating the blind DOA from θ 4 = 40 • C, but succeeds in obtaining the DOAs from other directions; (3) the iterative method [3] misses two DOAs θ 3 = 30 • C and θ 4 = 40 • C. Obviously, our method has the best performance of "blind angle" suppression.

Conclusions
We have proposed a modified SBL method that approaches the problem of DOA estimation with unknown mutual coupling. Unlike the original SBL method in [24] that only uses stationary priors, our new method utilized a hierarchical form of the Student t prior to enforce the sparsity of unknown signal more heavily. To efficiently perform the Bayesian inference, we adopted a refined EM algorithm that treatsŜθ and C as a hidden variable and a parameter, respectively. It can update the mutual coupling coefficients more efficiently. Another difference is that our method used an additional SVD to reduce the computational complexity of the signal reconstruction process and the sensitivity to the measurement noise. Hence, the proposed method is expected to give a better estimation performance. Simulation results have verified its efficiency.