Off-Grid DOA Estimation Using Alternating Block Coordinate Descent in Compressed Sensing

Weijian Si; Xinggen Qu; Zhiyu Qu

doi:10.3390/s150921099

,

and

Department of Information and Communication Engineering, Harbin Engineering University, Harbin 150001, China

^*

Author to whom correspondence should be addressed.

Sensors2015, 15(9), 21099-21113;https://doi.org/10.3390/s150921099

This article belongs to the Section Physical Sensors

Version Notes

Order Reprints

Abstract

This paper presents a novel off-grid direction of arrival (DOA) estimation method to achieve the superior performance in compressed sensing (CS), in which DOA estimation problem is cast as a sparse reconstruction. By minimizing the mixed k-l norm, the proposed method can reconstruct the sparse source and estimate grid error caused by mismatch. An iterative process that minimizes the mixed k-l norm alternately over two sparse vectors is employed so that the nonconvex problem is solved by alternating convex optimization. In order to yield the better reconstruction properties, the block sparse source is exploited for off-grid DOA estimation. A block selection criterion is engaged to reduce the computational complexity. In addition, the proposed method is proved to have the global convergence. Simulation results show that the proposed method has the superior performance in comparisons to existing methods.

Keywords:

alternating block coordinate descent; block selection criterion; block sparse source; compressed sensing; off-grid direction of arrival (DOA) estimation

1. Introduction

Direction of arrival (DOA) estimation has played an important role in many fields, such as radar, medical imaging and array signal processing [1,2]. In the last decades, many classical methods have been developed, among which multiple signal classification (MUSIC) [3] and estimation of signal parameter via rotational invariance technique. (ESPRIT) [4] are the most popular and have high resolution for DOA estimation. However, these methods are very sensitive to the number of snapshots, signal to noise ratio (SNR) and the correlation between sources. Small number of snapshots, low SNR and high correlation or coherent sources can all make the performance of these methods degrade severely. More recently, the emerging field of compressed sensing (CS) [5,6] has attracted enormous attention and it can reconstruct the sparse source using nonadaptive linear projection measurement obtained by the measurement matrix that satisfies the restricted isometry property (RIP) [7,8]. Support denotes the set that contains the indices of the nonzero elements in the sparse source. Once the support is determined, the sparse source can be reconstructed.

Due to the fact that sources are intrinsically sparse in the spatial domain, the DOA estimation problem can be regarded as a sparse reconstruction in the framework of CS. The CS-based estimation methods have much better estimation performance than conventional estimation methods. Malioutov et al. [9] firstly adopted the sparse signal reconstruction (SSR) perspective for DOA estimation and utilized the singular value decomposition (SVD) of the data matrix to propose

l_{1}

-SVD method. In [10], CS-MUSIC was proposed by revisiting the link between CS and MUSIC. This method identifies the parts of support using CS, after which the remaining parts are estimated by a novel generalized MUSIC criterion. Xu et al. [11] utilized the Capon spectrum to design a weighted

l_{1}

-norm penalty in order to further enforce the sparsity and approximate the original

l_{0}

-norm for DOA estimation. Wei et al. [12] proposed a modified greedy block coordinate descent (R-GBCD) method and the corresponding version with weight refinement (R-GBCD+) to improve the estimation performance.

The key to guarantee the performance of conventional CS-based estimation methods is that all true DOAs are exactly located on the grid. However, when true DOAs are not on the grid set, the performance may severely degrade due to the grid error caused by mismatch, which is defined as the distance from the true direction to the nearest grid. In order to address this issue, Zhu et al. [13] proposed the sparse total least square (STIS) to perform the off-grid DOA estimation, in which perturbations of the model are assume to be Gaussian. In [14], Yang et al. introduced the Bayesian theory in off-grid DOA estimation and proposed an off-grid sparse Bayesian inference based on the singular value decomposition (OGSBI-SVD). Liang et al. [15] proposed an off-grid synchronous approach based on distributed compressed sensing to obtain larger array aperture. Zhang et al. [16] formulated a novel model based on the sampling covariance matrix and solved the off-grid DOA estimation problem by the block sparse Bayesian method even if the number of sources are unknown.

In this paper, a novel alternating block coordinate descent method called ABCD is proposed for off-grid DOA estimation in CS. The proposed method solves the mixed k-l norm minimization problem to reconstruct the sparse source and estimate the grid error. Since joint estimation will lead to a nonconvex optimization problem, the proposed method adopts an iterative process that minimizes the mixed k-l norm alternately over two sparse vectors. Instead of conventional sparse source, the block sparse source is exploited to achieve better reconstruction properties. The block is updated by the proposed block selection criterion, which can improve efficiency of the proposed method. In addition, we give a detailed derivation process of proving the global convergence of the proposed method. Simulation results illustrate the superior performance of the proposed method as compared with existing methods.

The rest of the paper is organized as follows. An off-grid DOA estimation model is formulated in Section 2 and Section 3 introduces the proposed method in detail. The global convergence of the proposed method is proved in Section 4 and Section 5 shows the performance of the proposed method. Conclusions are provided in Section 6.

2. Problem Formulation

Consider K far-field narrowband sources

s_{1} (t), s_{2} (t), \dots, s_{K} (t)

impinging on the uniform linear array (ULA) consisting of M omnidirectional sensors with inter-sensor spacing

\overset{⌢}{d}

. Assume that each source

s_{k} (t)

is located at different direction

θ_{k}

with the power

σ_{k}

,

k = 1, 2, \dots, K

. At time instant t, the received source by ULA can be expressed as

x (t) = \sum_{k = 1}^{K} a (θ_{k}) s_{k} (t) + n (t) t = 1, 2 \dots T

(1)

where

a (θ_{k}) \in ℂ^{M \times 1}

and

n (t) \in ℂ^{M \times 1}

denote the steering vector and noise vector, respectively. Since the first point of ULA is set as the origin of sensor array, the mth element of

a (θ_{k})

is written as

e^{- j (2π / λ) \hat{d} (m - 1) \cos θ_{k}}

with the wavelength of source

λ

.

To incorporate the CS theory with the DOA estimation, the entire angular space is divided into a fine grid

\tilde{θ} = {[{\tilde{θ}}_{1}, {\tilde{θ}}_{2} \dots {\tilde{θ}}_{L}]}^{T}

, where L

(L ≫ K)

denotes the number of the grid and

{[\cdot]}^{T}

denotes the transpose. Due to the fact that true directions

θ = {[θ_{1}, θ_{2} \dots θ_{K}]}^{T}

are random in the entire angular space,

θ_{k}

for some

k \in {1, 2 \dots K}

are likely to be not on the grid set. To reduce the grid error caused by mismatch, we formulate the off-grid model, which has a close relationship with the on-grid model. Let

\tilde{θ}

satisfy the uniform distribution so that the grid interval

τ = {\tilde{θ}}_{k + 1} - {\tilde{θ}}_{k} \propto L^{- 1}

. Without loss of generality, by assuming

θ_{k} \notin {{\tilde{θ}}_{1}, {\tilde{θ}}_{2} \dots {\tilde{θ}}_{L}}

and that

{\tilde{θ}}_{l_{k}} \in {{\tilde{θ}}_{1}, {\tilde{θ}}_{2} \dots {\tilde{θ}}_{L}}

is the nearest grid to

θ_{k}

, the steering vector

a (θ_{k})

can be approximated as

a (θ_{k}) \approx a ({\tilde{θ}}_{l_{k}}) + b ({\tilde{θ}}_{l_{k}}) (θ_{k} - {\tilde{θ}}_{l_{k}})

(2)

where

ε_{l} = θ_{k} - {\tilde{θ}}_{l_{k}} \in [- \frac{τ}{2}, \frac{τ}{2}]

is the grid error and

b ({\tilde{θ}}_{l_{k}})

is the partial derivative of

a ({\tilde{θ}}_{l_{k}})

with respective to

{\tilde{θ}}_{l_{k}}

. Then, if

{\tilde{θ}}_{l_{1}}, {\tilde{θ}}_{l_{2}} \dots {\tilde{θ}}_{l_{K}}

are respectively nearest grids to the true directions

θ_{1}, θ_{2} \dots θ_{K}

, we have

{\begin{matrix} g_{l} (t) = s_{l_{k}} (t), ε_{l} = θ_{k} - {\tilde{θ}}_{l_{k}} & ; & l = l_{k} (k = 1, 2, \dots K) \\ g_{l} (t) = ε_{l} = 0 & ; & e l s e w h e r e \end{matrix}

(3)

Then, by imposing the approximation error on the noise, the received source

x (t)

can be rewritten as the following sparse form

x (t) = [A + B d i a g (ε)] g (t) + \overset{⌢}{n} (t) t = 1, 2, \dots T

(4)

where

A = [a ({\tilde{θ}}_{1}), a ({\tilde{θ}}_{2}), \dots, a ({\tilde{θ}}_{L})]

is the

M \times L

array manifold matrix corresponding to all potential directions, which is defined as an overcomplete dictionary in CS, and

B = [b ({\tilde{θ}}_{1}), b ({\tilde{θ}}_{2}), \dots, b ({\tilde{θ}}_{L})]

. In addition, the matrix

\overset{⌢}{N} = [\overset{⌢}{n} (1), \overset{⌢}{n} (2), \dots, \overset{⌢}{n} (T)]

is the noise matrix and

d i a g (ε)

is a diagonal matrix with

ε

being the diagonal elements. Since

g (t) = {[g_{1} (t), g_{2} (t), \dots, g_{L} (t)]}^{T}

has K nonzero elements in L elements, it is a K-sparse vector, where K is referred to as sparsity. More specifically,

ε = {[ε_{1}, ε_{2}, \dots, ε_{L}]}^{T}

is also a K-sparse vector and has the same support as

g (t)

. It is evident that the on-grid sparse model is a special case by simply setting

ε = 0

in Equation (4). Since

{g (t)}_{t = 1}^{T}

are jointly K-sparse, the matrix

G = [g (1), g (2), \dots, g (T)] \in ℂ^{L \times T}

has K nonzero rows and is called row K-sparse.

To solve the off-grid DOA estimation problem, we need to jointly estimate the support of sparse sources,

g (t), t = 1, 2, \dots, T

, and grid error

ε

from the matrix Y which is given by

Y = [y (1) y (2) \dots y (T)] = Φ A G + Φ Β d i a g (ε) G + \tilde{N}

(5)

where

\tilde{N} = Φ \overset{⌢}{N}

is the

N \times T

noise matrix,

Φ

is the

N \times M

measurement matrix with

N < M

and N is the number of nonadaptive linear projection measurement.

3. Off-DOA Estimation

In this section, an alternating block coordinate descent (ABCD) method and a block selection criterion are elaborated in the CS scenario. The proposed method not only has the advantages of conventional BCD [17] strategy, but also uses an iterative process that minimizes the mixed k-l norm alternately over two sparse vectors. Note that due to solving the minimization alternately, a tractable convex problem is obtained and the global convergence of ABCD can be easily determined, which is proved in the next section.

By applying the central limit theorem, the components

\overset{⌢}{n} (t)

,

t = 1, 2, \dots, T

, of

\overset{⌢}{N}

are independently white Gaussian noise with zero mean and covariance

σ^{2} I_{M}

, where

I_{M}

denotes an

M \times M

identity matrix. Thus, the covariance matrix of

Y

with the size

M \times M

is expressed as

R_{Y} = E [y (t) y^{H} (t)] = Φ A R_{G} A^{H} Φ^{H} + Φ Β d i a g (ε) R_{G} d i a g^{H} (ε) Β^{H} Φ^{H} + σ^{2} {ΦΦ}^{H}

(6)

where

R_{G} = E [s (t) s^{H} (t)]

is a

L \times L

covariance matrix of the sparse source and

{(\cdot)}^{H}

denotes the conjugate transpose. Since all potential directions are one to one corresponding to the powers

σ_{1}^{2}, σ_{2}^{2}, \dots, σ_{L}^{2}

and we are interested in estimating DOAs,

R_{G}

can be reduced to a diagonal matrix

R_{G} = d i a g (p)

, where

p = {[σ_{1}^{2}, σ_{2}^{2}, \dots, σ_{L}^{2}]}^{T}

is a K-sparse vector. Then, by denoting

\hat{Α} = Φ Α = [\hat{a} ({\tilde{θ}}_{1}), \hat{a} ({\tilde{θ}}_{2}), \dots, \hat{a} ({\tilde{θ}}_{L})]

and

\hat{Β} = Φ Β = [\hat{b} ({\tilde{θ}}_{1}), \hat{b} ({\tilde{θ}}_{2}), \dots, \hat{b} ({\tilde{θ}}_{L})]

, (6) can be further rewritten as

R_{Y} = \sum_{l = 1}^{L} \hat{a} ({\tilde{θ}}_{l}) {\hat{a}}^{H} ({\tilde{θ}}_{l}) σ_{l}^{2} + \sum_{l = 1}^{L} \hat{b} ({\tilde{θ}}_{l}) {\hat{b}}^{H} ({\tilde{θ}}_{l}) σ_{l}^{2} ε_{l}^{2} + σ^{2} {ΦΦ}^{H}

(7)

Due to the vector form of

R_{Y}

in Equation (7), the following measurement vector is given by

z = v e c (R_{Y}) = C p + D q + σ^{2} {ΦΦ}^{H}

(8)

with

C = [{\hat{a}}^{H} ({\tilde{θ}}_{1}) \otimes \hat{a} ({\tilde{θ}}_{1}), {\hat{a}}^{H} ({\tilde{θ}}_{2}) \otimes \hat{a} ({\tilde{θ}}_{2}), \dots, {\hat{a}}^{H} ({\tilde{θ}}_{L}) \otimes \hat{a} ({\tilde{θ}}_{L})]

and

D = [{\hat{b}}^{H} ({\tilde{θ}}_{1}) \otimes \hat{b} ({\tilde{θ}}_{1}), {\hat{b}}^{H} ({\tilde{θ}}_{2}) \otimes \hat{b} ({\tilde{θ}}_{2}), \dots, {\hat{b}}^{H} ({\tilde{θ}}_{L}) \otimes \hat{b} ({\tilde{θ}}_{L})]

, where

\otimes

and

v e c (\cdot)

denote the Kronecker product and the stack operation by placing the columns of a matrix on the top of one another in order, respectively. Moreover,

q = {[σ_{1}^{2} ε_{1}^{2}, σ_{2}^{2} ε_{2}^{2}, \dots, σ_{L}^{2} ε_{L}^{2}]}^{T}

is also a K-sparse vector, which has the same support as

p

. In the conventional sparse source, the nonzero elements of the sparse vector

p

or

q

can appear anywhere in the vector. However, in this paper, our goal is to explicitly take block sparse source into account, i.e., the nonzero elements of the sparse vector

p

or

q

tend to cluster in blocks. The motivations to exploit block sparse source are the following two main reasons. As can be seen in [18], the first reason is that block sparse source has been applied in many applications, such as unions of subspaces and multiband sources [19]. Secondly, block sparse source has better reconstruction properties than sparse source in the conventional sense, which is proved in [20]. To exploit block sparse source, denote

p [h] = {[σ_{d (h - 1) + 1}^{2}, σ_{d (h - 1) + 2}^{2}, \dots, σ_{d h}^{2}]}^{T}

and

q [h] = {[σ_{d (h - 1) + 1}^{2} ε_{d (h - 1) + 1}^{2}, σ_{d (h - 1) + 2}^{2} ε_{d (h - 1) + 2}^{2}, \dots, σ_{d h}^{2} ε_{d h}^{2}]}^{T}

as the hth blocks of

p

and

q

with the length d, respectively, so that we have

p = {[p^{T} [1], p^{T} [2], \dots, p^{T} [H]]}^{T}

(9)

q = {[q^{T} [1], q^{T} [2], \dots, q^{T} [H]]}^{T}

(10)

where

L = d H

. Following the similar manner, the matrices

C

and

D

can be respectively viewed as a concatenation of block matrices

C [h]

and

D [h]

of the size

N^{2} \times d

, i.e.,

C = [C [1], C [2], \dots, C [H]]

(11)

D = [D [1], D [2], \dots, D [H]]

(12)

where

C [h] = [{\hat{a}}^{H} ({\tilde{θ}}_{d (h - 1) + 1}) \otimes \hat{a} ({\tilde{θ}}_{d (h - 1) + 1}), {\hat{a}}^{H} ({\tilde{θ}}_{d (h - 1) + 2}) \otimes \hat{a} ({\tilde{θ}}_{d (h - 1) + 2}), \dots, {\hat{a}}^{H} ({\tilde{θ}}_{d h}) \otimes \hat{a} ({\tilde{θ}}_{d h})]

and

D [h] = [{\hat{b}}^{H} ({\tilde{θ}}_{d (h - 1) + 1}) \otimes \hat{b} ({\tilde{θ}}_{d (h - 1) + 1}), {\hat{b}}^{H} ({\tilde{θ}}_{d (h - 1) + 2}) \otimes \hat{b} ({\tilde{θ}}_{d (h - 1) + 2}), \dots, {\hat{b}}^{H} ({\tilde{θ}}_{d h}) \otimes \hat{b} ({\tilde{θ}}_{d h})]

. Obviously, the conventional sparse source is a special case of block sparse source by simply setting

d = 1

. If

{‖ p [h] ‖}_{2}, h = 1, 2, \dots, H,

has at most K nonzero elements, the vector

p

is referred to as block K-sparse, where

{‖ \cdot ‖}_{2}

denotes the Euclidean norm for vectors. In contrast to conventional sparsity, K is called block-sparsity.

Since

p

has the same support as

q

,

p

and

q

are jointly sparse. Thus, a mixed k-l norm minimization problem [21] is utilized to jointly reconstruct

p

and

q

. Given a

L \times 1

block sparse vector

p

, the mixed k-l norm of

p

is defined as

{‖ p ‖}_{k, l}^{l} = \sum_{h = 1}^{H} {({‖ p [h] ‖}_{k})}^{l}

(13)

Combining the definition given by Equation (13), this mixed k-l norm minimization problem is formulated as

\min_{p, q} ({‖ p ‖}_{k, l}^{l} + β {‖ q ‖}_{k, l}^{l}) s . t . z = C p + D q + σ^{2} {ΦΦ}^{H}

(14)

where

0 \leq l \leq 1

. It is worth mentioning that an important class of methods for solving the constrained optimization problem is to form the auxiliary function. By introducing the Lagrange multiplier method, the Lagrange function with respect to Equation (14) is given by

\min_{p, q} \frac{1}{2} {‖ z - C p - D q ‖}_{2}^{2} + β_{p} \sum_{h = 1}^{H} {({‖ p [h] ‖}_{k})}^{l} + β_{q} \sum_{h = 1}^{H} {({‖ q [h] ‖}_{k})}^{l}

(15)

where

β_{p}

and

β_{q}

are regularized parameters. As can be seen in Equation (15), the minimization problem with respect to

p

and

q

is nonconvex so as to make DOA estimation intractable. But note that if we fix one of two sparse vectors, that is, either

p

or

q

, the minimization problem in Equation (15) turns out to be convex with respect to the other sparse vector alone. Thus,

p

and

q

can be reconstructed by alternately solving the following two minimization problems under the condition that the other sparse vector is fixed.

\min_{p} {\frac{1}{2} {‖ z - C p - D q ‖}_{2}^{2} + β_{p} \sum_{h = 1}^{H} {({‖ p [h] ‖}_{k})}^{l}}

(16)

\min_{q} {\frac{1}{2} {‖ z - C p - D q ‖}_{2}^{2} + β_{q} \sum_{h = 1}^{H} {({‖ q [h] ‖}_{k})}^{l}}

(17)

It is clear that Equations (16) and (17) have the same structure and can be solved in a similar manner. In the following, we only need to find the optimal solution for Equation (16). Then, the minimization problem in Equation (17) can be handled in the same way.

The objective function in Equation (16) can be expressed as

F (p) = f (p) + v (p)

(18)

where

f (p) = \frac{1}{2} {‖ z - C p - D q ‖}_{2}^{2}

and

v (p) = β_{p} \sum_{h = 1}^{H} {({‖ p [h] ‖}_{k})}^{l}

. Assume that

p^{(k)}

is obtained at the kth iteration. Then, by the quadratic approximation of

f (p)

at the fixed point

p^{(k)}

,

F (p)

can be approximated as

F (p) \approx f (p^{(k)}) + {(p - p^{(k)})}^{H} \nabla f (p^{(k)}) + \frac{\sum_{h = 1}^{H} {‖ p [h] - p^{(k)} [h] ‖}_{2}^{2}}{2 Σ} + v (p)

(19)

where

\nabla f (p^{(k)}) = C^{H} \cdot [\sum_{h = 1}^{H} (C [h] p^{(k)} [h] + D [h] q [h]) - z]

is the partial derivation of

f (p^{(k)})

with respect to

p^{(k)}

.

Σ

is set to be

Σ = \frac{{‖ Δ ‖}_{1}}{L}

, where

{‖ \cdot ‖}_{1}

denotes one norm for vectors,

Δ = {[Δ_{1}, Δ_{2}, \dots, Δ_{L}]}^{T}

is a

L \times 1

vector and

Δ_{l} = \frac{1}{{‖ {\hat{a}}^{H} ({\tilde{θ}}_{l}) \otimes \hat{a} ({\tilde{θ}}_{l}) ‖}_{2}^{2}}

,

l = 1, 2, \dots, L

. Moreover,

Δ

can be shown to be a block vector, i.e.,

Δ = {[Δ^{T} [1], Δ^{T} [2], \dots, Δ^{T} [H]]}^{T}

(20)

where

Δ [h]

is the hth block of

Δ

with the length d. For the convenience of analysis, denote

J^{(k)} = d i a g (Δ) \nabla f (p^{(k)})

, and we can also represent

J^{(k)}

as a block vector, i.e.,

J^{(k)} = d i a g (Δ) C^{H} \cdot [\sum_{h = 1}^{H} (C [h] p^{(k)} [h] + D [h] q [h]) - z] = {[{(J^{(k)} [1])}^{T}, {(J^{(k)} [2])}^{T}, \dots, {(J^{(k)} [H])}^{T}]}^{T}

(21)

where

J^{(k)} [h]

is the hth block of

J^{(k)}

with the length d. At the kth iteration,

p^{(k)}

is updated by minimizing

F (p)

in Equation (19) so that the next iteration

p^{(k + 1)}

is given by

p^{(k + 1)} = \underset{p}{\min {} f (p^{(k)}) + {(p - p^{(k)})}^{H} \nabla f (p^{(k)}) + \frac{\sum_{h = 1}^{H} {‖ p [h] - p^{(k)} [h] ‖}_{2}^{2}}{2 Σ} + v (p)}

(22)

By further derivation, Equation (22) can be simplified to

p^{(k + 1)} = \underset{p}{\min {} \sum_{h = 1}^{H} [\frac{1}{2 Σ} {‖ p [h] - u^{(k)} [h] ‖}_{2}^{2} + λ_{p} {‖ p [h] ‖}_{k}^{l}]}

(23)

where

u^{(k)} [h] = p^{(k)} [h] - J^{(k)} [h]

. As one may note, the objective function in Equation (23) is separable, and thus H blocks of

p^{(k + 1)}

can be solved in a parallel manner. Although it is hard to solve H blocks, the classical BCD method has provided a critical inspiration, fortunately. The solution to the hth block of

p^{(k + 1)}

is given by a soft-thresholding operator [22]

p^{(k + 1)} [h] = \frac{u^{(k)} [h]}{{‖ u^{(k)} [h] ‖}_{2}} ({‖ u^{(k)} [h] ‖}_{2} - λ_{p} {‖ Δ [h] ‖}_{2}) I ({‖ u^{(k)} [h] ‖}_{2} \geq λ_{p} {‖ Δ [h] ‖}_{2}), h = 1, 2, \dots, H

(24)

where

I (\cdot)

denotes an indicator function. Instead of reconstructing

p^{(k + 1)}

directly,

p^{(k + 1)}

can be reconstructed by H blocks, which may be zero vectors during the iteration. It can be seen in Equation (24) that we just need to determine the relation of

{‖ u^{(k)} [h] ‖}_{2}

and

λ_{p} {‖ Δ [h] ‖}_{2}

to judge whether

p^{(k + 1)} [h]

is a zero vector. If

{‖ u^{(k)} [h] ‖}_{2}

is less than

λ_{p} {‖ Δ [h] ‖}_{2}

,

p^{(k + 1)} [h]

must be a zero vector. Therefore, a considerable amount of computations can be avoided in the process of solving H blocks. Similarly, by utilizing the soft-thresholding operator, the solution to the hth block of

q^{(k + 1)}

is expressed as

q^{(k + 1)} [h] = \frac{{\bar{u}}^{(k)} [h]}{{‖ {\bar{u}}^{(k)} [h] ‖}_{2}} ({‖ {\bar{u}}^{(k)} [h] ‖}_{2} - λ_{q} {‖ \bar{Δ} [h] ‖}_{2}) I ({‖ {\bar{u}}^{(k)} [h] ‖}_{2} \geq λ_{q} {‖ \bar{Δ} [h] ‖}_{2}), h = 1, 2, \dots, H

(25)

where

\bar{Δ} [h]

and

{\bar{J}}^{(k)} [h]

are the hth blocks of

\bar{Δ}

and

\bar{J}

with the length d, respectively, and

{\bar{u}}^{(k)} [h] = q^{k} [h] - {\bar{J}}^{(k)} [h]

. Following the similar derivation, we have

{\bar{J}}^{(k)} = d i a g (\bar{Δ}) D^{H} \cdot [\sum_{h = 1}^{H} (C [h] p [h] + D [h] q^{(k)} [h]) - z]

(26)

where

\bar{Δ} = {[{\bar{Δ}}_{1}, {\bar{Δ}}_{2}, \dots, {\bar{Δ}}_{L}]}^{T}

is a

L \times 1

vector and

{\bar{Δ}}_{i} = \frac{1}{{‖ {\hat{b}}^{H} ({\tilde{θ}}_{l}) \otimes \hat{b} ({\tilde{θ}}_{l}) ‖}_{2}^{2}}

,

l = 1, 2, \dots, L

. Based on Equations (24) and (25),

p

and

q

can be reconstructed alternately until the following criterion is satisfied

‖ \frac{η^{(k + 1)} - η^{(k)}}{η^{(k)}} ‖ \leq γ

(27)

where

η^{(k)} = [p^{(k)}; q^{(k)}]

and

γ

is the small tolerance.

To reduce the computational complexity, a block selection criterion is given. This criterion is of great importance in the whole ABCD method. By utilizing the block selection criterion, we can only update the block that is the closest to

z

, i.e.,

h_{0} = \arg \min {‖ w [h] - z ‖}_{2}, h = 1, 2, \dots, H

(28)

where

w [h] = C [h] p [h] + D [h] q [h]

. This means that

p^{(k + 1)} [h_{0}]

and

q^{(k + 1)} [h_{0}]

at the kth iteration are updated as Equations (24) and (25) while the remaining blocks keep unchanged. The purpose of utilizing this block selection criterion is to avoid the update of repetitive and unnecessary blocks and reduce the computational complexity. The major steps of reconstructing

p

and

q

by the ABCD method are given as follows:

Initialization: set

k = 0, p^{(0)} = 0, q^{(0)} = 0, C \in ℂ^{N^{2} \times L}, D \in ℂ^{N^{2} \times L}, z = v e c (R_{Y}) \in ℂ^{N^{2} \times 1}

and

γ = 10^{- 5}

.

(1): Calculate $J^{(k)}$ and ${\bar{J}}^{(k)}$ in terms of Equations (21) and (26).
(2): Due to H blocks of $J^{(k)}$ and ${\bar{J}}^{(k)}$ , calculate $u^{(k)} [h]$ and ${\bar{u}}^{(k)} [h]$ , $h = 1, 2, \dots, H$ .
(3): Calculate $p^{(k + 1)} [h]$ and $q^{(k + 1)} [h]$ , $h = 1, 2, \dots, H$ , in terms of (24) and (25).
(4): Choose the block index $h_{0}$ according to (28). Then, $p^{(k + 1)}$ and $q^{(k + 1)}$ are respectively updated as ${[p^{(k)} [1], \dots, p^{(k)} [h_{0} - 1], p^{(k + 1)} [h_{0}], p^{(k)} [h_{0} + 1], \dots; p^{(k)} [H]]}^{T}$ and $[q^{(k)} [1], \dots, q^{(k)} [h_{0} - 1],$ $q^{(k + 1)} [h_{0}], q^{(k)} [h_{0} + 1], \dots, q^{(k)} [H]]^{T}$ .
(5): If $‖ \frac{η^{(k + 1)} - η^{(k)}}{η^{(k)}} ‖ \leq γ$ , stop the iteration. Otherwise, set $k = k + 1$ and return to step (1).

4. Global Convergence of the ABCD Method

The global convergence of the ABCD method is proved in this section. By combining the existing convergence proof of the general BCD framework [23] with ABCD method, a detailed derivation process for proving the global convergence is shown as follows.

First, we introduce the general BCD framework. Note that

f (p)

in Equation (18) is a continuous convex function and

v (p)

in (18) is a non-smooth convex function. Given a fixed point

p^{(k)}

,

F (p)

in Equation (18) can be approximated as the following form by exploiting the second order Taylor expansion of

f (p)

in the general BCD framework.

F (p) \approx f (p^{(k)}) + {(p - p^{(k)})}^{H} \nabla f (p^{(k)}) + \frac{1}{2} {(p - p^{(k)})}^{H} H (p - p^{(k)}) + v (p)

(29)

where

H = \nabla f^{2} (p^{(k)}) = C^{H} C

is the

L \times L

Hessian matrix of

f (p^{(k)})

with respect to

p^{(k)}

. It is clear that minimizing Equation (29) involves finding the next iteration

p^{(k + 1)}

. Assume that

χ = [1, 2, \dots, H]

is an H-dimensional index set and

χ_{0}

is a subset of

χ

consisting of at most K indexes obtained by Equation (28). Hence, the next iteration

p^{(k + 1)}

is represented as

p^{(k + 1)} = Λ (p^{(k)}, χ_{0}) = \arg \min_{p} {F (p) / p^{(k)} [h] = 0, h \in χ, h \notin χ_{0}}

(30)

Then, to prove the global convergence, we give the modified Armijo rule and modified Gauss-Southwell-r rule that are prerequisites to guarantee the global convergence. These two rules are described in the following.

(1): Modified Armijo rule: If $0 < α^{(k)} < 1$ , $0 < ξ < 1$ and $0 < κ < 1$ , the following inequality holds:

$F (p^{(k)} + α^{(k)} μ^{(k)}) \leq F (p^{(k)}) + α^{(k)} κ Γ^{(k)}$

(31)

where $μ^{(k)} = p - p^{(k)}$ and $Γ^{(k)} = {(μ^{(k)})}^{H} \nabla f (p^{(k)}) + ξ {(μ^{(k)})}^{H} H μ^{(k)} + v (p) - v (p^{(k)})$ .
(2): Modified Gauss-Southwell-r rule: In the iteration, the index set $χ_{0}$ obtained by Equation (28) must satisfy

${‖ Λ (p^{(k)}, χ_{0}) ‖}_{2} \leq ς {‖ Λ (p^{(k)}, χ) ‖}_{2}$

(32)

where $0 < ς < 1$ .

It is well known that the problem of proving the global convergence is quite complex and intractable. However, fortunately, since Equations (16) and (17) are both convex and have both only one global point, we only need to show that Equations (16) and (17) satisfy the modified Armijo rule and modified Gauss-Southwell-r rule to prove the global convergence of the ABCD method. Furthermore, since Equation (16) has the same structure as Equation (17), it is enough to just prove that Equation (16) satisfies the modified Armijo rule and modified Gauss-Southwell-r rule. Regarding Equation (17), the derivation process is given in the same way.

To see the first, the following inequality holds according to Equation (30).

F (p) = F (p^{(k)} + μ^{(k)}) \leq F (p^{(k)} + α^{(k)} μ^{(k)})

(33)

By substituting Equation (33) into Equation (29), we have

\begin{array}{l} f (p^{(k)}) + {(μ^{(k)})}^{H} \nabla f (p^{(k)}) + \frac{1}{2} {(μ^{(k)})}^{H} H (μ^{(k)}) + v (p^{(k)} + μ^{(k)}) \leq \\ f (p^{(k)}) + {(α^{(k)} μ^{(k)})}^{H} \nabla f (p^{(k)}) + \frac{1}{2} {(α^{(k)} μ^{(k)})}^{H} H (α^{(k)} μ^{(k)}) + v (p^{(k)} + α^{(k)} μ^{(k)}) \end{array}

(34)

Based on the fact that

v (p^{(k)} + α^{(k)} μ^{(k)}) \leq α^{(k)} v (p^{(k)} + μ^{(k)}) + (1 - α^{(k)}) v (p^{(k)})

, Equation (34) can be further simplified to

(1 - α^{(k)}) {(μ^{(k)})}^{H} \nabla f (p^{(k)}) + \frac{1}{2} [1 - {(α^{(k)})}^{2}] {(μ^{(k)})}^{H} H (μ^{(k)}) + (1 - α^{(k)}) [v (p^{(k)} + μ^{(k)}) - v (p^{(k)})] \leq 0

(35)

Since

0 < α^{(k)} < 1

, by dividing by

1 - α^{(k)}

and setting

α^{(k)} \to 1

, we have

{(μ^{(k)})}^{H} \nabla f (p^{(k)}) + {(μ^{(k)})}^{H} H (μ^{(k)}) + v (p^{(k)} + μ^{(k)}) - v (p^{(k)}) \leq 0

(36)

For

0 < ξ < 1

, it can be deduced from (36) that

Γ^{(k)} = {(μ^{(k)})}^{H} \nabla f (p^{(k)}) + ξ {(μ^{(k)})}^{H} H μ^{(k)} + v (p) - v (p^{(k)}) \leq (ξ - 1) {(μ^{(k)})}^{H} H μ^{(k)} \leq 0

(37)

Subsequently, by exploiting the convexity of

f (p)

, we obtain

f (p) = f (p^{(k)} + μ^{(k)}) \leq f (p^{(k)}) + {(μ^{(k)})}^{H} \nabla f (p^{(k)}) + \frac{1}{2} {(μ^{(k)})}^{H} H (μ^{(k)})

(38)

Since

F (p) = f (p) + v (p)

, it is nature to have

F (p) = F (p^{(k)} + μ^{(k)}) \leq F (p^{(k)}) + {(μ^{(k)})}^{H} \nabla f (p^{(k)}) + \frac{1}{2} {(μ^{(k)})}^{H} H (μ^{(k)}) + v (p) - v (p^{(k)})

(39)

Following the fact in Equation (37), the following inequality holds

\begin{array}{l} F (p^{(k)}) + {(μ^{(k)})}^{H} \nabla f (p^{(k)}) + \frac{1}{2} {(μ^{(k)})}^{H} H (μ^{(k)}) + v (p) - v (p^{(k)}) \\ \leq F (p^{(k)}) + κ [{(μ^{(k)})}^{H} \nabla f (p^{(k)}) + \frac{1}{2} {(μ^{(k)})}^{H} H (μ^{(k)}) + v (p) - v (p^{(k)})] \end{array}

(40)

where

0 < κ < 1

. Combining Equations (39) and (40), we have

F (p^{(k)} + μ^{(k)}) \leq F (p^{(k)}) + κ [{(μ^{(k)})}^{H} \nabla f (p^{(k)}) + \frac{1}{2} {(μ^{(k)})}^{H} H (μ^{(k)}) + v (p) - v (p^{(k)})]

(41)

It is worth pointing out that Equation (41) is equal to Equation (31) with

α^{(k)} = 1

and

ξ = \frac{1}{2}

. Thus, it has been proved that Equation (16) satisfies the modified Armijo rule.

Secondly, in order to prove that Equation (16) satisfies the modified Gauss-Southwell-r rule, the following form is shown

h_{0} = \arg \min {‖ w [h] - z ‖}_{2} = \arg \min {‖ Λ (p^{(k)}, χ_{i}) ‖}_{2}

(42)

where the cardinality of

χ_{i}

is the same as that of

χ_{0}

. Without loss of generality, consider the worst case, i.e., the cardinality of

χ_{i}

is the maximization K, and assume

H = K Z

so that

χ

is expressed as

χ = [χ_{1} ， χ_{2} ， \dots ， χ_{Z}]

. Thus, we have

{‖ Λ (p^{(k)}, χ_{0}) ‖}_{2} \leq {‖ Λ (p^{(k)}, χ_{i}) ‖}_{2}

(43)

Based on the following equation

{‖ Λ (p^{(k)}, χ) ‖}_{2} = \sqrt{\sum_{z = 1}^{Z} {‖ Λ (p^{(k)}, χ_{z}) ‖}_{2}^{2}}

(44)

it is easy to have

{‖ Λ (p^{(k)}, χ_{0}) ‖}_{2} \leq \frac{1}{\sqrt{Z}} {‖ Λ (p^{(k)}, χ) ‖}_{2}

(45)

Since Equation (45) is equal to Equation (32) with

ς = \frac{1}{\sqrt{Z}}

, Equation (16) satisfies the modified Gauss-Southwell-r rule. Therefore, based on the above analysis, the global convergence of the ABCD method has been proved.

5. Simulation Results

This section presents several simulations to validate the superior performance of the proposed method as compared with R-GBCD+ and OGSBI-SVD. The angular space [−90°, 90°] is taken the grid with grid interval τ = 3° to perform three methods for off-grid targets. We set the length of block, the number of ULA sensors and spacing between adjacent sensors to be

d = 3

,

M = 9

and

\overset{⌢}{d} = \frac{λ}{2}

, respectively. In the simulation, the root mean squared error (RMSE) and success rate of DOA estimation are two significant performance indexes. RMSE is defined as

R M S E = \sqrt{\sum_{i = 1}^{M c} \sum_{k = 1}^{K} \frac{{({\bar{θ}}_{k, i} - θ_{k})}^{2}}{K \cdot M c}}

(46)

where

M c

is the number of Monte Carlo runs and

{\bar{θ}}_{k, i}

is the estimate of

θ_{k}

in the ith Monte Carlo run, and success rate is declared if the estimation error is within a certain small Euclidean distance of the true directions.

In the first simulation, we compare the spatial spectra of R-GBCD+, OGSBI-SVD and ABCD. Consider four far-field narrowband sources impinging on the ULA from [−30.4° −3.8° 10.1° 15.3°], where the latter two most closely spaced sources are coherent and the remaining sources are independent of other sources. Figure 1 presents the spatial spectra of R-GBCD+, OGSBI-SVD and ABCD with SNR 3 dB and number of snapshots 100. For the convenience of analysis, the spatial spectra are normalized. We can see from Figure 1 that the spatial spectra of three methods are able to detect four sources, but the spatial spectrum obtained by R-GBCD+ has obvious bias at the true directions and OGSBI-SVD can yield slight bias in the vicinity of the coherent sources. Note that ABCD has a nearly ideal spatial spectrum, and thus it outperforms R-GBCD+ and OGSBI-SVD in terms of the spatial spectrum.

Figure 1. Spatial spectra of R-GBCD+, OGSBI-SVD and ABCD, where the pink circles denote the true DOAs.

The success rates of three methods vs. SNR and the number of snapshots are analyzed in the second simulation. The source mode is the same as the first simulation. Figure 2 shows the success rates of three methods vs. SNR with the fixed number of snapshots 100, whereas the success rates of three methods vs. number of snapshots are depicted with the fixed SNR 0 dB in Figure 3. The following facts can be acquired from Figure 2 and Figure 3 that three methods can estimate correctly for high SNR or large number of snapshots and ABCD has a higher success rate than the other two methods for low SNR or a small number of snapshots.

Figure 2. Success rates vs. SNR with the fixed number of snapshots 100.

Figure 3. Success rates vs. number of snapshots with the fixed SNR 0 dB.

Figure 4. RMSE vs. SNR with the fixed number of snapshots 100.

Figure 5. RMSE vs. number of snapshots with the fixed SNR 0 dB.

The third simulation considers the RMSE of three methods vs. SNR and the number of snapshots. All the conditions are the same as the second simulation. Figure 4 and Figure 5 show the RMSE of three methods vs. SNR and the number of snapshots, respectively. It is indicated in Figure 4 and Figure 5 that ABCD has the best estimation accuracy among all three methods. Moreover, the accuracy of three methods is gradually improving with SNR or the number of snapshots increasing.

Finally, we test the resolving ability by showing the relation between RMSE of three methods and angle separation of sources, which is illustrated in Figure 6 Consider two coherent sources impinging on the ULA from

30.7

and

30.7 + Δ θ

, where the step of

Δ θ

is 1°. The SNR is 0 dB and the number of snapshots is 100. As can be seen from Figure 6, the performance of R-GBCD+ and OGSBI-SVD degrades severely as angle separation is 3°, while ABCD can still provide a precise estimation as long as angle separation is no less than 3°. The proposed ABCD is the most accurate method and has higher resolution than the other two methods.

Figure 6. RMSE vs. angle separation with the fixed SNR 0 dB and number of snapshots 100 for coherent sources.

6. Conclusions

In this paper, a novel ABCD method is proposed for off-grid DOA estimation in CS. The proposed method minimizes the mixed k-l norm to reconstruct the sparse source and estimate the grid error. In order to make the minimization problem tractable, an iterative process that minimizes the mixed k-l norm alternately over two sparse vectors is adopted. By reconstructing the block sparse source instead of conventional sparse source, the proposed method can achieve the better reconstruction properties. A block selection criterion is given to update the block so that the proposed method can reduce computational complexity. It is proved that the proposed method has the global convergence. Simulation results show that the proposed method has more notable performance advantages than R-GBCD+ and OGSBI-SVD in terms of spatial spectrum, RMSE and success rate.

Acknowledgments

This work was supported by Aviation Science Foundation of China (201401P6001) and Fundamental Research Funds for the Central Universities (HEUCF150804).

Author Contributions

The main idea was proposed by Weijian Si and Xinggen Qu. Zhiyu Qu conceived the experiments and provided many valuable suggestions. Xinggen Qu wrote the manuscript and all the authors participated in amending the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

Reddy, V.V.; Mubeen, M.; Ng, B.P. Reduced-complexity super-resolution DOA estimation with unknown number of sources. IEEE Signal Process. Lett. 2015, 22, 772–776. [Google Scholar] [CrossRef]
Xie, J.; Tao, H.H.; Rao, X.; Su, J. Passive localization of mixed far-field and near-field sources without estimation the number of sources. Sensors 2015, 15, 3834–3853. [Google Scholar] [CrossRef] [PubMed]
Schmidt, R.O. Multiple Emitter Location and Signal Parameter Estimation. IEEE Trans. Antennas Propag. 1986, 34, 276–280. [Google Scholar] [CrossRef]
Roy, R.; Kailath, T. ESPRIT-estimation of signal parameters via rotational invariance techniques. IEEE Trans. Acoust. Speech Signal Process. 1989, 37, 984–995. [Google Scholar] [CrossRef]
Donoho, D.L. Compressed sensing. IEEE Trans. Inf. Theory 2006, 52, 1289–1306. [Google Scholar] [CrossRef]
Candès, E.; Tao, T. Near-optimal signal recovery from random projections: Universal encoding strategies? IEEE Trans. Inf. Theory 2006, 52, 5406–5425. [Google Scholar] [CrossRef]
Candès, E. The restricted isometry property and its implications for compressed sensing. Comptes Rend. Math. 2008, 346, 589–592. [Google Scholar] [CrossRef]
Baraniuk, R.; Davenport, M.; DeVore, R.; Wakin, M. A simple proof of the restricted isometry property for random matrices. Constr. Approx. 2008, 28, 253–263. [Google Scholar] [CrossRef]
Malioutov, D.; Cetin, M.; Willsky, A.S. A sparse signal reconstruction perspective for source localization with sensor arrays. IEEE Trans. Signal Process. 2005, 53, 3010–3022. [Google Scholar] [CrossRef]
Kim, J.M.; Lee, O.K.; Ye, J.C. Compressive MUSIC: revisiting the link between compressive sensing and array signal processing. IEEE Trans. Inf. Theory 2012, 58, 278–301. [Google Scholar] [CrossRef]
Xu, X.; Wei, X.H.; Ye, Z.F. DOA estimation based on sparse signal recovery utilizing weighted l₁–norm penalty. IEEE Signal Process. Lett. 2012, 19, 155–158. [Google Scholar] [CrossRef]
Wei, X.H.; Yuan, Y.B.; Ling, Q. DOA estimation using a greedy block coordinate descent algorithm. IEEE Trans. Signal Process. 2012, 60, 6382–6394. [Google Scholar]
Zhu, H.; Leus, G.; Giannakis, G. Sparsity-cognizant total least-squares for perturbed compressive sampling. IEEE Trans. Signal Process. 2011, 59, 2002–2016. [Google Scholar] [CrossRef]
Yang, Z.; Xie, L.; Zhang, C. Off-grid direction of arrival estimation using sparse Bayesian Inference. IEEE Trans. Signal Process. 2013, 61, 38–43. [Google Scholar] [CrossRef]
Liang, Y.J.; Ying, R.D.; Lu, Z.Q.; Liu, P.L. Off-grid direction of arrival estimation based on joint spatial sparsity for distributed sparse linear arrays. Sensors 2014, 14, 21981–22000. [Google Scholar] [CrossRef] [PubMed]
Zhang, Y.; Ye, Z.F.; Xu, X.; Hu, N. Off-grid DOA estimation using array covariance matrix and block-sparse Bayesian learning. Signal Process. 2014, 98, 197–201. [Google Scholar] [CrossRef]
Qin, Z.W.; Scheinberg, K.; Goldfarb, D. Efficient block-coordinate descent algorithms for the Group Lasso. Math. Program. Comput. 2013, 5, 143–169. [Google Scholar] [CrossRef]
Eldar, Y.C.; Mishali, M. Robust recovery of signals from a structured union of subspaces. IEEE Trans. Inf. Theory 2009, 55, 5302–5316. [Google Scholar] [CrossRef]
Mishali, M.; Eldar, Y.C. Blind multi-band signal reconstruction: Compressed sensing for analog signals. IEEE Trans. Signal Process. 2009, 57, 993–1009. [Google Scholar] [CrossRef]
Eldar, Y.C.; Kuppinger, P.; Bölsckei, H. Block-sparse signals: uncertainty relations and efficient recovery. IEEE Trans. Signal Process. 2010, 58, 3042–3054. [Google Scholar] [CrossRef]
Zhao, G.H.; Shi, G.M.; Shen, F.F.; Luo, X.; Niu, Y. A sparse representation based DOA estimation algorithm with separable observation model. IEEE Trans. Antennas Wirel. Propag. Lett. 2015. [Google Scholar] [CrossRef]
Beck, A.; Teboulle, M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2009, 2, 183–202. [Google Scholar] [CrossRef]
Tseng, P.; Yun, S. A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. 2009, 117, 387–423. [Google Scholar] [CrossRef]

© 2015 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).

Off-Grid DOA Estimation Using Alternating Block Coordinate Descent in Compressed Sensing

Abstract

1. Introduction

2. Problem Formulation

3. Off-DOA Estimation

4. Global Convergence of the ABCD Method

5. Simulation Results

6. Conclusions

Acknowledgments

Author Contributions

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics