A Fast Space-Time Adaptive Processing Algorithm Based on Sparse Bayesian Learning for Airborne Radar

Space-time adaptive processing (STAP) plays an essential role in clutter suppression and moving target detection in airborne radar systems. The main difficulty is that independent and identically distributed (i.i.d) training samples may not be sufficient to guarantee the performance in the heterogeneous clutter environment. Currently, most sparse recovery/representation (SR) techniques to reduce the requirement of training samples still suffer from high computational complexities. To remedy this problem, a fast group sparse Bayesian learning approach is proposed. Instead of employing all the dictionary atoms, the proposed algorithm identifies the support space of the data and then employs the support space in the sparse Bayesian learning (SBL) algorithm. Moreover, to extend the modified hierarchical model, which can only apply to real-valued signals, the real and imaginary components of the complex-valued signals are treated as two independent real-valued variables. The efficiency of the proposed algorithm is demonstrated both with the simulated and measured data.


Introduction
Space-time adaptive processing (STAP) has the capability to detect slow-moving targets that might otherwise be swallowed up by the strong sidelobe clutter. The performance of STAP filter is dependent on the accuracy of the clutter plus noise covariance matrix (CNCM) of the cell under test (CUT) [1]. According to the well-known Reed-Mallet-Brennan (RMB) rule [2], to achieve the steady performance, the number of the independent and identically distributed (i.i.d) secondary range cells used to estimate CNCM is no less than twice the system degrees of freedom (DOF). However, it is hard to obtain enough i.i.d samples in practice because of array geometry structures, nonhomogeneous clutter environment and so on [1]. Knowing how to improve the performance of STAP with limited samples has been a hot topic until now.
Reduced-dimension (RD) [3] and reduced-rank (RR) [4] algorithms are proposed to improve the performance of STAP with limited secondary samples. Even though they are easy to implement, their performance gets worse when the number of secondary samples is smaller than twice the rank of clutter [5].
Apart from RD and RR algorithms, some other algorithms [6][7][8][9][10] are proposed and succeed in suppressing clutter in theory. However, they face some disadvantages in practice. Sufficiently, (1) direct data domain (DDD) algorithms in [6] that achieve enough samples from the CUT suffer from aperture loss; and (2) knowledge-aided (KA) algorithms in [7][8][9][10] state that the accurate prior knowledge must be required in advance. Since the cost to achieve the accurate prior knowledge is expensive and the prior knowledge changes with time, KA algorithms are not widely used in practical applications.
Over the past twenty years, sparse recovery/representation (SR) algorithms have received continuous attention in STAP [11][12][13][14][15] because they have enormous potential to estimate the clutter spectrum with limited samples. The sparse Bayesian learning (SBL)type algorithms are robust and have drawn continuous attention in the last five years.
The SBL algorithm was proposed by Tipping in [16]. Wipf enhanced it in the single measurement vector (SMV) case in [17] and then the multiple measurement vectors (MMV) case in [18]. Due to the excellent performance of SBL algorithms, the fast marginal likelihood maximization (FMLM) algorithm [19], the Bayesian compressive sensing (BCS) algorithm [20], the multitask BCS (M-BCS) algorithm [21], and other Bayesian algorithms [22][23][24] were proposed by researchers in the following years. SBL was firstly introduced into STAP with MMV, defined as the M-SBL-STAP algorithm in [25], by Duan to estimate CNCM and Wang improved the fast convergence of the algorithm in [26]. SBL has also been used to solve some common problems in STAP in the last five years, such as off-grid in [27] and discrete interference in [28]. However, the Bayesian algorithms aforementioned have one or more of the following disadvantages: (a) high computational cost, (b) inaccurate estimation of the noise variance and (c) inapplicability to complexvalued signals.
In this paper, to improve the computational efficiency of the M-SBL-STAP algorithm, we extend the FMLM algorithm to the conventional M-SBL-STAP algorithm. The real and imaginary components of the complex-valued signals are treated as two independent real-valued variables in order to extend the modified hierarchical model. Simulation results with both simulated and Mountain-Top data demonstrate that the proposed algorithm can achieve high computational efficiency and good performance.
The main contributions of this paper can be listed as follows: 1.
We extend the FMLM algorithm into M-SBL-STAP for the purpose of identifying the support space of the data, i.e., the atoms whose corresponding hyper-parameters are non-zero. After support space identification, the dimensions of the effective problems are drastically reduced due to sparsity, which can reduce computational complexities and alleviate memory requirements.

2.
According to [18], we have no access to obtain the accurate value of the noise variance. Instead of estimating the noise variance, we extend the modified hierarchical model, introduced in Section 4, to the SBL framework.

3.
Although the hierarchical models apply to the real-valued signals, they cannot be extended directly to the complex-valued signal according to [29,30]. The data needed to be dealt with in STAP are all complex-valued. To solve the problem, we transform sparse complex-valued signals into group sparse real-valued signals.
Notation: In this article, scalar quantities are denoted with italic typeface. Boldface small letters are reserved for vectors, and boldface capital letters are reserved matrices. The i−th entry of a vector x is denoted by x i , while A i and A ij denote the i−th row and (i, j)−th element of a matrix A, respectively. The symbols (•) T and (•) H denote the matrix transpose and conjugate transpose, respectively. The symbols • 1 , • 2 and • F are reserved for 1 , 2 and Frobenius ( F ) norms, respectively. • 0 is reserved for 0 pseudo-norm. • 2,0 stands for a mixed norm defined as the number of non-zero elements of the vector formed by the 2 norm of each row vector. The symbol |•| is reserved for the determinant. The notations I, 0 and 1 represent the identity matrix, the all zero matrix and the all one matrix, respectively. The expectation of a random variable is denoted as E(•).

STAP Signal Model for Airborne Radar
Consider an airborne pulsed-Doppler radar system with a side-looking uniform linear array (ULA) consisted of N elements. A coherent burst of M pulses is transmitted at a constant pulse repetition frequency (PRF) in a coherent processing interval (CPI). The complex sample at the CUT from the n−th element and the m−th pulse is denoted as x mn , and the data for the CUT can be written as a MN × 1 vector x, termed a space-time snapshot.
Radar systems need to ascertain whether the targets are present in the CUT; thus, target detection is formulated into a binary hypothesis problem: the hypothesis H 0 represents target absence and the other hypothesis H 1 represents target presence.
where α t is the target amplitude and S t ∈ C MN×1 is the space-time vector of the target. The vector n ∈ C MN×1 is the thermal noise, which is uncorrelated both spatially and temporally. A general model for the space-time clutter where α k denotes the random amplitude of the echo from the corresponding clutter patch; N c denotes the number of clutter patches in a clutter ring; υ st , υ t and υ s denote spacetime steering vector, temporal steering vector and spatial steering vector, respectively; and f d,k and f s,k denote the corresponding normalize temporal frequency and spatial frequency, respectively. According to the linearly constrained minimum variance (LCMV) criterion, the optimal STAP weight vector is where the CNCM R c+n is expressed as

SR-STAP Model and Principle
Discretize the space-time plane uniformly into K = N S N d grids, where N S = ϕ S N (ϕ S > 1) denotes the number of normalized spatial frequency bins and N d = ϕ d M (ϕ d > 1) denotes the number of normalized Doppler frequency bins. Each grid corresponds to a space-time steering vector υ k , (k = 1, 2, · · · , K). The dictionary D ∈ C MN×K used in STAP is the collection of space-time steering vectors of all grids.
The signal model in STAP can be cast in the following form where X = [x 1 , x 2 , · · · ,x L ]; A ∈ C K×L denotes sparse coefficient matrix; non-zero elements indicate the presence of clutter on the space-time profile; L denotes the number of the range gates in the data; and n ∈ C MN×L denotes zero-mean Gaussian noise.
In sparse signal recovery algorithms with MMV, our goal is to represent the measurement X, which is contaminated by noise as a linear combination of as few dictionary atoms as possible. Therefore, the objective function is expressed as where ε is the noise error allowance.

Sparse Bayesian Learning Formulation
In the SBL framework, {x l } L l=1 are assumed as i.i.d training snapshots. The noise is submitted to white complex Gaussian distribution with unknown power σ 2 . The likelihood function of the measurement vectors can be denoted as Since the training snapshots are i.i.d, each column in A is independent and shares the same covariance matrix. Assign to each column in A with a zero-mean Gaussian prior where 0 ∈ C K×1 is a zero vector and Γ = diag(γ). γ = {γ 1 , γ 2 , · · · , γ K } are unknown variance parameters corresponding to sparse coefficient matrix. The prior of A can be expressed as Combining likelihood with prior, the posterior density of A can be easily expressed as modulated by the hyper-parameters γ and σ 2 . To find the hyper-parameters γ and σ 2 , which are enough accurate to estimate the CNCM, the most common method is the expectation maximization (EM) algorithm. The EM algorithm is divided into two parts: E-step and M-step. t represents the sequence number of the current iteration. At the E-step: treat A as hidden data, and the posterior density can be described with hyper-parameters γ and σ 2 .
with covariance and mean given by where At the M-step, we update the hyper-parameters by maximizing the expectation of where The M-SBL-STAP algorithm is shown in Algorithm 1.
Step 1: Input: the clutter data X, the dictionary D Step 2: Initialization: initial the values of γ and σ 2 .
Step 5: Repeat step 3 and step 4 until a stopping criterion is satisfied.
Step 6: Estimate the CNCM R by the formula where β is a load factor and the symbol * represents the stopping criterion.
Step 7: Compute the space-time adaptive weight w using (7).
Step 8: The output of the M-SBL-STAP algorithm is Z = w H X.

Problem Statement of the M-SBL-STAP Algorithm
At the E-step in each iteration, the inversion of a K × K matrix, which brings high computational complexities in the order of o K 3 , is required to be calculated when update covariance Σ (17). K is the number of the atoms in the dictionary, and the value of K is usually large. If we avoid calculating the matrix inversion of a K × K matrix, the computational complexities can be reduced a lot.
At the M-step in each iteration, the noise variance σ 2 needs to be estimated. However, it has been demonstrated in [18] that σ 2 estimated by (23) can be extremely inaccurate when the dictionary is structured and K ≥ N M. We can see in [21] that σ 2 is a nuisance parameter in the iterative procedure and an inappropriate value may contaminate the convergence performance of the algorithm.
We extend a modified hierarchical model to the SBL framework, which aims at integrating σ 2 out instead of estimating σ 2 . However, the modified model, which applies to the real-valued signals, cannot be directly extended to complex-valued signals.
In the next sections, we introduce the proposed algorithm to solve the above problems. The proposed algorithm is defined as M-FMLM-STAP algorithm.

Modified Hierarchical Model
In [16], the scholars define the following hierarchical hyperpriors over Γ, as well as the noise variance σ 2 .
where Gamma(α|a, b) = Γ(a) −1 b a α a−1 e −bα and the 'gamma function' Γ(a) = ∞ 0 t a−1 e −t dt. It has been demonstrated in [16] that an appropriate choice of the shape parameter a and the scale parameter b encourages the sparsity of the coefficient matrix.
In the STAP framework, CNCM can be expressed as We can translate STAP weight vector in the form where λ and µ are constants.
The above analysis shows that R is equivalent to R c+n in the performance of clutter suppression. Thus, including σ 2 in the prior of Γ, each component of A is defined as a zero-mean Gaussian distribution and the modified hierarchical model follows (28) and (29). where

Application of the Modified Hierarchical Model to Complex-Valued Signals
However, although both of the original and the modified models apply to real-valued signals, they cannot be directly extended to complex-valued signals. To solve the above problem, the real and imaginary components of the complex-valued signals are treated as two independent real-valued variables.
In order to avoid calculating σ 2 in (33) and (34), we extend the modified hierarchical model to the SBL framework. The new real-valued likelihood function can be expressed as and the new prior of B can be expressed as Combining the new likelihood and prior, the new posterior density of B can be expressed as Integrate σ 2 out, and the posterior density function of B is where From (39), we can draw the conclusion that the modified formulation induces a heavy-tailed Student-t distribution on the residual noise, which improves the robustness of the algorithm.

Maximization of L(Γ Γ Γ ) to Estimate α α α
A most probable way to point estimate hyper-parameters α may be found via a type-II maximum likelihood procedure. Mathematically, maximize the marginal likelihood function or its logarithm (42) with respect to α. In the following substance, the FMLM algorithm is applied to maximize (42) to estimate α. Unlike the EM algorithm, the FMLM algorithm reduces the computational complexities by identifying the support space of data, i.e., the atoms in the dictionary whose corresponding values in α is non-zero.
For notational convenience, we ignore the symbol . γ and Γ in the following substance all represent γ and Γ , respectively.
Define Ω as the set of the non-zero values in Γ and ψ as the support space of data. where and J is the number of non-zeros in α.
At the beginning of the FMLM algorithm, we initialize α (0) = 0, namely, Ω (0) = ∅ and ψ (0) = ∅. Then, we need to identify the support space of data in each iteration until that α converges to a steady point.
The matrix C can be decomposed into two parts.
The first term α k Φ k Φ T k contains all terms that are independent of α i , and the second term includes all the terms related to α i .
Using the Woodbury Matrix Identity, the matrix inversion and matrix determinate lemmas are expressed as Then, (42) can be expressed as where L −i contains all terms that are independent of α i and L(α i ) includes all the terms related to α i .
Then, L(α i ) can be expressed as The eigenvalue decomposition (EVD) ofŜ î whereŝ i,j and υ i,j denote the j−th eigenvalue and eigenvector ofŜ i respectively.
, and the formula (55) can then be expressed as The next step involves maximizing (57) to estimate the hyper-parameters α i , ∀i, and we choose only one candidate from α i , ∀i that can maximize (53).
Differentiate L(α i ) with respect to α i , and set the result to zero which is equivalent to solve the following polynomial function It has at most 3 roots found by standard root-finding algorithms. Considering that α i ≥ 0 and the corresponding Φ i is not in the support space when α i = 0, the set of the positive roots is The optimal α * i is given by In the (t + 1)−th iteration, we choose only one α j , 1 ≤ j ≤ K that can maximize (53), while fix the other {α i |1 ≤ i ≤ K, i = j}. The sequence number j is expressed as In order to avoid calculating L −i , ∀i, we define Since L Γ (t) in the (t + 1)−th iteration is a fixed constant, the sequence number j can also be expressed as The next step is to change the value of α j while fixing the other , j > 0, delete Φ j from ψ (t) and delete α (t) j from Ω (t) ; and if α (t+1) j = 0 and α (t) j = 0 (i.e., Γ (t+1) = Γ (t) ), stop iteration because the procedure converges to steady state. Finally, ψ * is the exact support space of the data and Ω * is the set of the non-zero values in the exact Γ where the symbol * represents the stopping criterion.

Fast Computation of
In each iteration, we need to calculate the matrix inversion of all C −i , ∀i when updating Ŝ i ,q l,i ,ĝ l,i . In order to reduce computation complexities, we need a fast means to update Ŝ i ,q l,i ,ĝ l,i . Define If we can calculate Ŝ i ,q l,i ,ĝ l,i with S i , q l,i , g l , the computational complexities can be reduced a lot because we only calculate the matrix inversion of C.
Substituting (51) into (68), we can arrive at the following formula: We can draw the conclusion that the eigen-matrix of S i is the same as that ofŜ i from (56) and (69). The EVD of S i can be also expressed as where S i,j denotes the j−th eigenvalue of S i .
Thus,Ŝ i can be obtained via the EVD of S î q l,i andĝ l,i can be also computed with q l,i and g l , respectively, via the EVD of S i . Thus,q Similarly, Using (74), we can obtain Therefore, Ŝ i ,q l,i ,ĝ l,i can be obtained from S i , q l,i , g l using (72), (74) and (76). Compared with (68), there is another approach that requires fewer computational complexities to calculate S i , q l,i , g l . The formula (43) is equivalent to With matrix inversion lemmas and (77), it is fast and more convenient to update S i , q l,i , g l with the following formulae compared with (68).
where Σ herein represents the covariance of the non-zeros in Γ.
Additionally, the mean of the non-zeros in Γ herein is expressed as The computational complexities are measured in terms of the number of multiplications. Assuming that the dimension of ψ is 2MN × r, r is not a fixed value in each iteration but satisfies the condition r MN < K because the measurements are sparse. When we calculate S i with (68), the computation complexities are in order of (78), the computation complexities of calculating S i are in order of o r 3 + 16MN + 8MNr + 2r 2 . It is apparent that the latter operation is faster and more convenient.
Since the measurements are sparse, the dimension of the matrix ψ T ψ + {Ω} −1 in (79) is far smaller than the dimension of Ψ T Ψ + Γ −1 in (41). The proposed M-FMLM-STAP algorithm identifies the support space of data to reduce the effective problem dimensions drastically due to sparsity. Therefore, the proposed algorithm has tremendous potential for real-time operation.
The proposed M-FMLM-STAP algorithm is shown in Algorithm 2. The detailed update formulae are shown in the Appendix A.
The main symbols aforementioned are listed in Table 1.
Step 1: Input: the original data X, the original dictionary D and a = b = 0.
If α (t+1) j > 0 and α end Update Σ,µ,S i ,q l,i and g l referring to Appendix A. end while Step 5: Estimate the CNCM R by J is the number of non-zeros in α * and β is a load factor. The symbol * represents the stopping criterion.
Step 6: Compute the space-time adaptive weight w using (7).
Step 7: The output of M-FMLM-STAP is Z = w H X.

Complexity Analysis
Here, we compare the computational complexities of the proposed M-FMLM-STAP algorithm and the M-SBL-STAP algorithm for a single iteration. The computational complexities are measured in terms of the number of multiplications.
First, we analyze the computational complexities of the M-SBL-STAP algorithm. It is apparent that the main computational complexities of the M-SBL-STAP algorithm are related to the formulae (17) and (18). Noting that Λ in (17) is a diagonal matrix, the computational complexities of (17) are in the order of o K 3 + K 2 MN . The computational complexities of (18) are in the order of o K 2 MN + KMNL . Thus, the computational complexities of the M-SBL-STAP algorithm are in the order of o K 3 +2K 2 MN + KMNL .
Second, we analyze the computational complexities of the M-FMLM-STAP algorithm. Noting that the dimension of S i is 2 × 2, the computational complexities of the EVD of S i are small enough to be ignored, that is to say, the computational complexities to calculate Ŝ i ,q l,i ,ĝ l,i with S i , q l,i , g l can be ignored. Meanwhile, the computational complexities to solve the polynomial function (59), ∀i, 1 ≤ i ≤ K and to find the sequence number (65) are in direct proportion to K, which are also small enough to be ignored. Thus, the main computational complexities of the M-FMLM-STAP algorithm are used to update S i , q l,i , g l , Σ, µ . Assuming that the dimension of ψ is 2MN × r, r is not a fixed value in all iterations. However, the condition r MN < K is satisfied because the measurements are sparse. The computational complexities of the term Φ T i − Φ T i ψΣψ T are in the order of o 8MNr + 2r 2 . Thus, ∀i and ∀l, the computational complexities of (78), are in the order of o 8MNr + 2r 2 K + 8MNK + 4MNLK + 4MNr + r 2 + 4MN L . The computational complexities of (79) and (80) are in the order of o 2MNr 2 + r + r 3 and o 2MNr 2 + 2MNrL , respectively. To sum up, ignore the low-order computational complexities, and the computational complexities of the M-FMLM-STAP algorithm for a single iteration are in the order of o(8MNrK + 4MNLK + 6MNrL). Figure 1 illustrates the complexity requirements of two algorithms for a single iteration. It shows the computational complexities to the number of pluses M for a single iteration. Suppose that ϕ S = ϕ d = 4, N = 10 and L = 6. Although the value r is unknown and unfixed in each iteration, we can suppose that it is twice as much to the rank of clutter in the current iteration. We can draw the conclusion that the computational complexities of the M-SBL-STAP algorithm are far more than that of the M-FMLM-STAP algorithm for a single iteration.

Convergence Analysis
It is obvious that the logarithm of the marginal likelihood function L(Γ) has upper bound [31]. From (62), in the t−th iteration, L α * i is the maximal value of L(α i ), that is to say, . Therefore, the condition L Γ (t+1) ≥ L Γ (t) , ∀t is promised in each iteration referring to (65). Thus, the convergence performance of our proposed algorithm is guaranteed.

Performance Assessment
In this section, we firstly verify the performance of the loading sample matrix inversion (LSMI) algorithm, the multiple orthogonal matching pursuit (M-OMP-STAP) algorithm [32], the M-SBL-STAP algorithm and the proposed M-FMLM-STAP algorithm with the simulated data of side-looking ULA in the ideal case and the array errors case. The first two algorithms are common approaches in STAP. We then assess the performance of the M-SBL-STAP algorithm and the proposed M-FMLM-STAP algorithm with the Mountain-Top data. We utilize the improvement factor (IF) and signal to interference plus noise ratio (SINR) loss as two measurements of performance, which are expressed as where R is the exact CNCM of the CUT.

Simulated Data
The parameters of a side-looking phased array radar are listed in Table 2. The 600th range gate is chosen to be CUT. According to the parameters of the radar system, the slope of the clutter ridge is 1. We set the resolution scales ϕ S = 5 and ϕ d = 5. Total of six training samples are utilized for two algorithms and the parameters a = b = 0. In Figure 2, there are five clutter power spectrum figures in the framework of the sidelooking ULA in the ideal case. Figure 2a-e show the clutter power spectrums estimated by the exact CNCM, the LSMI algorithm, the M-OMP-STAP algorithm, the M-SBL-STAP algorithm and the M-FMLM-STAP algorithm, respectively. We note that the clutter spectrums obtained by the M-SBL-STAP algorithm and the M-FMLM-STAP algorithms are closer to the exact spectrum both on the location and power than the other algorithms in the ideal case. As shown in Figure 3, the IF factor curve achieved by the M-SBL-STAP algorithm is a litter better than that achieved by the M-FMLM-STAP algorithm because the latter algorithm is sensitive to noise fluctuation. However, the latter algorithm needs much fewer computational complexities, and the loss on performance can be offset by the decrease on the computational complexities. Using a computer equipped with two Inter(R) Xeon(R) Gold 6140 CPU @2.30 GHz 2.29 GHz processors, the former spends more than 40 s and the latter spends less than 4 s at average. The larger the value of K, the bigger gap of computational complexities between the M-SBL-STAP algorithm and the M-FMLM-STAP algorithm.  In Figure 4, we apply four approaches in the non-ideal case with amplitude Gaussian error (standard deviation 0.03) and phase random error (standard deviation 2 • ). The error is the same at all direction. Figure 4a-e show the clutter power spectrums estimated by the exact CNCM, the LSMI algorithm, the M-OMP-STAP algorithm, the M-SBL-STAP algorithm and the M-FMLM-STAP algorithm, respectively. We note that the clutter spectrums obtained by the M-SBL-STAP algorithm and the M-FMLM-STAP algorithm are much closer to the exact spectrum, both in terms of the location and power in the non-ideal case. As shown in Figure 5, the notch of the IF factor curve achieved by the M-SBL-STAP algorithm is close to that achieved by the M-FMLM-STAP algorithm, which means the performance of two algorithms is about the same in the non-ideal case. However, the running time of the latter is much less than that of the former.  In Figure 6, the average SINR loss, defined as the mean of the SINR loss values in the entire normalized Doppler frequency range to the number of training samples of the M-SBL-STAP algorithm and the M-FMLM-STAP algorithm, is presented. The curve of the average SINR loss achieved by the M-SBL-STAP algorithm is a litter better than that achieved by the M-FMLM-STAP algorithm within 0.5 dB, which can be ignored in practice. However, the gap on computational complexities of two algorithms will be larger as the DOF of the radar system increases. All presented results are averaged over 100 independent Monte Carlo runs.

Measured Data
We apply the M-SBL-STAP algorithm and the M-FMLM-STAP algorithm to the public available Mountain-Top set, i.e., t38pre01v1 CPI6 [33]. In this data file, the number of array elements and coherent pulses are 14 and 16, respectively. There are 403 range cells, and the clutter locates around 245 • relative to the true North. The target is located in 147th range cell, and the azimuth angle is 275 • to the true North. The normalized target Doppler frequency is 0.25. The estimated clutter Capon spectrum utilizing all 403 range cells is provided in Figure 7. Figure 8 depicts the STAP output power of the M-SBL-STAP algorithm and the M-FMLM-STAP algorithm in the range cells from 130 to 165, and 10 range cells out of 20 range cells located next to the CUT are selected as training samples. The target locates at 147th range cell and can be detected by two algorithms. Obviously, the detection performance of the proposed M-FMLM-STAP algorithm is close to that of M-SBL-STAP algorithm. However, the operation time of the former algorithm is much less than that of the latter algorithm. Therefore, the proposed algorithm is applicable in practice.

Conclusions
In this paper, we have extended the real-valued multitask compressive sensing technique to suppress complex-valued heterogeneous clutter for airborne radar. Unlike the conventional M-SBL-STAP algorithm, the proposed algorithm avoids the inversion of a K × K matrix at each iteration to guarantee real-time operations. We integrate the noise σ 2 out instead of estimating σ 2 to overcome the problem that we have no access to obtain the accurate value of σ 2 . The complex-valued multitask clutter data are translated into group real-valued sparse signals in the article to suppress clutter because the modified hierarchical model is not suitable to complex-valued signals. At the end, simulation results demonstrate the high computational efficiency and the great performance of the proposed algorithm.  i represents the index of the i−th group in the whole dictionary matrix, and g represents the position index of the above-mentioned i−th group in the used basis matrix ψ. α i is the hyper-parameter re-estimated, ∆α i = α i − α i , Σ g = Σ(:, 2g − 1 : 2g), and Σ gg = Σ(2g − 1 : 2g, 2g − 1 : 2g).
q l,i = q l,i + Φ T i ψΣ g GΣ T g ψ T y l (A11) g l = g l + y T l ψΣ g GΣ T g ψ T y l (A12) where G = − α i α i ∆α i I 2 + Σ gg −1 and µ g = Σ T g ψ T y denotes the g−th 2 × 1 sub-vector of µ.