A Two-Stage STAP Method Based on Fine Doppler Localization and Sparse Bayesian Learning in the Presence of Arbitrary Array Errors

In the presence of unknown array errors, sparse recovery based space-time adaptive processing (SR-STAP) methods usually directly use the ideal spatial steering vectors without array errors to construct the space-time dictionary; thus, the steering vector mismatch between the dictionary and clutter data will cause a severe performance degradation of SR-STAP methods. To solve this problem, in this paper, we propose a two-stage SR-STAP method for suppressing nonhomogeneous clutter in the presence of arbitrary array errors. In the first stage, utilizing the spatial-temporal coupling property of the ground clutter, a set of spatial steering vectors with array errors are well estimated by fine Doppler localization. In the second stage, firstly, in order to solve the model mismatch problem caused by array errors, we directly use these spatial steering vectors obtained in the first stage to construct the space-time dictionary, and then, the constructed dictionary and multiple measurement vectors sparse Bayesian learning (MSBL) algorithm are combined for space-time adaptive processing (STAP). The proposed SR-STAP method can exhibit superior clutter suppression performance and target detection performance in the presence of arbitrary array errors. Simulation results validate the effectiveness of the proposed method.


Introduction
Space-time adaptive processing (STAP) [1][2][3][4][5][6][7][8] is an effective approach for ground clutter suppression and low-velocity target detection in airborne radars. The performance of STAP mainly depends on the estimation accuracy of the clutter plus noise covariance matrix (CCM) of the cell under test (CUT). Generally, the independent and identically distributed (IID) target-free training samples adjacent to the CUT are used to estimate the CCM. According to the Reed-Mallett-Brennan (RMB) rule [9], to achieve an output signal-to-clutter-plus-noise ratio (SCNR) loss within 3 dB, the number of used IID training samples must be greater than twice the system degrees of freedom (DOFs). However, this requirement is hard to be satisfied in the practical heterogeneous and non-stationary clutter environment, thereby resulting in a severe performance degradation of the STAP algorithms.
Several low-sample methods have been developed to relieve the performance degradation caused by limited training data, such as reduced-dimension (RD) [10][11][12][13][14][15][16] algorithms, reduced-rank (RR) [17][18][19][20][21] algorithms, parametric adaptive matched filter (PAMF) algorithms [22,23], direct data domain (D3) [24,25] algorithms and knowledge-aided (KA) algorithms [26][27][28][29][30]. Although these algorithms can reduce the number of required training samples, they suffer from some drawbacks. The requirement of RR and RD algorithms is still hard to be satisfied, especially for large scale systems, the order for PAMF algorithms is hard to be determined, the system DOFs are significantly reduced for D3 algorithms and the exact prior knowledge of the environment is hard to obtain for KA algorithms.
Recently, with the development of sparse recovery (SR) techniques, sparse recovery based space-time adaptive processing (SR-STAP) methods have been extensively researched [31][32][33][34][35][36][37][38][39]. By utilizing the intrinsic sparsity of the clutter in angle-Doppler plane, SR-STAP recovers a signal with a sparse coefficient vector and a uniformly discretized space-time dictionary. Compared with the traditional STAP methods, SR-STAP can exhibit better clutter suppression performance in a very small training samples support. However, unfortunately, most SR algorithms, such as the iterative splitting and thresholding (IST) algorithm [40] and homotopy algorithm [41], need the fine tuning of one or more user parameters which affect the recovery results significantly. Sparse Bayesian learning (SBL) was proposed by Tipping and has been introduced to sparse signal recovery by Wipf for the single measurement vector (SMV) case and multiple measurement vector (MMV) case [42][43][44]. Different with the general SR algorithms, SBL is parameter-independent, which can guarantee the robustness of the algorithm in changing environment. Moreover, SBL can get favorable performance when the dictionary is highly coherent and its global minimum is always the sparsest solution. Thus, for its robustness and excellent performance, sparse Bayesian learning based space-time adaptive processing (SBL-STAP) [45,46] has received much attention.
However, SR-STAP methods rely on the accuracy of the sparse model and suffer performance degradation due to the model mismatch caused by array errors. Thus, several SR-STAP methods which can handle unknown array errors are developed. A sparsity-based STAP method considering array gain/phase error (AGPE-STAP) is proposed in [47], which combines a conventional sparsity-based STAP method and a conventional array gain/phase error calibration method. A sparsity-based STAP method with array gain/phase (GP) error self-calibration has been developed in [48], which iteratively solves an SR problem and an LS calibration problem. In [49], utilizing the specific structure of the mutual coupling matrix, a mutual coupling calibration method is developed for SBL-STAP by rearranging the received snapshots with the designed spatial-temporal selection matrix. In [50], under the framework of the alternating direction method (ADM), a constraint is added to the array GP errors, and the conventional sparsity-based STAP problem is transformed into a joint optimization problem of the angle-Doppler profile and the array GP errors. However, these SR-STAP methods are based on model errors and are only suitable for gain/phase calibration or mutual coupling calibration, in practice, various array errors often work together and some errors are difficult to model, in that case, these methods are no longer effective. Thus, an SR-STAP method which can handle the arbitrary array errors is urgently needed.
In this paper, we propose a two-stage SR-STAP method for suppressing nonhomogeneous clutter in the presence of arbitrary array errors. In our two-stage SR-STAP method, the radar operates in two modes. In the first stage, radar operates in measurement mode, this mode needs a long coherent processing interval (CPI) to ensure sufficient Doppler resolution. Then, utilizing the spatial-temporal coupling property of the ground clutter, a set of spatial steering vectors with array errors are well estimated by fine Doppler localization. In the second stage, radar operates in STAP mode, in order to solve the model mismatch problem caused by array errors, we directly use these spatial steering vectors obtained in the first stage to construct the space-time dictionary, and then, the constructed dictionary and MSBL algorithm are combined for STAP. The main contributions of this paper are summarized as follows.
(1) A new two-stage SR-STAP method is proposed, in the presence of arbitrary array errors, the proposed two-stage SR-STAP method can obtain superior clutter suppression performance and target detection performance with limited training samples.
(2) Steering vector estimation for arbitrary array errors is developed, which is based on the spatial-temporal coupling property of the ground clutter. Relative to many existing array calibration methods which are only suitable for individual perturbation, the devel-oped method can handle arbitrary array errors. Since it is free of the array model and based on clutter data, the developed method also avoids the model mismatch problem and has adaptability to the changing scenes.
(3) The developed method for estimating steering vectors is still effective when intrinsic clutter motion (ICM) is present, spatial steering vectors with array errors can also be well estimated when the pulse-to-pulse fluctuations are small.
The rest of the paper is organized as follows. In Section 2, the signal model with array errors is introduced. In Section 3, the proposed two-stage SR-STAP method is introduced. In Section 4, simulation results are provided to demonstrate the clutter suppression performance and target detection performance of the proposed method. Final conclusion is discussed in Section 5.
Notation: Boldface small letters denote vectors and boldface capital letters denote matrices. (·) T and (·) H represent the transpose and Hermitian transpose, respectively. R , R + and C represent the real filed, nonnegative real filed and complex filed, respectively. The expectation operator is represented by E(·). The symbols ⊗ and denote the Kronecker product and Hadamard product, respectively. diag(·) represents a diagonal matrix with entries of the argument vector on the diagonal. The NK × NK identity matrix is defined as I NK . · F denotes the Frobenius norm. · 2,0 denotes a mixed norm defined as the number of non-zero elements of l 2 -norms of the row vectors.

Signal Model
Consider an airborne pulsed Doppler radar system that employs a side-looking uniform linear array (ULA) consisting of N elements with an inter-element spacing d and K coherent pulses in a CPI at a constant pulse repetition frequency (PRF) f PRF . Ignoring the influence of range ambiguity, the clutter plus noise echoes collected over all pulses, all elements and all range bins can be represented by where y l is clutter plus noise data snapshot with array errors of the lth range bin, given by y l = [y 11l , y 21l , . . . , y N1l , . . . , y 1Kl , y 2Kl , . . . , where N c is the number of independent clutter sources, ς c,i is the random complex amplitude, is the spatial-temporal steering vector with array errors of the ith clutter patch, a( f si ) = G c,iā ( f si ) is the spatial steering vector with array errors of the ith clutter patch, G c,i is the array error matrix of the ith clutter patch, n l is a Gaussian noise vector with zero mean and covariance matrix σ 2 I, σ 2 is the noise power, I is the identity matrix, b( f di ) andā( f si ) are the corresponding temporal steering vector and the ideal spatial steering vector without array errors, and where f si = d cos φ i /λ and f di = 2v p cos φ i /(λ f PRF ) are the normalized spatial fre-quency and the normalized Doppler frequency of the ith clutter patch, φ i is the corre-sponding spatial cone angle, λ is the wavelength, v p is the velocity of the platform.
In practice, the gain and delay of each sensor are usually not identical due to different aging rates or imperfect manufacturing, which causes gain and phase errors. The errors can be represented by a N × N complex diagonal matrix G gain [4] G gain = diag([g 1 , g 2 , · · · , g N ]) (5) where g n = (1 + ∆α n )e j∆ϕ n , ∆α n and ∆ϕ n are the gain error and phase error of the nth sensor, respectively. Due to closed distance, the interactions among sensors generate mutual coupling. The mutual coupling can be represented by the following N × N symmetric Toeplitz matrix G mutual [4] where c i (i = 1, 2, . . . , q) denotes the complex mutual coupling coefficient, q N , which means that the mutual coupling can be ignored when the element spacing is greater than q inter-element spacing.
In order to obtain a certain geometry of array, each sensor must be in the precise location. However, in practice, this requirement is sometimes difficult to satisfy, which causes the sensor location errors. The error vector of the ith clutter patch caused by sensor location errors can be written as [4] e pi = 1, e j2π∆ 1 cos φ i /λ , · · · , e j2π∆ N−1 cos φ i /λ T where ∆ 0 = 0 , ∆ j (j = 1, 2, . . . , N − 1) are the random numbers represent the location errors for each sensor. Let G othersi ∈ C N×N denotes other array perturbations encountered at the ith clutter patch, the array error matrix G c,i can be formulated as

Proposed Method
In this section, we propose a two-stage SR-STAP method for suppressing nonhomogeneous clutter in the presence of arbitrary array errors.

Steering Vector Estimation
In the first stage, radar operates in measurement mode, assuming that the number of pulses in a CPI is K 1 , to promise sufficient Doppler resolution, K 1 should be a large value. From (2), we get the clutter plus noise data snapshot of the lth range bin.
Without consideration of the ICM, the relationship of spatial frequency f si and temporal frequency f di is represented by It means that clutter patches can be localized either by a spatial filter or by a Doppler filter. Generally, the number of pulses in a CPI is larger than the number of elements in the array, so, it is easier to create a narrow Doppler filter. Moreover, the ultra-low sidelobe of a Doppler filter is more reasonable than that of a spatial filter. Thus, clutter localization is preferred to be realized by fine Doppler localization. The kth Doppler filter output is given by where T k = (f k ⊗ I N ) is the transformation matrix, f k = t f u k is the Doppler filter coefficient vector of the kth Doppler filter, t f is a ultra-low sidelobe taper, u k = [1, exp(j2πk/K 1 ), · · · , exp(j2πk(K 1 − 1)/K 1 )] T ,ñ l = (u k ⊗ I N ) H n l is the additive Gaussian noise, and is the low-pass filter response with the passband of where f dk is the center frequency of the kth Doppler filter, D w is the Doppler frequency passband width (DFPW). Then, (11) can be recast as According to the Doppler frequency passband of the kth Doppler filter, we can get the associated spatial frequency passband of the clutter component by substituting (10) into (13) d The width of the spatial frequency passband is For a Doppler filter with ultra-low sidelobes, the gain of the stopband is negligible relative to the passband. Without consideration of the components in the stopband of the Doppler filter, (14) can be written as where ξ c,i = ς c,i pbr f dk − f di , N pk and N qk are the bounded indexes of the spatial frequency passband corresponding to the kth Doppler filter. Similar to the Doppler beam sharpening (DBS) radar, we define a sharpening ratio as (18) where θ mainlobe is the mainlobe beamwidth.
For an untapered Doppler filter, the distance between its two first nulls is 2/K 1 , which is larger than its DFPW. Therefore, when K 1 is large, a narrow Doppler filter with a small DFPW can be obtained. However, its sidelobe level is high (the first sidelobe is at −13.4 dB); thus, the sidelobe gain of an untapered Doppler filter cannot be ignored. A heavy tapered Doppler filter can obtain ultra-low sidelobes, but the obtainment is at the cost of a broadening mainlobe, and thereby resulting a larger DFPW. We define the DFPW to be the width of the Doppler frequency range where the drop of the power gain of a Doppler filter is less than 40 dB. For a Doppler filter with ultra-low sidelobes, the power gain is negligible outside this range. It is difficult to get the analytical DFPW of a tapered Doppler filter, but we can give a reasonable value based on our experience. For example, when a Chebyshev taper with sidelobe level of −80 dB is used, by experience, we know that 5/K 1 is a reasonable DFPW value, i.e., D w = 5/K 1 . Substituting D w = 5/K 1 into (16), we get the spatial frequency passband width corresponding to the DFPW of a Doppler filter with a 80 dB Chebyshev taper.
Substituting (19) into (18) yields Define the correlation coefficient of a f si and a( f si + ∆/2) as Thus, as the number of pulses K 1 increases, the sharpening ratio κ becomes larger and ∆ becomes smaller, as a result, the correlation coefficient of a f si and a( f si + ∆/2) becomes larger. Figure   In Figure 1, the dotted line with symbol * shows the correlation coefficient of a f si and a f si +∆/2 versus the sharpening ratio κ and the dotted line with symbol • denotes a threshold value. From Figure 1, we can observe that when the sharpening ratio κ is larger than 6.4, the correlation coefficient of a f si and a f si +∆/2 is greater than 0.99, i.e., in this case, if κ is larger than 6.4 (the number of pulses in a CPI is larger than 256), a f si +∆/2 can be well approximated by a f si .
When the sharpening ratio κ is large, a f sk ± ∆/2 ≈ a f sk , (17) can be simplified as ξ c,i , f sk is the normalized spatial frequency corresponding to the center Doppler frequency of the kth Doppler filter, a f sk is the corresponding spatial steering vector.
To alleviate the bad influence ofñ l , multiple range gates are utilized to estimate a f sk , according to (22), the covariance matrix of X kl can be written as where γ 2 kl = E |µ kl | 2 , E ñ lñ H l =σ 2 I N ,σ 2 is the additive noise power. In practice, R kl is unknown and can be substituted by the sample covariance matrix, i.e., Under the high clutter-to-noise ratio (CNR) case, γ 2 kl /σ 2 1 and it is valid to say that the number of large eigenvalues ofR kl is 1. Thus, we can perform singular value decomposition (SVD) onR kl and a f sk is estimated by the eigenvector associated with the largest eigenvalue.
When ICM is present, the pulse-to-pulse fluctuations will cause a broadening of the Doppler spectrum of a single clutter return and the relation in (10) does not hold. In this case, for a single clutter echo, its Doppler frequency range can be written as where D b = 2σ v λ f PRF is the width of the Doppler spectrum, σ v is the velocity standard deviation caused by ICM [1]. By substituting (10) into (25), we get the associated spatial frequency range And the width of the spatial frequency range is When the Doppler spectrum broadening caused by ICM is much smaller than the DFPW of the heavy tapered Doppler filter, i.e., D b D w , the inequality ∆ b ∆ holds. As a result, the correlation coefficient of a f si and a( f si +∆ b /2) is approximately 1 when the sharpening ratio κ is large, and in this case, we can say that the Doppler frequency range given in (25) corrsponds to a single spatial frequency f si . Thus, when the velocity standard deviation σ v caused by ICM is small, the broadening of the Doppler spectrum has little effect on estimating the spatial steering vectors and the proposed method for estimating spatial steering vector still works well.
In practice, clutter from the sidelobes and the nulls of the array pattern is much weaker than that from the mainlobe. Besides, the reflection coefficients are small in some unknown clutter areas. In addition, the adjacent range gates used to estimate R kl may include strong moving targets and other unwanted components. In these cases, the estimation accuracy of R kl or the condition γ 2 kl /σ 2 1 cannot be well guaranteed, which causes an inaccurate estimate of a( f sk ). Thus, beam scanning and secondary data selection are necessary. Figure 2 describes the process of beam scanning and fine Doppler localization. Firstly, to guarantee the gain of the array in all clutter regions, multiple beams, such as a group of N orthogonal Fourier beams, are used to cover all the azimuth angles; thus, we can get the ground clutter data of all range gates under each spatial beam. Then, for the reason that the angle resolution in the spatial domain is low while the Doppler resolution in the temporal domain is high, a group of K 1 Doppler filters are used for better localization of the ground clutter. Thus, we can obtain the output data of all range gates under each heavy tapered Doppler filter by the fine Doppler localization of the ground clutter data. Each spatial beam will cover several Doppler filters and N spatial beams will cover all K 1 Doppler filters, and the gain of the array in these clutter areas corresponding to the DFPW of each Doppler filter can be well guaranteed. Thus, by processing the output data of each heavy tapered Doppler filter in turn, a set of K 1 spatial steering vectors can be well estimated.
and an angle selection parameter ρ l whereâ 0 f sk is the initial estimated spatial steering vector by utilizing all range gates. According to the definition of ε l and ρ l given in (28) and (29), we find that ε l is dependent on both the direction and amplitude of X kl and ρ l is only dependent on the direction of X kl . Thus, firstly, we use a power selection parameter ε l to pick out the range gates which may be strong clutter or strong outliers. Then, we use an angle selection parameter ρ l to kick out the possible outliers, such as strong moving targets or strong interference, whose directions are different from X kl . Thereafter, these range gates which may be strong clutter can be preserved and the possible outliers can be removed.

The first beam
The second beam   In the first stage of our two-stage SR-STAP method, our goal is to estimate a set of spatial steering vectors with array errors. Firstly, in the beam scanning step, we can get the ground clutter data of all range gates given in (9) under the first spatial beam. Then, in the fine Doppler localization step, we can obtain the output data of all range gates given in (11) under the first heavy tapered Doppler filter by the fine Doppler localization of the ground clutter data. Then, for the secondary data selection step, we firstly obtain the initial estimated spatial steering vector by utilizing the output data of all range gates given in (11); then we use the power selection parameter ε l given in (28) to pick out these range gates which may be strong clutter or strong outliers; finally, we use the angle selection parameter ρ l given in (29) to kick out these range gates which may contain possible outliers. Next, for the steering vector estimation step, we calculate theR kl by (24) utilizing these selected range gates and perform SVD onR kl to find the eigenvectorâ( f sk ) associated with the largest eigenvalue, which is considered as the estimate of a( f sk ). Here we can get the spatial steering vector with array errors corresponding to the first heavy tapered Doppler filter. Then, we need to judge whether all the Doppler channel contained in the current beam have been processed. If it has not been finished, we should assume k = k + 1 and back to the beam scanning step. If the answer is Yes, we need to judge whether the beam scanning has been finished and if has not, we should assume n = n + 1 and back to the fine Doppler localization step. When the beam scanning ends and all K 1 Doppler bins are processed, a set of K 1 spatial steering vectors with array errors are well estimated by fine Doppler localization. The procedures of the first stage of the proposed method are summarized as follows: Step 1: Obtain the initial estimated spatial steering vectorâ 0 f sk corresponding to the kth Doppler filter.
Step 2: Compute the values of power selection parameter ε l and angle selection parameter ρ l for all range gates according to (28) and (29).
Step 5: Calculate theR kl given in (24) utilizing q range gates selected in step 4. Perform SVD onR kl to find the eigenvectorâ( f sk ) associated with the largest eigenvalue, which is considered as the estimate of a( f sk ).
Step 6: Go back to step 1 until the beam scanning ends and all K 1 Doppler bins are processed.

SR-STAP Method
In the second stage of our two-stage SR-STAP method, we firstly use these spatial steering vectors obtained in the first stage to construct the space-time dictionary, and then, since the MSBL algorithm has been demonstrated a robust, sparse enough, parameterindependent algorithm in the presence of noise, the existing multiple measurement vector sparse Bayesian learning based space-time adaptive processing (MSBL-STAP) [45] method is adopted.
In the second stage, radar operates in STAP mode, since we have already measured a set of spatial steering vectors with array errors in the first stage; thus, in this mode, high Doppler resolution is not needed, assuming that the pulse number in a CPI is K 2 , in general, K 2 < K 1 . To solve the model mismatch problem caused by array errors, we need to select N s spatial steering vectors from the K 1 spatial steering vectors obtained in the first stage and use these selected steering vectors to construct the space-time dictionary. Then, the received data snapshot of L range bins can be expressed by where Ψ = β (1) , β (2) , · · · , β (L) ∈ R N s N d ×L is the solution matrix with each row representing a possible clutter source, N = n (1) , n (2) , · · · , n (L) ∈ C NK×L is a noise matrix whose where r s ∈ R + is the degrees of the clutter sparsity (DOSs). A convex relaxation of (31) is From a Bayesian perspective, (32) is equivalent to maximum a posterior probability (MAP) with the prior probability density function (PDF) According to the measurement model in (30), we get the Gaussian likelihood function Assuming that each column in Ψ obeys a complex Gaussian prior where 0 is a zero vector, Γ = diag(ζ), ζ = [ζ 1 , ζ 2 , . . . , ζ M ] are the hyperparameters controlling the prior covariance of β (l) and its values can be viewed as the power of the clutter sources. Then the prior PDF of Ψ can be represented as Combining the prior and likelihood, we get the posterior PDF of Ψ p Ψ|Y; Γ, σ 2 = p Y|Ψ; σ 2 p(Ψ; Γ) p(Y|Ψ; σ 2 )p(Ψ; Γ)dΨ (36) Actually, the sparsity profile Ψ is estimated by the posterior mean µ, whose value is modulated by the hyperparameter vector ζ and σ 2 . Thus, the task to estimate Ψ is shifted to estimate the hyperparameter vector ζ and σ 2 . The latter can be effectively accomplished by an expectation maximization (EM) algorithm. The procedures of the EM algorithm are described as follows.
E step: According to (33) and (35), the joint PDF of (Y, Ψ) at j + 1 step is given by Then, the marginal PDF of Y at j + 1 step is represented as By combining (37) and (38), we get the posterior PDF of Ψ at j + 1 step where µ j+1 is the mean matrix and Σ j+1 is the covariance matrix, given by M step: At M-step, we estimate ζ j+1 and σ 2 j+1 by using a Type-II maximum likelihood [42], i.e., Because of decoupling [43], (42) can be divided into two optimization problems Substituting (35) into (43) yields where µ (l) m,j+1 is the mth component of µ (l) j+1 , Σ m,j+1 is the mth component of the main diagonal of Σ j+1 .
Substituting (33) into (44) yields The iteration for updating ζ and σ 2 ends when a predetermined criteria is satisfied.
Such as, ζ j+1 − ζ j / ζ j δ, where δ is a small enough positive threshold. Then, the CCM can be calculated by where α is a real constant. Based on the minimum variance distortionless response (MVDR) principle, we get the optimal STAP weight vector where s t = b( f dt ) ⊗ā( f st ) is the target spatial-temporal steering vector with the normalized Doppler frequency of f dt and the normalized spatial frequency of f st . The procedures of the second stage of the proposed method are summarized as follows: Step 1: Construct the dictionary D using these spatial steering vectors obtained in the first stage, give the initial values ζ 0 = 1, σ 2 0 = 0.1.

Numerical Experiments
In this section, numerical experiments are conducted to assess the performance of proposed method. The radar system parameters are given in Table 1. In the first stage, a Chebyshev taper with sidelobe level of −80 dB is used and the sharpening ratio κ is equal to 6.4. In the second stage, the discretized grids are set to be N s = 32 and N d = 32, i.e., ρ s = ρ d = 4, the number of used training samples and the iteration termination threshold of MSBL-STAP algorithm are set to be 10 and δ = 0.001, respectively. We use the signal to interference plus noise ratio (SINR) loss as a measure of clutter suppression performance, which is calculated by the ratio of output SINR and the signal to noise ratio (SNR) obtained by a match filter in a noise-only environment, i.e., where w is the STAP weight vector, R is the known CCM. We also evaluate the target detection performance by the probability of detection (PD) versus SNR curves, which are achieved by utilizing the adaptive matched filter (AMF) detector [51], and the probability of false alarm rate (PFA) is set as 10 −3 , the target is assumed in the main beam direction with the normalized Doppler frequency 0.1, the threshold and probability of detection estimates are based on 10 4 samples. Besides, all the simulation results of SINR loss are acquired through 100 Monte Carlo runs and all the PD to SNR curves are averaged over 1000 Monte Carlo trials. To demonstrate the performance of proposed two-stage SR-STAP method in the presence of array errors, each perturbation is first considered separately, and then, their combined effects are demonstrated, finally, we also measure the effect of the presence of ICM on our two-stage SR-STAP method. We consider four cases in the simulation, (1) use the true spatial steering vectors with array errors to construct the space-time dictionary and perform MSBL-STAP, which is called TSV-MSBL, (2) use the estimated spatial steering vectors which are obtained by utilizing single range gate to construct the space-time dictionary and perform MSBL-STAP, which is called SESV-MSBL, (3) use the estimated spatial steering vectors which are obtained by utilizing multiple range gates to construct the space-time dictionary and perform MSBL-STAP, which is called MESV-MSBL, (4) use the ideal spatial steering vectors without array errors to construct the space-time dictionary and perform MSBL-STAP, which is called ISV-MSBL.

Gain and Phase Errors
In this experiment, we verify the performance of the proposed two-stage SR-STAP method in the presence of gain and phase errors. G gain = diag([g 1 , g 2 , · · · , g N ]) is the error matrix, g 1 = 1, g i = (1 + ∆α i )e j∆ϕ i (i = 2, · · · , N) is the gain and phase error of the ith element, where ∆α i and ∆ϕ i follow a uniform distribution within [−0.1, 0.1] and [−10 • , 10 • ], respectively [47]. Then, we can get the following equation whereÂ = â f s1 ,â( f s2 ), . . . ,â f sK 1 is the matrix whose columns are the K 1 estimated spatial steering vectors with array errors in the first stage, A = ā f s1 ,ā( f s2 ), . . . ,ā f sK 1 is the matrix whose columns are the K 1 ideal spatial steering vectors without array errors, G gain is the estimate of G gain . The least square (LS) solution ofĜ gain is given bŷ To show the performance loss of the proposed method where there are varying levels of amplitude and phase errors, twenty-one different levels of the amplitude and phase errors are defined as "level1", "level2", "level3", . . . , "level20" and "level21", which are subject  Figure 4 plots the average SINR loss versus the amplitude and phase errors level, as shown in Figure 4, the higher the amplitude and phase errors level, the severer performance loss of the proposed method. In general, it is acceptable when the performance loss of the algorithm is less than 3 dB. In Figure 4, the black dotted line with a square mark denotes a threshold value, which means that the average SINR loss is decreased by 3 dB compared with the OPT. From Figure 4, we can observe that the slight amplitude and phase errors will cause a severe performance loss of the proposed method, specifically, the performance loss of the proposed two-stage method is greater than 3 dB when the amplitude and phase errors level is greater than 2. Thus, we can say that when the phase errors level is greater than 2, the performance of the proposed method is significantly deteriorated. The gain and phase errors estimated by the proposed method are presented in Table 2, from this table, it is observed that the estimated values are very close to the true ones. The SINR loss curves in the presence of gain and phase errors are given in Figure 5a. As shown in Figure 5a, due to the steering vector mismatch between the dictionary and clutter data, the clutter suppression performance of the ISV-MSBL method is much poorer than that of the TSV-MSBL method. By comparing the SINR loss curves of ISV-MSBL, SESV-MSBL, MESV-MSBL and the OPT, it is observed that the MESV-MSBL method achieves the comparable performance as the OPT, which is better than that of the SESV-MSBL method and much better than that of the ISV-MSBL method. The results demonstrate that the gain and phase errors can be well calibrated by the developed steering vector estimation method. or in the sidelobe region, the clutter suppression performance of the proposed method is significantly improved. The PD versus SNR curves in the presence of gain and phase errors are given in Figure 5b. As depicted in Figure 5b, the target detection performance of the MESV-MSBL method is close to the optimal performance, which is better than that of the SESV-MSBL method and much better than that of the ISV-MSBL method. Compared with the ISV-MSBL method, the slow-moving target detection performance of proposed SESV-MSBL method and MESV-MSBL method are significantly improved.

Mutual Coupling
In this experiment, we verify the performance of the proposed two-stage SR-STAP method in the presence of mutual coupling. Assuming that mutual coupling coefficient can be ignored when the element spacing is greater than 1.5 wavelength, which means that q = 3. We set the non-zero mutual coupling coefficients as 1, 0.1250 + 0.2165j, 0.0866 − 0.0500j, respectively [49]. The same principle as the estimation of G gain , we can also get the estimate of mutual coupling matrix G mutual by Equation (51).
The mutual coupling coefficients estimated by the proposed method are presented in Table 3, from this table, we find that the estimated values are also very close to the true ones. The SINR loss curves in the presence of mutual coupling are depicted in Figure 6a, the PD versus SNR curves in the presence of mutual coupling are depicted in Figure 6b

Sensor Location Errors
In this experiment, we verify the performance of the proposed two-stage SR-STAP method in the presence of sensor location errors. ∆ PE = diag([∆ 0 , ∆ 1 , . . . , ∆ N−1 ]) is the position errors matrix, ∆ 0 = 0, ∆ i−1 (i = 2, · · · , N) is the position error value of the ith element, where ∆ i−1 follows a uniform distribution within [−0.1d, 0.1d] [52]. We can also utilize the K 1 estimated spatial steering vectors â f s1 ,â( f s2 ), . . . ,â f sK 1 and the K 1 ideal spatial steering vectors ā f s1 ,ā( f s2 ), . . . ,ā f sK 1 to estimate ∆ i−1 , given bŷ whereâ i f sk andā i f sk are the ith element ofâ f sk andā f sk , respectively. φ k is the is the spatial cone angle corresponding to the center frequency of the kth Doppler filter. The sensor location errors estimated by the proposed method are presented in Table 4, from this table, we find that the estimated ones match true ones pretty well. Table 4. Sensor location errors estimation.

True (m) Estimated (m)
The SINR loss curves in the presence of sensor location errors are depicted in Figure 7a, the PD versus SNR curves in the presence of sensor location errors are depicted in Figure 7b. It is observed that the clutter suppression performance and the target detection performance of the proposed method are significantly improved when sensor location errors are present.

Arbtrary Array Errors
In this experiment, we model the arbitrary array errors as the combined effects of gain and phase errors, mutual coupling and sensor location errors, then, the performance of proposed method in the presence of arbitrary array errors is demonstrated. The specific values of array errors are the same as those in Sections 4.1-4.3.
The amplitudes and interferometry phases of all K 1 estimated spatial steering vectors with array errors in the first stage are given in Figure 8a,c, respectively. The amplitudes and interferometry phases of K 1 true spatial steering vectors with array errors are given in Figure 8b,d, respectively. From Figure 8a,c, we can observe that the amplitudes of the estimated spatial steering vectors with array errors are close to that of the true spatial steering vectors. From Figure 8b,d, we can also observe that the interferometry phases of the estimated spatial steering vectors with array errors are very close to that of the true spatial steering vectors. Thus, we can say that the amplitudes differences and the phase differences between the estimated spatial steering vectors and true spatial steering vectors of all Doppler bins are very small, i.e., a set of K 1 spatial steering vectors can be well estimated in the first stage of our two-stage SR-STAP method in the presence of arbitrary array errors.  Figure 9a,b show the amplitudes differences and the phase differences between the estimated spatial steering vectors and true spatial steering vectors of all Doppler bins, respectively. As depicted in Figure 9, the amplitude differences and the phase differences between the estimated spatial steering vectors and true spatial steering vectors of all Doppler bins are very small, i.e., the estimated spatial steering vectors are very close to these true spatial steering vectors. Thus, when the arbitrary array errors are present, a set of spatial steering vectors can be well estimated in the first stage of our two-stage method. The results intuitively demonstrate the superior steering vector estimation performance of the proposed method. For clarify, in Figure 10, the amplitudes and phases of the ideal steering vector, the true steering vector and the estimated steering vector of the 75th Doppler bin are demonstrated, the results indicate that the estimated steering vector is much closer to the true steering vector than the ideal steer vector, and the true steering vector can be well approximated by the estimated steering vector in the presence of arbitrary array errors. The SINR loss curves and the PD versus SNR curves in the presence of arbitrary array errors are depicted in Figure 11a,b, respectively. From Figure 11, it is observed that the ISV-MSBL method has a severe performance degradation when arbitrary array errors are present. However, compared with the ISV-MSBL method, the clutter suppression performance and the target detection performance of the proposed SESV-MSBL method and MESV-MSBL method are significantly improved and the MESV-MSBL method can obtain the comparable performance as the OPT. The reason is that the array errors are well calibrated by the developed steering vector estimation method and thereby the mismath problem between the clutter data and the space-time dictionary are well solved. The results further validate the superior performance of the proposed method. In the sidelobe region To better illustrate the advantage of the proposed method, Figure 12a-d plot the clutter capon spectra of different STAP methods. From Figure 12a-c, we can observe that the spectra of the MESV-MSBL method is the closest to the optimal spectra with few clutter power leakage, and the spectra of the SEMV-MSBL method is close to the optimal spectra with some clutter power leakage and a slight spectrum expansion. However, from Figure 12d, we can observed that the spectra of the ISV-MSBL method has severe clutter power leakage and spectrum expansion, the reason is that if the array calibration is not performed, the steering vector mismatch between the clutter data and the space-time dictionary will cause that the clutter spectrum cannot be well estimated; thus, the clutter suppression performance and the slow moving target detection performance of the SR-STAP methods will significantly degrade for the reason that the adaptive pattern cannot to suppress clutter and protect the target well because of the widened notches or the incorrect notches. That is the reason why we must perform array calibration when we apply sparse recovery technique to STAP.  Figure 13 plots the average SINR loss versus the number of training samples used in the first stage. From Figure 13, we can know that when the number of training samples used in the first stage is larger than 100, the spatial steering vectors with array errors can be well estimated and the MESV-MSBL method can acquire comparable performance as the TSV-MSBL method and the OPT. In Figure 14, we compare the clutter suppression performance of the proposed SESV-MSBL method and the MESV-MSBL method with that of the AGPE-SR-STAP method [47], the MSB-SR-STAP method [49] and the IAD-SR-STAP method [48]. From Figure 14, we can observe that the SESV-MSBL method and the MESV-MSBL method have narrower notches than other STAP methods. The reason is that AGPE-SR-STAP method and IAD-SR-STAP method are only suitable for gain/phase calibration and MSB-SR-STAP method is only suitable for mutual coupling calibration. Thus, in the presence of arbitrary array errors, these methods are not effective any more. Finally, we give two experiments to show that how much variation in values of the system parameters in Table 1 affect the performance of the proposed method. Figure 15a plots the SINR loss curves of the MESV-MSBL method under different pulse numbers in a CPI in the first stage of the proposed two-stage SR-STAP method. From Figure 15a, we can observe that the more pulses in a CPI, the better clutter suppression performance of the proposed MESV-MSBL method. The reason is that when the number of pulses K 1 in a CPI increases, the sharpening ratio κ given in (20) will become larger and the width of the spatial frequency passband ∆ given in (19) will become smaller, as a result, the correlation coefficient of a f si and a( f si + ∆/2) given in (21) will become larger. In other word, as the number of pulses K 1 increases, the spatial steering vectors can be estimated more and more accurately in the first stage of our two-stage SR-STAP method, as a result, the clutter suppression performance of the proposed MESV-MSBL method is getting better and better. Thus, we can conclude that the system parameters determine the value of the sharpening ratio κ and the value of the sharpening ratio κ determines the clutter suppression performance of the proposed two-stage SR-STAP method. The greater the sharpening ratio κ, the better the clutter suppression performance of the proposed method. In general, as long as these system parameters can guarantee that the correlation coefficient of a f si and a( f si + ∆/2) given in (21) is greater than 0.95, the proposed two-stage SR-STAP method can obtain superior clutter suppression performance. To further confirm our conclusion, Figure 15b plots the SINR loss curves of the MESV-MSBL method under different platform velocities and pulse repetition frequencies. From Figure 15b, we can observe that when v p = 150 and f PRF = 2000, the proposed MESV-MSBL method can achieve superior clutter suppression performance for the reason that the sharpening ratio κ is high in this case. In addition, when v p = 120 and f PRF = 1600, although the system parameters have changed, the sharpening ratio κ has not changed; thus, the proposed MESV-MSBL method can still achieve superior clutter suppression performance. However, when v p = 120, f PRF = 2000 and v p = 20, f PRF = 2000, the clutter suppression performance of the proposed MESV-MSBL method is getting worse and worse because that the sharpening ratio κ becomes smaller and smaller.

Arbitrary Array Errors and Intrinsic Clutter Motion
In this experiment, we verify the performance of the proposed two-stage SR-STAP method in the presence of arbitrary array errors and ICM. The same to Section 4.4, we model the arbitrary array errors as the combined effects of gain and phase errors, mutual coupling and sensor location errors. In this experiment, we only consider the ICM in the first stage of our two-stage method, i.e., we only measure the effect of the presence of ICM on estimating the spatial steering vectors in the first stage, without considering the clutter spectrum expansion problem caused by ICM in the second stage. In fact, this problem can be effectively handled by the covariance matrix taper (CMT) approach, the interested reader is referred to the literature [53,54] for further details. The ICM model is given by [1]. The temporal autocorrelation of the fluctuations is Gaussian in shape where σ v is the velocity standard deviation, T r = 1 f PRF is the pulse repetition interval. The SINR loss curves of different methods in the presence of arbitrary array errors and ICM are depicted in Figure 16. As shown in Figure 16a, when σ v = 0.5 m/s, due to the broadening of the Doppler spectrum caused by ICM is much smaller than the DFPW of the heavy tapered Doppler filter, the proposed two-stage method still achieves superior performance. Specifically, in this experiment, when σ v = 0.5 m/s, the width of the Doppler spectrum is D b = 2σ v λ f PRF = 1 300, and a reasonable DFPW value is 5/K 1 , i.e., D w = 5/256; thus, the inequality D b D w holds and we can say that the broadening of the Doppler spectrum has little effect on estimating the spatial steering vectors. However, when σ v = 3 m/s, the width of the Doppler spectrum is D b = 2σ v λ f PRF = 6 300 and the inequality D b D w no longer holds; thus, the broadening of the Doppler spectrum will have some adverse effect on estimating spatial steering vectors. As depicted in Figure 16b, when σ v = 3 m/s, due to severe temporal fluctuations, the notches of the proposed SESV-MSBL method and MESV-MSBL method are spreading. However, compared with other SR-STAP methods, the proposed two-stage method still achieves better performance. The existence of intrinsic clutter motion will deteriorate the performance of the proposed two-stage SR-STAP method. The more serious the intrinsic clutter motion, the less accurate the estimation of the spatial steering vectors at the first stage our two-stage method, thereby the worse the algorithm performance. In general, it is acceptable when the performance loss of the algorithm is less than 3 dB. Figure 17 plots the average SINR loss versus the velocity standard deviation. As shown in Figure 17, when the velocity standard deviation is small, the proposed two-stage method still obtains the near-optimal performance, however, when the velocity standard deviation becomes larger, the performance of the proposed method degrades due to the severer pulse-to-pulse fluctuations. In Figure 17, the black dotted line with a square mark denotes a threshold value, which means that the average SINR loss is decreased by 3 dB compared with the OPT. From Figure 17, we can observe that the performance loss of the proposed two-stage method is less than 3 dB when the velocity standard deviation is less than 3.1 m/s. For land clutter, in some areas, such as rural and urban, its velocity standard deviation is usually a very small value, even in wooded terrain, its velocity standard deviation is generally less than 1 m/s [55]. From Figure 17, we can observe that the performance loss of the proposed two-stage method is less than 1 dB when the velocity standard deviation is less than 1 m/s. Therefore, we can say that the performance of the proposed two-stage method is satisfactory when the velocity standard deviation is less than 1 m/s. Thus, the proposed method is still effective for ground clutter suppression and ground moving target detection in the presence of arbitrary array errors and small ICM.

Conclusions
The model mismatch caused by array errors drastically degrade the clutter suppression performance and the target detection performance of SR-STAP methods. To solve this problem, a new two-stage SR-STAP method is proposed in this paper. In our two-stage SR-STAP method, firstly, based on the spatial-temporal coupling property of ground clutter data, we obtain a set of spatial steering vectors with array errors by fine Doppler localization, then, in order to solve the model mismatch problem caused by array errors, we directly use these obtained spatial steering vectors with array errors to construct the space-time dictionary, finally, the constructed space-time dictionary and MSBL algorithm are combined for space-time adaptive processing. The simulation results demonstrate that the variation in system parameters will affect the performance of the proposed two-stage SR-STAP method, the system parameters determine the value of the sharpening ratio κ and the value of the sharpening ratio κ determines the performance of the proposed two-stage SR-STAP method. The greater the sharpening ratio κ, the better the clutter suppression performance and the target detection performance of the proposed method. In general, as long as these system parameters can guarantee that the correlation coefficient of a f si and a( f si + ∆/2) given in (21) is greater than 0.95, the proposed two-stage SR-STAP method can obtain favorable performance. In addition, this simulation results which are obtained based on some reasonable system parameters which are listed in Table 1 demonstrate that the spatial steering vectors with array errors can be well estimated in the first stage of our two-stage SR-STAP method when the arbitrary array errors and small ICM are present, and also demonstrate that the proposed method can achieve superior clutter suppression performance and target detection performance in the presence of arbitrary array errors.

Conflicts of Interest:
The authors declare no conflict of interest.