Clutter Subspace Characteristics-Aided Space-Time Adaptive Outlier Sample Selection Method

For statistic space-time adaptive processing (STAP), a critical issue is estimating the clutter covariance matrix (CCM). However, sufficient training samples are difficult to obtain that satisfy the independent and identically distributed (IID) condition. It is because of the realistic heterogeneous environment faced by airborne radar. Moreover, one should eliminate contaminated training samples before CCM estimation. Aiming at the problems of the computational complexity and susceptibility to the outlier of the traditional generalized inner product (GIP) method, a clutter subspace-based training sampling selecting method is proposed combined with specific distribution in the space-time plane of clutter spectrum. Theoretical analysis and simulation results verified the proposed method and indicate that the proposed method is easy to construct CCM and has lower computational complexity and sensitivity to outliers.

The yaw angle [29][30][31][32] of the airborne system and the variable clutter environment [33][34][35][36] may account for this. Furthermore, we cannot ignore the outlier [14,21,[37][38][39][40][41], which is a critical factor affecting the ST-CCM estimate's accuracy. Assume that there are outliers in the training samples used to estimate the ST-CCM. In that case, it will happen that the mismatch of the weight vector calculation, the loss of the output signal to clutter noise ratio (SCNR), and decrease of the detection performance of low speed and weak moving targets. Therefore, before estimating the ST-CCM, it is necessary to eliminate the training samples contaminated with interference or outlier.
The primary technical basis for traditional STAP sample selection methods is the generalized inner product (GIP) [38][39][40][41]. However, the GIP reference matrix is unknown because it is impossible to obtain the accurate ST-CCM from the under-tested range samples in the practical environment. Therefore, the traditional GIP method typically uses both sides of the under-tested range sample data as training samples to estimate the GIP matrix. However, the heterogeneous clutter environment will cause the estimated GIP matrix to Figure 1 shows the spatial geometric model of the airborne multi-channel radar antenna array. The speed of airborne is V a and parallel to the direction of the X-axis. The wavelength of the radar signal is λ. The radar antenna is a planar array of M × N physical array elements placed uniformly and installed in side-looking. The spacing of array elements is d = λ/2. The antenna array transmits pulses with a wide aperture. It receives the microwave synthesized echo in rows to obtain a row of the equivalent uniform linear array (ULA) consisting of N equivalent array elements. The primary lobe of the antenna points towards (θ 0 , ϕ 0 ). Sensors 2021, 21, x FOR PEER REVIEW 2 of 20 matrix to be seriously mismatched. The reason includes varying degrees of heterogeneity of each training sample and the influence of potential outliers' shielding effect [42]. Moreover, the GIP value of the heterogeneous sample is not significantly deviated from the GIP mean value of each range cell. It is easy to lead to inappropriate sample screening and then affect the sample's selection performance. In the meantime, the GIP method must invert the high-dimension matrix, which requires a large amount of computation. For this reason, many researchers have investigated and improved the GIP method. Such as GIP method was used together with prior knowledge always obtained from actual data [43]. Similarly, this paper aims to solve the shortcomings of the traditional GIP method and improve STAP performance.
As a result, this paper firstly established the echo model and moving target detection model with a side-looking uniform linear array (ULA) for airborne radar. Secondly, this paper analyses the influence of outliers in the training samples on the performance of STAP. Thirdly, constructing the clutter subspace combined with specific distribution in the space-time plane of the clutter spectrum. The clutter subspace is then constructed based on the characteristic [44] that clutter has a specific distribution in the twodimensional (2-D) space-time plane. Furthermore, there is the construction of the GIP matrix off-line.
Compared with the traditional GIP method, the proposed method constructs the GIP reference matrix off-line based on prior knowledge. In practice, the use of the matrix directly in a practical application does not produce much calculation.
Finally, theoretical analysis and simulation experiments show that the proposed method is easy to build the GIP matrix and requires less computing. Furthermore, it is sensitive to outliers, making it possible to select training samples efficiently. Figure 1 shows the spatial geometric model of the airborne multi-channel radar antenna array. The speed of airborne is V a and parallel to the direction of the X-axis. The wavelength of the radar signal is λ. The radar antenna is a planar array of M × N physical array elements placed uniformly and installed in side-looking. The spacing of array elements is d = λ/2. The antenna array transmits pulses with a wide aperture. It receives the microwave synthesized echo in rows to obtain a row of the equivalent uniform linear array (ULA) consisting of N equivalent array elements. The primary lobe of the antenna points towards (θ0, φ 0).   In such a case, the Wald clutter [31] model is adopted to divide the radiated area into grids. This method divides the area covered by a radar beam into N c equal range cells. It then divides the ring into N l sufficiently small units called Clutter Scatterer Patch (CSP). The size of each CSP(l, i) is usually smaller than or equal to the radar's resolution (l = 1, 2, . . . , N c ; i = 1, 2, . . . , N l ). The antenna gain, the Doppler frequency shift, the slant distance, the direction relative to the radar array, and the Radar Coss-Section (RCS) within each CSP are approximately homogeneous. The RCS of each CSP obeys the Rayleigh distribution. The symbol θ l, i , and ϕ l respectively represent the azimuth-angle and elevation-angle between the CSP(l, i) and radar array. It ignores the unevenness of the ground and has ϕ l = ϕ l, i . The symbol ψ represents the space cone-angle, and there is an equation cosψ l, i = cosθ l, i · cosϕ l .

Geometric Model of Space
In a Coherent Processing Interval (CPI), T r = 1/f r is the Pulse Repetition Interval (PRI). The symbol f r is the pulse repetition frequency (PRF) for radar transmitting signals. Assuming that the symbol t is a total time variable, t k = (k − 1)T r is the azimuth slow-time variable, andt = t − t k is the azimuth fast-time variable, (k = 1, 2, . . . , K, within K is the total number of pulses in a CPI).
The coordinates of the airborne at the time t 0 are (0, 0, H). The coordinates of a point target P within the area covered by the radar beam are (x 0 , y 0 , H). The symbols θ p and ϕ p respectively represent the azimuth-angle and the elevation-angle between the target P and the radar array. The symbol ψ p represents the space cone-angle, and there is an equation The symbol R p,0 represents the slant-distance of target P relative to the radar at the time t 0 . This document ignores the differences in the space of the array element because the airborne is far from the target relative to the space of the array element.
The target P velocity components in x, y, and z-axes are v x , v y , v z , respectively. The radial velocity of target P relative to the radar is v r . At the time t 0 , the coordinate of n-th receiving channels is (d n , 0, H), where d n = (n − 1)·d, (n = 1, 2, . . . , N).
At the time t k , the slant-distance R n (t k ) of target P relative to the n-th array element (channel) is: Based on the geometric model of the space established above, and using Taylor expansion and approximate compensation, the approximate expression of the Equation (1) is:

Echo Signal Model
Assume that s T (t, t k ) is the base-band pulse waveform of the radar's transmitting pulse signal. f c = c/λ is the carrier frequency of the radar transmitting signal, where c is the light speed. Within a CPI, the expression of the k-th radio-frequency (RF) pulse signal emitted by the radar is: It is considering the state of the narrow-band plane wave and ignores the signal envelope walking. Moreover, the expression of the Radio-Frequency (RF) echoes pulse signal S R , n (t, t k ) of the target P received by n-th equivalent array element is: The symbol ξ p represents the echo equivalent amplitude coefficient of the target P. It is composed of RCS, antenna pattern, and system loss. τ n (t k ) is the bi-directional delay of the echo signal received in the n-th equivalent array element. At the time t k , the expression of the base-band echo signal s n (t, t k ) is: Based on the Wald clutter model [31], in a CPI, the expression of the k-th base-band echo pulse of CSP(l,i) received by n-th equivalent array element is: The symbol ξ l, i represents the echo equivalent amplitude coefficient of CSP(l,i), which includes the RCS, antenna pattern, and system loss. τ n,l, i (t k ) is the bi-directional delay of the echo signal received in the n-th equivalent array element.
In the adjacent area around the moving target P in the region covered by the radar beam, the expression of the base-band clutter echo pulse C n received by the n-th equivalent array element at the time t k is: Assume that the radar uses the 1st equivalent array element as both transmit and receive channels simultaneously. Furthermore, the expressions of τ n (t k ) and τ n,l, i (t k ) in the Equations (5) and (6) are respectively: wherein, R T (t k ) is the slant-distance of target P relative to the transmitting channel. From the Equation (8), the separated spatial and time parts are (d n ·cosψ a )/c and 2(V a ·cosψ a − v r )·t k /c, respectively. In addition, the expressions for the spatial angular frequencies ω s and temporal angular frequencies ω t belonging to target P are respectively: Moreover, the normalized spatial frequency and temporal frequency (Doppler frequency) are respectively f s = ω s ·λ/2π and f d = ω t ·λ/2π. Meanwhile, there is an equation f d = β· f s , and β = 2V/df r is the fold coefficient. Thus, the refined expression for the Equation (5) is: wherein the symbol ξ p further includes the constant phase term exp(−4πR p,0 /λ). Similarly, the refined expression of the Equation (6) is: The symbol ξ l, i additionally includes the constant phase term exp(−4πR l,i,0 /λ).

Space-Time Steering Vector and Space-Time Clutter Spectrum Model
The temporal domain direction vector of the target P is: Sensors 2021, 21, 3108

of 19
Meanwhile, the spacial steering vectors of target P is: As well, the space-time steering vector of target P is: The vector expression of the under-tested range samples in a CPI is an NK × 1 dimension vector: wherein, T , and x n (t, t k ) represents the k-th baseband echo pulse data obtained by the n-th equivalent array element (n = 1, . . . , N, k = 1, 2, . . . , K). The received signal x generally consists of the target echo signal s, the clutter signal c, and the noise signal n, that is, x = s + c + n. Depending on the definition of the Equation (17), their expressions are respectively: (20) wherein, n n (t, t k ) represents the noise component in the k-th base-band echo pulse data obtained by the n-th equivalent array element. Supposing that the noise obeys a zero-mean Gaussian distribution and the variance is σ 2 . The space-time sample covariance matrix (ST-SCM) for the under-tested range sample is: The symbol R c is the space-time clutter covariance matrix (ST-CCM). The symbol I is the identity matrix of the dimensions N × K. Figure 2 illustrates the structure of STAP structure. According to Figure 2, the inner product of the optimum weight vector w opt and the under-tested range sample is the expression of the Y opt of the optimum STAP output: wherein, the expression of w opt : w opt = µR −1 cn S P (23) Figure 2 illustrates the structure of STAP structure. According to Figure 2, the inner product of the optimum weight vector wopt and the under-tested range sample is the expression of the Yopt of the optimum STAP output: (22) wherein, the expression of wopt:

Optimum Space-Time Processing
μR S (23) ...  Corresponding to Figure 2, the other expression of the w opt is: In the Equation (23), the R cn can not be directly available. The radar detection system can obtain L vectors data following the IID condition as the training samples. That is the sample data in the adjacent range cells around the target P. Therefore, the expression of the maximum likelihood estimation (MLE) of R cn : The expression of filter weight vector estimation through the MLE of R cn : wherein the expression of µ is The function of the adaptive model formed by the optimum STAP is F = w H opt S. Therefore, it is possible to obtain the adaptive model(2-D frequency response) by calculating the optimum weight vector. Figure 3 shows that the adaptive model forms notch distributed through the clutter, effectively filtering out the clutter in the ideal conditions.
Based on the Equations (21), (23), (25)- (27), one can obtain the output SCNR of the optimum weight vector and the estimation of the weight vector, respectively: According to the Equations (28) and (29), the expression of the output SCNR loss is: The expression of filter weight vector estimation through the MLE of Rcn: Wherein the expression of μ is ( ) The function of the adaptive model formed by the optimum STAP is F = w H opt S. Therefore, it is possible to obtain the adaptive model(2-D frequency response) by calculating the optimum weight vector. Figure 3 shows that the adaptive model forms notch distributed through the clutter, effectively filtering out the clutter in the ideal conditions.
The normalized spatial frequency The normalized Doppler frequency frequency response /dB According to the Equations (28) and (29), the expression of the output SCNR loss is:

Influence of Outfilers in Training Samples on STAP Performance
When there are outliers, it needs to obtain the ST-CCM expression of the under-tested range samples. Assuming that there is no correlation between the outliers and the clutter plus noise data, using the Equation (25), the ST-CCM expression is: where ε j and S Ij are the complex-amplitude and space-time steering vector corresponding to the j-th outlier, respectively. N j is the total number of outliers. For the convenience of analysis, assuming there is only one outlier. The expression of the eigendecomposition of R outlier is: is the eigenvalues of R outlier , u i is the eigenvector corresponding i-th outlier eigenvalue. Depending on the relationship between signal subspace and noise subspace [45,46], an equation exists: Using the Equation (33), we can obtain further: Without loss of generality, assuming a higher Clutter to Noise Ratio (CNR), in other words, there is λ i >> σ 2 , i = 1, . . . , r. Then, the other expression of the Equation (34) is: The function of the Equation (35) is to whiten the input data. In other words, it can project the input data into the noise subspace. According to the orthogonality between the signal subspace and the noise subspace, the filter's weight vector will produce notches at the outlier to suppress them. However, suppose the outlier in the training samples is in the signal subspace. In such a case, the filter will also suppress the under-detected targets.
Therefore, suppose the outlier has strong coherence with the under-tested targets. In other words, the 2-D space-time position information (including direction and Doppler) is very close for the outlier and under-detected targets in the training samples. At this time, the filter weight vector will produce the deep notch to cancel the under-detected targets and outlier simultaneously, which will reduce the target detection performance of STAP.
Considering the influence on the SATP output by the outlier in the training samples, and according to the Equation (30), the expression of the output SCNR loss is: Figure 4a,b present the output SCNR loss simulation results varying with outlier power under different normalized Doppler frequencies and normalized spatial frequencies of the outlier. Table 1 provides the radar system simulation parameters.     Normalized spatial frequency f s 0 Normalized Doppler frequency The simulation results in Figure 4 show that the wider the 2-D space-time position between the outlier and the under-detected targets. In other words, the weaker the correlation between them, the smaller the output SCNR loss. By comparison, the closer the 2-D space-time position between the outlier and the under-detected targets are, in other words, the stronger the correlation between them, the greater the output SCNR loss.
Moreover, with the increase of outlier power (as the interference power), the loss of output SCNR remains stable after increasing to a certain extent. The reason is that the interference component's character in training samples transforms from non-significant to significant gradually. In other words, the eigenvalues of it transform from small to large.
It has happened that the transforming process canceled the target gradually. Furthermore, when the interference power increases to a certain extent, the end canceled the target ultimately. At this point, the output SCNR loss remains stable.

A Sample Selection Method Based on Clutter Subspace
According to the analysis of Section 3, the outlier will lead to the cancellation of target signals, resulting in a significant loss of output SCNR and ultimately leading to a decline in the radar detection performance. To more accurately estimate the ST-CCM of the under-tested rang sample data, it is inevitable to select appropriate training samples.
Melvin et al. proposed a method for selecting training samples based on the Generalized Inner Product (GIP). It is known as the traditional GIP method. This method sets the GIP test statistics and then tests each training sample separately. Comparing the GIP value of the training samples containing the outlier with not, they are significantly different. The traditional GIP method can effectively find the training samples with outliers to eliminate them before estimating the ST-CCM.
However, the performance of the traditional GIP method decreases obviously in the heterogeneous clutter environment. Besides, its computation is extensive. For this reason, this section starts with the basic principle of the traditional GIP method. It then proposes a method of sample selection based on clutter subspace.

Traditional GIP Method
The GIP method proposed by Melvin et al. is one of the most common sample selection methods. Its basic principle is selecting and removing the training samples whose statistical characteristics are different from those of the under-tested range sample data by GIP value. Firstly, the following gives the definition: Definition 1. Assuming that X i and X j have the same dimension, R i and R j are their self-correlation matrices, respectively. If the following formula is true: Then, the statistical distribution of the vectors X i and X j are identical or approximately identical. In other words, the statistical characteristic between them is homogeneous. Otherwise, X i and X j are singular or homogeneous. For the i-th training sample, the expression of its GIP value is: According to the Equation (38), the GIP value's physical sense is the inner product of the vector of training samples whitened by matrix R −1/2 cn . We call the matrix R cn to compute the GIP value as the GIP reference matrix. Assuming that the ST-CCM of the i-th training sample is R cn,i , we can furtherly obtain the expression of the Equation (38) as follows: According to the Equation (39) and Definition 1, if the statistical characteristics of i-th training sample and the under-tested rang sample data are homogeneous, then there is an equation: On the contrary, the value of (39) will deviate from NK. The traditional GIP method judges each training sample's homogeneous degree by detecting the offset degree of the GIP value of the training samples relative to the NK.
In practice, the GIP matrix is unknown because we cannot obtain the accurate ST-CCM. Therefore, the traditional GIP method usually uses the training samples on both sides of the under-tested sample to estimate the GIP matrix.
However, in the heterogeneous clutter environment, the GIP reference matrix obtained is seriously mismatched. The reason is that each training sample's heterogeneity is variable, and there is the shielding effect of the potential outlier. Meanwhile, the offset degree is not apparent between the GIP value of the heterogeneous training samples and the GIP mean value of each under-tested range sample. Moreover, it quickly leads that the training samples selected are unsuitable and then reduces the selection performance of the traditional GIP method for the heterogeneous training samples.

Clutter Subspace Feature-Assisted Sample Selection Method
The traditional GIP method has significant limitations because the training samples' heterogeneity and outlier parameters can affect its performance efficiently. Simultaneously, it needs to invert the matrix, so the amount of calculation is large. Considering that the airborne radar clutter with the side-looking ULA has a specific distribution in the 2-D space-time plane, this paper proposes a sample selection method based on the clutter subspace. Figure 5 illustrates the flowchart for the method proposed in this paper. The Equations (39) and (40) and Figure 5 show that the key to the method proposed in this article is to estimate the GIP reference matrix R cn or its inverse matrix R −1 cn . The method proposed in this paper adopts the off-line construction of the matrix R −1 cn to replace the traditional GIP method used to estimate the matrix R cn based on the under-tested sample, as the yellow box in the Figure 5.
According to the Equation (34), the clutter subspace or noise subspace can construct the expression of R −1 cn : wherein U c and U n are the actual clutter and the noise subspace of the under-tested range samples, respectively. According to the Equation (41), if constructing the clutter or noise subspace of the GIP reference matrix accurately, it is easy to obtain R −1 cn . For clutter received by the radar, the clutter subspace has the following properties: Property 1. Considering without the non-ideal factors, the subspace of the clutter signal received by the radar is independent of the RCS of the CSPs. It only depends on the space-time steering vector of the CSP, that is: wherein, S c is a matrix composed of the space-time steering vectors corresponding to each CSP.
Proof of Property 1. The ST-CCM of clutter is equivalent to: wherein, Λ is a diagonal matrix composed of the square of each CSP's complex amplitude. According to the subspace span theorem [39], there is an expression: The expression of the space-time steering vector of a single CSP at (f s , f d ) is [38]: . . .
For a side-looking airborne radar with the ULA, the space-time clutter spectrum is linear, and responds to: f d = β · f s . Assuming that β is an integer, the further expression of the Equation (45) is: In the Equation (46), E is an NK × [N + β(K−1)] dimensional matrix, whose elements in the i-th row and j-th column are: wherein, n = 1, . . . , N; k = 1, . . . , K. The element in the (k − 1)N + n row and the n + β(k − 1) column is 1. The other elements are 0, so the E is an orthonormal column matrix. Combining with Property 1, the linear combination of N + β(k − 1) orthogonal column vectors of E can construct the space-time steering vector constructed by S c in the clutter subspace. Therefore, the column space of matrix E is the clutter subspace. To construct the orthonormal basis of E, we normalize the column vectors of E to obtain matrix E c , and then there is an equation: According to the Equation (48), the column spaces of the matrices U c and E c belong to the same clutter subspace. Depending on the properties of the subspace, the linear combination with the column vectors of E c can construct the column vectors of U c , as follows: Based on the orthogonality of the column vectors of the U c and E c , there is: The Q is an orthogonal matrix, so: By substituting the Equation (51) into (41), the GIP reference matrix is as follows: By substituting the Equation (52) into (38), the GIP value is, as follows: According to the Equations (40) and (53), when there is a relatively close homogeneous degree between the i-th training sample and the under-tested rang samples (e.g., there is no outlier), the GIP value of the training sample will be close to NK. Otherwise, the GIP value will deviate from the NK value, and there should be an elimination of the training samples.
In the process of constructing the GIP reference matrix, we can obtain the noise power in advance. Simultaneously, the clutter subspace E c elements are independent of the pitch angle, which applies to all under-tested range samples. Moreover, according to the specific distribution of clutter in 2-D space-time plane and radar system parameters based on the side-looking with the ULA, an E c can be constructed off-line.
These methods reduce the amount of calculation and avoid an inaccurate estimation of the GIP reference matrix due to outlier existence. The proposed method is more sensitive to the outlier, so heterogeneous training samples' elimination performance is enhanced.

Simulation Experiment and Analysis
In this section, there is a simulation experiment to perform the proposed method. Assuming that the aircraft's flight altitude is 9000 m; the flight speed is 50 m/s; the transmission signal wavelength is 0.667 m, and the PRF is 300 Hz. The radar antenna is side-looking with ULA. Moreover, each array element is equal to half of the wavelength; the array element number is 8; the number of pulses received in a CPI is 8. The number of training samples used in the simulation experiment is 141. There is the target located in the 71st range cell. The normalized Doppler frequency corresponding to the target is 0.25; the airspace frequency is 0; the SNR is 20 dB.
Furthermore, assuming the CNR is 40 dB, five outliers as interferences are added randomly to both sides of the under-tested range cell. The normalized Doppler frequency of the interference is equivalent to the target. In addition, the direction of the interferences is random, but within the main lobe; Table 2 shows the range cell and the interference to noise ratio (INR) of the interference:  Figure 6a represents the GIP value simulation results of the traditional method, and Figure 6b represents the GIP value simulation results of the proposed method. The simulation results in Figure 6 show that the traditional GIP method is not robust for detecting interferences. The reason is that there are no apparent offsets comparing the GIP values of the range cells containing most of the interferences with not. However, the GIP values of range cells containing a small number of solid interferences have a larger offset than the others. Therefore, the selection of detection threshold should not be small. Otherwise, it will lead to the elimination of more homogeneous samples and reduce the detection performance.
training samples used in the simulation experiment is 141. There is the target located in the 71st range cell. The normalized Doppler frequency corresponding to the target is 0.25; the airspace frequency is 0; the SNR is 20 dB.
Furthermore, assuming the CNR is 40 dB, five outliers as interferences are added randomly to both sides of the under-tested range cell. The normalized Doppler frequency of the interference is equivalent to the target. In addition, the direction of the interferences is random, but within the main lobe; Table 2 shows the range cell and the interference to noise ratio (INR) of the interference:  Figure 6a represents the GIP value simulation results of the traditional method, and Figure 6b represents the GIP value simulation results of the proposed method. The simulation results in Figure 6 show that the traditional GIP method is not robust for detecting interferences. The reason is that there are no apparent offsets comparing the GIP values of the range cells containing most of the interferences with not. However, the GIP values of range cells containing a small number of solid interferences have a larger offset than the others. Therefore, the selection of detection threshold should not be small. Otherwise, it will lead to the elimination of more homogeneous samples and reduce the detection performance. By comparison, the method proposed in this paper has a good detecting effect for the training samples. The GIP values of all the range cells containing the interferences have a large offset relative to other range cells. They reflect the weak and robust information about the interferences. Consequently, the proposed method's detection threshold set is large, making it difficult to eliminate the homogeneous samples.  Figure 7a shows that the method without sample selection cannot detect any targets. Figure 7b shows that the position corresponding to the maximum output power value of the STAP filter obtained by the traditional GIP method is the same as that of the target. However, for most of the range cells, the output power corresponding STAP filter is also very high. It does not detect and eliminate the interference effectively. Moreover, the output power of STAP at the target position generates an SNR loss (relative to the set SNR = 20 dB). Therefore, it cannot guarantee the performance of target detection.
By comparison, in Figure 7c, the STAP filter obtained the maximum output power located in the range cell containing the target by the method proposed in this paper. In the meanwhile, there is almost no SNR loss. Furthermore, the STAP filter suppresses output power corresponding to other range cells obviously. It demonstrates that the  Figure 7a shows that the method without sample selection cannot detect any targets. Figure 7b shows that the position corresponding to the maximum output power value of the STAP filter obtained by the traditional GIP method is the same as that of the target. However, for most of the range cells, the output power corresponding STAP filter is also very high. It does not detect and eliminate the interference effectively. Moreover, the output power of STAP at the target position generates an SNR loss (relative to the set SNR = 20 dB). Therefore, it cannot guarantee the performance of target detection.
By comparison, in Figure 7c, the STAP filter obtained the maximum output power located in the range cell containing the target by the method proposed in this paper. In the meanwhile, there is almost no SNR loss. Furthermore, the STAP filter suppresses output power corresponding to other range cells obviously. It demonstrates that the method proposed in this paper can effectively detect and eliminate interference to ensure STAP performance. Figure 8a-e presents the 2-D simulation results of space-time frequency response by five methods. These more intuitively illustrate the advantages of the method proposed in this paper. Figure 8f further presents the simulation quantization comparison results of SCNR output by STAP filters with different methods. the meanwhile, there is almost no SNR loss. Furthermore, the STAP filter suppresses output power corresponding to other range cells obviously. It demonstrates that the method proposed in this paper can effectively detect and eliminate interference to ensure STAP performance. Figure 8a-e presents the 2-D simulation results of space-time frequency response by five methods. These more intuitively illustrate the advantages of the method proposed in this paper. Figure 8f further presents the simulation quantization comparison results of SCNR output by STAP filters with different methods.  Figure 8a shows the theoretically optimum STAP filter results. The target response is 0 dB. Moreover, there is a deep notch in the clutter; Figure 8b shows the STAP filter results with the sample matrix inverse (SMI) method when there is no outlier. The target response is 0 dB approximately. Moreover, there is a deep notch in the clutter, but not as deep as the Figure 8a;  Figure 8a shows the theoretically optimum STAP filter results. The target response is 0 dB. Moreover, there is a deep notch in the clutter; Figure 8b shows the STAP filter results with the sample matrix inverse (SMI) method when there is no outlier. The target response is 0 dB approximately. Moreover, there is a deep notch in the clutter, but not as deep as the Figure 8a; Figure 8c shows the STAP filter results without any sample selection method when there are outliers. The target has a prominent cancellation. The response is less than 0 dB; Figure 8d shows the STAP filter results using the traditional GIP method when there are outliers. The target has a slight cancellation. The response is less than 0 dB; Figure 8e shows the STAP filter results according to the proposed method in this article when there are outliers. The target response is 0 dB. Additionally, there is a deep notch in the clutter, almost the same as in Figure 8b.

Normalized
The proposed method does not produce cancellation to the target. Furthermore, it has the same performance as the case without outliers. It only loses 3 dB compared with the theoretical optimum STAP filter results, which conforms to the theoretical analysis. These indicate that the method proposed has a good detection performance.

Conclusions
The heterogeneous clutter environment faced by airborne radar may make outliers exist in the training samples. Therefore, the statistical characteristics between the training samples and the under-tested rang samples no longer satisfy the IID conditions, making the calculation inaccurate of the weight vector. Moreover, it causes the cancellation of the under-detected targets and deduces the detection performance of the targets. Therefore, it is necessary to eliminate the outliers' training samples before estimating the space-time correlation matrix. This paper analyzed the effects of the outlier on STAP performance and then introduced the traditional GIP method. This paper proposed a sample selection method based on clutter subspace, considering the specific distribution on the 2-D space-time plane of clutter received by the airborne radar with the side-looking ULA. The method proposed in this paper is to solve significant computation amount problems and be affected by outlier parameters when constructing the GIP reference matrix for the traditional GIP method.
This method can construct the GIP reference matrix off-line and has the characteristics of small computation. Moreover, it is sensitive to the outlier and has good performance in sample selection. Theoretical analysis and simulation experiments support the effectiveness of the proposed algorithm.  Acknowledgments: In this section, we would like to thank and Zhanye Chen and Hui Li for participating in the research work. Zhanye Chen discussed many critical issues in the paper, participated in simulation and demonstration, and provided partial literature and reference materials. Hui Li has written part of the simulation program for this paper, doing data analysis, and finally proofreading the paper. Thanks for their contribution to the research work of this paper.