A Nonlinear Adaptive Beamforming Algorithm Based on Least Squares Support Vector Regression

To overcome the performance degradation in the presence of steering vector mismatches, strict restrictions on the number of available snapshots, and numerous interferences, a novel beamforming approach based on nonlinear least-square support vector regression machine (LS-SVR) is derived in this paper. In this approach, the conventional linearly constrained minimum variance cost function used by minimum variance distortionless response (MVDR) beamformer is replaced by a squared-loss function to increase robustness in complex scenarios and provide additional control over the sidelobe level. Gaussian kernels are also used to obtain better generalization capacity. This novel approach has two highlights, one is a recursive regression procedure to estimate the weight vectors on real-time, the other is a sparse model with novelty criterion to reduce the final size of the beamformer. The analysis and simulation tests show that the proposed approach offers better noise suppression capability and achieve near optimal signal-to-interference-and-noise ratio (SINR) with a low computational burden, as compared to other recently proposed robust beamforming techniques.


Introduction
As one important branch of modern array signal processing, the beamforming technique has been widely studied and applied in the radar, wireless communication, sonar, medical imaging, as well as astronomy domains. The standard beamforming approach, such as the minimum variance distortionless response (MVDR) beamformer [1], was usually established based on an ideal antenna array with exactly known array manifold. Thus, it is very sensitive to practical circumstances, and its performance would be seriously degraded by diverse factors, such as the steering vector mismatch, array calibration errors and snapshot number restrictions.
During the last decades, in order to resist the model mismatches and possible environment changes, the robust beamforming approach have been largely studied [2][3][4][5]. Among others, by introducing a penalty term into the objective function, the diagonal loading (DL) algorithm could effectively reduce the eigenvalue spread of the noise and prevent the distortion of beampattern [6]. Nevertheless, how to get the optimal loading factor for DL is still a serious issue when the desired steering vector and/or the available snapshot numbers are uncertain [7]. A robust adaptive beamforming, based on the worst-case performance optimization, would delimit the uncertainty set of steering vectors by upper bounding the norm of the steering vector mismatch [8]. However, neither the mismatch vector nor its upper bound is known in practice. To overcome this model defect in standard DL algorithm, an adaptive beamforming method was developed, which estimates iteratively the difference between the actual and presumed steering vectors in order to maximize the output signal-to-noise plus interference ratio (SINR) [9][10][11]. But this adaptive beamforming algorithm is not sufficiently reliable in the case when the snapshots are small.
In order to reject jamming signals, poor array calibration, signal wave-front distortions, the minimum-variance-distortionless-response (MVDR) beamforming is modified by the means of incorporating multiple linear constrains [12][13][14]. Whereas, the augmentation of constrains would reduce the array freedom degrees in the linear beamforming framework. Nonlinear beamforming approaches provide a novel idea to address this issue for they can adapt better to the statistical properties of the given data than linear ones [15]. Neural network has been applied to beamforming among other nonlinear array processing tasks. But this approach suffers from serious drawbacks such as over-fitting or local minima, which leads to suboptimal solutions [16]. Support Vector Machines (SVM), introduced by Vapnik [17], is an important new methodology for pattern classification and nonlinear function approximation. This method addresses the beamforming problem by means of incorporating additional inequality constrains to penalize sidelobe levels and allowing a certain error in the desired signal direction [18]. Thus the MVDR beamforming method is reformulated and the cost function turns out to be equivalent to SVM for regression. However, the time consumed to train SVM beamformer scales super linearly to the number of observations, and it leads to an insurmountable computational burden in online operation modes [19]. The least-squares support vector machine (LS-SVM) inherits the SVM's generalization capacity. By solving linear equations instead of a quadratic programming (QP) problem in the standard SVM, the training procedure and the computational complexity of the standard SVM would be effectively simplified [20]. The main drawback of LS-SVM is that it works in batch mode. Thus, it is difficult to be used in large-scale applications. Recent researches about LS-SVM continuously focus on the improvement of the training algorithms, model selection and sparseness [21,22]. This paper presents a new LS-SVR-based approach to address the robust beamforming issue. This approach alleviates the array output SINR degradation in the presence of steering vector mismatches, strict restrictions on the number of available snapshots, and numerous interferences by replacing the conventional linearly constrained minimum variance cost function with a squared-loss function, and achieves better generalization capacity by applying Gaussian kernels to the array observations. We also present a fast recursive procedure to estimate the weight vectors on real-time, and a novelty criterion to perform model reduction. The paper is organized as follows. The signal model, also the minimum mean square error (MMSE) and the MVDR-beamformer solutions are presented in Section 2. The basic principle of LS-SVR-based beamforming method is introduced in Section 3. In Section 4, a recursive procedure to calculate the regression parameters is provided. And a sparse mode is presented in Section 5. The simulation tests under different mismatch scenarios are illustrated in Section 6. A summary conclusion is given at the last of this paper.

Sensor Signal Model
Consider a linear array of M sensors receives signals from D narrowband source. The vector of at time t could be modeled as: is the sensor noise, and it is assumed as complex Gaussian with zero-mean: The output of the beamformer is defined as: where, [ ] If certain observations are known during the procedure of training parameters, then, according to the MMSE criterion, the complex vector of beamformer weights w can be described as: where, R is M × M covariance matrix, and p is the cross-correlation between the desired output and the received signal. The classical MVDR beamformer minimizes the array output energy, and the weights subject to a constraint of unity array response on the desired array steering vectors, that is: The constraint H 1 ( ) 1 θ = w a prevents the gain at the look direction from being reduced, and the solution of Equation (5) can be easily estimated by means of using Largrange multiplier method: In practice, it is not feasible to calculate the exact covariance matrix R and it would be estimated by the sample covariance matrix ( ) ( ) where K is the number of observed snapshots.
The performance of MVDR beamformer in Equation (5) is sensitive to mismatch between the presumed and actual steering vectors due to the uncertainty of the desired signal DOA, strict restrictions on the number of available snapshots, and numerous interferences.

Nonlinear SVM-Based Beamforming
Consider a set of snapshots x i , i = 1, N at time t from an array and the corresponding set of desired symbols y i , i = 1, N, are available for training purpose. The basic idea of nonlinear beamforming is to transform the data set x i , i = 1, N into a higher (possibly infinite) dimension feature space H by a nonlinear transformation ( ) φ ⋅ . Thus, the beamformer's output can be formulated as a linear regression in H. It could be expressed as: where, ∈ Η w is the linear parameter set and e i is the output error. The parameter set w can be estimated by minimizing a certain cost function on output error e i . For SVM regression, the parameter set w and the ε-intensive loss function could be estimated by the minimum risk criterion, i.e., subject to , 0 n n ξ ζ ≥ . Where, C ≥ 0 is the tradeoff term between the minimization of the weight norm and the output error. The ε-intensive loss function is given by: where ε is a positive parameter which is used as an error threshold. The weight vector w is regularized by solving Equation (8), Thus, the generalization capacity of the beamformer will be remarkably improved.

Nonlinear LS-SVR Beamforming
Instead of the inequality constrains in standard SVM algorithm, the equality ones are taken in LS-SVR, and the linear equation of the ε-intensive loss function is replaced by a quadratic equation. Therefore, The LS-SVR beamformer can be described as the following quadratic optimization problem [20]: where, e i t is the error at time t. The sum of squared errors in Equation (10) represents the ε-intensive loss function under the linear constraint. This treatment would greatly reduce the computation complexity since only the linear equation, instead of the QP problem in SVM, is solved.
The array observations of the beamformer are complex, whereas the variables in the objective function of SVM are real. So, it is necessary to rewrite the complex variables as real variables. For this reason, the array observations x i , the beamformer outputs y i and the weight vectors t w are rewritten as: The result of the quadratic optimization problem of Equation (10) is the saddle point of the following Lagrange function: where, is Lagrange multipliers, defined as regression parameters in this paper. According to the Karush-Kuhn-Tucker (KKT) conditions, differentiating the above function with respect to the Lagrange multipliers t α and , The system obtained from the KKT conditions is linear. Its result is obtained by solving the linear system which is expressed as following matrix: where, t i j k x x denotes kernel function responsible for the nonlinear mapping ( ) φ ⋅ , which greatly simplify the inner product calculation in the feature space.
Thus, linear methods can be applied on the transformed data, and it is not necessary to perform computations in the high-dimensional feature space. As the most widely used kernel function in many practical applications, Gaussian kernel is taken here: where σ > 0 is the kernel radius. The outputs of the nonlinear LS-SVR beamformer are:

Recursive Algorithms
From Equation (16), it could be known that once the regression parameters α t and b t are computed, the beamformer outputs can be obtained. Denoting  (14)) can be represented as: Then, we have: As the number of snapshots increases, the dimension of Gramm matrix Q i will be increasing because it is in proportional to the number of snapshots. Therefore, the computation for the regression parameters α t and b t would be very intensive as the snapshots increase, and it is key issue for LS-SVR beamformer to find out a fast algorithm to improve the computation efficiency of U i .
At time step t, Q i and H i are the matrixes with dimension of 2N × 2N: As time run to t + 1, new input snapshots x t+1 and the corresponding desired array output 1 t y + are added to the current training set. So Q t+1 and H t+1 can be represented as: Comparing the elements of H t and H t+1 , the matrix H t+1 could be reconstructed by the matrix H t plus an additional row and column, i.e., where, According to the theorem of inverting block matrix, the inverse of H t+1 can be expressed by the inverse of H t and the new column v t+1 as: where, ( ) Thus the inverse of H t+1 ,which is equal to U t+1 , can be calculated from the inverse of H t , and it is not necessary to calculate the inverse of H t when it has high dimension, so the computation complexity would be greatly reduced and the numerical stability problem arising from inverse matrix would be also avoided. When the set of snapshots is small, the U t can be computed directly by matrix inverse theory.

Sparsification
The crucial drawback of LS-SVR beamformer is that it deals with high-dimension matrix, which is equal to the number of the snapshots due to the use of a quadratic constraint function. This would bring a big implementation problem to the proposed beamforming method since it is required to increase memory and computational resources as time evolves. Several methods have been proposed to cope with these problems [23,24]. The sliding-window approach [25] fixes the size of LS-SVR beamformer and allows it to be operated online in time-varying environments by keeping only the last N input snapshots in the sliding-window and simply abandoning those out of it. In [26], an exponential forgetting mechanism is introduced to describe the influence, which is imposed on the present situation by the past data [26]. This paper employs the novelty criterion, presented by Platt [27,28], to reduce the final size of the proposed beamformer, keep the algorithm complexity bounded and realize online sparsification. The basic idea of this approach is to construct a dictionary with center set C and update it appropriately according to the novelty criterion. The stages of the proposed specification are given as follows: Step 1: Initialing an empty center set C 0 ; Step 2: Calculating the distance between the new snapshot x t and the present dictionary Step 3: If the distance obtained from Step 2 is smaller than the preset threshold δ 1 , x t is not added into the dictionary, otherwise the prediction error ê i i i y y = − is calculated; Step 4: if e i is larger than another preset threshold δ 2 , x t is accepted as a new center and C i is updated to C i+1 , otherwise go to Step 2.
Increasing δ 1 and δ 2 , the final size of the LS-SVR beamformer will be decreased. But this will result to performance degradation. In practical applications, δ 1 is set to around one tenth of the kernel bandwidth, and δ 2 is around the square root of the steady-state mean square error (MSE). Cross-validation also can be used to select these appropriate thresholds.
Applying the above sparsification procedure, the computation complexity of the proposed beamformer will be reduced from O(N 2 ) to O(K 2 ), where K is the effective number of centers in the network at time t. As K is finite, the online real-time beamforming will be practical.

Simulation Tests
To evaluate the performance of the proposed LS-SVR-based beamformer, simulation tests are carried out. A 10 elements uniform linear array with half-wavelength spacing is taken into account. The desired signal comes from a presumed direction θ = 3° and two irrelevant interferences, with interference-to-noise ratio (INR) of 20 dB, impinge on the array from θ 2 = −32° and θ 3 = 17° respectively. The additive noise is assumed to be a 0-dB complex white Gaussian distributed random variable. For comparison purpose, the conventional MVDR, the diagonal loading MVDR (MVDR-DL), the ES [29], the SQP [9] and the RR [30] method are considered. The parameters of the proposed beamformer, σ, δ 1 and δ 2 , are chosen as 1.0, 0.1 and 0.08 respectively. The load value of MVDR-DL beamformer is set to (P e +10 dB), where P e denotes the power of desired signal. All results are obtained from 100 independent simulation runs.
The first simulation aims to compare the performance of these beamformers when steering vector mismatch is presented. From Figure 1(a), we observe that the proposed LS-SVR beamformer consistently improves its output SINR as SNR increases and performs much closer as the idea one when the input SNR is varied from −20 dB to 30 dB. Due to the DOA mismatch, the interested signal is considered as interference and a null is allocated in the desired signal direction by the MVDR beamformer. As a result, the output SINR is decreased. When input SNR is larger than −5 dB, the output SINR of MVDR beamformer degrades seriously. In comparing with the MVDR beamformer, the MVDR-DL, ES, SQP and RR methods get more robustness against DOA mismatch. But they still suffer from a degradation of performance while the input SNR becomes higher.   Figure 1(b) shows the normali\zed beampattern plots when the input SNR is equal to 10 dB. As it is illustrated, all beam-patterns of the robust beamformers have nulls at the DOAs of the interferences. But the proposed LS-SVR still outperforms others by markedly lower sidelobe level, and maintaining distortionless response for the desired signal.
The covariance matrix would be inaccurately estimated owing to insufficient snapshots, DOA mismatch of desired signal and array calibration errors. This kind of inaccuracy may result in the degradation of array response. Hence, both the errors of insufficient snapshots and DOA mismatch are considered to verify the proposed beamformer in our second simulation tests. Figure 2 shows the resulting output SINRs versus the snapshot number K. When snapshots are over 20, the LS-SVR clearly outperforms other beamformers tested. Owing to the steering vector mismatch, the MVDR beamformer see the desired signal as interference and fails in its operation.
The performance of the proposed beamformer in the scenario with multiple interferences is demonstrated in the third test. The steering vector mismatch is also presented. As it can be seen from Figure 3(a), the proposed algorithm performs equally well as ES and SQP when the number of interferences less than 5. When the interference numbers is increased to 8, the output SINR of the proposed LS-SVR beamformer is only 1 dB lower than that of idea beamformer. In contrast, the output SINRs of other beamformers tested are dramatically decreased due to the decrease of the available freedom degrees which are devoted to suppress the interference.  The corresponding beampatterns of the beamformers are demonstrated in Figure 3(b), where the four interferences with DOAs of θ i = [17.4°, −11.5°, 53.1°, −23.5°] are taken into account. It can be seen that the LS-SVR beamformer not only presents deep nulls at the DOAs of interference, but also achieves better sidelobe suppression than other beamformers tested. Thus, the proposed LS-SVR method can get better SINR performance than the usual robust linear beamforming algorithms in the case of numerous interferences.
To show the computation complexity of the novel approach, the dictionary size growth with the input samples is given in Figure 4. As it can be seen in Figure 4, only 396 center numbers are needed to calculate the beamformed output for 4,000 input samples. In comparison with the original LS-SVR algorithm, in which 4,000 centers are needed for the same case. Thus, the computation cost is largely reduced.

Conclusions
We present a novel nonlinear LS-SVR-based beamforming approach in this paper. This approach first uses a squared-loss function to replace the conventional linearly constrained minimum variance cost function, which can significantly increase robustness against mismatch problems and provide additional control over the sidelobe level. The method also applies Gaussian kernels to the array observations to improve the generalization capacity. Finally, the method uses a recursive regression procedure to estimate the weight vectors on real-time and performs mode reduction to reduce the final size of the beamformer.
The simulation tests, with steering vector mismatch, numerous interferences and limited available snapshots, are carried out to verify the performance of the proposed beamforming algorithm in comparison with other recently proposed ones. The test results show that the proposed beamforming method significantly outperforms many other recently proposed linear robust beamforming techniques in terms of signal distortion in the desired signal and noise reduction in scenarios with DOA mismatch, limited observation samples, and numerous interferences.