Adaptive Maximum Correntropy Gaussian Filter Based on Variational Bayes

In this paper, we investigate the state estimation of systems with unknown covariance non-Gaussian measurement noise. A novel improved Gaussian filter (GF) is proposed, where the maximum correntropy criterion (MCC) is used to suppress the pollution of non-Gaussian measurement noise and its covariance is online estimated through the variational Bayes (VB) approximation. MCC and VB are integrated through the fixed-point iteration to modify the estimated measurement noise covariance. As a general framework, the proposed algorithm is applicable to both linear and nonlinear systems with different rules being used to calculate the Gaussian integrals. Experimental results show that the proposed algorithm has better estimation accuracy than related robust and adaptive algorithms through a target tracking simulation example and the field test of an INS/DVL integrated navigation system.


Introduction
As the benchmark work in state estimation problems, the linear recursive Kalman filter (KF) has been applied in various applications, such as information fusion, system control, integrated navigation, target tracking, and GPS solutions [1][2][3][4]. It is then extended to nonlinear systems through different ways to approximate the nonlinear functions or filtering distributions. Using the Taylor series to linearize the nonlinear functions, the popular extended Kalman filter (EKF) [5] is obtained. To further improve the estimation accuracy of EKF, several sigma points based nonlinear filters have been proposed in recent decades, such as unscented Kalman filter (UKF) using unscented transform [6], cubature Kalman filter (CKF) according to cubature rules [7], and divided difference Kalman filter (DDKF) adopting the polynomial approximations [8]. All these filters can be regarded as special cases of Gaussian filter (GF) [9][10][11], where the noise distribution is assumed to be Gaussian.
However, when the measurements are polluted by non-Gaussian noise, such as impulsive inference or outliers, GF will have worse estimation results and even break down [12,13]. Besides the computation extensive methods including particle filter [14,15], Gaussian sum filter [16], and multiple model filters [17], the robust filters, such as Huber's KF (HKF, also known as M-estimation) [18][19][20] and H ∞ filter [21], are also intended for the contaminated measurements. Although the H ∞ filter can obtain guaranteed bounded estimation error, it does not perform well under Gaussian noise [22]. The Huber's M-estimation is a combined l 1 and l 2 norm filter that can effectively suppress the non-Gaussian noise [18][19][20]. Recently, the information theoretical measure correntropy has been used to incorporate the non-Gaussian noise [23][24][25][26][27][28]. According to the maximum correntropy criterion (MCC), a new robust filter known as the maximum correntropy Kalman filter (MCKF) has been proposed in [24], and it is also extended to nonlinear systems using EKF [27] and UKF [25,26,28]. Simulation results show that an MCC based GF (MCGF) may obtain better estimation accuracy than M-estimation when choosing a proper kernel bandwidth [23][24][25][26]. Even so, both MCGF and M-estimation still require the information of nominal measurement noise covariance, which may be unknown or time varying in some applications. In this situation, the performance of MCGF will degrade as shown in our simulation examples.
Traditionally, the unknown noise statistic can be estimated by the adaptive filter, such as the Sage-Husa filter [29] and fading memory filter [30]. One drawback of these recursion adaptive filters is that the previously estimated statistic of the last time instant will influence current estimation, which is not suitable for the case measurement noise having frequently changing statistics [31]. The recently proposed variational Bayes (VB) based adaptive filter avoids this limitation by VB approximation, and VB based GFs (VBGFs) for both linear [32] and nonlinear systems [9,33] have been proposed.
In this paper, we proposed an adaptive MCGF based on the VB approximation, which is especially useful for estimating the system state from the measurements with unknown covariance non-Gaussian noise. Typical applications include low-cost INS/GPS integrated navigation systems [32] and maneuvering target tracking [33]. To overcome the limitation of MCGF under unknown time varying measurement noise covariance, the VB method is utilised to improve the adaptivity of MCGF, which is achieved through the fixed-point iteration framework. As will be demonstrated in our simulation results, our proposed method has better estimation accuracy than related algorithms. Furthermore, various filters can be obtained by using different ways to calculate the Gaussian integrals.
The rest of this paper is given as follows. In Section 2, after briefly introducing the concept of correntropy, we give the general MCGF algorithm. In Section 3, we explain the main idea of the VB method and the procedure of embedding it into MCGF to obtain our proposed adaptive MCGF. Section 4 gives the experimental results of a typical target tracking model and an INS/DVL navigation system comparing with several related algorithms. Conclusions are made in the final section.

Correntropy
As a kind of similarity measure, the correntropy of random variables X and Y is defined as [24][25][26] where E[·] denotes expectation, F X,Y (x, y) is the joint density function, and κ(x, y) represents the Mercer kernel. The most popular Gaussian kernel is given as the following: where e = x − y, and σ > 0 is the kernel bandwidth. Then, taking the Taylor series expansion on the Gaussian correntropy, we obtain that Obviously, it contains all the even moments of X − Y weighted by the kernel bandwidth σ. It enables us to capture high order information when applying the correntropy in signal processing. In practice, we can use the sampling data to estimate the real correntropy since the joint density function is usually unavailable.

Maximum Correntropy Gaussian Filter
In this paper, we consider the following nonlinear system with additive noise where x i ∈ R n is the system state at time i and z i ∈ R m denotes the measurement. f i (·) and h i (·) represent the known nonlinear functions. In standard GF, the process noise w i and the measurement noise v i are assumed to be zero mean Gaussian noise sequences with known covariance Q i and R i . The initial state x 0 has known meanx 0 and covariance P 0 .
In both GF and MCGF, the one step estimationx − i and its estimation covariance P − i are obtained through: To further improve the robustness of GF, MCC has been applied on the derivation of measurement update of MCGF. Consider the following regression model based on Equations (4)-(6) [25]: Multiplying S −1 i on both sides of Equation (7), we obtain where Then, the optimal estimationx i under the MCC can be obtained through the following optimization problem:x i = arg max where e i (k) is the k-th element of e i . Equation (10) can be solved by: where C y,i = diag (G σ (e i (n + 1)), . . . , G σ (e i (n + m))) .
Note here that the diag(·) is used to denote the diagonal matrix.
Based on the above equation and usingx − i to replace the x i contained in Equation (9), MCGF can be written in a similar way as GF except a modified measurement noise covariance [25]: Then,x i and its covariance P i can be obtained througĥ where We easily find that the main difference between MCGF and GF is the modified measurement noise covariance, and MCGF shows excellent estimation performance when measurement is polluted by outliers or shot noise [24][25][26]. However, it still requires the knowledge of measurement noise covariance. When the covariance changes over time (which implies the true covariance is different from the known covariance), the MCGF algorithm does not perform well. Therefore, we adopt the adaptive method to further improve the performance of MCGF in this case.

Variation Beysian Maximum Correntropy Gaussian Filter
The main idea under state estimation is to obtain the posterior probability density function p(x i |z 1:i ). For GF, we obtain it through the Gaussian approximation p(x i |z 1:i ) ≈ N (x i |x i , P i ). However, if the measurement noise covariance R i is unavailable, we need to estimate the joint posterior distribution p(x i , R i |z 1:i ). This distribution can be solved by the free form VB approximation [9,32]: where Q(x i ) and Q(R i ) are unknown approximation densities, which can be calculated by minimizing the Kullback-Leibler (KL) divergence between the true one and its corresponding approximation [9,32]: According to the VB method, p(x i , R i |z 1:i ) can be approximated as a product of Gaussian distribution and inverse Wishart (IW) distribution [9,32]: where where tr(·) is the trace of a matrix, and v i and V i are the degree of freedom parameter and the inverse scale matrix, respectively. The integrals in (22) and (23) can be computed as follows [9,32]: Besides this, the expectation can be rewritten as Substituting (30) and (31) into (28) and (29), and matching the parameters in (26) and (27), we can obtain the following results: The VB based GF works well for unknown measurement noise covariance. However, when the measurement contains outliers or shot noise, their estimation will degrade, as will be shown in our simulation results. To overcome the shortcomings of MCGF and VBGF, we take the advantages of VB and MCC by the fixed-point iteration method, and design the so called VBMCGF algorithm, which is summarized as follows: Step 1: Predict: x − i and P − i are obtained through (5) and (6), and where 0 < ρ ≤ 1, 0 <| B |≤ 1, and a reasonable choice is B = √ ρI.
Step 2: Update: Calculateẑ i and P xz by (19) and (21). For j = 1, . . . , N, iterate the following equations: End For. In addition, set The main difference between the proposed VBMCGF and existing GFs lies in the modified estimation error covariance R (j) i , where VB iterations are used to estimate its value and MCC is used to modify it in the presence of non-Gaussian noises. The kernel bandwidth σ plays an important role in reducing the effect of non-Gaussian noise or outliers. A smaller σ will make the filter more sensitive to outliers, but it may affect the convergence performance. In addition, a too large σ may cause the VBMCGF to perform more like VBGF (It can be proved that, if σ → ∞, the proposed VBMCGF will reduce to VBGF). One possible way to select it is by the trial and error method [24][25][26]. Another important issue is the number of fixed-point iterations. In fact, only a few iterations (e.g., 2 or 3) are enough [31,32].
As the general framework, our filter can be easily implemented according to the real requirements. For linear systems that are described by x i = F i−1 x i−1 + w i−1 and z i = H i x i + v i , the predation update in the VBMCKF is the same as KF: In addition, theẑ i , P xz , T (j+1) i , and V (j+1) i that appeared in VBMCKF will reduce to the following equations:ẑ while other steps are the same as the general framework.
When it comes to the nonlinear systems, the Gaussian integrals contained inx can be calculated according to Taylor series, unscented transform, or cubature rules, and the corresponding filters are called VBMCEKF, VBMCUKF, and VBMCCKF, respectively.

Simulation Results of the Target Tracking Model
To illustrate the performance of the proposed algorithm, we first give the simulation results using a typical target tracking model, where cubature rules are used to calculate the integrals. We compare the estimation accuracy of seven filters: CKF [7], MCCKF-1 [26], MCCKF-2 [25], VBCKF [9], HCKF [18], VBHCKF (which adopts Huber's function) and the proposed VBMCCKF under various kinds of measurement noise. The target tracking example is modeled as [2]: where where (ξ c x,i , η c y,i ) and (ξ c x,i ,η c y,i ) are the true and estimated position in the cth Monte Carlo experiment, respectively. The RMSE and ARMSE of velocity are similar.
We here consider the following five kinds of measurement noises: where the parameters α i , β i , and γ i are given in Figure 1.  Under the Gaussian measurement noise with known noise covariance, as given in Figure 2, both MCCKF and HCKF have nearly similar estimation accuracy to CKF, since they will reduce to CKF if choosing proper free parameters (e.g., the σ and h are infinity). VBCKF and VBMCCKF work slightly worse as compared with CKF because they only use their online estimated measurement noise covariance instead of the real one. In particular, the VBHCKF has the worst performance since the commonly used parameter h = 1.345 for Huber's function doesn't fit the Gaussian noise situation when using the inaccurate online estimated measurement noise covariance. The proposed VBMCCKF works well with the same kernel bandwidth under both Gaussian and non-Gaussian noise situations, as will be shown in the following cases.   It can be seen obviously that CKF has the worst estimation accuracy since it requires the measurement noise satisfying Gaussian distribution with known covariance, which is violated in these situations. MCCKF-1 and MCCKF-2 have similar estimation performance but are slightly worse than HCKF when using this kernel bandwidth. As demonstrated in [23][24][25][26][27], MCCKF is able to obtain better estimation accuracy than HCKF with a suitable σ. The estimation results of MCCKF and HCKF do not change too much when Gaussian mixture noise or shot noise are added, since they are robust filters. The VBHCKF has better estimation results in velocity but worse accuracy than HCKF. The VBCKF has much better estimation in Cases B and C, as it is able to online estimate the time varying measurement noise covariance. However, its performance will degrade once shot noise is injected in Cases D and E. Among these algorithms, our VBMCCKF has the best estimation accuracy as compared with other algorithms under Cases B-E. It shows the adaptivity to unknown changing measurement noise covariance and robustness to Gaussian mixture noise and shot noise. Its estimation results are also much better than VBHCKF since the MCC has the potential to capture high order information than Huber's function. The ARMSEs of these filters under different noises are also given in Table 1 to clearly show the differences.

Field Results of Integrated Navigation
To further illustrate the effectiveness of the proposed algorithm, we compare our algorithm and existing related methods using the real data collected by a self-made fiber optical gyroscope inertial navigation system (INS) together with a doppler velocity logger (DVL). The integrated navigation results of photonics inertial navigation system (PHINS) and GPS are used as the reference system. We adopt the loosely coupled method to fusion the information of INS and DVL. The state vector is chosen as x = [δL δλ δV E δV N ϕ x ϕ y ϕ z ∇ x ∇ y ∇ z ε x ε y ε z ] T , where δL and δλ are the latitude and longitude error, {δV j , ϕ j , ∇ j , ε j } are the velocity error, attitude error, accelerometer bias and gyroscope constant drift, respectively. j denotes the subscribe {e, n, x, y, z}, where e and n present the east and north directions in the local-level frame, and x, y, and z are the directions of three axises in the body frame. Then, the continuous system model is given as follows: where t is the continuous system time, and w(t) = [0 1×2 w ax w ay w gx w gy w gz 0 1×5 ] T is the process noise, which contains the Gaussian noise of both accelerometers and gyroscopes. The detailed elements of matrix A(t) and B(t) can refer to [34]. The measurement equation is In this experiment, the system is first tested in anchorage for about 50 min, then the ship starts to move. The real velocities of the ship are shown in Figure 7 provided by the commercial INS/GPS integrated navigation system. The collected data is processed using MATLAB (R2014a by MathWorks, Inc., Natick, MA, USA) on a computer with 2.50 GHz Intel Core i5-7300HQ CPU and 8 GB memory. The total computational time of KF, VBKF, HKF, MCKF, and VBMCKF are 0.1900 s, 0.4800 s, 0.2640 s, 0.2310 s, and 0.6340 s, respectively. The position and velocity errors of different filters are given in Figures 8 and 9. The differences between attitude and heading errors are quite similar so we omit them. It can be seen from Figures 8 and 9 that when the motion state changes sharply, the proposed VBMCKF algorithm has the smallest estimation errors with slightly increased computational time as compared with other estimation methods.

Conclusions
In this paper, a novel adaptive MCGF based on VB approximation is proposed. The MCC is used to reduce the effect of non-Gaussian measurement noise and outliers, while we use VB to estimate the unknown measurement noise covariance. Experimental results based on simulation examples and real data show that the proposed algorithm has better estimation accuracy than related robust and adaptive filters.