Hybrid Adaptive Cubature Kalman Filter with Unknown Variance of Measurement Noise

This paper is concerned with the filtering problem caused by the inaccuracy variance of measurement noise in real nonlinear systems. A novel weighted fusion estimation method of multiple different variance estimators is presented to estimate the variance of the measurement noise. On this basis, a hybrid adaptive cubature Kalman filtering structure is proposed. Furthermore, the information filter of the hybrid adaptive cubature Kalman filter is also studied, and the stability and filtering accuracy of the filter are theoretically discussed. The final simulation examples verify the validity and effectiveness of the hybrid adaptive cubature Kalman filtering methods proposed in this paper.


Introduction
In recent decades, nonlinear filtering has been widely used in military and civil fields such as target tracking, navigation, positioning, and intelligent manufacturing [1,2]. The theory and method of nonlinear filtering has became one of the most important research issues in the signal processing field, and has attracted increasing attention from researchers.
There are two main kinds of nonlinear filtering methods. The representative of the first kind of nonlinear filter is the extended Kalman filter (EKF), which linearizes the system model by Taylor expansion, holds the first order term, and ignores the second-and higher-order terms. The second kind of nonlinear filtering approximates the statistics of the system state, with examples being the unscented Kalman filter (UKF) and the cubature Kalman filter (CKF). Due to the model error of the linearization of the nonlinear system, the accuracy of the EKF is slightly lower, even leading to filtering divergence. Based on the unscented transformation to approximate the statistics of the system state, UKF was presented in [3]. Further, the cubature Kalman filter algorithm was proposed by Ienkaran in [4]. The CKF algorithm uses a third-degree spherical-radial cubature rule based on a Gaussian filtering framework. The algorithm has higher numerical stability and a smaller amount of calculation. Its excellent performance has made it widely used in various nonlinear system scenarios. Many advantages of CKF have attracted scholars to conduct in-depth research on it, considering that the traditional nonlinear filters often need to overcome the filtering divergence caused by high-dimensional operational errors. Drawing on the idea of square root filtering in the Kalman filter, Ienkaram and Haykin proposed the square-root cubature Kalman filter (SCKF) [5], which further improved the accuracy and stability of filtering. However, the traditional nonlinear filtering method requires knowledge of the mathematical model and the prior statistical information of noise when in practical application. Additionally, the statistical characteristics of noise in actual systems are usually indeterminate, which leads to a decline in the filtering accuracy.
For the problem of unknown statistical characteristics of measurement noise in real applications [6,7], Sage and Husa [8,9] proposed an excellent Sage-Husa suboptimal unbiased maximum a posteriori (MAP) estimator. Many scholars also adopted adaptive filtering techniques to improve the performance of the estimation algorithm, such as sliding window method [10], fading factor adjustment (FFA) [11][12][13], maximum a posteriori (MAP) estimator [14,15], and the variational Bayesian (VB) method [16][17][18][19], etc. Different methods for estimating the statistical characteristics of system noise are usually designed under different estimation criterions. How to use these methods to estimate the statistical characteristics of system noise is still an open issue.
For a class of nonlinear stochastic systems with inaccurate or unknown measurement noise variance (i.e., the priori measurement noise variance is not a precise value), an adaptive filtering algorithm based on SCKF is designed in this paper. Firstly, a novel fusion approach is proposed to estimate the measurement noise variance on the basis of the MAP and VB methods. Then, we use FFA to adjust the part of the variance matrix of the SCKF algorithm to obtain a hybrid adaptive SCKF algorithm (HASCKF) and the corresponding information filter (recorded as HASCIF). This is beneficial to reduce the effects on the adaptive filtering algorithm performance that may be caused by the estimation deviation of noise fusion. At the same time, the performance of the adaptive filtering algorithm is analyzed from two aspects based on the established HASCKF algorithm: the stability of the adaptive filtering algorithm and the filtering accuracy.
This paper is organized as follows: Section 2 formulates the nonlinear stochastic system and describes the problem of the inaccuracy of the the measurement noise variance. In Section 3, a novel noise variance fusion estimation algorithm HASCKF and the corresponding information filter are proposed based on the idea of weighted fusion. In Section 4, two simulation examples are utilized to display and verify the performance of the proposed algorithms. Section 5 provides the conclusions of this work.

Problem Description
Considering a class of discrete nonlinear stochastic systems, the state space model is described as follows [1]: where x(k) ∈ R n is the state of the target, z(k) ∈ R m is the measurement, f : R n → R n is the evolution process of the nonlinear state, and h : R n → R m is the corresponding nonlinear measurement mapping. The process noise w(k) ∈ R n is a Gaussian white noise with zero means and variance Q(k). The measurement noise v(k) ∈ R m is a Gaussian white noise with zero means and variance R(k). Hypothesis 1. The process noise w(k) and measurement noise v(k) in the model are mutually statistically independent.

Hypothesis 2.
The initial state of the system is x(0), with mean x 0 and variance P 0 , and it is uncorrelated with v(k) and w(k).
Hypothesis 3. The process noise variance Q(k) is known, but the measurement noise variance is only with an inaccuracy prior state R 0 . For the nonlinear system described in (1) and (2), assuming that at the time k, we have the optimal estimationx(k − 1|k − 1) and the square root matrix of the error variance matrix S(k − 1|k − 1). Then, the state estimatex(k|k) and the square root of the variance matrix S(k|k) can be calculated according to the standard SCKF algorithm.
For nonlinear systems with determined noise variance, the SCKF algorithm has better estimation performance. However, when the priori value of the measurement noise variance is inaccurate, the final estimationx(k|k) and S(k|k) will have large errors.

Hybrid Adaptive SCKF Algorithm (HASCKF)
In order to improve the adaptive filtering accuracy for an inaccurate modeling system with unknown measurement noise variance, and to let the estimated noise variance be closer to the true noise variance, in this paper, a novel noise variance fusion estimation algorithm HASCKF is proposed based on the idea of weighted fusion.

Estimation Method of Measurement Noise
Theorem 1. Assume that the measurement noise variance estimated by MAP and VB methods at time k are denoted asR 1 (k) andR 2 (k), respectively. Then, the weighted fusion estimation of the measurement noise varianceR g (k) iŝ where In the above equation, · F represents the Frobenius norm of the matrix, and the initial value of the fusion estimationR g (0) = R 0 .
Proof. The weighted fusion estimation of the measurement noise varianceR g (k) can be expressed as the following linear combination:R Under the condition a 1 (k) + a 2 (k) = 1 , we minimize the performance criterion: T From a 1 (k) + a 2 (k) = 1, we can get a 1 (k) = (1 − a 2 (k)), then substituting it into formula (5), we obtainR Then, the overall estimation error is So, we can get Let ∂J(k) ∂a 2 (k) = 0 , and after simplification, we can get where Since the measurement noise variance R(k) is uncertain, Equation (9) cannot be directly calculated. For this reason, we replace R(k) with fusion estimated valueR g (k − 1) of the measurement noise variance at time (k − 1). Obviously, the initial value satisfiesR g (0) = R 0 , then substituting it to Equations (9), (8), and (6), respectively, we can get Equations (3) and (4).

Note 1
In the above theorem, since the measurement noise variance R(k) is uncertain, we replace R(k) with fusion estimateR g (k − 1) of the measurement noise variance. This approximate substitution has certain rationality, especially for the case of constant noise variance and slowly varying noise variance. From Theorem 1, the physical meanings of T 1 (k) and T 2 (k) are equivalent to the estimated error variance of the two noise variance estimation methods, and T 12 (k) and T 21 (k) are similar to their cross-variance. Inference 1. If we do not consider the correlation between the estimation error of noise variance, that is T 12 (k) = T 12 (k) = 0, then the weighted fusion estimation of measurement noise varianceR g (k) is It can be directly derived according to the principle of simple convex combination fusion [1].

Note 2
Obviously, the result of inference 1 is easy to generalize to the case where the number of noise variance estimators N e ≥ 3 . Assume that the error variance of the ith estimator T i (k) = Assuming that the estimation errors of various estimation methods are not related to each other, the fusion estimation of noise variance is where T g (k) can be regarded as the error variance of noise variance fusion estimation. Since , which indicates that the estimated noise variance after fusion is superior to that of any single noise variance estimator.

HASCKF Algorithm
Based on Theorem 1, combined with the fading factor adjustment technique [11], we propose the hybrid adaptive SCKF algorithm (HASCKF). The principle block diagram is shown in Figure 1. Firstly, the variance of measurement noise is estimated by the MAP estimator and VB method respectively. Then, the weighted fusion technique is introduced to fuse the two noise variance estimators. Finally, the variance matrix of the SCKF measurement update is adjusted by the fading factor adjustment technique to obtain the final state estimation and the root-mean square error (RMSE) variance matrix. The detailed implementation of HASCKF is described in the following Theorem 2.
Theorem 2. Consider a class of nonlinear system as described in (1) and (2). Under the condition of Hypotheses 1-3, if the optimal estimatex(k − 1|k − 1) and the square-root matrix of the estimation error variance S(k − 1|k − 1) have been obtained, the state estimatex(k|k) and the square-root matrix of the estimation error variance S(k|k) can be calculated according to the following steps: Step 1: Time Update In (12)- (14), i = 1, 2, · · · , N x , N x = 2n. The parameters ξ i are given below (ε i is an n-order unit vector): Calculate the square root of the variance matrix: where Tria(·) represents a triangular operation. S Q (k) represents the square-root of the new process noise variance Q(k), that is, Step 2: Measurement Update (1) According to Equations (16)- (18), we can calculate the predicted valueẑ(k|k − 1).

Lemma 2 ([15]
). When the measurement noise variance is time-varying, the suboptimal MAP estimateR(k) of noise variance R(k) can be obtained by the recursive calculation: ). b is the forgetting factor, and its value range is usually between 0.95 and 0.99.
Estimation of prediction parameters of measurement noise variance by VB method: where "·" represents the point operation in Matlab.
and η i (k) are two parameters of the inverse gamma distribution. ρ i ⊂ (0, 1) is the predictive weighting factor. It reflects the degree of correlation between the noise at the last moment and the noise at the current moment. When the difference between the measurement noise variance at the last moment and the measurement noise variance at the current moment is small, a larger ρ i value should be used. Conversely, ρ i should take a smaller value.
The square-root of the variance matrix: where S R (k) represents the square root of measurement noise variance R(k) (namely, where Then, the VB method getsR(k) through M iterations: Iterative initialization: let t = 1, for a given number of iterations M, we have Calculating the estimate of the measurement noise variance: where diag(A) represents a diagonal matrix composed of matrix A diagonal elements. Use Equations (16) and (32) to calculate the t-th iteration state estimatex t (k|k) and the root of its mean square error matrix S t (k|k) . If t < M, update the parameter η t (k). Let , return to the beginning of the iterative. When t = M, end the iteration, we can get (4) Calculate the fusion estimationR g (k) of the measurement noise variance according to Equation (3).
where τ(k) is the adaptive factor, 0 < τ(k) ≤ 1, and it is calculated by the following Lemma 3.

Lemma 3 ([11]
). For nonlinear systems with unknown measurement noise variance, the adaptive fading factor is determined by the following equation: In the above equation, tr denotes the trace of the matrix,z(k|k − 1) = z(k) −ẑ(k|k − 1) is the measurement residual error vector.
(6) The following Equations (30)-(32) are used to obtain the updated estimationx(k|k) and the root of its mean square error variance S(k|k).
where the symbol "/" indicates the matrix right divide operation (e.g., Proof: We can directly derive Theorem 1 and Lemma 3 by Lemma 1, omitted here. Note 3 Theorem 2 only shows the hybrid adaptive filtering algorithm when the number of noise variance estimators N e is 2. Obviously, when N e > 2, we first use various noise variance estimators to estimateR j (k) and then calculate the fused estimateR g (k) of the measurement noise variance according to Equation (11). In other words, only steps (2)-(4) in Theorem 2 need to be adjusted.

HASCIF Information Filter
Compared with traditional filtering, the information filter may not require prior information when it is initialized, and thus has better numerical performance. In addition, the use of an information filter to design a fusion algorithm is also simpler. In the information filter, the state estimate and its estimation error variance matrix are replaced by information vector and information matrix, respectively. Subsequently, we give its corresponding information filtering form (HASCIF) on the basis of HASCKF. According to the literature [20,21], we can obtain the implementation process of HASCIF as follows: Step 1: Time update [20,21] whereŷ(k|k − 1) and Y(k|k − 1) are the predicted information vector and the predicted information matrix, respectively.x(k|k − 1) and S(k|k − 1) can be calculated according to Equations (14) and (15).
Step 2: Measurement updatê In Equations (35) and (36), the information state vectorsŷ(k|k) and Y(k|k) are the information vector and information matrix of the state estimate, respectively. θ(k) and Θ(k) are the information contribution vector and the information contribution matrix.R g (k) is determined by formula (3), and the cross-variance matrix P xz (k|k) can be calculated by Formula (29).

Performance Analysis of HASCKF
References [22][23][24] proposed the bounded convergence theorem of the UKF algorithm, and Reference [14] extended its theorem to the adaptive cubature Kalman filter (ACKF). In this section, the CKF bounded convergence theorem proposed in Reference [14] and the Cramer-Rao lower bound (CRLB) [25] are used to analyze the convergence of the HASCKF algorithm.

Lemma 4 ([14]
). Consider the nonlinear systems (1), (2), and the standard CKF algorithm. If ∀k ≥ 0, both satisfy the following two assumptions: (1) There are non-zero real numbers a min , a max , b max , c max , g min , and g max existing to let the following formulas be established: (2) There are positive real numbers p min , p max , q max , r max , Ξ min , Ξ max , and Σ min existing to let the following forms be established: . Then, the standard CKF state estimation error will be mean-square bounded, that is, the algorithm is stable and convergent. Note 4 Lemma 4 shows that the statistical characteristics of noise are closely related to the stability of the CKF algorithm. In addition, if Lemma 4 is established, the SCKF with known noise statistics is also stable and convergent. This is because the theoretical framework of SCKF is consistent with that of CKF. The mean square root matrix S(k|k − 1) and S(k|k) in CKF are only used when transferring the error variance. The root mean square matrix in the SCKF algorithm is obtained through triangulation of the matrix, thus avoiding the filter divergence caused by the non-positive definite variance matrix in the numerical calculation, so it has better stability while ensuring the accuracy of estimation with CKF. Theorem 3. If the standard SCKF algorithm is stable and convergent when the statistical characteristics of noise are known accurately, the introduction of the noise variance fusion estimator and the adaptive fade factor can ensure the stable convergence of the HASCKF algorithm.
Proof. When the measured noise variance matrix R(k) is inaccurate, other sufficient conditions in the bounded convergence theorem can be satisfied, but the condition (46) will be affected.
(a) First only consider the influence of the weighted fusion noise estimator. Let ∆R(k) =R g (k) − R(k), then formula (40) can be rewritten as follows: WhenR g (k) ≥ R(k), then ∆R(k) ≥ 0, and Σ(k) will become larger. Obviously, condition (46) still holds and the HASCKF algorithm still converges steadily.
WhenR g (k) < R(k), then ∆R(k) < 0, and Σ(k) will become smaller. Now, condition (46) may not be satisfied. However, the noise variance weighted fusion estimatorR g (k) estimates and corrects the inaccurate noise variance matrix in real time so that it gradually tracks the real value R(k), thus making ∆R(k) → 0, Σ(k) → Σ(k). In this way, ∆R(k) ≥ 0 is gradually satisfied to Equation (42) to ensure stable convergence of the HASCKF algorithm.
(b) Further consider the effect of fading factors. According to Theorem 2, the variance matrix for measuring residuals is Then, Equation (41) is rewritten as: Therefore, Equation (44) can be written as It is known from the definition of fading factor 0 < τ(k) ≤ 1, so we can get Similar to the analysis in (a), when ∆R(k) ≥ 0 and ∆Σ(k) ≥ 0, it is obvious that condition (46) holds and the stability of the HASCKF algorithm remains. When ∆R(k) < 0, if ∆P zz (k|k − 1) is large enough, ∆Σ(k) ≥ 0 can still be satisfied, and condition (46) still holds, so the HASCKF algorithm converges steadily. If ∆P zz (k|k − 1) is not enough to guarantee ∆Σ(k) ≥ 0, the introduction of the weighted fusion noise variance estimator can also make ∆R(k) → 0, so that ∆Σ(k) ≥ 0 is established stepwise to ensure stable convergence of the HASCKF algorithm.
In summary, the introduction of the noise variance fusion estimator and adaptive fading factor in the HASCKF algorithm can ensure the stable convergence of the algorithm.

Note 5
This theorem combined with Note 2 shows that the noise variance estimation based on weighted fusion is superior to the estimate of any single noise variance estimator. Therefore, the hybrid adaptive HASCKF estimation algorithm has better stability than the adaptive SCKF using a single noise variance estimation algorithm.

Filtering Accuracy Analysis
There is a lower bound on the minimum variance unbiased estimator of the state of the nonlinear filtering algorithm. It is widely used to evaluate the performance of nonlinear estimation. In practice, the lower limit of Cramer-Rao Lower Bound (CRLB) is commonly used. Denote X 0:k = {x(0), x(1), · · · , x(k)} , Z 0:k = {z(0), z(1), · · · , z(k)}, and p(Z 0:k , X 0:k ) is the joint probability density of (Z 0:k , X 0:k ).x(k) is the unbiased estimation of x(k). Then, CRLB is defined as [25]: where J(k) is the Fisher information matrix: Theorem 4 Assume that the nonlinear filter is applied to system (1), (2). Then where Proof. We use J(k) to denote the information matrix of state x(k). Then, the information matrix J(k) can be recursively calculated according to the following formula [26]: where where ∆ y x = ∇ x ∇ T y means the second-order operator. ∇ x = ∂ ∂x is the first-order operator. Because process noise and measurement noise are Gaussian white noise, we have Substituting Equations (58) and (59) into Equation (57), then we can get (60) Equation (56) can be obtained by applying a matrix inversion lemma to Equation (61).

Simulation
In this section, two numerical simulation examples are provided to display and verify the performance of the SACKF algorithm proposed in this paper, mainly including the following contents: (1) For a class of inaccurate modeling with unknown measurement noise variance, we compare the performance difference between the weighted fusion estimator (referred to as WF-NE) and the single noise estimator (e.g., MAP estimation or VB estimation, respectively denoted as MAP-NE and VB-NE).
(2) For a class of inaccurate modeling with unknown measurement noise variance, we study the advantages and disadvantages of the HASCKF algorithm and the standard SCKF algorithm, and the equivalence between the HASCKF and HASCIF algorithms. Example 1. This example is used to evaluate the performance of three kinds of noise estimators, MAP-NE, VB-NE, and WF-NE. Considering the following first-order nonlinear discrete dynamic system: where w(k) and v(k) are mutually independent Gaussian white noise sequences. Assume that the initial value of the system state and the error variance matrix are and the system state initial value x 0 is independent of the two noises. The process noise statistic Q(k) = 0.001. In the following, simulation experiments are performed for two cases where the measurement noise variance R(k) is a constant and piecewise continuous function.
(1) Situation 1: If the measurement noise variance is constant and R(k) = 0.012. Assume that the imprecise measurement noise variance of the initial valueR 0 = 0.04. In the simulation, MAP-NE adopted the estimator described in Equation (19). The parameters of VB-NE were selected as follows: In order to compare the performance of various algorithms, we adopted the absolute error (AE) and the mean absolute error (MAE), which are calculated as follows: where y(k) andŷ(k) are the value to be estimated and the estimated value respectively. N s is the number of simulation steps. In this case, N s = 1000.
The estimated noise variance of the three kinds of noise estimators are shown in Figures 2 and 3, and the estimation error is given in Table 1.
(2) Situation 2: If the measurement noise value is time varying, the true statistical characteristics meet the following formula: Assume that the known inaccurate initial value of the measurement noise variance isR 0 = 0.08. In the simulation, MAP-NE adopted the estimator described in Equation (20), and the forgetting factor b was 0.98. The parameters of VB-NE were selected as ρ = 1 − e −5 , ζ(0) = 1, η(0) = 0.08, M = 1. The estimation results of the three kinds of noise estimators on the measured noise variance are shown in Figures 4 and 5, and the mean absolute errors of several algorithms are given in Table 2.   The simulation results of Example 1 show that the fusion estimator WF-NE proposed in Theorem 1 could obtain the best estimated result for either the constant unknown measurement noise variance or for the time-varying one, compared with the MAP-NE and VB-NE. It had the same result with the analysis conclusion in Note 2. In Situation 1, as shown in Figures 2 and 3, compared with the fusion results of MAP-NE and VB-NE, the WF-NE had better estimates in most simulation steps. Due to the randomness of noise, not all the noise variance estimates obtained by WF-NE were better than the ones solved by other methods, especially for the estimation of the time-varying variance in Situation 2. As shown in Figure 3, the MAP-NE method had relatively large estimation errors in the time interval , and the VB-NE method had relatively large estimation errors in the time interval [390-800]. Nethertheless, the WF-NE could avoid large variation of the noise variance estimation error. Similarly, the general trend of the WF-NE method was best in Figure 4, although the MAP-NE method or the VB-NE method were best in some small time intervals. Although the CPU time cost of WF-NE (0.3725 for 1000 simulation steps) was larger than the other two methods (0.2188 for 1000 simulation steps), it was still acceptable. Meanwhile, the mean absolute errors of WF-NE were smaller than the other two methods, for boththe constant noise variance (in Situation 1) and for the time-varying variance (in Situation 2).

Example 2.
This example is used to verify the pros and cons of the hybrid adaptive SCKF estimation algorithm HASCKF proposed by Theorem 2 and the standard SCKF algorithm described in Reference [4]. Assume that the target is moving in a uniform linear motion on a two-dimensional plane. The system state x(k) = [x(k),ẋ(k), y(k),ẏ(k)] T , where x(k) and y(k) are the position components in the east-north coordinate system.ẋ(k) andẏ(k) are the corresponding velocity components, respectively. Then, the state equation can be described as: where the sampling period T = 0.5 s, the process noise w(k) is zero mean Gaussian white noise. Its statistical characteristic is Q(k) = 0.1 × diag(Q1, Q2), and Q1 = Consider a radar to track the target. The nonlinear measurement equation can be expressed as The real statistical characteristic of the measuring noise R(k) = diag{(4m) 2 , (0.1 • ) 2 }. During the simulation, the simulation time was set as N s = 100 s. Suppose that the inaccuracy initial value of the measurement noise varianceR(0) = diag{(81m) 2 , (0.3 • ) 2 }. MAP-NE adopts the estimator described in Equation (19), The estimated results of SCKF and HASCKF algorithms are shown in Figures 6-10. The estimation error is given in Table 3.
From Figure 6 to Figure 10, it can be clearly seen that the standard SCKF estimation showed a large deviation after 40 s. However, HASCKF could still better estimate the state of the target. This is because the standard SCKF adopts an inaccurate prior noise varianceR(0), while the HASCKF proposed in this paper estimates and corrects the inaccurate measurement noise variance, thus ensuring the accuracy and stability of the algorithm. As analyzed in Example 1, due to the randomness of noise, not all the noise variance estimates obtained by WF-NE were better. Therefore, the estimates of some components of the state obtained by SCKF were better than the one solved by HASCKF at some time in the interval [10 s, 40 s]. However, the general tend of the proposed HASCKF was better than SCKF, as shown in Figures 6-10

Conclusions
In this paper, aiming at the filtering problem caused by the inaccurate measurement noise variance in real systems, a weighted fusion estimation method is designed and a class of hybrid adaptive filtering structures is proposed, based on different measurement noise variance estimators. The stability and the filtering accuracy of the hybrid adaptive filter are analyzed and discussed theoretically. The simulation results showed that the algorithm had excellent accuracy and stability.
The work that can continue to be studied in the future includes further study on the research and design of fusion methods based on hybrid adaptive filtering in nonlinear centralized fusion framework and distributed fusion framework, research on functional equivalence among various fusion algorithms, and so on.