Design Method for a Higher Order Extended Kalman Filter Based on Maximum Correlation Entropy and a Taylor Network System

This paper proposes one new design method for a higher order extended Kalman filter based on combining maximum correlation entropy with a Taylor network system to create a nonlinear random dynamic system with modeling errors and unknown statistical properties. Firstly, the transfer function and measurement function are transformed into a nonlinear random dynamic model with a polynomial form via system identification through the multidimensional Taylor network. Secondly, the higher order polynomials in the transformed state model and measurement model are defined as implicit variables of the system. At the same time, the state model and the measurement model are equivalent to the pseudolinear model based on the combination of the original variable and the hidden variable. Thirdly, higher order hidden variables are treated as additive parameters of the system; then, we establish an extended dimensional linear state model and a measurement model combining state and parameters via the previously used random dynamic model. Finally, as we only know the results of the limited sampling of the random modeling error, we use the combination of the maximum correlation estimator and the Kalman filter to establish a new higher order extended Kalman filter. The effectiveness of the new filter is verified by digital simulation.


Introduction
The application of filters occupies an important position in various fields at the national and international levels. The progress and development of filters play important roles in national economic construction-especially national defense construction-such as in real-time estimation and target tracking. In 1960, Kalman proposed a method of filtering under the minimum mean squared error criterion for linear systems, and it soon began to be widely used [1]. In order to solve nonlinear problems, extended Kalman filters (EKFs) [2], unscented Kalman filters (UKFs) [3], and cubature Kalman filters (CKFs) have since emerged. However, the above-mentioned filtering methods require the modeling error to be Gaussian white noise. As such, their performances are likely to worsen when applied to non-Gaussian situations, especially when the systems are disturbed by impulsive noise. Impulsive noise arises from heavy-tailed distributions [4] (such as some mixed Gaussian distributions), and is common in many real scenarios of automatic control and target tracking (for instance, the measurement noise in the radar system is often not Gaussian noise, but heavy-tailed non-Gaussian noise [5]). In 1993, Gordon and Salmond proposed particle filtering when the density function is known [6]; this achieves an approximation of the distribution function by sampling a large number of particles therein. However, this method is very complicated; it requires a large number of particles, and it will cause particle degradation after re-sampling. In general, the density function is difficult to obtain. For this reason, for the linear system, Chen designed the corresponding Kalman filter under the maximum correlation entropy criterion based on the limited realization of random variables [7]; this is called the maximum correlation entropy Kalman filter (MCKF) [8]. On this basis, the maximum correntropy extended Kalman filter (MCEKF) and the maximum correntropy unscented Kalman filter (MCUKF), which can solve nonlinear non-Gaussian systems, have since emerged [9]. However, in MCEKFs, all higher order terms in the Taylor expansion are discarded. Therefore, a large truncation error will be generated, and the filtering performance will decrease or even diverge as the nonlinearity of the system increases. In addition, each step of the state estimation needs to recalculate the Taylor expansion coefficient, which will undoubtedly increase the complexity of the calculation. MCUKFs use UT transformation and sigma point sampling [10]; this is called deterministic sampling. There is only one sampling point for a dimensional system. Neither low-dimensional nor high-dimensional systems have a strong claim to superiority. A large number of experiments have shown that both EKFs and UKFs can be approximated by a second-order polynomial at most [11], which will produce a large rounding error. Hence, both will eventually face the problems of degraded filtering performance and divergence as their nonlinearity increases [12].
This project proposes a higher order extended Kalman filter method based on maximum correlation entropy, under the assumption that both state and measurement equations can be modeled and based on a strong nonlinear function. The main contributions of this paper are as follows: (1) using multidimensional Taylor nets to convert the general expression of nonlinear functions into higher order polynomials; (2) defining each order of polynomial in the system as hidden variables of the corresponding order, and treating them as time-variable parameters; (3) establishing the dynamic relationship between the time-variable parameters and combining them with the original variables to further establish the expanded dimension state model; (4) based on the expanded linear state variables, equivalently rewriting the measurement model into the corresponding linear form; and (5) according to the established linear state and measurement model of the new extended dimension system, establishing a higher order extended Kalman filter method based on maximum correlation entropy.
The remaining parts of this paper are organized as follows: the first chapter is the preface of our knowledge, which introduces the definition of "entropy"; the Section 2 presents a method for identifying nonlinear functions based on multidimensional Taylor networks; the Section 3 presents a higher order extended Kalman filter method; the Section 4 presents the detailed design process of the maximum correlation entropy higher order extended Kalman filter; the Section 5 concerns simulation verification; and the Sections 6 and 7 presents a summary and outlook.

Description of Correntropy
Correntropy is a generalized similarity measure between two random variables [13]. Given two one-dimensional random variables ϕ, ζ ∈ R 1 , their joint distribution function is F φξ (ϕ, ζ); then, the correlation entropy is defined as follows: where ε is the expectation operator and α(·, ·) is the translation-invariant Mercer kernel. In this article, it is not particularly emphasized that this kernel function is a Gaussian kernel, which is defined as follows: where e = ϕ − ζ, τ > 0 represents the kernel's bandwidth. By expanding Equation (2) with a Taylor series, we can obtain the following: and then the correlation entropy of Equation (1) has the following expression: However, in most practical cases, joint distribution F φξ is usually unknown, and there are often finite implementations ϕ (j) , ζ (j) , j = 1, 2, · · · , N of (φ, ξ) for random variables. In these cases, the sample mean estimator can be used to estimate the heterogeneity: Then, the entropy expression of the random variable pair (ϕ, ζ) is driven by finite data: When φ, ξ ∈ R n , and the components of vector e = φ − ξ are independent of one another, multidimensional correlation entropy is based on N sampling.

Lemma 1.
Any continuous function defined in a closed interval can be approximated accurately with a polynomial function [15].

Lemma 2.
For continuous functions, σ(α(k)), defined in a closed interval, can be approximated by the following [16]: (9) where N(h, l) denotes the total number of terms in the expansion and λ i,t denotes the power of the variable α t in the product of the ith variable.

Multidimensional Taylor Network Structure
The multidimensional Taylor network model can replace the traditional neural network with the dynamic model and control the system under certain conditions; it is characterized by a nonlinear autoregressive moving-average model composed of polynomials. The multidimensional Taylor network (MTN) uses a forward single intermediate layer structure, including an input layer, an intermediate layer, and an output layer. Supposing that the input layer comprises n nodes-α(τ) = α 1 the output layer is α(τ + 1), the middle layer is the network processing layer, and each input variable realizes the weighted summation of each power product term in this layer. The middle layer is composed of various power product terms and the corresponding connection weight vector ψ j (τ): T which represents the output weight vector connecting the intermediate layer and the output node of the network. According to the multivariate Taylor equation, if a function is differentiable to the h + 1th order at a certain point, then the function expands to a form where the power series of the variable is not greater than m times. The model can be expressed as a dynamic equation, as follows: where σ(·) is a function of nonlinearity described by a multidimensional Taylor network model, ψ i represents the weight before the product item of the ith variable, N(h, l) denotes the total number of terms in the expansion, λ i,t denotes the power of the variable α t in the product of the ith variable, and ∆σ(τ) is the error-also known as the remainderproduced by the identification of a function by a multidimensional Taylor network.

Parameter Identification Method Based on Kalman Filtering Model Establishment of a Kalman Filter
A Kalman filter can be regarded as an optimized autoregressive data processing method that describes the entire system through a state equation and an observation equation.
State equation: where i = 1, 2, · · · , N(h, l), j = 1, 2 · · · , h. Observation equation: It is not difficult to draw from Figure 1: where ψ j,i (τ + 1) represents the system state at τ-time, that is, the parameter status value of the kth moment; and β(τ + 1) represents the output value of the network. It is assumed that both process noise w(τ) and v(τ + 1) are Gaussian white noise during the analysis, and Q j = diag Q j,1 Q j,2 · · · Q j,N(h,l) and R j = diag R j,1 R j,2 · · · R j,N(h,l) , which are the process noise variance and measurement noise variance, respectively. Here, we use a Kalman filter to approximate the dynamic model. As the filtering principle of Kalman filters is mentioned later in this article, please refer to Equations (20)-(24) for the detailed process.

Approximation Analysis
Given a class of nonlinear functions ( ( )) k σα , it can be assumed that it is derivative of the rth order, but r is a relatively large number, making it difficult for us to use Taylor nets to approximate its function. The optimal approach would be to set ,1 m m r ≤ ≤ and use the Taylor network to expand the nonlinear function to the mth order, obtain the result of Equation (16), and simultaneously ensure the higher order error term δ θ Δ ≤ , where θ is the acceptable error threshold. This not only makes the Taylor network fitting function process easier, but also ensures the accuracy of the fit.

Approximation Analysis
Given a class of nonlinear functions σ(α(k)), it can be assumed that it is derivative of the rth order, but r is a relatively large number, making it difficult for us to use Taylor nets to approximate its function. The optimal approach would be to set m, 1 ≤ m ≤ r and use the Taylor network to expand the nonlinear function to the mth order, obtain the result of Equation (16), and simultaneously ensure the higher order error term ∆δ ≤ θ, where θ is the acceptable error threshold. This not only makes the Taylor network fitting function process easier, but also ensures the accuracy of the fit.

Pseudolinearized Representation of Nonlinear Functions
For ease of description and understanding, if l = d = 2, we can expand Equation (7) through a multidimensional Taylor network to the mth order, as follows: where ∑ l 1 +l 2 =l is the sum of all tensors of the l th order and ω i,l 1 ,l 2 represents the weight corresponding to each order of the tensor.
h is a set of implicit variables of the l th order.
is the weight vector corresponding to the ith order implicit variable.
In [17], there is a detailed pseudolinearization process, so we will not repeat it in this article. In order to make the model more accurate, we treat the remainder ∆σ(τ) of the equation of state as latent variables. According to Definition 1 and Definition 2, the pseudolinear extended dimension form using the remainder as a hidden variable is as follows: (14) where α (1) Similarly, Equation (8) can be rewritten as follows: where

Linearized Representation of Nonlinear Functions
In order to transform the pseudolinear model established in Section 3.1 into a true linear form, it is necessary to establish a dynamic relationship between the lth order hidden variables and the uth order hidden variables [18]: where W can be identified based on the multidimensional Taylor network in its original state; without any prior information, it can be set as follows: Combining Definition 1, Definition 2, and Equation (19), the state model Equation (7) has the following linear matrix form: If A(τ) = (α (1) (τ)) T , (α (2) (τ)) T , · · · , (α (l) (τ)) T , · · · , (α (r) (τ)) T , ∆σ(τ) then, Equation (7) has the following linearized form: where γ(k) is the modeling error.

Design of Higher Order Extended Kalman Filter
For linear models, KF-based filters are given. Given the initial value A(0), when γ(τ) and θ(τ + 1) are Gaussian white noise with zero mean, the variances are recorded as ϑ and η, respectively.
A recursive filter can be designed as follows:

The Statistical Independence Process of Each Component in the Non-Gaussian Modeling Error Vector (τ + 1) in the Comprehensive Measurement Model
The vector (τ + 1) in the comprehensive measurement model Equation (22) is a non-Gaussian modeling error vector, and its components are not statistically independent. In order to use the correlation entropy form of the multidimensional independent vector shown in Equation (19), the one-dimensional non-Gaussian vector (τ + 1) needs to be transformed into statistical independence.

Implementation Process of a Higher Order Extended Kalman Filter Based on Maximum Entropy
The filtering process of the extended Kalman filter (H-MCEKF) based on the maximum correlation entropy is as follows (see [20] for the specific derivation process):

1.
The filter initialization obtains the initial filter valueÂ(0) and the covariance λ(0), choosing a suitable core bandwidth and a small positive number ε; 2.
Taylor networks are used for system identification to obtain the parameters in the equations, using the expanded item and the remainder as the new hidden variables. A pseudolinearization process is performed to obtain the pseudolinear form of the system; 3.

Simulated Cases
This section verifies the validity of the proposed method by providing two cases: one in which the state equation is a nonlinear equation and the measurement equation is a linear equation, and one in which the state and measurement equations are both nonlinear.

Case 1
Consider a nonlinear system in which the state equation is a nonlinear model and the measurement equation is a linear model: where the initial value x(0) is a random value of [0, 1], the initial estimation error covariance P(0|0) = 0.1 × diag(1, 1), and the process noise and measurement noise have the following characteristics: 1N(0, 2), v 2 (k) ∼ 0.9N(0, 0.02)+0.1N(0, 2) Figure 2 shows a diagram of the MTN identification system, while Figure 3 shows the estimated values of state variables x 1 and x 2 under the three filtering methods. From [21], we know the influence of ε is not significant compared with the kernel bandwidth σ. The parameters are set at ε = 10 −6 . Tables 1 and 2 show the mean squared error and the mean relative error, respectively, of the estimated values under the three algorithms, which are computed as averages over 100 independent Monte Carlo runs, with each run containing 50 time steps. When σ = 5, the three algorithms all obtain better filtering results. Figures 4 and 5 show the probability densities of the estimation errors when estimating the states x 1 and x 2 , respectively, when the parameters are ε = 10 −6 and σ = 5. All of the results confirm that the proposed H-MCKF (design method for a higher order extended Kalman filter based on maximum correlation entropy and a Taylor network system) can outperform the MCEKF (maximum correntropy extended Kalman filter) significantly when the system is disturbed by non-Gaussian processes and measurement noise, and the H-MCKF_R (H-MCKF with the remainder of the state equation) further improves the filtering performance of the H-MCKF.         x estimation errors with the three filters in case 1.

Case 2
Consider a nonlinear system in which the state equation and the measurement equation are both nonlinear models:

Case 2
Consider a nonlinear system in which the state equation and the measurement equation are both nonlinear models: where the initial value x(0) is a random value of [0, 1], the initial estimation error covariance P(0|0) = 0.1 × diag(1, 1), and the process noise and measurement noise have the following characteristics: Figure 6 shows a diagram of the MTN identification system, while Figure 7 shows the estimated values of state variables x 1 and x 2 under the three filtering methods. Similar to case 1, the parameters are set at ε = 10 −6 . Tables 3 and 4 show the mean squared error and the mean relative error, respectively, of the estimated values under the three algorithms, which are computed as averages over 100 independent Monte Carlo runs, with each run containing 50 time steps. When σ = 5, the three algorithms all obtain better filtering results. Figures 8 and 9 show the probability densities of the estimation errors when estimating the states x 1 and x 2 , respectively, when the parameters are ε = 10 −6 and σ = 5. All of the results confirm that the proposed H-MCKF can outperform the MCKF significantly when the system is disturbed by non-Gaussian processes and measurement noise, and the H-MCKF_R further improves the filtering performance of the H-MCKF when the state and measurement equations are both nonlinear.

Conclusions
This paper considered a wide range of filter design problems for the state estimation of multivariable dynamic systems, which consist of a strong nonlinear dynamic model and a strong nonlinear observation model. Firstly, we transformed those strong nonlinear models into a higher order polynomial series using a multidimensional Taylor network. Secondly, all higher order items in the polynomial series were defined as hidden variables. Those higher order series were then rewritten as their pseudolinear equivalents. Thirdly,

Conclusions
This paper considered a wide range of filter design problems for the state estimation of multivariable dynamic systems, which consist of a strong nonlinear dynamic model and a strong nonlinear observation model. Firstly, we transformed those strong nonlinear models into a higher order polynomial series using a multidimensional Taylor network. Secondly, all higher order items in the polynomial series were defined as hidden variables. Those higher order series were then rewritten as their pseudolinear equivalents. Thirdly,

Conclusions
This paper considered a wide range of filter design problems for the state estimation of multivariable dynamic systems, which consist of a strong nonlinear dynamic model and a strong nonlinear observation model. Firstly, we transformed those strong nonlinear models into a higher order polynomial series using a multidimensional Taylor network. Secondly, all higher order items in the polynomial series were defined as hidden variables. Those higher order series were then rewritten as their pseudolinear equivalents. Thirdly, dynamic relationships between all hidden variables and known variables were constructed using the multidimensional Taylor network. Combining the original model of pseudolinearization with the higher order hidden variable dynamic model, linear dynamic models fitted to a standard Kalman filter were presented. Finally, considering that a finite number of samples from modeling error can be obtained, we built the higher order extended Kalman filter based on maximum correlation entropy, and acquired better filter performance than offered by the existing MCEKF [22].
Outlook: There exist several challenges worthy of further research. Firstly, the proposed higher order extended Kalman filter based on maximum correlation entropy is an online iteration process that obtains state estimation constantly, but, as such, it loses one important function possessed by the standard Kalman filter: the ability to operate in real time. Secondly, the linearized model parameters of the original nonlinear model and the hidden variable dynamic model were identified by local time period data; thus, they need to be updated with new time period data in order to fit the time dynamics of the system. Thirdly, in this paper, on the basis of defining all of the hidden variables, we established a linear form of the strong nonlinear model in an expanded state with the original variables and all hidden variables, and obtained better estimation performance than that of a standard EKF; if measurements can be expanded in the same manner as state, we believe that such a filter may offer better estimation performance than the one established by this paper.