1. Introduction
State estimation is a key problem in engineering and scientific fields, involving the estimation of a system’s internal state from noisy measurements [
1,
2]. The Kalman filter (KF) is known as the optimal linear estimation method, providing an optimal recursive solution for linear systems under Gaussian noises [
3]. However, in real-world applications, nonlinear dynamics and non-Gaussian noises are common. Under such conditions, the standard KF may fail to maintain its optimality in minimizing the mean squared error (MSE) of the estimated state. Thus, it is crucial to expand KF to efficiently deal with state estimation problem of nonlinear systems in the presence of non-Gaussian noise.
For addressing nonlinear filtering problems, obtaining an exact or analytical solution is often impractical. Many approaches have been explored to address this problem, including extended Kalman filter (EKF) [
4], unscented Kalman filter (UKF) [
5], cubature Kalman filter (CKF) [
6], and Gauss–Hermite Kalman filter (GHKF) [
7]. EKF utilizes the first-order Taylor series expansion to linearize system equations. However, EKF suffers from poor approximation accuracy due to linearization errors, and if the system exhibits high nonlinearity, EKF may even diverge [
8]. To address this issue, UKF, CKF, and GHKF have been proposed, which approximate the state conditional probability distribution using deterministic sampling (DS). Although UKF avoids the linearization step, offering more accurate estimates for nonlinear systems, it demands more computational resources and careful adjustment of parameters. Moreover, in high-dimensional nonlinear systems, UKF may encounter negative weights, leading to filter instability and potential divergence. While the CKF addresses the numerical instability issue in the UKF, it introduces the problem of nonlocal sampling. GHKF may encounter the curse of dimensionality in high-dimensional systems problems, resulting in significant computational burdens. Based on the Monte Carlo method, particle filter (PF) uses random sampling of particles to represent probability density [
9]. Since PF does not assume prior or posterior probability density, it requires considerable computational demands to attain accurate state estimation. Thus, depending on the system and specific requirements, it is important to balance accuracy and computational demands to select the suitable filter.
Generally, the standard KF and its variants are based on the framework of linear minimum mean square error (LMMSE) estimation [
10]. LMMSE estimation focuses on finding the optimal linear estimator in the original measurement space. When the measurement and the state are related to a lower degree of nonlinearity, the LMMSE estimation is expected to yield more accurate results [
10]. Furthermore, by considering a wider or optimized set of measurements that are uncorrelated with the original ones but still contain relevant information about the system state, LMMSE estimation can achieve better estimation accuracy. The uncorrelated conversion-based filter (UCF) improves the LMMSE estimator in a similar manner, by incorporating new measurements through nonlinear transformations [
11]. Moreover, the optimized conversion-sample filter (OCF) simplifies the optimization of the nonlinear transformation function and limits the errors introduced by the DS [
12]. In contrast to the UCF and OCF, the generalized conversion filter (GCF) optimizes both the conversion’s dimension and its sample points, providing a generalized transformation of measurements using DS [
13].
While the GCF designed based on the minimum mean square error (MMSE) criterion is effectively to handle Gaussian noise, it is highly sensitive to non-Gaussian noise [
14]. It is noteworthy that the measurement noise often follows a non-Gaussian distribution and may contain outliers in many practical scenarios. Recently, numerous advancements have been made in enhancing filter performance when dealing with non-Gaussian noise. One such approach involves augmenting the system model with quadratic or polynomial functions to improve estimation accuracy in the presence of non-Gaussian noise [
15,
16]. The Student’s t filter is another technique, using Student’s t distribution to model measurement noise [
17]. Moreover, information-theoretic learning (ITL) has been utilized to combat non-Gaussian noises in signal processing [
18]. For instance, maximum correntropy Kalman filter [
19], maximum correntropy GCF (MCGCF) [
20], and minimum error entropy Kalman filter [
21] have been proposed. Unlike the previously discussed techniques, the generalized loss (GL) [
22] provides flexibility in adjusting the shape of the loss function. It integrates various loss functions, such as the squared loss, Charbonnier loss [
23], Cauchy loss [
24], Geman–McClure loss [
25] and Welsch loss [
26]. Therefore, by acting as a robust loss function that does not rely on the symmetry of Gaussian distributions, GL improves both filtering accuracy and robustness, effectively capturing higher-order statistical characteristics and mitigating the impact of symmetry disruption in non-Gaussian environments. Until now, several GL-based algorithms have been proposed to tackle different estimation problems. A variational Bayesian-based generalized loss cubature Kalman filter is proposed to handle unknown measurement noise and the presence of potential outliers simultaneously [
27]. The iterative unscented Kalman filter with general robust loss function [
28] and geometric unscented Kalman filter with GL function [
29] are utilized to enhance state estimation in power systems, alleviating the impact of non-Gaussian noise.
In this paper, a new nonlinear filter named generalized loss-based generalized conversion filter (GLGCF) is proposed, which employs the GL to reformulate the measurement update process of GCF. By leveraging the GCF’s ability to utilize higher-order information from transformed measurements and the GL’s robustness in dealing with various types of noise, the GLGCF outperforms both the GCF and MCGCF in non-Gaussian noise environments. The main contributions of this paper are summarized as follows:
To combat non-Gaussian noises, the GLGCF employs a robust nonlinear regression based on GL, and the posterior estimate and its covariance matrix are updated using a fixed-point iteration.
To solve the problem of manually setting the shape parameter in the GL function, the residual error is integrated into negative log-likelihood (NLL) of GL, and the optimal shape parameter is determined through minimizing the NLL.
Simulations on the target-tracking models in non-Gaussian noise environments demonstrate the superiority of the GLGCF. Additionally, its recursive structure makes it suitable for online implementation.
The rest of this paper is organized as follows.
Section 2 introduces the GL and GCF.
Section 3 derives the proposed GLGCF algorithm.
Section 4 demonstrates the effectiveness of the GLGCF by target-tracking simulations.
Section 5 concludes this paper.
3. Generalized Loss-Based Generalized Conversion Filter
Since the MMSE depends only on the second-order statistics of the errors, the performance of the GCF deteriorates in non-Gaussian noise [
30]. To improve the robustness of the GCF, we propose integrating the GL cost function into the GCF framework, resulting in a new variant of the GCF, namely GLGCF. This variant is expected to perform better in non-Gaussian noise environments, as the GL incorporates second- and higher-order moments of the errors.
To combine the GL with the GCF, we first define a linear model that combines the state estimate and the measurement as follows:
where
and the covariance matrix of
can be obtained by
with
. Multiplying both sides of Equation (
23) by
yields
where
, and
.
The GL-based cost function is defined as
where
is the
ith element of
, and
is the
ith element of
,
is the
ith row of
.
Next, the optimal estimate of
can be obtained by
The solution to Equation (
27) can be obtained by solving the following equation:
Equation (
28) can be further expressed in matrix form as
where
with
and
According to [
19], updated Equation (
25) can be applied with one iteration to yield similar results within the GCF framework by employing
to modify the measurement data. As noted in [
31], two methods can be used to achieve this: the first method modifies the residual error covariance using
, and the second method reconstructs the ‘pseudo-observation’. Although both methods are equivalent in their final outcome. For simplicity, this paper presents the algorithm based on the first approach.
Let
denote the modified covariance, which can be expressed as
In the following analysis, we express
in two parts such that
Given that the actual state
is unknown, we set
. Under this condition, it is straightforward to observe that
The modified measurement noise covariance matrix can be obtained as
The GL-based cost function characterizes the error properties by weighting on the measurement uncertainty, which is reflected in the modified measurement noise covariance matrix . This procedure allows for a more accurate representation of the error dynamics by incorporating the uncertainty in the measurements, thus refining the covariance matrix to accurately capture the true noise characteristics.
Next, we replace
in Equation (
13) with
to obtain
Based on
and
, we can generate a weight vector
and a sample set
of
by a DS method. Thus, the samples of
and
can be obtained as Equation (
15).
By employing the approach from Table I in [
13], where
is substituted by
, we obtain the constraint matrix
for
. Subsequently,
is calculated by
for
, as shown in Equations (
16) and (17).
Thus, the filter estimated state
and its covariance matrix
can be computed by
where
Algorithm 1: GLGCF |
Step 1: Set the initial state estimate and the error covariance matrix , choose a proper truncation limit value t and a minimum shape parameter < 2. Step 2: Compute the predicted sampling points ; use and to calculate and ; adopt Cholesky decomposition to obtain . Step 3: Utilize to compute the propagated sampling points ; calculate the residual error . Employ Equation ( 45) and residual error samples to obtain ; use Equations ( 34) and ( 37) to derive . Step 4: Obtain the samples of and as Equation ( 15). Compute the constraint matrix using Table I in [ 13] with substituted by . Step 5: Compute using Equations ( 16) and (17). Update the state estimate and the error covariance matrix by the following equations: where and
|
The shape of the GL function is determined by the parameter
, with its value influencing the level of outlier suppression. Given that
directly affects filtering performance, finding the optimal
is essential. To address this issue, we formulate the negative log-likelihood (NLL) of GL’s probability distribution function as follows:
where
is an approximate integral, with
t representing the truncation limit [
32]. Subsequently, we find the optimal
by minimizing the NLL of the residual error
as follows:
Since obtaining an analytical solution for the partition function in Equation (
45) is not feasible and
is a scalar, a 1-D grid search within
can be used to minimize Equation (
45). The choice of
< 2 ensures the stability of the loss function and reduces computational complexity. When the system is affected by measurement outliers, the symmetric loss function MMSE becomes sensitive to symmetry disruption, resulting in biased state estimation. By optimizing the shape parameter, the GL adapts more effectively to the characteristic of non-Gaussian noise, thereby enhancing the robustness of the GCF in complex noise environments.
Finally, the proposed GLGCF algorithm is summarized in Algorithm 1.
Computational Complexity
The computational complexity of the proposed GLGCF is presented. Note that n and m are dimensions of and , respectively. represents the number of integration subintervals. represents the number of sampling points, which is determined by the selected DS method. In this paper, we utilize the GHQ rule as the DS method. denotes the computational complexity of for calculation with the sampling points.
As we can see from
Table 1, different DS methods exhibit distinct computational complexities. In scenarios with constrained computational resources, opting for a DS method that requires fewer sampling points can effectively balance the trade-off between computational efficiency and accuracy. Furthermore, a small truncation limit values
t, and a small search interval
can achieve high accuracy with a finite complexity.