1. Introduction
High-precision state estimation serves as the fundamental cornerstone for ensuring system reliability of dynamic perception and autonomous systems. As intelligent devices are increasingly deployed in complex environments, sensor data are subject to multifaceted effects, including nonlinear propagation characteristics, multi-source heterogeneous noise, and dynamic target interactions [
1,
2]. Particularly in target tracking scenarios, radar may produce outliers or noise spikes due to environmental interference or hardware malfunctions [
3], while adversarial electronic warfare tactics in military applications can introduce extreme noise through deliberate jamming signals [
4]. Such scenarios often exhibit heavy-tailed noise distributions, deviating significantly from Gaussian assumptions [
5]. Consequently, nonlinear filtering algorithms capable of mitigating these adverse factors constitute the core technology for target tracking, with their performance directly influencing the accuracy and reliability of tracking outcomes.
In recent years, nonlinear filtering techniques have evolved significantly through derivatives of the Kalman filter framework, particularly in addressing complex dynamic systems and high-dimensional nonlinear problems. The extended Kalman filter (EKF) [
6,
7] linearizes nonlinear models via first-order Taylor expansions within the Kalman framework. However, its reliance on local linearization introduces truncation errors that accumulate in strongly nonlinear scenarios, often leading to filter divergence. Apart from linearization errors, the vulnerability of EKF to environmental disturbances further limits its robustness. Recent innovations address this by integrating reliable measurement selection mechanisms with EKF frameworks [
8]. In addition, to address these limitations, probabilistic approximation-based methods such as the unscented Kalman filter (UKF) [
9,
10,
11] and cubature Kalman filter (CKF) [
12,
13] have emerged. The UKF transmits statistical moment information via unscented transform sampling of Sigma points, thereby enhancing adaptability for strongly nonlinear systems. Nevertheless, high-dimensional sampling can lead to issues of non-positive definite covariance matrices, resulting in numerical instability and degraded precision. In contrast, CKF employs Gaussian-weighted integration rules and approximates nonlinear distributions using the spherical–radial cubature criterion. By ensuring computational efficiency and designing symmetric sampling points, it enhances filtering robustness. In scenarios with higher state dimensions or greater nonlinear intensity, its estimation accuracy and robustness surpass those of the UKF, making it an essential theoretical tool for state estimation in complex dynamic systems. However, these nonlinear variants of the Kalman filter are predicated on the assumption of Gaussian noise. In heavy-tailed noise environments, where higher-order statistical moments dominate, such methods often fail to mitigate error accumulation, ultimately leading to estimation divergence.
State estimation under heavy-tailed noise conditions remains a critical challenge in dynamic systems, with existing approaches primarily focusing on noise statistical modeling and distribution approximation. Heavy-tailed distribution modeling explicitly constructs non-Gaussian noise profiles to enhance robustness [
2,
5,
14], yet its non-conjugacy leads to the lack of closed-form solutions, necessitating moment-matching approximations that introduce additional errors. Multi-model approaches, such as Gaussian mixture models [
15] and conditioned Markov switching models [
16], describe noise mutation characteristics through weighted multimodal distributions. However, their practical utility is constrained by model complexity and sensitivity to predefined weights. Monte Carlo methods, exemplified by particle filters [
17,
18], theoretically achieve arbitrary distribution approximation via stochastic sampling. Despite asymptotic optimality, their computational complexity in high-dimensional scenarios renders them impractical for real-time applications. Beyond conventional heavy-tailed modeling, recent advances have developed dual-path adaptive frameworks to simultaneously address non-Gaussian noise in system dynamics and measurement models. A representative example is the Mahalanobis distance-based CKF with both adaptability and robustness for tightly coupled GNSS/INS integration [
19]. Collectively, the trade-offs among dynamic noise adaptation, computational efficiency, and theoretical closed-form solutions remain significant challenges for these methods. Recent advancements in information-theoretic learning (ITL) have introduced novel paradigms for enhancing filter robustness by reconstructing optimization objectives through higher-order statistics and nonlinear similarity measures (such as correntropy) [
20]. Unlike minimum mean square error (MMSE)-based methods that rely solely on second-order moments, the maximum correntropy criterion (MCC) embeds local similarity into the cost function through kernel function mapping, using second-order and above moments of the innovation to effectively suppress outliers while capturing higher-order data characteristics [
21,
22,
23]. Beyond traditional filtering, the MCC framework has been extended to specialized domains requiring robust non-Gaussian noise suppression. A state-of-the-art application is spectral redshift navigation [
24] leveraging MCC to dynamically estimate stellar spectral shifts under cosmic noise. Such advancements underscore MCC’s potential in enhancing robustness for complex nonlinear systems beyond filtering designs. Although MCC sacrifices closed-form solutions for recursive estimation [
20], its dynamic adaptability to heavy-tailed non-Gaussian noise has demonstrated unique advantages in complex scenarios, driving its adoption in robust adaptive filtering designs [
20,
25,
26,
27,
28,
29]. Notably, a critical parameter in MCC-based filters is the kernel size for adjusting the gain matrix, which governs outlier suppression efficacy and estimation stability [
20,
25]. Current kernel size selection strategies predominantly rely on predefined static values (e.g., empirically fixed via offline experiments [
20,
26,
27]) or online empirical adjustments (e.g., heuristic calculations based on innovation covariance [
25,
28,
29]). However, static methods exhibit strong scenario dependency, limiting generalizability across diverse noise characteristics, while online strategies fail to adapt to abrupt interference intensity variations due to fixed adaptation mechanisms. Although a kernel size optimization mechanism based on sliding-window heuristics [
30] has been proposed, its noise adaptability remains constrained by predefined window parameters, exhibiting inherent latency in responding to abrupt noise disturbances. Fundamentally, existing approaches lack an adaptive association mechanism between kernel size and dynamic noise statistics, hindering real-time adaptability in scenarios with time-varying noise distributions or sudden changes in interference intensity. Furthermore, current research predominantly targets single-objective scenarios, lacking extended modeling for multi-target complex environments. These limitations subject existing methodologies to dual challenges of degraded estimation precision and computational efficiency imbalance in scenarios involving abrupt noise characteristic transitions or multi-target interactive environments.
To address these challenges, variational Bayesian (VB) methods have emerged as a promising framework for handling complex posterior distributions and joint estimation problems [
31,
32,
33,
34]. By probabilistically modeling latent variables (e.g., kernel size) and leveraging conjugate prior properties, VB enables joint Bayesian inference of states and parameters. Building on this foundation, this paper innovatively proposes a dynamic joint estimation mechanism for kernel size and state by integrating variational Bayesian inference with the theory of MCC. Additionally, it incorporates cubature transformation techniques to minimize approximation errors in nonlinear systems. Consequently, the variational Bayesian-based maximum correntropy criterion cubature Kalman filter (VBMCC-CKF) is proposed, showcasing enhanced optimization capabilities. The key innovations are reflected in the following three aspects.
- (1)
Integration architecture of VB and maximum correntropy criterion cubature Kalman filter (MCC-CKF):
For the first time, this method combines VB inference with MCC-CKF to construct a joint estimation framework of state and kernel size, breaking the dependence of traditional filtering algorithms on Gaussian noise assumptions and fixed kernel parameters. By synchronously optimizing the posterior distribution of the state and the kernel size parameters, it achieves fully adaptive robust estimation under nonlinear and non-Gaussian noise.
- (2)
Adaptive kernel size dynamic adjustment mechanism based on conjugate distribution:
A probabilistic modeling paradigm is introduced by treating the kernel size as an inverse gamma-distributed random variable [
35]. Leveraging the conjugate relationship between Gaussian and inverse gamma distributions, a closed-form analytical solution is derived through a variational Bayesian online alternating optimization. This mechanism significantly enhances robustness against heavy-tailed noise and outliers while maintaining linear computational complexity for parameter updates, ensuring real-time applicability.
- (3)
Scalability in single/multi-target tracking scenarios:
VBMCC-CKF demonstrates exceptional generalizability across tracking scenarios. In single-target benchmarks, it delivers superior robustness against complex disturbances. When extended to multi-target tracking via a multi-Bernoulli filtering framework [
36,
37,
38], the method dynamically balances multi-dimensional noise suppression and computational efficiency. This enables precise continuous tracking in cluttered environments with time-varying target interactions, providing a modular solution for advanced perception systems.
The structure of this paper is arranged as follows:
Section 2 systematically expounds the theoretical framework of CKF based on MCC, highlighting the limitations of conventional approaches and motivating this research.
Section 3 details the proposed adaptive kernel optimization methodology, focusing on the variational Bayesian iterative framework and its mathematical derivations.
Section 4 theoretically analyzes robustness mechanisms and evaluates computational complexity.
Section 5 validates the algorithm through comparative simulations: single-target benchmark evaluations followed by multi-target simulations under dynamic clutter.
Section 6 concludes with research summaries and future directions in nonlinear non-Gaussian systems.
2. Problem Formulation
2.1. Principle of CKF
Consider a nonlinear dynamic system [
29]:
where
denotes the state vector at time
,
denotes the measurement vector at time
,
and
are the nonlinear state transition and measurement functions, respectively. The process noise
and measurement noise
are mutually uncorrelated white Gaussian noise with covariance matrices
and
, respectively. CKF employs a set of deterministic cubature points to approximate the posterior mean and error covariance of the state under nonlinear transformations with additive Gaussian noise, thereby enabling robust state estimation
through effective exploitation of noise-contaminated measurement information.
The CKF algorithm initializes with the prior state estimate and covariance , followed by recursive execution of two steps: prediction and update.
Prediction Step: At time
, given the posterior state estimate
and covariance
, generate
symmetric cubature points:
where
is the Cholesky decomposition of
. Propagating these points through
yields the transformed cubature points
,
. The predicted state mean
and covariance
are computed as follows:
Update Step: Propagate the predicted cubature points through
to generate the measurement cubature points
,
. The predicted measurement mean, measurement covariance, and cross-covariance are calculated as follows:
Calculate the Kalman gain as follows:
Finally, the estimate and covariance of the posterior state are updated via the following:
2.2. Principle of MCC
For the random variables
,
, the correntropy quantifies their generalized similarity and is defined as follows [
29]:
where
denotes the expectation operator, and
is the joint probability density function. Because the Gaussian kernel function can approach the nonlinear model infinitely, the kernel function
satisfying Mercer’s theorem is designed, with the kernel size
adjusted dynamically and adaptively:
By expanding the Gaussian kernel function in Equation (12) into a Taylor series and substituting it into (11), we have the following:
It is evident that correntropy constitutes a weighted sum of even-order moments of
, thereby incorporating higher-order statistical information beyond the second moment. When
is sufficiently large, correntropy is determined by the second moment. Furthermore, due to the general unavailability of
and the finite samples
,
, correntropy is empirically estimated using the sample mean.
Let
denote the model output that depends on the parameter vector
, and
is the desired response. The residual term
depends on
. MCC is defined as the optimization framework that identifies the optimal parameter vector
from a feasible set
by maximizing the empirical correntropy in (14), formally expressed as follows:
In practical engineering applications, systems are inevitably subjected to various interference signals characterized by significant uncertainties in temporal distributions and amplitude intensities. A representative example includes outlier measurements generated by low-reliability sensors. Under such complex operating conditions, the conventional assumption of Gaussian-distributed process and measurement noise often fails, as system noise typically exhibits non-Gaussian heavy-tailed statistical properties (e.g., Gaussian mixture or Student’s t-distributions). These characteristics disrupt the optimality conditions of CKF, leading to substantial degradation in state estimation accuracy and filter stability, thereby severely limiting CKF practical applicability in non-Gaussian noise environments.
To address the challenges posed by unknown disturbances and non-Gaussian heavy-tailed noise in dynamic systems, numerous robust filtering algorithms have been proposed in recent years. Among these, Kalman filter variants based on MCC have emerged as a research focus due to their exceptional disturbance rejection capabilities. In contrast to the traditional MMSE criterion, which relies solely on second-order moments of the innovation, the MCC framework enhances robustness against the abnormal disturbances by jointly exploiting both second- and higher-order moment information of the innovation. Consequently, integrating MCC into the CKF framework yields a robust hybrid algorithm—MCC-CKF. This algorithm establishes a cooperative optimization mechanism between the error covariance matrix and the kernel function, effectively improving state estimation accuracy and filtering stability.
2.3. Principle of MCC-CKF
In the framework of a nonlinear dynamic system model, the optimization objective function based on MCC is formulated as follows:
where
denotes the squared Mahalanobis distance of the covariance matrix
, representing the standardized discrepancy under the covariance structure.
To address the limitation that a traditional Gaussian kernel function is not sensitive to capturing feature correlations, a Gaussian kernel function based on a square Mahalanobis distance is reconstructed as follows:
The optimal estimate
is obtained by maximizing
,
Consequently, the state estimate of the original CKF in (9) is rewritten as
where
is defined as follows:
Here,
serves as an additional adjustment factor to suppress the influence of anomalous disturbances and iteratively refine the estimation accuracy of
.
The modified measurement covariance matrix is
Accordingly, the covariance update is modified to
The complete MCC-CKF (Algorithm 1) is summarized as follows.
Algorithm 1: MCC-CKF |
Input: |
(1) Prediction: |
Generate symmetric cubature points in (2), and calculate the predicted state in (3) and the corresponding covariance in (4). |
(2) Update: |
Predict measurements in (5); |
Initialize , and set , and as the iteration threshold. |
Iterate: |
The adjustment factor in (21), the measurement covariance in (22), the cross-covariance in (7), and the Kalman gain in (20) are calculated successively. |
Refine the state estimate in (19) and the covariance in (23). |
Until: ; |
Return and , where is iterations. |
Output: , . |
2.4. Limitation and Improvement of MCC-CKF
As illustrated in Algorithm 1, MCC-CKF extends CKF to non-Gaussian environments by adaptively weighting the innovation covariance through a kernel function and iteratively optimizing the Kalman gain and covariance matrices using the adjustment factor . This mechanism effectively suppresses the influence of large estimation errors, significantly enhancing robustness under non-Gaussian noise. However, this improvement incurs higher computational complexity due to kernel operations, iterative optimization, and parameter adjustment—particularly, the selection of the kernel size . As a critical hyperparameter, directly governs the computation of and propagates its sensitivity to external disturbances to the gain matrix by means of . In (21), a smaller reduces , leading to diminished gain magnitudes in (20) and potential performance degradation or even divergence under Gaussian noise. Conversely, if , , then and , and MCC-CKF degenerates into the standard CKF. Thus, the rational selection of kernel size is pivotal to balancing estimation accuracy and robustness in MCC-CKF.
Given the non-stationary nature of external disturbances in temporal distribution and amplitude, the optimal kernel size varies with time-varying disturbance environments. Therefore, it is necessary to adopt the kernel scale adaptive adjustment strategy based on real-time data to make the adjustment factor dynamically respond to disturbance variations. This mechanism endows the filter with real-time responsiveness to unknown time-varying disturbances, achieving a balanced optimization between disturbance rejection and estimation accuracy through parameter coupling. Existing approaches, such as the adaptive method in [
25], dynamically adjust the kernel size based on the weighted norm of measurement residuals at each iteration (
). While this method improves robustness against non-Gaussian noise compared to filters relying on default or heuristic kernel selections [
20,
26,
27], integrating the adjustment factor
with a residual-dependent adaptation of the kernel size constrains
to a static value, thereby preventing real-time responsiveness to varying disturbance intensities. Consequently, this adaptive strategy remains insufficient to counteract unknown disturbances of varying intensities. Therefore, kernel size adjustment transitions from static or heuristic approaches to data-driven probabilistic inference, substantially enhancing the filters’ robustness in non-stationary non-Gaussian environments.
To eliminate reliance on heuristic parameter tuning, this paper proposes the VBMCC-CKF framework, which recursively estimates the system state while dynamically adapting the kernel size via VB theory [
33]. Leveraging VB theory, the joint posterior distribution of the state and kernel size is approximated using factorized distributions (Gaussian and inverse gamma), ensuring computational tractability and efficiency. Integrated with the state estimation framework of MCC-CKF, the proposed method achieves real-time adaptability to unknown time-varying disturbances and significantly improves robustness in non-stationary non-Gaussian environments.
4. Analysis of VBMCC-CKF
Section 3.3 rigorously derives the closed-form solution for the joint posterior distribution of the state and kernel size in VBMCC-CKF using variational Bayesian methods. The derived joint posterior retains an approximate factorization into Gaussian and inverse gamma distributions after iterative optimization. This section first supplements the mathematical validation with a focus on the closed-form kernel size update. Subsequently, anti-disturbance mechanism analyses are conducted to evaluate the robustness of the kernel size adaptation. Finally, the computational complexity of VBMCC-CKF is analyzed to assess its practical feasibility.
4.1. Mathematical Verification Using Kernel Size Closed-Form Update as an Example
Assuming the prior distribution of the kernel size as
, the predictive distribution is obtained by the dynamic evolution model propagation in (32):
The measurement likelihood function is defined as follows:
According to Bayes’ theorem, the posterior distribution of the kernel size is proportional to the product of the likelihood and prior:
Substituting the specific forms of Equations (49) and (50) into Equation (51) yields the following:
The posterior distribution retains the inverse gamma form , thereby validating the correctness of the closed-form update mechanism.
4.2. Anti-Disturbance Mechanism Analysis
A steady state implies that the system state and parameters no longer undergo significant changes over time, meaning the parameters converge to either a fixed value or exhibit periodic variation. For kernel size parameters, steady state analysis requires determining the long-term behavior of the kernel scale parameters and whether the expected value converges with it. The kernel size parameters are dynamically updated through an adaptive adjustment mechanism governed by Equations (46) and (47) and the model in (32). Under the condition that the residual exhibits bounded expectation (indicating system stability), if the covariance of tends to stabilize, then the expected value of satisfies , where is the steady-state covariance matrix of . Furthermore, based on the model in (32) and the update formulas in (46) and (47) of and , the parameters and demonstrate a linear growth trend over time, asymptotically approaching stable values, which are denoted as and . Consequently, the expectation of the kernel size converges to a steady-state value , thereby ensuring the stability of the gain matrix. In scenarios where outliers cause abrupt increases in , rapidly increases, triggering an adaptive adjustment of to reduce the gain matrix and suppress outlier interference effectively. Subsequently, due to the effect of the decay factor in smoothing historical information, it avoids over-suppressing normal data and gradually returns to the steady state.
Steady-state analysis validates the long-term stability and reliability of VBMCC-CKF, ensuring robust estimation performance in diverse adversarial environments. Notably, compared to the MCC-CKF in Algorithm 1, the gain matrix is reformulated by integrating Equations (7), (20) and (22), as follows:
Similarly, by integrating Equations (42)–(44), the gain matrix of VBMCC-CKF is reformulated as follows:
A comparative analysis of and reveals that VBMCC-CKF and MCC-CKF are structurally equivalent. Both the reciprocal of the adjustment factor () and the kernel size () serve to modulate in response to disturbances. Nevertheless, regardless of whether the kernel size is selected based on default values, empirical rules, or adaptive adjustment, MCC-CKF exhibits limited efficacy in mitigating unknown disturbances of varying intensities. Consequently, VBMCC-CKF’s dynamic adaptation of in response to disturbance intensity confers superior robustness against unknown time-varying disturbances.
4.3. Computational Complexity
The proposed VBMCC-CKF integrates VB methods with CKF while adaptively adjusting the kernel size parameters. Compared to the conventional CKF, VBMCC-CKF incurs a modest increase in computational complexity. Therefore, a concise analysis of its computational complexity is conducted from both time and space complexity perspectives.
In terms of time complexity, the computational complexity of CKF is dominated by the calculation of the Kalman gain, with a time complexity of . VBMCC-CKF introduces variational Bayesian iteration on this basis, requiring multiple iterations in the update step to optimize the state estimation and kernel size parameters. During each iteration, the Kalman gain, related covariance matrices, and kernel size parameters are recalculated. While the scalar operations for updating kernel size parameters introduce minor computational overhead, their impact on the overall complexity is negligible. Consequently, the time complexity per iteration remains equivalent to the CKF update step. Given that the variational iterations typically converge within 2–3 cycles (denoted as ), the total time complexity of VBMCC-CKF is . Although the total computational load increases linearly with , its complexity is still of the same order as , and the overall time complexity is controllable.
In terms of space complexity, CKF mainly stores core data, such as cubature points, covariance matrices, and Kalman gain matrices, resulting in a space complexity of . For VBMCC-CKF, additional intermediate variables, such as iteratively updated state estimates and kernel size parameters, must be stored during variational iterations. However, since these variables share the same dimensionality as the core matrices in CKF, no high-dimensional storage overhead is introduced. Thus, the space complexity remains .
This comprehensive analysis demonstrates that VBMCC-CKF achieves significant improvements in robustness and estimation accuracy with only a modest computational overhead, rendering it highly advantageous for addressing challenges in non-Gaussian noise environments. While its time complexity exceeds that of the conventional CKF, the algorithm attains convergence within merely 2–3 iterations through efficient computational steps, thereby satisfying real-time operational requirements and proving well-suited for practical applications demanding real-time performance.