Improved Cubature Kalman Filtering on Matrix Lie Groups Based on Intrinsic Numerical Integration Error Calibration with Application to Attitude Estimation

: This paper investigates the numerical integration error calibration problem in Lie group sigma point ﬁlters to obtain more accurate estimation results. On the basis of the theoretical framework of the Bayes–Sard quadrature transformation, we ﬁrst established a Bayesian estimator on matrix Lie groups for system measurements in Euclidean spaces or Lie groups. The estimator was then employed to develop a generalized Bayes–Sard cubature Kalman ﬁlter on matrix Lie groups that considers additional uncertainties brought by integration errors and contains two variants. We also built on the maximum likelihood principle, and an adaptive version of the proposed ﬁlter was derived for better algorithm ﬂexibility and more precise ﬁltering results. The proposed ﬁlters were applied to the quaternion attitude estimation problem. Monte Carlo numerical simulations supported that the proposed ﬁlters achieved better estimation quality than that of other Lie group ﬁlters in the mentioned studies.


Introduction
Estimating the state of dynamical systems is a research hotspot and a frequently faced essential issue in many real-life applications, e.g., navigation, target tracking, robotics, and automatic control [1][2][3]. Due to the imperfect system model and the existence of noise, filtering is the most used estimating approach. For linear systems with Gaussian uncertainties, the Kalman filter (KF) [4], as the most desirable estimator established on minimal mean-square-error (MMSE) criterion, can provide an optimal recursive solution. However, while the system is nonlinear, obtaining the optimal closed-form solution is intractable. Researchers proposed various methods via analytical or numerical approximation to address nonlinear state estimation issues. A successful analytical approximation method is the extended Kalman filter (EKF) [5], which employs first-order Taylor series approximation to linearize nonlinear dynamics, but may perform poorly with highly nonlinear systems or large initial errors. Moreover, the main drawback of the EKF is that it needs to calculate the Jacobians, which is relatively tricky for complicated models.
Instead of dealing with nonlinearities by evaluating Jacobians, another class of nonlinear filtering based on numerical approximation is preferred. These filters are collectively called sigma point filters, including the unscented Kalman (UKF) [6], Gauss-Hermite-Kalman (GHKF) [7], and cubature Kalman (CKF) [8] filters. Different deterministic sigma points are used in these filtering algorithms to approximate moments of the probability distribution functions (PDFs). Compared with the EKF, the sigma point Kalman filter can provide better robustness and estimation accuracy, but increases the computational burden. Another problem underlying the sigma point Kalman filter is that errors caused by the traditional numerical quadrature rules may inject bias into the estimated moments without (2) Deriving improved cubature Kalman filtering on matrix Lie groups with BSQMT to calibrate numerical integration errors, and introducing a method with the maximum likelihood principle to calculate adaptive expected model variance.
(3) Applying the proposed Lie group filtering to quaternion attitude estimation problems, and providing numerical simulations to validate the effectiveness of the proposed filtering.
The rest of this paper is constructed as follows. Section 2 reviews the preliminary knowledge of the Lie group. Section 3 outlines a designed Bayesian estimation method on Lie groups through leveraging the BSQMT for two different types of measurements. Section 4 gives detailed derivations of the proposed filtering algorithm on Lie groups. In Section 5, numerical simulations about quaternion attitude estimation demonstrate the effectiveness of the proposed filters. Lastly, in Section 6, we draw conclusions and present further work.

Mathematical Preliminary
This section reviews the theory and some basic properties of Lie groups used in this work. The following contents are chiefly based on [22,37] and partly on [15].

Introduction to Matrix Lie Groups
Matrix Lie group: Let G ⊂ R N×N be a matrix Lie group that is a subset of the invertible square matrix and holds properties including group identity, multiplication map, and inversion map: I N×N ∈ G; ∀χ 1 , χ 2 ∈ G, χ 1 χ 2 ∈ G; ∀χ ∈ G, χ −1 ∈ G; where I N×N is the identity matrix of R N×N . The matrix Lie group is also characterized by a smooth manifold structure, so that both multiplication and inversion maps are smooth operations. Each point on G, i.e., χ ∈ G, can attach to a tangent space at that point, denoted by T χ G. Tangent space T χ G, defined by the derivative of any curve γ(0), where γ(0) = χ (see the left plot of Figure 1), is a subvector space of R N×N with equal dimensions to those of G. Among all tangent spaces of G, the tangent space taken at group identity matrix I N×N and denoted as T I G is called Lie algebra.
Lie algebra: Let g denote the Lie algebra associated with a d-dimension Lie group; there is T I G := g ∈ R d×d . As Lie groups and Lie algebras lack mathematical tools, one needs to identify Lie algebra g to Euclidean space R d for convenient calculation. There is a linear bijection between g and R d , i.e., [·] ∨ G : g → R d . This linear bijection is invertible, and its inverse map is defined by [·] ∧ G : R d → g. For example, assume vector η ∈ R d and Lie algebra η ∧ ∈ g; then, we have [η] ∧ G = η ∧ ∈ g and [η ∧ ] ∨ G = η ∈ R d . The middle plot of Figure 1 gives a more intuitive illustration.
Matrix Lie exponential and logarithm maps: Matrix Lie exponential map exp G : R d → G and its inverse map log G : G → R d (the matrix Lie logarithm map) give two bijections between a neighborhood of 0 ∈ R d and a neighborhood of group identity matrix I d×d ∈ G. We define the matrix Lie exponential and logarithm maps as exp G (·) := exp m [·] ∧ G and log G (·) := [log m (·)] ∨ G , respectively, where exp m and log m are matrix exponential and logarithm operations. Here, matrix exponential and logarithm operations present the link between a Lie group and its unique corresponding Lie algebra: exp m : g → G and log m : G → g. For vector η ∈ R d , we have exp G (η) = exp m (η ∧ ). All maps among matrix Lie group G, Lie algebra g, and vector space R d are illustrated in detail on the right plot of Figure 1.

Uncertainty on Matrix Lie Groups
Most system noise is assumed to be additive white Gaussian noise in Euclidean spaces. However, the approach of additive Gaussian noise is not applicable to Lie groups, as Lie groups lack the addition operation. To describe Gaussian uncertainties for general Lie groups, we adopted the method in [15,16,34], which first assumes the distribution of uncertainty in the Lie algebra to be Gaussian and then maps Gaussian distribution to the corresponding Lie group by exponential mapping.
Consider a d-dimension random variable χ evolving on G, and let the variable satisfy prior probability distribution χ ∼ N φ (χ, P). Referring to the definition of the probability distribution for a random variable on a manifold in [15], we give the definition of probability distribution for χ ∈ G as follows: where φ : G × R d → G is a smooth function selected according to the composition of variable χ that must satisfy φ(χ, 0) =χ, also named "retraction";χ ∈ G is noise-free and represents the mean of random variable χ; operator N (·, ·) denotes Gaussian distribution in a Euclidean space; P ∈ R d×d is the error covariance matrix associated with uncertainty perturbation η; and η ∈ R d is a random Gaussian vector in a Euclidean space. Moreover, distribution N φ (·, ·) is not Gaussian. The specific operation of function φ(·, ·) is generally determined by the system's geometric structure. Left and right Lie group multiplications, and the matrix Lie group exponential are essential parts of φ(·, ·) no matter the system geometry. Here, with right and left multiplication, we give two simplified definitions of φ(χ, η), denoted by φ R (χ, η) = exp G (η)χ and φ L (χ, η) =χ exp G (η). The components of the variable evolving on G are different, so the expression of function φ(·, ·) is different in various system dynamics and applications.

Bayesian Estimation Based on Bayes-Sard Quadrature Moment Transform
This section aims to find a method to calibrate numerical quadrature errors in the classical sigma point filtering on Lie groups. By taking the cubature Kalman filter into consideration and adopting the BSQMT instead of the cubature transform, we propose a Bayesian estimation method with a generalized measurement model.

Bayes-Sard Quadrature Moment Transform with Cubature Points
The BSQMT [11] is a universal moment transform proposed to accommodate improper estimation and prediction calibration caused by quadrature error in sigma point filters.
The main technique of the BSQMT is the Bayesian quadrature, which treats the numerical process as probabilistic inference, and models the integration error by utilizing stochastic process models. Consider nonlinear vector function where input x ∈ R d is a Gaussian random variable that satisfies Gaussian distribution N (x, P); z ∈ R l is the output; g(·) : R d → R l denotes a known nonlinear transformation function. A brief generalization of the BSQMT algorithm with cubature points is given as follows: Step 1. Assume that the meanx and covariance P of random input variable x are known. Calculate third-degree cubature points that match the transform moment of random variable where S represents the lower triangular factor in the Cholesky decomposition of P; ξ i denotes unit cubature points expressed as where e i ∈ R d is the i-th column of identity matrix I d×d .
Step 2. Compute the quadrature weight: BSQMT uses a nonzero mean hierarchical Gaussian process (GP) prior model, and Bayes-Sard weights merely depend on the choice of sigma points and basic function space π := span{ψ 1 , . . . , Here, we consider a special base function space in which N = 2d holds. Lemma 1. Let p(x) = N (x | 0, I) be standard normal distribution, choose 2d cubature points by (4), and construct 2d dimensional function space Then, a set of quadrature weights w = {w m , n = 1, . . . , 2d} can decide the weighted mean of the state variable, and quadrature weights cohere with cubature transform weights w m = 1/2d.

Proof of Lemma 1. See Appendix A.
With third-degree unit cubature points, quadrature weights are calculated with the following equations: From Lemma 1, we know that w m = 1/2d. Thus, we can skip the calculation of w m in (6).
Step 3. Obtain transformed moments with the following equations: where Z = [g(X 1 ), . . . , g(X 2d )].σ 2 is expected model variance (EMV) of which the value partly depends on the selection of radial basis function (RBF) kernel (covariance of GP model based on unit cubature points) and its parameter. However, in practice, EMV sometimes needs to be reparameterized, but we omitted calculating it here. See [11,38] for the detailed computation of EMV and the selection rules of the basic function.

Bayesian Estimation on Lie Groups Using BSQMT
In this subsection, we first give a generalized measurement model covering system measurements in two kinds of spaces: Euclidean space and Lie group. Consider random variable χ ∈ G and its prior probability density function p(χ), χ follows (1) with knownχ and P. Assume that one can establish additional information about the random variable via an observation y: where v is white Gaussian noise with distribution v ∼ N (0, R) in the vector space, and its dimension depends on the type of y; h a (·) and h a1 (·) represent the observation function, and h a (·) satisfies the following mapping: The key to addressing the Bayesian estimation problem is to find parametersχ + and P + in posterior distribution approximated by p(χ | y) ≈ N φ χ + , η + , η + ∼ N 0, P + . Before estimatingχ + , posterior distribution p(η | y) should be found and approximated using BSQMT. Whether the measurement is in a vector space or a Lie group, there is an achievable way to calculate posterior estimate η + in the vector space and then map it to the Lie group. Hence, while an available measurement y exists in a Lie group, we need to transform the matrix Lie group measurement into the vector space by utilizing the inverse mapping of φ(·, ·). Inverse map φ −1 (·, ·) : G → R d is defined as In terms of left and right multiplications in both φ(·, ·) and φ −1 (·, ·), if y exists in a Lie group, there are actually four possible expansions to map it to the vector space. For instance, similar to [16], suppose measurement function where Ad(·) is the joint operator of a Lie group, there is Adχ−1 (η) =χ −1 ηχ.
In addition, when an available measurement y is in a vector space, it fulfils the following equation: From (14)- (16), we can detect that the optimized measurement function is related tô χ, η and v. So, giving a novel measurement equation for the derivation of the generalized nonlinear filtering: where y is the new observation, and h(·) is the new generalized measurement function. By using BSQMT, propagated measurement cubature points and the measurement mean are computed: where η i is cubature points; Y = ŷ 1 , . . . ,ŷ 2d , i = 1, . . . , 2d and w m is the mean weight defined in the above subsection. Afterwards, by calculating measurement error covariance P yy and cross-covariance P ηy , we obtain gain matrix K and parameters in the approximation of posterior distribution p(η | y) ∼ N η, P + : For converting posterior distribution p(η | y) into a distribution on the Lie group, we reviewed analyses and results in [15,16] and obtained Lastly, the entire Bayesian estimation procedure is summarized in Algorithm 1.

Algorithm 1 Bayesian estimation on Lie groups using BSQMT.
Input: prior stateχ and prior covariance P, unit cubature points ξ i , i = 1, 2, . . . , 2d, sequence of observations y, modified measurement noise covarianceR, Bayes-Sard weights w m , w, w c , expected model varianceσ 2 h (calculated by the method in [11]) Output: posterior stateχ + and associated posterior covariance P + 1: Calculate predicted measurement, and its associated cross-covariance and innovation covariance: Calculate the gain matrix: K = P ηy P −1 yy ; 3: Compute the estimated state and covariance:

Proposed Filtering Algorithm
This section presents a Bayes-Sard cubature Kalman filter on matrix Lie groups built on numerical integration error calibration with BSQMT to correct the filtering accuracy and named BSCKF-LG. The proposed BSCKF-LG has two versions: one version uses left multiplication to define "retraction" as left BSCKF-LG; the other uses right multiplication defined as right BSCKF-LG. A systemic method that leverages the maximum likelihood criterion is also proposed to estimate the EMVs. By reconstructing the covariances of the prior state error and the innovation with the innovation sequence in a sliding window, the proposed method estimates EMVs and feeds them back into improved filtering to adjust the Lie Kalman gain matrix.

Improved Cubature Kalman Filtering on Lie Groups with BSQMT
Consider discrete system dynamics where k represents the timestamp; χ k is the system state living in a matrix Lie group; n k ∼ N (0, Q k ) and v k ∼ N (0, R k ) are white Gaussian noise with known covariances in the vector space; and ω k is a known input. According to the above section, we first needed to remodel measurement y k into y k , and y k is in the Euclidean vector space. The purpose of designing filtering in the Bayesian framework of this model is to approximate posterior distribution p χ k , P k | y l:k . The recursive solution comprises two steps to the proposed filtering problem on Lie groups: propagation and update.
Propagation: start with giving prior probability destiny in the Bayesian framework.
whereχ + k−1 and P + k−1 are posterior estimation results of the previous moment. Then, try to approximate the state distribution of propagation: Our work needs to findχ k and P k . In the above Bayesian estimation, posterior state meanχ + k and its associated covariance P + k are estimated using the uncertainty represented by (1). Here, for Model (21), we still used this uncertainty representation and the associated inverse map that has where O η k 2 stands for error term. The proposed filter first propagates state meanχ k via employing the system's deterministic part (noiseless state model), that is, Subsequently, to compute associated covariance P k , we selected a set of unit cubature points for the BSQMT method and generated sigma points η + k−1,i in vector space R d through P + k−1 . Then, mapping these points to the matrix Lie group and passing the mapped cubature points through noise-free Model (25), we could obtain propagated sigma points χ i k . On the basis of this work and with defined inverse map φ −1 χ k , χ i k , i = 1, . . . , 2d, we obtained prior covariance P k (see Algorithm 2 for details).
Update: in this procedure, we need to approximate posterior probability distribution.
when each available measurement y k arrives. Due to the same measurement model, the Bayesian estimation method in Section 3 was applied to compute the posterior state and its associated covariance. The proposed filtering algorithm executes the propagation step and does not perform the update step until the measurement arrives. Algorithm 2 summarizes the proposed filter in detail.

Algorithm 2 Bayes-Sard cubature Kalman filter (BSCKF-LG).
Input: initial conditions χ 0 and P 0|0 , sequence of inputs ω k , k = 1, 2, . . ., modified process and measurement noise covariancesQ k andR k , sequence of observations y k , k = 1, 2, . . ., RBF kernel parameters θ f and θ h (for computing EMV), unit cubature points ξ i , i = 1, 2, . . . , 2d Output: posterior stateχ + k , posterior covariance P + k 1: Initialization:χ + 0 = χ 0 and P + 0 = P 0|0 ; 2: Compute Baye-Sard quadrature weights w m , w, and w c from (6) and (7), and calculate the EMVσ 2 f with the method in [11]; 3: for k = 1, 2, 3, . . . do Propagation: 4: Calculate state mean through noiseless system model Generate propagated sigma points χ i k on the basis of selected unit cubature points according to (4), Calculate associated covariance according to inverse map φ −1 χ k , χ i k , i = 1, . . . , 2d, k ,Q k derived from the dynamical equation of η k ; Update: (If the measurement y k arrives, start the update step of the filtering. BSQ weights and EMV value are known. ) 7: Computeχ + k , P + k from Algorithm 1 withχ k , P k 8: end for Remark 1. The propagation of state means does not need to utilize the BSQMT(or other moment transformation) because (25) was validated by [16] up to the second order. Instead of generating the sigma points directly by the distribution on the Lie group, all work about generating the corresponding sigma points is first done in the Euclidean space and then mapped to the Lie group. [15][16][17] generate sigma points for the process noise and calculate the covariance matrix of the deviation due to noise. Unlike these filters, process and measurement noise covariancesQ k andR k in Algorithm 2 are determined by the specified dynamic equation of η k and the remodeled measurement function of y k to save computing resources.

Remark 3.
BSQMT rules define all Bayes-Sard weights. Weights only depend on the selection of the sigma points and the basic function. The proposed filter requires two different kernel parameter values to obtain EMV values (see [11] for details) because there are two integrated functions. The kernel parameter's misspecification affects the BSQMT little, so manual parameter tuning of the EMV is available in practice. In the following section, we give a method to obtain the time-varying value of EMV.

EMV Estimation
In this subsection, we attempt to develop an adaptive BSCKF-LG with time-varying EMVs. To derive the EMV at each instant k, we first assumed that the prior covariance and the innovation covariance are unknown constants. The innovation vector is denoted as y k = y k −ȳ k , and we consider the following maximum likelihood function of the rebuild measurements y k−n+1 , y k−n+2 , . . . , y k : where | · | denotes the determinant operator, n represents the sliding window size, and C is the constant term. On the basis of (27), the maximum likelihood estimation of the error covariances of the prior state and the prior measurement is derived bŷ P yy,k|ML P k|ML = arg max L P yy,k (or P k ) | y k−n+1:k .
To solve (28), we need to take the partial derivatives of L P yy,k (or P k ) | y k−n+1:k with respect to estimated prior covarianceP k and estimated innovation covarianceP yy,k , and let them be equal to zero: Subsequently, the maximum likelihood equations are obtained via Equation (29): and where tr(·) denotes the trace operator;P a 1 ,b 1 y,k corresponds to the a 1 th row, b 1 th column of P yy,k with a 1 , b 1 ∈ [1, . . . , l]; similarly,P a 2 ,b 2 k corresponds to the a 2 th row, b 2 th column ofP k with a 2 , b 2 ∈ [1, . . . , d].
Equations (29) and (30) show that Therefore, the problem of estimating P yy,k is actually computing the derivation of the innovation covariance with respect to P yy,j , j = k − n + 1, . . . , k. As stated above, state errors are mapped from the Lie group into the Euclidean space, and prior state error covariance in the Euclidean space can be expressed as P k = E η k η T k . In this article, η k = φ −1 (χ k , χ k ) and η + k = φ −1 χ + k , χ k . Using (21) and the formula of P k in Algorithm 2, we first have Then, f (·) and (33) are expressed by the Taylor series expansion about η + k−1 , the estimation error η k becomes represent the second-and high-order moments about η + k−1 ;n k is derived from the specified dynamic equation of η k , and its covariance isQ k . To obtain exact equality, we consider the entire high order of η + k−1 by introducing a diagonal matrix α k = diag(α 1,k,..., α d,k ), which can scale the approximation error [39,40]: Similarly, by substituting (18) inỹ k = y k −ȳ k and expanding h(·) using the Taylor series about η k , we haveỹ where H k = dh(χ k , η k )/dη k , O( η k ) is the second-and higher-order moment of η k ;v k is defined by the remodeled measurement formula, and its covariance isR k . To consider the entire higher order, we also introduce a diagonal matrix β k = diag(β 1,k,..., , β l,k ) [39]: Hence, prior state error covariance and innovation covariance can be expressed by Lastly, for linear system models (35) and (37), if the Kalman filtering is steady, P k converges to a constant matrix; P + k , K k and P yy,k also become constant matrices [41]. On the basis of the above analysis, we assumed that our filtering process inside the sliding window was in a steady state. Consequently, the approximation of P yy,k in the sliding window was around constant, and there exists P yy,j = P yy,k , j = k − n + 1, . . . , k such that ∂P yy,j /∂P Since P yy,j was approximately constant in the window, we pre-and postmultiplied both sides of (40) with P yy,j : Substituting P yy = YwY T −ȳ ȳ T +σ 2 h I +R in (41) and letting ∆ h =σ 2 h I, the EMV of the measurement function is described aŝ Next, similar steps were employed to estimate the EMV of the system function. Equations (29) and (31) show that where H k = P T ηy,k P −1 k or H k = dy k /dη k . Again, since the filtering process was assumed to be stable in the sliding window, approximations of P k in the window were almost constant, P j = P k , j = k − n + 1, . . . , k, In sigma point filters, there exists K k = P ηy,k P −1 yy,k = P k H T k P −1 yy,k ; thus, (44) can be transformed into From Algorithm 2, we know that Substituting (47) into (46), there readily is with P j = P k , j < k in the window, the maximum likelihood of prior error covariance P k could be acquired from Combining P k = ΦwΦ T +σ 2 f I +Q k and (50), and letting ∆ f =σ 2 f I, the estimated EMV could be described aŝ Remark 4. As EMVs must be non-negative σ 2 f ≥ 0,σ 2 h ≥ 0 and ∆ f , ∆ h are diagonal matrices, so only estimating the diagonal elements of the EMV matrix is enough. Diagonal elements need to be non-negative; if some diagonal elements are negative, one should set them to zero. (42) and (51) show that the validity of EMV estimation depends on process accuracy and noise covariance measurement. Filtering accuracy also relies on noise statistics. In addition, even though (42) and (51) theoretically give adaptive EMVs, they bring computational burdens.

Application to Attitude Estimation
In this section, we apply the proposed method to a simulated attitude estimation system and compare it with various filters presented in the cited literature. Consider a rotation rigid body without translation that contains a triaxial gyroscope offering angular velocity ω m ∈ R 3 at a 100 Hz sampling frequency, a triaxial accelerometer providing acceleration f b ∈ R 3 at 10 Hz sampling frequency, and a triaxial magnetometer measuring magnetic field m b ∈ R 3 at 10 Hz sampling frequency. For the spacecraft attitude estimation problem, one can replace accelerometers and magnetometers with sun sensors and magnetometers [23]. The attitude kinematics in terms of the unit quaternion is described by Let accelerometer and magnetometer outputs be observations; observation equations are expressed by [23,42] f in which q ∈ SO(3) is the unit quaternion representing the attitude in the body frame with respect to the navigation frame; b ω ∈ R 3 denotes gyroscope bias; n ω ∈ R 3 and n bω ∈ R 3 are uncorrelated white Gaussian noise; f n ∈ R 3 is the gravity vector in the navigation frame; and m n ∈ R 3 represents Earth's magnetic field; and v f ∈ R 3 and v m ∈ R 3 are sensor noises assumed to be white Gaussian noise. Let χ = (q, b ω ), which belongs to matrix Lie group G = SO(3) × R 3 , and with the definition of φ(χ, η) in Section 2, we define the left and right estimation errors as follows: Then, retractions corresponding with the above estimation errors are

Numerical Simulations
To evaluate estimation quality, numerical simulations were carried out. In the simulations, we selected the north-east-down frame as the navigation frame. The attitude trajectory that lasted 200 s was generated by ω = [−0.1 cos(0.15t), 0.1 sin(0.1t), −0.1 cos(0.05t)] T rad/s. The true trajectory without noise and bias is shown in Figure 2 respectively. Some sensor parameters are provided by [42]. For filtering, we set the initial covariance matrix as P 0 = blkdiag 10 −3 I 3×3 , 10 −7 I 3×3 and the initial gyroscope bias asb ω0 = [0, 0, 0] T rad/s. The kernel scale and kernel length that constituted the kernel parameters were selected to be 0.001 and 0.3. Three cases were considered in our simulations, and in the first two cases, we utilized the trajectory defined above but with different sets of initial attitudes. Case 1 randomly selected the initial attitude from uniform distribution between −10 • and 10 • , and Case 2 randomly chose the initial attitude from uniform distribution [−90 • , 90 • ]. In Case 3, we used the angular velocity set above, but while 50 s < t ≤ 60 s , angular velocity was set to be ω = [−5, 0.5, −0.5] T rad/s; while 100 s < t ≤ 108 s, angular velocity was set to be ω = [−0.1, 0.1, −3] T rad/s; while 150 s < t ≤ 152 s, angular velocity was set to be ω = [0.1, −5, 0.1] T rad/s. Initial attitudes of Case 3 were the same as those in Case 1.
For each case, we conducted 100 independent Monto Carlo simulations and utilized the following root-mean-square error (RMSE) and averaged RMSE of the Euler angles (roll, pitch, yaw) to evaluate estimation quality: where M is the total number of simulations; K is the simulation time step; x m k andx m k represent the truth Euler angles(simulated) and the estimated Euler angles, respectively. With the simulated system and the initial conditions above, we compared nine different filters on Monte Carlo simulations: We set the sliding window size in both the propagation and update steps of the Right-BSCKF-LG-adaptive to be n = 35 in Cases 1 and 2, and n = 15 in Case 3. Simulation results of Case 1 are illustrated in Figure 3 and Table 1. Figure 3 compares the RMSEs of the Euler angles, while Table 1 summarizes the corresponding ARMSEs. With small initial attitude errors, the proposed BSCKF-type filters with cubature points performed better than other filters did and needed less time to reach a steady state. Right-BSCKF-LG and its adaptive version obviously outperformed the other filters. Simulation results also verified that the performance of Right-IEKF and Right-CKF-LG was similar and slightly better than that of Left-IEKF and Left-CKF-LG under small initial estimation errors. The selection of sliding window size could affect the estimation accuracy of Right-BSCKF-LG-adaptive, and the filtering accuracy of Right-BSCKF-LG-adaptive was not significantly better than that of Right-BSCKF-LG.  Figure 4 and Table 2. These graphs show that the estimation quality of SO(3) CKF, Left-CKF-LG, and Left-IEKF was poor, especially that of SO(3) CKF. In contrast, Right-CKF-LG and Right-IEKF achieved better accuracy and robustness than those of Left-CKF-LG and Left-IEKF. Again, BSCKF-type filters performed better. The estimation accuracy of Right-BSCKF-LG and Right-BSCKF-LG-adaptive was better than that of the other filters. Moreover, the accuracy of Right-BSCKF-LG-adaptive improved compared with that of Right-BSCKF-LG. Right-BSCKF-LG-adaptive performed best in the yaw angle and achieved the fastest convergence rate. As seen from the above analyses, the right estimation errors on Lie group G = SO(3) × R 3 are more suitable for attitude estimation with vector measurements.    In Case 3, we assumed that some fast motion occurred in the system trajectory. The true trajectory of Case 3 is shown in Figure 5. Estimation results under small initial attitude errors are indicated in Figure 6 and Table 3. Figure 6 shows that the performance of all filters was worse compared to that in Case 1, partly because of the definition domains of Euler angles. However, the right and left BSCKF-LGs proposed in this paper were still outperformed among all filters, especially adaptive right BSCKF-LG. Table 3 summarizes the ARMSEs of Euler angles. Left BSCKF-LG was slightly more accurate than right BSCKF-LG in this case. The accuracy of Right-BSCKF-LG-adaptive was also greatly improved compared with that of Right-BSCKF-LG.    LGs and BSCKF-LGs spent more computation time, while IEKFs spent minimal time. When the Bayes-Sard weights were calculated in advance according to the known system-state and measurement dimensions, the computation costs of BSCKF-LGs are almost equal to those of CKF-LGs. Right-BSCKF-LG-adaptive did not significantly increase the computational burden compared with Right-BSCKF-LG. Overall, BSCKF-LGs showed more accurate and robust estimation results in this attitude estimation example.

Conclusions
This article proposed a generalized Bayesian estimation algorithm on matrix Lie groups to calibrate numerical integration errors in classical sigma point filters on Lie groups using BSQMT theory. This Bayesian estimation is applicable to both measurements in Euclidean space and those evolving on the Lie group. Afterwards, with the proposed Lie group Bayesian estimator, we presented a Bayes-Sard cubature Kalman filter on Lie groups that comes in two variants. To obtain more accurate estimation, we then developed an approach to calculate adaptive EMVs. Numerical simulation results on quaternion attitude estimation indicated the superiority of our proposed filters over CKFs on Lie groups and unvariant EKFs. Future work includes exploring novel suitable methods to compute adaptive EMVs and applying the proposed filtering algorithm to visual-inertial navigation and fast drone navigation.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available because it also forms part of ongoing studies.

Conflicts of Interest:
The authors declare no conflict of interest.