Vehicle State Estimation Combining Physics-Informed Neural Network and Unscented Kalman Filtering on Manifolds

This paper proposes a novel vehicle state estimation (VSE) method that combines a physics-informed neural network (PINN) and an unscented Kalman filter on manifolds (UKF-M). This VSE aimed to achieve inertial measurement unit (IMU) calibration and provide comprehensive information on the vehicle’s dynamic state. The proposed method leverages a PINN to eliminate IMU drift by constraining the loss function with ordinary differential equations (ODEs). Then, the UKF-M is used to estimate the 3D attitude, velocity, and position of the vehicle more accurately using a six-degrees-of-freedom vehicle model. Experimental results demonstrate that the proposed PINN method can learn from multiple sensors and reduce the impact of sensor biases by constraining the ODEs without affecting the sensor characteristics. Compared to the UKF-M algorithm alone, our VSE can better estimate vehicle states. The proposed method has the potential to automatically reduce the impact of sensor drift during vehicle operation, making it more suitable for real-world applications.


Introduction
The rapid development of sensor technology and intelligent transportation systems [1] in recent years has led to the introduction of new vehicle chassis subsystems by original equipment manufacturers [2]. These subsystems improve specific vehicle performance, making driving more efficient and effective. Vehicle chassis subsystems not only make driving more convenient [3], but they also make the vehicle a complex autonomous system. The optimal coordination of chassis system (OCCS) [4] is applied to coordinate different complementary chassis subsystems. However, accurate vehicle state information is required for chassis coordinated control in order to correctly coordinate subsystems and identify driving conditions [5]. Furthermore, sensor noise can affect the accuracy, reliability, and continuity of vehicle state information. Extensive research [6,7] has been conducted to investigate various estimation algorithms based on cost-effective sensors and available measurements.
With the development of sensor technology, the use of vehicle control sensors such as the global navigation satellite system (GNSS) and the inertial measurement unit (IMU) has increased. In the OCCS, the IMU plays a crucial role in measuring the angular rate and acceleration of the vehicle body, enabling accurate calculations of the vehicle's attitude, velocity, and position [8]. However, due to installation errors [9] or coupling effects between the IMU and vehicle motion [10], IMU misalignment is inevitable. Indirect state estimation methods [9,10] are proposed to mitigate the drift of IMUs. To achieve advanced control, such as the OCCS, the coordinator requires multiple advanced sensor inputs and more estimation outputs, primarily to avoid conflicts downstream in subsystems [2]. This has motivated us to develop a method that leverages the relationships between advanced sensors to eliminate IMU drift.

•
We use PINN as a data-driven vehicle dynamics model to establish a VS, using the IMU calibration values as the output. By leveraging the loss calculation method of PINN, the vehicle dynamics can be integrated with the data-driven model, enabling the incorporation of information from multiple sensor sources during vehicle operation. The experimental results in a real vehicle platform indicated that the PINN-based model effectively integrates multiple sensor inputs to achieve improved estimation of the vehicle's state, surpassing both the physical and neural network-based models.
• Based on the IMU calibration values, we utilize the UKF-M algorithm to estimate the altitude, velocity, and position of the vehicle. By fusing data from multiple sensors, PINN UKFM provided accurate and comprehensive vehicle states that included sixdimensional vehicle dynamics, 3D attitude, speed, and position, which can be used in various vehicle dynamic control systems. For example, the six-dimensional vehicle dynamics can define the chassis motion, which can be used in OCCS. The 3D position can be applied to vehicle navigation in a GNSS-denied environment. The experimental results indicated that the PINN-based model can effectively incorporate multiple sensor inputs to mitigate IMU biases and enhance the accuracy of the existing state-ofthe-art integrated navigation algorithm, UKF-M.
The rest of work is organized as follows: Section 2 introduces the vehicle model, the PINN UKFM sensor states, and defines the PINN UKFM problem. Next, Section 3 introduces the proposed PINN UKFM algorithm. In Section 4, the proposed method is tested using realistic vehicle data. Rainy weather is used as an experimental condition as it reduces the road adhesion coefficient and increases the nonlinearity of vehicle dynamics. Finally, Section 5 presents the conclusions of this paper.

The Vehicle Model
The PINN UKFM algorithm used a six-degrees-of-freedom (6-DoF) kinematic vehicle model [32] to obtain accurate state estimates. As shown in Figure 1, the model could define the chassis movement [33], which included the navigation and vehicle body coordinates [5]. The starting point of the navigation coordinates was defined as the track start. The navigation coordinates consisted of three variables: E (east), N (north), and U (upward). altitude, velocity, and position of the vehicle. By fusing data from mul PINN UKFM provided accurate and comprehensive vehicle states that dimensional vehicle dynamics, 3D attitude, speed, and position, which in various vehicle dynamic control systems. For example, the six-dimen dynamics can define the chassis motion, which can be used in OCCS. Th can be applied to vehicle navigation in a GNSS-denied environment mental results indicated that the PINN-based model can effectively inco tiple sensor inputs to mitigate IMU biases and enhance the accuracy o state-of-the-art integrated navigation algorithm, UKF-M.
The rest of work is organized as follows: Section 2 introduces the vehic PINN UKFM sensor states, and defines the PINN UKFM problem. Next, Se duces the proposed PINN UKFM algorithm. In Section 4, the proposed met using realistic vehicle data. Rainy weather is used as an experimental cond duces the road adhesion coefficient and increases the nonlinearity of vehic Finally, Section 5 presents the conclusions of this paper.

The Vehicle Model
The PINN UKFM algorithm used a six-degrees-of-freedom (6-DoF) kine model [32] to obtain accurate state estimates. As shown in Figure 1, the mode the chassis movement [33], which included the navigation and vehicle body [5]. The starting point of the navigation coordinates was defined as the tra navigation coordinates consisted of three variables: E (east), N (north), and U The vehicle body's coordinate origin point was located at its center of right-hand rule was used. The x-, y-, and z-directions pointed forward, left, respectively. Acceleration and velocity could be broken down into longitud tion /velocity , lateral acceleration /velocity , and vertical acceleratio . The vehicle's direction angle was defined as the rolling angle (around x, pitching angle (around y, pitch rate ), and heading angle (around z, y The vehicle body's coordinate origin point was located at its center of mass and the right-hand rule was used. The x-, y-, and z-directions pointed forward, left, and upward, respectively. Acceleration and velocity could be broken down into longitudinal acceleration a x /velocity v x , lateral acceleration a y /velocity v y , and vertical acceleration a z /velocity v z . The vehicle's direction angle was defined as the rolling angle (around x, roll rate ω x ), pitching angle (around y, pitch rate ω y ), and heading angle (around z, yaw rate ω z ). Finally, other vehicle states could be extrapolated from these. For example, the sideslip angle β could be calculated through the following trigonometric operation:

The Sensor States
In the experiment, the vehicle had multiple sensors, including the GNSS, IMU, measurement steering wheels (MSW), wheel force transducers (WFT), 2-axis optical sensors (S-Motion), and a monocular camera. The experimental platform structure is presented in Section 4.1. Due to the real-time computation requirement, the signals from the monocular camera were not used in the PINN UKFM algorithm. The sensor inputs of PINN UKFM are introduced in Table 1. The symbols E and N represented easting and northing in the navigation coordinates. Using the universal transverse mercator (UTM) and the navigation starting point, easting and northing were transformed from the GNSS coordinates into the navigation coordinates. The UTM velocity was calculated from navigation coordinates as: where dt is a single GNSS interval, and t and t − 1 are GNSS coordinate timestamps.

Problem Definition
The PINN-based VS was defined as a time-series forecasting model in which sensor signals were taken as discrete variables. Assuming that "n" represents the current timestamp, the PINN module input states were defined as: n , . . . , X where n − L is the starting time step, X represents the sensor signals shown in Table 1, and ϑ is the number of sensor signals.
PINN was used as a universal function approximator to achieve IMU calibration. Building upon previous artificial intelligence-based techniques and the integration of the Kalman filter for estimation [23], the calibrated values were called "pseudo-states." By applying the conservation principles derived from the vehicle dynamics, the pseudo-states satisfied the conservation principles originating from the vehicle dynamics. Therefore, the PINN module output states were defined as: whereû θ represents the pseudo-states, and ι is the number of pseudo-states. Based on the LTI state space assumption, the pseudo-states were inputted to ODEs to compute the integrated states. The sensor measurements of these integrated states z were represented as: where [n + 1, n + 2, . . . , n + F] represent the output timestamps, κ is the number of the integrated states, and n + F is the ending time step.
For better integration with the vehicle control, the authors hypothesized that the vehicle physical model satisfied an LTI state-space model [23,24], which could be defined as: where x t+1 is the state variable at next time step, x t is the state variable at current time step, y t represents the output variable at current time step, B t andD t are disturbances or noise, and .
x denotes the rate of change of the state variable. It should be noted that the system dynamics and output equations did not change over [n + 1, n + 2, . . . , n + F].
By utilizing the LTI state-space model, the output of the PINN module could be used to compute the other vehicle states. Based on the ODE constraints, PINN ensured the estimated IMU states satisfied the physical relationships among sensor measurements.
Next, the pseudo-states were inputted into the UKF-M-based IMU-GNSS sensor-fusion model [34]. Using these pseudo-states, the UKF-M module could estimate both the velocity and position of the vehicle.

Structure of the Proposed VSE
In recent years, there has been growing interest in the development of multi-sensor systems for vehicle state estimation due to their potential to improve accuracy and robustness in complex environments. The proposed PINN UKFM is one such system, integrating multiple sensors to estimate the vehicle dynamics in real time. The architecture of our proposed PINN UKFM is illustrated in Figure 2 and consisted of three modules: the sensors module, the PINN module, and the UKF-M module.
The sensors module in PINN UKFM consisted of various sensors, such as GNSS, IMU, MSW, WFT, and S-Motion. These sensors provide rich information about the vehicle's motion, including its position, velocity, acceleration, and orientation. For adaptive reduction of data noise, the PINN module was employed to learn the complex, nonlinear relationship between the sensor signals and the filtered vehicle states. Specifically, the PINN module used the time-series sensor signals as input and outputted the corresponding pseudo-states. These pseudo-states were then fed into the UKF-M module, which used a state-space model to estimate the vehicle's position, velocity, and other parameters. By combining the

The PINN Module
In practical applications, different sensors may measure data with noise and drift due to their different characteristics and working environments. To obtain accurate vehicle state estimation, we used PINN to integrate the vehicle dynamics into a neural network architecture. The PINN module punished the loss function with ODEs and algebraic equations to make the sensor data consistent with the vehicle dynamics. The PINN module fused the data from multiple sensors to reduce measurement errors and make the data more consistent with the actual vehicle dynamics.
Therefore, the PINN module found the angle and velocity by integrating the acceleration and angular velocity. Temporal interaction is widely used in establishing datadriven vehicle models [19,25], as it can capture the complex temporal and hidden correlations for better state prediction. Therefore, a temporal model consisting of an encoder layer, a temporal interaction layer, and a decoder layer was proposed, as shown in Figure  3. The encoder layer was used to embed and encode sensor signal . Data were mapped into the high dimension through the multilayer perceptron (MLP). The encoder layer was defined as:

The PINN Module
In practical applications, different sensors may measure data with noise and drift due to their different characteristics and working environments. To obtain accurate vehicle state estimation, we used PINN to integrate the vehicle dynamics into a neural network architecture. The PINN module punished the loss function with ODEs and algebraic equations to make the sensor data consistent with the vehicle dynamics. The PINN module fused the data from multiple sensors to reduce measurement errors and make the data more consistent with the actual vehicle dynamics.
Therefore, the PINN module found the angle and velocity by integrating the acceleration and angular velocity. Temporal interaction is widely used in establishing data-driven vehicle models [19,25], as it can capture the complex temporal and hidden correlations for better state prediction. Therefore, a temporal model consisting of an encoder layer, a temporal interaction layer, and a decoder layer was proposed, as shown in Figure 3.

The PINN Module
In practical applications, different sensors may measure data with noise and drift due to their different characteristics and working environments. To obtain accurate vehicle state estimation, we used PINN to integrate the vehicle dynamics into a neural network architecture. The PINN module punished the loss function with ODEs and algebraic equations to make the sensor data consistent with the vehicle dynamics. The PINN module fused the data from multiple sensors to reduce measurement errors and make the data more consistent with the actual vehicle dynamics.
Therefore, the PINN module found the angle and velocity by integrating the acceleration and angular velocity. Temporal interaction is widely used in establishing datadriven vehicle models [19,25], as it can capture the complex temporal and hidden correlations for better state prediction. Therefore, a temporal model consisting of an encoder layer, a temporal interaction layer, and a decoder layer was proposed, as shown in Figure  3. The encoder layer was used to embed and encode sensor signal . Data were mapped into the high dimension through the multilayer perceptron (MLP). The encoder layer was defined as: where denotes the embedded feature vector. Additionally, the MLP contained the The encoder layer was used to embed and encode sensor signal X. Data were mapped into the high dimension through the multilayer perceptron (MLP). The encoder layer was defined as: where e t denotes the embedded feature vector. Additionally, the MLP contained the weighting matrix W emb , bias term b emb , and the rectified linear unit function (ReLU) activation function σ [35]. Since vehicle states have temporal interactions, the temporal interaction layer supposed the temporal interactions of different hidden states. The time dimension of e t was connected: where the embedded feature vectors were concatenated to form e 0 . Then, the PINN module learned the temporal interaction: where e l−1 and e l are the input and output of layer l, respectively. The decoder layer predicted the pseudo-states as: where u past represents the past measurement of the IMU, andû θ refers to the pseudo-states that represent the IMU calibration values. We defined the residual between the pseudostates and past measurement of the IMU as the drift of the IMU. This structure was similar to the residual block in a residual network [36]. (W dec e s + b dec ) =û θ − u past represented the latent (hidden) solution of the drift of IMU. The PINN determined the parameter θ of the NN [27] by minimizing the loss function: where L F (θ) represents the mean square error of residuals of the physics-based equations; L data (θ) represents the mean square error of residuals of the measurement data; ω 1 , ω 2 , . . . , ω 12 denote the weights associated with physical constraints; N d and N F represent the batch size; u(X) represents the future measurement of the IMU; t represents the output timestamps; F represents the forecast horizon; g(û θ (X; θ), X n , t) represents the ordinary differential equations [37]; and f (û θ (X; θ), X n , t) represents the algebraic equations [37]. The ODEs and algebraic equations were utilized as an additional regularization term [38]. By minimizing the loss function of the physics-based equations, we could incorporate the laws of physics into the NN [38][39][40]. When used for computing partial differential equations (PDE), the L F (θ) in PINN is typically used to penalize the degree to which the model violates physical laws. However, the PDE-based L F (θ) was not applicable in the continuous time modeling and prediction problems addressed in this paper. The physical laws of vehicle dynamics models are often subject to ODEs and algebraic equations. To address this, we drew inspiration from the approach of a physics-constrained neural network (PCNN) [39,40] and neural ordinary differential equations (NODEs) [41,42]. PCNN is a specific variant of PINN that introduces a regularization parameter to control the trade-off between data and knowledge-based regularization [43]. In NODEs, the hidden layers of a neural network are treated as the states of an ODE, and an ODE solver is used to compute the evolution of these states.
Specifically, we incorporated ODEs to represent the dynamic evolution of the system and used algebraic equations to represent the vehicle dynamics model.
The PINN module provided the signals of the pseudo-states, which represented the rate of change of the state variable .
x in the LTI state space. The inputs and outputs of the PINN module were denoted as: where d E , d N , and d U are the navigation coordinates minus the current vehicle coordinates [44].
The ordinary differential equations were represented as: ψ t are the measurements of ODE outputs at timestamp t; and d t is the time interval between n and t. By minimizing g 1 ∼ g 6 , the pseudo-states could incorporate information from the related vehicle states. The state z (7) t represents the position change of the vehicle, which was represented as: where [ E, N, U] are the outputs of variable y t in the LTI state-space vehicle model. By minimizing g 7 , the physical knowledge of y t was incorporated into the physics-informed loss function. The position-updated formula of the vehicle dynamics [45] may affect the learning effectiveness of the neural network. Therefore, we used the Euler integral to calculate the displacement of the vehicle. By utilizing the loss calculation method of PINN, the ODE vehicle dynamics could be combined with a data-driven model to consider multiple sensor sources.
Algebraic equations capture simple dependence relationships among vehicle states [37]. The algebraic equations representing the linear dynamics in the LTI state space could be written as . x = AX n + B n . By minimizing the loss ofû θ (X; θ) − .
x, we could incorporate the physical dynamics model into the physics-informed loss function. The referenced vehicle dynamics model was based on the two-degree-of-freedom (2-DOF) vehicle dynamics model [23,45]. To simplify the vehicle dynamics [23], we decoupled the longitudinal and latitudinal dynamics by neglecting the influence of the latitudinal and longitudinal forces.
The algebraic equation of f 8 (û θ (X; θ), X n ) was represented as: where a (1) x n is calculated based on the vehicle longitudinal dynamics; F e n represents the transmitted force of the vehicle; F b n represents the brake force of the vehicle; m is the gross vehicle mass; and M y l,n and M y r,n represent the y-axis torque of the two front wheels. We assumed F e n and F b n were transmitted to the front wheels and converted into M y l,n and M y r,n , respectively (the experimental car was a front-wheel drive vehicle). Therefore, the transmitted and brake forces were calculated using the least squares method. Additionally, a small slope angle was assumed. Therefore, the dissipative force was F s n = 0 and the frictional force was F f n = mgµ. Here, g represented the gravitational constant and µ was the road-wheel static friction coefficient. The air drag force F D n was calculated based on the longitudinal velocity and the drag coefficient k D . The parameters k 1 , k 2 , µ, and k D were calculated using the least squares method to realize online learning of the parameters.
In this paper, the tire force was directly measured. This allowed for the direct calculation of the longitudinal acceleration a (2) x n from the measured tire force. The algebraic equation of a (2) x n was represented as: where F x f l,n and F x f r,n are the front left and right wheel longitudinal forces, respectively; a (2) x n represents the longitudinal acceleration, ‫ג‬ 1 is the parameter 1 , F x f r,1 , a x 1 ] represents the known measurement data.
The algebraic equation of f 10 (û θ (X; θ), X n ) was calculated from the vehicle latitudinal dynamics, which was represented as: where C represents the tire cornering stiffness; l f and l r represent the distances from the center of the vehicle's mass to the front and rear axles, respectively; ω z denotes the yaw rate; F y f l,n , F y f r,n , F y rl,n , and F y rr,n denote the lateral tire forces at the front left/right and rear left/right wheels, respectively; δ s n represents the steering wheel angle; δ f n represents the front wheel steering angle; and i denotes the function of the variable steering gear ratio [46]. This was a linear tire model and we assumed the vehicle had equal cornering stiffness for all four wheels.
Similar to a (2) x n , we directly calculated the latitudinal acceleration a (2) y n from the measured tire force. The algebraic equation of a (2) y n was represented as: where F y f l,n and F y f r,n represent the front left and right wheel latitudinal forces, respectively; a (2) y n represents the latitudinal acceleration; and ‫ג‬ 2 is the parameter ‫ג(‬ 2 = F y f l,1 +F y f r,1 ma y 1 ).
The algebraic equation of f 12 (û θ (X; θ), X n ) was calculated from the kinematic vehicle model, which was represented as: where β n represents the vehicle slip angle.
The physics-based equations presented in this paper, (g 1−7 (û θ (X; θ), X n , z t ) and f 8−12 (û θ (X; θ), X n )), were applicable within the LTI state space. This implied that the pseudostates were assumed to remain constant over the time interval [n + 1, n + 2, . . ., n + F]. This assumption ensured the validity and applicability of the equations mentioned earlier, allowing for the incorporation of physics-based constraints into the neural network model.
The PINN module utilized the PyTorch deep learning framework. To consider the dynamics of different states, the Adam optimizer with a 0.001 learning rate was used to train the network. The model was trained using a Nvidia GTX 3080Ti GPU.

The UKF-M Module
The UKF-M is a novel UKF on manifolds, with versatility that allows direct application to numerous practical manifolds. For stochastic processes on Riemannian manifolds, the theory of Lie groups [47] is used to define the vehicle's attitude estimation. The IMU-GNSS sensor-fusion model [34] was a UKF-M-based filter, which is a standard 3D kinematic model based on inertial inputs. The UKF-M algorithm utilized in PINN UKFM was based on the methodology proposed in [34]. The modification of PINN UKFM involved using the outputs of the PINN module to replace the original IMU measurements.
The states of a moving vehicle in a discrete dynamic system are represented as: where χ n denotes the state of a vehicle belonging to a parallelizable manifold M; n is the n is the gyro bias; b a n = b a x, n , b a y, n , b a z, n is the accelerometer bias; and C n is a special orthogonal group that represents 3D rotation [47].
where 1 is the identity matrix. Based on the time derivative of C n C n T = 1, a skew- C n was obtained: The C T n . C n as a skew-symmetric matrix is often noted as [ω] × : The Lie algebra is a vector space and can be decomposed into: where ω = ω x , ω y , ω z is in the vector of angular velocities. For the ω constant, we obtained the ODE solution: where exp() is the exponential map on the SO(3) [42] and C 0 = I. The exp() map was derived from the time derivatives of χ n ∈ M. The vector fields were defined as: whereˆis the hat map [47], and V 1 , V 2 , and V 3 are the vector fields. Similar to the Gaussian belief of the Kalman filter, the UKF-M algorithm builds a probability distribution as χ n ∼ N(χ n , P n ) for the random variable χ n ∈ M as: whereχ n is viewed as the mean estimate at timestep n; ϕ is the propagation function; ξ n ∈ R d is a random Gaussian vector; N is the Gaussian distribution; and P n ∈ R d×d is the covariance matrix. ϕ(χ n , ξ n ) ∈ M is obtained by starting fromχ n and integrating the vector field ∑ d i=1 ξ i n V i (d is the dimension of the associated vector fields). Consider that the probability distribution of χ n is p(χ n ). The additional information about χ n is obtained from the measurement y n as: where h is the observation function, and v n ∼ N(0, R n ) denotes the white Gaussian noise. The UKF-M module used the gyro measurements and acceleration as inputs to update the random variable χ. The measurements of this standard 3D kinematic model were represented as: where µ n = ω x n , ω y n , ω z n represents the gyroscope, and a b n = a x n , a y n , a z n denotes the accelerometer. The UKF-M algorithm [34] updated the state and covariance by combining measurements y n and system states χ n . In PINN UKFM, the output of the PINN module was utilized to filter the noise and minimize the norm errors. Pseudo-statesû θ served as the calibrated IMU measurements for the filtering process. The states of the pseudo-states were represented as: where y n =û θ represents the PINN module output as the pseudo-states.
Using the propagation function [34], PINN UKFM built the vehicle model. First, the gyro measurements were inputted to calculate the rotation matrix: where exp is the exponential map on the SO(3), and d t is the integration step. In this vehicle model, w n , w (3:6) , w (6:9) , and w (9:12) were the submatrices of w (0:12) n . Next, the vehicle acceleration was inputted to calculate the calibrated vehicle acceleration: a = C n a b n − b a n + w (3:6) + g where g = [0, 0, −9.82] represents the gravitational constant and a represents the calibrated vehicle acceleration. Based on the calibrated vehicle acceleration, the model updated the vehicle speed as: where v n = (v E n , v N n , v U n ) denotes the velocity vector. Based on the vehicle velocity vector, the model updated the vehicle position vector as: where P n = (E n , N n , U n ) is the vehicle coordinates in the navigation coordinates. Finally, the model uploaded the gyro and accelerometer biases as: b a,n+1 = b a,n + w (6:9) × d t b g,n+1 = b g,n + w (9:12) × d t where b g,n represents the gyro bias, and b a,n represents the accelerometer bias. The probability distribution of χ and the propagation function remained unchanged; therefore, the posterior distribution p(χ n |y n ) was calculated as p(χ n |y n ). The pseudostates y n provided information about the sigma point ξ n . First, the sigma points ξ n were computed as: where λ is the scale parameter [48]; α is a free parameter chosen by the practitioner [49] (α must be small); d is the dimension of the associated vector fields; P n is the covariance matrix at timestep n; and col represents that the j column of the matrix is the weight associated with the j point. Second, these sigma points passed through the vehicle model to yield the set of transformed sigma point y j n : where h(ϕ(χ n , 0)) is the unnoisy state model, h is the observation function (as in Equations (31)(32)(33)(34)(35)), andχ n is the prior mean estimate of the current state. Third, the mean and covariance of the transformed sigma points [30] were computed as: y n = m y 0 n + ∑ 2d j=1 j y j n P y n y n = ∑ 2d j=0 j y j n − y n y j n − y n T + R n P ξ n y n = ∑ 2d j=1 j ξ j n (y j n −y n ) m = λ λ+d , j = 1/2 λ+d (38) where y n represents the mean of the transformed sigma points; P y n y n is the covariance of the transformed sigma points; P ξ n y n is the cross-covariance of the transformed sigma points; m and j are weights; and R n is the covariance matrix of white Gaussian noise. Next, PINN UKFM employed the Kalman updated equation to update the state and covariance as: K n = P ξ n y n P −1 y n y n χ + n = ϕ(χ n , K n (y n − y n )) P n + = P n − K n P y n y n K n T (39) where y n is the PINN module output (pseudo-states), K n is the gain matrix,χ + n is the posterior mean estimate state, P n is the prior estimate covariance matrix of the current state, and P n + is the posterior estimate covariance matrix.
The unscented transformation [50] was employed to approximate the posterior p(ξ n |y n ) for ξ n as: p(ξ n |y n ) ∼ N ξ n , P n + ξ n = K n (y n − y n ) where ξ n represents the noise-free mean.
The unscented approximation to the posterior p(ξ n |y n ) was thus the distribution of a Gaussian ξ n + ξ n + with ξ n + ∼ N 0, P n + [34]. Then, PINN UKFM approximated the posterior distribution p(χ n |y n ) as: where ξ n + represents the posterior noise. The posterior distribution p(χ n |y n ) boiled down to a Bayesian estimation problem [47] that incorporated the information from the PINN module.
In this paper, we focused on describing how PINN UKFM updated the state estimation χ + n and covariance matrix P n + when pseudo-states y n arrived. Additionally, the UKF-M algorithm could propagate the state without sensor measurement [34], which was implemented in our program.

Vehicle Platform
A vehicle platform was built to validate the proposed algorithm. The hardware configuration is shown in Figure 4. The sensors included GNSS, IMU, MSW, WFT, S-Motion, and a camera. D[WE-43A-USB and Dewesoft SIRIUS acquisition systems were used. The DEWE-43A-USB system obtained the S-Motion and MSW signals through highspeed CAN channels, while the Dewesoft SIRIUS system collected the WFT signals through high-speed CAN channels. A computer was used for all data acquisition.
where represents the noise-free mean. The unscented approximation to the posterior ( | ′) was thus the dis a Gaussian ̅̅̅ + + with +~( 0, P + ) [34]. Then, PINN UKFM approxima terior distribution ( | ′) as: where + represents the posterior noise. The posterior distribution ( down to a Bayesian estimation problem [47] that incorporated the informati PINN module. In this paper, we focused on describing how PINN UKFM updated the s tion ̂+ and covariance matrix P + when pseudo-states ′ arrived. Addit UKF-M algorithm could propagate the state without sensor measurement [34] implemented in our program. First, the Mako G-192B monocular camera [51], produced by the Allied V pany, captured image data of the environment. Then, the starNeto XW-GI [5 the IMU and GNSS data, using differential methods to measure both position a $GPFPD was selected as the starNeto communication protocol to acquire [ , , , , Θ, , , , ]. In the third step, the S-Motion system, produced by vided signals for [ , , , , , , , ]. In the fourth step, the MSW by used to measure the steering wheel angle, speed, and torque. Finally, the WF was used to measure the wheel forces and moments under dynamic condition and right front wheels. The WFT provided signals for [ , , , , , ] First, the Mako G-192B monocular camera [51], produced by the Allied Vision Company, captured image data of the environment. Then, the starNeto XW-GI [52] provided the IMU and GNSS data, using differential methods to measure both position and attitude. $GPFPD was selected as the starNeto communication protocol to acquire signals for [E, N, U, ϕ, Θ, ψ, v e , v n , v z ]. In the third step, the S-Motion system, produced by Kistler, provided signals for v x , a x , v y , a y , a z , ω x , ω y , ω z . In the fourth step, the MSW by Kistler was used to measure the steering wheel angle, speed, and torque. Finally, the WFT by Kistler was used to measure the wheel forces and moments under dynamic conditions for the left and right front wheels. The WFT provided signals for F x , M x , F y , M y , F z , M z in the front wheels and three axes of WFT, similar to the vehicle body coordinates. The hardware implementation in the test vehicle [52][53][54][55][56] is shown in Figure 5. rs 2023, 23, x FOR PEER REVIEW wheels and three axes of WFT, similar to the vehicle body coordi plementation in the test vehicle [52][53][54][55][56] is shown in Figure 5. The test was conducted on cement pavement at Jiangsu Un trajectory is shown in Figure 6. The weather during the test was r tire-road friction coefficient and increased the nonlinearity of th tionships. A total of 85% of the data were used for training and to while the remaining 15% were reserved for testing. The dataset as an overpass (high speed), a school (low speed), a slope, traffic ing. The road scene could be changed by altering the dataset spli last 15% of the dataset was used to evaluate the performance of P Start End Figure 5. The test vehicle.

Vehicle Platform
The test was conducted on cement pavement at Jiangsu University, and the vehicle trajectory is shown in Figure 6. The weather during the test was rainy, which reduced the tire-road friction coefficient and increased the nonlinearity of the vehicle's dynamic relationships. A total of 85% of the data were used for training and to verify the PINN module, while the remaining 15% were reserved for testing. The dataset included scenarios such as an overpass (high speed), a school (low speed), a slope, traffic lights, and vehicle turning. The road scene could be changed by altering the dataset splits. In the experiment, the last 15% of the dataset was used to evaluate the performance of PINN UKFM. wheels and three axes of WFT, similar to the vehicle body coordi plementation in the test vehicle [52][53][54][55][56] is shown in Figure 5. The test was conducted on cement pavement at Jiangsu Un trajectory is shown in Figure 6. The weather during the test was r tire-road friction coefficient and increased the nonlinearity of the tionships. A total of 85% of the data were used for training and to v while the remaining 15% were reserved for testing. The dataset as an overpass (high speed), a school (low speed), a slope, traffic ing. The road scene could be changed by altering the dataset split last 15% of the dataset was used to evaluate the performance of P Start End Figure 6. The trajectory of the experiment.

Parameter Settings and Training of All Comparative Methods
PINN: The structure of the PINN was chosen as {22, 32, 32×5, 64, 128, 128, 64, 32, 6}. For easy calculation of v z , we subtracted the gravity vector from the acceleration a z during data processing [34]. The layer {22, 32} was the encoder layer, and the layers {32, 32×5, 64, 128, 128, 64, 32} were the temporal interaction layers. The layer {32, 6} was the decoder layer. The prediction states of the PINN module were the IMU calibration values.
subPINN: To address the influence of loss backward during training, a single-output scheme of the PINN was implemented. The vehicle dynamics/kinematic models were used to build the data-driven model a x , a y , ω z . This subPINN had the same structure as the PINN and a single output. Specifically, for training a z , ω x , ω y , the structure {1, 2, 32, 32, 64} was used to reduce the computational complexity and improve accuracy. The inputs of a z , ω x , ω y were [a z ,v z ], [ω x ,ϕ], and [ω y ,Θ], respectively. Different from the UTM coordinates, the attitudes did not differ greatly, so they were directly inputted here.
MLP [24,25,57]: The multilayer perceptron (MLP) is a commonly used approach for building data-driven vehicle dynamics models. The structure of the MLP used in this study was inspired by [24] and was chosen as {22, 32, 32×5, 64, 128, 128, 64, 32, 6}. The ReLU activation function was applied between each layer. Different from the approach used in [24], we used the layers {32, 32×5} as the temporal interaction layers, which is a common approach in data-driven dynamics modeling [25].
LSTM [24,57,58]: Long Short-Term Memory (LSTM) is another classic method used to build data-driven dynamics models, which is employed for predicting the acceleration and angular velocity of vehicles [57,58]. LSTM is known for its ability to capture temporal dependencies in the data. In this study, LSTM was implemented with a general transformation structure as {22, 32, 128, 32, 6}, where {22, 32} and {32, 6} represented the MLP layers. The exponential linear unit (ELU) [35] function was applied after the first layer, and {32, 32, 128} represented the two LSTM layers.
Physical models [23,45]: The yaw rate formulas based on longitudinal/lateral dynamics and kinematic models were used in this paper. The longitudinal dynamics parameters were calculated using the least squares method [23]. The lateral dynamics parameters were computed based on the actual vehicle parameters (l f , l r , m) and the relationships between states (C = F y a y ). We assumed the yaw rate was an LTI state. Therefore, a kinematic model [45] was used to calculate the yaw rate.
The wheel force [F x , F y ] measured by WFT could be utilized to calculate the acceleration through algebraic equations. Additionally, we used the linearly constrained least squares method to combine these dynamics as: − mv x n ω z n + 2Cδ n m + K 9 F y f l,n + K 10 F y f r,n + K 11 (42) where K 1 ∼ K 6 , K 8 ∼ K 10 represent the parameters of the least squares method, and K 7 and K 11 represent the coefficients of the least squares method. The resulting states of the physical models are illustrated in Figure 7. In the experiment, we not only compared the prediction states with the sensor signals but also compared the integration states v x , v y , v z , ϕ, Θ, ψ with the integrated sensor signals, which could be expressed as: where Z 0 is initial state, t is the timestamp, and u i = a x , a y , a z , ω x , ω y , ω z are the states at timestamp i. As shown in Figure 7, the "Algebraic equation represented the calculation results of Equations (16) and (18). The "Linearly constrained least square represented the calculation results of Equation (42). By combining the forces and dynamics, the output of this equation provided a better fit to the IMU measurements. This method could fit multiple sensors to improve the accuracy of the dynamics, when the IMU measurements were accurate. However, due to the IMU drift, the integration error of this model was relatively large. The "Dynamics represented the longitudinal/lateral dynamics model based on Equations (15) and (17), which is commonly used for vehicle control. In this paper, "Linearly constrained least square and "Dynamics were established based on the IMU measurement data. Therefore, the results of these models were close to the IMU measurements but also had large integration errors.
To maintain visual clarity, we only used the vehicle dynamics ("Dynamics ) and kinematic models for comparison. MLP and LSTM also utilized the PyTorch deep learning framework. We employed the Adam optimizer with a 0.001 learning rate to train the network. The models were trained using the Nvidia GTX 3080Ti GPU. All models, including PINN, subPINN, MLP, and LSTM, were trained using the same training dataset. However, there was a difference in the loss calculation method. Different from the loss calculation method established by PINN, MLP and LSTM used the IMU measurements to build datadriven vehicle models [25,57,58]. These methods commonly assumed that the ground truth states of the vehicle were accessible. In this paper, we used highly accurate S-motion data as a surrogate for the actual vehicle state for MLP and LSTM training.

Validation of the PINN Module
In this subsection, we compare the PINN module with the MLP, LSTM, and vehicle dynamics/kinematic models and vehicle measurements. These data-driven models were trained using the dataset described in Section 4.1. The root mean square error (RMSE) [24,59] was used to evaluate the performance of the models, which was represented as: As shown in Figure 7, the "Algebraic equation" represented the calculation results of Equations (16) and (18). The "Linearly constrained least square" represented the calculation results of Equation (42). By combining the forces and dynamics, the output of this equation provided a better fit to the IMU measurements. This method could fit multiple sensors to improve the accuracy of the dynamics, when the IMU measurements were accurate. However, due to the IMU drift, the integration error of this model was relatively large. The "Dynamics" represented the longitudinal/lateral dynamics model based on Equations (15) and (17), which is commonly used for vehicle control. In this paper, "Linearly constrained least square" and "Dynamics" were established based on the IMU measurement data. Therefore, the results of these models were close to the IMU measurements but also had large integration errors.
To maintain visual clarity, we only used the vehicle dynamics ("Dynamics") and kinematic models for comparison. MLP and LSTM also utilized the PyTorch deep learning framework. We employed the Adam optimizer with a 0.001 learning rate to train the network. The models were trained using the Nvidia GTX 3080Ti GPU. All models, including PINN, subPINN, MLP, and LSTM, were trained using the same training dataset. However, there was a difference in the loss calculation method. Different from the loss calculation method established by PINN, MLP and LSTM used the IMU measurements to build datadriven vehicle models [25,57,58]. These methods commonly assumed that the ground truth states of the vehicle were accessible. In this paper, we used highly accurate S-motion data as a surrogate for the actual vehicle state for MLP and LSTM training.

Validation of the PINN Module
In this subsection, we compare the PINN module with the MLP, LSTM, and vehicle dynamics/kinematic models and vehicle measurements. These data-driven models were trained using the dataset described in Section 4.1. The root mean square error (RMSE) [24,59] was used to evaluate the performance of the models, which was represented as:  As shown in Figure 7, the "Algebraic equation represented the calculation results of Equations (16) and (18). The "Linearly constrained least square represented the calculation results of Equation (42). By combining the forces and dynamics, the output of this equation provided a better fit to the IMU measurements. This method could fit multiple sensors to improve the accuracy of the dynamics, when the IMU measurements were accurate. However, due to the IMU drift, the integration error of this model was relatively large. The "Dynamics represented the longitudinal/lateral dynamics model based on Equations (15) and (17), which is commonly used for vehicle control. In this paper, "Linearly constrained least square and "Dynamics were established based on the IMU measurement data. Therefore, the results of these models were close to the IMU measurements but also had large integration errors.
To maintain visual clarity, we only used the vehicle dynamics ("Dynamics ) and kinematic models for comparison. MLP and LSTM also utilized the PyTorch deep learning framework. We employed the Adam optimizer with a 0.001 learning rate to train the network. The models were trained using the Nvidia GTX 3080Ti GPU. All models, including PINN, subPINN, MLP, and LSTM, were trained using the same training dataset. However, there was a difference in the loss calculation method. Different from the loss calculation method established by PINN, MLP and LSTM used the IMU measurements to build datadriven vehicle models [25,57,58]. These methods commonly assumed that the ground truth states of the vehicle were accessible. In this paper, we used highly accurate S-motion data as a surrogate for the actual vehicle state for MLP and LSTM training.

Validation of the PINN Module
In this subsection, we compare the PINN module with the MLP, LSTM, and vehicle dynamics/kinematic models and vehicle measurements. These data-driven models were trained using the dataset described in Section 4.1. The root mean square error (RMSE) [24,59] was used to evaluate the performance of the models, which was represented as: where 1800 is the predicted steps, = [ , ] represents the i-th value of the predicted results, and ̂ represents the i-th value of the reference results (sensor measurements). Table 2 presents the RMSEs of the predicted states using different methods. The results demonstrated that except for longitudinal/lateral acceleration, the PINN-based approaches (PINN and subPINN) estimated the vehicle state more accurately. The inferior where S = 1800 is the predicted steps, As shown in Figure 7, the "Algebraic equation represented the calculation results of Equations (16) and (18). The "Linearly constrained least square represented the calculation results of Equation (42). By combining the forces and dynamics, the output of this equation provided a better fit to the IMU measurements. This method could fit multiple sensors to improve the accuracy of the dynamics, when the IMU measurements were accurate. However, due to the IMU drift, the integration error of this model was relatively large. The "Dynamics represented the longitudinal/lateral dynamics model based on Equations (15) and (17), which is commonly used for vehicle control. In this paper, "Linearly constrained least square and "Dynamics were established based on the IMU measurement data. Therefore, the results of these models were close to the IMU measurements but also had large integration errors.
To maintain visual clarity, we only used the vehicle dynamics ("Dynamics ) and kinematic models for comparison. MLP and LSTM also utilized the PyTorch deep learning framework. We employed the Adam optimizer with a 0.001 learning rate to train the network. The models were trained using the Nvidia GTX 3080Ti GPU. All models, including PINN, subPINN, MLP, and LSTM, were trained using the same training dataset. However, there was a difference in the loss calculation method. Different from the loss calculation method established by PINN, MLP and LSTM used the IMU measurements to build datadriven vehicle models [25,57,58]. These methods commonly assumed that the ground truth states of the vehicle were accessible. In this paper, we used highly accurate S-motion data as a surrogate for the actual vehicle state for MLP and LSTM training.

Validation of the PINN Module
In this subsection, we compare the PINN module with the MLP, LSTM, and vehicle dynamics/kinematic models and vehicle measurements. These data-driven models were trained using the dataset described in Section 4.1. The root mean square error (RMSE) [24,59] was used to evaluate the performance of the models, which was represented as: where 1800 is the predicted steps, = [ , ] represents the i-th value of the predicted results, and ̂ represents the i-th value of the reference results (sensor measurements). Table 2 presents the RMSEs of the predicted states using different methods. The re- s shown in Figure 7, the "Algebraic equation represented the calculation results of tions (16) and (18). The "Linearly constrained least square represented the calculaesults of Equation (42). By combining the forces and dynamics, the output of this ion provided a better fit to the IMU measurements. This method could fit multiple rs to improve the accuracy of the dynamics, when the IMU measurements were ace. However, due to the IMU drift, the integration error of this model was relatively . The "Dynamics represented the longitudinal/lateral dynamics model based on tions (15) and (17), which is commonly used for vehicle control. In this paper, "Lineonstrained least square and "Dynamics were established based on the IMU measent data. Therefore, the results of these models were close to the IMU measurements lso had large integration errors. o maintain visual clarity, we only used the vehicle dynamics ("Dynamics ) and kinic models for comparison. MLP and LSTM also utilized the PyTorch deep learning work. We employed the Adam optimizer with a 0.001 learning rate to train the net-. The models were trained using the Nvidia GTX 3080Ti GPU. All models, including , subPINN, MLP, and LSTM, were trained using the same training dataset. However, was a difference in the loss calculation method. Different from the loss calculation od established by PINN, MLP and LSTM used the IMU measurements to build datan vehicle models [25,57,58]. These methods commonly assumed that the ground states of the vehicle were accessible. In this paper, we used highly accurate S-motion as a surrogate for the actual vehicle state for MLP and LSTM training.
alidation of the PINN Module n this subsection, we compare the PINN module with the MLP, LSTM, and vehicle mics/kinematic models and vehicle measurements. These data-driven models were d using the dataset described in Section 4.1. The root mean square error (RMSE) ] was used to evaluate the performance of the models, which was represented as: e 1800 is the predicted steps, = [ , ] represents the i-th value of the preresults, and ̂ represents the i-th value of the reference results (sensor measures).
i represents the i-th value of the reference results (sensor measurements). Table 2 presents the RMSEs of the predicted states using different methods. The results demonstrated that except for longitudinal/lateral acceleration, the PINN-based approaches (PINN and subPINN) estimated the vehicle state more accurately. The inferior prediction performance on longitudinal/lateral acceleration was attributed to sensor drift. By integrating the acceleration to obtain velocity, it was shown that the PINN-based methods could effectively take the influence of other sensors and realize IMU calibration. As shown in Figure 8, the estimation results of [a x , v x , a y , v y were obtained. "Sensor a x integration" was the ODE states inferred from the original signals using Equation (43) and is shown in green. The PINN model incorporated both longitudinal/lateral vehicle dynamics and LTI-based vehicle displacement, which yielded higher-precision results than the subPINN model. Compared with the MLP and LSTM models, the PINN model had better prediction accuracy of the integrated states. This was due to the PINN model being able to incorporate more dynamics knowledge into the data-driven model. prediction performance on longitudinal/lateral acceleration was attributed to sensor drift. By integrating the acceleration to obtain velocity, it was shown that the PINN-based methods could effectively take the influence of other sensors and realize IMU calibration. As shown in Figure 8, the estimation results of [ , , , ] were obtained. "Sensor integration" was the ODE states inferred from the original signals using Equation (43) and is shown in green. The PINN model incorporated both longitudinal/lateral vehicle dynamics and LTI-based vehicle displacement, which yielded higher-precision results than the subPINN model. Compared with the MLP and LSTM models, the PINN model had better prediction accuracy of the integrated states. This was due to the PINN model being able to incorporate more dynamics knowledge into the data-driven model. The estimation results of states [ , , , , , Θ, , ] were obtained as shown in Figure 9. The experiment of [ , , ] aimed to reduce the noise in acceleration and angular velocity by combining the PINN module with the linear ODEs. As the noise impact on the S-Motion sensor increased over time, the subPINN model demonstrated superior accuracy in estimating the state variables while maintaining the sensor data characteristics and reducing noise. Both the PINN and subPINN models used the vehicle kinematic model in modeling . The subPINN model converged more readily in yaw rate prediction than the other models. Compared to the MLP and LSTM models, the subPINN model had better prediction accuracy in each state.
The subPINN model estimated a single vehicle state, which simplified the training The estimation results of states [a z , v z , ω x , ϕ, ω y , Θ, ω z , ψ were obtained as shown in Figure 9. The experiment of [a z , ω x , ω y ] aimed to reduce the noise in acceleration and angular velocity by combining the PINN module with the linear ODEs. As the noise impact on the S-Motion sensor increased over time, the subPINN model demonstrated superior accuracy in estimating the state variables while maintaining the sensor data characteristics and reducing noise. Both the PINN and subPINN models used the vehicle kinematic model in modeling ω x . The subPINN model converged more readily in yaw rate prediction than the other models. Compared to the MLP and LSTM models, the subPINN model had better prediction accuracy in each state.
subPINN model offered advantages in terms of training efficiency and high accuracy results. This gave the subPINN model a certain attractiveness for engineering implementations. However, the subPINN model could not establish connections between multiple states like the PINN model in order to achieve more comprehensive modeling. Therefore, the PINN model performed better in terms of longitudinal/lateral dynamics accuracy.

Validation of PINN UKFM
The experimental results presented above demonstrated the advantages of the PINN module, but some issues in the comparison still remained. The vehicle states were not connected through a vehicle-based model during the comparison, which failed to reflect the impact of the estimation states on the vehicle dynamics. Compared to other vehicle states, the cm-level GNSS positioning could better reflect the effectiveness of the presented estimation states.
In this subsection, the UKF-M module based on the output of the PINN module is demonstrated. This module reduced state noise through the Gaussian noise hypothesis. Figure 10 shows the trajectories of PINN UKFM and sensor inputs. The vectors [ , , ℎ] were entered at 20 Hz. The covariance matrices of the point were iterated quickly, which could not encapsulate the performance improvement of PINN UKFM. The subPINN model estimated a single vehicle state, which simplified the training process and enabled easier convergence during training. In contrast, the training process of the PINN model was more complex and it was difficult to achieve optimal results. The subPINN model offered advantages in terms of training efficiency and high accuracy results. This gave the subPINN model a certain attractiveness for engineering implementations. However, the subPINN model could not establish connections between multiple states like the PINN model in order to achieve more comprehensive modeling. Therefore, the PINN model performed better in terms of longitudinal/lateral dynamics accuracy.

Validation of PINN UFM
The experimental results presented above demonstrated the advantages of the PINN module, but some issues in the comparison still remained. The vehicle states were not connected through a vehicle-based model during the comparison, which failed to reflect the impact of the estimation states on the vehicle dynamics. Compared to other vehicle states, the cm-level GNSS positioning could better reflect the effectiveness of the presented estimation states.
In this subsection, the UKF-M module based on the output of the PINN module is demonstrated. This module reduced state noise through the Gaussian noise hypothesis. Figure 10 shows the trajectories of PINN UKFM and sensor inputs. The vectors [E, N, h] were entered at 20 Hz. The covariance matrices of the point were iterated quickly, which could not encapsulate the performance improvement of PINN UKFM. Therefore, the frequency of the trajectory coordinates was reduced to 1 Hz through downsampling. As illustrated in Figure 11, the results obtained using PINN UKFM were superior to those obtained using the conventional UKF-M algorithm. Without coordinate input, the vehicle coordinates were updated in the vehicle model using the UKF-M module. The PINN module could reduce the influence of vehicle sensor drift and provided a reliable state estimation result that was closer to the true state of the vehicle. Therefore, the updated vehicle position of PINN UKFM was also closer to the true vehicle state. Additionally, the UKF-M module could estimate the vehicle speed, which included [ e , n , u ]. However, the [ e , n , u ] were not inputted into the PINN module. Next, the modeling results were compared to the measurement states to verify PINN UKFM correctness. As shown in Figure 12, compared to the UKF-M algorithm, PINN UKFM could Therefore, the frequency of the trajectory coordinates was reduced to 1 Hz through downsampling. As illustrated in Figure 11, the results obtained using PINN UKFM were superior to those obtained using the conventional UKF-M algorithm. Without coordinate input, the vehicle coordinates were updated in the vehicle model using the UKF-M module. The PINN module could reduce the influence of vehicle sensor drift and provided a reliable state estimation result that was closer to the true state of the vehicle. Therefore, the updated vehicle position of PINN UKFM was also closer to the true vehicle state. Therefore, the frequency of the trajectory coordinates was reduced to 1 Hz through downsampling. As illustrated in Figure 11, the results obtained using PINN UKFM were superior to those obtained using the conventional UKF-M algorithm. Without coordinate input, the vehicle coordinates were updated in the vehicle model using the UKF-M module. The PINN module could reduce the influence of vehicle sensor drift and provided a reliable state estimation result that was closer to the true state of the vehicle. Therefore, the updated vehicle position of PINN UKFM was also closer to the true vehicle state. Additionally, the UKF-M module could estimate the vehicle speed, which included [ e , n , u ]. However, the [ e , n , u ] were not inputted into the PINN module. Next, the modeling results were compared to the measurement states to verify PINN UKFM correctness. As shown in Figure 12, compared to the UKF-M algorithm, PINN UKFM could better estimate . Additionally, the UKF-M module could estimate the vehicle speed, which included [v e , v n , v u ]. However, the [v e , v n , v u ] were not inputted into the PINN module. Next, the modeling results were compared to the measurement states to verify PINN UKFM correctness. As shown in Figure 12, compared to the UKF-M algorithm, PINN UKFM could better estimate v u . Considering that the cm-level GNSS positioning has a great impact on the covar matrix, the velocity at the 1 Hz GNSS position was also compared, as shown in Figu The PINN UKFM results were significantly better than the UKF-M results. Evidently, PINN UKFM exhibited remarkable performance in vehicle state es tion, although there was still potential for further improvement. A possible enhance lies in leveraging the signals from the CAN bus to replace the WFT and MSW sensor Considering that the cm-level GNSS positioning has a great impact on the covariance matrix, the velocity at the 1 Hz GNSS position was also compared, as shown in Figure 13. The PINN UKFM results were significantly better than the UKF-M results. Considering that the cm-level GNSS positioning has a great impact on the covar matrix, the velocity at the 1 Hz GNSS position was also compared, as shown in Figu The PINN UKFM results were significantly better than the UKF-M results. Evidently, PINN UKFM exhibited remarkable performance in vehicle state es tion, although there was still potential for further improvement. A possible enhance lies in leveraging the signals from the CAN bus to replace the WFT and MSW sensor CAN bus grants access to a wealth of vehicle sensor data, including valuable inform Evidently, PINN UKFM exhibited remarkable performance in vehicle state estimation, although there was still potential for further improvement. A possible enhancement lies in leveraging the signals from the CAN bus to replace the WFT and MSW sensors. The CAN bus grants access to a wealth of vehicle sensor data, including valuable information such as the steering angle, throttle opening, and brake pressure. By incorporating these signals as inputs to the model, a dynamic vehicle model could be constructed, enabling more accurate state estimation. Additionally, the S-Motion sensor can also be replaced by a cheaper IMU. Using more universal CAN bus signals and IMUs, the model could be implemented on more vehicle platforms without being limited by specific sensors.

Conclusions
In this study, a novel method for vehicle state estimation combining PINN and UKF-M has been proposed. The sensor module collects various sensor signals and feeds them into the PINN module. The PINN module automatically calibrates the IMU by utilizing the ODE relationships that exist between multiple sensors. The PINN-based model mitigates sensor errors and facilitates sensor fusion, leveraging the LTI hypothesis for loss calculation in order to reduce noise and bias in the original sensor data. It exhibits clear advantages in multi-sensor fusion compared to existing data-driven vehicle dynamics models. The resulting pseudo-states are then used as inputs to the UKF-M module to model the vehicle trajectory. The experimental results confirmed its effectiveness, with the PINN module successfully eliminating IMU drift and the PINN UKFM trajectory exhibiting less deviation from the cm-level GNSS positioning than the UKF-M trajectory. In the future, we aim to explore more cost-effective solutions to implement the PINN UKFM method. Additionally, the authors intend to use PINN UKFM in OCCS control.