1. Introduction
In an era of increasing reliance on autonomous systems, precise navigation capabilities have become paramount across diverse domains—from unmanned aerial vehicles operating in contested environments to autonomous underwater vehicles exploring ocean depths, and from spacecraft conducting deep-space missions to ground vehicles navigating urban canyons where satellite signals are unreliable or completely unavailable. Strapdown Inertial Navigation Systems (SINS) represent the cornerstone technology for such GPS-denied scenarios, offering unparalleled autonomy, stealth characteristics, and immunity to external interference [
1,
2,
3]. However, this independence comes at a significant cost: the relentless accumulation of systematic errors that progressively degrade positioning accuracy, potentially rendering long-duration missions ineffective or even catastrophic.
The fundamental challenge lies in the nature of inertial navigation itself—numerical integration processes that transform raw sensor measurements into position estimates inevitably amplify measurement errors over time [
4]. Traditional solutions have predominantly relied on sensor fusion methodologies, integrating SINS with Global Navigation Satellite Systems (GNSS) [
5] or Doppler Velocity Logs (DVL) [
6], employing sophisticated filtering techniques such as Kalman filters and their variants [
7,
8] to constrain error growth. While these approaches demonstrate mathematical rigor and practical validation, they suffer from a critical vulnerability: complete dependence on external reference signals. In GPS-denied environments—whether due to jamming, spoofing, natural obstacles, or operational requirements for stealth—these systems rapidly lose effectiveness, with positioning errors growing unbounded.
Deep Neural Networks (DNN) offer a paradigm shift in addressing this fundamental limitation. Unlike conventional analytical models that rely on predetermined mathematical relationships, DNN excel at discovering complex, nonlinear patterns within data through multi-layered transformations and adaptive learning mechanisms. Their proven success across domains ranging from blockchain security [
9] and medical diagnostics [
10] to biological modeling [
11] demonstrates their versatility in handling intricate, high-dimensional problems. For inertial navigation, this capability translates into the potential to model and predict the subtle, nonlinear error dynamics that conventional approaches struggle to capture.
Recent research has begun exploring the intersection of deep learning and inertial navigation, though with varying degrees of success and scope. Some studies focus on sensor-level error compensation, utilizing neural networks to calibrate inherent biases and noise characteristics of gyroscopes and accelerometers [
12,
13]. Others attempt system-level error prediction, learning relationships between navigation states and accumulated errors [
14,
15]. However, critical gaps remain in the existing literature:
Ref. [
15] employs an oversimplified feature representation, using only GPS position increments as input while attempting to predict attitude, velocity, and position errors—an approach that fundamentally ignores the strong correlation between inertial sensor measurements and navigation error propagation. Ref. [
14] addresses feature selection more appropriately by incorporating angular and velocity increments from inertial sensors, but limits its scope to position prediction alone, neglecting equally critical attitude and velocity errors, particularly the heading angle accuracy that is essential for trajectory maintenance. Furthermore, validation periods remain insufficient—most studies demonstrate effectiveness only over short durations (typically ≤ 200 s), leaving long-term performance characteristics unexplored.
This research addresses these limitations through a comprehensive approach to long-duration autonomous navigation. We develop a sophisticated error prediction framework that leverages the full spectrum of available navigation information: angular and velocity increments directly measured by inertial sensors, combined with real-time attitude and velocity states derived from navigation computations. This 12-dimensional feature vector captures both the immediate sensor characteristics and the accumulated system state, while a 9-dimensional response vector encompasses all critical error components—attitude, velocity, and position deviations. Our feedforward neural network architecture is specifically designed to model the deterministic integration relationships inherent in inertial navigation error propagation, moving beyond generic time-series approaches to capture the underlying physical mechanisms. The practical significance extends far beyond academic interest: successful long-term error prediction and compensation could enable extended autonomous operations in challenging environments, from military applications requiring electronic silence to civilian scenarios involving natural disasters where satellite infrastructure is compromised. Our validation demonstrates sustained accuracy improvements over extended durations, providing a viable path toward truly autonomous navigation systems that maintain precision without external dependencies.
2. Strapdown Inertial Navigation System Fundamentals
2.1. Spatial Coordinate System Definition
The SINS involves multiple spatial coordinate systems, which serve as the foundation for subsequent inertial navigation calculations. Here, only the coordinate systems mentioned in the text are briefly introduced.
The earth-centered inertial coordinate system (i-frame) has its origin, Oi, at the Earth’s center. The Xi and Yi axes lie within the equatorial plane, with the Xi axis pointing towards the vernal equinox. The i-frame does not change with the earth’s rotation. The earth-fixed coordinate system (e-frame) has its origin, Oe, at the earth’s center, with the Ze axis coincident with the earth’s rotation axis and the Xe axis pointing towards the prime meridian. The body coordinate system (b-frame) has its origin, Ob, at the center of the vehicle, with the Xb, Yb, and Zb axes pointing in the directions of positive right, positive forward, and positive upward of the vehicle, respectively. The navigation coordinate system (n-frame) has its origin at the center of the vehicle, with the Xn, Yn, and Zn axes pointing towards the true east, true north, and zenith directions, respectively. The computed navigation coordinate system (n′-frame) is used in navigation calculations. There is a misalignment angle between the n′-frame and the real n-frame, often represented by an equivalent rotation vector.
The use of the North-East-Down (NED) coordinate system aligns naturally with the earth’s surface coordinate system, simplifying the conversion between geographic location and attitude angles. The NED coordinate system is based on north, east, and downward directions, ensuring consistency between the inertial measurement unit (IMU) and the ground reference system in navigation, thus guaranteeing the accuracy and stability of attitude estimation. The right-forward-up (RFU) coordinate system is consistent with traditional yaw, pitch, and roll angles, which helps simplify the description of the vehicle’s attitude in three-dimensional space and optimize the system’s navigation performance.
2.2. SINS Numerical Update Algorithm
Equation (1) represents the attitude, velocity, and position update equations for the SINS.
where
represents the attitude matrix.
denotes the skew-symmetric matrix of the vector
, which is the projection of the angular velocity of the body frame relative to the navigation frame in the body frame.
represents the specific force measured by the accelerometer.
represents the projection of the angular velocity of the earth-fixed frame relative to the inertial frame in the navigation frame,
and
represent the projections of the angular velocity and linear velocity of the navigation frame relative to the earth-fixed frame in the navigation frame, respectively.
represents the projection of the gravitational acceleration vector in the navigation frame. The position vector
p = [
lat,
lon,
h]
T consists of latitude, longitude, and height.
represents the projection of the vehicle’s velocity in the navigation frame, where
= [
vE,
vN,
vU]
T.
is the transformation matrix as given in Equation (2).
where
lat represents latitude,
h represents height,
Re is the semi-major axis radius of the earth’s ellipsoid,
e is the eccentricity of the earth’s ellipsoid, and
f is the flattening factor of the earth’s ellipsoid,
RM and
RN are the meridian radius of curvature and the prime vertical radius of curvature, respectively.
2.3. Error Model Formulation
By the chain rule of the attitude matrix and the relationship between the equivalent rotation vector and the attitude matrix, Equation (3) holds.
As mentioned in
Section 2.1, there exists a misalignment angle between the
n′ frame and the
n frame, which is often expressed as an equivalent rotation vector
[
16], and this misalignment angle is the attitude error, which is mathematically expressed by bringing the second equation of Equation (3) into the first equation.
In other words, the attitude matrix derived from Equation (1) for attitude solving should be
, which is expanded into Equation (5).
where
and
are the gyro measurement error and the navigation system computational error, respectively.
and
are referred to as the observations of
and
. Equation (4) is differentiated and brought into the first equation in Equation (5) to obtain Equation (6).
The first equation in Equations (3) and (5) is brought into Equation (6), and Equation (7) is obtained by left-multiplying both sides of the equation and omitting the second-order minor.
The specific force equation of the second equation in Equation (1) is expressed in practical calculations as:
where
represents the projection of
n-frame’s velocity relative to
e-frame in the
n-frame,
represents the projection of the earth’s rotational angular velocity in the
n-frame, similar variable definitions are the same, and will not be described in detail in the following text. Similar to Equation (5), all quantities with the superscript ‘~’ represent observations, while all quantities with “
” represent error values. Subtracting the second equation of Equation (1) from Equation (8) and expanding to omit minor quantities of second order or higher, the differential equation for the velocity error of Equation (9) is obtained.
In the third equation of Equation (1), the position error differential equation is obtained by taking the partial derivatives for latitude, longitude and height, respectively:
Equations (7), (9), and (10) are the differential equations of attitude, velocity, and position errors of SINS, whose attitude, velocity and position correlation quantities are coupled with each other, and GPS and other exogenous signals are often fused with the IMU information to reduce the errors in navigation practice. It should be noted that in the above derivation, any quantity with the superscript “~” is an actual observed value, while quantities without a superscript are unknown actual values. Any quantity with the superscript “” represents an error value (i.e., the difference between the observed value and the actual value). Clearly, error values are also position values that need to be estimated. Apart from the error quantities already provided by inertial devices, if error values are needed in the calculation process, they are generally set based on experience according to the level of the inertial devices.
4. Deep Neural Network-Based Error Prediction Implementation
4.1. Network Architecture Design
Although SINS errors are typical time series, they have special physical mechanisms and mathematical models. Their error propagation follows specific navigation error equations rather than simple temporal dependencies. Inertial navigation errors are influenced by various factors, including random errors from gyroscopes and accelerometers, deterministic errors, and initial alignment errors. The influence mechanisms of these error sources are complex and interrelated. While classical time series prediction models like LSTM (Long Short-Term Memory) can capture long-term dependencies, their general gating mechanisms struggle to accurately model the specific error propagation laws and physical constraints in inertial navigation systems. Additionally, the cumulative nature of inertial navigation errors fundamentally differs from the memory mechanism of LSTM. Inertial navigation error accumulation follows integration relationships, where attitude errors affect velocity errors through integration, and velocity errors further affect position errors, forming an error chain. This deterministic error transmission mechanism differs from the long-term memory and vague relationship time series data that LSTM excels at handling. Numerous experiments have shown that feedforward neural networks combined with appropriate feature engineering can achieve better results in capturing these deterministic integration relationships.
When constructing a prediction model for strapdown inertial navigation errors, feature selection directly affects the model’s predictive performance. This study selects angular velocity increments, velocity increments collected by inertial sensors, and the attitude and velocity obtained from the SINS as neural network feature inputs. This choice is based on the following in-depth considerations.
Firstly, the error propagation in SINS systems follows specific physical laws. Fundamentally, system errors originate from measurement errors of the IMU and initial alignment errors. Angular velocity increments and velocity increments are raw data directly measured by the IMU, containing various information errors from gyroscopes and accelerometers. These errors in the raw measurement data are transmitted and accumulated through inertial navigation algorithms, ultimately leading to deviations in attitude, velocity, and position. Secondly, selecting angular velocity increments and velocity increments as features allows the neural network to directly learn the mapping relationship between measurement errors and navigation errors. These raw measurement data contain information such as random noise, bias, and scale factor errors, which are the main sources of navigation errors. The attitude and velocity obtained from the SINS are key state quantities of the system, closely related to navigation errors. Attitude errors affect the projection of specific forces through the coordinate transformation matrix, thereby influencing velocity calculations; velocity errors directly affect position calculations. Therefore, the current attitude and velocity states contain rich error information. In particular, the attitude calculation results reflect the cumulative effect of gyroscope errors, while the velocity calculation results reflect the combined influence of accelerometer errors and attitude errors. The trend and fluctuation characteristics of these state quantities can provide key clues for error prediction. Finally, from a practical application perspective, angular velocity increments, velocity increments, and calculated attitude and velocity are readily available data in SINS, without requiring additional sensors or complex preprocessing. This ensures that the error prediction model can run in real time during actual navigation, meeting the real-time requirements of navigation systems. In contrast, if other variables (such as temperature and vibration characteristics of accelerometers and gyroscopes) are chosen as features, additional sensors may be needed, increasing the system complexity and cost, and these data may be difficult to obtain in certain application scenarios.
In summary, angular velocity increments, velocity increments, inertial navigation calculated attitude, and velocity are selected as a 12-dimensional feature vector input, with attitude, velocity, and position errors as a 9-dimensional response output, to construct the error prediction neural network as shown in
Figure 1. The figure reserves interfaces for other features, allowing additional variables to be included as features based on actual conditions in the future, but in this paper, this item is an empty vector.
The neural network in
Figure 1 is configured with only two hidden layers based on the following considerations: A two-layer hidden structure ensures sufficient modeling capability while avoiding an overly complex network architecture. This is particularly suitable for problems like inertial navigation error compensation, which are nonlinear but supported by clear physical models. Theoretically, a network with two hidden layers can approximate most complex nonlinear functions, especially when dealing with attitude, velocity, and position errors in inertial navigation systems, capturing the interactions between different error sources. A two-layer network strikes a balance between training speed and generalization ability, avoiding the risks of overfitting and wasted computational resources due to excessive layers. The setting of the number of nodes in the two hidden layers of the network is based on the following considerations: From the input layer (12 features) to the first hidden layer (10 nodes) to the second hidden layer (5 nodes) and then to the output layer (9 response variables or a single response variable), it reflects the process of information compression and extraction, aligning with the “funnel principle” in neural network design. The first layer with 10 nodes is mainly responsible for extracting basic feature patterns from the original 12-dimensional features, while the second layer with 5 nodes further integrates these features to extract higher-level abstract representations, helping to capture the intrinsic patterns of inertial navigation system errors. The relatively small number of nodes (a total of 15 hidden nodes) helps avoid overly complex networks, especially when using efficient training algorithms such as Bayesian regularization (trainbr) and Levenberg–Marquardt (trainlm), maintaining good generalization ability. In inertial navigation error modeling, the relationship between input data (attitude error, velocity error, etc.) and output has a certain physical significance and correlation, allowing for good fitting results without requiring particularly complex network structures.
Table 1 lists the relevant parameters of the constructed DNN. The numbers of neurons in the input layer, hidden layers, and output layer, as well as the number of hidden layers, are clearly shown in
Figure 1, so they are not repeated in the table.
4.2. Network Training and Optimization
This study constructed a complete training dataset based on actual navigation data. The dataset includes angular increments and velocity increments collected by the IMU, as well as attitude and velocity obtained from the SINS as features; with attitude angle errors, velocity errors, and position errors as response variables. The data collection frequency is 100 Hz, and a total of 5 × 103 s of navigation data were collected, forming 5 × 105 sample points.
To ensure the model’s generalization ability, a standard dataset division strategy was adopted: divided into a training set and test set in a ratio of 7:3. The training set is used for network parameter optimization, and the test set for evaluating the model’s generalization ability. This division method ensures the sufficiency of training data while avoiding the risk of overfitting.
To improve network training efficiency and prediction accuracy, input features and output responses were standardized. The standardization process is based on the statistical characteristics of the training set, calculating the mean and standard deviation of each feature, and then converting all data into a standard normal distribution with a mean of 0 and a standard deviation of 1. This processing helps eliminate the impact of differences in feature dimensions, accelerates network convergence, and improves training stability. Similarly, the response variables were standardized. During the prediction phase, the network output needs to be de-standardized to restore the original dimensions.
Considering the characteristic differences of different types of navigation errors, this study adopted a “one error per network” parallel architecture, training independent neural networks for nine error variables (three attitude angle errors, three velocity errors, three position errors). Each network uses a two-hidden-layer structure, with five neurons in both the first and second hidden layers, and the activation function is ReLU (Rectified Linear Unit). Network training uses the Levenberg–Marquardt (LM) algorithm, which combines the advantages of gradient descent and Gauss–Newton methods, suitable for training small to medium-sized neural networks. The maximum number of training iterations is set to 1000, the target error is set to
, and the minimum gradient is set to
. These parameter settings ensure sufficient training while avoiding overtraining. To prevent overfitting, this study adopted an early stopping strategy, which terminates the training process early if the performance on the validation set does not improve for 10 consecutive iterations. This method effectively prevents the network from overfitting the training data and improves the model’s generalization ability on unseen data. The model inference time, hardware configuration, and software environment parameters are shown in
Table 2.
During training, the system automatically records the performance on the validation set after each training iteration and saves the network parameters with optimal performance, ensuring the final model has the best generalization ability. After training is completed, the network performance is comprehensively evaluated using the test set, with mean squared error as the evaluation metric. The specific performance is shown in
Figure 2 and
Table 3. In addition, we also included the most commonly used time series prediction method—LSTM for comparison. Unlike the DNN, the input and output of the LSTM are both the error states, namely the attitude, velocity, and position errors. In other words, both the features and responses are 9-dimensional vectors.
In
Figure 2, the true values (i.e., true errors) are the differences between the attitude, velocity, and position obtained from pure inertial navigation calculations and those obtained from SINS/GPS integrated navigation. In other words, the results of SINS/GPS integrated navigation are regarded as the true state, while the results of pure inertial navigation calculations are regarded as the observed state. The difference between the two is the true error, which is used to learn the nonlinear patterns of errors with a DNN, allowing for prediction and compensation when GPS signals are unavailable.
The test set performance comparison between DNN and LSTM models reveals that the DNN achieves exceptionally high prediction accuracy, with errors for each response variable typically on the order of 10
−5 or lower. These results demonstrate both the strong generalization capability of the proposed network architecture and the feasibility of employing DNN for inertial navigation system error prediction. In contrast, the LSTM model maintains reasonable prediction accuracy during the initial phase but progressively deviates from the true error values after 500 s. As shown in
Table 3, the MSE values for multiple response variables predicted by the LSTM are orders of magnitude higher than those achieved by the DNN. The inferior performance of LSTM in SINS error prediction can be attributed not only to the factors discussed in
Section 4.1, but also potentially to the hyperparameter settings, particularly the time step configuration, which was set to 200 in this study.
5. Experimental Results and Performance Analysis
Using the same set of IMU from chapter 3, the FOG (Fiber Optic Gyroscope) inertial measurement unit (specific information can be found in
Figure 3 and
Table 4 and
Table 5) using IMU-ISA-100C has a sampling frequency of 100 Hz; GPS uses a differential mode with an output frequency of 10 Hz. It first remained stationary for 690 s for self-alignment, then moved with a vehicle outdoors in Xi’an city for approximately 3 h; the slow speed is about 3.5 m/s and the driving range is within 10 km; the equipment layout in the vehicle is shown in
Figure 4. The outputs of the gyroscope and accelerometer are shown in
Figure 5. Firstly, the SINS/GPS integrated navigation system was used for calculation, and the results were taken as the true values. The attitude, velocity, and position obtained from the SINS, as well as the attitude, velocity, and position predicted and corrected by the DNN trained in
Section 4.1, were compared with the true values. Meanwhile, to highlight the predictive advantages of DNN in multi-dimensional features and multi-dimensional responses, we also included the attitude, velocity, and position predicted and corrected by LSTM in the comparison. In other words, the attitude, velocity, and position of the integrated navigation system are the true values, while the attitude, velocity, and position calculated by the SINS, the attitude, velocity, and position predicted and corrected by DNN, and the attitude, velocity, and position predicted and corrected by LSTM are the comparison values. The curves were plotted using Matlab to simulate the actual motion state of the carrier. The comparisons of attitude, velocity, and position are shown in
Figure 6,
Figure 7,
Figure 8 and
Figure 9. To more clearly distinguish the accuracy of different methods, the error curves of attitude, velocity, and position are also plotted, as shown in
Figure 10.
Table 6 lists the MSE of navigation errors for different methods.
Figure 6 and
Figure 7 show the comparison of attitude and velocity for the three methods, with a zoomed-in section of approximately 20 s of data. In
Figure 6, due to the high accuracy of the IMU used in the experiment and the relatively short sampling time, the differences in attitude among the four methods are not very pronounced, although the attitudes predicted and corrected by DNN and LSTM are obviously closer to the true values than SINS. However, in
Figure 7, there are still significant gaps in velocity and position accuracy among the three methods. It can be intuitively observed that the attitude, velocity, and position calculated by SINS have considerable deviations from the true values, with the skyward velocity showing a deviation of hundreds of meters per second. The values predicted and compensated by DNN and LSTM show significantly reduced deviations from the true values, but intuitively, the deviation of DNN from the true values is obviously smaller than that of LSTM. In
Figure 8, the three-dimensional trajectory calculated by SINS only coincides with the real trajectory in the initial stage of carrier movement, after which the deviation gradually increases and is particularly pronounced in height performance. After error prediction and correction using DNN and LSTM, the height is essentially at the same level as the real height. For ground-moving carriers, the planar position is of greater concern to us. In
Figure 9, the two-dimensional trajectory clearly shows that the planar trajectory predicted and corrected by DNN basically fluctuates around the real trajectory. LSTM prediction also shows a certain effectiveness, but its fluctuation is more pronounced, and the movement direction in the later stage is opposite to the carrier’s true movement direction. In contrast, the planar trajectory calculated by SINS basically cannot describe the carrier’s true movement.
Figure 6,
Figure 7,
Figure 8 and
Figure 9 can only provide qualitative judgment of the differences between different methods, while
Figure 10 allows for quantitative analysis of the accuracy of each method. In terms of pitch error, the maximum value of SINS calculation is 2.5 × 10
−2 radians, SINS-LSTM is 3.3 × 10
−3 radians, and SINS-DNN is 1.07 × 10
−3 radians. For roll error, the maximum value of SINS calculation is 2.0 × 10
−2 radians, SINS-LSTM is 3.14 × 10
−3 radians, and SINS-DNN is 4.70 × 10
−4 radians. In terms of yaw error, the maximum value of SINS calculation is 2.42 × 10
−2 radians, SINS-LSTM is 3.86 × 10
−3 radians, and SINS-DNN is 1.10 × 10
−4 radians. For eastward velocity error, the maximum value of SINS calculation is 455 m/s, SINS-LSTM is 135 m/s, and SINS-DNN is 4.71 m/s. For northward velocity error, the maximum value of SINS calculation is 26.8 m/s, SINS-LSTM is 8.22 m/s, and SINS-DNN is 4.16 m/s. In terms of skyward velocity error, the maximum value of SINS calculation is 1.34 × 10
4 m/s, SINS-LSTM is 3.74 × 10
3 m/s, while SINS-DNN remains stable within 10 m/s. For latitude error, the maximum value of SINS calculation is 3.83 × 10
−3 radians, SINS-LSTM is 9.68 × 10
−4 radians, and SINS-DNN is 7.45 × 10
−4 radians. In terms of longitude error, the maximum value of SINS calculation is 3.82 × 10
−2 radians, SINS-LSTM is 5.97 × 10
−3 radians, and SINS-DNN is 1.5 × 10
−4 radians. For height error, the maximum value of SINS calculation is 7.7 × 10
6 m, SINS-LSTM is 2.2 × 10
6 m, while SINS-DNN remains within 10 m. From the above analysis, it can be seen that overall, regardless of which method is used for prediction and correction, the accuracy is higher than pure inertial navigation calculation, with attitude, velocity, and position errors all significantly reduced. For carriers with planar motion, heading and position accuracy are of primary concern. Therefore, examining the errors in yaw and latitude/longitude, between SINS-LSTM and SINS-DNN, except for the relatively close latitude error, all other errors differ by an order of magnitude. This leads to the fact that in
Figure 8, the motion trajectory of SINS-DNN is more closely aligned with the real trajectory than SINS-LSTM, in other words, SINS-DNN has a higher positioning accuracy.
In terms of height, it is evident from the experimental results that both DNN and LSTM significantly reduce the height error. The underlying reason is that pure inertial navigation tends to diverge rapidly in the height channel, while prediction and compensation using DNN or LSTM essentially introduce height damping, resulting in an immediate improvement. However, for ground or surface vehicles, the height value is not a primary navigation parameter of interest. Therefore, although the error is significantly reduced, we do not discuss it in detail.
6. Conclusions and Prospects
This study proposes a DNN-based method for predicting and correcting the errors of SINS, aiming to address the persistent challenge of long-term error accumulation in inertial navigation. By constructing a prediction model that takes angular and velocity increments, as well as real-time attitude and velocity states as input features, and by using attitude, velocity, and position errors as multidimensional response targets, the method effectively captures the nonlinear, integrated propagation mechanisms of inertial errors. Experimental results based on both simulation and real-world vehicle motion data demonstrate that the proposed DNN approach significantly outperforms traditional SINS- and LSTM-based predictions in terms of accuracy and stability, particularly in maintaining position and heading consistency over extended durations. The method shows robust adaptability in GPS-denied environments, offering a promising alternative or complement to integrated navigation systems dependent on external signals. Nevertheless, the current model’s effectiveness is closely tied to the characteristics of the inertial hardware used during training, which may limit its generalization across different IMU devices. Future research will focus on enhancing cross-device generalization through transfer learning and domain adaptation techniques, as well as exploring lightweight network architectures for real-time onboard deployment in constrained platforms such as UAVs and underwater vehicles.
Of course, this study also has some unresolved issues at present. Among them, a key issue is that the navigation errors predicted using DNN or LSTM in the study represent the total sum of errors including the effects of inertial device errors, temperature errors, as well as vehicle maneuvering and environmental factors. In other words, the study does not involve the estimation and compensation of various types of errors. Therefore, the methods involved in this study would theoretically only have significant effects when using the same IMU, in roughly the same external environment, and when vehicle maneuvering changes are not drastic. This also limits the universality of the method to some extent. However, this does not mean that the research content and results are worthless. In the future, it might be possible to consider incorporating factors such as temperature into the features, implementing adaptive error learning for spatial environment and temperature using multi-layer recurrent neural networks [
17,
18] composed of LSTM and RNN.