4.2. Design of Experiments
To evaluate the performance of the proposed LSTM-EKF algorithm, three experiments were conducted: (1) a long-term reciprocating rotation experiment, (2) a camera occlusion scanning experiment, and (3) an attitude accuracy evaluation experiment. The goal of the long-term reciprocating rotation experiment is to verify that the LSTM-EKF algorithm can effectively suppress the divergence of IMU cumulative errors during extended measurements. In addition, by comparing with conventional AKF and KF algorithms, this test demonstrates the proposed algorithm’s ability to accurately track target attitude even under visual occlusion.
The attitude accuracy evaluation experiment assesses the angular measurement accuracy at various rotation angles. It aims to validate that, compared with the AKF algorithm, the LSTM-EKF algorithm maintains high accuracy even when the rotation exceeds the vision sensor’s measurement range.
Prior to the experiments, the total station and vision sensor were rigidly mounted at fixed positions, and the cooperative target was secured to the turntable. The IMU was installed inside the target. Subsequently, calibration was performed for both the IMU error model and the intrinsic parameters of the vision sensor. The rotation matrices between all relevant coordinate systems were determined to establish their spatial relationships. The IMU sampling frequency was set to 100 Hz, and the vision sensor was configured to sample at 1 Hz. The specific parameters of the turntable and IMU used in the experiment are shown in
Table 3.
The experimental procedures are as follows:
Since complex motions can theoretically be decomposed into simple motions of different periods, this paper adopts a relatively simple uniform-speed periodic reciprocating motion to preliminarily verify the effectiveness of the method. In subsequent work, we will consider more complex motion patterns.
The turntable was configured to rotate back and forth over a ±50° range. The measurement system was activated, and the turntable executed multiple reciprocating rotations. During the experiment, the vision sensor was intermittently occluded for various durations to simulate visual missing. Upon completion of the predefined number of rotations, the turntable was stopped and stabilized before terminating the measurement.
- 2.
Camera Occlusion Scanning Experiment:
Ten sets of repeated experiments were conducted for Occlusion Scanning. All algorithms were first evaluated using unobstructed data to obtain the benchmark RMSE. Subsequently, visual occlusion segments with frame lengths ranging from 1 to 30 were introduced at identical time points across all experimental runs. The RMSE of each algorithm with respect to the ground truth was then calculated, and the experimental results were recorded for further analysis.
- 3.
Attitude Accuracy Evaluation Experiment:
The turntable was programmed to perform a single rotation from 0° to 80°, with measurements taken at every 5° interval. For each angle, repeated measurements were conducted. After the system initialization, the turntable was rotated to the specified angle, and the system recorded the results after stabilization was achieved.
4.3. Experimental Results and Analysis
The experimental results of the long-term reciprocating rotation experiment are illustrated in
Figure 9.
As shown in
Figure 9, the shaded regions indicate periods of visual signal loss. The visual output is categorized into measured and predicted values, which are clearly distinguished in the figure. During the occlusion periods, the missing visual data are reconstructed using the LSTM. From left to right, the number of missing visual frames is 5, 10, and 20, respectively. The t1, t2, and t3 are, respectively, the start times of visual occlusion.
It is evident that the standalone IMU exhibits significant cumulative errors over prolonged measurements. However, the proposed LSTM-EKF algorithm effectively corrects the IMU drift, significantly enhancing the measurement accuracy. Notably, even during visual occlusion, the LSTM-EKF algorithm maintains reliable attitude tracking and measurement performance.
To further evaluate the robustness of the algorithm, the LSTM-EKF is compared with the traditional AKF and KF algorithms under varying durations of visual occlusion. The comparative results are presented in
Figure 10.
As illustrated in
Figure 10, it is evident that the errors associated with the AKF and KF algorithms progressively increase once the visual input is unavailable. In contrast, the LSTM-EKF maintains stable performance by adaptively correcting the IMU drift during the occlusion interval through its LSTM.
Due to different initial moments of visual occlusion, the cumulative errors corresponding to zero time differ in
Figure 10a–c. The initial moment of occlusion in
Figure 10c is a quasi-static moment, at which the cumulative errors corresponding to AKF and KF are minimal. This is also one of the reasons why the error of our method is relatively large within 11 s. Furthermore, from the overall effect of
Figure 10c, our method can control the error within a certain range within 20 frames. Especially when the errors of AKF and KF gradually diverge with time, our method’s suppression effect on errors is more significant.
When the duration of visual occlusion becomes prolonged, as shown in
Figure 10c, the performance of the AKF and KF methods deteriorates significantly. Specifically, once the visual loss exceeds 11 s, their errors rapidly diverge. This is because when the camera is obstructed, AKF and KF degrade to updating solely with the IMU. When a change in motion occurs, without corrections from the camera output, the error will diverge rapidly. However, the proposed LSTM-EKF algorithm effectively suppresses this error divergence by leveraging the visual prediction and residual adaptively correcting mechanism, thereby maintaining a high level of measurement accuracy even in extended occlusion scenarios.
To comprehensively evaluate the performance of the three fusion algorithms, we compare both their overall measurement performance and their performance during visual occlusion. The comparison results are summarized in
Table 4.
As shown in
Table 4, the LSTM-EKF algorithm achieves the lowest RMSE and ME across the entire measurement duration compared with both AKF and KF, demonstrating superior accuracy and long-term stability. Although KF slightly outperforms AKF due to the constant rotational motion of the turntable, LSTM-EKF still improves the MAE by 19% and reduces the RMSE and ME by approximately 29% relative to KF.
During the visual occlusion periods, the estimation errors of both AKF and KF increase significantly, while the LSTM-EKF maintains consistent accuracy. Compared with AKF—which exhibits lower errors among traditional methods—LSTM-EKF achieves an 85% reduction in ME and approximately 59% reductions in both RMSE and MAE. This performance gain is attributed to the LSTM’s ability to accurately predict during vision loss. Moreover, the MAE and RMSE during occlusion remain close to those over the full sequence, confirming the algorithm’s robustness. These results indicate that the proposed LSTM-EKF algorithm provides accurate and reliable attitude estimation, especially under challenging conditions such as extended visual occlusion.
- 2.
Camera Occlusion Scanning Experiment:
To evaluate the maximum effective prediction duration of the proposed LSTM network, a visual-missing duration scanning experiment was conducted. Ten long-duration oscillation experiments were performed with the same visual occlusion onset time. Under different visual missing durations, the RMSE of the IMU-only, KF, AKF, and LSTM-EKF methods were systematically compared. The experimental results are presented in
Figure 11.
As shown in
Figure 11, the estimation accuracy of all compared methods generally degrades as the number of consecutively missing camera frames increases. Nevertheless, among all fusion strategies, the LSTM-EKF method consistently achieves the highest estimation accuracy under different visual missing durations.
When the camera measurements are missing for 1–10 consecutive frames, both the LSTM-EKF and AKF approaches exhibit significantly lower RMSEs than the conventional KF. This improvement can be attributed to the adaptive mechanism of AKF, which is capable of adjusting model and noise parameters within a short time interval to suppress error divergence. However, when the number of consecutively missing frames exceeds 10, the RMSE of AKF also begins to diverge, showing a trend similar to that of the KF.
In contrast, the LSTM-EKF method exploits the LSTM network to predict the camera outputs and provides effective pseudo-measurements to the EKF during visual outages, thereby compensating for missing measurement updates and extending the duration over which RMSE divergence is suppressed. When the number of missing frames exceeds 20, the RMSE of the LSTM-EKF method also starts to diverge. These results indicate that, within 20 consecutive missing camera frames, the proposed LSTM-EKF approach can more effectively correct the accumulated IMU errors and maintain stable EKF updates compared with the KF and AKF methods.
- 3.
Attitude Accuracy Evaluation Experiment:
Since the effective measurement range of the vision sensor is limited to 55°, visual measurements fail when the turntable rotation angle exceeds this range. The proposed LSTM-EKF algorithm leverages neural network predictions to estimate angles beyond this visual range, thereby effectively extending the measurement capability of the system. To evaluate this, an angular accuracy evaluation experiment was designed with turntable rotation angles ranging from 0° to 80°, in 5° increments, with multiple repeated measurements conducted for each group. As a representative example, the measurement results for an 80° rotation are illustrated in
Figure 12.
In
Figure 12, two types of colored markers indicate the visual measurement output and the predictions from the LSTM. The total measurement duration was approximately 21 s during the 80° rotation. Over this period, the IMU accumulated an error of approximately 1.3°, which continued to diverge over time. However, the errors associated with the LSTM-EKF, AKF, and KF algorithms were significantly smaller, indicating effective suppression of IMU error drift.
When the rotation angle exceeds 55°, the cooperative target moves out of the vision sensor’s observable range. During this visual loss phase, the LSTM-EKF algorithm employs its LSTM to continue correcting the system’s outputs based on learned motion patterns, thereby maintaining high measurement accuracy. The quantitative evaluation results of angular measurement accuracy are summarized in
Table 5.
As shown in
Table 5, the absolute error of IMU angle measurements increases with larger rotation angles. In contrast, the measurement errors of both the LSTM-EKF and AKF fusion methods are significantly lower than that of the standalone IMU. A comparison between LSTM-EKF and AKF demonstrates that LSTM-EKF consistently achieves higher angular measurement accuracy.
Although the performance of both fusion methods degrades when the target motion exceeds the visual measurement range due to visual loss, the LSTM-EKF algorithm maintains superior accuracy due to its use of a visual prediction network. These allow it to adaptively correct the fusion results even in the absence of camera inputs.
In the angular range of [0°, 55°], where the camera measurements are valid, the maximum absolute error of the LSTM-EKF method is 0.18°, compared to 0.29° for the AKF method. In the range of [55°, 80°], where vision is occluded, the LSTM-EKF maintains a maximum absolute error of 0.26°, while the AKF error increases to 0.38°.
To evaluate the performance contribution of the LSTM network in the proposed LSTM-EKF algorithm, an ablation study was conducted to analyze the impact of the LSTM network on attitude estimation accuracy in the three aforementioned experiments, using RMSE as the evaluation metric. In
Table 6, Experiment 1, Experiment 2, and Experiment 3 correspond to the Long-Term Reciprocating Rotation Experiment, the Camera Occlusion Scanning Experiment, and the Attitude Accuracy Evaluation Experiment, respectively. The experimental results are summarized in
Table 6.
As shown in
Table 6, compared with the EKF algorithm without the LSTM network, the incorporation of the LSTM significantly improves the attitude estimation performance. Specifically, the RMSE is reduced by approximately 52.4% in the Long-Term Reciprocating Rotation Experiment, 21.9% in the Camera Occlusion Scanning Experiment, and 45.7% in the Attitude Accuracy Evaluation Experiment, respectively. Since the Camera Occlusion Scanning Experiment demonstrated the superior performance of the LSTM-EKF method under conditions of up to 20 consecutive missing camera frames, the ablation study herein reports the average RMSE across occlusion durations from 1 to 20 frames.
In summary, the proposed LSTM-EKF method effectively combines the high dynamic responsiveness of the IMU with the high-accuracy characteristics of vision sensors and the temporal prediction capabilities of LSTM neural networks. Compared to conventional vision systems, this approach significantly increases the measurement frequency while reducing the cumulative error typically associated with IMU-based systems during dynamic motion. As a result, the LSTM-EKF algorithm enables high-precision dynamic attitude estimation even under visual occlusion conditions.