Abstract
The Solar System Boundary Exploration (SSBE) mission is the focal point for future far-reaching space exploration. Due to the SSBE having many scientific difficulties that need to be studied, such as a super long space exploratory distance, a super long flight time in orbit, and a significant communication data delay between the ground and the probe, the probe must have sufficient intelligence to realize intelligent autonomous navigation. Traditional navigation schemes have been unable to provide high-accuracy autonomous intelligent navigation for the probe independent of the ground. Therefore, high-accuracy intelligent astronomical integrated navigation would provide new methods and technologies for the navigation of the SSBE probe. The probe of the SSBE is disturbed by multiple sources of solar light pressure and a complex, unknown environment during its long cruise operation while in orbit. In order to ensure the high-accuracy position state and velocity state error estimation for the probe in the cruise phase, an autonomous intelligent integrated navigation scheme based on the X-ray pulsar/solar and target planetary Doppler velocity measurements is proposed. The reinforcement Q-learning method is introduced, and the reward mechanism is designed for trial-and-error tuning of state and observation noise error covariance parameters. The federated extended Kalman filter (FEKF) based on the Q-learning (QLFEKF) navigation algorithm is proposed to achieve high-accuracy state estimations of the autonomous intelligence navigation system for the SSBE probe cruise phase. The main advantage of the QLFEKF is that Q-learning combined with the conventional federated filtering method could optimize the state parameters in real-time and obtain high position and velocity state estimation (PVSE) accuracy. Compared with the conventional FEKF integrated navigation algorithm, the PVSE navigation accuracy of the federated filter integrated based the Q-learning navigation algorithm is improved by 55.84% and 37.04%, respectively, demonstrating the higher accuracy and greater capability of the raised autonomous intelligent integrated navigation algorithm. The simulation results show that the intelligent integrated navigation algorithm based on QLFEKF has higher navigation accuracy and is able to satisfy the demands of autonomous high accuracy for the SSBE cruise phase.
1. Introduction
The exploration of the extremely distant, extremely dark, and extremely cold regions at the solar system boundary is a hot point in deep space exploration. After the human exploration of the Moon and Mars, which has progressively expanded to the solar system boundary, further space exploration could continuously open new windows for humans’ understanding of the universe [1]. There are only a limited number of deep space missions that continue to explore the solar system boundary after completing the scheduled exploration mission, such as Pioneer 10–11 [2,3,4], Voyager 1–2 [5,6], and New Horizons [7,8]. With the continuous extension of the field of deep space exploration, China has put forward the SSBE program, which could build the capability of the probe to reach the whole area of the solar system and realize super long-distance detection from internal celestial bodies to interstellar space, providing us with the capability to independently explore interplanetary far-reaching space [1,9]. Due to the characteristics of unknown and varied exploration environments, super long space exploratory distances, super long orbital flight times, and significant communication data delays between the SSBE and the ground, the implementation of the exploration mission is very difficult [10]. Because the probe is so far from the ground, it cannot rely on traditional navigation methods, such as the radio station aeronautical ground, very long baseline interferometry (VLBI), and tracking telemetry and command (TT&C) communication networks to provide real-time and high-accuracy navigation information, which presents new challenges to ultra-long-distance autonomous high-accuracy navigation [11,12].
Regarding the solar system boundary probe, reliance on local ground-based tracking navigation would no longer be applicable and could not be used for autonomous real-time and high-accuracy navigation of the probe at great distances from the ground [12]. In order to improve the autonomous operation capability of the probe in orbit, autonomous navigation technology that is independent of ground navigation measurements has been actively studied in depth. X-ray pulsar navigation (XNAV) is a new kind of autonomous celestial navigation technology that can be applied to both near-Earth and deep space [13,14,15]. Based on the three X-ray pulsars, the autonomous position determination of the deep space probe could be realized. By collecting the X-ray radiation signals from the X-ray pulsar with the X-ray detector, the pulse time-of-arrival (TOA) can be obtained. The arrival time of the pulse reaching the detector and the time to reach the solar system barycenter (SSB) are predicted, which can determine the position of the probe by characterizing its position direction along the X-ray pulsar [16,17]. Reichley et al. [18] and Downs [19] stated that the position of the spacecraft is determined through the measurement of the phase of a periodic signal, and they speculated for the first time that the X-ray pulsar could be used to determine the time and position of the spacecraft in orbit [15]. Runnels et al. [20] proposed a method to estimate the comprehensive six DOF position navigation and time solutions for the deep space exploration spacecraft, which relied on the measurement of the arrival time and angles of the X-ray photons from the X-ray pulsars and other bright stars. Gao et al. [21] proposed differential XNAV based on the time difference of the pulse arrival, which analyzed the impact of the ephemeris errors on correcting the photon TOA and observation equations, and three Mars orbits with different altitudes were simulated to validate the efficacy of the navigation scheme and its suitability for the different heights from the Martian surface. When the detector moves relative to stellar objects in far-reaching space exploration, the measured stellar objects spectrum also undergoes Doppler frequency offset, which reflects the velocity of the probe relative to the stellar objects [22]. The radial velocity of the probe relative to the navigation stellar objects can be calculated based on the Doppler frequency offset of the probe with respect to the stellar object, and then the velocity state of the probe can be estimated. Yim et al. [23] adopted Doppler measurements generated by the relative motion of the spacecraft and Sun as the key measurement, which realized the autonomous navigation of the spacecraft for deep space exploration, but the positioning accuracy is not particularly high. Ning et al. [24] introduced the celestial navigation algorithm assisted by the differential Doppler measurements, which used the difference between Doppler radial velocity measured in two adjacent observation periods as the measured value to mitigate the effects of solar spectrum frequency instability and enhance autonomous celestial navigation accuracy.
Since deep space probes operate in long-endurance, highly dynamic, and extreme environments, it is difficult for the single navigation pattern to satisfy the navigation demands of excellent accuracy and great dependability for the deep space probe. The integrated navigation formed by the combination of multiple navigation modes is an irresistible trend for deep space exploration navigation systems in the future. Liu et al. [25] proposed an autonomous integrated navigation scheme based on Doppler/XNAV, and the Doppler autonomous navigation provides high-accuracy velocity error estimation information by measuring the radial velocity information relative to the Sun. The complementary of the two navigation methods can make up for the problem of low-accuracy navigation caused by shrinking the pulse signal receiving area size for the X-ray pulsar detection sensor. Pan et al. [26] used the stability of the solar Doppler differential navigation to put forward the solar time difference of arrival (TDOA)/Doppler difference joint observation and integrate it with Mars angle navigation. This approach allows fully observable Mars angle and solar light joint observation navigation, providing higher accuracy and stability of the navigation. Cui et al. [27] proposed the integrated navigation scheme based on the X-ray pulsars and Doppler to advance the error estimation precision of the probe during the final approach phase entry state of Mars exploration, and the navigation scheme adopted the X-ray pulsar observations and Doppler velocity measurements to complement each other to eliminate the problem of the reduced velocity state estimation performance of the XNAV for this phase.
During the design of the autonomous navigation system for the SSBE cruise phase, compared with other filtering estimators [28,29], the extended Kalman filter (EKF) is the preferred state estimator for the cruise phase navigation system because of the nonlinearity of the orbital dynamics model [30]. The EKF is widely used in various scenarios involving navigation systems and has a very stable navigation performance [31,32]. This state estimation algorithm has also been favored by researchers from the field of astronomical autonomous navigation. In addition, for the navigation system, the measurement noise is usually based on the measurement error of the sensor used in actual tasks. The environment disturbances mainly come from micrometeoroids, cosmic rays, and solar wind in the space environment, which all have an impact on the navigation accuracy of the deep space probe, and it is necessary to set the noise according to the scene in the different phases during the simulation analysis. Kang et al. [33] combined Doppler velocity measurements and the XNAV and proposed the integrated navigation method of the Doppler velocity measurements based on the double measurement model and X-ray pulsar. The EKF was used to achieve continuous high-accuracy state estimation, which effectively suppressed the accumulative state errors in the Doppler velocity measurements and improved the PVSE precision of the autonomous navigation system for the deep space exploration spacecraft. To enhance the autonomous navigation capability of the deep space probes, many FEKF integrated navigation methods based on multiple measurement models have been proposed [34]. Yang et al. [35] applied the XNAV and ultraviolet sensor to measure the position, velocity, and attitude of the deep space probe. The researchers also designed the FEKF integrated navigation system based on the X-ray pulsar and ultraviolet sensor, and the system could perform clock time prediction, attitude determination, and orbit determination concurrently during the deep space probe in-orbit flight. It could supply high-precision error state estimations. In order to eliminate the influence of the unstable solar spectra on Doppler velocity measurements and given that Doppler difference navigation might accumulate position errors over time, Liu et al. [36] combined the solar Doppler difference velocity measurement and the XNAV method to propose integrated X-ray pulsar/Doppler differential velocity measurement navigation based on FEKF for deep space exploration. This method has high robustness for Doppler velocity measurement deviation brought by the solar spectral instability and also ensures high-accuracy state estimation of the deep space probe in orbit.
The intelligent autonomous navigation algorithm combined with the reinforcement Q-learning algorithm has been favored by researchers at present [37]. In particular, it has a very good application in estimating the accuracy of in-orbit spacecraft, providing high precision for spacecraft navigation systems [38]. Xiong et al. [39] proposed the state error estimation algorithm that combines the reinforcement Q-learning algorithm and the EKF and tried to use reinforcement learning (RL) to restrain the influence of the uncertain noise error covariance matrix on the estimation performance of the EKF navigation algorithm, in which the covariance adaptive adjustment strategy based on Q-learning was designed to enhance the state estimation performance of the autonomous nonlinear navigation systems for the spacecraft. In addition, they also adopted the astronomical integrated navigation based on an ultraviolet earth sensor and optical interferometer and introduced QLEKF to integrate the information from the two measurements to estimate the position, velocity, and attitude state error of the spacecraft [40]. They also assessed the optical path delay bias in the optical interferometer, and the proposed QLEKF demonstrates superior estimation performance compared to the conventional EKF. Nemati et al. [41] adopted RL to enhance the position estimation of the spacecraft and proposed the state estimation algorithm based on the combination of RL and the Kalman filter to acquire the optimal solution of the state and observation noise covariance, which improved the PVSE accuracy for the spacecraft navigation. Xiong et al. [42] recently developed an improved Q-learning EKF optical autonomous navigation based on their own previous research by measuring the line of sight (LOS) direction of the non-cooperative spacecraft targets, which improved the PVSE precision of the spacecraft navigation based on the space target LOS by fine-tunning the process noise error covariance matrix filtering parameters.
Based on the above research conducted by others, this paper presents a combined navigation scheme based on the X-ray pulsar and two-dimensional Doppler velocity measurements to improve the high accuracy of autonomous integrated navigation for the SSBE cruise phase, which can make up for the accumulated position estimation error in the Doppler velocity measurements. The FEKF algorithm is designed to correct the position information obtained from the two-dimensional Doppler velocity-measuring in real-time by the position information estimation of the XNAV, and eliminate the accumulated position errors caused by Doppler velocity-measuring at the same time. The Q-learning-based FEKF integrated navigation algorithm was developed, enabling real-time parameter tuning and fine-tuning the state and observation noise error parameters for this navigation system according to the operational state environment of the probe during the cruising phase. The research is different from the X-ray pulsar and the Doppler velocity dual measurement model integrated navigation given by [33] in that the federation filter composed of the primary filter and sub-filter is adopted to realize the information fusion of multiple navigation modes and guarantee the estimation performance of autonomous integrated navigation. Compared with the above research [39,40,41,42], the intelligent Q-learning algorithm used to tune the parameters of the state and observation noise covariance is also improved accordingly. The reward mechanism and Q-table of the reinforcement Q-learning is designed according to the sub-filter characteristics of the federated filter, and the iterative period of the Q-learning is taken as the algorithm to evaluate the cumulative reward. The optimal state and observation noise covariance parameters are obtained by adjusting the grid size of the Q-table and the learning parameters. The proposed intelligent integrated navigation method based on QLFEKF has the potential to realize high precision and autonomy.
The main research work for this paper involves in-depth research on the high accuracy and feasibility of the intelligent integrated navigation for the cruise phase of the SSBE based on the X-ray pulsar/solar and target planetary Doppler velocity measurements of the QLFEKF algorithm. This paper is divided into six sections. After the introduction of the first section, the orbital dynamics model and the integrated navigation measurement model based on the X-ray pulsar and two-dimensional Doppler velocity measurements of the cruise phase are modeled in Section 2, and the state and measurement equations intelligent navigation for the cruise phase are given. In Section 3, the navigation information obtained from the fusion state estimation algorithm based on QLFEKF is developed with the combination of reinforcement Q-learning and the conventional Kalman filter. In Section 4, the performance of the QLFEKF-based intelligent integrated navigation using X-ray pulsar/solar and target planetary Doppler velocity measurements is simulated and analyzed. The results are compared with the EKF and the FEKF integrated navigation algorithms. The high accuracy performance of the suggested intelligent integrated navigation algorithm is confirmed with simulation and analysis. Finally, the conclusions are presented in Section 5.
4. Simulation and Results Analysis
The high-accuracy, effectiveness, and feasibility of the autonomous intelligent integrated navigation method is an important index to improve the autonomous operation ability of the cruise phase for the SSB probe. In order to further analyze the performance for this presented method, which is compared with the integrated navigation FEKF algorithm to verify the high accuracy of the state estimation of the X-ray pulsar/solar and target planetary Doppler velocity-measuring intelligent integrated navigation algorithm based on the QLFEKF.
4.1. Simulation Initial Conditions
This section presents the initial position state and velocity sate parameters of the probe cruise phase for the solution of the orbital dynamics model at the beginning. The initial simulation conditions are shown in Table 1 as follows. The simulation time is from 1 September 2026 12:00:00:00.000 UTCG to 7 September 2026 12:00:00:00.000 UTCG. Thus, the total simulation time is 6 days.
Table 1.
Initial state orbit parameters for the probe.
Three X-ray pulsars are considered as one of the measurement models for intelligent integrated navigation, namely PSR B1937+21, PSR B0531+21, and PSR J1821-24. Their specific parameters are shown in Table 2. The X-ray pulsar detector has an effective area of 1 m2. The vector of the position for the SSB relative to the solar center of mass is calculated by the ephemeris DE441 from the Jet Propulsion Laboratory (JPL). According to the approximate calculation formula of the pulsar observation noise [53,54], the standard deviations of the measurement noise for three X-ray pulsars are p1 = 0.42060 km, p2 = 0.14070 km, and p3 = 0.44488 km, respectively.
Table 2.
The corresponding parameters of the selected X-ray pulsars.
The initial conditions of the relevant parameters for the X-ray pulsars and solar/target planetary Doppler measurement model that would be adopted in the simulation are listed in Table 3 and mainly include the relevant initial measurement parameter characteristics of the X-ray pulsar detectors and Doppler spectrometers.
Table 3.
The corresponding parameters of the measurement models.
In the QLFEKF algorithm, reinforcement Q-learning is adopted to tune the parameters of the state and observation noise error covariance matrices and to obtain the optimum state and observation noise error covariance matrices. The parameters tuned by the reinforcement Q-learning include α, γ, and ε. Set the size of the appropriate Q-learning grid to M × N = 20 × 20. In the case of the specific range of the , , , and , the larger scale grid could search more fully and cover more area. The root mean squared error (RMSE) [54] is adopted as the evaluation indicator to implement precision analysis of the PVSE for the probe’s cruise phase. The relevant design parameters for the QLFEKF algorithm are summarized in Table 4 as follows.
Table 4.
The design parameters of the QLFEKF algorithm.
4.2. Simulation and Analysis
The X-ray pulsar/solar and target planetary Doppler velocity measurement based the Q-learning federation EKF (STD/XP-QLFEKF) is compared with the navigation state estimates of precision of the X-ray pulsar/solar and target planetary Doppler velocity measurement-based EKF (STD/XP-EKF) and the X-ray pulsar/solar and target planetary Doppler velocity measurement-based federated EKF (STD/XP-FEKF). The RMSE of the PVES of the cruise phase for the probe is used as the evaluation indicator of the navigation accuracy.
Figure 6 and Figure 7 show the RMSE curves of the position estimation error and the three-axis position estimation error of the three integrated navigation algorithms. It can be seen from the figure that the QLFEKF algorithm shows high accuracy position estimation performance with an error accuracy of 102 m. This shows that QLFEKF can adaptively choose the appropriate process states and observation (the X-ray pulsar detector measurement errors and spectrometer measurement errors) noise error covariance matrix by using the cumulative reward mechanism of the Q-learning through its interaction with the flight environment.
Figure 6.
Comparison of the position estimate RMSEs between STD/XP-QLFEKF and other integrated navigation algorithms.
Figure 7.
Comparison of the position estimate RMSEs for three axes between STD/XP-QLFEKF and other integrated navigation algorithms.
The stability of the position for the three integrated navigation algorithms has corresponding fluctuations, which are caused by eliminating the accumulated position errors caused by Doppler velocity measurements. However, compared with the position state estimation of the STD/XP-EKF navigation algorithm and STD/XP-FEKF navigation algorithm, the stability of the STD/XP-QLFEKF position state estimation is better. This result shows that the Q-learning algorithm has a good convergence effect in the process of parameter tuning, which ensures that the probe can operate continuously and stably in the cruise phase with high precision.
From Figure 8 and Figure 9, it is seen that the velocity estimation precision of the STD/XP-QLFEKF algorithm is about 10−1 m/s, which is much higher than that of the STD/XP-EKF algorithm and the STD/XP-FEKF algorithm. It can be indicated that the state error estimation precision of the FEKF is better than that of the EKF, which reflects the superiority of the federated filter in the fusion of various measurement information. The FEKF combined with the Q-learning algorithm can distribute the position, velocity, and state transition information from different measurement models according to the weighted proportions, and it ensures that the information fusion of various navigation measurement models can obtain high-accuracy position state and velocity state error estimations based on the time update and measurement update. In addition, we could see that the precision convergence of the intelligent navigation system based on X-ray pulsar/solar and target planetary Doppler velocity measurements is very stable.
Figure 8.
Comparison of the velocity estimate RMSEs between STD/XP-QLFEKF and other integrated navigation algorithms.
Figure 9.
Comparison of the velocity estimate RMSEs based on three axes between STD/XP-QLFEKF and other integrated navigation algorithms.
Based on the simulated analysis mentioned above, it verifies that the proposed STD/XP-QLFEKF intelligent integrated navigation is very appropriate for the considered intelligent integrated navigation for the cruise phase of the SSBE. Table 5 below shows the comparison of the position state and velocity state estimation RMSE errors (3σ) for the above three integrated navigation algorithms.
Table 5.
The PVSE RMSE errors (3σ) of the integrated navigation system.
From Table 5 above, it can be seen that the position estimation RMSE error (3σ) accuracy is 443.560 m and the velocity state RMSE (3σ) precision is 0.630 m/s for the STD/XP-EKF. The PVES RMSE (3σ) precision of the STD/XP-FEKF are 351.803 m and 0.386 m/s, respectively. Based on the proposed STD/XP-QLFEKF algorithm in the paper, the PVSE RMSE error (3σ) accuracy of the cruise phase for the probe is 152.101 m and 0.243 m/s, respectively. By comparing the accuracy of the PVSE errors (3σ) between the STD/XP-QLFEKF and STD/XP-FEKF algorithms, the accuracy of the STD/XP-QLFEKF algorithm has been greatly improved. According to statistical analysis, the PVSE precision improvement capability of the STD/XP-QLFEKF algorithm is 55.84% and 37.04%, respectively. This is because the Q-learning algorithm can dynamically adjust the process and observation noise error covariance by periodically iterating according to the real-time in-orbit operating state after receiving different measurement state information. The cumulative reward is obtained through continuous iterations in each iteration cycle (ItCy). The Q-value of the corresponding state (, , , and ) position in the Q-table is calculated according to the cumulative reward value, and the appropriate information parameters are selected to the maximum extent through continuous trial-and-error to guarantee the performance superiority of the STD/XP-QLFEKF algorithm.
Next, the range of the related parameters (including α, γ, and ε) for the reinforcement Q-learning is set to evaluate the effects on the related learning parameters on the precision of the PVES error of the STD/XP-QLFEKF. The following sets the range of the Q-learning parameters involved above when the three learning parameters are changed in the range of , , and , and the RMSE errors (3σ) for position state and velocity state estimation of the STD/XP-QLFEKF algorithm are shown in Figure 10, Figure 11 and Figure 12, respectively.
Figure 10.
The PVSE RMSE errors (3σ) of the cruise phase as a function of the learning rate.
Figure 11.
The PVSE RMSE errors (3σ) of the cruise phase as a function of the discount factor.
Figure 12.
The PVSE RMSE errors (3σ) for the cruise phase as s function of the action selection probability.
From Figure 10, Figure 11 and Figure 12, it can be seen that the change of the Q-learning design parameters has relatively little influence on the PVES error precision for the STD/XP-QLFEKF algorithm. The simulation results show that the STD/XP-QLFEKF is insensitive to the changes in the navigation PVSE errors within the range of the above α, γ, and ε, and the small changes of navigation accuracy estimation errors caused by the changes of the Q-learning parameters can be ignored in the process of intelligent navigation algorithm design.
In addition, in the process of the STD/XP-QLFEKF algorithm reward mechanism design, we take the iteration period as a function of reward accumulation, and the setting of the iteration period has a certain influence on the calculation results of the reward accumulation. The following is the simulation and analysis of the PVES error corresponding different iteration cycles of the Q-learning of the QLFEKF algorithm. The influence of the different iteration cycles of reinforcement Q-learning on the accuracy of PVSE errors in the probe cruise phase is shown in Figure 13.
Figure 13.
The influence of different iteration cycles of the reinforcement Q-learning on the precision of the PVSE errors in the probe’s cruise phase.
The simulation results show that when the iteration cycle of the Q-learning in STD/XP-QLFEKF algorithm is 7200 s, the position state and velocity state estimation precision for the intelligent integrated navigation is higher. Therefore, the cumulative reward calculation of the Q-learning algorithm is carried out according to the iteration cycle (ItCy = 2.0 h) in this research to ensure the high precision of the state error estimation of the intelligent integrated navigation system for the cruise phase of the SSBE.
5. Conclusions
Intelligent, high-accuracy integrated navigation for the cruise phase of the SSBE is deeply studied in this paper. The single navigation mode could not meet the high-precision and high-reliability requirements of the navigation system for a probe operating in long-endurance, highly dynamic, and extreme environments. In order to improve the navigation accuracy of the PVSE for the cruise phase of the probe, the QLFEKF based on the X-ray pulsar/solar and target planetary Doppler velocity measurement intelligent integrated navigation algorithm is proposed. The integrated navigation algorithm can compensate for the accumulated position estimation error of Doppler velocity measurements. The navigation system adopts the federation filter composed of the sub-filter and main filter to realize the information fusion of the various navigation modes and ensure the estimation performance of the intelligent integrated navigation system. At the same time, the cumulative reward mechanism of the reinforcement Q-learning algorithm is designed according to the sub-filter characteristics of the federated filter, and the iterative period of the Q-learning algorithm is used as an index to judge the cumulative reward. The process and measurement noise covariance matrices are carried out by tuning the parameters by traversing the Q-table. The simulation and analysis show that compare with the STD/XP-EKF and the STD/XP-FEKF, the STD/XP-QLFEKF algorithm has higher navigation accuracy, and its PVSE RMSE error (3σ) accuracy can reach 152.101 m and 0.243 m/s, respectively. The different values of the three learning parameters (α, γ, and ε) for the QLFEKF algorithm are insensitive to the PVSE errors, and slight changes in the estimation errors of the navigation accuracy caused by changes in the Q-learning parameters can be ignored. Of course, it is necessary to consider the iterative period parameters related to the cumulative rewards of Q-learning in the QLFEKF algorithm. As a function of the reward accumulation, the iterative period has a certain influence on reward accumulation and thus affects the PVSE accuracy of the navigation system. It is important to choose the appropriate cumulative reward iteration cycle. Therefore, the QLEKF based on the intelligent X-ray pulsar/solar and target planetary Doppler velocity measurement integrated navigation system has high accuracy, which can fully fulfill the high accuracy demands for the cruise phase of the SSBE probe.
In future work, the combination of reinforcement learning and traditional navigation methods will be further studied, and deep neural networks will be added to reinforcement learning to provide interpretability for intelligent integrated navigation methods and improve the flexibility of navigation system parameter setting. The fusion of multisource intelligent information of the integrated navigation measurement models would be simultaneously carried out to further improve the information fusion accuracy and efficiency of information obtained from various navigation measurements. It provides theoretical and technical support for the interpretability of the multistage, intelligent, and highly accurate integrated navigation of the SSBE.
Author Contributions
W.T. put forward the algorithm, carried out the simulation and experiment, and prepared this original manuscript; J.Z., J.S. and J.W. pointed out the research directions, gave theoretical guidance, and revised the manuscript; Q.L., H.W., Z.C. and J.Y. revised and checked the manuscript. The authors would like to thank the reviewers and the editor for their comments and constructive suggestions that helped to improve the paper significantly. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the National Natural Science Foundation Project, China (grant number 12150007) and the Guangdong Major Project of Basic and Applied Basic Research, China (grant number 2019B030302001).
Data Availability Statement
The data generated in this study are not publicly available due to [their use in an ongoing study by the authors] but can be made available from the corresponding author upon reasonable request.
Acknowledgments
The authors would like to thank Jiaqi Min, Sihuan Wu, Maosen Shao, and Sifan Wu of our research team for their valuable help.
Conflicts of Interest
The authors declare no conflicts of interest.
Abbreviations
| SSBE | Solar System Boundary Exploration |
| FEKF | Federated Extended Kalman Filter |
| QLFEKF | Federated Extended Kalman Filter Based on Q-learning |
| PVSE | Position and Velocity State Estimation |
| VLBI | Very Long Baseline Interferometry |
| TT&C | Tracking Telemetry and Command |
| XNAV | X-ray Pulsar Navigation |
| TOA | Time-of-Arrival |
| SSB | Solar System Barycenter |
| TDOA | Time Difference of Arrival |
| EKF | Extended Kalman Filter |
| RL | Reinforcement Learning |
| LOS | Line of Sight |
| PA | Probe Agent |
| RMSE | Root Mean Squared Error |
| STD/XP-QLFEKF | X-ray Pulsar/Solar and Target Planetary Doppler Velocity Measurement Based on the Q-learning Federation EKF |
| STD/XP-EKF | X-ray Pulsar/Solar and Target Planetary Doppler Velocity Measurement Based on EKF |
| STD/XP-FEKF | X-ray Pulsar/Solar and Target Planetary Doppler Velocity Measurement Based on Federated EKF |
References
- Wu, W.; Yu, D.; Huang, J.; Zong, Q.; Wang, C.; Yu, G.; Hao, R.; Wang, Q.; Kang, Y.; Meng, L.; et al. Exploring the solar system boundary. Sci. Sin. Inform. 2019, 49, 1–16. [Google Scholar] [CrossRef]
- Hall, C.F. Pioneer 10 and pioneer 11. Science 1975, 4187, 445–446. [Google Scholar] [CrossRef] [PubMed]
- Courty, J.M.; Levy, A.; Christophe, B.; Reynaud, S. Simulation of ambiguity effects in Doppler tracking of Pioneer probes. Space Sci. Rev. 2010, 151, 93–103. [Google Scholar] [CrossRef]
- Capova, K.A. Introducing Humans to the Extraterrestrials: The Pioneering Missions of the Pioneer and Voyager Probes. Front. Hum. Dyn. 2021, 3, 714616. [Google Scholar] [CrossRef]
- Burlaga, L.F.; Ness, N.F.; Stone, E.C. Magnetic field observations as Voyager 1 entered the heliosheath depletion region. Science 2013, 341, 147–150. [Google Scholar] [CrossRef]
- Stone, E.C.; Cummings, A.C.; Heikkila, B.C.; Lal, N. Cosmic ray measurements from Voyager 2 as it crossed into interstellar space. Nat. Astron. 2019, 3, 1013–1018. [Google Scholar] [CrossRef]
- Fountain, G.H.; Kusnierkiewicz, D.; Hersman, C.; Herder, T.; Coughlin, T.; Gibson, W.; Clancy, D.; DeBoy, C.; Hill, T.; Kinnison, J.; et al. The new horizons spacecraft. Space Sci. Rev. 2008, 140, 23–47. [Google Scholar] [CrossRef]
- Guo, Y.P.; Farquhar, R.W. New Horizons mission design. Space Sci. Rev. 2008, 140, 49–74. [Google Scholar] [CrossRef]
- Liu, X.; Wang, J.; Li, X.; Chen, M.; Yu, Y. Laser communication proposal for solar system boundary exploration. J. Telem. Track. Command 2022, 43, 62–69. [Google Scholar] [CrossRef]
- Song, Y.; Wu, W.; Hu, H.; Lin, M.; Wang, H.; Zhang, J. Gravity assist space pruning and global optimization of spacecraft trajectories for solar system boundary exploration. Complex Intell. Syst. 2023, 10, 323–341. [Google Scholar] [CrossRef]
- Zheng, W.; Wang, Y. X-ray Pulsar–Based Navigation: Theory and Applications; Springer: Singapore, 2020; pp. 1–20. [Google Scholar] [CrossRef]
- Wang, Y.; Wang, Y.; Zheng, W.; Song, M. X-Ray Pulsar-based Navigation Scheme for Solar System Boundary Exploration. J. Phys. Conf. Ser. 2020, 2224, 012127. [Google Scholar] [CrossRef]
- Sheikh, S.I.; Pines, D.J.; Ray, P.S.; Wood, K.S.; Lovellette, M.N.; Wolff, M.T. Spacecraft Navigation Using X-Ray Pulsars. J. Guid. Control Dyn. 2006, 29, 49–63. [Google Scholar] [CrossRef]
- Emadzadeh, A.A.; Speyer, J.L. Relative navigation between two spacecraft using X-ray pulsars. IEEE Trans. Control Syst. Technol. 2010, 19, 1021–1035. [Google Scholar] [CrossRef]
- Shemar, S.; Fraser, G.; Heil, L.; Hindley, D.; Martindale, A.; Molyneux, P.; Pye, J.; Warwick, R.; Lamb, A. Towards practical autonomous deep-space navigation using X-Ray pulsar timing. Exp. Astron. 2016, 42, 101–138. [Google Scholar] [CrossRef]
- Wang, Y.; Zheng, W. Pulse phase estimation of X-ray pulsar with the aid of vehicle orbital dynamics. J. Navig. 2016, 69, 414–432. [Google Scholar] [CrossRef]
- Rinauro, S.; Colonnese, S.; Scarano, G. Fast near-maximum likelihood phase estimation of X-ray pulsars. Signal Process. 2013, 93, 326–331. [Google Scholar] [CrossRef]
- Reichley, P.E.; Downs, G.S.; Morris, G.A. Time-of-Arrival Observations of Eleven Pulsars. Astrophys. J. 1970, 159, L35–L40. [Google Scholar] [CrossRef]
- Downs, G.S. Interplanetary Navigation Using Pulsating Radio Sources. NASA Technical Report N74-34150. 1974; pp. 1–12. Available online: https://ntrs.nasa.gov/api/citations/19740026037/downloads/19740026037.pdf (accessed on 12 June 2024).
- Runnels, J.T.; Demoz, G.E. Estimator for deep-space position and attitude using X-ray pulsars. IEEE Trans. Aerosp. Electron. Syst. 2021, 57, 2149–2166. [Google Scholar] [CrossRef]
- Gao, J.; Fang, H.; Su, J. Differential X-Ray Pulsar Navigation Method Based on Pulse Arrival Time Difference. In China Satellite Navigation Conference (CSNC 2022) Proceedings: Volume III; Springer Nature: Singapore, 2022; pp. 552–562. [Google Scholar] [CrossRef]
- Huang, S.; Kang, Z.; Liu, J.; Ma, X. Accuracy analysis of spectral velocimetry for the solar Doppler difference navigation. IEEE Access 2021, 9, 78075–78082. [Google Scholar] [CrossRef]
- Yim, J.R.; Crassidis, J.L.; Junkins, J.L. Autonomous orbit navigation of interplanetary spacecraft. In Proceedings of the Astrodynamics Specialist Conference, Denver, CO, USA, 14–17 August 2000; pp. 53–61. [Google Scholar] [CrossRef]
- Ning, X.; Gui, M.; Fang, J.; Liu, G.; Dai, Y. A novel differential Doppler measurement-aided autonomous celestial navigation method for spacecraft during approach phase. IEEE Trans. Aerosp. Electron. Syst. 2017, 53, 587–597. [Google Scholar] [CrossRef]
- Liu, J.; Kang, Z.; Wuite, P.; Ma, J.; Tian, J. Doppler/XNAV–integrated navigation system using small-area X-ray sensor. IET Radar Sonar Navig. 2011, 5, 1010–1017. [Google Scholar] [CrossRef]
- Pan, C.; Liu, J.; Kang, Z.; Chen, X. Solar TDOA/Doppler difference joint observation navigation for the approach phase of mars exploration. Int. J. Aeronaut. Space Sci. 2020, 9, 836–844. [Google Scholar] [CrossRef]
- Cui, P.; Wang, S.; Gao, A.; Yu, Z. X-ray pulsars/Doppler integrated navigation for Mars final approach. Adv. Space Res. 2016, 57, 1889–1900. [Google Scholar] [CrossRef]
- Xu, Y.; Cao, J.; Shmaliy, Y.S.; Zhuang, Y. Distributed Kalman filter for UWB/INS integrated pedestrian localization under colored measurement noise. Satell. Navig. 2021, 2, 22. [Google Scholar] [CrossRef]
- Wang, X.; Yang, Y.; Wang, B.; Lin, Y.; Han, C. Resilient timekeeping algorithm with multi-observation fusion Kalman filter. Satell. Navig. 2023, 4, 25. [Google Scholar] [CrossRef]
- Julier, S.; Uhlmann, J.; Durrant-Whyte, H.F. A new method for the nonlinear transformation of means and covariances in filters and estimators. IEEE Trans. Autom. Control. 2000, 45, 477–482. [Google Scholar] [CrossRef]
- Xin, S.J.; Wang, X.M.; Zhang, J.L.; Zhou, K.; Chen, Y.F. A Comparative Study of Factor Graph Optimization-Based and Extended Kalman Filter-Based PPP-B2b/INS Integrated Navigation. Remote Sens. 2023, 15, 5144. [Google Scholar] [CrossRef]
- Yin, Z.H.; Yang, J.C.; Ma, Y.; Wang, S.L.; Chai, D.S.; Cui, H.N. A Robust Adaptive Extended Kalman Filter Based on an Improved Measurement Noise Covariance Matrix for the Monitoring and Isolation of Abnormal Disturbances in GNSS/INS Vehicle Navigation. Remote Sens. 2023, 15, 4125. [Google Scholar] [CrossRef]
- Kang, Z.; Xu, X.; Liu, J.; Li, N. Doppler velocity measurement based on double measurement model and its integrated navigation. J. Astronaut. 2017, 38, 964–970. [Google Scholar] [CrossRef]
- Qiao, L.; Liu, J.; Zheng, G.; Xiong, Z. Augmentation of XNAV system to an ultraviolet sensor-based satellite navigation system. IEEE J. Sel. Top. Signal Process. 2009, 3, 777–785. [Google Scholar] [CrossRef]
- Yang, C.; Zheng, J.; Gao, D. Autonomous orbit and attitude determination including time prediction based on XNAV and ultraviolet sensor. Chin. J. Space Sci. 2013, 33, 194–199. [Google Scholar] [CrossRef]
- Liu, J.; Fang, J.; Yang, Z.; Kang, Z.; Wu, J. X-ray pulsar/Doppler difference integrated navigation for deep space exploration with unstable solar spectrum. Aerosp. Sci. Technol. 2015, 41, 144–150. [Google Scholar] [CrossRef]
- Ou, J.J.; Guo, X.; Lou, W.J.; Zhu, M. Quadrotor autonomous navigation in semi-known environments based on deep reinforcement learning. Remote Sens. 2021, 13, 4330. [Google Scholar] [CrossRef]
- Xiong, K.; Zhao, Q.; Yuan, L. Calibration Method for Relativistic Navigation System Using Parallel Q-Learning Extended Kalman Filter. Sensors 2024, 24, 6186. [Google Scholar] [CrossRef] [PubMed]
- Xiong, K.; Wei, C.; Zhang, H. Q-learning for noise covariance adaptation in extended KALMAN filter. Asian J. Control 2021, 23, 1803–1816. [Google Scholar] [CrossRef]
- Xiong, K.; Wei, C. Integrated celestial navigation for spacecraft using interferometer and Earth sensor. Proc. Inst. Mech. Eng. Part G J. Aerosp. Eng. 2020, 234, 2248–2262. [Google Scholar] [CrossRef]
- Nemati, M.H.; Kankashvar, M.R.; Bolandi, H. Unscented Kalman Filter adaptive noise covariance selection for satellite formation flying with Q_leaming. In Proceedings of the 2022 30th International Conference on Electrical Engineering (ICEE), IEEE, Tehran, Iran, 17–19 May 2022; pp. 362–367. [Google Scholar] [CrossRef]
- Xiong, K.; Peng, Z.; Wei, C. Spacecraft autonomous navigation using line-of-sight directions of non-cooperative targets by improved Q-learning based extended Kalman filter. Proc. Inst. Mech. Eng. Part G J. Aerosp. Eng. 2024, 238, 182–197. [Google Scholar] [CrossRef]
- Chang, X.; Cui, P.; Cui, H. Research on Autonomous Navigation Method of Deep Space Cruise Phase Based on the Sun Observation. Yuhang Xuebao/J. Astronaut. 2010, 31, 1017–1023. [Google Scholar] [CrossRef]
- Chen, X.; You, W.; Huang, Q. Research on Celestial Navigation for Mars Missions during the Interplanetary Cruising. J. Deep Space Explor. 2016, 3, 214–218. [Google Scholar] [CrossRef]
- Song, M.; Yuan, Y. Research on Autonomous Navigation Method for the Cruise Phase of Mars Exploration. Geomat. Inf. Sci. Wuhan Univ. 2016, 41, 952–957. [Google Scholar] [CrossRef]
- Liu, J.; Fang, J.; Kang, Z.; Wu, J.; Ning, X. Novel algorithm for X-ray pulsar navigation against Doppler effects. IEEE Trans. Aerosp. Electron. Syst. 2015, 51, 228–241. [Google Scholar] [CrossRef]
- Emadzadeh, A.A.; Speyer, J.L. On modeling and pulse phase estimation of X-ray pulsars. IEEE Trans. Signal Process. 2010, 58, 4484–4495. [Google Scholar] [CrossRef]
- Sheikh, S.I.; Hanson, J.E.; Graven, P.H.; Pines, D.J. Spacecraft navigation and timing using X-ray pulsars. Navigation 2011, 58, 165–186. [Google Scholar] [CrossRef]
- Liu, J.; Ma, J.; Tian, J. Pulsar/CNS integrated navigation based on federated UKF. J. Syst. Eng. Electron. 2010, 21, 675–681. [Google Scholar] [CrossRef]
- Watkins, C.J.C.H.; Dayan, P. Q-learning. Mach. Learn. 1992, 8, 279–292. [Google Scholar] [CrossRef]
- AlMahamid, F.; Grolinger, K. Autonomous unmanned aerial vehicle navigation using reinforcement learning: A systematic review. Eng. Appl. Artif. Intell. 2022, 115, 105321. [Google Scholar] [CrossRef]
- Wong, A.; Bäck, T.; Kononova, A.V.; Plaat, A. Deep multiagent reinforcement learning: Challenges and directions. Artif. Intell. Rev. 2023, 56, 5023–5056. [Google Scholar] [CrossRef]
- Wang, Y.; Zheng, W.; Sun, S. X-ray pulsar-based navigation system/Sun measurement integrated navigation method for deep space explorer. Proc. Inst. Mech. Eng. Part G J. Aerosp. Eng. 2015, 229, 1843–1852. [Google Scholar] [CrossRef]
- Xu, Q.; Fan, X.; Zhao, A.; Cui, H.; Xu, L.; Liu, N.; Ding, B. Pre-correction X-ray pulsar navigation algorithm based on asynchronous overlapping observation method. Adv. Space Res. 2021, 67, 583–596. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).