Abstract
For the relativistic navigation system where the position and velocity of the spacecraft are determined through the observation of the relativistic perturbations including stellar aberration and starlight gravitational deflection, a novel parallel Q-learning extended Kalman filter (PQEKF) is presented to implement the measurement bias calibration. The relativistic perturbations are extracted from the inter-star angle measurement achieved with a group of high-accuracy star sensors on the spacecraft. Inter-star angle measurement bias caused by the misalignment of the star sensors is one of the main error sources in the relativistic navigation system. In order to suppress the unfavorable effect of measurement bias on navigation performance, the PQEKF is developed to estimate the position and velocity, together with the calibration parameters, where the Q-learning approach is adopted to fine tune the process noise covariance matrix of the filter automatically. The high performance of the presented method is illustrated via numerical simulations in the scenario of medium Earth orbit (MEO) satellite navigation. The simulation results show that, for the considered MEO satellite and the presented PQEKF algorithm, in the case that the inter-star angle measurement accuracy is about 1 mas, after calibration, the positioning accuracy of the relativistic navigation system is less than 300 m.
1. Introduction
Spacecraft navigation is an enabling technology for a wide variety of space missions, such as Earth satellites and deep space explorers. Currently, the commonly used navigation approach is the radio navigation based on the radio signal sent from beacons, such as ground stations and the global navigation satellite system (GNSS) [1,2]. To reduce the mission cost and improve the autonomous survival capacity, the autonomous navigation system that determines the position and velocity of the spacecraft with onboard instruments in a radio signal-denied environment is required [3,4,5]. To achieve precise navigation information for the spacecraft without the support of man-made beacons is critical for the development of future intelligent unmanned systems [6,7].
In the past few decades, several autonomous navigation techniques with different observation information sources have been studied, such as optical navigation (OPNAV) using the optical imaging of nearby celestial bodies [8,9,10], X-ray pulsar-based navigation (XNAV) [11,12,13] and star navigation based on the Doppler effect of starlight (StarNAV-DE) [14,15,16]. In the on-orbit demonstrations, the positioning accuracy of the OPNAV based on the observations of Earth is on the order of a few kilometers, while the accuracy of XNAV is less than 10 km, which is not sufficient to satisfy the high-precision navigation requirement for certain space missions. The StarNAV-DE technique has been demonstrated on the Chinese Hα Solar Explorer (CHASE). It is reported that the accuracy of the solar velocimeter observing the starlight Doppler effect on CHASE is about 2 m/s.
The spacecraft autonomous navigation method based on the relativistic perturbations of starlight is introduced in [17] and developed in [18,19]. Recently, the investigation relevant to the relativistic navigation has attracted increasing attention. A practical mathematical model to describe the relativistic perturbations to the space-based starlight observation is derived in [20]. The optical instruments for the observation of the relativistic perturbations are discussed in [21]. The application of relativistic navigation is suggested in [22] for interstellar spacecraft with high velocity such that the relativistic perturbations are not negligible. To enhance navigation accuracy and rapidity, the information fusion scheme of the relativistic navigation and the OPNAV are designed in [23,24]. The extended Kalman filter (EKF) and the unscented Kalman filter (UKF) are designed and evaluated for the implementation of the relativistic navigation in [25,26].
Among the previously mentioned autonomous navigation techniques, the relativistic navigation based on the relativistic perturbations to the inter-star angle measurement has the potential to achieve higher performance with current technology. Generally, the relativistic navigation performance depends on the measurement accuracy of the inter-star angle and the precision of the star catalog. As the inter-star angle can be measured with the accuracy of a few mas with state-of-the-art instruments, and the error of the modern star catalog is less than 0.1 mas, it is considered that relativistic navigation is a promising method to achieve high performance. In comparison with the OPNAV, an advantage of relativistic navigation is that the high-accuracy observation of starlight is generally easier than that of a nearby celestial body. Compared with the XNAV, relativistic navigation is competitive, as the number of visible stars is much more than the X-ray pulsars suitable for navigation. In addition, as the stability of the inter-star angle calculated with the star catalog is rather high, the main difficulty of the StarNAV-DE technique due to the poor stability of the stellar spectra is avoided.
In the relativistic navigation system, at least two star sensors separated from each other by a large angle are required to measure the inter-star angles, which reveal the variations in the relativistic perturbations, including stellar aberration and starlight gravitational deflection. The inter-star angle measurement bias caused by the misalignment of the star sensors strongly affects the performance of the relativistic navigation system. The main motivation of this study is to calibrate the measurement bias accurately via a fine-designed navigation filter. A common approach is to model the measurement bias as calibration parameters, which can be estimated together with the position and velocity of the spacecraft through the EKF.
It is well known that the state estimation accuracy of the EKF depends on the tuning of the process and measurement noise covariance matrices [27,28]. As the measurement noise covariance matrix can be determined through the specification of the star sensors, the problem remains in determining the process noise covariance matrix, especially for the elements related to the calibration parameters. Generally, it is difficult to obtain the optimal noise covariance matrices in the absence of exact statistical knowledge about the process noise. Several attempts have been made in the literature to develop adaptive filters [29,30,31]. For the study of autonomous navigation, the most widely used method is the adaptive extended Kalman filter (AEKF), where the noise covariance matrices are estimated together with the state vector. However, it is often difficult to guarantee the estimation accuracy of the noise covariance matrix in the presence of the state estimation error. To cope with this problem, a potential method is to combine the Q-learning approach with the EKF for tuning the process noise covariance matrix automatically [32,33,34,35,36]. The key idea of the parallel Q-learning extended Kalman filter (PQEKF) is the integration of the EKF and the Q-learning approach, where the process noise covariance matrix of the EKF is selected with the Q-learning approach, whose reward is constructed with the innovation of the EKF, such that the appropriated covariance matrix is determined to improve the filtering accuracy.
This paper studies a measurement bias calibration method for a relativistic navigation system. The main contributions of this study are as follows: (1) The PQEKF is presented to adjust the process noise covariance matrix related to the calibration parameters and that related to the position and velocity vectors, respectively. The PQEKF is different from its original version presented in [24] in that two learning agents are designed to work in parallel such that the flexibility of the algorithm is improved. (2) It is illustrated that the PQEKF is effective for calibrating the inter-star angle measurement bias of the star sensors. The simulation shows that, after calibration, the relativistic navigation accuracy for the MEO satellite is on the order of 300 m in the case that the standard deviation of the measurement noise is about 1 mas. (3) The principle of the presented method can be further applied to cope with other state estimation problems that require autonomous parameters tuning.
The remaining part of the paper is organized as follows: Section 2 formulates the mathematical model of the relativistic navigation system. Section 3 presents the PQEKF algorithm for the relativistic navigation system to estimate the calibration parameters. Section 4 evaluates the performance of the navigation filter via simulations. Finally, Section 5 concludes the paper.
4. Simulations
4.1. Simulation Conditions
In this section, comparisons are performed to demonstrate the efficiency of the calibration method for the relativistic navigation system using the PQEKF. The reference trajectory of the spacecraft is generated through a high precision orbit propagator, where non-spherical Earth gravity perturbation, lunar–solar gravitational perturbation and solar radiation pressure perturbation are considered. Assume that the spacecraft is an MEO satellite in a near-circular orbit with a semi-major axis of 21,528 km and inclination of 55°. The measurement data are generated according to the reference trajectory and the measurement model shown in Section 2. The navigation filters designed based on the EKF and the PQEKF presented in Section 3 are implemented individually to process the measurement data. The position and velocity estimation errors are obtained via comparison between the state estimation and the reference trajectory.
For the fairness of comparison, the EKF and the PQEKF share the same measurement noise covariance matrix and the initial estimation error covariance matrix . The parameter settings for the simulation are listed in Table 1.
Table 1.
Simulation parameter settings.
For the PQEKF, when discretizing the state space, since the range of the state space is unknown, the upper limit and lower limit of the process noise covariance matrix is obtained through experiments. The performance of the presented methods is evaluated via the position and velocity estimation errors, which are critical for the orbital control of the spacecraft.
4.2. Simulation Results
First, the navigation performance of the presented method is compared with that of the traditional EKF without measurement bias calibration [24]. The three-axis position and velocity estimation error curves of the spacecraft obtained from the EKF without measurement bias calibration are shown in Figure 4 and Figure 5 with solid line, where the dashed lines represent the theoretic error bounds calculated from the estimation error covariance matrix of the navigation filter.
Figure 4.
Position estimation error of traditional EKF without measurement bias calibration.
Figure 5.
Velocity estimation error of traditional EKF without measurement bias calibration.
It is seen from Figure 4 and Figure 5 that the error curves fluctuate out of the theoretic error bounds frequently due to the unfavorable effect of the measurement bias. In contrast, the position and velocity estimation error curves of the calibration method based on the PQEKF are shown in Figure 6 and Figure 7.
Figure 6.
Position estimation error of calibration method based on PQEKF.
Figure 7.
Velocity estimation error of calibration method based on PQEKF.
From Figure 6 and Figure 7, it can be seen that all of the error curves are contained in the corresponding error bounds, which indicates the effectiveness for the design of the navigation filter.
Second, to facilitate the performance comparison of the algorithms in different simulation conditions, the position and velocity average root mean squared (RMS) errors of the EKF without bias calibration, the EKF with bias calibration and the presented method are plotted versus different settings of the measurement bias in Figure 8 and Figure 9.
Figure 8.
Position RMS errors of different methods vs. measurement bias.
Figure 9.
Velocity RMS errors of different methods vs. measurement bias.
It is easy to see from Figure 8 and Figure 9 that the estimation error of the EKF without bias calibration is enlarged with the increase in the measurement bias, while the effect of the measurement bias on the navigation performance is suppressed efficiently when the EKF with bias calibration is adopted. In addition, the calibration method based on the PQEKF achieves superior performance due to its ability in selecting the suitable process noise covariance matrix. It indicates that the presented method is not sensitive to the inter-star angle measurement bias.
In addition, the effect of the measurement noise on the PQEKF algorithm is examined through simulations. When the standard deviation of the measurement noise is changed in the scopes of [0.6, 1.6] mas, the RMS position errors of the EKF and the PQEKF are illustrated in Figure 10. The simulation result shows that, in comparison with the EKF, the PQEKF is more effective for suppressing the unfavorable effect of the measurement noise.
Figure 10.
RMS errors as functions of measurement noise standard deviation.
It is seen from Algorithm 2 that the PQEKF contains multiple EKFs. In the simulation, the execution time of the PQEKF is several times larger than that of the EKF. Nevertheless, it is easy to complete the one step iteration of the PQEKF algorithm in an observation period of the star sensors. For the considered system, to reduce the computational cost of the PQEKF, the state space or the action space of the Q-learning approach could be further optimized. For a complicated practical system with a large state space or action space, artificial neural network approximation or dedicated hardware can be introduced for the implementation of the algorithm. In addition, it is expected that a dynamic state space with the bound stretched automatically can be designed in future works.
Next, as the PQEKF is an improved version of the QLEKF, it is compared with the QLEKF for the relativistic navigation system through Monte Carlo trials. The position RMS error curves of the calibration methods based on the EKF, the AEKF, the QLEKF and the PQEKF are plotted in Figure 11. The statistical values of the navigation accuracy for the different methods are summarized in Table 2.
Figure 11.
Position RMS error curves of different navigation filters.
Table 2.
Comparison of calibration methods based on EKF, AEKF, QLEKF and PQEKF.
We can see from Figure 11 and Table 2 that the navigation performance of the PQEKF is slightly higher than the QLEKF, as two learning agents are implemented in parallel to select the appropriate and in parallel, while the QLEKF is designed to search for the whole . It is believed that the design of the PQEKF is more flexible than the QLEKF as different scale factors can be adopted to tune the different sub-matrices in the process noise covariance matrix.
Finally, the influence of the state space discretization on the positioning accuracy of the PQEKF algorithm for the relativistic navigation system is analyzed. When the number of states is set as five, seven and nine, respectively, the position RMS error curves of the PQEKF algorithms are shown in Figure 12.
Figure 12.
Position RMS error curves of PQEKF algorithms for different state numbers.
From Figure 12, the variation in the position estimation error under the different settings of the state number in the state space discretization is rather small in the majority of the simulation processes. This indicates that the influence of the state number variation within a certain scope on the estimation accuracy of the PQEKF is not evident. In the considered scenario, two pre-determined sets with a small number of design values for and are beneficial for improving the performance of the calibration method.
According to the above simulation analysis, it is confirmed that the presented method is well-suited for the relativistic navigation system with the requirement to calibrate the inter-star angle measurement bias. For the simulation conditions described in Section 4.1, the achievable spacecraft navigation accuracy is on the order of a few hundred meters, which is sufficient for most orbital control missions.
5. Conclusions
This paper presents an inter-star angle measurement bias calibration method for the spacecraft relativistic navigation system. The proper design of the process noise covariance matrix is critical for accurate calibration. In order to improve the calibration accuracy and the navigation performance, the Q-learning approach is combined with the EKF for an online adaptive tuning of the process noise covariance matrix based on the measurement data achieved from onboard star sensors. The PQEKF algorithm is developed as the navigation filter, where two learning agents are implemented in parallel to select the appropriate sub-matrices related to the kinematic state and the calibration parameters, respectively. The simulation results show that the navigation performance of the presented method is superior to that of the EKF, the AEKF and the QLEKF in the presence of measurement bias, demonstrating the efficiency of the calibration method and the PQEKF algorithm. This study introduces a hybrid framework to combine the reinforcement learning approach in the navigation filter, which can serve as a foundation method to improve the state estimation accuracy in potential applications of relativistic navigation for Earth satellites or deep space explorers.
Author Contributions
Conceptualization, K.X. and Q.Z.; methodology, K.X.; software, Q.Z.; validation, K.X., Q.Z. and L.Y.; formal analysis, K.X.; writing—original draft preparation, K.X.; writing—review and editing, Q.Z.; supervision, L.Y.; project administration, K.X.; funding acquisition, L.Y. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the National Natural Science Foundation of China, grant number 62394354.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Data are contained within the article.
Conflicts of Interest
The authors declare no conflicts of interest.
Appendix A
Proof of Theorem 1.
First, mathematical induction is adopted to prove the following inequality
For , from (10), we have
Considering the condition shown in (12), we obtain
It is seen from (A3) that the inequality (A1) holds for . Assume that the inequality (A1) holds for , i.e.,
It can be derived from (10) and (11) that
From (A5), the inequality (A1) holds for . Thus, for , the inequality (A1) holds. The mathematical induction is complete.
Second, considering that is a bounded non-increasing sequence, let
Taking the limit of both sides of equation (10) yields
The formulation of (A8) is essentially the same as that of (15). This completes the proof of Theorem 1. □
Appendix B
Proof of Lemma 1.
Mathematical induction is adopted to prove Lemma 1. For , let
and
Considering the condition shown in (19), we have
It is derived from (10) and (A11) that
It is seen from (A12) that the inequality (20) holds for . Assume that the inequality (20) holds for , i.e.,
Let
and
From (A13), we obtain
It is derived from (10) and (A16) that
From (A17), the inequality (20) holds for . Thus, for , the inequality (20) holds. This completes the proof of Lemma 1. □
Appendix C
Proof of Lemma 2.
Mathematical induction is adopted to prove Lemma 2. For , it follows from (21) and (22) that
From (17), we have
Inserting (A19) into (A18), we obtain
It is seen from (A20) that the equality (23) holds for .
Assume that the inequality (23) holds for , i.e.,
It follows from (22) and (A21) that
Considering that
The Equation (A22) becomes
From (A24), the inequality (23) holds for . Thus, for , the inequality (23) holds. This completes the proof of Lemma 2. □
Appendix D
Proof of Theorem 2.
It is easy to see from (25) that
or
According to Lemma 1, we get the following inequality
According to Lemma 2, the right side of (A27) is written as
Substituting (A28) into (A27) yields
Furthermore, it is derived according to Lemmas 1 and 2 that
and
Substituting (A31) into (A30), we have
With a similar process, it is easy to verify that
In the case , the inequality (A33) is expressed as
According to Theorem 1, we obtain
Similarly, applying Lemmas 1 and 2, we get the following inequality from (26)
Combining the inequalities (A35) and (A36), we conclude that the inequality (24) holds. This completes the proof of Theorem 2. □
Appendix E
Proof of Theorem 3.
According to Theorem 2, we have
From (27), we obtain
and
Substituting (A38) and (A39) into (A37) yields
From (31), the inequality (A40) is rewritten as
or
It is derived with a similar process that
Combining the inequalities (A42) and (A43), we conclude that the inequality (28) holds. This completes the proof of Theorem 3. □
References
- Huang, J.; Yang, R.; Zhan, X. Constraint Navigation Filter for Space Vehicle Autonomous Positioning with Deficient GNSS Measurements. Aerosp. Sci. Technol. 2022, 120, 107291. [Google Scholar] [CrossRef]
- Ely, T.A.; Seubert, J.; Bradley, N.; Drain, T.; Bhaskaran, S. Radiometric Autonomous Navigation Fused with Optical for Deep Space Exploration. J. Astronaut. Sci. 2021, 68, 300–325. [Google Scholar] [CrossRef]
- Gallo, E.; Barrientos, A. Reduction of GNSS-Denied Inertial Navigation Errors for Fixed Wing Autonomous Unmanned Air Vehicles. Aerosp. Sci. Technol. 2022, 120, 107237. [Google Scholar] [CrossRef]
- Hu, J.; Liu, J.; Wang, Y.; Ning, X. INS/CNS/DNS/XNAV Deep Integrated Navigation in a Highly Dynamic Environment. Aircr. Eng. Aerosp. Technol. 2023, 95, 180–189. [Google Scholar] [CrossRef]
- Yang, Y.; Han, X.; Song, N.; Wang, Z. A New Method to Improve the Measurement Accuracy of Autonomous Astronomical Navigation. J. Math. 2022, 2022, 3649662. [Google Scholar] [CrossRef]
- Wang, Y.; Yan, T.; Wang, L. Development Situation and Trend of Space Intelligent Navigation Technology. Aerosp. Control Appl. 2022, 48, 9–17. [Google Scholar]
- Zhou, B.; Li, Y.; Zhang, A.; Cui, S. Observability Analysis of Satellite Autonomous Orbit Determination with Modeling and Measurement Errors. Chin. Space Sci. Technol. 2023, 43, 25–34. [Google Scholar]
- Christian, J.A. Optical Navigation Using Planet’s Centroid and Apparent Diameter in Image. J. Guid. Control. Dyn. 2015, 38, 192–204. [Google Scholar] [CrossRef]
- Hou, B.; Wang, J.; Zhou, H.; He, Z.; Li, D.; Liu, X. Guidepost-based Autonomous Orbit Determination Method for GEO Satellite. Adv. Space Res. 2021, 67, 1090–1113. [Google Scholar] [CrossRef]
- Turan, E.; Speretta, S.; Gill, E. Autonomous navigation for deep space small satellites: Scientific and technological advances. Acta Astronaut. 2022, 193, 56–74. [Google Scholar] [CrossRef]
- Sheikh, S.I.; Pines, D.J. Spacecraft Navigation Using X-Ray Pulsars. J. Guid. Control. Dyn. 2006, 29, 49–63. [Google Scholar] [CrossRef]
- Wang, Y.; Zheng, W.; Ge, M.; Zheng, S.; Zhang, S. Use of Statistical Linearization for Nonlinear Least-Squares Problems in Pulsar Navigation. J. Guid. Control. Dyn. 2023, 46, 1850–1855. [Google Scholar] [CrossRef]
- Zoccarato, P.; Larese, S.; Naletto, G.; Zampieri, L.; Brotto, F. Deep Space Navigation by Optical Pulsars. J. Guid. Control. Dyn. 2023, 46, 1501–1511. [Google Scholar] [CrossRef]
- Zhang, W. A Study of the Navigation Technology and Application Based on Astronomical Spectral Velocity Measurement. Navig. Control 2020, 19, 64–73. [Google Scholar]
- Liu, J.; Wang, T.; Ning, X.; Kang, Z. Modelling and analysis of celestial Doppler difference velocimetry navigation considering solar characteristics. IET Radar Sonar Navig. 2020, 14, 1897–1904. [Google Scholar] [CrossRef]
- Gui, M.; Yang, H.; Ning, X.; Ye, W.; Wei, C. A Novel Sun Direction/Solar Disk Velocity Difference Integrated Navigation Method Against Installation Error of Spectrometer Array. IEEE Sens. J. 2023, 23, 17480–17490. [Google Scholar] [CrossRef]
- Christian, J.A. StarNAV: Autonomous Optical Navigation of a Spacecraft by the Relativistic Perturbation of Starlight. Sensors 2019, 19, 4064. [Google Scholar] [CrossRef]
- Bailer-Jones, C.A.L. Lost in Space? Relativistic Interstellar Navigation using an Astrometric Star Catalog. Publ. Astron. Soc. Pac. 2021, 133, 074502. [Google Scholar] [CrossRef]
- McKee, P.; Kowalski, J.; Christian, J. Navigation and star identification for an interstellar mission. Acta Astronaut. 2022, 192, 390–401. [Google Scholar] [CrossRef]
- Klioner, S. A Practical Relativistic Model for Microarcsecond Astrometry in Space. Astron. J. 2003, 125, 1580–1597. [Google Scholar] [CrossRef]
- McKee, P.; Nguyen, H.; Kudenov, M.W.; Christian, J.A. StarNAV with a wide field-of-view optical sensor. Acta Astron. 2022, 197, 220–234. [Google Scholar] [CrossRef]
- Yucalan, D.; Peck, M. Autonomous Navigation of Relativistic Spacecraft in Interstellar Space. J. Guid. Control Dyn. 2021, 44, 1106–1115. [Google Scholar] [CrossRef]
- Xiong, K.; Wei, C. Integrated Celestial Navigation for Spacecraft Using Interferometer and Earth Sensor. Proc. Inst. Mech. Eng. Part G: J. Aerosp. Eng. 2020, 234, 2248–2262. [Google Scholar] [CrossRef]
- Xiong, K.; Wei, C.; Zhou, P. Integrated Autonomous Optical Navigation Using Q-Learning Extended Kalman Filter. Aircr. Eng. Aerosp. Technol. 2022, 94, 848–861. [Google Scholar] [CrossRef]
- Gui, M.; Wei, Y.; Ning, X. Celestial angle measurement navigation for Mars probe considering relativistic effect. J. Deep Space Explor. 2023, 10, 126–132. [Google Scholar]
- Liu, F.; Li, M.; Peng, Y.; Sun, J.; Liu, J. An autonomous navigation method for spacecraft in cislunar space using stellar aberration observation. J. Deep Space Explor. 2023, 10, 159–168. [Google Scholar]
- Ullah, I.; Fayaz, M.; Naveed, N.; Kim, D. ANN Based Learning to Kalman Filter Algorithm for Indoor Environment Prediction in Smart Greenhouse. IEEE Access 2020, 8, 159371–159388. [Google Scholar] [CrossRef]
- Or, B.; Klein, I. A Hybrid Model and Learning-Based Adaptive Navigation Filter. IEEE Trans. Instrum. Meas. 2022, 71, 1–11. [Google Scholar] [CrossRef]
- Ning, X.; Li, Z.; Wu, W.; Yang, Y.; Fang, J.; Liu, G. Recursive Adaptive Filter Using Current Innovation for Celestial Navigation During the Mars Approach Phase. Sci. China-Inf. Sci. 2017, 60, 032205. [Google Scholar] [CrossRef]
- Li, W.; Sun, S.; Jia, Y.; Du, J. Robust unscented Kalman filter with adaptation of process and measurement noise covariances. Digit. Signal Process. 2016, 48, 93–103. [Google Scholar] [CrossRef]
- Jia, W.; Tian, Y.; Duan, H.; Luo, R.; Lian, J.; Ruan, C.; Zhao, D.; Li, C. Autonomous Navigation Control Based on Improved Adaptive Filtering for Agricultural Robot. Int. J. Adv. Robot. Syst. 2020, 17, 1729881420925357. [Google Scholar] [CrossRef]
- Xiong, K.; Zhou, P.; Wei, C. Autonomous Navigation of Unmanned Aircraft Using Space Target LOS Measurements and QLEKF. Sensors 2022, 22, 6992. [Google Scholar] [CrossRef] [PubMed]
- Tao, W.; Zhang, J.; Hu, H.; Zhang, J.; Sun, H.; Zeng, Z.; Song, J.; Wang, J. Intelligent Navigation for the Cruise Phase of Solar System Boundary Exploration Based on Q-learning EKF. Complex Intell. Syst. 2024, 2, 2653–2672. [Google Scholar] [CrossRef]
- Xiong, K.; Wei, C.; Zhang, H. Q-learning for noise covariance adaptation in extended Kalman filter. Asian J. Control. 2021, 23, 1803–1816. [Google Scholar] [CrossRef]
- Chen, C.; Wu, X.; Bo, Y.; Chen, Y.; Liu, Y.; Alsaadi, F.E. SARSA in extended Kalman Filter for complex urban environments positioning. Int. J. Syst. Sci. 2021, 52, 3044–3059. [Google Scholar] [CrossRef]
- Yin, Y.; Li, S.E.; Tang, K.; Cao, W.; Wu, W.; Li, H. Approximate optimal filter design for vehicle system through Actor-Critic reinforcement learning. Automot. Innov. 2022, 5, 415–426. [Google Scholar] [CrossRef]
- Crassidis, J.L.; Markley, F.L.; Cheng, Y. Survey of Nonlinear Attitude Estimation Methods. J. Guid. Control. Dyn. 2007, 30, 12–28. [Google Scholar] [CrossRef]
- Hu, Z.; Gong, W. Constrained Evolutionary Optimization Based on Reinforcement Learning Using the Objective Function and Constraints. Knowl.-Based Syst. 2022, 237, 107731. [Google Scholar] [CrossRef]
- Jang, B.; Kim, M.; Harerimana, G.; Kim, J.W. Q-learning Algorithms: A Comprehensive Classification and Applications. IEEE Access 2019, 7, 133653–133667. [Google Scholar] [CrossRef]
- Li, Y.; Yang, C.; Hou, Z.; Feng, Y.; Yin, C. Data-driven approximate Q-learning stabilization with optimality error bound analysis. Automatica 2019, 103, 435–442. [Google Scholar] [CrossRef]
- Shi, H.; Li, X.; Hwang, K.; Pan, W.; Xu, G. Decoupled Visual Servoing with Fuzzy Q-learning. IEEE Trans. Ind. Inform. 2018, 14, 241–252. [Google Scholar] [CrossRef]
- Wu, G. UAV-Based Interference Source Localization: A Multi-model Q-learning Approach. IEEE Access 2019, 7, 137982–137991. [Google Scholar] [CrossRef]
- Maia, R.; Mendes, J.; Araujo, R.; Silva, M.; Nunes, U. Regenerative Braking System Modeling by Fuzzy Q-Learning. Eng. Appl. Artif. Intell. 2020, 93, 103712. [Google Scholar] [CrossRef]
- Wei, Q.; Lewis, F.L.; Sun, Q.; Yan, P.; Song, R. Discrete-time Deterministic Q-learning: A Novel Convergence Analysis. IEEE Trans. Cybern. 2017, 47, 1224–1237. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).