1. Introduction
The recent proliferation of private and commercial ventures providing affordable access to low-Earth orbit and new crewed space vehicles [
1], along with a renewed and increasing interest for deep space missions, not only from space agencies, but from the space industry and other actors, as highlighted in the Global Exploration Roadmap elaborated by the International Space Exploration Coordination Group [
2,
3], shows the necessity of advancing on new technologies to continue with this momentum and solve the challenges of the upcoming space missions currently being proposed. Visual inspection and extravehicular repair activities, spacecraft refueling and lifetime extension, active space debris removal, and resupply and other general on-orbit servicing missions are but a few of the most immediate applications requiring an increasing degree of autonomy and automation, with even more relevance in deep space missions. Along this line, the upcoming establishment of new crewed outposts in cislunar space within the next years, such as the Lunar Gateway, will require a continued service of resupply and regular crew transportation missions, that would benefit from autonomous capabilities for automated rendezvous, docking and other proximity operations in cislunar space, which have never been attempted to date, and thus still remain a major challenge for space exploration programs and missions beyond Earth orbit. Therefore, autonomous guidance, navigation and control capabilities have been defined as key enabling technologies to be developed in the coming years to support the expansion of human space activity beyond Earth orbit.
Libration Point Orbits (LPO) have been identified as ideal locations for the Lunar Gateway and other lunar and deep space exploration related activities. LPOs include periodic and quasi-periodic orbits, which posses highly interesting features due to their ability to maintain continuous communication with the Earth and their lesser station-keeping requirements. However, their inherent orbital instability yields relatively short time scales for divergence if the spacecraft abandons the nominal LPO or is subjected to perturbational acceleration sources, such as third body perturbations or solar radiation pressure. Therefore, a lot of effort has been devoted over the last three decades to the problem of trajectory control and station-keeping of LPO, initiated by the pioneer work of Farquhar [
4,
5]. Dynamical systems theory has been a fundamental role in the development of such control strategies. In this sense, early attempts exploited the periodic nature of these orbits, allowing for an approach based on Floquet’s theory, which led to the ‘Floquet Mode’ station-keeping strategy for LPOs, first proposed by Wiesel and Shelton [
6] and Simo et al. [
7], and later applied to the station-keeping of translunar LPOs [
8]. Howell and Pernicka [
9,
10] proposed a new LPO station-keeping by adapting Dwivedi’s approach [
11] to the LPO problem, leading to a strategy known as ‘Target Point’. Bai and Junkins [
12] proposed a gradient-free computational approach based on a Modified Chebyshev-Picard Iteration method, which provided a simple and lightweight control structure. Hou et al. [
13] proposed impulsive control strategies similar to the Floquet approach, which allowed to extend its applicability to the real Earth–Moon system by relying on quasi-periodic orbits referred to as dynamical substitutes. Folta et al. [
14] extended the concept of the Target Point strategy to include optimality considerations by combination with a global search method and an orbit continuation method, resulting in a discrete control strategy which was successfully used for operational station-keeping in the ARTEMIS mission [
15]. Recently Jin and Xu [
16] proposed a modified strategy for selecting target points for LPO station-keeping to reduce manoeuvre costs in the real Earth-Moon system. Extensions to more complicated multi-body contexts have also been accomplished; Carletta et al. exploited the Hamiltonian formalism to develop a linear feedback compact station-keeping law in the Sun-Mars elliptic restricted four-body problem LPOs [
17].
The aforementioned works proposed effective control strategies that relied heavily on dynamical systems theory and exploited the properties of intrinsic structures of the CR3BP, resulting in impulsive strategies; in contrast, they overlooked the plethora of techniques readily available in both classical and modern control theory, which can be effectively borrowed and adapted to the problem at hand. Along this line, Breakwell et al. [
18] were the first to approach the LPO trajectory control problem from a classical control viewpoint by proposing a linear quadratic regulator for station-keeping of a translunar halo orbit. Control schemes of increasing complexity followed. Jones and Bishop [
19] developed an output feedback guidance law using
control theory. Scheeres and Vinh [
20] developed a feedback control law based on the local eigenstructure of the LPO, which allowed for oscillatory motions in the center manifold. Luquette and Sanner [
21] proposed an adaptive, nonlinear control for orbit maintenance in the vicinity of LPOs. Gurfil and Kasdin proposed a time-varying, continuous, linear quadratic control law, along with an internal disturbance model that rendered a robust disturbance rejection performance [
22], and also investigated the early use of neural networks for tracking control and disturbance rejection in this context [
23]. Marchand and Howell also developed continuous control strategies of increasing complexity based on linear and non-linear quadratic regulators and input/output feedback linearization [
24], as well as through numerical solutions to the optimal control problem [
25]. Infeld et al. [
26] used Legendre pseudo-spectral methods to numerically solve the fuel-optimal, constrained, non-linear control problem. Kulkarni et al. [
27] successfully adapted an
approach to station-keeping control of LPOs. Nazari et al. [
28] proposed three control strategies combining continuous LQR control and Floquet theory using periodic control gains; these relied, respectively, on an time-periodic infinite horizon LQR, a backstepping technique with time- invariant LQR, and a dead-band periodic-gain controller. Lian et al. [
29] investigated the use of discrete-time sliding mode control and a discrete time linear quadratic regulator for station-keeping of real Earth–Moon LPO (i.e., with a complete Solar System model under a real ephemerides model), resulting in a discrete control suitable for impulsive manoeuvres. Ulybyshev [
30] approached the station-keeping problem as an optimization problem, considering pseudoimpulses for discretized orbital segments, thus transforming the station-keeping problem into a large-scale linear programming form, resulting in a long-term station-keeping strategy for quasi-periodic LOPs in the full ephemerides model. Using a simple linear extended state observer, Narula and Biggs [
31] extended an LQR control scheme to enable continued tracking in the event of thruster failure and the presence of disturbances; they also demonstrated that in combination with a sliding mode or an adaptive control, asymptotic tracking could be achieved. Peng et al. [
32] demonstrated the robust maintenance of multi-revolution halo orbits in an elliptic restricted three-body problem using a receding horizon control strategy solved by an indirect Radau pseudo-spectral method. Qi and de Ruiter [
33] extended the use of backstepping controllers for station-keeping of LPOs under practical navigational and executional constraints, a real ephemerides model and solar radiation pressure.
Héritier and Howell [
34] looked into harnessing the natural, multi-body dynamics to minimize the drift of the unstable relative dynamics. Along this line, Xu, Liang and Fu [
35] proposed a Hamiltonian structure-preserving control for LPO, which they successfully extended to the bi-circular four-body problem (i.e., time-periodic dynamics), and later to time-dependent dynamics, such as the relative orbital motion along low-energy transfer trajectories in the CR3BP [
36], a path also investigated by Cheng et al. [
37]. Jung and Kim [
38] also proposed a switching Hamiltonian structure-preserving control. Other currently ongoing research includes Elliott and Bosanac [
39,
40], who are looking into LPO station-keeping controllers based on an alternative set of geometric coordinates, Bonasera, Bosanac et al. [
41,
42], who are looking into machine learning approaches to the LPO station-keeping problem, in particular with reinforcement learning, and Gao et al. [
43], who are investigating high-order dynamical systems approaches for low-thrust station-keeping of LPOs. A thorough and extensive survey on LPO station-keeping strategies was presented by Shirobokov et al. [
44], although it is unfortunately already outdated given the continued developments carried out in these research lines in recent years. Also, despite all these publications have focused on the trajectory control around LPOs, it is worth noting that a few publications have also looked into the general problem of relative orbital motion in the CR3BP, not necessarily bound to the vicinity of equilibrium points [
45,
46].
Traditional techniques for stability analysis and classical control build on linearization about a reference trajectory, which by extension leads to the problem of close proximity relative orbital motion and formation flight around a LPO. This immediately drew attention to multi-spacecraft formations and applications to interferometry missions and other distributed spacecraft architectures; in fact, many of the aforementioned bibliographic references revolve around such applications. In contrast, the problem of orbital rendezvous between two co-orbital spacecraft within the CR3BP framework seems to have drawn very little attention until recent years. Gerding [
47] was the first to consider rendezvous in a CR3BP environment, by proposing a simple, two-impulse rendezvous strategy based on the linearized motion around a LPO. Jones and Bishop [
48,
49] derived a targeting law for the terminal phase rendezvous, loosely equivalent to the rendezvous application of Hill’s equations in the two-body problem, and constructed a Kalman-based rendezvous navigation filter to supply the targeting law with the chaser vehicle state information. Canalias and Masdemont [
50] developed a methodology for the rendezvous of satellites on a Lissajous orbit using the effective phases plane together with a linear approach. However, it was not until well within the last decade that rendezvous in non-Keplerian environments become a topic of renewed interest upon studies related to the Lunar Gateway placement on a LPO. Mand [
51] investigated simple, impulsive close-rendezvous targeting strategies adapting the line-of-sight corridor and the line-of-sight glide notions, of common use in Keplerian rendezvous. Ueda and Murakami [
52] investigated into optimum guidance strategies based on free-drift dynamics, by following safe approach trajectories along invariant manifolds and into safe injection points for rendezvous in an Earth-Moon Halo orbit. Along this line, Sato et al. [
53] proposed two different strategies for rendezvous on an Earth-Moon
halo orbit by phasing along the orbit: one where the chaser approaches the target from behind along the orbit, similarly to a rendezvous in a low Earth orbit, and a second one utilizing an homoclinic intersection between an unstable manifold that departs from the halo orbit towards the Moon, where it connects to a stable manifold returning to the halo orbit, thus allowing an increased launch window and flexibility for mission design, and the capability to adjust the time of arrival to the halo orbit with a lower propellant usage. A similar approach, based on finding connecting manifolds on Poincaré maps, was proposed by Lizy-Destrez [
54]. Murakami et al. [
55] further investigated into the problems and requirements regarding guidance, navigation, and control for a rendezvous scenario in an Earth–Moon
halo.
From 2015 onward, the spotlight was put into other types of LPO, such as Direct Retrograde Orbits (DRO) and Near-Rectilinear Halo Orbits (NRHO), once these were pointed out to be potentially better locations for the Lunar Gateway. Thus, Murakami and Yamanaka [
56] proposed a three-impulse transfer from LEO to various phase points in a certain DRO, including a retrograde, powered lunar gravity assist. Ueda et al. [
57] used standard, impulsive strategies to target an arbitrary relative position in halo, DRO and NRHO. Lizy-Destrez [
58] provided a safety analysis for close approach rendezvous into a NRHO, and Davis et al. [
59] looked into navigation accuracies and noise effects applied to various station-keeping strategies for NRHO, as well as examining the ability to absorb missed burns, construct phasing manoeuvres and conduct rendezvous and proximity operations. On a follow up work, Lizy-Destrez et al. [
60] presented methods and results related to strategies for far and close rendezvous, and compared different linear and nonlinear models for cislunar relative motion; in particular, three- and four-impulse strategies are reviewed for the far-range rendezvous, and for the close-range renzdezvous, key concepts and several impulsive, targeting strategies available in the literature are reviewed. Blazquez et al. [
61] looked at the far rendezvous approaches for NRHO and passively safe drift trajectories under a real ephemerides model, employing multiple-shooting and adaptive receding-horizon targeting algorithms, and more recently, Khoury and Howell [
62] also looked into solutions to rendezvous and space loitering problems on NRHO and DRO type orbits. Bucchioni, in a series of works, focused on GNC for phasing and rendezvous under 6 DoF models [
63,
64,
65].
It is a staggering realization though, that all the aforementioned rendezvous strategies and targeting methods are either impulsive strategies based on exploiting invariant manifolds or impulsive strategies for the close-range based on different flavours of linearized motion. Despite the vast literature dedicated to continuous-thrust station-keeping and formation flight control around LPOs, our literature review only revealed a handful of publications that proposed a continuous control for rendezvous under non-Keplerian dynamics; in particular, Ulybyshev et al. [
66] presented an optimal method for the low-thrust rendezvous trajectory design in the vicinity of a lunar
orbit under full ephemerides model by adapting their LPO station-keeping strategy based on pseudoimpulses distributed along discretized orbital segments, yielding a fully numerical, linear programming problem [
30]. Sanchez et al. have successfully proposed both, impulsive and continuous MPC schemes for constrained, robust close-range docking in NHRO scenarios [
67], while using a costly dynamics formulation based on a Local Vertical Local Horizontal reference frame. Another interesting contribution is the work by Colagrossi et al. [
68], who studied the rendezvous and coupling in non-Keplerian orbits accounting for the orbit-attitude coupling and flexible modes of the structure of a very large space station, and more recently have also employed vision-based state navigation techniques for complete attitude-orbital state estimation and control [
69]. Therefore, we found it surprising that, despite the wealth of literature devoted to the investigation of continuous-thrust LPO station-keeping and formation flight strategies built upon the classical and modern control theory, with the only exceptions of Refs. [
66,
67,
68,
69], to the best of our knowledge, none of the reviewed continuous controllers has been investigated nor adapted for analysis of rendezvous trajectories at LPOs. It was precisely this realization that served as motivation for the present work. Therefore, the main contributions of this manuscript are summarized in the following paragraph.
This work firstly revisits both analytical and numerical approaches to the problem of relative motion in the CR3BP; in particular: (1) the analytical linearization of motion around LPO is approached distinctively from other literature sources, exploiting the natural dynamics for near-libration points scenarios; resulting in a linear time-invariant (LTI) relative motion model to be exploited for GNC; and (2) a highly accurate numerical calculation scheme is presented for the relative motion in the CR3BP based on Encke’s method. Based on the developed analytical framework, a family of optimal, linear and nonlinear, impulsive and continuous set of control and guidance techniques are reviewed and developed to exploit the multi-body context and its intrinsic structures for far-range rendezvous and proximity operations in the CR3BP. Firstly, the optimal constrained rendezvous problem in the form of Bolza is introduced, and several guidance techniques are developed for successful optimal relative trajectory design. Both classical impulsive and continuous strategies, such as the LQR/SDRE or the two-impulse rendezvous scheme, are reviewed and compared in LPO missions. However, our work introduces the possibility of exploiting the presented low-cost LTI relative motion models for direct linear control synthesis, independent of the target motion. Secondly, two novel rendezvous techniques are presented: a numerical impulsive planning algorithm is designed based on classical launcher staging theory, yielding a recursive solution for the optimal thrusting directions. Moreover, the Augmented Lagrangian Iterative LQR (AL-iLQR) is formulated for our CR3BP rendezvous problem, for both continuous and impulsive missions. Thirdly, all presented guidance techniques are hybridized with inner compensator loops, implemented through Model Predictive Control and first-order Sliding Mode Control, to provide robust performance against uncertainties and unmodelled dynamics. Finally, the validation of the proposed dynamical models and the developed rendezvous techniques are exhibited in two realistic simulation scenarios for the James Webb Telescope and the future Lunar Gateway.
The remainder of this manuscript is organized as follows. In
Section 2, both the absolute and relative dynamics in the CR3BP are briefly revisited, and novel approaches to both the precise numerical integration of the relative dynamics and the linearization of relative motion are introduced. Some preliminaries on Control Theory are also highlighted.
Section 3 and
Section 4 poses the general optimal rendezvous problem and comprises the design of several guidance techniques for general rendezvous trajectory planning between spacecraft in co-orbital motion. Both continuous and impulsive guidance algorithms are either revisited or presented, and a comparison between these techniques is also provided using specific performance indices.
Section 5 effectively combines the proposed guidance cores with several modern, robust, continuous and impulsive controllers for general reference tracking of optimal rendezvous trajectories and again, performance analysis is accomplished for the overall guidance and control loop. In
Section 6 all the trajectory design and control techniques presented in this paper are tested and validated upon two real-case mission scenarios, where these techniques are effectively employed and combined. Finally,
Section 7 summarizes the main conclusions of this work.
3. Optimal Impulsive Guidance
The following section introduces several impulsive optimal guidance schemes for general proximity operations trajectory design. Moreover, the different techniques are compared against each other under a homing and rendezvous mission scenario.
The use of the term guidance in this context applies to the computation of both a reference state trajectory
and control input
. The guidance core will be then cascaded with an inner control or compensator loop, whose techniques are described in
Section 5.
All guidance laws presented in this work can be either used online, in feedback manner, or computed offline and explicitly stored and regressed over an independent variable of interest, such as time, for online GNC purposes.
3.1. The Rendezvous Problem
In practice, most proximity operations and regular orbital control activities can be formulated as an optimal rendezvous problem, involving either two real spacecraft or the relative dynamics of a given vehicle with respect to some virtual object. This treatment motivates the formulation of the Rendezvous Problem.
The Rendezvous Problem is formally stated as a path-constrained two-boundary value problem, where a finite control law
is sought such that the chaser spacecraft follows a relative motion that takes it to the origin of the relative phase space after some finite time of flight
, thus mathematically fulfilling the Rendezvous Condition, namely
. Hence, the Rendezvous Problem can be compactly stated as the following optimal control problem in the form of Bolza:
where
,
saturate the control function
p-norm
and
is an appropriate model of the relative dynamics.
3.2. Surrogate Optimization and Backward-Forward Sweep
The majority of the guidance techniques presented in this section benefit from surrogate relative motion models when addressing the Rendezvous Problem, mainly based on an appropriate linearization of true nonlinear dynamics and the corresponding STM . Moreover, the computation of is at most, when not build upon an analytical model, numerically integrated through the linear variational equations along the flow .
Therefore, any control policy computed under such dynamics is not guaranteed to rendezvous the relative state vector under the true nonlinear field. In general, some form of feedback is needed to successively refined and to comply with the nonlinear relative motion dynamics. This is achieved through iterating a backward-forward pass or sweep structure until convergence:
Along the backward pass, the guidance control policy is solved for under an appropriate approximation of the true nonlinear dynamics, such as the discrete map given by the current estimate of the STM .
In the forward pass or rollout, the converged input is used to re-integrate to higher accuracy and refine the state flow , dictating the approximate dynamics in the backward process (for example, the estimate of the STM).
For offline guidance, both direct iteration or Newton-Rhapson differential correctors [
76] are used to re-evaluate the STM
along the flow
and update the nominal control sequence
. Online guidance is based on the Model Predictive Control (MPC) approach described in
Section 5, in which the true nonlinearities and uncertainties unmodelled in the surrogate guidance model are accommodated through a time-receding horizon scheme under the true plant dynamics.
3.3. Two-Impulse Guidance
Given the linearized dynamics in Equation (
8), proximity operations trajectory design can be determined through planning the impulsive control sequence
with respect to some optimality policy, usually penalizing both integral loss functions of the control effort and error to the desired state
.
Although a single impulse suffices to nullify the relative range to target after some time of flight
, two impulses (
) are mandatory to fully regulate the 6-dimensional relative state
(rendezvous the two spacecraft). Thus, the linear two-impulse (TI) rendezvous scheme, widely used within the context of Keplerian motion since the 60s [
77] and suggested in previous investigations [
51], is here explicitly adapted to the context of the CR3BP for comparison purposes against more complex strategies. Moreover, compared to previous literature, our scheme can profit from direct LTI dynamics (lower computational cost in the integration process) and a more numerically accurate integration process, as introduced in
Section 2.
The backward pass of the TI scheme is given by evaluating Equation (
8) with a first impulse at
,
, and a second impulse at
,
, which provides the final state vector. Enforcing the Rendezvous Condition
and solving for the impulses yields the following solution:
where the subscripts of
indicate the corresponding partitions of the STM following the usual notation.
The forward pass may be implemented both using MPC or classical differential correctors. In the latter case, as already stated, these expressions are solved iteratively for the current estimate of
and its effect under the nonlinear vector field to re-estimate the STM, which depends on the initial conditions, so it is an implicit function of
. Under the MPC paradigm [
78], for an
N-horizon time span,
TI problems are solved, and the problem’s true dynamics accommodated through the complete horizon.
3.4. Multi-Impulse Guidance
A multi-impulse (MI) rendezvous allows for greater flexibility in the design of the relative trajectory, as accuracy-in-the-execution constraints may be relaxed and navigation errors can be accommodated in any of the manoeuvres to be performed.
For the MI backward pass, Equation (
8) evaluated at
can be compactly expressed in matrix form as
with matrices
and
defined as
with
and
denoting horizontal and vertical concatenation, respectively. For non-LPO missions, a similar approach may be found in [
46] exploiting the duality of linear input-output systems.
Enforcing the Rendezvous Condition
and solving for
yields
where the
operator denotes the Moore-Penrose pseudoinverse [
79]. This solution also introduces an
-norm penalty on the impulses vector
as a general fuel consumption metric [
80]. Moreover, an appropriate decomposition of
provides the solution for less general rendezvous problems, such as position waypoint targeting.
The overall performance and convergence of the associated differential corrector is compromised by the initial conditions, the final time of flight, , and the selected execution times for the burns, , triggering the effect of neglected nonlinearities in the problem’s dynamics.
In any case, and as for the TI technique, Equation (
12) needs to be solved iteratively in the backward-forward structure. In this case, the most appropriate feedback formulation is the Newton-Rhapson method due to the sparsity of the expected impulses.
3.5. Optimal Multi-Impulsive Guidance
Further developing the multi-impulsive scheme naturally leads into its optimal formulation (Opt-MI), in which the
N execution times
and their associated burns are optimally computed. The selected loss function to be minimized here is the total propellant consumption or, equivalently, the aggregated
of the
N impulses, for which the
-norm is most fitting [
80]. This yields a Linear Programming minimization problem in which, under a linearized model of the dynamics, the component-wise constrained magnitude of the
N burns can be determined for a rendezvous with the target object at a prescribed time of flight
, namely:
The use of the synodic reference frame together with Encke’s formulation of the dynamic vector field ensures a cheaper and numerically better-behaved STM propagation when compared to previous work [
67]; in addition, the use of the
-norm as a proxy for fuel consumption and its associated LP optimization problem allows the online use of the algorithm without major computational expenses.
As already mentioned, the solver shall be embedded in some form of feedback loop to converge the control impulses sequence under the true nonlinear dynamics, through an iterative refinement of both the input
and the STM
. In this case, the MPC paradigm in
Section 5 may be used as forward pass, due to the lack of an exact form of an appropriate differential corrector to solve the problem under iteration.
3.6. Multi-Impulsive Staging Guidance
Despite the benefits of multi-impulsive schemes, as already seen, additional criteria or assumptions are needed to determine the impulses executions times , on which the performance of the controller is totally dependent on. While the previous algorithm develops an online cost-effective optimal rendezvous problem solver, its formulation depends on the -norm proxy for fuel consumption, which, in some cases, may not be representative of the true mass dynamics of the mission.
Inspired by classical launcher staging mass optimization, the novel Multi-Impulsive Staging Guidance (MISG) core aims to solve the following constrained optimization problem during the backward pass:
The cost function is selected as a proxy of control effort and fuel consumption and to ease further algebra. As for the rest of this derivation, the relative weights are assumed to be equal and unitary, . is the desired relative state at the final time of flight . The complete impulse sequence can be determined considering discrete dynamics, therefore eliminating the impulse times as explicit optimization variables. For a given time of flight and time grid , the problem is fully determined, so that for every time node in the grid, a (possibly null) optimal impulse is computed.
The necessary conditions for optimality are given by constructing an augmented Lagrangian function and computing its stationary point with respect to
,
resulting in the following
equations for the
variables
The Lagrange multiplier
can be eliminated through the use of equations
i and
, reading the following recursive system
The final algebraic system to be explicitly solve is therefore
Equation (
15) is suited for the MPC forward pass iteration, with applications in online feedback guidance. However, for offline cases, and by noting that the complete sequence is a function of the first impulse only
, a Newton-Rhapson differential corrector scheme can be formulated to iteratively correct
under each STM refinement iteration until convergence. For the latter case, Equation (
15) can be further simplified as
where the sensitivity matrix
G is defined as
G gives the Newton-Rhapson update step to
through iterating the following result
If the sequence needs to be control-input bounded or constrained in any other manner, the associated inequality can be introduced in the cost function through additional slack variables
, yielding
3.7. Impulsive Iterative Linear Quadratic Regulator
The last impulsive guidance scheme presented in this investigation is a novel application of the Augmented Lagrangian Iterative Linear Quadratic Regulator (AL-iLQR) for spacecraft rendezvous under LPO dynamics. The motivation and detailed derivation of the AL-iLQR may be found in
Section 4, after presenting the classical Linear Quadratic Regulator (LQR) policy for continuous dynamics.
Our AL-iLQR paradigm aims to solve the discrete constrained optimization problem
where
is a control authority bound and the inequality/equality set given by
introduces general state constraints, such as Line-of-Sight restrictions [
81]. The discrete dynamics given by the map
represent the relative motion dynamical system in any of its forms.
The constrained problem is relaxed and solved as a series of unconstrained optimizations by appropriately augmenting
J [
82]
The method achieves a low-cost optimization of a quadratic proxy of the control effort and the final and integral rendezvous error in finite time, under discrete impulsive manoeuvres. It generalizes the LQR solution to nonlinear systems by iterating on the optimal control policy
during the backward pass and updating the flow
over the forward pass, as described later on. The significance of the penalizing matrix sequence
and
R may also be found in
Section 4. Finally, it is trivial to augment the relative state
to incorporate an integral penalty term, as discussed also in
Section 4.
3.8. Performance of Impulsive Control Rendezvous Manoeuvres
In the following, the different impulsive guidance algorithms presented in this section are tested and compared using the complete nonlinear equations of relative motion in the CR3BP. To that end, a long-range example scenario is proposed where the target and chaser spacecraft are in two distinct northern halo orbits around the Earth-Moon
, as depicted in
Figure 3. The target spacecraft is located in a halo orbit of an out-of-plane amplitude of 20,140
; the dimensionless time of flight to rendezvous was selected to be 0.6, corresponding to 2.67 days (this value is found to be the critical time
for the given orbit ensuring the differential correction convergence). For the AL-iLQR, however, this value was modified to
units or 14 days: despite seeming a relaxing constraint, it does promote control cost, as the number of impulses increases drastically. Additionally, and despite showing finite time convergence, the AL-iLQR benefits from longer time of flights, in which the true nonlinear dynamics can be optimized; for short time scales, the algorithms tends to overshoot the optimal solution. The dimensionless initial conditions for the target and the chaser spacecraft are, respectively:
corresponding to an initial relative range of 13,195
.
Several considerations must be addressed regarding the configuration of the different guidance schemes:
The STM is integrated along the reference state trajectory
through the linear variational equations of Encke’s formulation of the relative dynamics, Equation (
3). Despite being the most accurate, they are also the most expensive in terms of computational cost, providing a worst-case scenario for performance analysis. For the sake of demonstration and comparison, the TI algorithm is also employed in use of the RLLM analytical STM.
For the sake of comparison, all state constraints are relaxed for all schemes, and the upper control bound is set to for the Opt-MI case (component-wise) and for the AL-iLQR. However, to nullify the final relative velocity, such bound is relaxed at the time of flight.
For the multi-impulsive case, six burns were selected to be performed along the trajectory; we chose a randomly distributed set of values (given in
Table 1 for the sake of reproducibility) to illustrate the intrinsic robustness of this algorithm, for which the impulse sequence is critical.
For the Opt-MI, MISG, impulses are planned every 1 time units. For the AL-iLQR, 5 nondimensional time units are used instead.
All controllers are formulated as open-loop schemes, in which differential correctors suit better as forward passes. However, to address the clear performance difference against MPC, both the Opt-MI and MISG techniques are formulated using also the latter.
Matrices
Q and
R in the AL-iLQR scheme are selected to be constant and
Moreover, the initial estimation of the optimal flow is given by the coasting solution .
No uncertainty is considered in the simulation, and remains to be introduced under the performance of the robust controllers in
Section 5.
To quantitatively analyze the performance differences between controllers, we use as a proxy a series of performance indices related to the rendezvous error and the propellant consumption of the manoeuvre; these indices are defined in
Appendix B. The performance indices for these techniques in the considered example are summarized in
Table 2. Moreover, the minimum and maximum impulses norms (for those distinct to 0) are also displayed to quantitatively bound the feasibility of each algorithm.
All guidance schemes present really similar error performance, the MISG scheme showing slightly better results. The majority of the presented techniques also show similar computational cost, except precisely for the MISG algorithm, which, in both forms of the forward sweep, solves the optimal problem one order of magnitude slower. Interestingly, the use of the analytical RLLM STM rises the computational time of the TI algorithm when compared to direct numerical integration of the variational equations. This is explained by an increase in the number of iteration until convergence, from 4 to 15 in the RLLM case. However, provided that they show the same results, the analytical version of the algorithm is particularly suited for online calculations when compared to its integrated counterpart. Finally, regarding control cost, both the Opt-MI, MISG and AL-iLQR (despite showing an increase of impulses by a factor of 10) show similar performance, as they all are both optimally conceived and constructed. Interestingly, the TI and MI controllers show a much smaller control effort, deviating 18% and 50% from the nominal values. This is attributed to the sparsity of impulses in both algorithms, indicating that the selected number of impulses are closer to the optimal value [
83,
84,
85]. For the other 3 controllers, the number of impulses is determined by the minimum time step considered in the discretization of the dynamics. However, for both the TI, MI and Opt-MI algorithms, the final impulse nullifying the relative velocity increases by a factor of nearly 5 when compared to the rest of controllers. Again, this is explained by the lack of optimality considerations founding the algorithms.
Figure 3 demonstrates the proposed rendezvous trajectories in the absolute configuration space, while
Figure 4 displays the relative state evolution in time under the action of each controller. All of them are able to successfully drive the chaser to rendezvous the target spacecraft located in a different periodic orbit with a convergence error below 1
, and differential corrections requiring less than 50 iterations to converge in the worst scenario. Finally,
Figure 5 shows the evolution of the control impulse sequence for the optimal multi-impulsive controllers. The MISG algorithm can be shown to maintain bounded the control sequence all over the time span. Independently of the numerical solver in use, the algorithm shows similar control trends: first, the relative range is targeted at the initial stages, driving the sequence towards a coasting period, and then towards the end of the transfer, the action plan nullifies the relative velocity progressively. On the other hand, the AL-iLQR shows most control activity in the early stages of the transfer and then a nearly coasting phase, exploiting natural transport dynamics.
The terminal phase of the rendezvous trajectory (
) is re-designed making use of the full constrained AL-iLQR, in which the chaser shall approach the docking port through a given safety corridor. Such constraint is introduced by
which models a cone-like constraint defined by the half-cone angle
, together with an axial direction given by the unit vector
. In this simulation,
and the cone axis is selected to be
, which is assumed to be aligned with the docking axis at the rendezvous time
. Despite being nonlinear in the state
, the constraint need not any redefinition and can be directly introduced in the AL-iLQR algorithm.
Figure 6 depicts the evolution of the relative angle between
and
in time. Again, convergence towards
is noticeable. Finally,
Table 3 compiles the major performance metrics of the algorithm, where no difference can be appreciated with respect to the unconstrained case, demonstrating the robustness and adaptability of the AL-iLQR technique when compared to the rest of the presented guidance schemes.
6. Applications and Testbench Mission Scenarios
This section presents two simulation scenarios to illustrate the use of the manoeuvre design schemes introduced in the preceding sections, and for the purpose of validation and benchmarking.
In the following test cases, all numerical integrations were performed by means of Encke’s formulation for the relative dynamics using a variable-step, variable-order Adams-Bashforth-Moulton predictor-corrector solver of orders 1 to 13 with absolute and relative integration tolerances set to
and
, respectively, as implemented in Matlab 2021b’s function
ode113 [
98].
The time discretization for the simulations output was set to nondimensional time units, which corresponds to 1.68 h in the Earth-Moon system and 8.76 days in the Sun-Earth system; consecutive maneuvers were also constrained not to be executed in less than such time interval.
Navigation uncertainty in the state estimation was modeled as a zero-mean normal multivariate distribution , with diagonal covariance matrix for the relative position space and for velocity. Moreover, impulsive control actions are -bounded by , which corresponds to in the Earth-Moon system and for the Sun-Earth case. Continuous acceleration is restricted to be smaller than unless otherwise specified. Finally, process noise is added as a white noise disturbance, where .
6.1. Servicing a Solar Observatory
Solar and telescope missions have played a fundamental role in both our understanding of the Solar System and our surrounding space environment, as well as the development of modern Orbital Mechanics. As a matter of fact, both the Solar and Heliospheric Observatory and Genesis spacecraft were the very first to demonstrate and use the Three-Body Problem dynamical solutions as a basis for their missions. Libration points provide ideal locations for celestial observation mission too, due to their illumination conditions and uninterrupted communication with the Earth. Hence, for the James Webb Space Telescope (JWST), launched in December 2021, a Sun-Earth halo orbit with an orbital period of 6 months was selected as its nominal orbit. The JWST and its potential visual-inspection and repair necessities throughout its mission lifetime provide the first validation scenario for the proposed GNC techniques. A nominal northern halo orbit has been selected to be representative of that of the real mission.
The chaser spacecraft is intended to perform a formation flight around the target spacecraft (JWST) and, if needed, rendezvous with it for repair or maintenance activities. The mission will then finish with an exploration of the Sun-Earth interior realm.
The proposed mission timeline for this fictional JWST servicing spacecraft is summarized in
Table 6, and the different mission phases and their associated control cost estimates are listed in
Table 7. The spacecraft is assumed to be equipped with chemical propulsion and an electric thruster for relative formation flying. Additionally, simulations only reproduce proximity operations activities, while the transfer phase from a low-Earth orbit to the
target halo orbit range is not considered here; it may be achieved by means of the stable manifold of the orbit, just as the nominal transfer trajectory for the JWST was planned, with potentially null associated control cost [
99].
The proximity operations mission starts after nominal loitering halo orbit insertion of the chaser vehicle, into a distinct
halo orbit to that of the target. This initial long range approach considers acquiring the
relative range for the start of the formation flying. The initial conditions of the target and the chaser relative state at the halo insertion time are
which corresponds to an initial relative range of 623,520 km.
The guidance and control scheme used in this phase corresponds to the combination of the MPC controller with an optimal iLQR technique. The time horizon considered is
, with impulses every
nondimensional time units (corrections every 2.90 days). The time of flight of the phase is set therefore to
units. Moreover, any state constraint is relaxed in this initial phase. Finally, after an heuristic trial and error process, the discrete AL-iLQR algorithm is defined by the following penalty matrices
The performance of the algorithm gives a reduction of 98% of the initial relative range, as shown in
Figure 12, where the relative state evolution of chaser spacecraft can be analysed.
Figure 13 depicts the control input evolution in time during this initial phase.
Once the relative formation flying range is acquired, offline LQR guidance (in use of the integral-augmented RLLM) is combined with the SMC controller to position the chaser at
to the JWST and maintain such relative distance for two months, by converting the relative formation flying problem into a virtual time-invariant rendezvous problem under an appropriate change of variables. The total time of flight is
, formation flying starting at
from phase start. The initial conditions of this phase are given by
The LQR is defined by the weight matrices
while the SMC, after tuning, is defined by the parameter set
Offline guidance refers to the regression of the LQR reference state trajectory
as a function of time, in this case through an appropriate low-order projection in Chebyshev polynomials
.
The relative state evolution of chaser spacecraft can be analysed in
Figure 14. The needed low-thrust acceleration is displayed in
Figure 15.
After 243.33 days of visual inspection and formation flying, the chaser spacecraft is planned to rendezvous with the target vehicle using the combined MPC-MISG core described in
Section 3. The phase considers a time horizon of
impulses in
nondimensional time units. The phase initial state for both the target and relative spacecraft is given by
The final (purely numerical) relative error between vehicles is of
, which, in position space, corresponds to
, thus successfully achieving the rendezvous of the two spacecraft.
Figure 16 shows the relative state evolution during this final close-range rendezvous phase, while
Figure 17 depicts the needed impulses to accomplish the rendezvous.
The trajectory of the proposed JWST servicing mission for the complete mission timeline is illustrated in
Figure 18.
Thus, the proposed solution leads to affordable rendezvous maneuvers and control strategies, where the majority of the control budget is allocated to perform the relative formation flying phase, which is by far the longest in time and the most demanding in high-frequency actuation. Moreover, despite the associated budget, the phase is completed using bounded continuous propulsion, for which direct requirements are less stringent thanks to their higher specific impulse. The rest of the manoeuvres are negligible in cost, while successfully completing all phases. The proposed guidance and control architecture is therefore demonstrated for the mission.
6.2. Rendezvous in a near Rectilinear Halo Orbit
For the past decade, deep space missions have gained increasing attention from the space industry. In order to ease such missions, a lunar orbit station is planned to be established within this decade, known as the Lunar Gateway [
2,
3], which will serve as a communication hub and short-term habitation module to support human return to the Moon and as a staging point for these deep space missions. Thanks to their station keeping long term stability, nearly uninterrupted communication with Earth and eclipse-avoidance properties, Near Rectilinear Halo Orbits (NRHO) have been selected as the major candidate for the Gateway nominal trajectory; these orbits are highly eccentric trajectories, nearly normal to the orbital plane of the Earth-Moon system. The NRHO family can be found in both
and
southern halo sets, at the closest side to the Moon; although they are an intrinsic solution to the CR3BP, they also persist under ephemerides models. NRHOs provide easy access to the lunar surface, as well as to the Earth, by means of their stable and unstable manifolds.
The Lunar Gateway and their supply and re-fueling necessities will provide the second validation mission scenario that we shall consider: for crew replacement and life support resupply purposes, a chaser spacecraft is tasked to rendezvous with the Gateway station after being inserted into its nominal NRHO orbit, located at the point in the Earth-Moon system.
The proposed mission schedule for this Lunar Gateway resupply mission is summarized in
Table 8, and the different mission phases and their associated control cost estimates as listed in
Table 9.
Again, the transfer trajectory from Earth may be designed based on the globalized NHRO stable manifold with potentially null associated control cost [
99]. Such design is not considered in this simulation. Instead, to propose a demanding mission scenario, the chaser spacecraft is first inserted into a northern
halo orbit of out-of-plane amplitude of
= 20,000
to demonstrate the capabilities of the proposed GNC architecture even in extreme trajectory design cases. Moreover, this initial
halo orbit may serve as a loitering circuit for a complete multi-spacecraft re-supply chain. The initial conditions of the Gateway station (the target) and the relative state after insertion into the loitering
orbit are:
which corresponds to an initial relative range of 112,967 km.
The transfer shall be accomplished in less than 14 days or
, given it will be performed using low-thrust propulsion. The continuous SDRE algorithm is used to construct an optimal low-thrust trajectory between the two LPO in close-loop. The algorithm is defined by the following
Q and
R matrices
Figure 19 and
Figure 20 demonstrates the low-thrust transfer and insertion into the nominal target halo orbit in the absolute and relative phase space, respectively. Moreover,
Figure 21 shows the needed control acceleration to accomplish the transfer. A lower control authority bound exists to escape the gravitational well of the Moon, which is found to be
. The optimal transfer trajectory reduces the initial relative range by 99.99%.
The second phase comprises close-range rendezvous and re-supply of the Gateway after a total time of flight of
nondimensional time units or 5.57 days. The selected GNC architecture to fulfill such task is the combination of the MPC scheme with the Opt-MI guidance algorithm, with impulses every 1.07 h. The initial conditions of the phase are
Figure 22 depicts the relative state evolution during this second phase. High-frequency impulsive actuation is clearly noticed, also given in
Figure 23. The minimum required control actuation is
. The final rendezvous error is of
, completing the rendezvous and docking.
Before finishing the mission with a retrieval of the chaser spacecraft to the
nearbies, a formation flying demonstration is performed around the Lunar Gateway at a relative range of ∼
, where the chaser vehicle performs a Lissajous relative orbit-like of amplitudes
,
and frequencies
,
. The SMC controller tracks the desired state evolution for 21 days or
. The initial conditions of the phase are
Figure 24 shows the relative state evolution during this last phase, where the periodic variation of both the position and velocity is noticeable, as well as the transitory secular departure of the relative state from the rendezvous condition to the nominal formation flying relative range. The associated control acceleration is given in
Figure 25.
Again, the majority of the required budget is allocated to the low-thrust long-range rendezvous, the contributions of the other two phases being negligible in comparison. Moreover, with an appropriate design of the loitering orbit, this control effort may be even null if heteroclinic connections between the two halo orbits are exploited. Interestingly, a relatively complex formation flying configuration is accomplished nearly for free.