Relative Dynamics and Modern Control Strategies for Rendezvous in Libration Point Orbits

Cuevas del Valle, Sergio; Urrutxua, Hodei; Solano-López, Pablo; Gutierrez-Ramon, Roger; Sugihara, Ahmed Kiyoshi

doi:10.3390/aerospace9120798

Open AccessArticle

Relative Dynamics and Modern Control Strategies for Rendezvous in Libration Point Orbits

by

Sergio Cuevas del Valle

^1,*

,

Hodei Urrutxua

¹

,

Pablo Solano-López

¹

,

Roger Gutierrez-Ramon

²

and

Ahmed Kiyoshi Sugihara

³

¹

Aerospace Systems and Transport Research Group (GISAT-ASTRG), Universidad Rey Juan Carlos, Camino del Molino 5, Fuenlabrada, 28942 Madrid, Spain

²

SOKENDAI, ISAS/JAXA, The Graduate University for Advanced Studies, Chuo-ku, Sagamihara-shi 252-5210, Kanagawa-ken, Japan

³

Japan Aerospace Exploration Agency (JAXA)—Institute of Space and Astronautical Science (ISAS), Chuo-ku, Sagamihara-shi 252-5210, Kanagawa-ken, Japan

^*

Author to whom correspondence should be addressed.

Aerospace 2022, 9(12), 798; https://doi.org/10.3390/aerospace9120798

Submission received: 25 October 2022 / Revised: 29 November 2022 / Accepted: 2 December 2022 / Published: 5 December 2022

(This article belongs to the Section Astronautics & Space Science)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Deep space missions are recently gaining increasing interest from space agencies and industry, their maximum exponent being the establishment of a permanent station in cis-lunar orbit within this decade. To that end, autonomous rendezvous and docking in multi-body dynamical environments have been defined as crucial technologies to expand and maintain human space activities beyond near Earth orbit. Based on analytical and numerical formulations of the relative dynamics in the Circular Restricted Three Body Problem (CR3BP), a family of optimal, linear and nonlinear, continuous and impulsive, guidance and control techniques are developed for the design of end-to-end rendezvous trajectories between co-orbiting spacecraft in this multi-body dynamical environment. To this end, several modern control techniques are effectively designed and adapted to this problem, with particular emphasis on the design of low cost rendezvous manoeuvres. Finally, the designed hybrid rendezvous strategies, combining both discrete and continuous control techniques, are effectively tested and validated under several start-to-end deep space testbench mission scenarios, where their performance is compared and quantitatively assessed with a set of performance indices.

Keywords:

optimal control; cr3bp; rendezvous; libration point orbits; relative dynamics; nonlinear control; robust control

1. Introduction

The recent proliferation of private and commercial ventures providing affordable access to low-Earth orbit and new crewed space vehicles [1], along with a renewed and increasing interest for deep space missions, not only from space agencies, but from the space industry and other actors, as highlighted in the Global Exploration Roadmap elaborated by the International Space Exploration Coordination Group [2,3], shows the necessity of advancing on new technologies to continue with this momentum and solve the challenges of the upcoming space missions currently being proposed. Visual inspection and extravehicular repair activities, spacecraft refueling and lifetime extension, active space debris removal, and resupply and other general on-orbit servicing missions are but a few of the most immediate applications requiring an increasing degree of autonomy and automation, with even more relevance in deep space missions. Along this line, the upcoming establishment of new crewed outposts in cislunar space within the next years, such as the Lunar Gateway, will require a continued service of resupply and regular crew transportation missions, that would benefit from autonomous capabilities for automated rendezvous, docking and other proximity operations in cislunar space, which have never been attempted to date, and thus still remain a major challenge for space exploration programs and missions beyond Earth orbit. Therefore, autonomous guidance, navigation and control capabilities have been defined as key enabling technologies to be developed in the coming years to support the expansion of human space activity beyond Earth orbit.

Libration Point Orbits (LPO) have been identified as ideal locations for the Lunar Gateway and other lunar and deep space exploration related activities. LPOs include periodic and quasi-periodic orbits, which posses highly interesting features due to their ability to maintain continuous communication with the Earth and their lesser station-keeping requirements. However, their inherent orbital instability yields relatively short time scales for divergence if the spacecraft abandons the nominal LPO or is subjected to perturbational acceleration sources, such as third body perturbations or solar radiation pressure. Therefore, a lot of effort has been devoted over the last three decades to the problem of trajectory control and station-keeping of LPO, initiated by the pioneer work of Farquhar [4,5]. Dynamical systems theory has been a fundamental role in the development of such control strategies. In this sense, early attempts exploited the periodic nature of these orbits, allowing for an approach based on Floquet’s theory, which led to the ‘Floquet Mode’ station-keeping strategy for LPOs, first proposed by Wiesel and Shelton [6] and Simo et al. [7], and later applied to the station-keeping of translunar LPOs [8]. Howell and Pernicka [9,10] proposed a new LPO station-keeping by adapting Dwivedi’s approach [11] to the LPO problem, leading to a strategy known as ‘Target Point’. Bai and Junkins [12] proposed a gradient-free computational approach based on a Modified Chebyshev-Picard Iteration method, which provided a simple and lightweight control structure. Hou et al. [13] proposed impulsive control strategies similar to the Floquet approach, which allowed to extend its applicability to the real Earth–Moon system by relying on quasi-periodic orbits referred to as dynamical substitutes. Folta et al. [14] extended the concept of the Target Point strategy to include optimality considerations by combination with a global search method and an orbit continuation method, resulting in a discrete control strategy which was successfully used for operational station-keeping in the ARTEMIS mission [15]. Recently Jin and Xu [16] proposed a modified strategy for selecting target points for LPO station-keeping to reduce manoeuvre costs in the real Earth-Moon system. Extensions to more complicated multi-body contexts have also been accomplished; Carletta et al. exploited the Hamiltonian formalism to develop a linear feedback compact station-keeping law in the Sun-Mars elliptic restricted four-body problem LPOs [17].

The aforementioned works proposed effective control strategies that relied heavily on dynamical systems theory and exploited the properties of intrinsic structures of the CR3BP, resulting in impulsive strategies; in contrast, they overlooked the plethora of techniques readily available in both classical and modern control theory, which can be effectively borrowed and adapted to the problem at hand. Along this line, Breakwell et al. [18] were the first to approach the LPO trajectory control problem from a classical control viewpoint by proposing a linear quadratic regulator for station-keeping of a translunar halo orbit. Control schemes of increasing complexity followed. Jones and Bishop [19] developed an output feedback guidance law using

H_{2}

control theory. Scheeres and Vinh [20] developed a feedback control law based on the local eigenstructure of the LPO, which allowed for oscillatory motions in the center manifold. Luquette and Sanner [21] proposed an adaptive, nonlinear control for orbit maintenance in the vicinity of LPOs. Gurfil and Kasdin proposed a time-varying, continuous, linear quadratic control law, along with an internal disturbance model that rendered a robust disturbance rejection performance [22], and also investigated the early use of neural networks for tracking control and disturbance rejection in this context [23]. Marchand and Howell also developed continuous control strategies of increasing complexity based on linear and non-linear quadratic regulators and input/output feedback linearization [24], as well as through numerical solutions to the optimal control problem [25]. Infeld et al. [26] used Legendre pseudo-spectral methods to numerically solve the fuel-optimal, constrained, non-linear control problem. Kulkarni et al. [27] successfully adapted an

H_{\infty}

approach to station-keeping control of LPOs. Nazari et al. [28] proposed three control strategies combining continuous LQR control and Floquet theory using periodic control gains; these relied, respectively, on an time-periodic infinite horizon LQR, a backstepping technique with time- invariant LQR, and a dead-band periodic-gain controller. Lian et al. [29] investigated the use of discrete-time sliding mode control and a discrete time linear quadratic regulator for station-keeping of real Earth–Moon LPO (i.e., with a complete Solar System model under a real ephemerides model), resulting in a discrete control suitable for impulsive manoeuvres. Ulybyshev [30] approached the station-keeping problem as an optimization problem, considering pseudoimpulses for discretized orbital segments, thus transforming the station-keeping problem into a large-scale linear programming form, resulting in a long-term station-keeping strategy for quasi-periodic LOPs in the full ephemerides model. Using a simple linear extended state observer, Narula and Biggs [31] extended an LQR control scheme to enable continued tracking in the event of thruster failure and the presence of disturbances; they also demonstrated that in combination with a sliding mode or an adaptive control, asymptotic tracking could be achieved. Peng et al. [32] demonstrated the robust maintenance of multi-revolution halo orbits in an elliptic restricted three-body problem using a receding horizon control strategy solved by an indirect Radau pseudo-spectral method. Qi and de Ruiter [33] extended the use of backstepping controllers for station-keeping of LPOs under practical navigational and executional constraints, a real ephemerides model and solar radiation pressure.

Héritier and Howell [34] looked into harnessing the natural, multi-body dynamics to minimize the drift of the unstable relative dynamics. Along this line, Xu, Liang and Fu [35] proposed a Hamiltonian structure-preserving control for LPO, which they successfully extended to the bi-circular four-body problem (i.e., time-periodic dynamics), and later to time-dependent dynamics, such as the relative orbital motion along low-energy transfer trajectories in the CR3BP [36], a path also investigated by Cheng et al. [37]. Jung and Kim [38] also proposed a switching Hamiltonian structure-preserving control. Other currently ongoing research includes Elliott and Bosanac [39,40], who are looking into LPO station-keeping controllers based on an alternative set of geometric coordinates, Bonasera, Bosanac et al. [41,42], who are looking into machine learning approaches to the LPO station-keeping problem, in particular with reinforcement learning, and Gao et al. [43], who are investigating high-order dynamical systems approaches for low-thrust station-keeping of LPOs. A thorough and extensive survey on LPO station-keeping strategies was presented by Shirobokov et al. [44], although it is unfortunately already outdated given the continued developments carried out in these research lines in recent years. Also, despite all these publications have focused on the trajectory control around LPOs, it is worth noting that a few publications have also looked into the general problem of relative orbital motion in the CR3BP, not necessarily bound to the vicinity of equilibrium points [45,46].

Traditional techniques for stability analysis and classical control build on linearization about a reference trajectory, which by extension leads to the problem of close proximity relative orbital motion and formation flight around a LPO. This immediately drew attention to multi-spacecraft formations and applications to interferometry missions and other distributed spacecraft architectures; in fact, many of the aforementioned bibliographic references revolve around such applications. In contrast, the problem of orbital rendezvous between two co-orbital spacecraft within the CR3BP framework seems to have drawn very little attention until recent years. Gerding [47] was the first to consider rendezvous in a CR3BP environment, by proposing a simple, two-impulse rendezvous strategy based on the linearized motion around a LPO. Jones and Bishop [48,49] derived a targeting law for the terminal phase rendezvous, loosely equivalent to the rendezvous application of Hill’s equations in the two-body problem, and constructed a Kalman-based rendezvous navigation filter to supply the targeting law with the chaser vehicle state information. Canalias and Masdemont [50] developed a methodology for the rendezvous of satellites on a Lissajous orbit using the effective phases plane together with a linear approach. However, it was not until well within the last decade that rendezvous in non-Keplerian environments become a topic of renewed interest upon studies related to the Lunar Gateway placement on a LPO. Mand [51] investigated simple, impulsive close-rendezvous targeting strategies adapting the line-of-sight corridor and the line-of-sight glide notions, of common use in Keplerian rendezvous. Ueda and Murakami [52] investigated into optimum guidance strategies based on free-drift dynamics, by following safe approach trajectories along invariant manifolds and into safe injection points for rendezvous in an Earth-Moon Halo orbit. Along this line, Sato et al. [53] proposed two different strategies for rendezvous on an Earth-Moon

L_{2}

halo orbit by phasing along the orbit: one where the chaser approaches the target from behind along the orbit, similarly to a rendezvous in a low Earth orbit, and a second one utilizing an homoclinic intersection between an unstable manifold that departs from the halo orbit towards the Moon, where it connects to a stable manifold returning to the halo orbit, thus allowing an increased launch window and flexibility for mission design, and the capability to adjust the time of arrival to the halo orbit with a lower propellant usage. A similar approach, based on finding connecting manifolds on Poincaré maps, was proposed by Lizy-Destrez [54]. Murakami et al. [55] further investigated into the problems and requirements regarding guidance, navigation, and control for a rendezvous scenario in an Earth–Moon

L_{2}

halo.

From 2015 onward, the spotlight was put into other types of LPO, such as Direct Retrograde Orbits (DRO) and Near-Rectilinear Halo Orbits (NRHO), once these were pointed out to be potentially better locations for the Lunar Gateway. Thus, Murakami and Yamanaka [56] proposed a three-impulse transfer from LEO to various phase points in a certain DRO, including a retrograde, powered lunar gravity assist. Ueda et al. [57] used standard, impulsive strategies to target an arbitrary relative position in halo, DRO and NRHO. Lizy-Destrez [58] provided a safety analysis for close approach rendezvous into a NRHO, and Davis et al. [59] looked into navigation accuracies and noise effects applied to various station-keeping strategies for NRHO, as well as examining the ability to absorb missed burns, construct phasing manoeuvres and conduct rendezvous and proximity operations. On a follow up work, Lizy-Destrez et al. [60] presented methods and results related to strategies for far and close rendezvous, and compared different linear and nonlinear models for cislunar relative motion; in particular, three- and four-impulse strategies are reviewed for the far-range rendezvous, and for the close-range renzdezvous, key concepts and several impulsive, targeting strategies available in the literature are reviewed. Blazquez et al. [61] looked at the far rendezvous approaches for NRHO and passively safe drift trajectories under a real ephemerides model, employing multiple-shooting and adaptive receding-horizon targeting algorithms, and more recently, Khoury and Howell [62] also looked into solutions to rendezvous and space loitering problems on NRHO and DRO type orbits. Bucchioni, in a series of works, focused on GNC for phasing and rendezvous under 6 DoF models [63,64,65].

It is a staggering realization though, that all the aforementioned rendezvous strategies and targeting methods are either impulsive strategies based on exploiting invariant manifolds or impulsive strategies for the close-range based on different flavours of linearized motion. Despite the vast literature dedicated to continuous-thrust station-keeping and formation flight control around LPOs, our literature review only revealed a handful of publications that proposed a continuous control for rendezvous under non-Keplerian dynamics; in particular, Ulybyshev et al. [66] presented an optimal method for the low-thrust rendezvous trajectory design in the vicinity of a lunar

L_{2}

orbit under full ephemerides model by adapting their LPO station-keeping strategy based on pseudoimpulses distributed along discretized orbital segments, yielding a fully numerical, linear programming problem [30]. Sanchez et al. have successfully proposed both, impulsive and continuous MPC schemes for constrained, robust close-range docking in NHRO scenarios [67], while using a costly dynamics formulation based on a Local Vertical Local Horizontal reference frame. Another interesting contribution is the work by Colagrossi et al. [68], who studied the rendezvous and coupling in non-Keplerian orbits accounting for the orbit-attitude coupling and flexible modes of the structure of a very large space station, and more recently have also employed vision-based state navigation techniques for complete attitude-orbital state estimation and control [69]. Therefore, we found it surprising that, despite the wealth of literature devoted to the investigation of continuous-thrust LPO station-keeping and formation flight strategies built upon the classical and modern control theory, with the only exceptions of Refs. [66,67,68,69], to the best of our knowledge, none of the reviewed continuous controllers has been investigated nor adapted for analysis of rendezvous trajectories at LPOs. It was precisely this realization that served as motivation for the present work. Therefore, the main contributions of this manuscript are summarized in the following paragraph.

This work firstly revisits both analytical and numerical approaches to the problem of relative motion in the CR3BP; in particular: (1) the analytical linearization of motion around LPO is approached distinctively from other literature sources, exploiting the natural dynamics for near-libration points scenarios; resulting in a linear time-invariant (LTI) relative motion model to be exploited for GNC; and (2) a highly accurate numerical calculation scheme is presented for the relative motion in the CR3BP based on Encke’s method. Based on the developed analytical framework, a family of optimal, linear and nonlinear, impulsive and continuous set of control and guidance techniques are reviewed and developed to exploit the multi-body context and its intrinsic structures for far-range rendezvous and proximity operations in the CR3BP. Firstly, the optimal constrained rendezvous problem in the form of Bolza is introduced, and several guidance techniques are developed for successful optimal relative trajectory design. Both classical impulsive and continuous strategies, such as the LQR/SDRE or the two-impulse rendezvous scheme, are reviewed and compared in LPO missions. However, our work introduces the possibility of exploiting the presented low-cost LTI relative motion models for direct linear control synthesis, independent of the target motion. Secondly, two novel rendezvous techniques are presented: a numerical impulsive planning algorithm is designed based on classical launcher staging theory, yielding a recursive solution for the optimal thrusting directions. Moreover, the Augmented Lagrangian Iterative LQR (AL-iLQR) is formulated for our CR3BP rendezvous problem, for both continuous and impulsive missions. Thirdly, all presented guidance techniques are hybridized with inner compensator loops, implemented through Model Predictive Control and first-order Sliding Mode Control, to provide robust performance against uncertainties and unmodelled dynamics. Finally, the validation of the proposed dynamical models and the developed rendezvous techniques are exhibited in two realistic simulation scenarios for the James Webb Telescope and the future Lunar Gateway.

The remainder of this manuscript is organized as follows. In Section 2, both the absolute and relative dynamics in the CR3BP are briefly revisited, and novel approaches to both the precise numerical integration of the relative dynamics and the linearization of relative motion are introduced. Some preliminaries on Control Theory are also highlighted. Section 3 and Section 4 poses the general optimal rendezvous problem and comprises the design of several guidance techniques for general rendezvous trajectory planning between spacecraft in co-orbital motion. Both continuous and impulsive guidance algorithms are either revisited or presented, and a comparison between these techniques is also provided using specific performance indices. Section 5 effectively combines the proposed guidance cores with several modern, robust, continuous and impulsive controllers for general reference tracking of optimal rendezvous trajectories and again, performance analysis is accomplished for the overall guidance and control loop. In Section 6 all the trajectory design and control techniques presented in this paper are tested and validated upon two real-case mission scenarios, where these techniques are effectively employed and combined. Finally, Section 7 summarizes the main conclusions of this work.

2. Relative Dynamics in the Circular Restricted Three-Body Problem

2.1. Definitions and Governing Equations

The governing equations of motion considered in this study are framed within the Circular Restricted Three-Body Problem (CR3BP), where two celestial bodies of masses

M_{i}

are considered, hereafter referred to as primaries, which revolve around their common barycenter under their mutual gravitation, thus describing a circular motion; a third body, i.e., a spacecraft, is considered with a comparatively negligeable mass

m ≪ M_{i}

, such that its motion is driven by the gravitational interaction with the primaries, whereas the primaries are not affected by the presence of the spacecraft. The motion of the spacecraft is naturally studied in a rotating, synodic frame

S

, such that the primaries occupy fixed positions in the

x_{S}

axis, the

z_{S}

axis is set perpendicular to the orbital plane of the primaries (along the angular momentum of the system), and the

y_{S}

axis completes a dextral frame. The unit vectors

{i, j, k}

are defined, respectively, along each of the coordinate axes of the synodic frame. This is illustrated in Figure 1, which depicts the definition of the synodic frame with respect to the inertial frame

I

. The latter is arbitrarily defined to be aligned with

S

at the reference epoch

t_{0} = 0

, while other options are obviously available.

It is customary to use dimensionless coordinates, such that the distance between the primaries is used as characteristic length, the orbital period of the primaries around their common barycenter is taken as characteristic time, and the masses of the primaries are referred to the total mass of the system. In these dimensionless coordinates, the primaries revolve at one radian per unit of dimensionless time, the reduced mass of

M_{2}

can be defined as

μ = \frac{M_{2}}{M_{1} + M_{2}}

and the primaries are set at a unit length apart from each other, so their positions vectors (which are constant in synodic frame coordinates) can be conveniently written as

R_{1} = - μ i, R_{2} = (1 - μ) i .

(1)

Under the aforementioned assumptions, the motion of the spacecraft is governed by the following second-order ordinal differential equation

\ddot{r} + 2 ω \times \dot{r} + ω \times (ω \times r) = - (1 - μ) \frac{r - R_{1}}{∥ r - R_{1} ∥^{3}} - μ \frac{r - R_{2}}{∥ r - R_{2} ∥^{3}},

where

ω

is the dimensionless angular velocity vector of the synodic frame

S

with respect to the inertial frame

I

,

r

is the dimensionless spacecraft position vector as measured by an observer in the synodic frame, and the overhead dot (

\dot{□}

) indicates derivatives with respect to the non-dimensional time.

In the following, the relative motion between two spacecraft, termed as target and chaser (following the usual nomenclature), will be addressed in the CR3BP framework. To this end, let us consider not one, but two spacecraft modeled as point masses, whose mutual gravity interaction is neglected. The target spacecraft is assumed to be cooperative and passive (i.e., with no orbital manoeuvring capabilities), and its orbit to be accurately known.

Let vectors

r_{t}

and

r_{c}

denote the position of the target and chaser spacecraft, respectively, so their relative position vector is defined as

ρ = r_{c} - r_{t}

. Differentiating twice with respect to time and including the appropriate inertial terms, the following governing equations are obtained:

\begin{matrix} \ddot{x} - 2 \dot{y} - x = (1 - μ) (\frac{ξ + μ}{∥ r_{t} - R_{1} ∥^{3}} - \frac{x + ξ + μ}{∥ ρ + r_{t} - R_{1} ∥^{3}}) + μ (\frac{ξ - 1 + μ}{∥ r_{t} - R_{2} ∥^{3}} - \frac{x + ξ - 1 + μ}{∥ ρ + r_{t} - R_{2} ∥^{3}}) + u_{x}, \\ \ddot{y} + 2 \dot{x} - y = (1 - μ) (\frac{η}{∥ r_{t} - R_{1} ∥^{3}} - \frac{y + η}{∥ ρ + r_{t} - R_{1} ∥^{3}}) + μ (\frac{η}{∥ r_{t} - R_{2} ∥^{3}} - \frac{y + η}{∥ ρ + r_{t} - R_{2} ∥^{3}}) + u_{y}, \\ \ddot{z} = (1 - μ) (\frac{ζ}{∥ r_{t} - R_{1} ∥^{3}} - \frac{z + ζ}{∥ ρ + r_{t} - R_{1} ∥^{3}}) + μ (\frac{ζ}{∥ r_{t} - R_{2} ∥^{3}} - \frac{z + ζ}{∥ ρ + r_{t} - R_{2} ∥^{3}}) + u_{z} . \end{matrix}

(2)

where

(x, y, z)

are the synodic frame coordinates of the relative position vector

ρ

,

(ξ, η, ζ)

are the synodic frame coordinates of the target spacecraft position vector

r_{t}

, and the control acceleration

u = u_{x} i + u_{y} j + u_{z} k

has been introduced for future concerns. Equation (2) can be compactly referred to as the following first-order, control-affine nonlinear system

\dot{s} = f (μ, s, r_{t}) + u .

The relative motion model is closed assuming the target spacecraft ephemerides are known.

The State Transition Matrix (STM),

Φ (t, t_{0})

, can also be computed from the variational equations stemming from Equation (2). Both the flow’s stability and stroboscopic map are available from the propagation of the STM of the system along a reference trajectory, which satisfies

\begin{matrix} \frac{d}{d t} Φ (t, t_{0}) = J Φ (t, t_{0}) \\ Φ (t_{0}, t_{0}) = I \end{matrix}

with J the Jacobian of the general dynamics vector field

f (μ, s, r_{t})

. For LTI dynamical systems,

f (μ, s, r_{t}) = A

, the STM is explicitly given by

Φ (t, t_{0}) = e^{A (t - t_{0})}

where the exp operator denotes matrix exponential.

2.2. Encke’s Formulation of the CR3BP Relative Dynamics

The lack of an analytical solution to the CR3BP enforces the necessity of numerical propagation. However, Equation (2) require some caution when evaluating their right-hand side: the order of magnitude difference between

ρ

and

r_{t} - R_{i}

yields a truncation error which quickly demises the integration process, specially in close proximity operations scenarios. Following [70,71], an Encke’s formulation [72] of the dynamics vector field can be derived and introduced for multi-body dynamics, which is better suited from the numerical viewpoint:

\ddot{ρ} + 2 ω \times \dot{ρ} + ω \times (ω \times ρ) = \sum_{i = 1}^{2} \frac{- μ_{i}}{∥ r_{t} - R_{i} ∥^{3}} (f (q_{i}) (r_{t} - R_{i}) + (1 + f (q_{i})) ρ)

(3)

with

q_{i} = - \frac{2 (r_{t} - R_{i}) \cdot ρ + ρ \cdot ρ}{∥ r_{t} - R_{i} {+ ρ ∥}^{2}}, f (q_{i}) = q_{i} \frac{3 (1 + q_{i}) + q_{i}^{2}}{1 + {(1 + q_{i})}^{3 / 2}} .

(4)

Integrating the latter set of equations provides the synodic frame coordinates of the relative position vector

ρ

. This approach is used as the backbone computational integration algorithm in this work, yielding the same dynamics as Equation (2) while providing enhanced numerical accuracy with respect to the true chaser motion solution. This form of the dynamics is also leveraged in the integration of the STM, which benefit from the enhanced numerical properties of the scheme.

For demonstration purposes, a numerical test between the propagators (using either Newton’s or Encke’s formulation) is presented in Figure 2 for a standard halo orbit of out-of-plane amplitude of 20,000

km

. The performance comparison between the two addresses the

l^{2}

-norm absolute error between the double-precision relative state vector, integrated in use of Equations (2) and (3), with respect to a quadruple precision baseline, obtained through integration of Equations (2) with maximum relative and absolute tolerances. The enhanced numerical performance and stability of the novel Encke’s propagator is clear when compared to the classical Newtonian results, showing an enhancement of an order of magnitude in terms of numerical precision all over the considered integration span.

2.3. Linearized Models

In relative orbital motion it is customary to employ linearization techniques for approximating the dynamics in the neighborhood of the target spacecraft. Retaining up to first order terms in Equation (2) yields the Rendezvous Linear Model (RLM) [45,52,62], which can be compactly expressed in matrix form as:

\dot{s} = A s + B u =

[\begin{matrix} \dot{ρ} \\ \ddot{ρ} \end{matrix}] = [\begin{matrix} 0_{3 \times 3} & I_{3 \times 3} \\ Σ & Ω \end{matrix}] [\begin{matrix} ρ \\ \dot{ρ} \end{matrix}] + [\begin{matrix} 0_{3 \times 3} \\ I_{3 \times 3} \end{matrix}] u,

(5)

where

0_{3 \times 3}

and

I_{3 \times 3}

denotes 3-dimensional null and identity matrices, respectively; the Coriolis acceleration term

Ω

reads

Ω = [\begin{matrix} 0 & 2 & 0 \\ - 2 & 0 & 0 \\ 0 & 0 & 0 \end{matrix}]

and the Hessian matrix

Σ

can be computed as

Σ = - (κ_{1} + κ_{2}) I_{3 \times 3} + 3 κ_{1} (e_{1} \otimes e_{1}) + 3 κ_{2} (e_{2} \otimes e_{2}),

where the operator ⊗ denotes the dyadic product,

e_{i}

are unit vectors pointing from the i-th primary to the target spacecraft

e_{i} = \frac{r_{t} - R_{i}}{∥ r_{t} - R_{i} ∥}, i = 1, 2

and

κ_{i}

are coefficients defined as

κ_{i} = \frac{μ_{i}}{∥ r_{t} - R_{i} ∥^{3}}, i = 1, 2 .

Note that the Rendezvous Linear Model is time-dependent, since the matrix

Σ

depends on the target spacecraft location

r_{t}

. Consequently, the stability of the model cannot be easily assessed, as the eigenvalues of the system matrix depend explicitly on the target spacecraft ephemerides.

While the original RLM model was derived as a linearization of the true relative dynamics around the target position vector, in this work we introduce the Relative Libration Linear Model (RLLM), which considers relative motion near the collinear libration points, assuming both the target and chaser spacecraft to be on an LPO. In such cases,

Σ

is expected to become constant or (quasi-)periodic, respectively. In fact, after linearization,

Σ

can be shown to read [73,74]

\begin{matrix} Σ = (\begin{matrix} 1 + 2 c_{2} & 0 & 0 \\ 0 & 1 - c_{2} & 0 \\ 0 & 0 & - c_{2} \end{matrix}) . \end{matrix}

(6)

The fundamental frequency

c_{2}

is recognized to be that introduced by Richardson, which depends on the libration point both spacecraft orbit, as defined in [75]. As it can be seen, the relative dynamics are therefore uncoupled from the exact chaser’s and target’s trajectories while inheriting the absolute dynamics phase space solution structures [74], providing an LTI relative dynamics model (RLLM-LTI), which may be useful in the design of classical linear control schemes.

2.4. Discrete and Controlled Dynamics

In the design of the guidance and control schemes the following form of the state flow, as given by Lagrange’s formula, will be usually leveraged

s (t) = Φ (t, t_{0}) s_{0} + \int_{t_{0}}^{t} Φ (t - τ, t_{0}) B u (τ) d τ .

(7)

The first term of the right hand side is the homogeneous solution of the dynamics, given by the discrete mapping of the initial conditions to the epoch of interest t. Such discrete mapping is defined by the STM of the system, which can be numerically computed by integrating the variational equations along with a reference phase space trajectory

ϕ (t; s_{0})

. The convolutional, integral term constitutes the effect of the control action

u (t)

through the control input matrix B on the dynamics.

In astrodynamics applications, a discrete or impulsive manoeuvre sequence is usually conceived as a feasible, natural and straightforward control strategy. Particularising Lagrange’s formula for such action plan

u (t) = \sum_{i} U_{i} δ (t - t_{i})

yields the following result

s (t) = Φ (t, t_{0}) s_{0} + \sum_{i = 1}^{N} Φ (t, t_{i}) B U_{i},

(8)

where the

δ (t - t_{i})

is Dirac’s delta generalized function. Introducing the time shifting property of the STM, the above result can be further expanded as

s (t) = Φ (t, t_{0}) s_{0} + \sum_{i = 1}^{N} Φ (t, t_{0}) Φ {(t_{i}, t_{0})}^{- 1} B U_{i} .

Some of the guidance techniques developed in Section 4 are founded on the premise of discrete dynamics. The classical results just presented can be used to construct a discrete map

s (t_{i + 1}) = F (s (t_{i}))

for a given discrete time sequence

0, t_{i}, t_{i + 1} \dots

, under the action of both continuous and discrete control functions. For the latter case,

s_{i + 1} = Φ (t_{i + 1}, t_{0}) Φ {(t_{i}, t_{0})}^{- 1} s_{i} + Φ (t_{i + 1}, t_{0}) Φ {(t_{i}, t_{0})}^{- 1} B U_{i} .

For continuous control inputs, some assumption shall be made with respect to the control action between time steps

Δ t = t_{i + 1} - t_{i}

. This work employs a zero-order hold between instants and a first order Euler quadrature for the STM exponential, giving

s_{i + 1} = Φ (t_{i + 1}, t_{0}) Φ {(t_{i}, t_{0})}^{- 1} s_{i} + \frac{Δ t}{2} [I + Φ (t_{i + 1}, t_{0}) Φ {(t_{i}, t_{0})}^{- 1}] B u_{i} .

3. Optimal Impulsive Guidance

The following section introduces several impulsive optimal guidance schemes for general proximity operations trajectory design. Moreover, the different techniques are compared against each other under a homing and rendezvous mission scenario.

The use of the term guidance in this context applies to the computation of both a reference state trajectory

s_{ref}

and control input

u_{ref}

. The guidance core will be then cascaded with an inner control or compensator loop, whose techniques are described in Section 5.

All guidance laws presented in this work can be either used online, in feedback manner, or computed offline and explicitly stored and regressed over an independent variable of interest, such as time, for online GNC purposes.

3.1. The Rendezvous Problem

In practice, most proximity operations and regular orbital control activities can be formulated as an optimal rendezvous problem, involving either two real spacecraft or the relative dynamics of a given vehicle with respect to some virtual object. This treatment motivates the formulation of the Rendezvous Problem.

The Rendezvous Problem is formally stated as a path-constrained two-boundary value problem, where a finite control law

u (t)

is sought such that the chaser spacecraft follows a relative motion that takes it to the origin of the relative phase space after some finite time of flight

t_{f}

, thus mathematically fulfilling the Rendezvous Condition, namely

ρ (t_{f}) = \dot{ρ} (t_{f}) = 0

. Hence, the Rendezvous Problem can be compactly stated as the following optimal control problem in the form of Bolza:

\begin{matrix} \underset{u \in R^{3} \times R^{+}}{\arg \min} & J = G (s (t_{f}), s (0), t_{f}, t_{0}) + \int_{t_{0}}^{t_{f}} l (s, u, t) d t \\ subject to & \dot{s} = f (μ, s, S_{t}) + u, \\ s (t_{0}) = s_{0}, \\ s (t_{f}) = 0, \\ g (μ, s, S_{t}) = 0, \\ h (μ, s, S_{t}) < 0, \\ u_{\min} \leq {∥ u ∥}^{p} \leq u_{\max} \end{matrix}

(9)

where

u_{\min}

,

u_{\max}

saturate the control function p-norm

{∥ u ∥}^{p}

and

f (μ, s, S_{t})

is an appropriate model of the relative dynamics.

3.2. Surrogate Optimization and Backward-Forward Sweep

The majority of the guidance techniques presented in this section benefit from surrogate relative motion models when addressing the Rendezvous Problem, mainly based on an appropriate linearization of true nonlinear dynamics and the corresponding STM

Φ (t, t_{0})

. Moreover, the computation of

Φ (t, t_{0})

is at most, when not build upon an analytical model, numerically integrated through the linear variational equations along the flow

ϕ (t, s_{0}, u)

.

Therefore, any control policy

Π (t)

computed under such dynamics is not guaranteed to rendezvous the relative state vector under the true nonlinear field. In general, some form of feedback is needed to successively refined

Π (t)

and

Φ (t, t_{0})

to comply with the nonlinear relative motion dynamics. This is achieved through iterating a backward-forward pass or sweep structure until convergence:

Along the backward pass, the guidance control policy $Π (t)$ is solved for under an appropriate approximation of the true nonlinear dynamics, such as the discrete map given by the current estimate of the STM $Φ (t, t_{0})$ .
In the forward pass or rollout, the converged input $Π (t)$ is used to re-integrate to higher accuracy and refine the state flow $ϕ (t, s_{0}, Π (t))$ , dictating the approximate dynamics in the backward process (for example, the estimate of the STM).
For offline guidance, both direct iteration or Newton-Rhapson differential correctors [76] are used to re-evaluate the STM $Φ (t, t_{0})$ along the flow $ϕ (t, s_{0}, Π)$ and update the nominal control sequence $Π (t)$ . Online guidance is based on the Model Predictive Control (MPC) approach described in Section 5, in which the true nonlinearities and uncertainties unmodelled in the surrogate guidance model are accommodated through a time-receding horizon scheme under the true plant dynamics.

3.3. Two-Impulse Guidance

Given the linearized dynamics in Equation (8), proximity operations trajectory design can be determined through planning the impulsive control sequence

(U_{i}, t_{i})

with respect to some optimality policy, usually penalizing both integral loss functions of the control effort and error to the desired state

s_{d}

.

Although a single impulse suffices to nullify the relative range to target after some time of flight

t_{f}

, two impulses (

N = 2

) are mandatory to fully regulate the 6-dimensional relative state

s

(rendezvous the two spacecraft). Thus, the linear two-impulse (TI) rendezvous scheme, widely used within the context of Keplerian motion since the 60s [77] and suggested in previous investigations [51], is here explicitly adapted to the context of the CR3BP for comparison purposes against more complex strategies. Moreover, compared to previous literature, our scheme can profit from direct LTI dynamics (lower computational cost in the integration process) and a more numerically accurate integration process, as introduced in Section 2.

The backward pass of the TI scheme is given by evaluating Equation (8) with a first impulse at

t_{0}

,

Δ V_{1}

, and a second impulse at

t_{f}

,

Δ V_{2}

, which provides the final state vector. Enforcing the Rendezvous Condition

s (t_{f}) = 0 \in R^{6}

and solving for the impulses yields the following solution:

Δ V_{1} = - Φ_{ρ \dot{ρ}}^{- 1} [\begin{matrix} Φ_{ρ ρ} & Φ_{ρ \dot{ρ}} \end{matrix}] s_{0} = - Φ_{ρ \dot{ρ}}^{- 1} ρ (t_{f}),

(10)

Δ V_{2} = - Φ_{\dot{ρ} \dot{ρ}}^{- 1} [\begin{matrix} Φ_{\dot{ρ} ρ} & Φ_{\dot{ρ} \dot{ρ}} \end{matrix}] s_{0} = - Φ_{\dot{ρ} \dot{ρ}}^{- 1} \dot{ρ} (t_{f});

(11)

where the subscripts of

Φ

indicate the corresponding partitions of the STM following the usual notation.

The forward pass may be implemented both using MPC or classical differential correctors. In the latter case, as already stated, these expressions are solved iteratively for the current estimate of

Δ V_{1}

and its effect under the nonlinear vector field to re-estimate the STM, which depends on the initial conditions, so it is an implicit function of

Δ V_{1}

. Under the MPC paradigm [78], for an N-horizon time span,

N - 1

TI problems are solved, and the problem’s true dynamics accommodated through the complete horizon.

3.4. Multi-Impulse Guidance

A multi-impulse (MI) rendezvous allows for greater flexibility in the design of the relative trajectory, as accuracy-in-the-execution constraints may be relaxed and navigation errors can be accommodated in any of the manoeuvres to be performed.

For the MI backward pass, Equation (8) evaluated at

t_{f}

can be compactly expressed in matrix form as

s (t_{f}) = Φ (t_{f}, t_{0}) s_{0} + Σ œ,

with matrices

Σ \in R^{6 \times 6 N}

and

œ \in R^{6 N \times 1}

defined as

\begin{matrix} Σ = hor {Φ (t_{f}, t_{0}) Φ {(t_{i}, t_{0})}^{- 1}}, œ = ver \{[\begin{matrix} 0_{3 \times 1} \\ Δ V_{i} \end{matrix}]\}, \forall i = 1, 2, \dots, N, \end{matrix}

with

hor {\cdot}

and

ver {\cdot}

denoting horizontal and vertical concatenation, respectively. For non-LPO missions, a similar approach may be found in [46] exploiting the duality of linear input-output systems.

Enforcing the Rendezvous Condition

s (t_{f}) = 0_{6 \times 1}

and solving for

Σ

yields

œ = - Σ^{†} Φ (t, t_{0}) s_{0} = - Σ^{†} s (t_{f}),

(12)

where the

□^{†}

operator denotes the Moore-Penrose pseudoinverse [79]. This solution also introduces an

l^{2}

-norm penalty on the impulses vector

œ

as a general fuel consumption metric [80]. Moreover, an appropriate decomposition of

Σ

provides the solution for less general rendezvous problems, such as position waypoint targeting.

The overall performance and convergence of the associated differential corrector is compromised by the initial conditions, the final time of flight,

t_{f}

, and the selected execution times for the burns,

t_{i}

, triggering the effect of neglected nonlinearities in the problem’s dynamics.

In any case, and as for the TI technique, Equation (12) needs to be solved iteratively in the backward-forward structure. In this case, the most appropriate feedback formulation is the Newton-Rhapson method due to the sparsity of the expected impulses.

3.5. Optimal Multi-Impulsive Guidance

Further developing the multi-impulsive scheme naturally leads into its optimal formulation (Opt-MI), in which the N execution times

t_{i}

and their associated burns are optimally computed. The selected loss function to be minimized here is the total propellant consumption or, equivalently, the aggregated

Δ V

of the N impulses, for which the

l^{1}

-norm is most fitting [80]. This yields a Linear Programming minimization problem in which, under a linearized model of the dynamics, the component-wise constrained magnitude of the N burns can be determined for a rendezvous with the target object at a prescribed time of flight

t_{f}

, namely:

\begin{matrix} \underset{Δ V}{\arg \min} & {∥ Δ V ∥}_{1} \\ subject to & s (t_{0}) = s_{0} \\ s (t_{f}) = 0 \\ s (t) = Φ (t, t_{0}) s (t_{0}) + \sum_{i = 1}^{N} Φ (t, t_{i}) [\begin{matrix} 0 \\ Δ V_{i} \end{matrix}] \\ Δ V_{\min} \leq Δ V_{i} \leq Δ V_{\max} \end{matrix}

(13)

The use of the synodic reference frame together with Encke’s formulation of the dynamic vector field ensures a cheaper and numerically better-behaved STM propagation when compared to previous work [67]; in addition, the use of the

l^{1}

-norm as a proxy for fuel consumption and its associated LP optimization problem allows the online use of the algorithm without major computational expenses.

As already mentioned, the solver shall be embedded in some form of feedback loop to converge the control impulses sequence under the true nonlinear dynamics, through an iterative refinement of both the input

Δ V_{i}

and the STM

Φ

. In this case, the MPC paradigm in Section 5 may be used as forward pass, due to the lack of an exact form of an appropriate differential corrector to solve the problem under iteration.

3.6. Multi-Impulsive Staging Guidance

Despite the benefits of multi-impulsive schemes, as already seen, additional criteria or assumptions are needed to determine the impulses executions times

t_{i}

, on which the performance of the controller is totally dependent on. While the previous algorithm develops an online cost-effective optimal rendezvous problem solver, its formulation depends on the

l^{1}

-norm proxy for fuel consumption, which, in some cases, may not be representative of the true mass dynamics of the mission.

Inspired by classical launcher staging mass optimization, the novel Multi-Impulsive Staging Guidance (MISG) core aims to solve the following constrained optimization problem during the backward pass:

\begin{matrix} \underset{Δ V}{\arg \min} & \prod_{i}^{N} \exp (α_{i} ∥ Δ V_{i} ∥) \\ subject to & s (t_{0}) = s_{0} \\ s_{d} (t_{f}) = Φ (t_{f}, t_{0}) s (t_{0}) + \sum_{i = 1}^{N} Φ (t_{f}, t_{i}) B Δ V_{i} . \end{matrix}

(14)

The cost function

\prod_{i}^{N} \exp {(α_{i} ∥ Δ V)}_{i} ∥ = \prod_{i}^{N} x_{i}

is selected as a proxy of control effort and fuel consumption and to ease further algebra. As for the rest of this derivation, the relative weights are assumed to be equal and unitary,

α_{i} = 1

.

s_{d} (t_{f})

is the desired relative state at the final time of flight

t_{f}

. The complete impulse sequence can be determined considering discrete dynamics, therefore eliminating the impulse times as explicit optimization variables. For a given time of flight

t_{f}

and time grid

Δ t_{i} = t_{i + 1} - t_{i}

, the problem is fully determined, so that for every time node in the grid, a (possibly null) optimal impulse is computed.

The necessary conditions for optimality are given by constructing an augmented Lagrangian function and computing its stationary point with respect to

(Δ V_{i}

,

λ)

\begin{matrix} J = \prod_{i}^{N} x_{i} + λ^{⊺} (s_{d} (t_{f}) - Φ (t_{f}, t_{0}) s (t_{0}) - \sum_{i = 1}^{N} Φ (t_{f}, t_{i}) B Δ V_{i}) = \\ = \prod_{i}^{N} x_{i} + λ^{⊺} (e - \sum_{i = 1}^{N} Φ (t_{f}, t_{i}) B Δ V_{i}) \end{matrix}

resulting in the following

3 N + 6

equations for the

3 N + 6

variables

\begin{matrix} \prod_{i}^{N} x_{i} \frac{Δ V_{i}}{∥ Δ V_{i} ∥} - B^{⊺} Φ {(t_{f}, t_{i})}^{⊺} λ = 0, \\ e - \sum_{i = 1}^{N} Φ (t_{f}, t_{i}) B Δ V_{i} = 0 . \end{matrix}

The Lagrange multiplier

λ

can be eliminated through the use of equations i and

i + 1

, reading the following recursive system

\frac{Δ V_{i + 1}}{∥ Δ V_{i + 1} ∥} = (B^{⊺} Φ {(t_{f}, t_{i + 1})}^{⊺}) {(B^{⊺} Φ {(t_{f}, t_{i})}^{⊺})}^{†} \frac{Δ V_{i}}{∥ Δ V_{i} ∥} .

The final algebraic system to be explicitly solve is therefore

\begin{matrix} \frac{Δ V_{i + 1}}{∥ Δ V_{i + 1} ∥} = (B^{⊺} Φ {(t_{f}, t_{i + 1})}^{⊺}) {(B^{⊺} Φ {(t_{f}, t_{i})}^{⊺})}^{†} \frac{Δ V_{i}}{∥ Δ V_{i} ∥} \\ e - \sum_{i = 1}^{N} Φ (t_{f}, t_{i}) B Δ V_{i} = 0 . \end{matrix}

(15)

Equation (15) is suited for the MPC forward pass iteration, with applications in online feedback guidance. However, for offline cases, and by noting that the complete sequence is a function of the first impulse only

Δ V_{0}

, a Newton-Rhapson differential corrector scheme can be formulated to iteratively correct

Δ V_{0}

under each STM refinement iteration until convergence. For the latter case, Equation (15) can be further simplified as

\begin{matrix} \frac{Δ V_{i + 1}}{∥ Δ V_{i + 1} ∥} = (B^{⊺} Φ {(t_{f}, t_{i + 1})}^{⊺}) {(B^{⊺} Φ {(t_{f}, t_{i})}^{⊺})}^{†} \frac{Δ V_{i}}{∥ Δ V_{i} ∥} \\ e - [\sum_{i = 1}^{N} ∥ Δ V_{i} ∥ Φ (t_{f}, t_{i}) B G] \frac{Δ V_{0}}{∥ Δ V_{0} ∥} = 0, \end{matrix}

(16)

where the sensitivity matrix G is defined as

G = \prod_{j = 1}^{i - 1} (B^{⊺} Φ {(t_{f}, t_{j + 1})}^{⊺}) {(B^{⊺} Φ {(t_{f}, t_{j})}^{⊺})}^{†} .

G gives the Newton-Rhapson update step to

Δ V_{0}

through iterating the following result

Δ V_{0, k + 1} = Δ V_{0, k} + G^{†} [e - [\sum_{i = 1}^{N} ∥ Δ V_{i} ∥ Φ (t_{f}, t_{i}) B G] \frac{Δ V_{0, k}}{∥ Δ V_{0, k} ∥}] .

If the sequence needs to be control-input bounded or constrained in any other manner, the associated inequality can be introduced in the cost function through additional slack variables

l_{i}

, yielding

J = \prod_{i}^{N} x_{i} + λ^{⊺} (e - \sum_{i = 1}^{N} Φ (t_{f}, t_{i}) B Δ V_{i}) + μ^{⊺} (Δ V_{\max}^{2} - Δ V_{i}^{⊺} Δ V_{i} - l_{i}^{2}) + γ^{⊺} c (s) .

3.7. Impulsive Iterative Linear Quadratic Regulator

The last impulsive guidance scheme presented in this investigation is a novel application of the Augmented Lagrangian Iterative Linear Quadratic Regulator (AL-iLQR) for spacecraft rendezvous under LPO dynamics. The motivation and detailed derivation of the AL-iLQR may be found in Section 4, after presenting the classical Linear Quadratic Regulator (LQR) policy for continuous dynamics.

Our AL-iLQR paradigm aims to solve the discrete constrained optimization problem

\begin{matrix} \underset{Δ V_{i}}{\arg \min} & J = \frac{1}{2} s_{N}^{⊺} Q_{N} s_{N} + \frac{1}{2} \sum_{i = 0}^{N - 1} (s_{i}^{⊺} Q_{i} s_{i} + Δ V_{i}^{⊺} R Δ V_{i}) \\ subject to & s_{i + 1} = f (μ, s_{i}, Δ V_{i}, Δ t), \\ s (t_{0}) = s_{0}, \\ Δ V_{i}^{⊺} Δ V_{i} \leq Δ V_{\max}^{2}, \\ c (s_{i}) \leq 0 . \end{matrix}

(17)

where

Δ V_{\max}

is a control authority bound and the inequality/equality set given by

c (s_{i})

introduces general state constraints, such as Line-of-Sight restrictions [81]. The discrete dynamics given by the map

f

represent the relative motion dynamical system in any of its forms.

The constrained problem is relaxed and solved as a series of unconstrained optimizations by appropriately augmenting J [82]

\begin{matrix} \underset{Δ V_{i}, λ}{\arg \min} & J^{*} = \frac{1}{2} s_{N}^{⊺} Q_{N} s_{N} + \frac{1}{2} \sum_{i = 0}^{N - 1} (s_{i}^{⊺} Q_{i} s_{i} + Δ V_{i}^{⊺} R Δ V_{i}) + λ^{⊺} g + \frac{1}{2} g^{⊺} I^{γ} g \\ subject to & s_{i + 1} = f (μ, s_{i}, Δ V_{i}, Δ t), \\ s (t_{0}) = s_{0}, \\ g = [\begin{matrix} Δ V_{i}^{⊺} Δ V_{i} - Δ V_{\max}^{2} \\ c (s_{i}) \end{matrix}] \in R^{(m + 1) \times N}, \\ I_{j, j}^{γ} = \{\begin{matrix} γ_{j} & g_{j} > 0 \\ 0 & g_{j} < 0 \end{matrix} . \end{matrix}

The method achieves a low-cost optimization of a quadratic proxy of the control effort and the final and integral rendezvous error in finite time, under discrete impulsive manoeuvres. It generalizes the LQR solution to nonlinear systems by iterating on the optimal control policy

Π (t)

during the backward pass and updating the flow

Φ (s, Π)

over the forward pass, as described later on. The significance of the penalizing matrix sequence

Q_{i}

and R may also be found in Section 4. Finally, it is trivial to augment the relative state

s

to incorporate an integral penalty term, as discussed also in Section 4.

3.8. Performance of Impulsive Control Rendezvous Manoeuvres

In the following, the different impulsive guidance algorithms presented in this section are tested and compared using the complete nonlinear equations of relative motion in the CR3BP. To that end, a long-range example scenario is proposed where the target and chaser spacecraft are in two distinct northern halo orbits around the Earth-Moon

L_{2}

, as depicted in Figure 3. The target spacecraft is located in a halo orbit of an out-of-plane amplitude of 20,140

km

; the dimensionless time of flight to rendezvous was selected to be 0.6, corresponding to 2.67 days (this value is found to be the critical time

t^{*}

for the given orbit ensuring the differential correction convergence). For the AL-iLQR, however, this value was modified to

π

units or 14 days: despite seeming a relaxing constraint, it does promote control cost, as the number of impulses increases drastically. Additionally, and despite showing finite time convergence, the AL-iLQR benefits from longer time of flights, in which the true nonlinear dynamics can be optimized; for short time scales, the algorithms tends to overshoot the optimal solution. The dimensionless initial conditions for the target and the chaser spacecraft are, respectively:

\begin{matrix} S_{t} (t_{0}) & = {[\begin{matrix} 1.10495 & 0.02160 & - 0.04313 & 0.00346 & 0.21380 & 0.02985 \end{matrix}]}^{⊺} \\ s (t_{0}) & = {[\begin{matrix} 0.01262 & - 0.02160 & 0.02351 & - 0.00346 & - 0.02961 & - 0.02985 \end{matrix}]}^{⊺} \end{matrix}

corresponding to an initial relative range of 13,195

km

.

Several considerations must be addressed regarding the configuration of the different guidance schemes:

The STM is integrated along the reference state trajectory $s (t)$ through the linear variational equations of Encke’s formulation of the relative dynamics, Equation (3). Despite being the most accurate, they are also the most expensive in terms of computational cost, providing a worst-case scenario for performance analysis. For the sake of demonstration and comparison, the TI algorithm is also employed in use of the RLLM analytical STM.
For the sake of comparison, all state constraints are relaxed for all schemes, and the upper control bound $Δ V_{\max}$ is set to $1.025 m / s$ for the Opt-MI case (component-wise) and $51.25 m / s$ for the AL-iLQR. However, to nullify the final relative velocity, such bound is relaxed at the time of flight.
For the multi-impulsive case, six burns were selected to be performed along the trajectory; we chose a randomly distributed set of values (given in Table 1 for the sake of reproducibility) to illustrate the intrinsic robustness of this algorithm, for which the impulse sequence is critical.
For the Opt-MI, MISG, impulses are planned every 1 $\cdot 10^{- 2}$ time units. For the AL-iLQR, 5 $\cdot 10^{- 2}$ nondimensional time units are used instead.
All controllers are formulated as open-loop schemes, in which differential correctors suit better as forward passes. However, to address the clear performance difference against MPC, both the Opt-MI and MISG techniques are formulated using also the latter.
Matrices Q and R in the AL-iLQR scheme are selected to be constant and

$Q = (\begin{matrix} I_{3 \times 3} & 0_{3 \times 3} \\ 0_{3 \times 3} & 1 \cdot 10^{- 6} I_{3 \times 3} \end{matrix}), R = I_{3 \times 3} .$

Moreover, the initial estimation of the optimal flow $ϕ (t, s_{0}, u)$ is given by the coasting solution $u_{ref} = 0$ .
No uncertainty is considered in the simulation, and remains to be introduced under the performance of the robust controllers in Section 5.

To quantitatively analyze the performance differences between controllers, we use as a proxy a series of performance indices related to the rendezvous error and the propellant consumption of the manoeuvre; these indices are defined in Appendix B. The performance indices for these techniques in the considered example are summarized in Table 2. Moreover, the minimum and maximum impulses norms (for those distinct to 0) are also displayed to quantitatively bound the feasibility of each algorithm.

All guidance schemes present really similar error performance, the MISG scheme showing slightly better results. The majority of the presented techniques also show similar computational cost, except precisely for the MISG algorithm, which, in both forms of the forward sweep, solves the optimal problem one order of magnitude slower. Interestingly, the use of the analytical RLLM STM rises the computational time of the TI algorithm when compared to direct numerical integration of the variational equations. This is explained by an increase in the number of iteration until convergence, from 4 to 15 in the RLLM case. However, provided that they show the same results, the analytical version of the algorithm is particularly suited for online calculations when compared to its integrated counterpart. Finally, regarding control cost, both the Opt-MI, MISG and AL-iLQR (despite showing an increase of impulses by a factor of 10) show similar performance, as they all are both optimally conceived and constructed. Interestingly, the TI and MI controllers show a much smaller control effort, deviating 18% and 50% from the nominal values. This is attributed to the sparsity of impulses in both algorithms, indicating that the selected number of impulses are closer to the optimal value [83,84,85]. For the other 3 controllers, the number of impulses is determined by the minimum time step considered in the discretization of the dynamics. However, for both the TI, MI and Opt-MI algorithms, the final impulse nullifying the relative velocity increases by a factor of nearly 5 when compared to the rest of controllers. Again, this is explained by the lack of optimality considerations founding the algorithms.

Figure 3 demonstrates the proposed rendezvous trajectories in the absolute configuration space, while Figure 4 displays the relative state evolution in time under the action of each controller. All of them are able to successfully drive the chaser to rendezvous the target spacecraft located in a different periodic orbit with a convergence error below 1

\cdot 10^{- 10}

, and differential corrections requiring less than 50 iterations to converge in the worst scenario. Finally, Figure 5 shows the evolution of the control impulse sequence for the optimal multi-impulsive controllers. The MISG algorithm can be shown to maintain bounded the control sequence all over the time span. Independently of the numerical solver in use, the algorithm shows similar control trends: first, the relative range is targeted at the initial stages, driving the sequence towards a coasting period, and then towards the end of the transfer, the action plan nullifies the relative velocity progressively. On the other hand, the AL-iLQR shows most control activity in the early stages of the transfer and then a nearly coasting phase, exploiting natural transport dynamics.

The terminal phase of the rendezvous trajectory (

∥ ρ ∥ \leq 100 km

) is re-designed making use of the full constrained AL-iLQR, in which the chaser shall approach the docking port through a given safety corridor. Such constraint is introduced by

l (s)

l (s) = - (\begin{matrix} p_{1} & p_{2} & p_{3} & 0 & 0 & 0 \end{matrix}) s + ∥ ρ ∥ cos β \leq 0,

which models a cone-like constraint defined by the half-cone angle

β

, together with an axial direction given by the unit vector

p

. In this simulation,

β = 15^{\circ}

and the cone axis is selected to be

p = {[1, 1, 1]}^{⊺}

, which is assumed to be aligned with the docking axis at the rendezvous time

t_{f} = π

. Despite being nonlinear in the state

s

, the constraint need not any redefinition and can be directly introduced in the AL-iLQR algorithm.

Figure 6 depicts the evolution of the relative angle between

p

and

ρ

in time. Again, convergence towards

ϕ \leq β

is noticeable. Finally, Table 3 compiles the major performance metrics of the algorithm, where no difference can be appreciated with respect to the unconstrained case, demonstrating the robustness and adaptability of the AL-iLQR technique when compared to the rest of the presented guidance schemes.

4. Optimal Continuous Guidance

Continuous acceleration control schemes may also provide efficient rendezvous strategies and are intrinsically interesting for low-thrust propelled missions, where control cost and rendezvous efficiency constraints may be again imposed and emphasized. In this section, three such continuous acceleration controllers are presented. Previous results on the Linear Quadratic Regulator and State-Dependent Ricatti Equation (SDRE) guidance are discussed and combined with the new linear dynamical models presented, before introducing a novel AL-iLQR methodology for spacecraft rendezvous. All three techniques are suited for both online and offline guidance.

4.1. Linear Quadratic Regulator

Since the Rendezvous Condition is satisfied at the origin of the relative phase space (namely, the equilibrium point in Equation (2)), a Proportional-Integral-Derivative (PID) controller can be constructed to asymptotically drive the system to such state. Thus, the control acceleration can be defined as

u = - (K_{P} ρ + K_{D} \dot{ρ} + K_{I} \int_{0}^{t} ρ d t),

where

K_{P}

,

K_{D}

and

K_{I}

are, respectively, the proportional, derivative and integral gains. To introduce an integral penalty term, the relative state vector and dynamics Equation (5) are augmented as follows

\begin{matrix} \hat{s} = {[ρ, \dot{ρ}, \int_{0}^{t} ρ d t]}^{⊺} \end{matrix}

(18)

\begin{matrix} \frac{d \hat{s} (t)}{d t} = [\begin{matrix} A & 0_{6 \times 3} \\ I_{3 \times 3} & 0_{3 \times 6} \end{matrix}] \hat{s} (t) + [\begin{matrix} B \\ 0_{3 \times 3} \end{matrix}] u (t) . \end{matrix}

(19)

Using the augmented state vector, an infinite horizon Linear Quadratic Regulator (LQR) can now be proposed [21,73], which provides the optimal control law,

u^{*} (t)

, by minimizing the following quadratic cost function

J = \frac{1}{2} lim_{t \to \infty} \int_{0}^{t} {\hat{s}}^{⊺} Q \hat{s} + u^{⊺} R u,

(20)

where matrices Q and R define the weights of the states and the control inputs, respectively. The optimal solution is then obtained as

u^{*} (t) = - R^{- 1} B^{⊺} P \hat{s} (t),

where P is the solution of the following Algebraic Ricatti Equation for the infinite horizon case:

P A + A^{⊺} P - P B R^{- 1} B^{⊺} P + Q = 0,

provided that

Q = M^{⊺} M

and R are positive semidefinite and definite matrices, respectively, the pair

{A, B}

is controllable and

{A, M}

is observable. Once the Q and R matrices are appropriately selected, the optimal control

u^{*} (t)

law may be found to rendezvous the two spacecraft. Such selection may be addressed using pole placement arguments or using more advanced techniques, such as Genetic Algorithms, to trade-off asymptotic convergence and control expense. In general, the ratio Q to R defines the performance of the controller: the cost function J is a relative trade-off between nominal state error and control effort, and Q and R balance such compromise solution.

One important remark is that, since the LQR formulation relies on an LTI model, an operating point needs be defined for the target spacecraft, so that the linear system of Equation (19) is indeed time-invariant. In this sense, numerical experiments show that optimal performance is obtained when the model is evaluated at the target’s orbital state at the rendezvous time of flight. However, and compared to previous literature, this work exploits the novel RLLM as a natural LTI formulation for the LQR method, rendering the controller independent of the target’s motion from definition.

4.2. State Dependent Ricatti Equation Controller

Although the LQR controller formulation yields a successful scheme to rendezvous the two vehicles, the requirement to fix a prescribed operating point for the target spacecraft constrains its applicability and handicaps its performance. State Dependent Ricatti Equation (SDRE) controllers overcome this shortcoming by providing a time-varying generalization to the LQR formulation for general nonlinear vector fields of the type

\dot{s} = f (t, s) + g (t, s) u,

(21)

in which a quadratic cost function analogue to Equation (20) can be defined, where matrices

Q (s)

and

R (s)

no longer need to be constant, but can depend on the state vector, i.e., they are allowed to vary in the relative state space.

Upon linearization, under certain conditions [86] the dynamical system can be reduced to a state dependent linear model of the type

\dot{s} (t) = A (s) s (t) + B (s) u (t) .

where the pair

{A (s), B (s)}

is assumed to be point-wise stabilizable. For the relative motion in the CR3BP, such a state dependent linear system arises from the point-wise linearization around the target spacecraft’s orbit, i.e., for points along the trajectory of the target spacecraft, as given by the RLM model in Equation (5). In that case, a SDRE controller can be straightforwardly designed by defining an augmented linear model analogue to Equation (19), only that

A (\hat{s})

and

B (\hat{s})

now depend on time implicitly through the target’s time law. Therefore, for prescribed matrices

Q (\hat{s})

and

R (\hat{s})

, an optimal control law

u^{*} (t)

can be computed for the linearized problem upon solving the corresponding algebraic Ricatti equation, resulting in the control law:

u^{*} (t) = - R^{- 1} B^{⊺} P \hat{s} (t) .

The suboptimal policy

u^{*} (t)

can be shown to be stable and asymptotically regulate the relative augmneted state

\hat{s}

towards the Rendezvous Condition.

Detailed derivations and applications of the SDRE controller for optimal rendezvous can be found in [65,73], including available extended formulations for state-constrained rendezvous and proximity operations.

4.3. Iterative Linear Quadratic Regulator

The iterative Linear Quadratic Regulator (iLQR) is an optimal trajectory planning algorithm widely applied in robotics applications, and a variant of the classical Differential Dynamic Programming technique [87,88]. iLQR generalizes the classical optimal control LQR solution to nonlinear system through iterative first-oder linearization of the dynamics [89]. The main two formulations of the algorithm are found in [82,89]. The latter may be conceived as more general (as it is based on Bellman’s Principle of Optimality) and provides the framework to incorporate state constraints in the underlying optimization problem, with the so-called Augmented Lagrangian paradigm (AL-iLQR).

The following discussion is focused on the application of AL-iLQR for continuously controlled rendezvous trajectory planning, while it can be easily adapted to impulsive missions, as already discussed in Section 3.

Our AL-iLQR paradigm aims to solve the discrete constrained optimization problem

\begin{matrix} \underset{u}{\arg \min} & J = \frac{1}{2} s_{N}^{⊺} Q_{N} s_{N} + \frac{1}{2} \sum_{i = 0}^{N - 1} (s_{i}^{⊺} Q_{i} s_{i} + Δ t u_{i}^{⊺} R u_{i}) \\ subject to & s_{i + 1} = f (μ, s_{i}, u_{i}, Δ t), \\ s (t_{0}) = s_{0}, \\ u_{i}^{⊺} u_{i} \leq u_{\max}^{2}, \\ l (s_{i}) \leq 0, \end{matrix}

(22)

where

u_{\max}

is a control authority bound and the inequality

l (s_{i})

accommodates general state constraints. Again, the relative motion dynamical system is represented by the discrete vector field

f

. A discussion on the selection of the dynamical model within the AL-iLQR method can be found at the end of this section. Finally, a zeroth-order hold is assumed for the control action

u_{i}

over time step

t_{i}, t_{i + 1}

, resulting in a piecewise control solution.

The constrained optimization problem is relaxed into an unconstrained version, to be solved iteratively until constraints are satisfied and the extremum control sequence

u

is found. In the following, the set of p constraints are collapsed into the vector inequality

c (s) \leq 0, c (s) \in R^{p}

.

Initially, an augmented Lagrangian cost function is constructed through appropriate penalty Lagrange multipliers terms

J^{*} = \frac{1}{2} s_{N}^{⊺} Q_{N} s_{N} + \sum_{i = 0}^{N - 1} \frac{1}{2} (s_{i}^{⊺} Q_{i} s_{i} + Δ t u_{i}^{⊺} R u_{i}) + {(λ_{i} + \frac{1}{2} c (s_{i}) I^{γ})}^{⊺} c (s_{i})

where

λ \in R^{p}

are the constraints Lagrange multipliers and the quadratic penalty term

\frac{1}{2} c^{⊺} (s_{i}) I^{γ} c (s_{i})

is constructed upon the weight matrix

I^{γ}

, defined as

I_{j, j}^{γ} = \{\begin{matrix} γ_{j} & c_{j} > 0 \\ 0 & c_{j} < 0 \end{matrix};

γ \in R^{p}

are user-defined hyperparameters to be optimized during the process. Therefore, the constrained problem has been transformed into the following unconstrained one

\begin{matrix} \underset{u, λ}{\arg \min} & J^{*} = \frac{1}{2} s_{N}^{⊺} Q_{N} s_{N} + \sum_{i = 0}^{N - 1} \frac{1}{2} (s_{i}^{⊺} Q_{i} s_{i} + Δ t u_{i}^{⊺} R u_{i}) + {(λ_{i} + \frac{1}{2} c (s_{i}) I^{γ})}^{⊺} c (s_{i}) \\ subject to & s_{i + 1} = f (μ, s_{i}, u_{i}, Δ t), \\ s (t_{0}) = s_{0} . \end{matrix}

(23)

The solution to the original constrained problem is given by the iteration of the following sequence of steps

Solve the unconstrained problem (23) using the iLQR technique, with $λ, γ$ fixed.
Update the estimation of the Lagrange multipliers $λ$

$λ_{j} = \max (0, λ_{j} + γ_{j} c_{j} (s)) .$
Update the penalty weights $γ$ through the schedule

$γ_{j} \leftarrow κ γ_{j}, \forall j, κ > 0 .$

If feasible, the problem is considered solved when constraints are satisfied to a given tolerance and the unconstrained problem (23) is also optimally solved. Numerical details and enhancements to the method can be found in [82,87]. The convergence of the algorithm is also driven by an appropriate selection of the Q and R matrices: aside from determining the final optimal trade-off between control cost and rendezvous error, as in the classical LQR or SDRE, they also affect the positive definiteness of the action value Hessian, on which the algorithm’s convergence totally depends on.

At each iteration, the control law is updated through the sequence control and state deviation

δ u_{i}^{*}

, given by

\begin{matrix} δ u_{i}^{*} = K_{i} δ s_{i} + d_{i} \\ δ s_{i + 1} = Φ (t_{i + 1}, t_{0}) Φ {(t_{i}, t_{0})}^{- 1} δ s_{i} + \frac{Δ t}{2} [I + Φ (t_{i + 1}, t_{0}) Φ {(t_{i}, t_{0})}^{- 1}] B δ u_{i}, δ s_{0} = 0 . \end{matrix}

where the feedforward term

d_{i}

is given by the Hessian and gradient of the cost function. State estimation on

δ s_{i}

is achieved through the STM flow mapping, which can benefit from analytical models, such as the RLLM. The exact derivation and solving of Step I may be found in Appendix C [82], where the Hessian of the action value function can be seen to be given by an appropriate Gauss-Newton approximation. This lower-accuracy expansion, compared to its true 2-rank nature, is the main difference between iLQR and classical Differential Dynamic Programming, and, while it makes the former require more iterations to converge, it also diminishes the computation burden of the algorithm, making it suitable for online processing.

An iteration over Step I is finished with the forward pass, given by flow solution

ϕ (s_{0}, u, t)

over the updated control action

u \leftarrow u + δ u

\begin{matrix} \dot{s} = f (μ, s, u) \\ \frac{d}{d t} Φ (t, t_{0}) = J Φ (t, t_{0}) . \end{matrix}

J denotes the Jacobian of

f

. Step I is assumed to converge whenever the difference between iterations in the cost function

J^{*}

reaches some tolerance or a maximum number of iterations is exceeded.

As it may be noted, linearized dynamics are used in the recursion for the optimal flow variation

δ s_{i}

under the effect of

δ u_{i}

. Depending on the selection of the dynamics vector field

f

, the exact form of

Φ (t, t_{0})

will be distinct, either numerical or analytical, and the associated computational burden of the method will also increase or decrease during the forward pass. Moreover, the reached solution

{s (t), u (t)}

will be optimal for the model in use: if

f

is the true CR3BP relative motion dynamics, the rendezvous trajectory design will already account for the problem’s nonlinearities, while other surrogate models (see Section 2) could also be used for further refinement within the inner robust control loop.

Finally, the method is in need of a first estimation of the optimal trajectory

s_{ref}

and control

u_{ref}

. Given the trajectory initial conditions

s_{0}

, both the natural coasting solution (

u_{ref} = 0

) or any other guidance solution (such as that provided by the LQR) may be used as such initial guess.

4.4. Performance of Continuous Control Rendezvous Manoeuvres

In this section the previously described continuous guidance schemes will be compared upon each other in a test scenario based on an Earth-Moon

L_{1}

Lyapunov orbit of an in-plane amplitude of

A_{x}

= 10,000

km

, as depicted in Figure 7, for which the dimensionless initial conditions for the target spacecraf and the relative state are, respectively:

\begin{matrix} S_{t} (t_{0}) & = {[\begin{matrix} 0.82446 & 0.01183 & 0 & 0.01629 & 0.01017 & 0.11743 & 0 \end{matrix}]}^{⊺} \\ s (t_{0}) & = {[\begin{matrix} - 0.00051 & - 0.01183 & 0 & - 0.01017 & 0.00304 & 0 \end{matrix}]}^{⊺} \end{matrix}

yielding an initial range of

4550 km

. The selected time of flight for the mission is

t_{f} = 2 π

in nondimensional units, or 28 days. However, to exploit the finite time convergence of the AL-iLQR scheme and show its superior performance, its final time of flight is restricted to 8.91 days in this latter case,

t_{f} = 2

in non-dimensional units.

Regarding the setup of the rendezvous example, the following considerations shall be addressed:

All guidance techniques make use of an augmented state vector $\hat{s}$ , to include an integral penalty term within the appropriate cost function.
Again, no uncertainty or process noise is considered in the simulations.
The LQR trajectory was designed in use of the LTI Relative Libration Linear Model in Equation (6) instead of that in Equation (5) to exploit its intrinsic time-invariant nature. A comparison against the RLM model is also addressed. SDRE guidance is based on the RLM model, Equation (5), while the AL-iLQR outputs an optimal state flow $ϕ (t, s_{0}, u)$ for the true nonlinear dynamics.
For the AL-iLQR technique, a $1 \cdot 10^{- 2}$ nondimensional time units zeroth-order hold was used, and numerical integration performed through MCPI, as described in Appendix A. Discrete linearization of the true nonlinear relative dynamics is achieved by numerical propagating the STM through Encke’s linear variational equations at the forward pass, providing the worst-case scenario in terms of computational performance. Finally, the initial estimation of the optimal flow $ϕ (t, s_{0}, u)$ is given by the open loop LQR guidance solution.
LQR and SDRE schemes are defined by the following constant, penalty weight matrices Q and R

$Q = 2 I_{9 \times 9}, R = I_{3 \times 3},$

while for the AL-iLQR, after some trial and error, these are selected to be

$Q = 1 \cdot 10^{3} I_{9 \times 9}, R = I_{3 \times 3} .$

Figure 8 shows the evolution of the chaser’s relative state vector as a function of time for the three showcased controllers, and Table 4 summarizes for each controller the performance indices as defined in Appendix B, including maximum and minimum acceleration,

{∥ u ∥}_{\min}

and

{∥ u ∥}_{\max}

, respectively. Additionally, Figure 9 shows the evolution in time of the norm of the control acceleration as a proxy to the real performance of the aforementioned schemes.

On the basis of such results, several conclusions may be drawn. First, little gain is achieved by means of exploiting time-varying linearizations of the relative dynamics when compared to trivial LTI models such as the RLLM (which, to best of our knowledge, had not been explicitly used for rendezvous before). Difference between the LQR and SDRE controllers are minimum in terms of both the state error and the associated control cost. The only exception accounts for the inclusion of state domain constraints within the optimal quadratic problem, in which the SDRE formulation is needed [65]. Secondly, the AL-iLQR paradigm shows similar error performance to that of infinite horizon controllers but in finite time (the time of flight is reduced by 70%) despite the increase in the total control cost (as expected for a discrete horizon formulation). For the AL-iLQR case, this increase is counterbalanced by the possibility of explicitly imposing control authority bounds (and general constraints), a major concern within continuous control applications when compared to total

Δ V

budgets. In this example, the resulting control acceleration is in the order of

mm s^{- 2}

. However, the algorithm lacks numerical robustness, specially when selecting both the Q and R matrices sequences, due to its sensitivity to the positive-definiteness of the action value gradient and Hessian. In terms of the computational cost, both the LQR and AL-iLQR provide similar performance (in the order of seconds) while the SDRE and its need to solve the Ricatti equation over the state trajectory

s_{ref}

are much more expensive (order of minutes).

A similar study was conducted on the rendezvous mission scenario presented in Section 3 to confirm such conclusions, under the very same setup. Results are shown in Table 5. First, in terms of computational cost, both the LQR and AL-iLQR algorithms show the same order of magnitude as in the previous case, contrary to the SDRE controller, whose performance augments. This may be explained by better numerical conditioning of the optimal control when compared to the previous mission scenario, in which the out-of-plane dynamics are numerically ill-behaved. Error metrics are similar for all three controllers despite the different handled time scales. Cost performance shows the greatest differences, remarkably for the LQR: in this scenario, in which dynamics play a greater role, the selection of the LQR determines its performance, still profiting the use of LTI models (on which its optimal solutions builds) when compared to the use of linearisation points. Again, the AL-iLQR is the most expensive in average due to its discrete time convergence properties and the initial control overshoot given by the selected Q-R matrices ratio. Figure 10 and Figure 11 show the relative state evolution of this second rendezvous test and the associated control acceleration for the three guidance techniques, highlighting the discrete/asymptotic capabilities of the three techniques.

5. Modern Optimal Control

After discussing optimal trajectory planning and guidance, two controllers, for both impulsive and continuous missions, are discussed for robust reference tracking, control and disturbance rejection. Their cascaded combination with the presented optimal guidance techniques is also first achieved now.

Observe that in practice, all guidance techniques may be indeed employed as feedback controllers (i.e., for reference tracking), if the designed of the reference trajectory

s_{ref}

is achieved by any other mean. However, classical linear controllers, such as the LQR or SDRE, usually lack the robustness and generality offered by the proposed modern control approaches.

5.1. Model Predictive Control

Modern Predictive Control (MPC) is an optimal control technique based on iterative optimization, introduced in the late 1980s [78,90], and since then abundant literature may be found for rendezvous a proximity operations [81,91,92,93,94], while only recently has been applied in multi-body scenarios [67,73].

In the MPC paradigm, an optimal guidance problem is solved for a given time horizon

t_{i} = 0, 1, \dots, T

under discrete dynamics, and as a result, both a state trajectory

{s_{i} (t_{i})} = s_{1}, s_{2}, \dots, s_{T}

and control action policy

{U_{i} (t_{i})} = U_{1}, U_{2}, \dots, U_{T}

are returned. The controller only executes the action at the initial time

U_{1}

, and, after the plant natural rollout one time step ahead, the optimization problem is run again with the time horizon receded

t_{i} = 0, 1, \dots, T - 1

. Such time prediction horizon is assumed in this work to be externally constrained by, for example, mission analysis considerations.

As already, discussed, all presented impulsive guidance techniques are subject to be adapted as the optimization core of an MPC-controlled process, where linear dynamics are used as an optimization surrogate model. The true nonlinearities of the CR3BP relative motion, uncertainty and process noise are compensated along the time receding horizon through iterative optimization. Of particular relevance is to note the synergy between the AL-iLQR technique and an MPC iterative process [87], which, as the rest of combinations and to the best of our knowledge, has not been addressed before for rendezvous purposes. In such scheme, at each time step, the former is used for motion planning in a fast and robust way, while the upper MPC iteration accommodates non-modelled effects in the optimization model. Not only the combination MPC/AL-iLQR is promoted for impulsive guidance and control, but it can also be employed for low-thrust missions given the piecewise nature of the continuous iLQR solution.

5.2. Sliding Mode Control

Sliding Mode Control (SMC) is one of the most widely used variable structure controllers, developed by Emelyanov et al. in the late 1950s [95]. An SMC relies on discontinuous piecewise functions to provide global Lyapunov stability properties to the controlled plant by forcing the dynamical system to slide on a given state space manifold in which first-order dynamics are imposed [96].

Consider the dynamical system of Equation (21), where

f

and

g

may not be exactly known, but upper-bounded by continuous operators all over the state space. If the relative error

e (t) = s (t) - s_{ref} (t)

is measured with respect to a given state space reference trajectory,

s_{d}

, then the sliding hypersurface is defined by

œ (e) = \sum_{k = 2}^{K} {(\frac{d}{d t} + λ)}^{k - 1} e = 0,

where

λ

represents a Hurwitz scalar. In this work

K = 2

is chosen, which yields a first-order SMC control law.

Any tracking problem is therefore equivalent to stabilizing the system dynamics on

œ (e) = 0

, which is also equivalent to imposing the Lyapunov’s Direct Method condition

\dot{œ} \cdot œ \leq 0 .

Once the sliding manifold is defined, a global exponentially stabilizing behaviour may be obtained from the following equivalent control law:

u_{eq} (t) = g {(t, s)}^{- 1} [{\ddot{s}}_{d} (t) - f (t, s) - λ \dot{e} (t)],

provided

g^{- 1}

exists. Also, the Lyapunov constraint is imposed outside the sliding surface using the following discontinuous reaching control function:

u_{r} (t) = - ϵ {∥ œ ∥}^{α} sat (œ, Δ) - ϵ œ,

where

∥ \cdot ∥

denotes the Euclidean norm,

ϵ

is strictly positive,

α \in [0, 1)

, and

Δ

is the boundary layer for the saturation operator

sat (\cdot)

, which prevents the system from chattering across the sliding surface:

sat (y) = \{\begin{matrix} 1 & if & y > Δ \\ \frac{1}{Δ} y & if & - Δ \leq y \leq Δ \\ - 1 & if & y < - Δ \end{matrix}

Hence, the final control law is given by

u (t) = u_{eq} (t) + u_{r}

(t).

For the rendezvous problem at hand and for a prescribed desired guidance trajectory for the chaser to follow,

ρ_{d} (t)

, the SMC control law simplifies to:

\begin{matrix} e (t) & = ρ (t) - ρ_{d} (t) \\ œ (e) & = \dot{e} + λ e \\ u (t) & = {\ddot{ρ}}_{d} (t) - f (t, s) - λ \dot{e} (t) - ϵ [{∥ œ ∥}^{α} sat (œ, Δ) + œ], \end{matrix}

(24)

where

f

is given by any appropriate model of the relative dynamics.

The parameters

{λ, ϵ, α, Δ}

completely define the controller’s performance, where:

$λ$ is the characteristic time of the reduced first order exponential dynamics on such state subspace.
$ϵ$ and $α$ determine the controller across the sliding surface, with direct implication on chattering or weak attractive properties of the surface.
The boundary layer $Δ$ defines the control bandwidth and the precision to which saturation of the control law is necessary to reach the sliding surface.

This set of parameters can be optimized for the performance of the controller. However, the non-smooth nature of the SMC prevents the use of gradient-descent based techniques. In this work, Genetic Algorithms [97] have been used to heuristically explore this parameter search space and select the optimal set minimizing the following convex multi-objective cost function over a set of typical guidance trajectories

s_{i}^{*} (t)

of interest

\begin{matrix} \underset{λ, ϵ, α, Δ}{\arg \min} & J = \sum_{i = 1}^{N} [\int_{t_{0}}^{t_{f}} ∥ u (t, s_{i}^{*}) ∥ d t + Λ s_{i}^{*} (t_{f})], \end{matrix}

where

Λ

is a constant gain to account for the rendezvous error in the minimization problem.

6. Applications and Testbench Mission Scenarios

This section presents two simulation scenarios to illustrate the use of the manoeuvre design schemes introduced in the preceding sections, and for the purpose of validation and benchmarking.

In the following test cases, all numerical integrations were performed by means of Encke’s formulation for the relative dynamics using a variable-step, variable-order Adams-Bashforth-Moulton predictor-corrector solver of orders 1 to 13 with absolute and relative integration tolerances set to

10^{- 22}

and

2.25 \cdot 10^{- 14}

, respectively, as implemented in Matlab 2021b’s function ode113 [98].

The time discretization for the simulations output was set to

10^{- 3}

nondimensional time units, which corresponds to 1.68 h in the Earth-Moon system and 8.76 days in the Sun-Earth system; consecutive maneuvers were also constrained not to be executed in less than such time interval.

Navigation uncertainty in the state estimation was modeled as a zero-mean normal multivariate distribution

N \sim (0, Σ)

, with diagonal covariance matrix

Σ = 5 I_{3 \times 3} m

for the relative position space and

Σ = 0.5 I_{3 \times 3} m s^{- 1}

for velocity. Moreover, impulsive control actions

Δ V_{i}

are

l^{2}

-bounded by

∥ Δ V_{i} ∥ < 100 m s^{- 1}

, which corresponds to

0.05

in the Earth-Moon system and

0.003

for the Sun-Earth case. Continuous acceleration is restricted to be smaller than

∥ u ∥ < 0.001 m s^{- 2}

unless otherwise specified. Finally, process noise

p

is added as a white noise disturbance, where

∥ p ∥ \leq 5 \cdot 10^{- 2} ∥ u ∥

.

6.1. Servicing a Solar Observatory

Solar and telescope missions have played a fundamental role in both our understanding of the Solar System and our surrounding space environment, as well as the development of modern Orbital Mechanics. As a matter of fact, both the Solar and Heliospheric Observatory and Genesis spacecraft were the very first to demonstrate and use the Three-Body Problem dynamical solutions as a basis for their missions. Libration points provide ideal locations for celestial observation mission too, due to their illumination conditions and uninterrupted communication with the Earth. Hence, for the James Webb Space Telescope (JWST), launched in December 2021, a Sun-Earth

L_{2}

halo orbit with an orbital period of 6 months was selected as its nominal orbit. The JWST and its potential visual-inspection and repair necessities throughout its mission lifetime provide the first validation scenario for the proposed GNC techniques. A nominal

L_{2}

northern halo orbit has been selected to be representative of that of the real mission.

The chaser spacecraft is intended to perform a formation flight around the target spacecraft (JWST) and, if needed, rendezvous with it for repair or maintenance activities. The mission will then finish with an exploration of the Sun-Earth interior realm.

The proposed mission timeline for this fictional JWST servicing spacecraft is summarized in Table 6, and the different mission phases and their associated control cost estimates are listed in Table 7. The spacecraft is assumed to be equipped with chemical propulsion and an electric thruster for relative formation flying. Additionally, simulations only reproduce proximity operations activities, while the transfer phase from a low-Earth orbit to the

L_{2}

target halo orbit range is not considered here; it may be achieved by means of the stable manifold of the orbit, just as the nominal transfer trajectory for the JWST was planned, with potentially null associated control cost [99].

The proximity operations mission starts after nominal loitering halo orbit insertion of the chaser vehicle, into a distinct

L_{2}

halo orbit to that of the target. This initial long range approach considers acquiring the

\sim 100 km

relative range for the start of the formation flying. The initial conditions of the target and the chaser relative state at the halo insertion time are

\begin{matrix} S_{t} (t_{0}) & = {[\begin{matrix} 1.00860 & 0.00416 & - 0.00047 & 0.00202 & 0.00474 & 0.00287 \end{matrix}]}^{⊺} \\ s (t_{0}) & = {[\begin{matrix} - 0.00025 & - 0.00416 & - 0.00007 & - 0.00202 & 0.00513 & - 0.00287 \end{matrix}]}^{⊺}, \end{matrix}

which corresponds to an initial relative range of 623,520 km.

The guidance and control scheme used in this phase corresponds to the combination of the MPC controller with an optimal iLQR technique. The time horizon considered is

N = 60

, with impulses every

Δ t = 0.05

nondimensional time units (corrections every 2.90 days). The time of flight of the phase is set therefore to

t_{f} = 3.05

units. Moreover, any state constraint is relaxed in this initial phase. Finally, after an heuristic trial and error process, the discrete AL-iLQR algorithm is defined by the following penalty matrices

Q = (\begin{matrix} 10 \cdot I_{3 \times 3} & 0_{3 \times 3} \\ 0_{3 \times 3} & 10^{- 5} \cdot I_{3 \times 3} \end{matrix}), R = I_{3 \times 3} .

The performance of the algorithm gives a reduction of 98% of the initial relative range, as shown in Figure 12, where the relative state evolution of chaser spacecraft can be analysed. Figure 13 depicts the control input evolution in time during this initial phase.

Once the relative formation flying range is acquired, offline LQR guidance (in use of the integral-augmented RLLM) is combined with the SMC controller to position the chaser at

500 m

to the JWST and maintain such relative distance for two months, by converting the relative formation flying problem into a virtual time-invariant rendezvous problem under an appropriate change of variables. The total time of flight is

t_{f} = 2 π / 3

, formation flying starting at

δ t = π / 2

from phase start. The initial conditions of this phase are given by

\begin{matrix} S_{t} (t_{0}) & = {[\begin{matrix} 1.00850 & 0.00390 & - 0.00061 & 0.00174 & 0.00563 & 0.00275 \end{matrix}]}^{⊺} \\ s (t_{0}) & = 1 \cdot 10^{- 4} \cdot {[\begin{matrix} 0.4 & 0.3 & 0.1 & 1.6 & 0.4 & 0.3 \end{matrix}]}^{⊺} . \end{matrix}

The LQR is defined by the weight matrices

Q = 100 \cdot I_{9 \times 9}, R = I_{3 \times 3},

while the SMC, after tuning, is defined by the parameter set

λ = 1, ϵ = 0.98560, α = 0.00601, Δ = 0.01323 .

Offline guidance refers to the regression of the LQR reference state trajectory

s_{ref}

as a function of time, in this case through an appropriate low-order projection in Chebyshev polynomials

T (τ)

.

s_{ref} \approx \sum_{i = 1}^{N} β_{i} T (τ) .

The relative state evolution of chaser spacecraft can be analysed in Figure 14. The needed low-thrust acceleration is displayed in Figure 15.

After 243.33 days of visual inspection and formation flying, the chaser spacecraft is planned to rendezvous with the target vehicle using the combined MPC-MISG core described in Section 3. The phase considers a time horizon of

N = 100

impulses in

t_{f} = 0.6

nondimensional time units. The phase initial state for both the target and relative spacecraft is given by

\begin{matrix} S_{t} (t_{0}) & = {[\begin{matrix} 1.00818 & 0.00010 & - 0.00133 & 0.00000 & 0.01042 & 0.00008 \end{matrix}]}^{⊺} \\ s (t_{0}) & = 1 \cdot 10^{- 8} \cdot {[\begin{matrix} 0.14004 & - 0.05683 & - 0.01369 & - 0.01694 & 0.10161 & - 0.05052 \end{matrix}]}^{⊺} . \end{matrix}

The final (purely numerical) relative error between vehicles is of

1.98 \cdot 10^{- 16}

, which, in position space, corresponds to

0.02 mm

, thus successfully achieving the rendezvous of the two spacecraft. Figure 16 shows the relative state evolution during this final close-range rendezvous phase, while Figure 17 depicts the needed impulses to accomplish the rendezvous.

The trajectory of the proposed JWST servicing mission for the complete mission timeline is illustrated in Figure 18.

Thus, the proposed solution leads to affordable rendezvous maneuvers and control strategies, where the majority of the control budget is allocated to perform the relative formation flying phase, which is by far the longest in time and the most demanding in high-frequency actuation. Moreover, despite the associated

Δ V

budget, the phase is completed using bounded continuous propulsion, for which direct

Δ V

requirements are less stringent thanks to their higher specific impulse. The rest of the manoeuvres are negligible in cost, while successfully completing all phases. The proposed guidance and control architecture is therefore demonstrated for the mission.

6.2. Rendezvous in a near Rectilinear Halo Orbit

For the past decade, deep space missions have gained increasing attention from the space industry. In order to ease such missions, a lunar orbit station is planned to be established within this decade, known as the Lunar Gateway [2,3], which will serve as a communication hub and short-term habitation module to support human return to the Moon and as a staging point for these deep space missions. Thanks to their station keeping long term stability, nearly uninterrupted communication with Earth and eclipse-avoidance properties, Near Rectilinear Halo Orbits (NRHO) have been selected as the major candidate for the Gateway nominal trajectory; these orbits are highly eccentric trajectories, nearly normal to the orbital plane of the Earth-Moon system. The NRHO family can be found in both

L_{1}

and

L_{2}

southern halo sets, at the closest side to the Moon; although they are an intrinsic solution to the CR3BP, they also persist under ephemerides models. NRHOs provide easy access to the lunar surface, as well as to the Earth, by means of their stable and unstable manifolds.

The Lunar Gateway and their supply and re-fueling necessities will provide the second validation mission scenario that we shall consider: for crew replacement and life support resupply purposes, a chaser spacecraft is tasked to rendezvous with the Gateway station after being inserted into its nominal NRHO orbit, located at the

L_{2}

point in the Earth-Moon system.

The proposed mission schedule for this Lunar Gateway resupply mission is summarized in Table 8, and the different mission phases and their associated control cost estimates as listed in Table 9.

Again, the transfer trajectory from Earth may be designed based on the globalized NHRO stable manifold with potentially null associated control cost [99]. Such design is not considered in this simulation. Instead, to propose a demanding mission scenario, the chaser spacecraft is first inserted into a northern

L_{1}

halo orbit of out-of-plane amplitude of

A_{z}

= 20,000

km

to demonstrate the capabilities of the proposed GNC architecture even in extreme trajectory design cases. Moreover, this initial

L_{1}

halo orbit may serve as a loitering circuit for a complete multi-spacecraft re-supply chain. The initial conditions of the Gateway station (the target) and the relative state after insertion into the loitering

L_{1}

orbit are:

\begin{matrix} S_{t} (t_{0}) & = {[\begin{matrix} 1.09452 & 0.10590 & 0.01161 & 0.05686 & 0.10648 & - 0.16324 \end{matrix}]}^{⊺} \\ s (t_{0}) & = {[\begin{matrix} - 0.27039 & - 0.10590 & 0.04520 & - 0.05686 & 0.06077 & 0.16324 \end{matrix}]}^{⊺} . \end{matrix}

which corresponds to an initial relative range of 112,967 km.

The transfer shall be accomplished in less than 14 days or

t_{f} = π

, given it will be performed using low-thrust propulsion. The continuous SDRE algorithm is used to construct an optimal low-thrust trajectory between the two LPO in close-loop. The algorithm is defined by the following Q and R matrices

Q = (\begin{matrix} 500 \cdot I_{3 \times 3} & 0_{3 \times 6} \\ 0_{6 \times 3} & 10^{- 4} \cdot I_{6 \times 6} \end{matrix}), R = I_{3 \times 3} .

Figure 19 and Figure 20 demonstrates the low-thrust transfer and insertion into the nominal target halo orbit in the absolute and relative phase space, respectively. Moreover, Figure 21 shows the needed control acceleration to accomplish the transfer. A lower control authority bound exists to escape the gravitational well of the Moon, which is found to be

∥ u ∥ \geq 3 mm s^{- 2}

. The optimal transfer trajectory reduces the initial relative range by 99.99%.

The second phase comprises close-range rendezvous and re-supply of the Gateway after a total time of flight of

t_{f} = 1.25

nondimensional time units or 5.57 days. The selected GNC architecture to fulfill such task is the combination of the MPC scheme with the Opt-MI guidance algorithm, with impulses every 1.07 h. The initial conditions of the phase are

\begin{matrix} S_{t} (t_{0}) & = {[\begin{matrix} 1.08688 & 0.08105 & 0.03793 & 0.03262 & 0.18559 & - 0.14131 \end{matrix}]}^{⊺} \\ s (t_{0}) & = 1 \cdot 10^{- 4} \cdot {[\begin{matrix} - 0.14131 & - 0.04576 & - 0.01970 & 0.34390 & 0.17148 & - 0.27110 \end{matrix}]}^{⊺} . \end{matrix}

Figure 22 depicts the relative state evolution during this second phase. High-frequency impulsive actuation is clearly noticed, also given in Figure 23. The minimum required control actuation is

0.1 mm / s

. The final rendezvous error is of

3.67 m

, completing the rendezvous and docking.

Before finishing the mission with a retrieval of the chaser spacecraft to the

L_{1}

nearbies, a formation flying demonstration is performed around the Lunar Gateway at a relative range of ∼

1 km

, where the chaser vehicle performs a Lissajous relative orbit-like of amplitudes

A_{x} = 1 km

,

A_{z} = 1 km

and frequencies

ω = 4 π

,

ν = 2 π / 3

. The SMC controller tracks the desired state evolution for 21 days or

t_{f} = 3 π / 2

. The initial conditions of the phase are

\begin{matrix} S_{t} (t_{0}) & = {[\begin{matrix} 1.16523 & 0.01927 & - 0.10691 & 0.01694 & - 0.19749 & - 0.02022 \end{matrix}]}^{⊺} \\ s (t_{0}) & = 1 \cdot 10^{- 8} \cdot {[\begin{matrix} - 0.07279 & 0.91323 & 0.26657 & 0 & 0 & 0 \end{matrix}]}^{⊺} . \end{matrix}

Figure 24 shows the relative state evolution during this last phase, where the periodic variation of both the position and velocity is noticeable, as well as the transitory secular departure of the relative state from the rendezvous condition to the nominal formation flying relative range. The associated control acceleration is given in Figure 25.

Again, the majority of the required

Δ V

budget is allocated to the low-thrust long-range rendezvous, the contributions of the other two phases being negligible in comparison. Moreover, with an appropriate design of the

L_{1}

loitering orbit, this control effort may be even null if heteroclinic connections between the two halo orbits are exploited. Interestingly, a relatively complex formation flying configuration is accomplished nearly for free.

7. Conclusions

The problem of relative orbital motion within the CR3BP framework is tackled in this work. First, a framework for the relative dynamics in the CR3BP is presented: a new set of relations is derived for the development of control schemes based on linearized dynamics, including a novel LTI relative motion model, and a scheme based on Encke’s formulation is presented for the accurate numerical calculation of the relative motion. Afterwards, these tools are utilized for the design of rendezvous manoeuvres between spacecraft in relative motion within the CR3BP dynamical model. To this end, several control and guidance approaches of increasing complexity and performance are investigated and either successfully adapted or specifically designed to relative motion applications and proximity operations in the framework of CR3BP. These approaches enable the design of efficient and affordable (in terms of

Δ V

budget) rendezvous trajectories between spacecraft in neighboring trajectories. These techniques cover a variety of approaches, ranging from impulsive manoeuvres (two-impulse manoeuvres, multi-impulse manoeuvres, optimal

l^{1, 2}

-norm multi-impulsive sequence planning algorithms), to continuous thrust manoeuvres (based on Linear Quadratic Regulators, State Dependent Ricatti Equation, Augmented Lagrangian Iterative Linear Quadratic Regulator). Moreover, the presented guidance techniques are effectively combined with modern control strategies for robust and optimal guidance reference tracking (Model Predictive Control for impulsive control inputs and Sliding Mode Control for acceleration-based actuation).

These trajectory design techniques have been successfully validated through application to several mission scenarios, which showcase a good performance (based on a series of proposed performance indices) and flexibility, both for mission design as well as for autonomous proximity operations. However, the design of a complete GNC architecture is still to be accomplished, where navigational aspects, beyond the scope of this work, play a major role. Improvements to the presented algorithms, both for further exploiting the problem’s intrinsic dynamical structures and extending them to rendezvous-related activities, remain as an open line of research for future work.

Author Contributions

Conceptualization, S.C.d.V., H.U., P.S.-L., R.G.-R. and A.K.S.; methodology, S.C.d.V., H.U. and P.S.-L.; writing—original draft preparation, S.C.d.V.; writing—review and editing, H.U., P.S.-L., R.G.-R. and A.K.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Spanish State Research Agency and the European Regional Development Fund through the research grant PID2020-112576GB-C22 (AEI/ERDF, UE).

Data Availability Statement

Not applicable.

Acknowledgments

S.C.d.V., H.U. and P.S.-L. wish to acknowledge the Spanish State Research Agency and the European Regional Development Fund for their support through the research grant PID2020-112576GB-C22 (AEI/ERDF, UE). The Ministry of Education, Culture, Sports, Science and Technology (MEXT) of the Japanese government supported R.G.-R. under its program of scholarships for graduate school students.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

CR3BP	Circular Restricted Three-Body Problem
GNC	Guidance, Navigation and Control
iLQR	Iterative Linear Quadratic Regulator
JWST	James Webb Space Telescope
LOS	Line of Sight
LP	Linear Programming
LPO	Libration Point Orbit
LQR	Linear Quadratic Regulator
LTI	Linear Time Invariant
LTP	Linear Time Periodic
MI	Multi-Impulse
MISG	Multi-Impulsive Staging Guidance
MCPI	Modified Chebyshev Picard Iterations
MPC	Model Predictive Control
Opt-MI	Optimal Multi-Impulse
RLM	Rendezvous Linear Model
RLLM	Relative Libration Linear Model
SDRE	State Dependent Ricatti Equation
SMC	Sliding Mode Control
STM	State Transition Matrix
TI	Two-Impulse

Appendix A. Numerical Integration

Throughout this investigation, two numerical integrators are predominantly used and are here described.

General integration of the relative and absolute motion dynamical systems is achieved through a variable-step, variable-order Adams-Bashforth-Moulton predictor-corrector solver of orders 1 to 13. The latter is is used to form the error estimate and the function does local extrapolation to advance the integration at order 13. Such integrator has been employed as commercially implemented in Matlab 2021b [98].

Additionally, the iLQR guidance core benefits from a Modified Chebyshev-Picard Iterations (MCPI) integrator, as formulated in the seminar work of Bai [12]. The high-accuracy MCPI scheme allows to integrate the nonlinear, controlled relative vector field without explicit regression of the control law with respect to an independent variable, thus allowing for enhanced numerical precision.

Consider the first-order differential dynamical system

\dot{s} = f (μ, s, u, t)

where

u

is the control input. The solution of the above system is given by the quadrature

s (t) = \int_{t_{0}}^{t_{1}} f (μ, s, u, t) d t .

Given an initial estimation of

s (t)

, the MCPI approximates the solution

s

as a Chebyshev polynomial series of order N, evaluated over the

N + 1

Chebyshev nodes

τ_{j}

s (τ_{j}) \approx T W α = C_{s} α

where T is the Chebyshev polynomial matrix evaluated over the Chebyshev nodes

τ_{j}

and W is an appropriate weight matrix (see [12] for details). Vector

α

is the coordinates of

s

in the Chebyshev functional space.

To comply with the differential vector field

f

, such approximation is updated following an iterative procedure described now

Compute the nonlinear vectorfield $g$ evaluated at the Chebyshev nodes $τ_{j}$

$g (τ_{j}) = \frac{t_{1} - t_{0}}{2} f (μ, s_{j}, u_{j}, \frac{t_{1} - t_{0}}{2} τ_{j} + \frac{t_{1} + t_{0}}{2}) .$
Compute the updated polynomial coefficients $fi$

$β = C_{α} g + χ_{0},$

where $χ_{0}$ is a function of the initial conditions $s_{0}$ and $C_{α}$ is appropriately defined in [12].
Update the state approximation $s$

$s (τ_{j}) \leftarrow C_{s} β .$

Iterations continue until some convergence criteria is met.

Appendix B. Performance Indices

Different performance indices are defined and used to compare the proposed guidance and control schemes to achieve rendezvous and proximity operations. These are based on proxies that are regularly used in Control Theory to assess general control performance, defined both in terms of the rendezvous error

e (t)

in time and the needed control effort made to accomplish such state.

In particular, the Integral of the Absolute Error (IAE) and the Integral of the Square of the Error (ISE) are well-known performance indices to address the performance of a given control input action as a function of the final system relative error; for this study, we define these indices as the following integral loss functions:

\begin{matrix} IAE & = \int_{0}^{t_{f}} | e | d t \\ ISE & = \int_{0}^{t_{f}} e \cdot e d t . \end{matrix}

Propellant consumption can be directly quantified using the Euclidean norms of the integral

l^{1}

and

l^{2}

-distances of the control law

u

over the time of flight

t_{f}

, thus defining the following performance indices:

\begin{matrix} E_{1} & = \int_{0}^{t_{f}} {| u |}_{1} d t \\ E_{2} & = \int_{0}^{t_{f}} u \cdot u d t . \end{matrix}

Finally, the equivalent finite burn

Δ V_{T}

is given by the following integral:

Δ V_{T} = \int_{0}^{t_{f}} ∥u∥ d t .

In addition, the computational cost of each algorithm has been quantified by means of the mean computational time taken to solve the problem in particular over 25 repetitions each. Simulations have been performed in Matlab using an 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80 GHz with 15.7 GB of RAM.

Appendix C. AL-iLQR Formulation, Hessian and Gradient Approximations

The following Appendix details the mathematical formulation of the AL-iLQR scheme, and the exact form of the gradient and Hessian operators within the AL-iLQR optimal problem. Recovering its general formulation

\begin{matrix} \underset{u, λ}{\arg \min} & J^{*} = \sum_{i = 0}^{N} \frac{1}{2} (s_{i}^{⊺} Q_{i} s_{i} + Δ t u_{i}^{⊺} R u_{i}) + {(λ_{i} + \frac{1}{2} c (s_{i}) I^{γ})}^{⊺} c (s_{i}) \\ subject to & s_{i + 1} = f (μ, s_{i}, u_{i}, Δ t), \\ s (t_{0}) = s_{0}, \end{matrix}

(A1)

attention is now drawn to the Step I if the iLQR scheme, solving the unconstrained problem. The backward pass first starts decomposing the cost function into a terminal and running cost

J = l (s_{N}, λ_{N}, γ_{N}) + \sum_{i = 0}^{N - 1} (s_{i}, u_{i}, λ_{i}, γ_{i}) .

Using Bellman’s Principle of Optimality and realising the recursive structure in J, a cost-to-go and action value functions,

V_{N} (s) {|_{λ, γ}, M (s_{i}, u_{i}) |}_{λ, γ}

respectively, are defined

\begin{matrix} V_{N} (s_{N}) |_{λ, γ} = l (s_{N}, λ_{N}, γ_{N}), \\ V_{i} (s_{i}) |_{λ, γ} = \underset{u_{i}}{\arg \min} \{(s_{i}, u_{i}, λ_{i}, γ_{i}) + V_{i + 1} (f (μ, s_{i}, u_{i}, Δ t)) |_{λ, γ}\}, \\ V_{i} (s_{i}) {|_{λ, γ} = \underset{u_{i}}{\arg \min} M (s_{i}, u_{i}) |}_{λ, γ} . \end{matrix}

To optimize the action sequence

u_{i}

, deviations from the true optimal solution

u_{i}^{*}

are considered by a second-order expansion of

V_{k}

\begin{matrix} δ V_{k} \approx \frac{1}{2} δ s_{i}^{⊺} S_{i} s_{i} + v_{i}^{⊺} δ s_{i} \end{matrix} .

At the final step N,

δ V_{N}

is completely defined

\begin{matrix} s_{N} = {(l_{N})}_{s} + {(c_{N})}_{s}^{⊺} (λ + I_{N}^{γ} c_{N}), \\ S_{N} = {(l_{N})}_{s s} + {(c_{N})}_{s}^{⊺} I_{N}^{γ} {(c_{N})}_{s} . \end{matrix}

{()}_{x}

denote partial derivatives with respect to the vector

x

. To derive a recursive relationship between

δ V_{i}

and

δ V_{i + 1}

, the second-order expansion of M around the optimal solution is leveraged

δ M_{i} = \frac{1}{2} {[\begin{matrix} δ s_{i} \\ δ u_{i} \end{matrix}]}^{⊺} [\begin{matrix} M_{s s} & M_{s u} \\ M_{u s} & M_{u u} \end{matrix}] [\begin{matrix} δ s_{i} \\ δ u_{i} \end{matrix}] + {[\begin{matrix} M_{s} \\ M_{u} \end{matrix}]}^{⊺} [\begin{matrix} δ s_{i} \\ δ u_{i} \end{matrix}] .

The optimal control sequence deviation

δ u_{i}^{*}

is therefore given by

δ u_{i}^{*} = \frac{\partial δ M_{i}}{\partial δ u} = 0 \to δ u_{i}^{*} = - M_{u u}^{- 1} (M_{u s} δ s_{i} + M_{u}) = K_{i} δ s_{i} + d_{i}

Finally, the Gauss-Newton approximation of the Hessian of M and its gradient yield the following exact expressions

\begin{matrix} M_{s} = Q_{i} s_{i} + A_{i}^{⊺} v_{i + 1} + {(c_{i})}_{s}^{⊺} (I^{γ} c + λ) \\ M_{u} = R u_{i} + B_{i}^{⊺} v_{i + 1} + {(c_{i})}_{u}^{⊺} (I^{γ} c + λ) \\ M_{s s} = Q_{i} + A_{i}^{⊺} S_{i + 1} A_{i} + {(c_{i})}_{s}^{⊺} I^{γ} {(c_{i})}_{s} \\ M_{u u} = R + B_{i}^{⊺} S_{i + 1} B_{i} + {(c_{i})}_{u}^{⊺} I^{γ} {(c_{i})}_{u} \\ M_{u s} = B_{i}^{⊺} S_{i + 1} A_{i} + {(c_{i})}_{u}^{⊺} I^{γ} {(c_{i})}_{s} \end{matrix}

where

v_{i}

and

S_{i}

are the gradient and Hessian of the cost-to-go function

V_{i}

. Precisely, the backward pass is closed with the recursions for

S_{i}

,

s_{i}

\begin{matrix} S_{i} = M_{s s} + K_{i}^{⊺} (M_{u u} K_{i} + M_{u x}) + M_{s u} K_{i}, \\ v_{i} = M_{s} + K_{i}^{⊺} (M_{u u} d_{i} + M_{u}) + M_{s u} d_{i} . \end{matrix}

Matrices A and B are the linearization of

f

with respect to

s

and

u

\begin{matrix} A = \frac{\partial}{\partial s} f, B = \frac{\partial}{\partial u} f . \end{matrix}

References

Weinzierl, M. Space, the Final Economic Frontier. J. Econ. Perspect. 2018, 32, 173–192. [Google Scholar] [CrossRef]
International Space Exploration Coordination Group. Global Exploration Roadmap; Technical Report; NASA: Washington, DC, USA, 2018. [Google Scholar]
International Space Exploration Coordination Group. Global Exploration Roadmap Supplement; Technical Report; NASA: Washington, DC, USA, 2020. [Google Scholar]
Farquhar, R. The Control and Use of Libration-Point Satellites; Technical Report; NASA: Greenbelt, MD, USA, 1970. [Google Scholar]
Farquhar, R.; Muhonen, D.; Newman, C.; Heubergerg, H. Trajectories and Orbital Maneuvers for the First Libration-Point Satellite. J. Guid. Control 1980, 3, 549–554. [Google Scholar] [CrossRef]
Wiesel, W.; Shelton, W. Modal control of an unstable periodic orbit. J. Astronaut. Sci. 1983, 31, 63–76. [Google Scholar]
Simó, C.; Gómez, G.; Llibre, J.; Martínez, R.; Rodríguez, J. On the optimal station keeping control of halo orbits. Acta Astronaut. 1987, 15, 391–397. [Google Scholar] [CrossRef]
Gómez, G.; Howell, K.; Masdemont, J.; Simó, C. Station-Keeping Strategies For Translunar Libration Point Orbits. In Proceedings of the Advances in the Astronautical Sciences, Greenbelt, MD, USA, 11–15 May 1998; Volume 99. [Google Scholar]
Howell, K.; Pernicka, H. Station-keeping method for libration point trajectories. J. Guid. Control. Dyn. 1993, 16, 151–159. [Google Scholar] [CrossRef]
Howell, K.C.; Gordon, S.C. Orbit Determination Error Analysis and a Station-keeping Strategy for Sun-Earth L1 Libration Point Orbits. J. Astronaut. Sci. 1994, 42, 207–228. [Google Scholar]
Dwivedi, N.P. Deterministic Optimal Maneuver Strategy for Multi-Target Missions. J. Optim. Theory Appl. 1975, 17, 133–153. [Google Scholar] [CrossRef]
Bai, X.; Junkins, J.L. Modified Chebyshev-Picard Iteration Methods for Station-Keeping of Translunar Halo Orbits. Math. Probl. Eng. 2012, 2012, 1–18. [Google Scholar] [CrossRef]
Hou, X.; Liu, L.; Tang, J. Station-keeping of small amplitude motions around the collinear libration point in the real Earth–Moon system. Adv. Space Res. 2011, 47, 1127–1134. [Google Scholar] [CrossRef]
Folta, D.; Pavlak, T.; Howell, K.; Woodard, M.; Woodfork, M.A. Stationkeeping of Lissajous Trajectories in the Earth-Moon System with Applications to ARTEMIS. Adv. Astronaut. Sci. 2010, 136, AAS 10-113. [Google Scholar]
Folta, D.C.; Pavlak, T.A.; Haapala, A.F.; Howell, K.C.; Woodard, M.A. Earth–Moon libration point orbit stationkeeping: Theory, modeling, and operations. Acta Astronaut. 2014, 94, 421–433. [Google Scholar] [CrossRef]
Jin, Y.; Xu, B. A Modified Targeting Strategy for Station-Keeping of Libration Point Orbits in the Real Earth-Moon System. Int. J. Aerosp. Eng. 2019, 2019, 3257514. [Google Scholar] [CrossRef]
Carletta, S.; Pontani, M.; Teofilatto, P. Station-keeping about sun-mars three-dimensional quasi-periodic collinear libration point trajectories. Adv. Astronaut. Sci. 2020, 173, 299–311. [Google Scholar]
Breakwell, J.; Kamel, A.A.; Ratner, M.J. Station-keeping for a translunar communication station. Celest. Mech. 1974, 10, 357–373. [Google Scholar] [CrossRef]
Jones, B.L.; Bishop, R.H. H₂ optimal halo orbit guidance. J. Guid. Control. Dyn. 1993, 16, 1118–1124. [Google Scholar] [CrossRef]
Scheeres, D.J.; Vinh, N.X. Dynamics and control of relative motion in an unstable orbit. In Proceedings of the Astrodynamics Specialists Conference, Denver, CO, USA, 14–17 August 2000; AIAA Paper 2000-4135. pp. 192–202. [Google Scholar] [CrossRef]
Luquette, R.J.; Sanner, R.M. A Non-Linear Approach to Spacecraft Formation Control in the Vicinity of a Collinear Libration Point. In Proceedings of the Astrodynamics Specialists Conference, Monterey, CA, USA, 5–8 August 2002; Volume 109. [Google Scholar]
Gurfil, P.; Kasdin, N.J. Stability and control of spacecraft formation flying in trajectories of the restricted three-body problem. Acta Astronaut. 2004, 54, 433–453. [Google Scholar] [CrossRef]
Gurfil, P.; Idan, M.; Kasdin, N.J. Adaptive Neural Control of Deep-Space Formation Flying. J. Guid. Control. Dyn. 2003, 26, 491–501. [Google Scholar] [CrossRef]
Marchand, B.G.; Howell, K.C. Control Strategies for Formation Flight In the Vicinity of the Libration Points. J. Guid. Control. Dyn. 2005, 28, 1210–1219. [Google Scholar] [CrossRef]
Marchand, B.; Howell, K.C.; Betts, J. Discrete Nonlinear Optimal Control of S/C Formations Near the L1 and L2 Points of the Sun-Earth/Moon System. Adv. Astronaut. Sci. 2006, 123, AAS 05-341. [Google Scholar]
Infeld, S.I.; Jossely, S.B.; Murray, W.; Ross, I.M. Design and Control of Libration Point Spacecraft Formations. J. Guid. Control Dyn. 2007, 30, 899–909. [Google Scholar] [CrossRef][Green Version]
Kulkarni, J.; Campbell, M.; Dullerud, G. Stabilization of Spacecraft Flight in Halo Orbits: An H_∞ Approach. IEEE Trans. Control Syst. Technol. 2006, 14, 572–578. [Google Scholar] [CrossRef]
Nazari, M.; Anthony, W.M.; Butcher, E. Continuous Thrust Stationkeeping in Earth-Moon L1 Halo Orbits Based on LQR control and Floquet Theory. In Proceedings of the AIAA/AAS Astrodynamics Specialist Conference, Keystone, CO, USA, 21–24 August 2014. [Google Scholar] [CrossRef]
Lian, Y.; Gómez, G.; Masdemont, J.J.; Tang, G. Station-keeping of real Earth–Moon libration point orbits using discrete-time sliding mode control. Commun. Nonlinear Sci. Numer. Simul. 2014, 19, 3792–3807. [Google Scholar] [CrossRef]
Ulybyshev, Y. Long-Term Station Keeping of Space Station in Lunar Halo Orbits. J. Guid. Control. Dyn. 2015, 38, 1063–1070. [Google Scholar] [CrossRef]
Narula, A.; Biggs, J.D. Fault-Tolerant Station-Keeping on Libration Point Orbits. J. Guid. Control. Dyn. 2018, 41, 879–887. [Google Scholar] [CrossRef]
Peng, H.; Liao, Y.; Bai, X.; Xu, S. Maintenance of Libration Point Orbit in Elliptic Sun–Mercury Model. IEEE Trans. Aerosp. Electron. Syst. 2018, 54, 144–158. [Google Scholar] [CrossRef]
Qi, Y.; de Ruiter, A. Station-keeping strategy for real translunar libration point orbits using continuous thrust. Aerosp. Sci. Technol. 2019, 94, 105376. [Google Scholar] [CrossRef]
Héritier, A.; Howell, K.C. Dynamical evolution of natural formations in libration point orbits in a multi-body regime. Acta Astronaut. 2014, 102, 332–340. [Google Scholar] [CrossRef]
Xu, M.; Liang, Y.; Fu, X. Formation flying on quasi-halo orbits in restricted Sun–Earth/Moon system. Aerosp. Sci. Technol. 2017, 67, 118–125. [Google Scholar] [CrossRef]
Fu, X.; Xu, M. Formation Flying Along Low-Energy Lunar Transfer Trajectory Using Hamiltonian-Structure-Preserving Control. J. Guid. Control Dyn. 2019, 42, 650–661. [Google Scholar] [CrossRef]
Cheng, Y.; Circi, C.; Lian, Y. Hamiltonian Structure-Based Formation Flight Control Along Low-Energy Transfer Trajectory. J. Guid. Control Dyn. 2021, 44, 522–536. [Google Scholar] [CrossRef]
Jung, S.; Kim, Y. Formation flying along unstable Libration Point Orbits using switching Hamiltonian structure-preserving control. Acta Astronaut. 2019, 158, 1–11. [Google Scholar] [CrossRef]
Elliott, I.; Bosanac, N. Spacecraft Formation Control Near a Periodic Orbit Using Geometric Relative Coordinates. In Proceedings of the AAS/AIAA Space Flight Mechanics Meeting, Orlando, FL, USA, 9–11 August 2021. [Google Scholar]
Elliott, I.; Bosanac, N. Impulsive control of formations near invariant tori via local toroidal coordinates. In Proceedings of the AAS/AIAA Astrodynamics Specialist Virtual Conference, Online. 9–12 August 2021. [Google Scholar]
Bonasera, S.; Elliott, I.; Sullivan, C.; Bosanac, N.; Ahmed, N.; McMahon, J. Designing Impulsive Station-Keeping Maneuvers Near a Sun-Earth L2 Halo Orbit via Reinforcement Learning. In Proceedings of the AAS/AIAA Space Flight Mechanics Meeting, Orlando, FL, USA, 9–11 August 2021. [Google Scholar]
Bosanac, N.; Bonasera, S.; Sullivan, C.; McMahon, J.; Ahmed, N. Reinforcement Learning for Reconfiguration Maneuver Design in Multi-Body Systems. In Proceedings of the AAS/AIAA Astrodynamics Specialist Virtual Conference, Online, 9–12 August 2021. [Google Scholar]
Gao, C.; Masdemont, J.J.; Gómez, G.; Chen, J.; Yuan, J. High order dynamical systems approaches for low-thrust station-keeping of libration point orbits. Acta Astronaut. 2022, 190, 349–364. [Google Scholar] [CrossRef]
Shirobokov, M.; Trofimov, S.; Ovchinnikov, M. Survey of Station-Keeping Techniques for Libration Point Orbits. J. Guid. Control. Dyn. 2017, 40, 1085–1105. [Google Scholar] [CrossRef]
Luquette, R.J. Nonlinear Control Design Techniques for Precision Formation Flying at Lagrange Points. Ph.D. Thesis, University of Maryland, College Park, MD, USA, 2006. [Google Scholar]
Franzini, G. Relative Motion Dynamics and Control in the Two-Body and in the Restricted Three-Body Problems. Ph.D. Thesis, Università di Pisa, Pisa, Italy, 2018. [Google Scholar]
Gerding, R. Rendezvous equations in the vicinity of the second libration point. J. Spacecr. Rocket. 1971, 8, 292–294. [Google Scholar] [CrossRef]
Jones, B.L. A Guidance and Navigation System for Two Spacecraft Rendezvous in Translunar Halo Orbit. Ph.D. Thesis, University of Texas at Austin, Austin, TX, USA, 1993. [Google Scholar]
Jones, B.L.; Bishop, R.H. Rendezvous targeting and navigation for a translunar halo orbit. J. Guid. Control Dyn. 1994, 17, 1109–1114. [Google Scholar] [CrossRef]
Canalias, E.; Masdemont, J.J. Rendez-vous in lissajous orbits using the effective phase plane. In Proceedings of the 57th International Astronautical Congress, Valencia, Spain, 2–6 October 2006. [Google Scholar] [CrossRef]
Mand, K. Rendezvous and Proximity Operations at the Earth-Moon L2 Lagrange Point: Navigation Analysis for Preliminary Trajectory Design. Master’s Thesis, Rice University, Houston, TX, USA, 2014. [Google Scholar]
Ueda, S.; Murakami, N. Optimum guidance strategy for rendezvous mission in Earth-Moon L2 Halo orbit. In Proceedings of the 25th International Symposium on Space Flight Dynamics ISSFD 2015, Munich, Germany, 19–23 October 2015. [Google Scholar]
Sato, Y.; Kitamura, K.; Shima, T. Spacecraft Rendezvous Utilizing Invariant Manifolds for a Halo Orbit. Trans. Jpn. Soc. Aeronaut. Space Sci. 2015, 58, 261–269. [Google Scholar] [CrossRef]
Lizy-Destrez, S. Operational scenarios optimization for r supply of crew and cargo of anInternational gateway Station located near the Earth-Moon-Lagrangian point-2. Ph.D. Thesis, L’Université de Toulouse, Toulouse, France, 2015. [Google Scholar]
Murakami, N.; Ueda, S.; Ikenaga, T.; Maeda, M.; Yamamoto, T.; Ikeda, H. Practical Rendezvous Scenario for Transportation Missions to Cis-Lunar Station in the Earth-Moon L2 Halo Orbit. In Proceedings of the 25th International Symposium on Space Flight Dynamics ISSFD 2015, Munich, Germany, 19–23 October 2015. [Google Scholar]
Murakami, N.; Yamanaka, K. Trajectory design for rendezvous in lunar Distant Retrograde Orbit. In Proceedings of the 2015 IEEE Aerospace Conference, Big Sky, MT, USA, 7–14 March 2015; pp. 1–13. [Google Scholar] [CrossRef]
Ueda, S.; Murakami, N.; Ikenaga, T. A Study on Rendezvous Trajectory Design Utilizing Invariant Manifolds of Cislunar Periodic Orbits. In Proceedings of the AIAA Guidance, Navigation, and Control Conference, Grapevine, TX, USA, 9–13 January 2017. [Google Scholar] [CrossRef]
Lizy-Destrez, S.; Le Bihan, B.; Campolo, A.; Manglativi, S. Safety Analysis for Near Rectilinear Orbit Close Approach Rendezvous in the Circular Restricted Three-Body Problem. In Proceedings of the 68th Annual International Astronautical Congress (IAC 2017), Adelaide, Australia, 25–29 September 2017. [Google Scholar]
Davis, D.; Bhatt, S.; Howell, K.; Jang, J.; Whitley, R.; Clark, F.; Guzzetti, D.; Zimovan, E.; Barton, G. Orbit maintenance and navigation of human spacecraft at cislunar near rectiliear Halo orbits. In Proceedings of the Advances in the Astronautical Sciences, Stevenson, WA, USA, 20–24 August 2017; Volume 160, pp. 5–9. [Google Scholar]
Lizy-Destrez, S.; Beauregard, L.; Blazquez, E.; Campolo, A.; Manglativi, S.; Quet, V. Rendezvous Strategies in the Vicinity of Earth-Moon Lagrangian Points. Front. Astron. Space Sci. 2019, 5, 45. [Google Scholar] [CrossRef]
Blazquez, E.; Beauregard, L.; Lizy-Destrez, S.; Ankersen, F.; Capolupo, F. Rendezvous design in a cislunar near rectilinear Halo orbit. Aeronaut. J. 2020, 124, 821–837. [Google Scholar] [CrossRef]
Khoury, F. Orbital Rendezvous and Spacecraft Loitering in the Earth-Moon System. Master’s Thesis, Purdue University, Lafayette, IN, USA, 2020. [Google Scholar]
Bucchioni, G. Guidance and Control for Phasing, Rendezvous and Docking in the Three Body Lunar Space. Ph.D. Thesis, Università di Pisa, Pisa, Italy, 2021. [Google Scholar]
Bucchioni, G.; Innocenti, M. Phasing Maneuver Analysis from a Low Lunar Orbit to a Near Rectilinear Halo Orbit. Aerospace 2021, 8, 70. [Google Scholar] [CrossRef]
Galullo, M.; Bucchioni, G.; Franzini, G.; Innocenti, M. Closed Loop Guidance During Close Range Rendezvous in a Three Body Problem. J. Astronaut. Sci. 2022, 69, 28–50. [Google Scholar] [CrossRef]
Ulybyshev, Y. Optimization of Low Thrust Rendezvous Trajectories in Vicinity of Lunar L2 Halo Orbit. In Proceedings of the AIAA/AAS Astrodynamics Specialist Conference, Long Beach, CA, USA, 13–16 September 2016. [Google Scholar] [CrossRef]
Sánchez, J.; Gavilán, F.; Vázquez, R. Chance-constrained Model Predictive Control for Near Rectilinear Halo Orbit Spacecraft Rendezvous. Aerosp. Sci. Technol. 2020, 100, 105827. [Google Scholar] [CrossRef]
Colagrossi, A.; Lavagna, M. Dynamical analysis of rendezvous and docking with very large space infrastructures in non-Keplerian orbits. CEAS Space J. 2018, 10, 87–99. [Google Scholar] [CrossRef]
Colagrossi, A.; Pesce, V.; Bucci, L.; Colombi, F.; Lavagna, M. Guidance, navigation and control for 6DOF rendezvous in Cislunar multi-body environment. Aerosp. Sci. Technol. 2021, 114. [Google Scholar] [CrossRef]
Battin, R.H. An Introduction to the Mathematics and Methods of Astrodynamics; AAIA Education Series; American Institute of Aeronautics and Astronautics, Inc.: Reston, VA, USA, 1999. [Google Scholar] [CrossRef]
Casotto, S. The equations of relative motion in the orbital reference frame. Celest. Mech. Dyn. Astron. 2016, 124, 215–234. [Google Scholar] [CrossRef]
Encke, J.F. Uber die allgemeinen störungen der planeten. In Berliner Astronomisches Jahrbuch für 1856; Dümmler: Berlin, Germany, 1857. [Google Scholar]
Cuevas, S.; Urrutxua, H.; Solano-Lòpez, P. Dynamics, Guidance and Control for Autonomous Rendezvous and Docking in the Restricted Three Body Problem. In Proceedings of the 31th Workshop on JAXA Astrodynamics and Flight Mechanics, JAXA/ISAS, Online. 26–27 July 2021. [Google Scholar]
Cuevas, S.; Urrutxua, H.; Solano-Lòpez, P. Relative Dynamics and Shape-based Methods for Guidance in the Restricted Three-Body Problem. In Proceedings of the 73rd International Astronautical Congress, Paris, France, 18–22 September 2022. [Google Scholar]
Richardson, D.L. A Note on Lagrangian Formulations for Motion about the Collinear Points. Celest. Mech. 1980, 22, 231–236. [Google Scholar] [CrossRef]
Howell, K.; Pernicka, H. Numerical Determination of Lissajous Trajectories in the Restricted Three-Body Problem. Celest. Mech. 1988, 41, 107–124. [Google Scholar] [CrossRef]
Clohessy, W.; Whiltshire, R. Terminal Guidance System for Satellite Rendezvous. J. Astronaut. Sci. 1960, 27, 653–678. [Google Scholar] [CrossRef]
Camacho, E.; Bordons, C. Model Predictive Control; Springer: London, UK, 1998. [Google Scholar]
Penrose, R. A generalized inverse for matrices. Math. Proc. Camb. Philos. Soc. 1955, 51, 406–413. [Google Scholar] [CrossRef]
Ross, I.M. Space Trajectory Optimization and L¹-Optimal Control Problems. In Modern Astrodynamics; Elsevier: Amsterdam, The Netherlands, 2006; Chapter 6; pp. 155–186. [Google Scholar] [CrossRef]
Breger, L.; How, J. J2-modified GVE-based MPC for formation flying spacecraft. In Proceedings of the AIAA Guidance, Navigation, and Control Conference and Exhibit, San Francisco, CA, USA, 15–18 August 2012. [Google Scholar] [CrossRef][Green Version]
Jackson, B.E. AL-iLQR Tutorial. 2019. Available online: https://bjack205.github.io/papers/AL_iLQR_Tutorial.pdf (accessed on 3 December 2022).
Lawden, D.F. Optimal Trajectories for Space Navigation; Cambridge University Press: Cambridge, UK, 1963. [Google Scholar]
Jezewsky, D.; Rozendaal, H. An efficient method for calculating optimal free-space N-impulse trajectories. AIAA J. 1968, 6, 2160–2165. [Google Scholar] [CrossRef]
Prussing, J. Optimal impulsive linear systems: Sufficient conditions and maximum number of impulses. J. Astronaut. Sci. 1995, 43, 195–206. [Google Scholar]
Çimen, T. Systematic and effective design of nonlinear feedback controllers via the state-dependent Riccati equation (SDRE) method. Annu. Rev. Control 2010, 34, 32–51. [Google Scholar] [CrossRef]
Tassa, Y.; Erez, T.; Todorov, E. Synthesis and stabilization of complex behaviors through online trajectory optimization. In Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Faro, Portugal, 7–12 October 2012; pp. 4906–4913. [Google Scholar] [CrossRef]
Aziz, J. Low-Thrust Many-Revolution Trajectory Optimization. Ph.D. Thesis, University of Colorado Boulder, Boulder, CO, USA, 2018. [Google Scholar]
Li, W.; Todorov, E. Iterative Linear Quadratic Regulator Design for Nonlinear Biological Movement Systems. In Proceedings of the 1st International Conference on Informatics in Control, Automation and Robotics, Setubal, Portugal, 25–28 August 2004; Volume 1, pp. 222–229. [Google Scholar]
Schwenzer, M.; Ay, M.; Bergs, T.; Abel, D. Review on model predictive control: An engineering perspective. Int. J. Adv. Manuf. Technol. 2021, 117, 1327–1349. [Google Scholar] [CrossRef]
Richards, A.; How, J. Performance Evaluation Of Rendezvous Using Model Predictive Control. In Proceedings of the AIAA Guidance, Navigation, and Control Conference and Exhibit, Austin, TX, USA, 11–14 August 2003. [Google Scholar] [CrossRef]
Gavilan, F.; Vazquez, R.; Camacho, E.F. Chance-constrained model predictive control for spacecraft rendezvous with disturbance estimation. Control Eng. Pract. 2012, 20, 111–122. [Google Scholar] [CrossRef]
Hartley, E.N. A tutorial on model predictive control for spacecraft rendezvous. In Proceedings of the 2015 European Control Conference (ECC), Linz, Austria, 15–17 July 2015; pp. 1355–1361. [Google Scholar] [CrossRef]
Richards, A.; How, J. Analytical Performance Prediction for Robust Constrained Model Predictive Control. Int. J. Control 2006, 79. [Google Scholar] [CrossRef]
Emelyanov, S. Variable Structure Control Systems; Nauka: Moscow, Russia, 1967. [Google Scholar]
Slotine, J.; Li, W. Applied Nonlinear Control; Prentice-Hall: Hoboken, NJ, USA, 1991. [Google Scholar]
Holland, J.H. Adaptation in Natural and Artificial Systems; MIT Press: Cambridge, MA, USA, 1992. [Google Scholar]
Mathworks. Matlab 2021b. Available online: https://es.mathworks.com/products/new_products/release2021b.html (accessed on 6 October 2022).
Gómez, G.; Llibre, J.; Martínez, R.; Simó, C. Dynamics and Mission Design Near Libration Points; World Scientific Publishing: Singapore, 2001. [Google Scholar] [CrossRef]

Figure 1. Relative orbital motion in the CR3BP.

Figure 2. Integration error for the different relative dynamics formulations.

Figure 3. Rendezvous trajectory in the absolute configuration space for the MI, MISG and AL-iLQR schemes.

Figure 4. Relative state evolution for the TI (solid), AL-iLQR (dashed), Opt-MI (*), MI (barred), MISG (triangled) strategies.

Figure 5. Impulse sequence in

m / s

for the Opt-MI/MPC (upper left), AL-iLQR (upper right), MISG/DC (lower left) and MISG/MPC (lower right) strategies.

Figure 5. Impulse sequence in

m / s

for the Opt-MI/MPC (upper left), AL-iLQR (upper right), MISG/DC (lower left) and MISG/MPC (lower right) strategies.

Figure 6. Relative docking angle evolution for LOS-constrained and unconstrained rendezvous using AL-iLQR.

Figure 7. Rendezvous trajectory in the absolute configuration space for the AL-iLQR and LQR (RLLM) controllers.

Figure 8. Relative state evolution for the LQR (solid), SDRE (*) and AL-iLQR (dashed) controllers.

Figure 9. Control acceleration in

mm / s^{2}

for the LQR, SDRE and AL-iLQR controllers.

Figure 9. Control acceleration in

mm / s^{2}

for the LQR, SDRE and AL-iLQR controllers.

Figure 10. Relative state evolution for the SDRE (*) and AL-iLQR (dashed) controllers for the halo transfer mission scenario.

Figure 11. Control acceleration in

mm / s^{2}

for the LQR, SDRE and AL-iLQR controllers for the halo transfer mission scenario.

Figure 11. Control acceleration in

mm / s^{2}

for the LQR, SDRE and AL-iLQR controllers for the halo transfer mission scenario.

Figure 12. Evolution of the relative state vector components for the JWST servicing spacecraft during the long-range homing.

Figure 13. Control input in

m / s

for the JWST servicing spacecraft during the long-range homing.

Figure 13. Control input in

m / s

for the JWST servicing spacecraft during the long-range homing.

Figure 14. Evolution of the relative state vector components for the JWST servicing spacecraft during the low-thrust approach.

Figure 15. Control input in

mm / s^{2}

for the JWST servicing spacecraft during the low-thrust approach.

Figure 15. Control input in

mm / s^{2}

for the JWST servicing spacecraft during the low-thrust approach.

Figure 16. Evolution of the relative state vector components for the JWST servicing spacecraft during the final close-range rendezvous.

Figure 17. Control input in

m / s

for the JWST servicing spacecraft during the final close-range rendezvous.

Figure 17. Control input in

m / s

for the JWST servicing spacecraft during the final close-range rendezvous.

Figure 18. Trajectory of the JWST servicing spacecraft for the complete mission timeline.

Figure 19. Absolute trajectory for the Lunar Gateway resupply mission during the long-range rendezvous.

Figure 20. Relative state evolution for the Lunar Gateway resupply mission during the long-range rendezvous.

Figure 21. Control input in

mm / s^{2}

for the Lunar Gateway resupply mission during the long-range rendezvous.

Figure 21. Control input in

mm / s^{2}

for the Lunar Gateway resupply mission during the long-range rendezvous.

Figure 22. Relative state evolution for the Lunar Gateway resupply mission during the close-range rendezvous.

Figure 23. Control input in

m / s

for the Lunar Gateway resupply mission during the close-range rendezvous.

Figure 23. Control input in

m / s

for the Lunar Gateway resupply mission during the close-range rendezvous.

Figure 24. Relative state evolution for the Lunar Gateway resupply mission during formation flying.

Figure 25. Control input in

mm / s^{2}

for the Lunar Gateway resupply mission during formation flying.

Figure 25. Control input in

mm / s^{2}

for the Lunar Gateway resupply mission during formation flying.

Table 1. Set of prescribed, randomly distributed

t_{i}

values used for the MI algorithm.

Table 1. Set of prescribed, randomly distributed

t_{i}

values used for the MI algorithm.

$t_{1}$	$t_{2}$	$t_{3}$	$t_{4}$	$t_{5}$	$t_{6}$
0.0851	0.2531	0.2912	0.4802	0.5494	0.5743

Table 2. Performance comparison for the impulsive guidance schemes.

Controller	ISE	IAE	$∥ E_{1} ∥$	$∥ E_{2} ∥$	$Δ V_{T}$ [ $m / s$ ]	${Δ V}_{\min}$ [ $m / s$ ]	${Δ V}_{\max}$ [ $m / s$ ]	Computational Time [s]
TI	0.00253	0.03846	0.18441	0.00751	$121.35$	44.42	76.93	$0.62$
TI (RLLM)	0.00253	0.03846	0.18441	0.00751	$121.35$	44.42	76.93	$1.77$
MI	0.00282	0.04079	0.11385	0.00603	$79.65$	0.00	79.65	$5.15$
Opt-MI (MPC)	0.00297	0.04119	0.21257	0.00746	$148.85$	0.010	87.29	$2.12$
MISG (DC)	0.00238	0.03644	0.21802	0.00077	$147.67$	0.00	16.75	$39.67$
MISG (MPC)	0.00266	0.03769	0.23306	0.00060	$157.63$	0.08	8.89	$64.61$
AL-iLQR	0.00276	0.04662	0.24120	0.00149	$163.02$	0.001	17.85	$3.49$

Table 3. Performance metrics for the LOS-constrained AL-iLQR rendezvous.

Controller	ISE	IAE	$∥ E_{1} ∥$	$∥ E_{2} ∥$	$Δ V_{T}$ [ $m / s$ ]	${Δ V}_{\min}$ [ $m / s$ ]	${Δ V}_{\max}$ [ $m / s$ ]	Computational Time [s]
AL-iLQR	0.00276	0.04663	0.00079 0.24117	0.00149	$163.00$	0.002	17.85	$6.11 s$

Table 4. Performance comparison for the continuous guidance schemes.

Controller	ISE	IAE	$∥ E_{1} ∥$	$∥ E_{2} ∥$	$Δ V_{T}$ [ $m / s$ ]	${Δ V}_{\max}$ [ $m / s$ ]	Computational Time [s]
LQR (RLM)	0.00025	0.01840	0.04215	0.00064	$33.55$	0.07	$2.98$
LQR (RLLM-LTI)	0.00021	0.01707	0.04880	0.00082	$37.56$	0.08	$2.85$
SDRE	0.00023	0.01748	0.04334	0.00069	$33.47$	0.10	$82.68$
AL-iLQR	0.00035	0.01495	0.07454	0.00693	$62.72$	0.90	$8.17$

Table 5. Performance comparison for the continuous guidance schemes.

Controller	ISE	IAE	$∥ E_{1} ∥$	$∥ E_{2} ∥$	$Δ V_{T}$ [ $m / s$ ]	${∥ u ∥}_{\max}$ [ $mm / s^{2}$ ]	Computational Time [s]
LQR (RLM)	0.00303	0.07229	0.31636	0.03401	$218.26$	1.3	$3.88$
LQR (RLLM-LTI)	0.00315	0.08307	0.24028	0.01325	$167.60$	0.47	$3.39$
SDRE	0.00225	0.05801	0.15826	0.00731	$113.07$	0.28	$3.70$
AL-iLQR	0.00295	0.04378	0.30071	0.06669	$199.15$	2.71	$7.80$

Table 6. Phases timeline for the JWST servicing mission.

Mission Time	Phase
T + 0	Long-range rendezvous & homing
T + 177.18 days	Low-thrust approach and formation flying
T + 420.51 days	Rendezvous

Table 7. Phase metrics for the JWST servicing mission.

Phase	Time of Flight	Controller	Control Cost
Long-range rendezvous	177.18 days	MPC-AL-iLQR	$723.91 m s^{- 1}$
Formation flight	243.33 days	SMC-LQR	1.34 $\cdot 10^{4} m s^{- 1}$
Rendezvous	34.85 days	MPC-MISG	3.61 $\cdot 10^{- 4} m s^{- 1}$

Table 8. Phases timeline for the Gateway re-supply mission.

Mission Time	Phase
T + 0	Long-range rendezvous & homing
T +14 days	Close-range rendezvous & mating
T + 19.57 days	Formation flying

Table 9. Phase metrics for the Gateway resupply mission.

Phase	Time of Flight	Controller	Control Cost
Long-range rendezvous	14 days	SDRE	$1.42 km s^{- 1}$
Rendezvous	5.57 days	MPC-Opt-MI	$0.36 m s^{- 1}$
Formation flying	21 days	SMC	$2.32 m s^{- 1}$

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cuevas del Valle, S.; Urrutxua, H.; Solano-López, P.; Gutierrez-Ramon, R.; Sugihara, A.K. Relative Dynamics and Modern Control Strategies for Rendezvous in Libration Point Orbits. Aerospace 2022, 9, 798. https://doi.org/10.3390/aerospace9120798

AMA Style

Cuevas del Valle S, Urrutxua H, Solano-López P, Gutierrez-Ramon R, Sugihara AK. Relative Dynamics and Modern Control Strategies for Rendezvous in Libration Point Orbits. Aerospace. 2022; 9(12):798. https://doi.org/10.3390/aerospace9120798

Chicago/Turabian Style

Cuevas del Valle, Sergio, Hodei Urrutxua, Pablo Solano-López, Roger Gutierrez-Ramon, and Ahmed Kiyoshi Sugihara. 2022. "Relative Dynamics and Modern Control Strategies for Rendezvous in Libration Point Orbits" Aerospace 9, no. 12: 798. https://doi.org/10.3390/aerospace9120798

APA Style

Cuevas del Valle, S., Urrutxua, H., Solano-López, P., Gutierrez-Ramon, R., & Sugihara, A. K. (2022). Relative Dynamics and Modern Control Strategies for Rendezvous in Libration Point Orbits. Aerospace, 9(12), 798. https://doi.org/10.3390/aerospace9120798

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Relative Dynamics and Modern Control Strategies for Rendezvous in Libration Point Orbits

Abstract

1. Introduction

2. Relative Dynamics in the Circular Restricted Three-Body Problem

2.1. Definitions and Governing Equations

2.2. Encke’s Formulation of the CR3BP Relative Dynamics

2.3. Linearized Models

2.4. Discrete and Controlled Dynamics

3. Optimal Impulsive Guidance

3.1. The Rendezvous Problem

3.2. Surrogate Optimization and Backward-Forward Sweep

3.3. Two-Impulse Guidance

3.4. Multi-Impulse Guidance

3.5. Optimal Multi-Impulsive Guidance

3.6. Multi-Impulsive Staging Guidance

3.7. Impulsive Iterative Linear Quadratic Regulator

3.8. Performance of Impulsive Control Rendezvous Manoeuvres

4. Optimal Continuous Guidance

4.1. Linear Quadratic Regulator

4.2. State Dependent Ricatti Equation Controller

4.3. Iterative Linear Quadratic Regulator

4.4. Performance of Continuous Control Rendezvous Manoeuvres

5. Modern Optimal Control

5.1. Model Predictive Control

5.2. Sliding Mode Control

6. Applications and Testbench Mission Scenarios

6.1. Servicing a Solar Observatory

6.2. Rendezvous in a near Rectilinear Halo Orbit

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Numerical Integration

Appendix B. Performance Indices

Appendix C. AL-iLQR Formulation, Hessian and Gradient Approximations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI