Rendezvous in Cis-Lunar Space near Rectilinear Halo Orbit: Dynamics and Control Issues

: The paper presents the development of a fully-safe, automatic rendezvous strategy between a passive vehicle and an active one orbiting around the Earth–Moon L2 Lagrangian point. This is one of the critical phases of future missions to permanently return to the Moon, which are of interest to the majority of space organizations. The ﬁrst step in the study is the derivation of a suitable full 6-DOF relative motion model in the Local Vertical Local Horizontal reference frame, most suitable for the design of the guidance. The main dynamic model is approximated using both the elliptic and circular three-body motion, due to the contribution of Earth and Moon gravity. A rather detailed set of sensors and actuator dynamics was also implemented in order to ensure the reliability of the guidance algorithms. The selection of guidance and control is presented, and evaluated using a sample scenario as described by ESA’s HERACLES program. The safety, in particular the passive safety, concept is introduced and different techniques to guarantee it are discussed that exploit the ideas of stable and unstable manifolds to intrinsically guarantee some properties at each hold-point, in which the rendezvous trajectory is divided. Finally, the rendezvous dynamics are validated using available Ephemeris models in order to verify the validity of the results and their limitations for future more detailed design.


Introduction
The paper presents some of the dynamics models and potential guidance algorithms usable for the preliminary assessment of a rendezvous between an unmanned vehicle ascending from the Moon, and a permanent manned/unmanned service station located in a collinear Lagrangian point orbit of the Earth-Moon system. The motivation for this work is given by the renewed interest of many government agencies and private organizations in returning to our satellite.
The positive aspects of one such parking orbit lie in the possibility of continuous communications with Earth, the visibility of the lunar hidden surface, and the relative low cost of station keeping maneuvers. Lagrangian points are known to be equilibria within a circular restricted three-body model. Although the model is an approximation of the real multibody dynamics, it serves for a useful series of analyses, in particular when it comes to guidance and control algorithms, which, by nature, need simplified modeling for their primary design.
The attractiveness of the above orbits (Lissajous, halo, etc.) has been exploited over the years by several missions, especially in the Sun-Earth system. The Solar and Heliospheric Observatory and the Deep Space Climate Observatory are located at the L1 collinear point, while the Wilkinson Microwave Anisotropy Probe and the Gaia spacecraft used the L2 collinear point. The orbit of the James Webb telescope will be also located at the latter.
While the previous examples focused on science experiments and scientific knowledge, the use of Lagrangian points in the Earth-Moon system is also considered for future human exploration. Of particular interest, we can mention the Chinese Chang'e series missions, especially the number four, with the presence of the relay satellite Queqiao. The spacecraft took 24 days to reach L2 from Earth, using a lunar swing-by to save fuel. On 14 June 2018, Queqiao finished its final adjustment burn and entered the L2 halo mission orbit, which is about 65,000 kilometers from the Moon. This is the first lunar relay satellite at that location. The ARTEMIS mission instead is a programmed multi-agency mission with the objective of a permanent return to the Moon inclusive of human presence. Within ARTEMIS, a permanent station in an L2 halo orbit will serve as a bridge between human/robotic activities on the Moon, and the transfer of astronauts and material to and from the Earth and the Moon.
The present paper is cast within NASA's ARTEMIS program and considers one specific aspect of ESA's Heracles mission concept [1], which is the relative dynamics and guidance issues of the automated rendezvous between a lunar ascender vehicle and an L2 orbiting station.
Since this type of mission has not been performed yet, the paper presents several aspects of originality. The first aspect is relative to the dynamic modeling. It is important to establish the level of modeling necessary to describe the relative motion between the chasing vehicle and the target station. Keeping in mind that we are interested in a preliminary model for guidance and control, the paper describes a comparison between elliptic and circular three-body motions and validates their propagation against an Ephemeris model using GMAT and JPL ephemerides data. The inclusion of the Sun as a fourth body is considered negligible, and only potential effects of the solar radiation pressure are investigated. The incorporation of a set of sensors and actuators is also considered, in order to verify propagation errors due to the presence of their models in the dynamics. Another contribution is the selection of a set of guidance algorithms for open loop and closed loop control of the chaser, depending on the distance from the target. Guidance methods are not proposed as the optimal ones; instead, they are considered as "potential" feasibility solutions. Finally, numerical simulations are performed using the scenario defined by [1] as an example.
The background mission is described in Section 2: the mathematics relative to relative motion and their validation are described in Sections 3 and 4; Section 5 describes the models used for the primary sensors and actuators; Section 6 presents the structure of the guidance algorithms, whose details are reported in the appendix. Finally, numerical results and conclusions are shown in Sections 7 and 8.

Background
This section describes the general sequence of phases within ESA's Heracles (currently renamed ESA Large Logistic Lander or EL3) mission. Figures 1 and 2 show an artistic representation of the lunar ascent vehicle, which would rendezvous with a station in orbit.  The Heracles project was initially planned with the objective of delivering Moon samples to Earth on NASA's Orion spacecraft as early as its fourth or fifth mission. The initial project was set up to be integrated with NASA's plan to return to the Moon, and to build a permanent station (Gateway) in cislunar orbit. This has now become NASA's Artemis mission.
A small lander with a rover inside weighing around 1800 kg in total would land and be monitored by astronauts from the Gateway. The current status of the mission includes two distinct tasks: the first is to deploy a rover whose objective is the collection of soil samples, the second is to launch a module that will carry samples to the orbiting station. When the ascent module carrying the sample container arrives, the Gateway's robotic arm will capture it and berth it with the outpost's airlock for unpacking and transfer of the container to Orion and subsequent unmanned flights to Earth and later on with returning astronauts.
The aim of the mission is to prove the advancements in autonomous operations in space, in particular the focus of this paper is on the sequence of phasing and rendezvous of the LAE with the orbiting LOP-G. The first part of the mission starts with the phasing maneuver that sees the transferring of the LAE-also called chaser in the following-with the payload of lunar soil, collected from the south pole of the Moon, from an assumed low lunar orbit with an altitude of 100 km towards the L2 Near Rectilinear Halo Orbit (NRHO) in which the LOP-G-also called target-is orbiting. The rendezvous maneuver will take care of the approach when the relative distance between the two space vehicles is less than 100 km.
The literature and the technical details will be discussed in the dedicated sections, and the goal of this section is instead to provide a general background for of the challenges that the design of a transfer maneuver in such peculiar environment implies.
One of the critical aspects of this problem is the formulation of the proper equations of motion that should be written in non-inertial reference frame and must describe the complete 6-DOF relative dynamics for the rendezvous and berthing; however, under the restricted three-body problem hypothesis, there is no closed form solution and no standard procedure to compute the best rendezvous strategy and to select the best guidance/control technique.
Phasing maneuvers are continuously performed in low Earth orbit to connect to the ISS, but no experience is available in the case of automated motion in the restricted threebody problem. The LAE must fly towards the Gateway departing from a Low Lunar Orbit to a L2-NRHO, minimizing the fuel consumption and the required time of flight. The degrees of freedom of the maneuver are multiple. A possibility is to take advantage of the manifold theory, with the dynamics being modeled under the circular restricted three-body problem, in order to have a preliminary evaluation of the influence of design parameters such as TOF, number of impulses, and their amplitude.
Once the phasing is accomplished, the rendezvous phase may start immediately or after one orbit or two orbits depending on the acquired relative initial conditions. The rendezvous is a critical phase of the approaching strategy, since the two vehicles are very close to one another and the risk of collision or mission abortion is critically high. The standard closing maneuver usually follows a series of way points located in which the chaser must stop and check its state and the relative state according to accuracy, changes to sensor suite, and safety. During the rendezvous, multiple changes of the types of sensors, guidance, and control methods may be needed; therefore, all of this instrumentation must be properly selected, modeled, and validated via simulation. One of the main requirements, for instance, is the use of cameras for information on translational as well as attitude state variables.
As an example, Figure 3 shows a conceptual scheme of the sequence of the complete maneuver for the mission under study in the paper.

Relative Dynamics Equations of Motion
This section reviews the relative motion of the two spacecraft within the Earth-Moon system, or, more generally, the motion of three bodies under mutual gravitational attraction. The design of the rendezvous and docking maneuvers requires an accurate formulation of the relative motion; in this work, some working hypotheses are made, such as the elliptic and circular restricted three-body problem hypotheses. In addition, the equations of motions are formulated in non-inertial frames to make them more useful in the analysis and synthesis of the GNC loop. The complexity of this formulation stays in the fact that no closed solution exists for the problem. The interested reader can refer to [2] for more details. The appropriate reference frames are reviewed herein; then, translational and rotational dynamics are described.

Reference Frames
The generic inertial frame centered in O and with unit vectorsÎ,Ĵ, andK, will be denoted as follows: Consider two primary bodies, with masses M 1 and M 2 , orbiting around their composite center of mass C in a collinear formation. A convenient frame for describing the motion of spacecraft in such a system is the synodic reference frame. It can be centered in one of the primaries center of mass, as in [3], and the unit vectors are defined as follows: •î s = −r 12 / r 12 where r 12 is the position of M 2 with respect to M 1 ; •k s is perpendicular to the plane where the primaries revolve, and is positive in the direction of the system angular velocity vector; • s =k s ×î s completes the right-handed coordinate systems. This choice may be convenient when spacecraft measurements are taken with respect to the nearest primary. For instance, in [3], the Sun-Earth/Moon system is considered, and the synodic frame is centered on the Earth, since the spacecraft receives measurements with respect to it. In the following, we will refer to such a coordinate system as primary-centered rotating frame, and it will be denoted as where i is the index of the primary chosen for placing the coordinate frame. Figure 4 shows the primary-centered rotating frame, centered on M 2 , and the Local-Vertical Local Horizon Frame, described next.  Rendezvous trajectories are generally described in a frame local to the target. This eases the analysis and the trajectory monitoring of incoming vehicles, as well as the definition of keep-out zones and admissible approaching corridors. The LVLH frame is usually employed for rendezvous scenario analysis, and defined as: The LVLH (Sometimes called orbital frame) frame is defined with respect to the primary body around which the target is orbiting. Denoting with r it the target position with respect to the primary i, with ṙ it M i , the target velocity as seen from the primary, and with h it = r it × ṙ it M i , the target specific angular momentum with respect to the primary, the LVLH frame unit vectors are defined and named as follows: •k = −r it / r it points to the primary and is called R-bar; • = −h it / h it is perpendicular to the target instantaneous orbital plane (negative specific angular momentum direction) and is called H-bar; •î = ×k completes the right-handed reference frame, and is called V-bar.
The above definition of the LVLH frame is consistent with the one given by Fehse in its classical reference book for spacecraft rendezvous and docking [4] (pp. 31-32, Section 3.1.3). The LVLH frame for a target orbiting around M 2 is shown in Figure 4.
Note that the formal definition of the LVLH frame is relative to Kaplerian motion so using R-bar, H-bar, and V-bar is somewhat an abuse of notation, justified by their use in the community. In addition to the above, other body fixed reference frames can be used to highlight the dynamics relative to specific points of the vehicles (docking port, geometric center of mass, location of actuators, etc.).

Translational Equations of Relative Motion
Consider a target and a chaser spacecraft, orbiting around the Moon, and subjected to both Earth and Moon gravitational influence. Their equations of motion with respect to the Moon are: where: • µ = 1.215 × 10 −2 is the Moon-Earth mass ratio parameter defined as: µ = µ m µ e +µ m = (1 + M e M m ) −1 with M e + M m = 1 • r em is the position of the Moon with respect to the Earth, and r em = r em its norm; • r and r c are the target and chaser positions with respect to the Moon, with norms r = r , and r c = r c , respectively.
In order to simplify computation, distances are normalized to the Moon semi-major axis and time to the Moon mean angular motion.
With reference to Figure 5, the chaser position with respect to the Moon is given by: where ρ is the relative position of the chaser with respect to the target. Taking the second time derivative of Equation (6) and using Equations (4) and (5), we obtain the nonlinear equations of relative motion in the LVLH reference frame: Recall that: Using the subscript notation for the reference frames above, the angular velocity of the LVLH frame with respect to the inertial frame can be computed by simple composition: where ω l/m and ω m/i are the angular velocities of L with respect to M, and of M with respect to I, respectively. The LVLH angular acceleration with respect to I is given by: Equation (7), along with Equations (9) and (10), constitutes a set of time-varying nonlinear equations: • r, ω l/m , and ω l/m L that depend on the target motion around the Moon; • r em , ω m/i , and ω m/i M , characteristics of the Moon orbital motion.
Equation (7) can be further simplified if we consider the restricted three-body problem (elliptic or circular), which will be actually used later on in the paper for the guidance design.

Attitude Equations of Motion
Traditionally, orbital mechanics refers to trajectory dynamics; however, in our problem, attitude motion becomes especially critical in order to maintain correct orientation in the final phase (berthing or docking).
Standard derivation using rigid body approximation uses Euler's rotation equations with the general form [5]: where N is the vector of external torques, I is the inertia tensor matrix, and ω is the angular velocity about (typically) the principal axes. As angular velocity vectors are cumulative, we can write the inertial angular velocity as: where ω is the angular rate of body chaser frame w.r.t. inertial frame, ω co (or ω c ) is the angular rate of body chaser frame w.r.t. orbital frame and ω oi is the angular velocity of orbital frame w.r.t. inertial frame. Note that all vectors are expressed in chaser body frame.
For the kinematics, we derive the differential equations of motion of the body frame w.r.t. the LVLH reference frame, relating the Euler angles (in their Bryant 3, 2, 1 sequence) with the angular velocity vector ω c [6,7]. Using the standard Euler matrix rotation, we can write ω c as a function of Euler angles. The inverse relationship becomes Quaternions can also be used if singularities are critical. The interested reader can refer to [5,6], for instance.
The target's attitude model is taken from the literature to be a saw tooth type of motion, with an amplitude of one degree and a frequency of 1 Hz [4,7].

Attitude Relative Motion
The relative attitude between the two spacecraft is based on the angular rate vectors, since Euler angles are not cumulative. Thus: where ω ra is the relative angular rate in chaser body frame, ω c is the chaser angular rate expressed in chaser frame, ω t is the target angular rate expressed in orbital frame and R co is the rotation matrix that transforms a vector from orbital frame components to body chaser frame components Using Equation (13), we can express the derivative of the relative Euler anglesθ ra as follows:θ where J(θ ra ) is the appropriate Jacobian matrix.
For close proximity maneuvers, we need the relative centers of mass position of interest, and the position and velocity between two docking ports, which are located elsewhere on the spacecraft. The port to port distance can be expressed as where ρ is the relative distance between chaser and target CoM, r dc and r dt are the docking port position of chaser and target, respectively. The equation above can be seen as an output of the system. The target (or chaser) docking port in the orbital reference frame LVLH can be presented as with the rotation matrix defined as follows: The subscript i = c, t indicate chaser or target, respectively.

Validation of the Equations of Motion
This section describes in detail the validation process performed by comparing the model of the previous section with available Ephemeris models. The dynamics of the target orbit, relative ER3BP, and CR3BP motions are evaluated for a free drift motion and the most significant rendezvous scenario.
The Ephemeris model was not directly implemented and the validated commercial software GMAT was selected [8], since it could be easily integrated with the propagation simulator. The set DE405 generated by JPL was used in the comparison.

Propagation of the Equations of Motion
This section presents the propagation of equations of motion over one target orbit with no active corrections. The comparison was performed in the following way: Elliptic restricted, circular restricted, and GMAT were initialized with the same initial conditions in terms of position and velocity in the synodic reference frame S, and the dynamics were propagated for one orbit. The same procedure was performed for 100 times at different Mean Anomalies (M), from 0 to 180 degrees. The one orbit propagation considered to account for a second rendezvous was necessary if the first was not successful. GMAT was set to simulate the restricted three-body problem dynamics as well.
The orbit propagation in the three different models (GMAT, ER3BP, CR3BP) is shown in Figure 6; in each sub figure, it is possible to see all the trajectories propagated for the duration of seven days. The propagation duration is taken as representative of the orbital period of the target vehicle; even though the Ephemeris propagation is more realistic, the obtained trajectories require some active corrections to be considered orbits. The starting point at the periselene propagation is in blue, whereas, at the aposelene, it is in red.

Comparison Results
The next figures and present the comparison between ER3BP, CR3BP, and GMAT. Figure 7 shows the position (top column) and velocity (bottom column) differences between ER3BP and GMAT for one orbit period and the details at the aposelene. The numerical maximum and minimum values are reported in Tables 1 and 2. Note that the aposelene region is the one that is assumed to provide better rendezvous properties. The position error increases with the simulation time, and it increases more rapidly as the starting point is closer to the periselene. This latter result is expected since the unmodelled dynamics play a larger role.
Analogous tests were performed using the CR3BP and shown in Figure 8 by using the same simulation procedure. It is possible to observe that, as expected, the CR3BP model has larger errors with respect to the Ephemeris than the ER3BP model. Moreover, the error amplitude can be tolerated for mean anomalies between 100 and 180 degrees. The numerical errors in Tables 1 and 2 indicate the values propagated trajectory with respect to GMAT. The tables highlight how the propagation errors increase with the three-body restricted models with respect to the Ephemerides.

Solar Radiation Pressure (SRP)
Since the presence of the solar radiation pressure SRP was initially neglected in the derivation of the equations of motion, a preliminary evaluation of its influence was also assessed on the dynamics of the single spacecraft and on the relative dynamics. The solar radiation pressure is generated by electromagnetic waves that imprint a pressure on the surfaces hit by the waves themselves. There are many models depending on the spacecraft geometry and orbital properties; see [9], for instance. Here, the pressure is modeled as in [10] and given by: where Φ E is the energy flux associated with the electromagnetic wave spectrum, and c = 299792458 m/s is the speed of light. The pressure acts on every exposed surface, and it can influence the motion of the chaser and target vehicles. For additional details, see [10]. In GMAT, the solar radiation pressure model is considered as a spherical radiation from the Sun, with a flux of 1376 W/m 2 . The tests were performed in GMAT with the same Ephemeris model used above, and adding the solar radiation pressure assuming an effective area of 10 m 2 for the target and 3 m 2 for the chaser. In the first test, the target free dynamics are propagated and compared with the same dynamics propagated under the ER3BP; the second test consists of the relative dynamics between chaser and target comparison with the the corresponding dynamics under the ER3BP. The results are summarized in the figures below. Figure 9 shows the propagation of the target orbit with the influence of the solar pressure compared with the dynamics propagated under the ER3BP. The difference between the two models differs from the ones reported in Figure 7 for the shape but not for the order of magnitude. Therefore, the contribution due to SRP can be considered negligible. Figure 10 describes the worst case scenario with the orbits propagated in GMAT with and without solar pressure and the same orbit propagated in ER3BP. Figure 11 shows the orbits propagated from the aposelene.
The free motion of the relative dynamics is also compared with the ER3BP with and without the presence of the SRP; see Figures 12 and 13. In the latter, the propagation is performed for one entire orbit, and, as expected, the maximum amount of error is accumulated at the passage at the periselene; after that, the errors decrease again.   The presence of the SRP changes the shape of the error of the relative dynamics a bit, but the amplitude of the errors stays in the same order of magnitude. The maximum amount of position error is obtained when the dynamics are propagated starting from the periselene; the maximum amount of velocity error is obtained when the two vehicles pass through the periselene, as expected.
The maximum and minimum relative motion errors with and without SRP are shown in Tables 3 and 4, respectively. The relative errors remain very limited and the maximum position value must be taken into account when designing a trajectory that is safe with respect to imposed keep-out-zones.

Sensors and Actuators' Models
One of the important elements introduced in the equations of motion is a characterization of sensors and actuators. The selection is based on the assumed choice for the rendezvous mission and improves the accuracy of overall dynamic behavior. The following sections provide a general description of component suites, their models, and validity. We remind the reader that the motivation for component selection is based on the relative effectiveness for the mission at hand, rather than a theoretical optimization evaluation. In particular, cameras are selected to play a primary role in the relative position measurement.

Sensors
Due to the characteristics of the mission, the set of sensors used by the guidance and control depends on the relative distance between chaser and target, which selects which suite is active at any particular moment. A qualitative representation is shown in Figure 14. Range estimation using the NAC is computed according to: The error is a function of the relative distance R as shown in Figure 16. If the distance is above 10 km, the sensor used to estimate the range is the ISL; in fact, it commits an error in the range estimation equal to 3%R, and, looking at Figure 16, it is possible to notice that the range estimation error produced by the NAC is above 3% for ranges larger than 6 km. The NAC is also used for the measurement of the lateral displacement (α and β) and an error of 100% of the target angular visible size is taken into account (about 1-2 px) at large distances.
At medium distances (10 km-1 km), the ISL is deactivated and the range is estimated using the NAC using Equation (21): cos α c tan α 1 + sin α c ± 0.5 ang px 2α 1 (21) where: • l is the target length that is equal to 5 m; • α c is the measured azimuth angle of the target center of mass, affected by an error of 100% of the target illuminated size. • α 1 is the difference between the azimuth of the centre of the illuminated zone and the azimuth of a side of the illuminated zone. • ang px the pixel angular size of the visible Target area.
The lateral displacement is still measured with the NAC and the relative is between 5 and 100 px (100% of the target illuminated side). The lateral displacement error is computed using Equation (22), and the trend is reported in Figure 17. The angular error α err defines the limits between medium distances and short distances; in fact, medium distances are considered those for which RangeError % is less than 3%R, but the lateral displacement error, computed with Equation (22), is less than 100 px; as a result, the medium distances are those included in the range 1-10 km: The short distances are considered those between 1 km and 5 m; here, the sensor used is the WAC and the error on lateral displacement is equal to 0.2%R and the error on R is 2%R. At short distances, it is also possible to estimate the relative attitude, with an error of 5 degrees maximum.
The estimation of the translational velocity is based on the differentiation principle and a subsequent Kalman filtering procedure; however, in this work, the entire navigation chain is modeled as in Equation (23): whereρ pp is the relative port-to-port velocity affected by the error,ρ pp is the velocity without errors,ρ pp is the relative position affected by error, ρ pp is the relative position, and nρ pp is a white noise. A similar approach was used to model the angular velocity error, as described in Equation (24) whereω is the relative angular velocity affected by the error, ω is the velocity without errors,θ is the relative attitude affected by error, θ is the relative attitude, and n ω is a white noise. Both nρ and n ω characteristics can be selected by the user. The performance of the sensors with the estimated range measurement is shown by the port-to-port trajectory in Figure 18 drawn in red, representative of a V-bar approach to the target. It is important to remark that the range is measured with the camera only when the the target is within the camera field of view; the sensors are initialized at every hold-point and the sensor suite changes only in the hold-points depending on the target-chaser distance at hold-point. The camera is also used to measure the relative attitude for short distances as shown in Figure 19. The range measurements are always available since we assume the implemented controllers capable of keeping the target within the camera's FOV.

Actuators
In general, the set of actuators implements the firing command sequences, and allows the vehicle to separate the motion along different axes and the translational motion from the rotational one. With reference to the Heracles mission, the propulsion is implemented by a main engine and 16 RCS-thrusters of 10N each [11] located on the edges of the chaser vehicle. In this paper, we limit ourselves to describe the model the latter, synthetically shown in Figures 20 and 21. The main sources of errors are thrust magnitude errors, thrust direction errors, rise/fall time, and delays. The sixteen engines magnitude can be modeled as: where F is the nominal thrust level at steady state condition, η 1BIT is the theoretical bit efficiency, and δη 1BIT is the impulse bit efficiency random variation. As said before, in our scenario, F is equal to 10N. The theoretical impulse bit efficiency is computed from the empirical formula given by Equation (26), which is based on the duration of the thruster firing command t on and the thruster off-time: t on and t o f f are expressed in seconds, and c 1 and c 2 are the efficiency factor constants.
According to [12], t on and t o f f can be selected using a PWM approach. The random variation of the impulse bit efficiency δη 1BIT is assumed to be Gaussian with zero mean and the standard deviation σ η 1BIT , given by Equation (27).
The misalignment is mainly composed of two contributions: internal and external. The internal error is due to a misalignment between thrust vector and flange. They are constant over a single thruster firing, but they change randomly from one firing to the next, so the internal uncertainty is assumed to behave as a Gaussian noise. The external errors are due to some misalignment between flange and master reference cube.
The uncertainties remain constant for the entire simulation, but they change from a simulation to the other. As a consequence, the external uncertainty is assumed to behave as a uniformly distributed variable.
In addition to magnitude and direction uncertainty, it is necessary to also take into account the Rise/Fall time behavior [7]. The thrust is discretized following a PWM logic. The motors have a dynamic and minimum t on and t o f f times. The Rise/fall times are taken into account through a filtering operation: the PWM signal will be filter by a 1st or 2nd order filter according to Equations (28) and (29). In the filter equation, we can also take into account a time delay τ: The location of the thrusters depends on the vehicle, its configuration, geometry, and mission. Thus, the contribution that each actuator gives to torques and forces can be computed with respect to the chaser center of mass in the body frame axis, which is the Geometrical frame, defined earlier. Due to the absence of data, the chaser vehicle is considered as a perfect cylinder with uniform density and constant mass; therefore, the body reference system-that coincides with G in this case-was located along principal axes of inertia.
The relationship between the thrust of each motor and the torques and forces produced is then computed by projection. Let us define: F s , the vector that contains all the thrust provided by the small thrusters: The 6x1 vector of forces and torques, in chaser body frame, is called τ, and it is defined as: The relation between τ, F s and F b is a linear relationship given by Equation (32): where B s is called the 6 × 16 control allocation matrix [5]. Control allocation is an area with a large amount of literature and many algorithmic techniques, spanning from structural geometry approaches, to several types of optimization methods. Ref. [13] is especially interesting for our application; in fact, it avoids the use of a PWM module after the control allocation because it computes the optimal combination of duty-cycle to minimize a linear cost function given by: where [δ 1 , . . . , δ n ] is the vector that contains the duty-cycles; each element of this vector is included between 0 and 1, and F s = F m ax[δ 1 , . . . , δ n ], where F m ax is the maximum thrust supplied by a thruster. It defines the control law in (32). Another advantage of this technique is that, by minimizing the duty-cycle vector, the fuel-consumption is minimized as well because it is proportional to the summation of all thruster on time. For the reasons briefly explained above, also supported by empirical results, one of the selected control allocation techniques for comparison is the one proposed by [13]. Clearly, the selection of a specific allocation algorithm also depends on safety factors and software computational power specific for space applications. Therefore, a look-up table approach was also investigated in this study. The control allocation methods described above are described in detail in Refs. [13] and [14], and the interested reader is referred to them. In the following, we only present rendezvous simulation results as a function of PWM working frequency (1 Hz and 2 Hz, respectively). The optimal control allocation in Equation (33) selects an optimal combination of Pulse Width Modulation duty cycle at each step-in the sense that it minimizes the summation over all duty-cycles and, consequently, the fuel consumption. The optimization method is the interior-point-legacy (default) and the optimization procedure was executed at each step using available routines. The rendezvous maneuver simulation is shown in Figure 22, indicating better performance with a higher PWM frequency. In the look-up table algorithm, the thruster modulator is the on-board function that computes parameters for the actuation of the thrusters during a control cycle. The main objective is to compute the percentage of the total duration of the control cycle for which each thruster must be open such that that the average effect of the thrusters over the control cycle results in a force and torque as requested by the controller. The resulting trajectories are shown in Figure 23, for the same frequency range. Using the Heracles mission data, a comparison in terms of thrust consumption is shown in Table 5. The table displays the integral of the acceleration at the thruster during the entire maneuver duration. The "ideal" implies a simulation performed using perfect sensors and actuators, while "non-ideal" indicates results obtained with more accurate models for sensors and actuators as described earlier.

Rendezvous Guidance and Control
A rendezvous guidance profile is generally composed of several elements aimed at steering the chaser towards the target, moving between a set of control points. In the Keplerian case, these trajectories are the result of the application of impulsive or continuous thrust firings along prescribed directions. Except for very small relative distances, the maneuvers are computed in an open-loop fashion, and can be based on the solution of the Hill's equations [4], for instance. The accuracy of the Hill's equations for the restrictedthree-body problem is poor and may compromise the good execution of the maneuver, especially due to their open-loop implementation; therefore, in this section, some of the maneuvers used for rendezvous and docking are reproduced exploiting the linearized elliptic ELERM and circular CLERM equations, whose derivation can be found in [2], together with their range of validity along the target orbit.
The linearization procedure yields a linear time varying expression in state variable form given by:ẋ with A ∈ R 6×6 defined as: with state variable defined in Equation (7). The difference between ELERM and CLERM in Equation (34) is due to the angular velocities assumption between elliptic and circular motions. The closed-form solution of the these equation sets is not straightforward, due to the presence of time-varying parameters, and the absence of general analytical solutions for ER3BP and CR3BP orbits. To deal with this problem, different guidance approaches are introduced. The selection of the guidance algorithm is influenced by the relative distance, the required level of accuracy, the capability of closing the loop, and the available/current sensor's and actuator's suite. We must remark that the aim of the paper is not to propose the best guidance algorithm, rather to provide a suitable choice for the mission. Figure 24 shows a sequence of algorithms that are used in the paper. The mathematical details are reported in the Appendix. The methods used for the guidance are: The first two techniques are used to control the relative translational motion and the absolute attitude motion outside of a safety area defined to avoid collision, and they operate in an open loop fashion for large to medium distances. The third controller takes care of the regulation of the attitude and the position in the last part of the approach.

Adjoint Guidance
Adjoint models (or more properly dual models) have been extensively exploited for the solution of classical missile guidance problems, see [15,16], and more recently in [17,18]. They appear to be well suited due to the time varying nature of the relative dynamics. The application of adjoint theory to our problem is described in Appendix A.1.
The main idea is to consider the relative dynamics as described by ELERM (or in a simpler model CLERM) as given by Equation (A1), with time response at some finite time t f > t 0 given by Equation (A3). We can associate with Equation (A1) a dual system at time t f [19] given by Equation (A4), whose initial time response coincides with the final time response of the original system.
The adjoint guidance is applied in this work to a two-impulse maneuver at each hold point. In other words, the chaser leaves a hold point with an impulsive firing and reaches the next with a braking impulse. The total input has the form given by Equation (A8) with two impulses applied at times t 1 and t 2 ; then, we obtain relative position and rate from Equation (A15), where (ρ f ,ρ f ) is the desired relative state, computed starting from ELERM.
The performance of the guidance was evaluated computing relative distance and velocity errors repeated here for clarity's sake: with total control effort:

PID Guidance Controller
During the initial phase of the rendezvous, the primary concern is to follow a prescribed attitude profile, or to maintain the target in the field of view of the chaser's cameras. To this end, a standard PID controller on all axes was considered to be satisfactory. The controllers were simply tuned to obtain a good rise time.
An example of the performance without and with the controller is shown in Figure A1 and Figure A2, respectively. The simulation considers a nonzero initial condition on the angular velocity with no damping at the beginning of the maneuver. The continuous rotation of the vehicle can be seen in Figure A1. After the insertion of the controller, the vehicle is stabilized to the desired attitude (zero in this case) as shown in Figure A2. The presence of the PID also allows the pointing of the cameras always towards the target avoiding possible singularity issues (of course, the equations can be always transformed in their quaternion, not changing the general controller philosophy).
Two different attitude-modes are available for large and medium distances, where the PID is in charge of controlling the attitude of the chaser. In the former case, the chaser is forced to follow a user-defined attitude profile w.r.t the LVLH; this profile is defined in Euler angles. In the latter case, the chaser must point always towards the target to keep it inside the camera FOV, with the dynamics expressed either in Euler angles or quaternions [20].
In the latter, the quaternion (q re f ) that represents the angular displacement between the camera axis and the target position w.r.t. the chaser being computed first. Then, it is used as a reference for the controller; this means that the controller is asked to zero the chaser attitude w.r.t q re f . To compute q re f , let us define n re f and θ re f as in Equations (39) and (40): Combining the equations yields: The quantities written in Equations (39)-(41) are graphically represented in Figure 25. At close distances, the user can only define the desired relative attitude profile; in other words, the user can define the attitude of the chaser w.r.t. the target at each hold-point that is located at a close distance from the target.

SDRE Guidance
Since the middle of the 1990s, State-Dependent Riccati Equation (SDRE) strategies have emerged as a general design structure that provide a systematic and effective means of designing nonlinear controllers, observers, and filters. These methods overcome many of the difficulties and shortcomings of existing methodologies, and deliver computationally simple algorithms that have been highly effective in a variety of practical and meaningful applications. In a special session at the 17th IFAC Symposium on Automatic Control in Aerospace 2007, theoreticians and practitioners in this area of research were brought together to discuss and present SDRE-based design methodologies as well as review the supporting theory. It became evident that the number of successful simulations, experimental, and practical real-world applications of SDRE control have outpaced the available theoretical results.
The methodology originated as an extension of Linear Quadratic Control techniques to a special class of nonlinear systems. An interesting survey on the method can be found in [21]. Stability and controllability aspects are discussed in [22,23]. SDRE controllers were used in space related applications, especially for the control of proximity operation and formation flight as reported in [24][25][26].
The method entails a factorization (that is, parameterization) of the nonlinear dynamics into a state vector and the product of a matrix-valued function that depends on the state itself. In doing so, the SDRE algorithm fully captures the nonlinearities of the system, bringing the nonlinear system to a (non unique) linear structure having state-dependent coefficient matrices, and minimizing a nonlinear performance index having a quadratic-like structure. A state dependent algebraic Riccati equation P using the SDC matrices is then solved online to give the suboptimum control law.
A more detailed derivation can be found in Appendix A.3. The structure of the control law, in its simpler form that assumes the measurement of the entire state, is a full state feedback controller given by Equation (A25): The procedure can be applied in the presence of inequality constraints as well, derived from a set of admissible states on relative position and velocity, especially at close distances, where relative motion precision is necessary as shown in Equation (A26). The resulting control law then becomes: Here, the gain matrix includes the constraint term K Ω . A constrained proximity representation is shown in Figure A3 of Appendix A.3, with the constraints derivation. A numerical simulation is shown below using the initial conditions in Table 6. We assume that the attitude motion of the target has a maximum amplitude of 5°. For the chaser, the inertia matrix is I = 10 −3 · diag(1.1, 0.6, 0.6) kg · m 2 . The direction vector of approaching cone is [p] T = [−1, 0, 0] , and the maximum cone angle is set as β = 25°. The example was selected to show the performance of the SDRE guidance, assuming perfect sensor and actuator models. The resulting trajectory is shown in Figure 26.

Full Rendezvous Sequence
This section describes an example of full rendezvous/berthing maneuver, using the methodologies described in the paper.
As shown in Section 4, the simulation uses the Ephemeris model, in particular the DE430 provided by JPL. The hold-point settings for the maneuver are given by: The location of the hold-points is passively safe [5], and it was selected minimizing the energy consumption. The duration of the different transfer times is designed to enter the safety area at the aposelene; moreover, the location of HP0 within the target orbit corresponds to a mean anomaly of about 113°that guarantees one of the lowest differences between the Ephemeris equations and the ER3BP equations (inclusive of sensor and actuator dynamics). Figure 27 shows the 3D full rendezvous trajectory, Figure 28 show its the R-bar-V-bar projection, with the safety zone indicated by the circle satisfied until actual berthing occurs.   Figure 29 shows the relative velocity components between chaser and target. The relative attitude is shown in Figure 30. In both cases, the contribution of sensors and actuators models is shown in red as compared to the ideal case (black line).

Conclusions
The paper presents a comprehensive preliminary analysis of the dynamics and control issues arising in a rendezvous mission between an active vehicle departing from lunar orbit to a target station located in an NRHO L2 orbit. The mission requires a three-body problem description due to the gravity of both Earth and Moon. The implementation of dynamic components was based on ESA's Heracles mission data and preliminary configuration. The accuracy of the model was improved with the inclusion of sensors and actuators models, and guidance algorithms were selected in order to verify the reliability of the GNC loop. The validation was presented using an Ephemeris model. The paper shows that elliptic restricted three-body dynamics offers a good approximation; in fact, since the rendezvous will occur at the aposelene, the circular restricted model could be used as a starting point for the guidance design. If the first rendezvous were aborted, the mall errors in the relative motion would allow a try after one orbit. The selection of guidance algorithms was driven by feasibility, and a full mission profile was tested successfully in simulation.

Appendix A.3. SDRE Guidance
Since the middle of the 1990s, State-Dependent Riccati Equation (SDRE) strategies have emerged as general design methods that provide a systematic and effective means of designing nonlinear controllers, observers, and filters. These methods overcome many of the difficulties and shortcomings of existing methodologies, and deliver computationally simple algorithms that have been highly effective in a variety of practical and meaningful applications. In a special session at the 17th IFAC Symposium on Automatic Control in Aerospace 2007, theoreticians and practitioners in this area of research were brought together to discuss and present SDRE-based design methodologies as well as review the supporting theory. It became evident that the number of successful simulations, experimental, and practical real-world applications of SDRE control have outpaced the available theoretical results.
The methodology originated as an extension of Linear Quadratic Control (LQR) techniques to a special class of nonlinear systems. An interesting survey on the method can be found in [21]. Stability and controllability aspects can be found in [22,23]. SDRE controllers were used in space related applications, especially for the control of proximity operation and formation flight [24][25][26].
The method entails a factorization (that is, parameterization) of the nonlinear dynamics into a state vector and the product of a matrix-valued function that depends on the state itself. In doing so, the SDRE algorithm fully captures the nonlinearities of the system, bringing the nonlinear system to a (non-unique) linear structure having statedependent coefficient (SDC) matrices, and minimizing a nonlinear performance index having a quadratic-like structure. An algebraic Riccati equation (ARE) using the SDC matrices is then solved online to give the suboptimum control law.
Let us consider a nonlinear regulator problem for minimizing the cost function in Equation (A22): subject to a nonlinear differential constraint and affine in control (A23): where x ∈ R n is state vector, u ∈ R m is control vector, f : R n → R n , g(x) = 0 ∀x ∈ R n . Q(x) ≥ 0 and R(x) > 0 are the weight matrices of the state vector and the input vector, respectively. If the dynamics of the system can be written in a pseudo-linear form (A24) by means of State Dependent Coefficient (SDC) matrices parametrization: Note that the SDC parametrization is not unique and provides the designer with additional degrees of freedom to enhance controller performance. The SDRE control method can be summarized in the following two steps, both occurring at each integration time. First, the state dependent ARE is solved, then the nonlinear full state (in the most simplest case) feedback control law is found as shown in Equation (A25): Rendezvous maneuvers can also be executed imposing constraints on relative position and velocity, especially at close distances, where relative motion precision is necessary. The SDRE structure can be used to incorporate such constraints.
Consider a system given by Equation (A23), with x(0) = x 0 ∈ Ω and the set of admissible states defined by Equation (A26): Ω = x : l(x) ≤ 0, l(x) ∈ R p , l(·) ∈ C 1 (A26) It is possible to synthesize a SDRE controller that stabilizes the closed loop system, and, like x, does not go beyond ∂Ω, the border of Ω, defined as: ∂Ω = x : l(x) = 0, l(x) ∈ R p , l(·) ∈ C 1 (A27) A sufficient condition for x to remain within Ω isl(x) = 0, see [30,31]: ∇l(x)ẋ = ∇l(x) f(x) + g(x)u = 0 (A28) We can parametrize the above condition in a SDC form by adding a fictitious output z to the problem: Such controller forces the closed loop trajectories to follow the level curves of set Ω. Taking into account the constraint, the cost function changes to Equation (A30) : J(x, u) = J 0 (x, u) + J Ω (x, u) = where W z is a p × p weight matrix, such that its i-th element has a large value when x is near the border of the i-th constraint, and small otherwise. This implies that the component J Ω (x, u) in the cost function dominates with respect to J 0 (x, u) when the state vector does not satisfy the constraint, and becomes negligible when the constraint is satisfied. The matrices in the Riccati equation are now modified as follows: A T (x)P(x) +P(x)Ā(x) −P(x)B(x)R −1 (x)B T (x)P(x) +Q(x) = 0, The resulting control law becomes: where K 0 (x) =R −1 (x)B(x)P(x) K Ω (x) =R −1 (x)D T (x)W z (x)C(x) (A33) K 0 (x) is the controller component responsible for stability and performance, while K Ω (x) satisfies the constraints [31].
There are situations where ∇l(x) is orthogonal to B(x), so that D(x) = 0 eR(x) = R(x). In this case, a possible choice for W z (x) is that the i-th element be selected to be large, when we are in a region of the state space to be penalized, and zero otherwise [30].
As an example, consider the graphical proximity representation in Figure A3, in which p is the unit direction vector of path direction and β is the maximum cone angle of the desired final corridor. In order to satisfy this constraint, we rewrite it using (A26) as follows: where x = ρ ,ρ , q c/l , ω c/l . It is important to note that the constraint given by Equation (A34) is expressed in T frame, so, as we can see from Equation (A34), the direction of cone axis depends on target's attitude. From Equation (A29), we obtain the fictitious output z. Note that, in our case, ∇l(x)⊥B(x), so the weight function W z was selected to penalize the state when it is far from the imposed constraint. To this end, the weight function depends on the 3D distance of chaser's Center of mass (CoM) from the line described by the unit vector p. In this way, W z has a large value when the chaser's CoM is far from the cone axis and a small value when it is close to the axis.