Continuous Low-Thrust Maneuver Planning for Space Gravitational Wave Formation Reconfiguration Based on Improved Particle Swarm Optimization Algorithm

This study proposes a three-spacecraft formation reconfiguration strategy of minimum fuel for space gravitational wave detection missions in the high Earth orbit (105 km). For solving the limitations of measurement and communication in long baseline formations, a control strategy of a virtual formation is applied. The virtual reference spacecraft provides a desired relative state between the satellites, which is then used to control the motion of the physical spacecraft to maintain the desired formation. A linear dynamics model based on relative orbit elements’ parameterization is used to describe the relative motion in the virtual formation, which facilitates the inclusion of J2, SRP, and lunisolar third-body gravity effects and provides a direct insight into the relative motion geometry. Considering the actual flight scenarios of gravitational wave formations, a formation reconfiguration strategy based on continuous low thrust is investigated to achieve the desired state at a given time while minimizing interference to the satellite platform. The reconfiguration problem is considered a constrained nonlinear programming problem, and an improved particle swarm algorithm is developed to solve this problem. Finally, the simulation results demonstrate the performance of the proposed method in improving the maneuver sequence distribution and optimizing maneuver consumption.


Introduction
Gravitational wave detection from space has been an active research area for decades. The Laser Interferometer Space Antenna (LISA) [1] was the first space-based gravitational wave detection project to be studied and evaluated. Since then, several innovative proposals have been put forward, including the ASTROD [2], DECIGO [3], OMEGA [4], TIANQIN [5], and TAIJI [6] projects. In these projects, three or more spacecraft are arranged in an equilateral triangle, and lasers are employed to measure the minute distance variations driven by gravitational waves. Compared with the ground detector, the space-based gravitational wave detector has the characteristics of a long baseline distance, fewer noise and interference sources, and the ability to detect gravitational waves from all directions, which make it especially suitable for capturing weaker gravitational wave signals in medium-and low-frequency bands [7]. However, space gravitational wave detectors present various challenges in their design and operation, among which the configuration maintenance and control of the precision formation flying is the most fundamental and urgent difficulty [1,8,9].
The primary obstacle is the extremely lengthy baseline distance, making communication and measurement difficult. For sufficient detection sensitivity, gravitational wave detectors in space need baseline distances of a few dozen kilometers or possibly millions of kilometers [10,11]. Nevertheless, such lengths would make it challenging to implement formation control methods that depend on relative state measurements. Some alternatives need to be considered, one option is to employ a mix of onboard sensors and external measurements to estimate the relative states of the spacecraft, and the other is to use modelbased control methods [12]. Both strategies are effective at certain times, but the former will make the system more complex and expensive, and the latter will use up a lot of the onboard computational resources. Some academics have recently suggested a workable control method based on virtual formation. Lippe [13] offers a novel approach to minimizing delta-v for spacecraft swarm reconfiguration through refinement of the real or virtual chief spacecraft orbit, which is a closed orbit of arbitrary eccentricity. Sherrill [14] presents a simple method for roughly applying the HCW equations to the chiefs in an elliptic orbit; a virtual chief is described as a circularized version of the actual chief in the approach. Caruso [15] refines and extends a multiple impulse trajectory transfer method, which is based on the linearized CW equation and is solved in an optimal framework by minimizing the total velocity variation; in the study, a massless point covering a circular orbit around a certain celestial body was created as the formation's (virtual) chief. Huang [16] proposed a spacecraft orbit correction method based on a virtual formation configuration design, and the correction was realized by four-pulse control. In view of the previous work, the virtual reference spacecraft provides a desired relative state between the satellites, which can significantly reduce the communication and measurement requirements between the spacecraft and make it easier to decouple the spacecraft's motion in the formation, and in this study, we adopt this strategy.
An additional difficulty in a close-range decentralized control architecture based on virtual formations is formation reconfiguration under complicated constraints. Reconfiguration or modification is defined as transforming a spacecraft's formation from an initial configuration to another desired configuration while also taking into account its capabilities, constraints, and objectives during the process [17]. It is commonly carried out utilizing optimization algorithms that evaluate different thrust scenarios and determine the most efficient and effective solution for a given mission. There is already a wealth of research on close formation reconfiguration control, ranging from continuous thrust to impulsive strategies, from analytical to numerical forms, and from using the relative Cartesian state to orbital element descriptions [18]. A typical class of research focuses on using impulsive thrust strategies. Vaddi [19] proposed an analytic, two-pulse control scheme for formation reconstruction and establishment. Michelle [20] presented a closed-form method for the fuel-optimal guidance and control of relative motion in the formation flying and rendezvous of spacecraft using impulsive maneuvers. Gaias [18] described an impulsive maneuvers planner for onboard autonomous optimal formation flying reconfigurations in a near-circular orbit. The impulse control strategy is an applicable and flexible control method; however, it is challenging to realize the relative velocity's instantaneous change in practical engineering for a gravitational wave detection satellite equipped with µN-class thrusters. Continuous control is becoming popular in new space missions thanks to the advancement of microelectronic propulsion. Ben [21] derives a control concept for formation flight applications using analytical finite-duration approaches. Di Mauro [22][23][24] presents a solution to the minimum-fuel spacecraft formation reconfiguration maneuver in J 2 perturbed near-circular orbits and subsequently investigates several flight scenarios by analytical and numerical methods. Zhang [25] proposes a control parameter direct optimization method for the optimal short-range elliptic orbit rendezvous problem using on-off constant thrust; during the study, the optimal control problem is transformed into a nonlinear programming problem with bound constraints on the optimization variables and terminal equality constraints. The aforementioned continuous control strategies can provide a reference for the reconfiguration control of a gravitational wave formation, but the current research mostly concentrates on near-ground scenarios; therefore, related research still needs to be refined and improved in order to be adaptable to the high Earth orbit environment while meeting the limitations of poor maneuverability, a long mission period, and high accuracy requirements in gravitational wave missions.
In this paper, a geocentric space gravitational wave detection mission is selected as a candidate for analysis. Compared with the heliocentric scheme, the geocentric scheme faces a more complex space environment and dynamics [9], thus posing a more pressing demand for the maintenance and control of the formation configuration. Figure 1 depicts the mission's operational mode: Scientific gravitational wave detection will operate when the direction of the Sun is sufficiently angled with the orbital plane; when the direction of the Sun is almost parallel to the orbital plane, scientific observations will be put off for maintenance. and terminal equality constraints. The aforementioned continuous control s provide a reference for the reconfiguration control of a gravitational wave fo the current research mostly concentrates on near-ground scenarios; therefor search still needs to be refined and improved in order to be adaptable to th orbit environment while meeting the limitations of poor maneuverability, a l period, and high accuracy requirements in gravitational wave missions.
In this paper, a geocentric space gravitational wave detection mission is candidate for analysis. Compared with the heliocentric scheme, the geocen faces a more complex space environment and dynamics [9], thus posing a m demand for the maintenance and control of the formation configuration. Figu the mission's operational mode: Scientific gravitational wave detection will o the direction of the Sun is sufficiently angled with the orbital plane; when the the Sun is almost parallel to the orbital plane, scientific observations will b maintenance. The main objective of this study is to develop a minimum-fuel spacecra reconfiguration strategy for future space missions, such as space gravitation tectors, a distributed remote sensing cluster or solar satellite systems, etc Firstly, a configuration reconfiguration scheme suitable for space gravitation tection formation is proposed, which is based on the concept of virtual forma formation involves using a reference trajectory, or "virtual" formation, to gu tion of the spacecraft in the formation; this approach can enable precise contr mation configuration, even in the presence of uncertainty and perturbation the relative motion is described using a linear dynamics model based on the zation of relative orbit elements; compared with the commonly used Carte states, this description method can not only effectively reduce the error caus ization and facilitate the inclusion of perturbation effects, but it can also offer alization of the effects of maneuvers on the relative orbit [22]. In addition, built from a set of mean elements that slowly vary in time, which is benefi mation control since no additional fuel needs to be wasted to counteract sho period effects caused by osculating elements [29]. Thirdly, a piecewise consta mation reconfiguration mechanism is described. Simultaneous in-plane an controls are adopted to improve the reconfiguration efficiency; in order to av The main objective of this study is to develop a minimum-fuel spacecraft formation reconfiguration strategy for future space missions, such as space gravitational wave detectors, a distributed remote sensing cluster or solar satellite systems, etc. [5,8,[26][27][28]. Firstly, a configuration reconfiguration scheme suitable for space gravitational wave detection formation is proposed, which is based on the concept of virtual formation. Virtual formation involves using a reference trajectory, or "virtual" formation, to guide the motion of the spacecraft in the formation; this approach can enable precise control of the formation configuration, even in the presence of uncertainty and perturbations. Secondly, the relative motion is described using a linear dynamics model based on the parameterization of relative orbit elements; compared with the commonly used Cartesian relative states, this description method can not only effectively reduce the error caused by linearization and facilitate the inclusion of perturbation effects, but it can also offer direct visualization of the effects of maneuvers on the relative orbit [22]. In addition, the model is built from a set of mean elements that slowly vary in time, which is beneficial for formation control since no additional fuel needs to be wasted to counteract short-and long-period effects caused by osculating elements [29]. Thirdly, a piecewise constant thrust formation reconfiguration mechanism is described. Simultaneous in-plane and out-plane controls are adopted to improve the reconfiguration efficiency; in order to avoid the long duration or frequent orbital maneuvers adding difficulty to the control of the test masses, the whole task interval is discretized. Finally, an improved particle swarm optimization algorithm is proposed to solve the multi-constrained maneuvers planning. The improved algorithm introduces measures such as adaptive factors, dynamic penalty functions, and updated rules for better global search performance. The innovation of this paper can be summarized as: (1) A minimum-fuel reconfiguration strategy applicable to gravitational wave formations is proposed. The strategy can flexibly respond to different complex scenarios during gravitational wave missions with less interference to the satellite platform. (2) Several measures are employed to improve the performance of the particle swarm algorithm. The improved particle swarm algorithm has better stability and superior global search capability when dealing with problems with complex constraints.
The outline of the paper is as follows. In Section 2, the concept of virtual formation, the relative motion model, and the reconfiguration control strategy are given; Section 3 introduces the maneuver planning process based on the improved particle swarm algorithm; the numerical simulation of the proposed control scheme is presented in Section 4; and finally, Section 5 draws the conclusion.

Formation Reconfiguration Control Strategy Based on Continuous Low Thrust
This part introduces the concept of virtual formation and the dynamic model describing the relative motion of the physical spacecraft relative to the virtual reference point.

Definition of the Coordinate System and Introduction of the Virtual Formation
In order to facilitate the description of the spacecraft's absolute motion and relative motion, the following coordinate system is defined in this paper, as shown in Figure 2.
updated rules for better global search performance. The innovation of this paper can be summarized as: (1) A minimum-fuel reconfiguration strategy applicable to gravitational wave formations is proposed. The strategy can flexibly respond to different complex scenarios during gravitational wave missions with less interference to the satellite platform. (2) Several measures are employed to improve the performance of the particle swarm algorithm. The improved particle swarm algorithm has better stability and superior global search capability when dealing with problems with complex constraints.
The outline of the paper is as follows. In Section 2, the concept of virtual formation, the relative motion model, and the reconfiguration control strategy are given; Section 3 introduces the maneuver planning process based on the improved particle swarm algorithm; the numerical simulation of the proposed control scheme is presented in Section 4; and finally, Section 5 draws the conclusion.

Formation Reconfiguration Control Strategy Based on Continuous Low Thrust
This part introduces the concept of virtual formation and the dynamic model describing the relative motion of the physical spacecraft relative to the virtual reference point.

Definition of the Coordinate System and Introduction of the Virtual Formation
In order to facilitate the description of the spacecraft's absolute motion and relative motion, the following coordinate system is defined in this paper, as shown in Figure 2.
(1) The Earth-centered inertial (ECI) reference frame: the origin O is located at the center of the earth, the OX axis points towards the vernal equinox, the OZ axis points towards the north celestial pole, and the OY axis is perpendicular to the XOZ plane and follows the right-hand rule.  (1) The Earth-centered inertial (ECI) reference frame: the origin O is located at the center of the earth, the OX axis points towards the vernal equinox, the OZ axis points towards the north celestial pole, and the OY axis is perpendicular to the XOZ plane and follows the right-hand rule.  The illustration of the virtual formation is depicted in Figure 2a. The reference point operating on the nominal orbit is defined as the virtual chief spacecraft. The spacecraft traveling on the actual trajectory with errors is defined as the deputy spacecraft. As a result, the original long baseline formation is split into three close-proximity virtual formations, each comprising a physical spacecraft and its corresponding virtual reference point. As shown in the figure, S c1 , S c2 , and S c3 represent virtual reference points, S r1 , S r2 , and S r3 are physical spacecraft; L ij (i, j = 1, 2, 3) is the arm length between spacecraft S ri and S rj ; A i (i = 1, 2, 3) represents the breathing angle corresponding to spacecraft S ri ; and ∆r is the envelope radius of the virtual formation configuration.
As discussed in reference [16], the maximum allowable envelope radius of the virtual formation can be calculated using direct planar geometry relationships when the difference in orbital elements between the physical spacecraft and the virtual reference point is small enough. The spatial geometric relationship between physical spacecraft is depicted in Figure 2b. In the figure, L 0 , L min , and L max , respectively, represent the nominal arm length, the minimum arm length, and the maximum arm length; similarly, A 0 , A min , and A max , respectively, represent the nominal breathing angle, the minimum breathing angle, and the maximum breathing angle.
The envelope radius of the virtual formation's constraint conditions can be computed using the equilateral triangle configuration index and the geometric relationship depicted in Figure 2b: In Equation (1), ∆L and ∆A are the arm length variation index and the breathing angle variation index, respectively, and the maximum breathing angle A max and the minimum breathing angle A min can be expressed as With ∆L L 0 , the terms containing ∆L L 0 can be ignored to simplify the formula further: By solving the system of inequality equations in Equation (2), the envelope boundary satisfying the gravitational wave detection requirement can be obtained.

Relative Motion Equation Modeling Based on Quasi-Nonsingular Relative Orbit Elements
In each virtual formation, the relative motion of the physical spacecraft relative to the virtual reference point can be described by a set of dimensionless relative orbital elements defined by D'Amico [30].
In the ECI reference frame, let α c = [a c , e c , i c , ω c , Ω c , M c ] T , α r = [a r , e r , i r , ω r , Ω r , M r ] T denote the classical Keplerian orbital elements (OE) of the virtual reference point and the physical spacecraft, respectively. The quasi-nonsingular relative orbital elements (ROEs) are defined as where δa is the semi-major axis difference, which has been normalized by the chief semimajor axis, δλ is the relative mean argument of longitude, δe = [δe x , δe y ] T is the relative eccentricity vector, and δi = [δi x , δi y ] T is the relative inclination vector. Additionally, u = ω + M is the mean argument of latitude, the phase angles of ϕ and θ are termed the relative perigee and relative ascending node, respectively.
Assuming that the reference orbits in a near-circular reference orbit and the distance between the physical satellite and the virtual point is much smaller than the orbital radius of the virtual point, the ROEs can be written as functions of the integration constants of the HCW equations, The linear map [31,32] can be described as where Accordingly, the inverse transformation can be easily realized by a c δα = T −1 (u c )δx (5) where u c is the latitude argument of the chief orbit at the instant t c , which is interchangeable with time t c through the relation u c = u 0 + W c (t − t 0 ), W c is the mean motion of the virtual reference point. δx = [δr R , δr T , δr N , δr R , δv T , δv N ] T represents the relative position and velocity in the RTN reference frame. According to Equation (4), The ROEs will be decomposed into periodic motion in the RT plane and harmonic oscillations in the RN plane.
In particular, the amplitudes of relative motion in R, T, and N directions are aδe, 2aδe, and aδi, respectively, offsets in T and R direction are given by aδλ and aδa, respectively. Moreover, the instantaneous phases of in-plane and out-of-plane motion are, respectively, represented by angle u c − ϕ and u c − θ, while relative phase angle θ − ϕ governs the orientation and shape of relative motion in the RN plane. The geometric insight provided by Equation (4) is illustrated in Figure 3.

Linearized Equations of Relative Motion with Perturbation
Consider a general absolute state α c and relative state δα, a linear relative dynamic equation including perturbations and continuous thrust acceleration can be derived by using the relevant theory of Taylor expansion and Gaussian variational equation [20,33], which is given as The term A(α c (t)) = A kep + A J 2 (t) + A 3rd (t) + A srp (t) is complete plant matrix, which is the superposition of Kepler motion and perturbation contributions such as the Earth's oblateness effect J 2 , SRP, and lunisolar third body (T 3rdb ). The term B(t) represents the control input matrix and u(t) = [u r , u t , u n ] T is the vector of constant control accelerations in the RTN orbital frame. ∆B srp = B r,srp − B c,srp is the difference between the physical satellite and the chief SRP ballistic coefficient.
represents the relative position and velocity in the RTN reference frame. According to Equation (4), The ROEs will be decomposed into periodic motion in the RT plane and harmonic oscillations in the RN plane. In particular, the amplitudes of relative motion in R, T, and N directions are a e δ , 2a e δ , and a i δ , respectively, offsets in T and R direction are given by aδλ and a a δ , respectively. Moreover, the instantaneous phases of in-plane and out-of-plane motion are, respectively, represented by angle c u ϕ − and c u θ − , while relative phase angle θ ϕ − governs the orientation and shape of relative motion in the RN plane. The geometric insight provided by Equation (4) is illustrated in Figure 3.

Linearized Equations of Relative Motion with Perturbation
Consider a general absolute state c α and relative state δα , a linear relative dynamic equation including perturbations and continuous thrust acceleration can be derived by using the relevant theory of Taylor expansion and Gaussian variational equation [20,33], which is given as The term is the difference between the physical satellite and the chief SRP ballistic coefficient. When the reference orbit is nearly circular, a simpler form can be derived by ignoring all terms that depend on eccentricity. These simplified plant matrices are given by When the reference orbit is nearly circular, a simpler form can be derived by ignoring all terms that depend on eccentricity. These simplified plant matrices are given by The items of J 2 plant matrix are as follows: The items of SRP plant matrix are as follows: is the spacecraft ballistic coefficient, with C r being the reflectivity coefficient, A srp the cross-sectional area perpendicular to the Sun's direction, and m the satellite mass; ε = 23.44 • is the ecliptic obliquity, λ s = λ 0 + n s t is the value of the ecliptic longitude of the Sun at the instant t, n s = 2π/year.
The items of T 3rdb plant matrix are as follows: In the Sun third body perturbation, we have In the Moon third body perturbation, parameters can be given as follows with Ω 0 being the longitude of the mean ascending node of the lunar at the initial moment, JD is the Julian date at the instant t c .
where the parameters a c , n c , and u c are defined as above.
Notably, each plant matrix in A(α c (t)) is independent and can be uniformly formalized into a [7 × 7] matrix, with the upper left [6 × 6] block containing the terms that relate the ROE with their variations (both conservative and nonconservative perturbing effects), and the upper right [6 × 1] column contains the linear relations that exist in-between the ROE variations and the ballistic coefficient difference (only nonconservative perturbing effects). Through examination of the rows of null matrices, it is possible to identify the ROE (regions of influence) that are not impacted by a particular perturbation. For detailed information on the plant matrix for the above items, we recommend that the reader read the literature's appendixes [34]. Similarly, in the control matrix B nc , in-plane and out-of-plane control are decoupled, which can be described as the radial (R) and tangential (T) direction maneuvers, represented by the first two columns, affecting only the in-plane ROE ([δα, δλ, δe x , δe y ] T ), and the normal (N) direction maneuvers, represented by the third column, which affect only the out-of-plane ROE ([δi x , δi y ] T ). Meanwhile, the model has neglected factors such as eclipses effects, planetary gravity, and higher-order geopotential terms.

Converting from Osculating to Mean Elements
As mentioned above, the linear dynamic equations are based on the average state space, while the actual measurement information of the satellite is osculating states; therefore, the algorithm converting the osculating elements to the mean elements is necessary. Due to closed-form conversions between osculating to mean states under the variety of perturbations considered not being available in the literature, a numerical method of moving average filtering was applied.
The basic idea of the moving average filter algorithm is to set a fixed-width sliding window, which slides along the time series while taking the arithmetic mean of the data in the window as the output value. As shown in Figure 4, assuming that each orbital period has 2k + 1 data, the average element corresponding to the n th osculating element is calculated by the following formula: ) and the upper right [6 × 1] column contains the linear relations that exist in-betwee ROE variations and the ballistic coefficient difference (only nonconservative pertu effects). Through examination of the rows of null matrices, it is possible to identif ROE (regions of influence) that are not impacted by a particular perturbation. For de information on the plant matrix for the above items, we recommend that the reader the literature's appendixes [34]. Similarly, in the control matrix nc B , in-plane and o plane control are decoupled, which can be described as the radial (R) and tangenti direction maneuvers, represented by the first two columns, affecting only the in-ROE (

Converting from Osculating to Mean Elements
As mentioned above, the linear dynamic equations are based on the average space, while the actual measurement information of the satellite is osculating states; t fore, the algorithm converting the osculating elements to the mean elements is neces Due to closed-form conversions between osculating to mean states under the varie perturbations considered not being available in the literature, a numerical method of ing average filtering was applied.
The basic idea of the moving average filter algorithm is to set a fixed-width sl window, which slides along the time series while taking the arithmetic mean of the in the window as the output value. As shown in Figure 4, assuming that each orbit riod has 2 1 k + data, the average element corresponding to the th n osculating elem calculated by the following formula: This is a convenient and accurate method for the scenarios studied in this p where the reference orbit is known.   This is a convenient and accurate method for the scenarios studied in this paper where the reference orbit is known.

Continuous Low-Thrust Maneuver Planning via Improved Particle Swarm Optimization
In this section, a general solution for formation reconfiguration using piecewise constant thrust is introduced. The problem is described as a nonlinear programming problem (NLP) with complex constraints, and then the maneuvering parameters are optimized and solved by an improved particle swarm optimization algorithm.

Reconfiguration Control Based on Piecewise Constant Thrust
In references [22,25], a tangential maneuvering strategy is used to obtain the minimum fuel consumption, which is an optimal strategy for the case where the eccentricity dominates the conformational divergence. In contrast to the studies described, δλ is the main divergence element in the study, which is caused by the geometrical differences of the orbits; for this reason, in-plane control simultaneously in the radial and tangential directions is a more efficient option. In addition, since the spacecraft contains two test masses, executing long-duration or short-term high-frequency maneuvers will make it more difficult to control the test masses and may even cause a collision with the satellite platform. Hence, a reasonable distribution of the thrust sequence is necessary.
As illustrated in Figure 5, it is assumed that after k i in-plane maneuvers and k o out-ofplane maneuvers, the formation reaches the desired state at the given time. The maximum amplitudes of the in-plane and out-of-plane thrusters are u i and u o , respectively; each maneuver's magnitude is expressed as u j,k (j = r, t, n and k = 1, 2, . . . , k i or k o ) and the maneuver duration is ∆t j = t j,2k − t j,2k−1 (k = 1, 2, . . . , k i or k o ). The whole task interval is discretized into k i and k o parts in terms of in-plane and out-of-plane according to the number of maneuvers and constraint limits, denoted as [t k−1 , t k ] (k = 1, 2, . . . , k i or k o ). Then the process of converting the relative state δα(t 0 ) at the time t 0 to the state δα(t f ) at the time t f using the linear dynamic equation described in Section 2.3 can be specifically described as: Similarly, by following the above rules in turn, the relative state the mission f t will be obtained, and the relative state satisfies The term des α δ is the desired mean ROE at the end of the maneuverin It has been demonstrated that fuel consumption saving is particularly practical space operations. In general, fuel consumption can be minimized b maneuvering parameters, which are handled in the following form: Find: Boundary constraints:  In the period [t 0 , t o,1 ], there is no maneuver acceleration, then the relative state δα(t o,1 ) at t o,1 can be obtained by In the period [t o,1 , t i,1 ], the constant thrust acceleration is [0, 0, u n,1 ] T , then the relative state δα(t i,1 ) at t i,1 can be obtained by Next, in-plane thrusters start working, the constant maneuver acceleration becomes [u r,1 , u t,1 , u n,1 ] T in the period [t i,1 , t o,2 ], then the relative state δα(t o,2 ) at t o,2 can be obtained by Following this, in the interval [t o,2 , t i,2 ], the constant thrust acceleration is [u r,1 , u t,1 , 0] T , then the relative state δα(t i,2 ) at t i,2 can be obtained by Then again, all thrusters stop working in the interval [t i,2 , t i,3 ], and the relative state Similarly, by following the above rules in turn, the relative state δα(t f ) at the end of the mission t f will be obtained, and the relative state satisfies The term ffiα des is the desired mean ROE at the end of the maneuvering interval. It has been demonstrated that fuel consumption saving is particularly important in practical space operations. In general, fuel consumption can be minimized by optimizing maneuvering parameters, which are handled in the following form: Other constraints: all kinds of sudden or planned maintenance work. The problem above is known as a constrained nonlinear programming problem, where the optimization variables are the maneuver parameters u(t c ) and t c , J is the objective function, and the remaining terms are the constraints.

Improved Particle Swarm Optimization Algorithm
Particle swarm optimization (PSO) is a population intelligence-based optimization algorithm. It simulates the search process of multiple candidate solutions in the form of particles and obtains the global optimal solution by continuously updating the velocity and position of the particles. However, when dealing with complex constraints or a large number of optimization variables, the algorithm often shows weaknesses of poor convergence and tends to fall into partial optimality. Considering the complexity of the problem in this paper, an improved particle swarm optimization algorithm is proposed.
The first part is to process the constraints. When applying particle swarm algorithms to solve optimization problems with constraints, the most generally used method is to add penalty terms to the objective function using the Lagrange multiplier method. The selection of penalty factors has been a difficult problem. If the penalty factor is too large, the algorithm tends to converge to a partial optimal; when the penalty factor is smaller, the algorithm may converge to an infeasible solution. In response to this question, a segmented penalty function that can be dynamically updated is introduced. The generalized objective function constructed by adding the penalty term is In the formula, represents the sum of the in-plane and out-of-plane maneuver costs, h(i c ) = i c √ i c is the penalty factor that updates dynamically with the number of iterations, i c is the current generation, and P(x) is the penalty term of the constraint, with where S n (x) is the relative constraint penalty function, V(S n (x)) is a segment function, [c 1 , c 2 , . . . c i ] and [q 1 , q 2 , . . . q i−1 ] are penalty values and tolerances, respectively, whose amounts are set based on the specific problem. Furthermore, in order to balance the conflict between minimizing the value of the generalized objective function and satisfying the constraints, by referring to Deb's work [35], rules for evaluating the particle superiority are developed as follows: (1) If both particles are feasible solutions, the particle with the smaller objective function is superior; (2) If both particles are infeasible solutions, the particle with the more minor violation of the constraint is superior; (3) When one of the two particles is a feasible solution, the particle that is feasible is chosen as the superior.
The other part is an improved update strategy for particles. The standard particle swarm optimization algorithm adopts constant inertia coefficients and learning factors; such a setup will lead to slow convergence of the algorithm at the end of evolution, and it will easily fall into partial optimality. Considering the above limitations, an adaptive variation mechanism is employed in this paper to improve the performance of the algorithm. The particle update formulas of the adaptive particle swarm optimization algorithm can be expressed as where i c denotes the current generation; x and v are the position and velocity of the particle, respectively; W is the inertia coefficient; C 1 and C 2 are the learning factors; r 1 and r 2 are the random numbers distributed between [0, 1]; p best is the optimal solution of the particle itself; and g best is the global optimal solution. W, C 1 , and C 2 will dynamically adjust according to the following mechanism: where ω max and ω min denote the maximum and minimum inertia coefficients, respectively; c 1 , max , c 2 , max and c 1 , min , c 2 , min are the maximum and minimum learning factors, respectively; I max is the maximum iteration number; and ς(ς ∈ [25,55]) is an empirical parameter. Using the update strategy described in Equations (21)-(23), the inertia coefficient W and the self-learning factor C 1 will decrease with increasing iterations, and the group learning factor C 2 will increase with increasing iterations. This adjustment mechanism will give the algorithm a strong global search capability in the early stage and a robust local search capability in the later phase.
Additionally, we introduce a random mutation factor after the update of particles in each generation to avoid the algorithm from falling into premature convergence, whose basic idea is to generate a random number r d (r d ∈ [0, 1]) for each generation and compare it with a set mutation probability φ c (φ c ∈ [0, 1]), which is set to 0.85 in this study. If the random number is larger than the probability, then: x(i, j) = x min (i) + (x max (i) − x min (i))r d (25) where, dim is the dimension of the optimization variable, i is the ith variable, j is the jth particle, and x min , x max are the boundary values of the ith variable. In this way, each iteration has a certain possibility to randomly change the value of a certain individual, which will improve the whole particle swarm's performance.
The complete flow of the improved particle swarm algorithm is shown in Figure 6, and the details are as follows: Step 1: Initialize the position and velocity of each particle in the search space randomly; Step 2: Calculate the objective function for each particle's current position and compare it to its personal best solution; Step 3: Update the personal and global best according to defined internal penalty function rules; Step 4: Update the velocity and position of each particle; Step 5: Perform random mutation operations for particles; Step 6: Repeat steps 2-5 until a maximum number of iterations N d (in this paper N d = 800) is met; Step 7: Record the global best solution as the optimization result.

PSO Formula for Minimum Fuel Reconfiguration Control
In order to apply the improved particle swarm optimization algorithm to solve the nonlinear programming problem described in Section 3.2, the setting of the algorithm formulation is necessary.
The PSO particle can be defined as where k i and k o are the numbers of in-plane and out-of-plane maneuvers, respectively; t i,k (k = 1, 2, . . . , k i ) are the moments of in-plane maneuvering, [F r,k , F t,k ](k = 1, 2, . . . , k i ) are the thrust amplitudes of the in-plane maneuvers; t o,k (k = 1, 2, . . . , k o ) are the moments of out-plane maneuvering, and F n,k (k = 1, 2, . . . , k o ) are the thrust amplitudes of the inplane maneuvers.
Sensors 2023, 22, x FOR PEER REVIEW Figure 6. Flow chart of improved particle swarm optimization.

PSO Formula for Minimum Fuel Reconfiguration Control
In order to apply the improved particle swarm optimization algorithm nonlinear programming problem described in Section 3.2, the setting of the alg mulation is necessary.
The PSO particle can be defined as ,1   Since each variable in the orbital dynamics equations presents a large var magnitude of units and physical quantities, the variables were normalized avoid negative effects on the convergence of the problem, expressed as  Since each variable in the orbital dynamics equations presents a large variation in the magnitude of units and physical quantities, the variables were normalized in order to avoid negative effects on the convergence of the problem, expressed as The relationship between the normalized variables and the actual physical variables can be expressed as In the formula, τ, ξ ∈ [0, 1], and m is the mass of the spacecraft. It needs to be noted that the linear inequality constraints on the maneuver times are transformed into only the bound constraints on the normalized optimization variables. In addition, for consistency with Section 3.1, the thrust amplitude is expressed as acceleration form in the new variables.
The dynamics constraint is added to the objective function as a penalty term, denoted as: where δα t f (x) is the relative state at the end of the task, δα des is the desired state, and ∆d is the constraint tolerance. The other basic settings are not described in detail here.

Numerical Simulations
This section presents the maneuver planning obtained using the improved particle swarm optimization algorithm, demonstrating the maneuvering performance in terms of delta-v, accuracy, etc.

Initial Values and Settings
In a selected space gravitational wave mission, the initial values of reference orbits and the errors for the three spacecraft are shown in Tables 1 and 2, and the parameters of the satellites are shown in Table 3. The detector operates in a "3 + 3" mode, meaning that the detector conducts scientific observations for the first three months and conducts the formation reconfiguration and equipment maintenance in the following three months. During the scientific exploration, the configuration performance index of the formation should meet: the variation in arm length is less than 1732.1 km, the variation in breathing angle is less than 0.2 • , and the variation in relative velocity is less than 10 m/s.  Table 3. Parameters of the satellite.

Mass (kg) Cross-Section Area (m 2 ) Reflectance Coefficient
The control method proposed in this paper is based on the mean states, whereas the measurements available to us are all osculating. Therefore, it is necessary to process the results of the numerical orbit propagation. The process is shown in Figure 7 and is briefly described as follows [37]. step, a high-precision numerical propagation of the reference and true orbits is carried out to obtain orbital data for three months. In the third step, the orbital data obtained in the previous step is converted into osculating orbital elements. In the fourth step, the osculating relative orbital elements are calculated by Equation (3). Finally, the mean relative orbital elements and the mean orbital elements of the reference orbit through the moving average filter defined in Section 2.4 are denoted as  The results of the mean states obtained by the moving average filter are shown in Figure 8. As demonstrated, the blue curves denote the osculating states and the red curves represent the mean states, which shows visually that the use of the filtering algorithm works well in removing short-term effects for both the absolute and relative orbital parameters. Furthermore, it can be deduced that if osculating ROE is used for control, excess control effort will be wasted to counter temporary effects, wasting fuel and reducing mission lifetime. In the first step, the initial osculating position velocity of the real satellite is obtained from the initial reference orbit and the error is denoted as [r r,i (t s,0 ), v r,i (t s,0 )]. In the second step, a high-precision numerical propagation of the reference and true orbits is carried out to obtain orbital data for three months. In the third step, the orbital data obtained in the previous step is converted into osculating orbital elements. In the fourth step, the osculating relative orbital elements are calculated by Equation (3). Finally, the mean relative orbital elements and the mean orbital elements of the reference orbit through the moving average filter defined in Section 2.4 are denoted as δα c,i,m (t s,0 : t s, f ) and α c,i,m (t s,0 : t s, f ).
The results of the mean states obtained by the moving average filter are shown in Figure 8. As demonstrated, the blue curves denote the osculating states and the red curves represent the mean states, which shows visually that the use of the filtering algorithm works well in removing short-term effects for both the absolute and relative orbital parameters. Furthermore, it can be deduced that if osculating ROE is used for control, excess control effort will be wasted to counter temporary effects, wasting fuel and reducing mission lifetime.
Furthermore, by projecting the motion of the physical spacecraft onto the RTN coordinate system of the reference point, the divergence of each virtual formation configuration can be obtained, as shown in Figure 9. In subfigure (a), it can be seen that the dispersion of the virtual formation is mainly in the tangential direction and no satellite exceeds the maximum permissible boundary of 116.16 km (calculated by Equation (2)) during the whole mission period. The divergence of each physical spacecraft with respect to the virtual reference point can be seen in subfigure (b), with maximum offsets of 108.02 km, 90.65 km, and 57.98 km for the three satellites, respectively. Judging from the dispersion, satellites 1 and 2 are close to the maximum allowable boundary of the virtual formation, so the formation reconfiguration control needs to be implemented as soon as possible after the end of the scientific observation mission.
Finally, the average relative orbital elements and the average absolute orbital elements of the reference orbit at the end of the scientific exploration phase are recorded as shown in Tables 4 and 5, which are used as initial values for the formation reconfiguration control.

Validation of Linear Relative Dynamics Models
The relative dynamics model used in this paper has not yet been applied to a 100,000 km geocentric orbit, so its applicability and performance need to be discussed. It is accomplished by comparing the output of the linear model with the numerical results.
Assuming no drag-free control during the maintenance phase, in this case, the physical satellite is mainly affected by J 2 , SRP, and T 3rdb perturbations, while the virtual reference point is only affected by conservative forces such as J 2 and T 3rdb since it has no physical entity. The Sc 1 data in Tables 4 and 5 were selected as initial values for the simulation, and simulation results are shown in Figure 10.  Furthermore, by projecting the motion of the physical spacecraft onto the RTN coordinate system of the reference point, the divergence of each virtual formation configuration can be obtained, as shown in Figure 9. In subfigure (a), it can be seen that the dispersion of the virtual formation is mainly in the tangential direction and no satellite exceeds the maximum permissible boundary of 116.16 km (calculated by Equation (2)) during the whole mission period. The divergence of each physical spacecraft with respect to the virtual reference point can be seen in subfigure (b), with maximum offsets of 108.02 km, 90.65 km, and 57.98 km for the three satellites, respectively. Judging from the dispersion, satellites 1 and 2 are close to the maximum allowable boundary of the virtual formation, so the formation reconfiguration control needs to be implemented as soon as possible after the end of the scientific observation mission. Finally, the average relative orbital elements and the average absolute orbital elements of the reference orbit at the end of the scientific exploration phase are recorded as shown in Tables 4 and 5, which are used as initial values for the formation reconfiguration control.

Validation of Linear Relative Dynamics Models
The relative dynamics model used in this paper has not yet been applied to a 100,000 km geocentric orbit, so its applicability and performance need to be discussed. It is accomplished by comparing the output of the linear model with the numerical results.
Assuming no drag-free control during the maintenance phase, in this case, the physical satellite is mainly affected by J2, SRP, and T3rdb perturbations, while the virtual  In the figure, the black line indicates the results obtained by numerical methods, the red line is the result of the linear model used in this paper, the cyan line is the result of the nonlinear model [38], and the blue line is the result of the Keplerian model; the three subplots from top to bottom indicate the curves of the relative orbital elements, the absolute error curve of each model with respect to the numerical results, and the relative error curve of each model with respect to the numerical results, respectively. The comparison with the Keplerian model shows that the two-body assumption does not accurately describe the orbital dynamics at 10 5 km (mainly reflected in e x and e y ) and the perturbation effects are not negligible. The comparison with the nonlinear model shows that the two have almost comparable accuracy (the maximum error is less than 3 m in 92 days), which indicates that the simplification operations carried out by the linear model are valid. Furthermore, a comparison with the numerical results provides an assessment of the accuracy of the model. The statistics in Table 6 show that the maximum absolute error of the linear model is only in the order of a hundred meters over a 30-day period, and the relative errors are less than 7%; even at 60-and 90-day periods, the model still has a high level of accuracy. The largest error term occurs at δe x , which is primarily due to the unmodeled terms (e.g., eclipses effects, planetary gravity, or higher-order geopotential terms), and the error can be reduced by updating the orbital elements of the reference spacecraft and the physical spacecraft within a certain time interval.    In summary, the linear model is an advantageous choice for the study of 100,000 km formation dynamics.

Performance Evaluation of Improved Particle Swarm Optimization Algorithms
To evaluate the performance of the improved particle swarm optimization algorithm in the paper, 20 repeated experiments were carried out with the same initial values and settings.
Assuming that the amplitudes of the three axes of the thruster were [400, 400, 200] µN, six in-plane maneuvers and four out-of-plane maneuvers were performed during the whole control process, and the given mission time was 14 days, the desired relative orbital elements are [0,0,0,0,0,0] T m. The parameter settings of the algorithm include: the population scale was 1000, maximum iterations were 800; the inertia factor W decreased from 0.9 to 0.4, the self-learning factor C 1 decreased from 1.5 to 0.5, the group learning factor C 2 increased from 0.5 to 1.5, random mutation probability factor φ c was set to 0.85, and ς was set to 35; the tolerances for dynamics constraints were [1,1,1,1,1,1] T m, and the penalty factor was set as in Equation (31). The traditional particle swarm algorithm uses the same basic setup.
In the experimental test of the algorithm, the terminal error, maneuver cost, and convergence rate of each experiment are recorded and shown in the form of a box plot, as shown in Figure 11. A box plot is used to observe the overall distribution of the data, which is a type of chart that displays the summary of a set of statistical values with a five number summary "minimum", first quartile (Q1), median (Q2), third quartile (Q3), and "maximum". The "box" part of the box plot represents the interquartile range (IQR), which is the range of the middle 50% of the data, and the "whiskers" represent the minimum and maximum values, excluding outliers. The median is shown as a line inside the box and some points outside the box are anomalies. Subplot (a) shows the total errors between the terminal and desired values of the six parameters, subplot (b) shows the total cost of maneuvers during the control process, and subplot (c) shows the time consumption of the whole convergence process; the above three indicators are commonly used to describe algorithm performance.
As illustrated in subplots (a) and (b), the box representing the traditional PSO is rather long and has a high number of outliers, which indicates that traditional PSO algorithms have poor convergence and high volatility. In contrast, all boxes corresponding to the improved particle swarm algorithm are narrow and have smaller values, proving that the improved algorithm performs better and the solutions obtained are more credible. It should be noted that the advantage of subgraph (c) is not as obvious as that of subgraphs (a) and (b), mainly because the convergence rate is mainly dominated by the properties of the objective function when the solved problem is too complex.
Overall, our proposed improved PSO algorithm outperforms the traditional PSO in terms of both convergence precision and convergence rate.

Fuel-Minimum Reconfiguration Maneuver Planning via IPSO
Assuming that the amplitude of the three axes of the thruster is [400, 400, 200] µN, six in-plane maneuvers, and four out-of-plane maneuvers are performed during the whole reconfiguration process, the given mission time is 14 days, the desired relative orbital elements are [0,0,0,0,0,0] T m, and the tolerances of the six relative orbital elements are [1,1,1,1,1,1] T m. In addition, set [3.1, 3.5] d, [11.5, 11.9] d as the maintenance phase of the instruments, during which the satellites are not available for formation control.
Firstly, as a comparison term to the strategy proposed in this paper, the near-Earth formation reconstruction strategy of the literature [23] is directly transposed to the space gravitational wave mission, without discretization of the mission space and without normalization of the optimization variables. Simulation results are shown in Figures 12-14, where subplot (a) is the optimized maneuver sequence, subplot (b) is the curves of the relative orbital elements throughout the control process, subplot (c) is the relative trajectory in the RTN coordinate system of the virtual reference spacecraft, and subplot (d) is the relative position and relative velocity curve in the virtual reference spacecraft RTN coordinate system. Subplots (c) and (d) are transformed by substituting the relative orbit element data into Equation (5), whose end states reflect the accuracy of the spacecraft tracking the reference trajectory.

Fuel-Minimum Reconfiguration Maneuver Planning via IPSO
Assuming that the amplitude of the three axes of the thruster is [400, 400, 200] µN, six in-plane maneuvers, and four out-of-plane maneuvers are performed during the whole reconfiguration process, the given mission time is 14 days, the desired relative orbital elements are [0,0,0,0,0,0] T m, and the tolerances of the six relative orbital elements are [1,1,1,1,1,1] T m. In addition, set [3.1, 3.5] d, [11.5, 11.9] d as the maintenance phase of the instruments, during which the satellites are not available for formation control.
Firstly, as a comparison term to the strategy proposed in this paper, the near-Earth formation reconstruction strategy of the literature [23] is directly transposed to the space gravitational wave mission, without discretization of the mission space and without normalization of the optimization variables. Simulation results are shown in Figures 12-14, where subplot (a) is the optimized maneuver sequence, subplot (b) is the curves of the relative orbital elements throughout the control process, subplot (c) is the relative trajectory in the RTN coordinate system of the virtual reference spacecraft, and subplot (d) is the relative position and relative velocity curve in the virtual reference spacecraft RTN coordinate system. Subplots (c) and (d) are transformed by substituting the relative orbit element data into Equation (5), whose end states reflect the accuracy of the spacecraft tracking the reference trajectory.
The results for virtual formation 1 are shown in Figure 12. For in-plane motion, the maneuvers' times are [2.951, 6.914, 9.532, 12.824, 13.224, 13.478, 13.628, 13.912, 13.951, 13.978, 13.989, 13.997  The results for virtual formation 2 are shown in Figure 13. For in-plane motion, the maneuvers' times are [   The results for virtual formation 3 are shown in Figure 14. For in-plane motion, the maneuvers' times are [   Secondly, applying the method proposed in Section 3.1 to the discrete entire task interval, each subinterval of the in-plane is [0, 3.10] d, [  The results for virtual formation 3 are shown in Figure 14. For in-plane motion, the maneuvers' times are [1.544, 3.  The results for virtual formation 2 are shown in Figure 17.   The results for virtual formation 3 are shown in Figure 17.   Comparing the results of the two approaches, it can be seen that both methods can obtain maneuver solutions that satisfy the dynamical constraints; however, there are some differences in the maneuver cost and maneuver distribution. Firstly, in terms of the maneuver cost, theoretically analyzed, the conventional method does not constrain the complete solution space and is supposed to obtain a relatively better solution, but from the Comparing the results of the two approaches, it can be seen that both methods can obtain maneuver solutions that satisfy the dynamical constraints; however, there are some differences in the maneuver cost and maneuver distribution. Firstly, in terms of the maneuver cost, theoretically analyzed, the conventional method does not constrain the complete solution space and is supposed to obtain a relatively better solution, but from the actual results, the delta-v consumptions of the three satellites obtained by the two methods are [0.3895, 0.2130, 0.0979] m/s and [0.2030, 0.1806, 0.1100] m/s, respectively. The proposed method performs better instead. This phenomenon may be caused by two reasons: one because the algorithm easily falls into local optimum due to a large number of variables and complex constraints of the problem under study; another is that the conventional method does not normalize the variables, which is not beneficial to the convergence of the solution. Second, in terms of the distribution of maneuver profiles, the maneuver profiles obtained using the conventional method tend to impose two maneuvers of longer duration followed by a series of short-duration maneuvers for quick adjustment in the final stage. In contrast, the maneuver solutions obtained by the method proposed in this paper are more uniform regarding the location and duration of each maneuver due to the discretization of the mission time. Finally, in terms of accuracy, both methods have comparable accuracy, and the maximum relative position error is no more than 4 m and the maximum relative velocity error is no more than 4 mm/s. It should be noted that some factors in the paper (e.g., random variation probability, inertia factor, learning factor, penalty factor, etc.) are set empirically, which will affect the performance of the algorithm to some extent; in addition, the discretization method of the whole task interval will also affect the acquisition of the optimal solution.
To verify the universality of the proposed method, Tables 7 and 8 are obtained by applying the method to different thruster schemes and different given mission times. In different thruster schemes, the mission time is limited to 14 d; in different given mission time schemes, the thruster amplitude is given as [400, 400, 200] µN. The simulation results show that the proposed method is well-suited and flexible enough to meet the possible formation reconfiguration requirements in space gravitational wave missions. In addition, the data in Table 7 show that the total delta-v does not present a clear regularity as the thruster amplitude changes. The data in Table 8 show that the total delta-v generally tends to decrease and then increases with the gradual increase in the given mission time, which is because part of the offset of δλ can be corrected by the natural dynamics established by δa, the divergence dominating again with the prolongation of the task time. It should be noted that the above analysis is limited to the phenomenon presented in this study and cannot be taken as a general rule.

Conclusions
This paper addressed the design of the fuel-minimum maneuvering strategy for the formation reconfiguration in the high earth orbit (10 5 km). In this study, a control strategy utilizing virtual formation is employed to address the issue of measurement-limited formation. The relative motion of real satellites relative to a virtual reference point is described by a linear dynamic model, which encompasses J 2 , SRP, and the third-body gravitational effects of the Sun and Moon. A piecewise continuous control strategy is chosen and the entire mission interval is discretized in the reconfiguration process to avoid persistent or high-frequency orbital maneuver disturbances to the satellite platform. The minimum fuel reconfiguration problem is transformed into a constrained nonlinear optimization problem, and an improved particle swarm algorithm is proposed for maneuver planning solutions.
Simulation results indicate that the improved particle swarm optimization algorithm has better stability and superior global search capability, and the proposed reconfiguration control strategy can flexibly respond to different complex scenarios during gravitational wave missions with less interference to the satellite platform. In addition, the maximum relative position error is only 3.454 m and the maximum relative velocity error is only 3.390 mm/s, which can satisfy the precision requirements of space gravitational wave formation maintenance and reconfiguration.
Future work will include developing more efficient optimization algorithms, and improving the dynamical models used to derive multi-objective optimization strategies, as well as extending the continuous scheme to orbits of arbitrary eccentricity and to various space missions.

Conflicts of Interest:
The authors declare no conflict of interest.

Nomenclature
S ci , S ri Virtual reference spacecraft and physical spacecraft L ij Arm length between spacecraft S ri and S rj A i Breathing angle corresponding to spacecraft S ri ∆r Envelope radius of the virtual formation ∆L Arm length variation index