Next Article in Journal
A Global ArUco-Based Lidar Navigation System for UAV Navigation in GNSS-Denied Environments
Previous Article in Journal
Orbital Stability and Invariant Manifolds on Distant Retrograde Orbits around Ganymede and Nearby Higher-Period Orbits
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Hybrid Game Strategy for the Pursuit of Out-of-Control Spacecraft under Incomplete-Information

1
Research Center of Satellite Technology, Harbin Institute of Technology, Harbin 150001, China
2
Electrical and Computer Engineering, University of Canterbury, Christchurch 8020, New Zealand
3
Satellite Technology and Research Centre, National University of Singapore, Singapore 117292, Singapore
*
Author to whom correspondence should be addressed.
Aerospace 2022, 9(8), 455; https://doi.org/10.3390/aerospace9080455
Submission received: 16 July 2022 / Revised: 4 August 2022 / Accepted: 12 August 2022 / Published: 18 August 2022
(This article belongs to the Section Astronautics & Space Science)

Abstract

:
This paper investigates the pursuit problem of out-of-control spacecraft under incomplete-information, and provides new ideas for the disposal of dangerous spacecraft with obstacle avoidance capability. Throughout the pursuit process, the maneuver strategy of the out-of-control spacecraft is unknown, and its possibly unconventional and irregular maneuvers may endanger the safe operation of any other spacecraft on orbit. Based on the differential game theory, complete information game strategy pairs are derived. Then, considering that the control information of the target is unavailable to the pursuer, the target’s maneuver is regarded as the disturbance item. The incomplete information game strategy is derived from the unilateral optimal cost function. Furthermore, the disturbance estimator is designed to identify the missing information of the target. The optimal hybrid game strategy is proposed as an approach to compensate the target maneuver strategy. Simulation study has been conducted and the results have validated that the missing information can be effectively estimated using the estimator. The designed hybrid game strategy can achieve rapid approach, while saving fuel consumption for on-orbit service.

1. Introduction

With the increasing frequency of commercial activities since the 1960s, spacecraft numbers continue to grow rapidly. Unavoidably, there has also been an increase in the spacecraft with launch and on-orbit operation failure [1]. Once spacecrafts, with autonomous maneuvering and decision-making capabilities, are out of control, their potential unconventional and irregular maneuvers endanger the safe operation of spacecraft on orbit, and the losses would be difficult to estimate. Therefore, in the research of on-orbit service in recent years, the safe approach and processing of out-of-control spacecraft have been paid more attention. In general, the disposal of satellites can be completed by analyzing the orbital data towards its end of mission, if it carries de-orbiting devices. However, once a satellite is out of control, its disposal has to be on-orbit processed by an active debris removal spacecraft.
In this paper, an approach method for a class of uncontrolled satellites with space situation awareness is studied. Such satellites have the ability to avoid obstacles and can make autonomous decisions to keep on-orbit service spacecraft away. Since the target can automatically perceive space information and make unconventional maneuver strategies, a bilateral optimal strategy, which considers target irregular maneuvers, based on game theory, is more suitable than the traditional unilateral optimal control strategy [2]. In addition, the out-of-control satellite has unknown maneuvers. Consequently, the pursuer cannot obtain the complete information of the target through traditional means, such as two-line elements (TLE), which leads to the problem of incomplete information game. Therefore, it is of great significance to study the pursuit game problem of the out-of-control spacecraft with obstacle avoidance capability under incomplete information. We explored the problem of space rendezvous and approach of out-of-control spacecraft within the problem of game without considering non -peaceful purpose.
There have been many studies on the two-side game, where the target has maneuverability, and most of these are based on differential game theory and applied in the military field [3]. Ye [4] derived vector guidance for satellite proximate interception. The formula for calculating the intercept time was given, by which the interception with the desired miss distance could be achieved. Stupik [5] derived the optimal thrust angle for the satellite game, based on the Hill-Clohessy-Wilshire (HCW) equation. The particle swarm optimization was used to obtain an open-loop solution, while closed-loop control strategy was given by interpolating and extrapolating a series of trajectories. For the game with complex dynamics of the two players, the problem evolved into the two-point boundary value problem (TPBVP) with coupled nonlinear equations, which is difficult to solve. Aiming at this problem, authors in Refs. [6,7,8] used the genetic algorithm to approximately calculate initial value of the adjoint state so that, then, the exact value could be calculated by nonlinear programming. In Ref. [9], the thrust configurations of the two players were assumed to be different. Then the pursuit–evasion game was transferred into a TPBVP with four unknowns and four nonlinear equations. Based on the indirect method, a numerical algorithm for solving the TPBVP was proposed. Gutman [10] derived the missile vector guidance laws in the polar coordinate and in the spherical coordinate, respectively. Furthermore, a quadratic equation on time-to-go was proposed to determine the terminal intercept time, and the bifurcation of this equation was discussed in Refs. [11,12]. With this method, the shortest intercept time could be calculated and rapid interception could be achieved.
Game theory also provides a suitable frame to study the sophisticated optimal decision and control problems, where the performance index function of each player depends on the control strategies of itself and all other players. In Ref. [13], the two-side game was considered first and the target strategy could be optimized based on the attacking missile strategy. In addition, the author analyzed the game with three players. The target evasion strategy and the defender pursuit strategy were derived cooperatively. Simulations indicated that the defender could intercept the attacking missile using the cooperative strategy. Perelman [14] considered the three-player game and derived a closed-loop game strategy based on continuous dynamics and discrete dynamics, respectively. Then, the guidance gains and the condition of the saddle-point existence were analyzed under the different game cases. Simulations showed that the pursuer could be attacked by the defender, and the target was used as a bait to lure the pursuer. Literature [15] proposed a differential game method to stabilize a combined spacecraft composed of multiple microsatellites and a failed spacecraft, in which each microsatellite could independently calculate its own control strategy. Numerical simulations verified the effectiveness of the differential game method in the attitude takeover control of the failed spacecraft.
These literatures mentioned above were the application of complete information game strategy under idealized conditions. This paper will focus on the safe approach of uncontrolled spacecraft with continuous and dynamic obstacle avoidance ability under incomplete information. In a practical pursuit process, some players have private information, and others should consider this fact when forming expectations of these players’ behaviors. The pursuer may not obtain game information completely due to lost contact and component damage of the target. Aiming at the incomplete information problem, Ref. [16] considered the evasion defense problem for a given interceptor strategy. For the target and the defender, the optimal strategy of single-direction communication and the game strategy of two-direction communication were derived, respectively. Refs. [17,18] investigated the incomplete-information and imperfect-information situations based on double integral dynamics. The missing information was treated as the extended state, and then the observer was applied to estimate the information. Satak [19,20] used series extension to approximate the unknown game value function, which played a key role in strategy derivation, and the series coefficients were updated by the observable target information. In Refs. [21,22], authors used the multiple mode adaptive estimator to identify the unknown guidance law of the pursuer. Several filters matched to possible guidance laws were applied to obtain estimated posterior probability. By fusing guidance laws in probability, the target strategy and the defender strategy could be derived. Wang [23] proposed a method to degenerate the game into a strong tracking problem. The extended Kalman Filter was used to obtain relative state of the target, the observability was analyzed under different measurement methods, based on linear quadratic differential game theory. Similar to the optimization object considered in this paper, one of the goals of our paper is to approach the target with minimal fuel consumption.
We should consider not only the completeness and symmetry of the information during the dynamic game scenario, but also the potential unconventional and irregular changes of the target maneuver strategies. The pursuer can adopt hybrid game strategies to cope with the capricious changes of the target maneuver strategy effectively. Hafer [24] considered the scenario where the spacecraft needed a hybrid strategy to avoid obstacles and evade another spacecraft. By comparing game value and obstacle value, the spacecraft would switch its strategy to balance the two missions. Turesky [25] derived evasion strategy for hybrid pursuit dynamics. If the switch information was available to the target, the evasion strategy could be given as a bang-bang form, which relied on the switch function and the zero-effort miss (ZEM). On the other hand, the matrix game was formulated for the incomplete-information scenario. Through the Nash equilibrium, the mixed saddle-point solution could be given, which guaranteed the low bound of ZEM. Shinar [26,27] considered the pursuit-evasion game with hybrid pursuer dynamics and evader dynamics. On this basis, the corresponding opponent’s strategies were derived and the capture zone was constructed.
In summary, in the current research methods of the game problem in aerospace, the theory of differential game accounts for the majority [28,29,30]. Although there is much research on pursuit–evasion, most of the discussions are about a situation in which the pursuer can obtain complete information of the target. However, due to lost contact of the target or the failure of its components, the target information is not completely available to the pursuer in practice, which makes the pursuit not achievable. Moreover, in Refs. [17,22], although the authors consider the incomplete-information situation, the player dynamics are simplified, which is not consistent with the actual scenario of spacecraft pursuit. All in all, the formation of the game relationship requires conflict of interests between spacecrafts. A large amount of literatures have studied the application of differential game in the field of target interception. However, this paper focuses on its application in the field of on-orbit services. Thus, this paper applies differential game theory to space rendezvous and the approach of runaway spacecraft, regardless of its non-peaceful uses. We assume that the pursuer can change its maneuver strategies during the game process, and that the target knows the maneuver strategies that the pursuer may take, that is, the target has complete information during the game process, but does not know the actual game strategy taken by the pursuer. In this scenario, we propose the optimal hybrid game strategy in pursuit of the out-of-control spacecraft using its incomplete information. The designed hybrid game strategy can achieve a rapid approach while saving fuel consumption. This approach strategy, based on game theory, can also deal with failed targets with dynamic avoidance capability in the future constellation.
This paper is organized as follows: the relative dynamics model is first established in Section 2, the complete information game strategy pairs and the game strategies under incomplete information are subsequently derived in Section 3. The target missing information is estimated by the disturbance estimator and the corresponding optimal hybrid strategy is then presented in Section 4. Simulation study has been conducted to verify the proposed hybrid game strategy and evaluate the satellite safe approach performance in Section 5. Lastly, conclusions are presented in Section 6.

2. Relative Dynamics Model

For the spacecraft terminal approach, we can establish a moving orbital coordinate to derive the relative dynamics for the pursuer [4]. As shown in Figure 1, we can establish a satellite near the pursuer as the reference satellite O1, with P as a pursuer. A non-inertial orbital coordinate frame O1xyz, known as the local vertical local horizon (LVLH) coordinate [31] can be established by setting the center of the reference satellite as the origin. The O1x axis is directed from the Earth’s center to the reference satellite. The O1z axis is orientated to the direction of the orbital angular momentum of the reference satellite, and the O1y axis completes the right-hand rule.
Let r 1 , r 2 be the positions of the reference satellite and the pursuer in the earth inertial coordinate frame OXYZ. Then, the dynamics of the two satellites can be given as
r ¨ 1 = μ r 1 3 r 1 r ¨ 2 = μ r 2 3 r 2 + f
where μ is the earth gravitational constant, f is the thrust acceleration of the pursuer.
Defining the relative position between the pursuer and the reference satellite as δ r = r 2 r 1 and differentiating it, we can obtain the dynamic of the pursuer as
δ r ¨ = r ¨ 2 r ¨ 1 = μ r 2 3 r 2 + μ r 1 3 r 1 + f
Since the orbital coordinate rotates with the motion of the reference satellite, the relative position derivatives can be derived from the vector differentiation relations as follows:
δ r ˙ = δ r + ω × δ r δ r ¨ = δ r + 2 ω × δ r + ω ˙ × δ r + ω × ( ω × δ r )
where δ r , δ r are the first and the second derivatives of δ r in orbital coordinate, and ω represents the angular velocity of the orbital coordinate frame.
Therefore, the dynamic of the pursuer in the orbital coordinate frame can be rewritten as:
δ r = 2 ω × δ r ω ˙ × δ r ω × ( ω × δ r ) μ r 3 r P + μ r 3 r O + f
Under the assumption that the reference satellite moves along a circular orbit and the relative distance between the pursuer and the reference satellite is far less than the geocentric distance of the reference satellite, namely, δ r r .
ω = [ 0 0 1 ] T ω ω ˙ = [ 0 0 0 ] T
The dynamic of the pursuer in the LVLH coordinate can be simplified to the Clohessy-Wiltshire (CW) equations as:
x ¨ 2 ω y ˙ 3 ω 2 x = u x y ¨ + 2 ω x ˙ = u y z ¨ + ω 2 z = u z
where x , y , z are position components of the pursuer in the LVLH coordinate, and u x , u y , u z represent the three-axis thrust accelerations of the pursuer.
Defining the state variable X = [ x y z x ˙ y ˙ z ˙ ] T , and the control acceleration variable U = [ u x u y u z ] T , we can rewrite Equation (6) in the state space form as:
X ˙ = A X + B U
with the matrices:
A = [ 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 3 ω 2 0 0 0 2 ω 0 0 0 0 2 ω 0 0 0 0 ω 2 0 0 0 ] ,   and   B = [ 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 ]
We select another satellite as our target E . For the terminal approach, the pursuer P is close to the target E in position. As the formula deduces, a satellite near them is chosen as the reference satellite to establish the LVLH frame. Thus, the relative dynamics for the two players in the LVLH coordinate are as follows:
X ˙ P = A X P + B U P X ˙ E = A X E + B U E
where X i ( i = P , E ) denotes the state of the pursuer P or the target E in the orbital coordinate, respectively. U i ( i = P , E ) denotes the thrust accelerations of the pursuer P and the target E , and satisfies the magnitude constraint, namely, U i < ρ i ( i = P , E ) .
We define the relative state between the pursuer and the target as X PE = X P X E .
By differentiating the state equation and combining with Equation (8), the relative dynamics can be written as:
X ˙ PE = A X PE + B U P + C U E
where C = B .

3. Game Strategies under Incomplete Information

From Ref. [4], the necessary condition for successful interception is that the thrust magnitude of the pursuer is higher than that of the target. When the target thrust is higher than the pursuer, the target will successfully escape. In practical application, due to private information or sensor constraints, the pursuit–evasion game information may not be completely known to the pursuer. Therefore, based on the complete information game strategy pairs, this section studies the pursuit game strategies under incomplete information. The proposed norm-based game strategy can achieve fast approach, while the linear quadratic game strategy can save fuel consumption.

3.1. Norm-Based Game Strategy

The goal of the pursuer is to compete with the target in terms of terminal approach distance. In the norm-based game strategy, it is sufficient to consider only the relative displacements between the pursuer and the target. In order to facilitate later analysis, we define the terminal zero effort misse (ZEM) between the pursuer and the target as Z PE ( t ) .
Thus, the norm-based strategy with ignoring fuel optimization is derived.
Z PE ( t ) = D Φ ( t f , t ) X PE
where D = [ I 3 0 3 ] with I 3 R 3 × 3 , t f is the terminal time of the interception. Φ ( t f , t ) is the state-transition matrix of the system (9), with its explicit form being given in [4], and satisfies:
Φ ˙ ( t f , t ) = Φ ( t f , t ) A
Differentiating Equation (10) and using Equations (9) and (11) yield:
Z ˙ PE ( t ) = D ( Φ ˙ ( t f , t ) X PE + Φ ( t f , t ) X ˙ PE ) = D ( Φ ( t f , t ) B U P Φ ( t f , t ) B U E ) = B P U P B E U E
where B P = B E = D Φ ( t f , t ) B = Φ 12 ( t f , t ) .
Throughout the pursuit-evasion game process, the pursuer expects the target to enter one of the pursuit zones in a limited time, while the target tries to avoid it by increasing the distance with maximum thrust. Therefore, we define the cost function J as:
min U P max U E J = Z PE ( t f )
Define the cost function along a solution as:
J ˜ = Z PE ( t )
Differentiating it and combining with Equation (12), we have:
d d t Z PE ( t ) = Z PE T ( t ) Z PE ( t ) Z ˙ PE ( t ) = Z PE T ( t ) Z PE ( t ) ( B P U P B E U E )
Since the pursuer P has the intention to minimize J , the strategy U P satisfies d J ˜ d t < 0 . Therefore, the control strategy of the pursuer P with the maximal magnitude constraint can be derived as:
U P = ρ P ( Z PE T ( t ) Z PE ( t ) B P ) T Z PE T ( t ) Z PE ( t ) B P = ρ P B P T Z PE ( t ) B P T Z PE ( t )
On the contrary, the target E tries its effort to maximize J and its strategy U E satisfies d J ˜ d t > 0 . Therefore, the control strategy of the target E with the maximal magnitude constraint can be derived as:
U E = ρ E ( Z PE T ( t ) Z PE ( t ) B E Z PE T ( t ) Z PE ( t ) B E ) T = ρ E B E T Z PE ( t ) B E T Z PE ( t )
For the norm-based strategy, the cost function only takes into account the relative position between the pursuer and the target, that is, the pursuer uses its strategy instead of the target strategy. Thus, the absence of the target strategy information has no impact on the norm-based strategy.

3.2. Linear Quadratic Game Strategy

First, we make an assumption that both players can obtain the complete information of the game in the zero-sum equilibrium game. The pursuer has the goal to pursuit the target with minimal cost, while the target increases the distance with the pursuer with minimal cost. Thus, we define the linear quadratic cost function as follows:
min U P max U E J = 1 2 X PE T ( t f ) S X PE ( t f ) + 1 2 t 0 t f ( X PE T Q X PE + U P T R P U P U E T R E U E ) d t
where S > 0 , R P > 0 and R E > 0 are all symmetric positive definite cost matrices, and Q 0 is symmetric positive semidefinite cost matrix, t f is the terminal time of the game.
Different from the norm-based game strategy, the cost function of the linear quadratic game strategy takes into account fuel consumption. If both sides in the game have complete information about the opponent, the game is zero-sum and associates with a saddle-point strategy pair.
To derive the saddle-point strategy pair, a Hamiltonian function is introduced as:
H = λ T ( A X PE + B U P B U E ) + 1 2 ( X PE T Q X PE + U P T R P U P U E T R E U E )
where λ is the adjoint variable.
Thus, the game saddle-point strategy pair is obtained as:
U P = R P 1 B T P X PE U E = R E 1 C T P X PE
where P can be determined by:
P ˙ + A T P + P A P ( B R P 1 B T B R E 1 B T ) P + Q = 0
with P satisfies the terminal condition:
P ( t f ) = S
Next, to derive the target strategy under incomplete information. The pursuit information may not be completely known to the pursuer in practice, which causes the strategies derived in Equation (20) not be applicable. Thus, the approach strategy under incomplete information is studied, and we make an assumption as follows:
Assumption 1.
The pursuer can obtain the target state information, instead of the target maneuver information. On the contrary, the target can perceive the complete information of the whole game.
Under this assumption, the pursuer considers the target strategy to be U E = 0 , and the relative dynamics can be written as:
X ˙ PE = A X PE + B U P
Thus, the pursuit strategy can be derived by minimizing the one-side cost function
min U P J = 1 2 X PE T ( t f ) S X PE ( t f ) + 1 2 t 0 t f ( X PE T Q X PE + U P T R P U P ) d t
where S , Q and R P are identical to that in Equation (18).
The Hamiltonian function is defined as:
H = λ T ( A X PE + B U P ) + 1 2 ( X PE T Q X PE + U P T R P U P )
Φ = 1 2 X PE T ( t f ) S X PE ( t f )
From the optimal control condition H U P = 0 , we have
U P = R P 1 B T λ
The necessary condition for the optimal saddle-point solution includes the adjoint equation λ ˙ = H X PE , namely:
λ ˙ = ( A T λ + Q X PE )
in conjunction with the boundary condition:
λ ( t f ) = Φ X PE ( t f ) = S X PE ( t f )
Assuming the adjoint variable and the relative state also satisfy the linear relationship as follows:
λ = P p X PE
where P p is also a symmetric positive definite matrix, namely, P p > 0 , P P T = P p . Then, the assumption given in Equation (30) is valid.
Differentiating Equation (30) and combing with Equations (23), (27) and (28), we can obtain the Ricatti equation as follows:
P ˙ p + A T P p + P p A P p B R P 1 B T P p + Q = 0
From the boundary condition in Equation (29), P p also satisfies the terminal condition:
P P ( t f ) = S
Thus, the one-side pursuit strategy is determined by Equations (27) and (30)–(32), namely:
U P = R P 1 B T P p X PE
For the target, the target knows the maneuver strategy that the pursuer may take, that is, the target has the complete information about the game process, but does not know the actual strategy taken by the pursuer after estimating the incomplete information. Therefore, the target’s game strategy can be optimized by considering the pursuer’s game strategy to obtain better avoidance performance.
Thus, with the pursuer using strategy (33), the relative dynamics can be rewritten as:
X ˙ PE = A X PE B R P 1 B T P p X PE + C U E = A c X PE + C U E
where A c is a time-varying matrix and defined as A c = A B R P 1 B T P .
The target one-side optimization cost function is defined as:
max U E J = 1 2 X PE T ( t f ) S X PE ( t f ) + 1 2 t 0 t f ( X PE T Q X PE U E T R E U E ) d t
where S , Q , R E are identical to Equation(18).
Similar to the derivation of the pursuer strategy, the optimal strategy of the target is given as:
U E = R E 1 B T P E X PE
where P E is symmetric positive definite, and satisfies:
P ˙ E + A c T P E + P E A c + P E C R E 1 C T P E + Q = 0
with the terminal condition:
P E ( t f ) = S

4. Optimal Hybrid Game Strategies

For the norm-based strategy, the interception can be guaranteed, as long as the thrust amplitude is greater than the target. Nevertheless, the fuel consumption term is not optimized, which leads to an increase in fuel consumption. On the other hand, the linear quadratic strategy can reduce fuel consumption, although the pursuit may not be realized because of the target’s strong maneuverability or the changeable maneuvering law. Thus, the hybrid game strategy, combining the linear quadratic compensation strategy with the norm-based strategy, is proposed to guarantee the approach and reduce fuel consumption. The proposed hybrid game strategy has the inherent ability to combine multiple game algorithms under a single coherent framework as shown in Figure 2. The designed hybrid game strategy can achieve rapid approach and save fuel consumption, even under the condition that the target can perceive complete information of the pursuer and obtain better evasion performance in the game strategy.
To model the system dynamics in hybrid framework, the hybrid systems state is defined as:
x = [ X q τ ]
where q is the game control logic variable and τ is the timer variable. The value of q specifies which game strategy is currently being used, with q { 1 , 2 } , where q = 1 represents the linear quadratic compensation strategy, and q = 2 represents the norm-based strategy. The property of q introduces discrete dynamics into the system. For this problem, the hybrid system H is defined as:
H = ( C , F , D , G )
where F is the flow map, G is the jump map, C is the flow set, and D is the jump set. That is, C is defined as the set of the state of x in which the system will follow the continuous time dynamics defined by F . Similarly, D and G are defined for the discrete dynamics.
The definition of the continuous dynamics of the flow map F follows:
F = [ A X P E + B U P B U E 0 1 ]
The definition of the discrete dynamics of the flow map G follows:
G = [ A X P E 3 q 0 ]
The flow set C and the jump set D are defined later. For the norm-based strategy, the strategy is given in Equation (16).
For the linear quadratic strategy, the disturbance observer is designed to estimate the unavailable target maneuver information, and the compensation strategy is derived. To simplify the analysis later, an assumption is given as follows:
Assumption 2.
The target perceives the pursuer maneuver strategy with incomplete information, namely, Equation (33), rather than knows that the pursuer uses the disturbance estimator.
To apply the disturbance estimator, defining the minus of the target strategy as the disturbance d , namely, d = U E , and combing with C = B , the relative dynamics can be written as:
X ˙ PE = A X PE + B U P + B d
Decomposing the relative state as X PE = [ x 1 x 2 ] T , we can obtain the relative dynamics on x 2
x ˙ 2 = A 21 x 1 + A 22 x 2 + U P + d
with A 21 = [ 3 ω 2 0 0 0 0 0 0 0 ω 2 ] , A 22 = [ 0 2 ω 0 2 ω 0 0 0 0 0 ] .
Thus, we design the disturbance estimator as follows:
d ^ = z + p ( X PE ) z ˙ = L ( X PE ) [ d ^ + a 21 x 1 + a 22 x 2 + U P ]
with the disturbance observer gain L ( X PE ) being positive definite, and the p ( X PE ) satisfying L ( X PE ) = p ( X PE ) X PE . d ^ is the estimated value of using to compensate the target strategy, z is the internal additional variable of the system.
Thus, we define the estimation error of the system (45) as d ˜ = ( d ^ + U E ) .
If the disturbance defined in this paper d is continuous and satisfies d ˙ δ , then, according to the input-to-state stable (ISS) criterion, the estimation error d ˜ is uniformly ultimately bounded, as known from Ref. [32].
For the well estimation, the pursuer applies a strategy combining an interception term with a compensation term. Thus, the pursuer’s strategy can be given as:
U P = R P 1 B T P P X P E d ^
where P p satisfies the Ricatti differential equation as follows:
P ˙ p + A T P p + P p A P p B R p 1 B T P p + Q = 0
with P P ( t f ) = S .
Remark 1.
To derive the approach strategy above, we make an assumption that the target does not know the estimated value of the pursuer. If a rational target knows the estimated value, it may use the strategy as follows:
U E = R E 1 B T P E X PE d ^
In this case, substituting the pursuer strategy and the target strategy into Equation (9), the estimation can be offset and the relative dynamics can be rewritten as:
X ˙ PE = A X PE + B ( R P 1 B T P P X d ^ ) C ( R E 1 B T P E X PE d ^ ) = A X PE B R P 1 B T P P X + C R E 1 B T P E X PE
For the incomplete information game, substituting Equations (33) and (36) into Equation (9) yields an equation which is identical to Equation (49). Therefore, the case where the target knows the estimated value of the pursuer is not discussed.
If the target strategy changes slowly, the disturbance observer achieves a well estimation performance. Nevertheless, the target may adjust its strategy fast to avoid being intercepted in practice. In this case, the estimation error is large and the effective estimation cannot be achieved, and, as a result, the pursuer is still unavailable to the game information. In this case, the pursuer applies the norm-based strategy. From Ref. [4], we know that if the pursuer applies the norm-based strategy, the approach can be guaranteed as long as the thrust magnitude of the pursuer is larger than that of the target. Thus, the pursuer uses this strategy if the estimation error is large. To obtain the switch logic, and evaluate the strategy estimation performance, an auxiliary system is defined as:
X ˙ P E = A X P E + B U P B d
Defining state error X ˜ = X d ^ , subtracting the actual system yields:
X ˜ ˙ = X ˙ X ^ ˙ = ( A X P E + B U P B U E ) ( A X P E + B U P + B d ) = A X ˜ + B d ˜
Integrating Equation (51), we have:
X ˜ = Φ ( t t 0 ) X ˜ 0 + t 0 t Φ ( t τ ) B d ˜ d τ
Choosing the initial state of the auxiliary system as the same one of the actual system, we have X ˜ 0 = 0 . Thus,
X ˜ = t 0 t Φ ( t τ ) B d ˜ d τ
Therefore, the estimation error d ˜ can be substituted by X ˜ . Predesign an estimation error bound as:
X ˜ ¯ = ε
If X ˜ < X ˜ ¯ , a well estimation is obtained and the linear quadratic compensation strategy is implemented. On the other hand, if X ˜ X ˜ ¯ , the norm-based strategy is used. Thus, the flow set C and the jump set D are defined as:
C = C 1 C 2 D = D 1 D 2
with
C 1 = { X ˜ : X ˜ < X ˜ ¯ , q = 1 } C 2 = { X ˜ : X ˜ X ˜ ¯ , q = 2 }
and
D 1 = { X ˜ : X ˜ X ˜ ¯ , q = 1 } D 2 = { X ˜ : X ˜ < X ˜ ¯ , q = 2 }
Remark 2.
If the error bound X ˜ ¯ = ε is large, the pursuer may always use the linear quadratic strategy. On the other hand, the pursuer always uses the norm-based strategy with a small error bound.

5. Simulation Analysis

To demonstrate the performance of the hybrid strategy under different information scenarios, several simulations were carried out. We assumed that the pursuer and the target moved around the geocentric orbit (GEO). Thus, a satellite near them was chosen as the reference satellite that moved along the GEO. We knew that the orbital angular velocity was ω = 7.2722 × 10 5 rad / s . The initial positions of the pursuer and the target were [1.5; 0.5; 0] km, [0; 0; 0] km. The initial velocities they had were [0; 0; 0] km/s, [−0.05; 0; 0.01] km/s, respectively. The parameters in cost function were chosen as:
Q = [ I 0 0 0 ] ,   S = [ I 0 0 0 ] ,   R P = 1 × 10 6 I 3 ,   R E = 1.5 × 10 6 I 3
The maximal thrust of the pursuer was ρ P = 10     m / s 2 . The maximal thrust of the target was ρ E = 8     m / s 2 . The gain of the disturbance estimator was chosen as L = I 3 .
In the first scenario, the target applied the linear quadratic strategy. Figure 3 gives the trajectories of the pursuer and the target, and Figure 4 shows the relative distance between them, from which we can find that the target was approached in 134 s. Figure 5 and Figure 6 show the three-axis thrust accelerations of the spacecrafts. It can be seen that the acceleration of the pursuer was larger than that of the target, and their accelerations varied slowly.
Figure 7, Figure 8 and Figure 9 depict the control logic and error of estimation, from which we knew that the pursuer used the linear quadratic compensation strategy, and, then, used the norm-based strategy after 128 s. Via the hybrid strategy, the approach could be achieved.
The second scenario was that the target used the norm-based strategy. Simulation results are shown as Figure 10, Figure 11, Figure 12, Figure 13, Figure 14, Figure 15 and Figure 16. Different from the first case, the target adopted the strategy of ignoring fuel consumption, and so the limit value of change hybrid strategy would be reached faster. The control logic changed from 1 to 2 at 68 s, which indicated that the pursuer applied the linear quadratic strategy within 70 s and, then, applied the norm-based strategy. The results show that the approach could be achieved in 81 s.
Finally, to demonstrate the effectiveness of the disturbance estimator to the target strategy, we supposed the target employed random strategy, which is a non-optimal and irregular strategy. This situation simulated a real-world scenario for handling dangerously out-of-control spacecraft. The strategy is given as:
U E = ρ E randn ( 3 ) randn ( 3 )
The spacecraft trajectory of hybrid game strategy is shown in Figure 17, from which we can find that the target maneuvered irregularly in the whole process. The following Figure 18 shows the relative distance between the pursuer and the target. After 120 s, the target spacecraft was approached. As can be seen from Figure 19 and Figure 20, if there was no hybrid strategy in the situation of incomplete information, approach could not be realized, so the estimation of target strategy was very meaningful. As can be seen from Figure 21, since the maneuvering strategy of the out-of-control target was random and disordered, the error estimated oscillation constantly to match the true value of the target control strategy. In this scenario, the pursuer took the disturbance observer to estimate the target control matrix information, and then established the hybrid game strategy and completed the change of control logic after 117 s as shown in Figure 22 and Figure 23.
When the pursuer took the estimation behavior, the bilateral game problem turned into a unilateral optimization problem, so the pursuer could get better approaching effect by adopting the corresponding strategy. The proposed hybrid game strategy could intercept a dangerous spacecraft which is out of control in a finite time, which shows the effectiveness of information estimation in incomplete information game. Nevertheless, it was noted that when the target’s maneuver varied slowly, the disturbance estimator showed the well estimation performance. However, if the target’s maneuver was performed randomly, which meant at a fast-changing rate, the estimation performance would be degraded, which would also affect the pursuit performance.
From Table 1 on the comparison of fuel consumption, it was observed that the designed hybrid game strategy could achieve rapid approach, while saving fuel consumption in different scenarios. Compared with the traditional unilateral optimal control strategy, considering the complex situation of pursuit–evasion of spacecrafts with the ability of perceiving information, the application of hybrid game strategy had more advantages.

6. Conclusions

This paper investigated the hybrid game strategies of an out-of-control spacecraft with incomplete-information. Firstly, the impact of hybrid game strategy on both sides was analyzed, and the complete-information intercept game switching strategy was derived from differential game theory. Furthermore, the asymmetric-information situation, where the target information was not available to the pursuer, was discussed. The incomplete-information game strategy and information-estimation strategy were derived, respectively. The method of interference estimator is proposed to improve the estimation performance, and, embedded in hybrid game strategy, is consideration of both interception effectiveness and fuel consumption in estimating unknown target maneuver information. Finally, simulations were performed to assess the effectiveness and pursuit performance.
Compared with traditional unilateral optimal control strategy, the complex situation of the out-of-control spacecraft with the ability of sensing space information was considered. The simulation verified the effectiveness of the hybrid game strategy under different information scenarios. Simulation results showed that if the out-of-control target’s maneuver was unknown to the pursuer, the difficulty of approach would increase. Using the proposed disturbance estimator, the target information could be estimated effectively and the control logic could be changed quickly to minimize fuel consumption while quickly approaching the target, which indicated the effectiveness of the proposed scheme in this paper. Therefore, the proposed hybrid game strategy can be applied in practical pursuit scenarios for dangerous spacecraft disposal missions with incomplete-information. In the future, the authors will focus on stochastic-game problems for complex dynamics of the out-of-control spacecraft.

Author Contributions

Conceptualization, X.T. and D.Y.; methodology, X.T.; software, X.T.; validation, D.Y., S.L. and K.-S.L.; formal analysis, S.L.; investigation, X.T.; resources, D.Y.; data curation, X.T.; writing—original draft preparation, X.T.; writing—review and editing, S.L.; visualization, X.T.; supervision, Z.S.; project administration, D.Y.; funding acquisition, D.Y. and Z.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (NNSF) under grant No. 62073102, No. 51875119, the National Key R&D Program of China (2021YFC2202900).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Tafazoli, M. A study of on-orbit spacecraft failures. Acta Astronaut. 2009, 64, 195–205. [Google Scholar] [CrossRef]
  2. Shen, H.X.; Casalino, L. Revisit of the three-dimensional orbital pursuit-evasion game. J. Guid. Control Dyn. 2018, 41, 1823–1831. [Google Scholar] [CrossRef]
  3. Isaacs, R. Differential Games: A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization; Dover Publications Inc.: New York, NY, USA, 1999; pp. 200–231. [Google Scholar]
  4. Ye, D.; Shi, M.; Sun, Z. Satellite proximate interception vector guidance based on differential games. Chin. J. Aeronaut. 2018, 31, 1352–1361. [Google Scholar] [CrossRef]
  5. Stupik, J.; Pontani, M.; Conway, B. Optimal pursuit/evasion spacecraft trajectories in the hill reference frame. In Proceedings of the AIAA/AAS Astrodynamics Specialist Conference, Minneapolis, MN, USA, 13–16 August 2012; p. 4882. [Google Scholar]
  6. Liu, Y.; Li, R.; Wang, S. Orbital three-player differential game using semi-direct collocation with nonlinear programming. In Proceedings of the 2016 2nd International Conference on Control Science and Systems Engineering (ICCSSE), Singapore, 27–29 July 2016; pp. 217–222. [Google Scholar]
  7. Potani, M.; Conway, B.A. Numerical solution of the three-dimensional orbital pursuit-evasion game. J. Guid. Control Dyn. 2009, 32, 474–487. [Google Scholar] [CrossRef]
  8. Potani, M.; Conway, B.A. Optimal Interception of Evasive Missile Warhead Numerical Solution of the Differential Game. J. Guid. Control Dyn. 2008, 31, 1111–1122. [Google Scholar] [CrossRef]
  9. Ye, D.; Shi, M.; Sun, Z. Satellite proximate pursuit-evasion game with different thrust configurations. Aerosp. Sci. Technol. 2020, 99, 105715. [Google Scholar] [CrossRef]
  10. Gutman, S.; Rubinsky, S. 3D-nonlinear vector guidance and exo-atmospheric interception. IEEE Trans. Aerosp. Electron. Syst. 2015, 51, 3014–3022. [Google Scholar] [CrossRef]
  11. Gutman, S.; Rubinsky, S. Exo-atmospheric mid-course guidance. In Proceedings of the AIAA Guidance, Navigation, and Control Conference, Kissimmee, FL, USA, 5–9 January 2015; p. 0088. [Google Scholar]
  12. Gutman, S.; Rubinsky, S. Exoatmospheric thrust vector interception via time-to-go analysis. J. Guid. Control Dyn. 2016, 39, 86–97. [Google Scholar] [CrossRef]
  13. Shima, T. Optimal Cooperative Pursuit and Evasion Strategies Against a Homing Missile. J. Guid. Control Dyn. 2011, 34, 414–425. [Google Scholar] [CrossRef]
  14. Perelman, A.; Shima, T.; Rusnak, I. Cooperative Differential Games Strategies for Active Aircraft Protection from a Homing Missile. J. Guid. Control Dyn. 2011, 34, 761–773. [Google Scholar] [CrossRef]
  15. Chai, Y.; Luo, J.; Han, N.; Xie, J. Linear quadratic differential game approach for attitude takeover control of failed spacecraft. Acta Astronaut. 2020, 175, 142–154. [Google Scholar] [CrossRef]
  16. Prokopov, O.; Shima, T. Linear quadratic optimal cooperative strategies for active aircraft protection. In Proceedings of the AIAA Guidance, Navigation, and Control Conference, Minneapolis, MN, USA, 13–16 August 2012; pp. 753–764. [Google Scholar]
  17. Cavalieri, K.A. Incomplete Information Pursuit-Evasion Games with Application to Spacecraft Rendezvous and Missile Defense. Ph.D. Thesis, Texas A&M University, College Station, TX, USA, 2014. [Google Scholar]
  18. Cavalieri, K.A.; Satak, N.; Hurtado, J.E. Incomplete information pursuit-evasion games with uncertain relative dynamics. In Proceedings of the AIAA Guidance, Navigation, and Control Conference, National Harbor, MD, USA, 13–17 January 2014; p. 0971. [Google Scholar]
  19. Satak, N. Behavior Learning in Differential Games and Reorientation Maneuvers; Texas A&M University: College Station, TX, USA, 2013. [Google Scholar]
  20. Satak, N.; Hurtado, J.E. A framework for behavior learning in differential games. In Proceedings of the 52nd Aerospace Science Meeting, National Harbor, MD, USA, 13–17 January 2014; p. 1325. [Google Scholar]
  21. Shaferman, V.; Shima, T. Cooperative multiple-model adaptive guidance for an aircraft defending missile. J. Guid. Control Dyn. 2010, 33, 1801–1813. [Google Scholar] [CrossRef]
  22. Shima, T.; Oshman, Y.; Shinar, J. Efficient multiple model adaptive estimation in ballistic missile interception scenarios. J. Guid. Control Dyn. 2010, 25, 667–675. [Google Scholar] [CrossRef]
  23. Wang, Z.; Gong, B.; Yuan, Y.; Ding, X. Incomplete Information Pursuit-Evasion Game Control for a Space Non-Cooperative Target. Aerospace 2021, 8, 211. [Google Scholar] [CrossRef]
  24. Hafer, W.T.; Reed, H.L. Orbital pursuit-evasion hybrid spacecraft controllers. In Proceedings of the AIAA Guidance, Navigation, and Control Conference, Kissimmee, FL, USA, 5–9 January 2015. [Google Scholar]
  25. Turesky, V.; Shima, T. Target evasion from a missile performing multiple switches in guidance law. J. Guid. Control Dyn. 2016, 39, 2364–2373. [Google Scholar] [CrossRef]
  26. Shinar, J.; Glizer, V.Y.; Turesky, V. A pursuit-evasion game with hybrid pursuer dynamics. Eur. J. Control 2009, 6, 665–684. [Google Scholar] [CrossRef]
  27. Shinar, J.; Glizer, V.Y.; Turesky, V. Robust pursuit of a hybrid evader. Appl. Math. Comput. 2010, 217, 1231–1245. [Google Scholar] [CrossRef]
  28. Wibben, D.R.; Furfaro, R. Terminal guidance for lunar landing and retargeting using a hybrid control strategy. J. Guid. Control Dyn. 2016, 39, 1168–1172. [Google Scholar] [CrossRef]
  29. Xiao, N.; Xiao, Y.; Ye, D. Adaptive differential game for modular reconfigurable satellites based on neural network observer. Aerosp. Sci. Technol. 2022, 128, 107759. [Google Scholar] [CrossRef]
  30. Lin, W. Differential Games for Multi-Agent Systems under Distributed Information. Ph.D. Thesis, University of Central Florida, Orlando, FL, USA, 2013. [Google Scholar]
  31. Tartaglia, V.; Innocenti, M. Game theoretic strategies for spacecraft rendezvous and motion synchronization. In Proceedings of the 2016 AIAA Guidance, Navigation, and Control Conference, San Diego, CA, USA, 4–8 January 2016; pp. 1–13. [Google Scholar]
  32. Yang, J.; Li, S.; Chen, W.H. Nonlinear disturbance observer-based control for multi-input multi-output nonlinear systems subject to mismatching condition. Int. J. Control 2012, 85, 1071–1082. [Google Scholar] [CrossRef]
Figure 1. Pursuer and reference satellite.
Figure 1. Pursuer and reference satellite.
Aerospace 09 00455 g001
Figure 2. Closed-loop system hybrid strategy scheme.
Figure 2. Closed-loop system hybrid strategy scheme.
Aerospace 09 00455 g002
Figure 3. Spacecraft trajectories of the hybrid game strategy.
Figure 3. Spacecraft trajectories of the hybrid game strategy.
Aerospace 09 00455 g003
Figure 4. Relative distance between the pursuer and the target.
Figure 4. Relative distance between the pursuer and the target.
Aerospace 09 00455 g004
Figure 5. Control acceleration of the pursuer.
Figure 5. Control acceleration of the pursuer.
Aerospace 09 00455 g005
Figure 6. Control acceleration of the target.
Figure 6. Control acceleration of the target.
Aerospace 09 00455 g006
Figure 7. Error estimation of U E .
Figure 7. Error estimation of U E .
Aerospace 09 00455 g007
Figure 8. Error estimation of X ˜ .
Figure 8. Error estimation of X ˜ .
Aerospace 09 00455 g008
Figure 9. Control logic.
Figure 9. Control logic.
Aerospace 09 00455 g009
Figure 10. Spacecraft trajectories of the hybrid game strategy.
Figure 10. Spacecraft trajectories of the hybrid game strategy.
Aerospace 09 00455 g010
Figure 11. Relative distance between the pursuer and the target.
Figure 11. Relative distance between the pursuer and the target.
Aerospace 09 00455 g011
Figure 12. Control acceleration of the pursuer.
Figure 12. Control acceleration of the pursuer.
Aerospace 09 00455 g012
Figure 13. Control acceleration of the target.
Figure 13. Control acceleration of the target.
Aerospace 09 00455 g013
Figure 14. Error estimation of U E .
Figure 14. Error estimation of U E .
Aerospace 09 00455 g014
Figure 15. Error estimation of X ˜ .
Figure 15. Error estimation of X ˜ .
Aerospace 09 00455 g015
Figure 16. Control logic.
Figure 16. Control logic.
Aerospace 09 00455 g016
Figure 17. Spacecraft trajectories of the hybrid game strategy.
Figure 17. Spacecraft trajectories of the hybrid game strategy.
Aerospace 09 00455 g017
Figure 18. Relative distance between the pursuer and the target.
Figure 18. Relative distance between the pursuer and the target.
Aerospace 09 00455 g018
Figure 19. Control acceleration of the pursuer.
Figure 19. Control acceleration of the pursuer.
Aerospace 09 00455 g019
Figure 20. Control acceleration of the target.
Figure 20. Control acceleration of the target.
Aerospace 09 00455 g020
Figure 21. Error estimation of U E .
Figure 21. Error estimation of U E .
Aerospace 09 00455 g021
Figure 22. Error estimation of X ˜ .
Figure 22. Error estimation of X ˜ .
Aerospace 09 00455 g022
Figure 23. Control logic.
Figure 23. Control logic.
Aerospace 09 00455 g023
Table 1. Comparison of Fuel Consumption.
Table 1. Comparison of Fuel Consumption.
Norm-Based StrategyHybrid Strategy
Case 10.00370.0015
Case 20.0035
Case 30.0007
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Tang, X.; Ye, D.; Luo, S.; Low, K.-S.; Sun, Z. A Hybrid Game Strategy for the Pursuit of Out-of-Control Spacecraft under Incomplete-Information. Aerospace 2022, 9, 455. https://doi.org/10.3390/aerospace9080455

AMA Style

Tang X, Ye D, Luo S, Low K-S, Sun Z. A Hybrid Game Strategy for the Pursuit of Out-of-Control Spacecraft under Incomplete-Information. Aerospace. 2022; 9(8):455. https://doi.org/10.3390/aerospace9080455

Chicago/Turabian Style

Tang, Xu, Dong Ye, Sha Luo, Kay-Soon Low, and Zhaowei Sun. 2022. "A Hybrid Game Strategy for the Pursuit of Out-of-Control Spacecraft under Incomplete-Information" Aerospace 9, no. 8: 455. https://doi.org/10.3390/aerospace9080455

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop