Pursuer’s Control Strategy for Orbital Pursuit-Evasion-Defense Game with Continuous Low Thrust Propulsion

Zhou, Junfeng; Zhao, Lin; Cheng, Jianhua; Wang, Shuo; Wang, Yipeng

doi:10.3390/app9153190

Open AccessArticle

Pursuer’s Control Strategy for Orbital Pursuit-Evasion-Defense Game with Continuous Low Thrust Propulsion

by

Junfeng Zhou

,

Lin Zhao

,

Jianhua Cheng

^*,

Shuo Wang

and

Yipeng Wang

College of Automation, Harbin Engineering University, Harbin 150001, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2019, 9(15), 3190; https://doi.org/10.3390/app9153190

Submission received: 25 June 2019 / Revised: 30 July 2019 / Accepted: 1 August 2019 / Published: 5 August 2019

(This article belongs to the Special Issue Control and Soft Computing)

Download

Browse Figures

Review Reports Versions Notes

Abstract

This paper studies the orbital pursuit-evasion-defense problem with the continuous low thrust propulsion. A control strategy for the pursuer is proposed based on the fuzzy comprehensive evaluation and the differential game. First, the system is described by the Lawden’s equations, and simplified by introducing the relative state variables and the zero effort miss (ZEM) variables. Then, the objective function of the pursuer is designed based on the fuzzy comprehensive evaluation, and the analytical necessary conditions for the optimal control strategy are presented. Finally, a hybrid method combining the multi-objective genetic algorithm and the multiple shooting method is proposed to obtain the solution of the orbital pursuit-evasion-defense problem. The simulation results show that the proposed control strategy can handle the orbital pursuit-evasion-defense problem effectively.

Keywords:

differential game; fuzzy comprehensive evaluation; continuous low thrust; zero effort miss variables

1. Introduction

Recently, the orbital pursuit-evasion problem has attracted increasing attention in space research [1,2,3,4]. This problem can be formulated as a differential game [5], which aims to obtain the optimal control strategy of the pursuer and/or the evader in the worst-case scenario, so as to realize the interception of the evader or the evasion from the pursuer.

Wong [6] was regarded as the first person to study the orbital pursuit-evasion problem, he solved the problem of intercepting a maneuverable satellite under the assumption of planar motion and constant gravitational field. Since then, many works have focused on the orbital pursuit-evasion problem. In reference [7], a method based on periodically updating the solution of the two-point boundary value problem (TPBVP) was proposed to generate near optimal feedback controls for the orbital pursuit-evasion problem. However, this method is time-consuming and difficult to be applied in real time. In order to overcome these drawbacks, Anderson [8] used a modified first-order differential dynamic programming algorithm to generate near-optimal feedback controls. References [9,10,11] found the saddle-point equilibrium solutions of the three-dimensional orbital pursuit-evasion game respectively by three different hybrid numerical methods. Hafer et al. [12] applied the sensitivity method to the orbital pursuit-evasion problem, which greatly reduces the computation burden for solving this problem numerically. Widhalm studied the problem of avoiding an interception and proposed two optimal evasive-maneuver strategies with the impulsive thrust [13] and the continuous low thrust [1] respectively. Prussing et al. [14] derived minimum-fuel impulsive strategies for return-on-state maneuvers by applying the primer vector theory. Merz [15] developed the guidance laws for the noisy satellite pursuit-evasion game. Woodbury et al. [16] studied an incomplete, imperfect information game and presented the adaptive strategies for the pursuer and the evader. Ghosh et al. [17] developed a near-optimal feedback controller for the two-player pursuit-evasion games by using a new extremal-field approach. The above works were studied in the two-player pursuit-evasion game framework. However, in this framework, the evader can only perform maneuvers by itself to avoid threats. It is called self-defense, which disturbs the original mission of the evader and requires a large additional amount of fuel.

To overcome this disadvantage, a defender is introduced in [18]. The role of the defender is intercepting the pursuer. In this way, the evader can perform its original mission without being disturbed. A hybrid method combined particle swarm optimization with a Newton-Interpolation algorithm was proposed to solve the orbital defense problem. However, because of the introduction of the defender, the pursuer must avoid the interception by the defender while capturing the evader [19], which makes the design of the pursuer’s control strategy more complicated. In order to develop control strategies for pursuers, Liu et al. [19] proposed a distributed online mission plan algorithm for pursuers to access targets. However, these works on the orbital pursuit-evasion-defense game adopted the impulsive thrust, which suffers the drawback that the interception will fail when the target can perform evasive maneuvers [4].

Compared with the impulse thrust, the continuous low thrust allows players to perform multiple, continuous maneuvers, which meets the requirements of the frequently orbital transfers in the game. When applying the continuous low thrust, the hypothesis about players’ maneuverable is removed. It is closer to the actual situation of the orbital pursuit-evasion-defense game. Therefore, in this paper, the orbital pursuit-evasion-defense game model is constructed based on the continuous low thrust. Different from the model based on impulse thrust, the model based on continuous low thrust cannot adopt the Keplerian dynamics [20]. Its dynamic equations are based on the non-Keplerian motion. Two issues need to be solved in this model: (i) The system has a high dimension, which means that it will suffer from the curse of dimensionality [21] when solving the problem; (ii) two objectives, intercepting the evader and evading the defender, should be considered by the pursuer, and the corresponding weights should be determined according to the current state. For the first issue, as the zero effort miss (ZEM) can be used to simplify the linear system [4], the dimension of the system is reduced by introducing the relative state variables and the ZEM variables [22]. For the second issue, the pursuer’s objective function is designed based on the fuzzy comprehensive evaluation, and the pursuer’s control strategy which is suitable for the orbital pursuit-evasion-defense game is proposed. Based on the above model, the orbital pursuit-evasion-defense game is transformed into a TPBVP by applying the differential game theory. A hybrid method combining the multi-objective genetic algorithm and the multiple shooting method is presented to solve the TPBVP.

2. Mathematical Model of Orbital Pursuit-Evasion-Defense Game

2.1. Relative Orbital Dynamics

The orbital pursuit-evasion-defense game occurs in the final phase of the confrontation when the spacecraft are close enough so that they can identify each other with onboard electronic devices [4]. In this type of situation, the motion between the spacecraft can be expressed as relative motion [23]. As is known, Lawden’s equations [24] and Clohessy–Wiltshire (C–W) equations [25] are two linearized equations used to describe the relative motion between spacecrafts. Unlike the C–W equations, which can only be applied to circular orbits, the Lawden’s equations can describe the relative motion of a spacecraft in elliptical orbits. Same as in [26], the dynamics of the participating spacecraft are described in the local-vertical local-horizontal (LVLH) frame centered at a virtual spacecraft. In addition, Lawden’s equations are adopted as the relative dynamic equations of the spacecraft.

As shown in Figure 1,

P, D, E

respectively represents the pursuer, the defender, and the evader. We establish an elliptical fictitious spacecraft

O

which is close to the players. The LVLH coordinate system is centered at the point

O

.

O X

is pointing outward along the radius of the Earth,

O Y

is perpendicular to

O X

in the reference orbital plane and points to the front of its flight direction,

O Z

is perpendicular to the orbital plane and forms a right-handed frame with

O X

and

O Y

.

The Lawden’s equations can be expressed as:

{\begin{cases} {\ddot{x}}_{i} = ω^{2} x_{i} + 2 ω {\dot{y}}_{i} + \dot{ω} y_{i} + 2 \frac{μ x_{i}}{r_{t}^{3}} + T_{i} u_{x i} \\ {\ddot{y}}_{i} = - 2 ω {\dot{x}}_{i} - \dot{ω} x_{i} + ω^{2} y_{i} - \frac{μ y_{i}}{r_{t}^{3}} + T_{i} u_{y i} \\ {\ddot{z}}_{i} = - \frac{μ z_{i}}{r_{t}^{3}} + T_{i} u_{z i} \end{cases}, i = P, D, E

(1)

where

μ

is the Earth gravitational constant,

r_{t}

is the distance between the origin

O

and the Earth core,

ω

and

\dot{ω}

represent the orbital angular velocity and acceleration of the origin

O

, respectively.

x_{i}

,

y_{i}

, and

z_{i}

represent the position components of the players in the relative coordinate system.

T_{i}

represents the maximum thrust.

u_{x i}

,

u_{y i}

and

u_{z i}

respectively represent control variables in three directions (i.e., x, y, z axis), ranging from 0 to 1.

The state variables (i.e., position and velocity) of the players are represented by

X_{i}

as follows:

X_{i} = {[x_{i}, y_{i}, z_{i}, {\dot{x}}_{i}, {\dot{y}}_{i}, {\dot{z}}_{i}]}^{T}, i = P, D, E

(2)

Thus, the dynamics equations can be written as:

{\dot{X}}_{i} = A_{i} X_{i} + T_{i} U_{i}

(3)

where

A_{i} (t) = [\begin{matrix} 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 \\ ω^{2} + \frac{2 μ}{r_{t}^{3}} & \dot{ω} & 0 & 0 & 2 ω & 0 \\ - \dot{ω} & ω^{2} - \frac{μ}{r_{t}^{3}} & 0 & - 2 ω & 0 & 0 \\ 0 & 0 & - \frac{μ}{r_{t}^{3}} & 0 & 0 & 0 \end{matrix}], i = P, D, E

(4)

U_{i} (i = P, D, E)

is the control variable, which can be represented by

U_{i} = {[0, 0, 0, u_{x i}, u_{y i}, u_{z i}]}^{T}, ‖ U_{i} ‖ \leq 1

(5)

2.2. Dimension-Reduction

According to Equation (3), each player has 6 state variables, so the total number of state variables in the game is 18. In the numerically solving process, co-state variables associated with state variables are introduced, and the problem converts to a 36-dimensional TPBVP. However, solving this high-dimensional TPBVP is quite difficult and possesses high computational demands [27]. In order to improve computational efficiency, the dimension of the system needs to be reduced. This process is performed in two steps. First, the relative state variables between the spacecraft are used to replace the system states. Then, the ZEM variables are applied to further reduce the number of variables and equations in the system.

In the first step, the game can be divided into two parts: One is the game between the pursuer

P

and the evader

E

; the other is the game between the defender

D

and the pursuer

P

. The relative state variables in two parts,

X_{PE}

and

X_{DP}

, can be respectively represented by

{\begin{cases} X_{PE} = X_{P} - X_{E} \\ X_{DP} = X_{D} - X_{P} \end{cases}

(6)

Substituting Equation (6) into Equation (3), the state equations are converted to:

{\begin{cases} {\dot{X}}_{PE} = A X_{PE} + T_{P} U_{P} - T_{E} U_{E} \\ {\dot{X}}_{DP} = A X_{DP} + T_{D} U_{D} - T_{P} U_{P} \end{cases}

(7)

where

A = A_{P} = A_{E} = A_{D}

.

In the second step, according to the linear system theory, the zero-input state transfer matrix

Φ (t_{f}, t)

of the state equation is defined as:

{\begin{cases} \dot{Φ} (t_{f}, t) = - Φ (t_{f}, t) A \\ Φ (t_{f}, t_{f}) = I_{6} \end{cases}

(8)

where

t_{f}

is the terminal time and

I_{6}

is the

6 \times 6

unit matrix.

Although two factors, the relative position and the relative velocity are involved in the game, only the first factor needs to be considered at the end of the game. The ZEM is the miss distance if both players do not apply any control from the current moment to the end of the game. Thus, the ZEM variables are introduced to reduce the dimension of the system and defined as:

{\begin{cases} Z_{PE} (t) = D Φ (t_{f}, t) X_{PE} \\ Z_{DP} (t) = D Φ (t_{f}, t) X_{DP} \end{cases}

(9)

where

D = [I_{3 \times 3}, 0_{3 \times 3}]

.

Substituting Equation (9) into Equation (7), the state equations are reduced to:

{\begin{cases} {\dot{Z}}_{PE} = D Φ T_{P} U_{P} - D Φ T_{E} U_{E} \\ {\dot{Z}}_{DP} = D Φ T_{D} U_{D} - D Φ T_{P} U_{P} \end{cases}

(10)

2.3. Design of Objective Function Based on Fuzzy Comprehensive Evaluation

In the orbital pursuit-evasion-defense game, the pursuer must survive from the defender’s interception before it can successfully access the evader. Therefore, the pursuer-evader game and the defender-pursuer game must be considered and weighed in the objective function of the pursuer. As noted by Liu [19], the fuzzy comprehensive evaluation is an effective way to quantify various factors that are difficult to evaluate. Thus, it is used to obtain the weights corresponding to the two games. Detailed design is shown as follows.

By taking the terminal miss distance as the cost, the objective function of the three players can be defined as:

{\begin{cases} J_{E} = - ‖ Z_{PE} (t_{f}) ‖ \\ J_{D} = ‖ Z_{DP} (t_{f}) ‖ \\ J_{P} = k_{1} ‖ Z_{PE} (t_{f}) ‖ - k_{2} ‖ Z_{DP} (t_{f}) ‖ \end{cases}

(11)

where the parameter

k_{i}, i = 1, 2

is the weight factor, which satisfies

k_{i} \geq 0

.

k_{1} > k_{2}

indicates that the pursuer prefers to reduce the terminal miss distance of the pursuer-evader game, while

k_{1} < k_{2}

indicates that the pursuer prefers to increase the terminal miss distance of the defender-pursuer game. The value of

k_{i}

is divided into 11 scales, which are shown in Table 1.

According to the analysis above, two factors need to be evaluated, one is the urgency of intercepting the evader at the very moment, denoted by

u_{1}

; the other is the urgency of evading the defender at the very moment, denoted by

u_{2}

. The effect of the factor

u_{1}

increases as the ZEM distance of the pursuer-evader game decreases. The effect of the factor

u_{2}

increases as the ZEM distance of the defender-pursuer game decreases. According to this rule,

Z_{PE} (t)

and

Z_{D P} (t)

are used to construct the weights of the two factors, which are given by:

{\begin{cases} a_{1} = 1 - {(\frac{‖ Z_{PE} (t) ‖}{‖ Z_{PE} (t) ‖ + ‖ Z_{DP} (t) ‖})}^{3} \\ a_{2} = 1 - a_{1} \end{cases}

(12)

where

a_{1}

and

a_{2}

represent the weights of the factor

u_{1}

and the factor

u_{2}

respectively. Then the weight vector is expressed as:

A = [a_{1}, a_{2}]

.

In order to establish the relationship between weighting factors and evaluation scales, the membership degree of each factor is calculated by the non-linear membership function which is written as follows:

{\begin{cases} u_{1} (x) = {(k (x - 1))}^{3} \\ u_{2} (x) = 1 - {(k (x - 1))}^{3} \end{cases}

(13)

where

k = 0 . 1

,

x = 1, \dots, 11

are the corresponding evaluation scales.

Let

r_{i j} = u_{i} (j)

, where

i = 1, 2

,

j = 1, \dots 11

, the fuzzy evaluation matrix can be obtained as:

R = {[r_{i j}]}_{2 \times 11}

(14)

The fuzzy comprehensive evaluation result vector is generated by the fuzzy synthetic operation of the weight vector and the fuzzy evaluation matrix. The fuzzy synthetic formula is defined as follows:

B = A \circ R = (b_{1}, b_{2}, \dots, b_{11})

(15)

where “

\circ

” is a fuzzy synthetic operator. In this paper, the weighted average fuzzy arithmetic operator is adopted, which can make full use of the information of

R

. It is specifically expressed as:

b_{j} = \min {1, \sum_{i = 1}^{2} a_{i} \cdot r_{i j}}, j = 1, 2, \dots, 11

(16)

The comprehensive evaluation value is obtained by analyzing the fuzzy comprehensive evaluation result vector. The analysis is done in the following steps. First, the result vector is normalized:

b_{j}^{'} = \frac{b_{j}}{\sum_{j = 1}^{11} b_{j}}

(17)

Then, the normalized vector:

B^{'} = (b_{1}^{'}, b_{2}^{'}, \dots, b_{n}^{'})

, which is processed using the weighted average principle. The evaluation value can be obtained as follows:

b = \frac{\sum_{j = 1}^{11} {(b_{j}^{'})}^{k} \cdot j}{\sum_{j = 1}^{11} {(b_{j}^{'})}^{k}}

(18)

where

k = 10

is a specific coefficient. The purpose of this coefficient is to control the role played by a larger

b_{j}^{'} (j = 1, 2, \dots, 11)

. If its value increases, the role of the largest term in

b_{j}^{'} (j = 1, 2, \dots, 11)

will be more prominent.

Finally, the values of

k_{1}

and

k_{2}

are obtained by finding the evaluation scale corresponding to the evaluation value

b

.

3. Solution Method for Orbital Pursuit-Evasion-Defense Game

3.1. Necessary Conditions for Optimal Strategies

The orbital pursuit-evasion-defense model given in the second section can be formulated as a non-cooperative N-person differential game. Necessary conditions for optimal strategies in this game are provided by Sarma [28] and applied to the system composed of (7) and (8) to obtain the form of optimal strategies.

The Hamiltonian function is introduced as follows:

{\begin{cases} H_{E} = λ_{E}^{T} {\dot{Z}}_{PE} = λ_{E}^{T} (D Φ T_{P} U_{P} - D Φ T_{E} U_{E}) \\ H_{D} = λ_{D}^{T} {\dot{Z}}_{DP} = λ_{D}^{T} (D Φ T_{D} U_{D} - D Φ T_{P} U_{P}) \\ H_{P} = λ_{PE}^{T} {\dot{Z}}_{PE} + λ_{DP}^{T} {\dot{Z}}_{DP} = λ_{PE}^{T} (D Φ T_{P} U_{P} - D Φ T_{E} U_{E}) + λ_{DP}^{T} (D Φ T_{D} U_{D} - D Φ T_{P} U_{P}) \end{cases}

(19)

where

λ_{i} (i = P, D, P E, D P)

are the co-state variables of the system.

According to the necessary conditions, the co-state equations are expressed as follows:

{\begin{cases} {\dot{λ}}_{PE} = - {(\frac{\partial H_{P}}{\partial Z_{PE}})}^{T} = 0 \\ {\dot{λ}}_{DP} = - {(\frac{\partial H_{P}}{\partial Z_{DP}})}^{T} = 0 \\ {\dot{λ}}_{E} = - {(\frac{\partial H_{E}}{\partial Z_{PE}})}^{T} = 0 \\ {\dot{λ}}_{D} = - {(\frac{\partial H_{D}}{\partial Z_{DP}})}^{T} = 0 \end{cases}

(20)

and the transversality conditions are written as follows:

{\begin{cases} λ_{PE} (t_{f}) = \frac{\partial J_{P}}{\partial Z_{PE} (t_{f})} = k_{1} \frac{Z_{PE} (t_{f})}{‖ Z_{PE} (t_{f}) ‖} \\ λ_{DP} (t_{f}) = \frac{\partial J_{P}}{\partial Z_{DP} (t_{f})} = - k_{2} \frac{Z_{DP} (t_{f})}{‖ Z_{DP} (t_{f}) ‖} \\ λ_{E} (t_{f}) = \frac{\partial J_{E}}{\partial Z_{PE} (t_{f})} = - \frac{Z_{P E} (t_{f})}{‖ Z_{PE} (t_{f}) ‖} \\ λ_{D} (t_{f}) = \frac{\partial J_{D}}{\partial Z_{DP} (t_{f})} = \frac{Z_{DP} (t_{f})}{‖ Z_{DP} (t_{f}) ‖} \end{cases}

(21)

From Equations (20) and (21), we can find the following relationship:

{\begin{cases} λ_{PE} (t) = - k_{1} λ_{E} (t) \\ λ_{DP} (t) = - k_{2} λ_{D} (t) \end{cases}

(22)

In addition, the optimal control strategies need to satisfy:

{\begin{cases} u_{D}^{*} = \underset{‖ u_{D} ‖ \leq 1}{\arg \min} H_{D} \\ u_{E}^{*} = \underset{‖ u_{E} ‖ \leq 1}{\arg \min} H_{E} \\ u_{P}^{*} = \underset{‖ u_{P} ‖ \leq 1}{\arg \min} H_{P} \end{cases}

(23)

For the sake of brevity, we define new variables

M_{i} (i = D, E, P)

as follows:

{\begin{cases} M_{D} = λ_{D}^{T} D Φ T_{D} \\ M_{E} = - λ_{E}^{T} D Φ T_{E} \\ M_{P} = - k_{1} λ_{E}^{T} D Φ T_{P} + k_{2} λ_{D}^{T} D Φ T_{P} \end{cases}

(24)

Combining Equations (19), (22), (23), and (24) yields:

{\begin{cases} u_{D}^{*} = {[u_{xD}^{*}, u_{yD}^{*}, u_{zD}^{*}]}^{T} = - \frac{{[M_{D} (4), M_{D} (5), M_{D} (6)]}^{T}}{‖ {[M_{D} (4), M_{D} (5), M_{D} (6)]}^{T} ‖} \\ u_{E}^{*} = {[u_{xE}^{*}, u_{yE}^{*}, u_{zE}^{*}]}^{T} = - \frac{{[M_{E} (4), M_{E} (5), M_{E} (6)]}^{T}}{‖ {[M_{E} (4), M_{E} (5), M_{E} (6)]}^{T} ‖} \\ u_{P}^{*} = {[u_{xP}^{*}, u_{yP}^{*}, u_{zP}^{*}]}^{T} = - \frac{{[M_{P} (4), M_{P} (5), M_{P} (6)]}^{T}}{‖ {[M_{P} (4), M_{P} (5), M_{P} (6)]}^{T} ‖} \end{cases}

(25)

Combining Equation (25) and the form of control vector, the optimal control variables are expressed as Equation (26), which satisfies Equation (27).

{\begin{cases} U_{D}^{*} = {[0, 0, 0, u_{xD}^{*}, u_{yD}^{*}, u_{zD}^{*}]}^{T} \\ U_{E}^{*} = {[0, 0, 0, u_{xE}^{*}, u_{yE}^{*}, u_{zE}^{*}]}^{T} \\ U_{P}^{*} = {[0, 0, 0, u_{xP}^{*}, u_{yP}^{*}, u_{zP}^{*}]}^{T} \end{cases}

(26)

{\begin{matrix} J_{P} (U_{P}^{*}, U_{E}^{*}, U_{D}^{*}) \leq J_{P} (U_{P}, U_{E}^{*}, U_{D}^{*}) \\ J_{E} (U_{P}^{*}, U_{E}^{*}) \leq J_{E} (U_{P}^{*}, U_{E}) \\ J_{D} (U_{P}^{*}, U_{D}^{*}) \leq J_{D} (U_{P}^{*}, U_{D}) \end{matrix}

(27)

Equations (10), (20), (21), and (24)–(26) constitute a TPBVP.

3.2. Hybrid Numerical Method

So far, the orbital pursuit-evasion-defense problem has been transformed into a 12-dimensional TPBVP. Generally, this kind of problem cannot be solved analytically, and numerical algorithms must be employed [9]. Numerical algorithms for solving this kind of problems include collocation method [29] and multiple shooting method [30]. The collocation method suffers from poor accuracy and high computational burden, while the multiple shooting method has high accuracy but is very sensitive to the initial guess. As noted by Pontani [9], evolutionary methods constitute an effective statistical search technique for selecting the best parameters. Thus, we apply evolutionary methods to generate the initial guess for the multiple shooting method. A hybrid method combining the multi-objective genetic algorithm and the multiple shooting method is proposed to obtain the solution of the orbital pursuit-evasion-defense game. First, the initial guesses of unknown parameters are obtained by using the multi-objective genetic algorithm. Then the exact solution of the TPBVP is solved by using the multiple shooting method.

For the sake of clarity, the state equations, the co-state equations, the initial conditions, and the terminal conditions are arranged.

Combining Equations (10), (20), and (26), the state equations and the co-state equations can be concluded as follows:

{\begin{cases} {\dot{Z}}_{PE} = D Φ T_{P} U_{P}^{*} - D Φ T_{E} U_{E}^{*} \\ {\dot{Z}}_{DP} = D Φ T_{D} U_{D}^{*} - D Φ T_{P} U_{P}^{*} \\ {\dot{λ}}_{E} = 0 \\ {\dot{λ}}_{D} = 0 \end{cases}

(28)

The initial conditions of Equation (28) are expressed as follows:

{\begin{cases} Z_{PE} (0) = D Φ (t_{f}, 0) X_{PE} (0) \\ Z_{DP} (0) = D Φ (t_{f}, 0) X_{DP} (0) \end{cases}

(29)

where

X_{PE} (0) = X_{P} (0) - X_{E} (0)

,

X_{DP} (0) = X_{D} (0) - X_{P} (0)

.

According to Equation (21), the terminal conditions are written as follows:

{\begin{cases} λ_{E} (t_{f}) = \frac{\partial J_{E}}{\partial Z_{PE} (t_{f})} = - \frac{Z_{PE} (t_{f})}{‖ Z_{PE} (t_{f}) ‖} \\ λ_{D} (t_{f}) = \frac{\partial J_{D}}{\partial Z_{DP} (t_{f})} = \frac{Z_{DP} (t_{f})}{‖ Z_{DP} (t_{f}) ‖} \end{cases}

(30)

3.2.1. Multi-Objective Genetic Algorithm

In the multi-objective genetic algorithm preprocessing, the terminal time

t_{f}

and the unknown initial co-state variables

λ_{E} (0)

and

λ_{D} (0)

are taken as parameters (individuals). According to the terminal conditions, the objective functions of the multi-objective genetic algorithm are set as follows:

{\begin{matrix} J_{1} = ‖ λ_{E} (t_{f}) + \frac{Z_{PE} (t_{f})}{‖ Z_{PE} (t_{f}) ‖} ‖ \\ J_{2} = ‖ λ_{D} (t_{f}) + \frac{Z_{DP} (t_{f})}{‖ Z_{DP} (t_{f}) ‖} ‖ \end{matrix}

(31)

The safe distance constraint is applied to ensure that the distance between any two players is greater than the safe distance before the terminal time. The best parameters are obtained by setting the reasonable population size, the appropriate maximum generation, and the suitable operators (i.e., crossover and mutation). The multi-objective genetic algorithm improved by Deb [31] is applied to this problem. This algorithm can reduce the complexity of computation and maintain the diversity of solutions. In this paper, we used the default operators in the toolkit on multi-objective genetic algorithm which is provided by Aravind Seshadri [32]. In addition, the population size and the number of generations are set as 100 and 200 respectively. Because of the use of the multi-objective genetic algorithm, the preprocessing time is relatively long. Thus, this algorithm is suitable for off-line calculation.

3.2.2. Multiple Shooting Method

In order to better illustrate the application of the multiple shooting method in this problem, a new state vector is defined:

Ω (t) = [Z_{PE} (t), Z_{DP} (t), λ_{E} (t), λ_{D} (t)]

(32)

Substituting Equation (32) into Equation (28), the system equations can be expressed as follows:

\dot{Ω} (t) = f (t, Ω (t))

(33)

The multiple shooting method transforms the TPBVP into a series of initial value problems. The specific steps are given as follows:

Step 1.: Divide the time interval $[0, t_{f}]$ into m subintervals, and $t_{k} (k = 0, \dots, m)$ represents the boundary points of subintervals, which satisfy $0 = t_{0} < t_{1} < \dots < t_{m} = t_{f}$ .
Step 2.: For each subinterval $[t_{i}, t_{i + 1}] (i = 0, \dots, m - 1)$ , consider the initial value problem: $\dot{Ω} (t) = f (t, Ω (t))$ , $Ω (t_{i}) = s_{i}$ , where $s_{i}$ is the initial value of the problem.
Step 3.: Calculate the initial guess by the multi-objective genetic algorithm.
Step 4.: Solve the initial value problem on each subinterval to obtain the solution $Ω (t, t_{i}, s_{i})$ .
Step 5.: Determine whether the condition $Ω (t_{i + 1}, t_{i}, s_{i}) = s_{i + 1}$ and boundary conditions (29) and (30) are satisfied. If not, use the Newton method to modify the initial value and return to step 4. If the conditions are satisfied, the solution of the TPBVP is obtained successfully.

We point out that the accuracy of the initial guess value affects the solution obtained by the multiple shooting method. If the accuracy of the initial guess value is not enough, the convergence point may not be the desired solution. Moreover, it may increase the number of iterations and prolong the calculation time.

4. Results and Discussion

In this section, the following four examples are given to verify the effectiveness of the proposed strategy. Among these, Example 1 and Example 2 are taken as one group. Their initial conditions and maneuver parameters are the same. The differences between the two examples are that when performing orbital maneuvers, the pursuer in Example 1 adopts the control strategy based on the fuzzy comprehensive evaluation, while the pursuer in Example 2 does not consider the impact of the defender, that is, the parameters

k_{1} = 1, k_{2} = 0

in the objective function

J_{P}

. Example 3 and Example 4 are taken as the other group, with the differences between the two examples being the same as those between Example 1 and Example 2 in the first group. The initial orbital altitude of their reference orbit

h = 500 km

, the acceleration of gravity

g = 9.8 e - 3 km / s^{2}

, and the radius of the Earth

R = 6371.393 km

. During the game, the safety distance between players is set as 0.5 km.

Example 1. The maximum unit mass thrusts of the pursuer, the evader, and the defender are

T_{P} = 0.09 \times g

,

T_{E} = 0.01 \times g

, and

T_{D} = 0.02 \times g

, respectively, and the game time is 267.4124 s. The initial positions and velocities of the pursuer, the evader, and the defender are shown in Table 2. The pursuer adopts the control strategy based on the fuzzy comprehensive evaluation.

Figure 2 shows the curves of the positions of the three players changing with time in the directions of X, Y, Z. From Figure 2, it can be seen that the pursuer bypasses the interception of the defender and eventually catches up with the evader. From Table 3, it can be seen that at the terminal moment, the distance between the pursuer and the evader is 0.3598 km, which is shorter than the safety distance 0.5 km, indicating that at the terminal moment, the pursuer catches up with the evader. Figure 3 shows the distance between the defender and the pursuer during the game. It reaches the shortest distance at 203.5 s. After that, the distance between the pursuer and the defender becomes longer, the shortest distance being 0.5099 km, which is longer than the safety distance 0.5 km, indicating that during the game the pursuer successfully bypasses the defender.

Figure 4 shows the curves of the control variable of each player changing with time in the directions of X, Y, Z. Figure 5 shows the curve of ZEM distance changing with time. From the figures, it can be seen that when the ZEM distance

Z_{DP} (t)

is close to 0 (i.e., 40 s to 130 s), the pursuer will consider more about evading the defender. So in this phase, the pursuer’s control curve is nearer to the control curve of the defender. During the time when the ZEM variables

Z_{DP} (t)

are not close to 0, the pursuer almost ignores the impact of the defender, so the control curve of the pursuer at this stage almost superposes with that of the evader. Through this strategy, the pursuer successfully bypasses the defenders during the game and finally captures the evader.

Example 2. The maximum unit mass thrusts of the pursuer, the evader, and the defender are

T_{P} = 0.09 \times g

,

T_{E} = 0.01 \times g

, and

T_{D} = 0.02 \times g

, respectively, and the game time is 159.81193 s. The positions and velocities of the pursuer, the evader and the defender in the initial time are shown in Table 2. The pursuer does not consider the impact of the defender when performing orbital maneuvers.

As shown in Figure 6, the defender successfully intercepts the pursuer at the terminal moment. Figure 7 shows the distance between the defender and the pursuer during the game. According to Figure 7, the distance becomes shorter and shorter in the entire game, which is caused by the pursuer’s not considering the impact of the defender. At the terminal moment, the distance between the defender and the pursuer is 0.4004 km, which is shorter than the safety distance 0.5 km.

Figure 8 shows the control variable of each player changing with time in the game. As shown in the figure, the control curve of the pursuer overlaps with that of the evader in the whole procedure, the reason being that the pursuer only considers the evader when performing orbital maneuvers.

Comparing Example 1 with Example 2, it can be seen that with the control strategy based on the fuzzy comprehensive evaluation, the pursuer can successfully bypass the defender, and finally capture the evader. The pursuer, for not considering the impact of the defender, is eventually intercepted by the defender.

Example 3. The maximum unit mass thrusts of the pursuer, the evader, and the defender are

T_{P} = 0.14 \times g

,

T_{E} = 0.01 \times g

, and

T_{D} = 0.14 \times g

, respectively. The maneuverability of the defender and that of the pursuer are the same, and the game time is 177.87788 s. The positions and velocities of the pursuer, the evader, and the defender in the initial time are shown in Table 4. The pursuer adopts the control strategy based on the fuzzy comprehensive evaluation.

Figure 9 shows the curves of the positions of the three players changing with time in directions of X, Y, Z. As shown in Figure 9, the defender successfully intercepts the pursuer at the end of the game. Figure 10 shows the curves of the control variable of each player changing with time in the directions of X, Y, Z. At 160 s or so, the pursuer starts to change the control strategy to evade the defender. However, because of the same maneuverability of the defender and the pursuer, the pursuer does not successfully bypass the interception of the defender. Table 5 shows the position of each player at the terminal moment. From Table 5, it can be seen that at the terminal moment, the distance between the defender and the pursuer is 0.4223 km, which is shorter than the safety distance 0.5 km, and at the terminal moment, the distance between the pursuer and the evader is 1.7823 km, which is longer than the safety distance 0.5 km. All the above show that at the terminal moment, the defender successfully intercepts the pursuer and that the evader successfully evades the capture of the pursuer.

Example 4. The maximum unit mass thrusts of the pursuer, the evader, and the defender are

T_{P} = 0.14 \times g

,

T_{E} = 0.01 \times g

,

T_{D} = 0.14 \times g

, respectively. The maneuverability of the defender and that of the pursuer are the same, and the game time is 177.19431 s. The positions and velocities of the pursuer, the evader, and defender in the initial time are shown in Table 4. The pursuer does not consider the impact of the defender when performing orbital maneuvers.

As shown in Figure 11, the defender intercepts the pursuer at the terminal moment. From Table 6, it can be seen that the distance between the pursuer and the defender in the game is 0.3959 km, which is shorter than the safety distance 0.5 km, and that the distance between the pursuer and the evader is 1.6016 km, which is longer than the safety distance 0.5 km. This shows that the defender intercepts the pursuer successfully at the terminal moment, and the evader evades the capture of the pursuer successfully. Figure 12 shows the curves of the control variable of each player changing with time in the directions of X, Y, Z. From the figure, it can be seen that the control curves of the pursuer remain overlapped with those of the evader.

Comparing Example 3 with Example 4, it can be seen that, because of the different control strategies adopted by the pursuer, the time that the defender takes to intercept the pursuer in Example 3 is longer than that in Example 4. Moreover, at the terminal moment, the distance between the pursuer and the defender in Example 4 is shorter than that in Example 3.

The comparison between Example 1 and Example 2 shows that when the control variable of the pursuer is in a dominant position, the optimal control strategy proposed in this paper makes the pursuer bypass the defender and capture the evader. The comparison between Example 3 and Example 4 shows that when the control variable of the pursuer is not in a dominant position, the optimal control strategy proposed in this paper prolongs the time that the defender takes to intercept the pursuer.

5. Conclusions

The fuzzy comprehensive evaluation and the differential game theory are applied to design the control strategy of the pursuer in the orbital pursuit-evasion-defense problem. The hybrid method combining the multi-objective genetic algorithm and the multiple shooting method is proposed to solve the problem. The simulation results show that when the pursuer control is in a dominant position, the control strategy proposed in this paper can make the pursuer bypass the defender and capture the evader, and that when the pursuer control is not in a dominant position, the control strategy proposed in this paper can prolong the time that the defender takes to intercept the pursuer. The proposed control strategy is applicable to the orbital pursuit-evasion-defense scenario, in which the players adopt the continuous low thrust propulsion. When the ZEM distance between the pursuer and the defender is close to zero, the control strategy can be automatically switched to parallel with the defender’s control strategy, so that the pursuer can effectively avoid the interception of the defender.

However, the limitation of this paper is that the terminal time of the game is given by the genetic algorithm, which is not accurate. Further research will be carried out on the accurate calculation of the terminal time.

Author Contributions

J.Z. and L.Z. conceived the framework and structured the paper; J.Z. and J.C. performed the experiments and analyzed the data; J.Z., S.W., and Y.W. wrote and revised the paper.

Funding

This research was jointly funded by the National Natural Science Foundation of China (Nos. 61633008, 61773132, 61803115), the 7th Generation Ultra Deep Water Drilling Unit Innovation Project sponsored by Chinese Ministry of Industry and Information Technology, the Heilongjiang Province Science Fund for Distinguished Young Scholars (No. JC2018019), and the Fundamental Research Funds for Central Universities (No. HEUCFP201768).

Acknowledgments

We gratefully acknowledge Aravind Seshadri for providing the toolkit on multi-objective genetic algorithm at the following website: https://www.mathworks.com/matlabcentral/fileexchange/10429-nsga-ii-a-multi-objective-optimization-algorithm.

Conflicts of Interest

The authors declare no conflict of interest.

References

Widhalm, J.W.; Heise, S.A. Optimal in-plane orbital evasive maneuvers using continuous thrust propulsion. J. Guid. Control Dyn. 1991, 14, 1323–1326. [Google Scholar] [CrossRef]
Jagat, A.; Sinclair, A.J. Optimization of spacecraft pursuit-evasion game trajectories in the euler-hill reference frame. In Proceedings of the AIAA/AAS Astrodynamics Specialist Conference, San Diego, CA, USA, 4–7 August 2014. [Google Scholar]
Stupik, J. Optimal Pursuit/Evasion Spacecraft Trajectories in the Hill Reference Frame. Master’s Thsies, University of Illinois at Urbana-Champaign, Champaign, IL, USA, 2013. [Google Scholar]
Ye, D.; Shi, M.M.; Sun, Z.W. Satellite proximate interception vector guidance based on differential games. Chin. J. Aeronaut. 2018, 31, 1352–1361. [Google Scholar] [CrossRef]
Isaacs, R. Differential Games: A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization; Courier Corporation: New York, NY, USA, 1999. [Google Scholar]
Wong, R.E. Some aerospace differential games. J. Spacecr. Rocket. 1967, 4, 1460–1465. [Google Scholar] [CrossRef]
Anderson, G.M.; Bohn, G.D. A near-optimal control law for pursuit-evasion problems between two spacecraft. AIAA J. 1977, 15, 1203–1205. [Google Scholar] [CrossRef]
Anderson, G.M. Feedback control for a pursuing spacecraft using differential dynamic programming. AIAA J. 1977, 15, 1084–1088. [Google Scholar] [CrossRef]
Pontani, M.; Conway, B.A. Numerical solution of the three-dimensional orbital pursuit-evasion game. J. Guid. Control Dyn. 2009, 32, 474–487. [Google Scholar] [CrossRef]
Sun, S.T.; Zhang, Q.H.; Chen, Y. Numerical solution for a class of pursuit-evasion problem in low earth orbit. In Proceedings of the 9th Asian Control Conference (ASCC), Istanbul, Turkey, 23–26 June 2013. [Google Scholar]
Sun, S.T.; Zhang, Q.H.; Loxton, R.; Li, B. Numerical solution of a pursuit-evasion differential game involving two spacecraft in low earth orbit. J. Ind. Manag. Optim. 2015, 11, 1127–1147. [Google Scholar] [CrossRef]
Hafer, W.T.; Reed, H.L.; Turner, J.D.; Pham, K. Sensitivity methods applied to orbital pursuit evasion. J. Guid. Control Dyn. 2015, 38, 1118–1126. [Google Scholar] [CrossRef]
Burk, R.C.; Widhalm, J.W. Minimum impulse orbital evasive maneuvers. J. Guid. Control Dyn. 1989, 12, 121–123. [Google Scholar] [CrossRef]
Prussing, J.E.; Clifton, R.S. Optimal multiple-impulse satellite evasive maneuvers. J. Guid. Control Dyn. 1994, 17, 599–606. [Google Scholar] [CrossRef]
Merz, A.W. Noisy satellite pursuit-evasion guidance. J. Guid. Control Dyn. 1989, 12, 901–905. [Google Scholar] [CrossRef]
Woodbury, T.D.; Hurtado, J.E. Adaptive play via estimation in uncertain nonzero-sum orbital pursuit evasion games. In Proceedings of the AIAA SPACE and Astronautics Forum and Exposition, Orlando, FL, USA, 12–14 September 2017. [Google Scholar]
Ghosh, P.; Conway, B.A. Near-optimal feedback strategies synthesized using a spatial statistical approach. J. Guid. Control Dyn. 2013, 36, 905–919. [Google Scholar] [CrossRef]
Liu, Y.F.; Li, R.F.; Hu, L.; Cai, Z.Q. Optimal solution to orbital three-player defense problems using impulsive transfer. Soft Comput. 2018, 22, 2921–2934. [Google Scholar] [CrossRef]
Liu, Y.; Ye, D.; Hao, Y. Distributed online mission planning for multi-player space pursuit and evasion. Chin. J. Aeronaut. 2016, 29, 1709–1720. [Google Scholar] [CrossRef]
Markopoulos, N. Analytically exact non-Keplerian motion for orbital transfers. In Proceedings of the Astrodynamics Conference, Scottsdale, AZ, USA, 1–3 August 1994. [Google Scholar]
Bellman, R. Dynamic Programming; Princeton University: Princeton, NJ, USA, 1957. [Google Scholar]
Rubinsky, S.; Gutman, S. Three-player pursuit and evasion conflict. J. Guid. Control Dyn. 2014, 37, 98–110. [Google Scholar] [CrossRef]
Stupik, J.; Pontani, M.; Conway, B. Optimal pursuit/evasion spacecraft trajectories in the hill reference frame. In Proceedings of the AIAA/AAS Astrodynamics Specialist Conference, Minneapolis, MN, USA, 13–16 August 2012. [Google Scholar]
Lawden, D.F. Optimal Trajectories for Space Navigation; Butterworths: London, UK, 1963. [Google Scholar]
Clohessy, W.H.; Wiltshire, R.S. Terminal guidance system for satellite rendezvous. J. Aerosp. Sci. 1960, 27, 653–658. [Google Scholar] [CrossRef]
Tartaglia, V.; Innocenti, M. Game theoretic strategies for spacecraft rendezvous and motion synchronization. In Proceedings of the AIAA Guidance, Navigation, and Control Conference, San Diego, CA, USA, 4–8 January 2016. [Google Scholar]
Li, Z.Y.; Zhu, H.; Yang, Z.; Luo, Y.Z. A dimension-reduction solution of free-time differential games for spacecraft pursuit-evasion. Acta Astronaut. 2019. [Google Scholar] [CrossRef]
Sarma, I.; Ragade, R.; Prasad, U. Necessary conditions for optimal strategies in a class of noncooperative N-person differential games. SIAM J. Control 1969, 7, 637–644. [Google Scholar] [CrossRef]
Dickmanns, E.D.; Well, K.H. Approximate solution of optimal control problems using third order Hermite polynomial functions. In Proceedings of the Optimization Techniques IFIP Technical Conference, Novosibirsk, Russia, 1–7 July 1974; pp. 158–166. [Google Scholar]
Stoer, J.; Bulirsch, R. Introduction to Numerical Analysis; Springer: New York, NY, USA, 1993. [Google Scholar]
Deb, K.; Pratap, A.; Agarwal, S.; Meyarivan, T. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 2002, 6, 182–197. [Google Scholar] [CrossRef]
NSGA—II: A Multi-Objective Optimization Algorithm. Available online: https://www.mathworks.com/matlabcentral/fileexchange/10429-nsga-ii-a-multi-objective-optimization-algorithm (accessed on 19 July 2009).

Figure 1. The local-vertical local-horizontal (LVLH) coordinate system.

Figure 2. The position of each player changing with time in (a) x-axis, (b) y-axis, and (c) z-axis.

Figure 3. The distance between the pursuer and the defender changing with time.

Figure 4. The curves of the control variable of each player with time in (a) x-axis, (b) y-axis, and (c) z-axis.

Figure 5. The curves of zero-control miss distance with time in (a) x-axis, (b) y-axis, and (c) z-axis.

Figure 6. The position of each player changing over time in (a) x-axis, (b) y-axis, and (c) z-axis.

Figure 7. The distance between the pursuer and the defender changing over time.

Figure 8. The control variable of each player changing over time in (a) x-axis, (b) y-axis, and (c) z-axis.

Figure 9. The position of each player changing over time in (a) x-axis, (b) y-axis, and (c) z-axis.

Figure 10. The control variable of each player changing over time in (a) x-axis, (b) y-axis, and (c) z-axis.

Figure 11. The position of each player changing with time in (a) x-axis, (b) y-axis, and (c) z-axis.

Figure 12. The control variable of each player changing over time in (a) x-axis, (b) y-axis, and (c) z-axis.

Table 1. The evaluation scales.

$v_{i}$ ¹	1	2	3	4	5	6	7	8	9	10	11
$k_{1}$	0	0.1	0.2	0.3	0.4	0.5	0.6	0.7	0.8	0.9	1
$k_{2}$	1	0.9	0.8	0.7	0.6	0.5	0.4	0.3	0.2	0.1	0

¹

v_{i}, i = 1, \dots, 11

represents the corresponding scales, respectively.

Table 2. Positions and velocities of the initial time.

Parameter	Pursuer	Evader	Defender
$X / km$	0	12	6
$Y / km$	0	16	8
$Z / km$	20	0	10
$V_{X} / (km \cdot s^{- 1})$	0	0	0
$V_{Y} / (km \cdot s^{- 1})$	0	0	0
$V_{Z} / (km \cdot s^{- 1})$	0	0	0

Table 3. Position of each player at the end of the game.

Parameter	Pursuer	Evader	Defender
$X / km$	15.24	15.28	10.24
$Y / km$	17.44	17.68	11.72
$Z / km$	−2.098	−2.363	4.87

Table 4. Positions and velocities of the initial time.

Parameter	Pursuer	Evader	Defender
$X / km$	0	8	18
$Y / km$	0	9	24
$Z / km$	30	12	0
$V_{X} / (km \cdot s^{- 1})$	0	0	0
$V_{Y} / (km \cdot s^{- 1})$	0	0	0
$V_{Z} / (km \cdot s^{- 1})$	0	0	0

Table 5. Position of each player at the end of the game.

Parameter	Pursuer	Evader	Defender
$X / km$	8.293	9.083	8.638
$Y / km$	8.804	9.609	9.026
$Z / km$	11.89	10.51	11.79

Table 6. Position of each player at the end of the game.

Parameter	Pursuer	Evader	Defender
$X / km$	8.384	9.069	8.714
$Y / km$	8.956	9.593	9.132
$Z / km$	11.81	10.51	11.68

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhou, J.; Zhao, L.; Cheng, J.; Wang, S.; Wang, Y. Pursuer’s Control Strategy for Orbital Pursuit-Evasion-Defense Game with Continuous Low Thrust Propulsion. Appl. Sci. 2019, 9, 3190. https://doi.org/10.3390/app9153190

AMA Style

Zhou J, Zhao L, Cheng J, Wang S, Wang Y. Pursuer’s Control Strategy for Orbital Pursuit-Evasion-Defense Game with Continuous Low Thrust Propulsion. Applied Sciences. 2019; 9(15):3190. https://doi.org/10.3390/app9153190

Chicago/Turabian Style

Zhou, Junfeng, Lin Zhao, Jianhua Cheng, Shuo Wang, and Yipeng Wang. 2019. "Pursuer’s Control Strategy for Orbital Pursuit-Evasion-Defense Game with Continuous Low Thrust Propulsion" Applied Sciences 9, no. 15: 3190. https://doi.org/10.3390/app9153190

APA Style

Zhou, J., Zhao, L., Cheng, J., Wang, S., & Wang, Y. (2019). Pursuer’s Control Strategy for Orbital Pursuit-Evasion-Defense Game with Continuous Low Thrust Propulsion. Applied Sciences, 9(15), 3190. https://doi.org/10.3390/app9153190

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Pursuer’s Control Strategy for Orbital Pursuit-Evasion-Defense Game with Continuous Low Thrust Propulsion

Abstract

1. Introduction

2. Mathematical Model of Orbital Pursuit-Evasion-Defense Game

2.1. Relative Orbital Dynamics

2.2. Dimension-Reduction

2.3. Design of Objective Function Based on Fuzzy Comprehensive Evaluation

3. Solution Method for Orbital Pursuit-Evasion-Defense Game

3.1. Necessary Conditions for Optimal Strategies

3.2. Hybrid Numerical Method

3.2.1. Multi-Objective Genetic Algorithm

3.2.2. Multiple Shooting Method

4. Results and Discussion

5. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI