1. Introduction
Recently, the orbital pursuit-evasion problem has attracted increasing attention in space research [
1,
2,
3,
4]. This problem can be formulated as a differential game [
5], which aims to obtain the optimal control strategy of the pursuer and/or the evader in the worst-case scenario, so as to realize the interception of the evader or the evasion from the pursuer.
Wong [
6] was regarded as the first person to study the orbital pursuit-evasion problem, he solved the problem of intercepting a maneuverable satellite under the assumption of planar motion and constant gravitational field. Since then, many works have focused on the orbital pursuit-evasion problem. In reference [
7], a method based on periodically updating the solution of the two-point boundary value problem (TPBVP) was proposed to generate near optimal feedback controls for the orbital pursuit-evasion problem. However, this method is time-consuming and difficult to be applied in real time. In order to overcome these drawbacks, Anderson [
8] used a modified first-order differential dynamic programming algorithm to generate near-optimal feedback controls. References [
9,
10,
11] found the saddle-point equilibrium solutions of the three-dimensional orbital pursuit-evasion game respectively by three different hybrid numerical methods. Hafer et al. [
12] applied the sensitivity method to the orbital pursuit-evasion problem, which greatly reduces the computation burden for solving this problem numerically. Widhalm studied the problem of avoiding an interception and proposed two optimal evasive-maneuver strategies with the impulsive thrust [
13] and the continuous low thrust [
1] respectively. Prussing et al. [
14] derived minimum-fuel impulsive strategies for return-on-state maneuvers by applying the primer vector theory. Merz [
15] developed the guidance laws for the noisy satellite pursuit-evasion game. Woodbury et al. [
16] studied an incomplete, imperfect information game and presented the adaptive strategies for the pursuer and the evader. Ghosh et al. [
17] developed a near-optimal feedback controller for the two-player pursuit-evasion games by using a new extremal-field approach. The above works were studied in the two-player pursuit-evasion game framework. However, in this framework, the evader can only perform maneuvers by itself to avoid threats. It is called self-defense, which disturbs the original mission of the evader and requires a large additional amount of fuel.
To overcome this disadvantage, a defender is introduced in [
18]. The role of the defender is intercepting the pursuer. In this way, the evader can perform its original mission without being disturbed. A hybrid method combined particle swarm optimization with a Newton-Interpolation algorithm was proposed to solve the orbital defense problem. However, because of the introduction of the defender, the pursuer must avoid the interception by the defender while capturing the evader [
19], which makes the design of the pursuer’s control strategy more complicated. In order to develop control strategies for pursuers, Liu et al. [
19] proposed a distributed online mission plan algorithm for pursuers to access targets. However, these works on the orbital pursuit-evasion-defense game adopted the impulsive thrust, which suffers the drawback that the interception will fail when the target can perform evasive maneuvers [
4].
Compared with the impulse thrust, the continuous low thrust allows players to perform multiple, continuous maneuvers, which meets the requirements of the frequently orbital transfers in the game. When applying the continuous low thrust, the hypothesis about players’ maneuverable is removed. It is closer to the actual situation of the orbital pursuit-evasion-defense game. Therefore, in this paper, the orbital pursuit-evasion-defense game model is constructed based on the continuous low thrust. Different from the model based on impulse thrust, the model based on continuous low thrust cannot adopt the Keplerian dynamics [
20]. Its dynamic equations are based on the non-Keplerian motion. Two issues need to be solved in this model: (i) The system has a high dimension, which means that it will suffer from the curse of dimensionality [
21] when solving the problem; (ii) two objectives, intercepting the evader and evading the defender, should be considered by the pursuer, and the corresponding weights should be determined according to the current state. For the first issue, as the zero effort miss (ZEM) can be used to simplify the linear system [
4], the dimension of the system is reduced by introducing the relative state variables and the ZEM variables [
22]. For the second issue, the pursuer’s objective function is designed based on the fuzzy comprehensive evaluation, and the pursuer’s control strategy which is suitable for the orbital pursuit-evasion-defense game is proposed. Based on the above model, the orbital pursuit-evasion-defense game is transformed into a TPBVP by applying the differential game theory. A hybrid method combining the multi-objective genetic algorithm and the multiple shooting method is presented to solve the TPBVP.
4. Results and Discussion
In this section, the following four examples are given to verify the effectiveness of the proposed strategy. Among these, Example 1 and Example 2 are taken as one group. Their initial conditions and maneuver parameters are the same. The differences between the two examples are that when performing orbital maneuvers, the pursuer in Example 1 adopts the control strategy based on the fuzzy comprehensive evaluation, while the pursuer in Example 2 does not consider the impact of the defender, that is, the parameters in the objective function . Example 3 and Example 4 are taken as the other group, with the differences between the two examples being the same as those between Example 1 and Example 2 in the first group. The initial orbital altitude of their reference orbit , the acceleration of gravity , and the radius of the Earth . During the game, the safety distance between players is set as 0.5 km.
Example 1. The maximum unit mass thrusts of the pursuer, the evader, and the defender are
,
, and
, respectively, and the game time is 267.4124 s. The initial positions and velocities of the pursuer, the evader, and the defender are shown in
Table 2. The pursuer adopts the control strategy based on the fuzzy comprehensive evaluation.
Figure 2 shows the curves of the positions of the three players changing with time in the directions of
X,
Y,
Z. From
Figure 2, it can be seen that the pursuer bypasses the interception of the defender and eventually catches up with the evader. From
Table 3, it can be seen that at the terminal moment, the distance between the pursuer and the evader is 0.3598 km, which is shorter than the safety distance 0.5 km, indicating that at the terminal moment, the pursuer catches up with the evader.
Figure 3 shows the distance between the defender and the pursuer during the game. It reaches the shortest distance at 203.5 s. After that, the distance between the pursuer and the defender becomes longer, the shortest distance being 0.5099 km, which is longer than the safety distance 0.5 km, indicating that during the game the pursuer successfully bypasses the defender.
Figure 4 shows the curves of the control variable of each player changing with time in the directions of
X,
Y,
Z.
Figure 5 shows the curve of ZEM distance changing with time. From the figures, it can be seen that when the ZEM distance
is close to 0 (i.e., 40 s to 130 s), the pursuer will consider more about evading the defender. So in this phase, the pursuer’s control curve is nearer to the control curve of the defender. During the time when the ZEM variables
are not close to 0, the pursuer almost ignores the impact of the defender, so the control curve of the pursuer at this stage almost superposes with that of the evader. Through this strategy, the pursuer successfully bypasses the defenders during the game and finally captures the evader.
Example 2. The maximum unit mass thrusts of the pursuer, the evader, and the defender are
,
, and
, respectively, and the game time is 159.81193 s. The positions and velocities of the pursuer, the evader and the defender in the initial time are shown in
Table 2. The pursuer does not consider the impact of the defender when performing orbital maneuvers.
As shown in
Figure 6, the defender successfully intercepts the pursuer at the terminal moment.
Figure 7 shows the distance between the defender and the pursuer during the game. According to
Figure 7, the distance becomes shorter and shorter in the entire game, which is caused by the pursuer’s not considering the impact of the defender. At the terminal moment, the distance between the defender and the pursuer is 0.4004 km, which is shorter than the safety distance 0.5 km.
Figure 8 shows the control variable of each player changing with time in the game. As shown in the figure, the control curve of the pursuer overlaps with that of the evader in the whole procedure, the reason being that the pursuer only considers the evader when performing orbital maneuvers.
Comparing Example 1 with Example 2, it can be seen that with the control strategy based on the fuzzy comprehensive evaluation, the pursuer can successfully bypass the defender, and finally capture the evader. The pursuer, for not considering the impact of the defender, is eventually intercepted by the defender.
Example 3. The maximum unit mass thrusts of the pursuer, the evader, and the defender are
,
, and
, respectively. The maneuverability of the defender and that of the pursuer are the same, and the game time is 177.87788 s. The positions and velocities of the pursuer, the evader, and the defender in the initial time are shown in
Table 4. The pursuer adopts the control strategy based on the fuzzy comprehensive evaluation.
Figure 9 shows the curves of the positions of the three players changing with time in directions of
X,
Y,
Z. As shown in
Figure 9, the defender successfully intercepts the pursuer at the end of the game.
Figure 10 shows the curves of the control variable of each player changing with time in the directions of
X,
Y,
Z. At 160 s or so, the pursuer starts to change the control strategy to evade the defender. However, because of the same maneuverability of the defender and the pursuer, the pursuer does not successfully bypass the interception of the defender.
Table 5 shows the position of each player at the terminal moment. From
Table 5, it can be seen that at the terminal moment, the distance between the defender and the pursuer is 0.4223 km, which is shorter than the safety distance 0.5 km, and at the terminal moment, the distance between the pursuer and the evader is 1.7823 km, which is longer than the safety distance 0.5 km. All the above show that at the terminal moment, the defender successfully intercepts the pursuer and that the evader successfully evades the capture of the pursuer.
Example 4. The maximum unit mass thrusts of the pursuer, the evader, and the defender are
,
,
, respectively. The maneuverability of the defender and that of the pursuer are the same, and the game time is 177.19431 s. The positions and velocities of the pursuer, the evader, and defender in the initial time are shown in
Table 4. The pursuer does not consider the impact of the defender when performing orbital maneuvers.
As shown in
Figure 11, the defender intercepts the pursuer at the terminal moment. From
Table 6, it can be seen that the distance between the pursuer and the defender in the game is 0.3959 km, which is shorter than the safety distance 0.5 km, and that the distance between the pursuer and the evader is 1.6016 km, which is longer than the safety distance 0.5 km. This shows that the defender intercepts the pursuer successfully at the terminal moment, and the evader evades the capture of the pursuer successfully.
Figure 12 shows the curves of the control variable of each player changing with time in the directions of
X,
Y,
Z. From the figure, it can be seen that the control curves of the pursuer remain overlapped with those of the evader.
Comparing Example 3 with Example 4, it can be seen that, because of the different control strategies adopted by the pursuer, the time that the defender takes to intercept the pursuer in Example 3 is longer than that in Example 4. Moreover, at the terminal moment, the distance between the pursuer and the defender in Example 4 is shorter than that in Example 3.
The comparison between Example 1 and Example 2 shows that when the control variable of the pursuer is in a dominant position, the optimal control strategy proposed in this paper makes the pursuer bypass the defender and capture the evader. The comparison between Example 3 and Example 4 shows that when the control variable of the pursuer is not in a dominant position, the optimal control strategy proposed in this paper prolongs the time that the defender takes to intercept the pursuer.