Incomplete Information Pursuit-Evasion Game Control for a Space Non-Cooperative Target

Aiming to solve the optimal control problem for the pursuit-evasion game with a space noncooperative target under the condition of incomplete information, a new method degenerating the game into a strong tracking problem is proposed, where the unknown target maneuver is processed as colored noise. First, the relative motion is modeled in the rotating local vertical local horizontal (LVLH) frame originated at a virtual Chief based on the Hill-Clohessy-Wiltshire relative dynamics, while the measurement models for three different sensor schemes (i.e., single LOS (line-of-sight) sensor, LOS range sensor and double LOS sensor) are established and an extended Kalman Filter (EKF) is used to obtain the relative state of target. Next, under the assumption that the unknown maneuver of the target is colored noise, the game control law of chaser is derived based on the linear quadratic differential game theory. Furthermore, the optimal control law considering the thrust limitation is obtained. After that, the observability of the relative orbit state is analyzed, where the relative orbit is weakly observable in a short period of time in the case of only LOS angle measurements, fully observable in the cases of LOS range and double LOS measurement schemes. Finally, numerical simulations are conducted to verify the proposed method. The results show that by using the single LOS scheme, the chaser would firstly approach the target but then would lose the game because of the existence of the target’s unknown maneuver. Conversely, the chaser can successfully win the game in the cases of LOS range and double LOS sensor schemes.


Introduction
With the progress of human space exploration, the number of space debris and inactive satellites has been increasing sharply, which has become a significant threat to active spacecraft and satellites; thus, cleaning up these space debris has become an important issue. Furthermore, along with the development of space rendezvous and docking technology [1,2], non-cooperative target observation [3] and approaching technics [4], the safety of space assets is threatened more than ever by military vehicles. Therefore, protecting the safety of space assets in face of these threats is critical for cleaning space. Many studies have been done to find a way out, e.g., space situational awareness, on-orbit servicing and so on [5][6][7][8][9][10][11][12]. Developing on-orbit servicing vehicles with corresponding GNC (guidance, navigation and control) systems used to handle the space debris, inactive satellites and military vehicles threating to the space asset is the most effective method for safety. In this manuscript, the topic of pursuit-evasion game control, as a further problem of space rendezvous for space non-cooperative target, will be studied.
ISAACS [13] has the earliest study for differential games, and gave the optimal necessary conditions for pursuit-evasion games. In 1971, Friedman [14] established the theory of differential game value and saddle point existence using the discrete approximate sequence, which laid a solid mathematical foundation in the differential game. Starr and Ho [15] studied the nonzero-sum N-person differential game of three different types. Roxin and Tsokos [16] gave the mathematical definition of stochastic differential game, Nichols [17] pointed out the relationship between the stochastic differential game and cybernetics and Ciletti [18][19][20][21] studied the differential game containing information delay, and established the open loop and closed loop control of the information delay differential game. In the 1980s, Stackelberg's [22] master-slave differential game become the new hotspot among many scholars in the 1990s. Since 2000, differential game research has mainly concentrated on the zero-sum with state constraints and the differential game, many differential game and incomplete information differential pairs. Aumann and Maschler [23] and Harsanyi [24] studied static incomplete information differential countermeasures, where Harsanyi converted the game with incomplete information into a complete but imperfect game, and used the methods for processing full information. Kreps and Wilson [25] studied the dynamic incomplete information differential countermeasures, introducing the perfect Bayesian balance, sequential balance, etc. to introduce discrete dynamic games. The basis and conditions for making decisions for non-cooperative targets are unable to understand that the relevant relative state information of the other party may not be obtained in the incomplete information game. For the first problem mentioned above, Woodbury and Hurtado proposed adaptive control policies [26] that obtain the weight of the target function of the other party by order, which is not applicable to the unknown target function form; for the latter problem mentioned above, they proposed a method of adding an additional spacecraft for observation to obtain location information of the target [27]. Cavalieri further studied the incomplete information game in uncertain relative kinetics situation by joining the problem of the incomplete information game [28] with further studies of the incomplete information game in uncertain relative kinetics [29], that is, joining an estimate on the basis of the behavioral learning algorithm. Woodbury used similar methods [30]. Since the learning algorithm requires strong on-board calculation capabilities, Liu et al. built a fuzzy reasoning model to characterize continuous space, and proposed a branch depth strengthening learning architecture with multiple sets of parallel neural networks and shared decision modules [31]. Linville used the linear regression model [32] for incomplete information game to improve the practicality of the depth learning algorithm. DONG et al. proposed a multi-mode adaptive solution to the incomplete information game [33]. Similarly, Li studied the incomplete information game by estimating and modifying the guess of target's control strategy constantly [34].
As an important case of incomplete information, bearings-only measurements have been widely studied. Oshman and Davidson proposed a method which is based on maximizing the determinant of the Fisher information matrix (FIM) to design the optimal observation trajectory for observer [35]. Battistini and Shima proposed a new guidance strategy that exploits the information from the error covariance matrix of the homing loop integrated Kalman filter in the framework of a pursuit-evasion game for missile [36]. Fonod and Shima studied cooperative estimation/guidance for a team of missiles by using bearings-only measurements [37]. Battistini presented a method for characterizing the capture region of a pursuit-evasion game in terms of the confidence on the estimation of the ZEM [38].
In summary, the pursuit-evasion game problem in the complicated space environment is quite challenging, especially in the case of incomplete feedback information. The main contribution of this research is to develop the space pursuit-evasion game control algorithm in the context of incomplete feedback information, i.e., angles-only measurements and known target maneuvers. Unlike previous researches estimating the unknown maneuvers by over-burden calculated artificial intelligent approach, the proposed algorithm in this paper treats it as colored noise, while a double line-of-sight scheme is developed to overcome the problems of the colored noise and observability resulting from angles-only measurements. It can potentially provide a feasible solution to the space game problem. The rest of the paper is organized as follows. The relative motion dynamics for the game participants are presented in Section 2. The measurements models for three observation schemes are established in Section 3, followed by the observability analysis for the states in Section 4. The basic theory of the differential game control of pursuit-evasion is reviewed in Section 5. The space pursuit-evasion game control algorithm based on incomplete information is designed in Section 6. Numerical simulations with performance index and simulation parameters are set in Section 7. Conclusions are presented in Section 8.

Relative Dynamics Model
Two participants in a two-player spacecraft pursuit-evasion (PE) game are called Pursuer and Evader, respectively. Typically, the objective of the Pursuer is to intercept/rendezvous with the Evader and the objective of the Evader is to avoid or delay the interception/rendezvous. To descript the pursuit-evasion game between Pursuer and Evader, a rotating local vertical local horizontal (LVLH) reference frame is adopted. The origin of the LVLH frame is collocated with a virtual Chief, as shown in Figure 1, where the axes are aligned with the inertial position vector (x axis or radial), the normal to orbit plane (z axis or cross track) and the along-track direction (y axis completes the orthogonal set).
Let the relative orbit state be x = [r T , v T ] T , where the superscript T stands for the operator of transposition. Vectors without a superscript are assumed to be coordinated in LVLH frames. oped to overcome the problems of the colored noise and observability resulting from angles-only measurements. It can potentially provide a feasible solution to the space game problem. The rest of the paper is organized as follows. The relative motion dynamics for the game participants are presented in Section 2. The measurements models for three observation schemes are established in Section 3, followed by the observability analysis for the states in Section 4. The basic theory of the differential game control of pursuit-evasion is reviewed in Section 5. The space pursuit-evasion game control algorithm based on incomplete information is designed in Section 6. Numerical simulations with performance index and simulation parameters are set in Section 7. Conclusions are presented in Section 8.

Relative Dynamics Model
Two participants in a two-player spacecraft pursuit-evasion (PE) game are called Pursuer and Evader, respectively. Typically, the objective of the Pursuer is to intercept/rendezvous with the Evader and the objective of the Evader is to avoid or delay the interception/rendezvous. To descript the pursuit-evasion game between Pursuer and Evader, a rotating local vertical local horizontal (LVLH) reference frame is adopted. The origin of the LVLH frame is collocated with a virtual Chief, as shown in Figure 1, where the axes are aligned with the inertial position vector (x axis or radial), the normal to orbit plane (z axis or cross track) and the along-track direction (y axis completes the orthogonal set). Let the relative orbit state be [ , ] , where the superscript T stands for the operator of transposition. Vectors without a superscript are assumed to be coordinated in LVLH frames. Then, under the assumptions of near-circular orbit, the two-body problem and that the range between the virtual Chief and the participants in the game is relatively small compared to the radial distance to the center of the Earth, the relative motion of the participants with respect to the virtual Chief can be governed by the well-known Hill-Clohessy-Wiltshire (HCW) equation [39]:  Then, under the assumptions of near-circular orbit, the two-body problem and that the range between the virtual Chief and the participants in the game is relatively small compared to the radial distance to the center of the Earth, the relative motion of the participants with respect to the virtual Chief can be governed by the well-known Hill-Clohessy-Wiltshire (HCW) equation [39]: where the subscripts of p and e stand for Pursuer and Evader, respectively, n is the orbital rate of the virtual chief and u is the control acceleration, which is loaded on the participants along the three axes of LVLH frame. Then, let the relative state of Evader relative to Pursuer be x = x e − x p in the defined LVLH frame, which can be obtained from Equation (1) as follows: where

Measurement Models
The relative motion geometry between the Evader and Pursuer in the LVLH frame is shown in Figure 2, where the measurements observed by the Pursuer are generally assumed to be the line-of-sight (LOS) angles and relative range. Three observation schemes, i.e., single LOS sensor, LOS range sensor and double LOS sensors, are discussed in the following sections. where the subscripts of p and e stand for Pursuer and Evader, respectively, n is the orbital rate of the virtual chief and u is the control acceleration, which is loaded on the participants along the three axes of LVLH frame. Then, let the relative state of Evader relative to Pursuer be e p = − x x x in the defined LVLH frame, which can be obtained from Equation (1) as follows:

Measurement Models
The relative motion geometry between the Evader and Pursuer in the LVLH frame is shown in Figure 2, where the measurements observed by the Pursuer are generally assumed to be the line-of-sight (LOS) angles and relative range. Three observation schemes, i.e., single LOS sensor, LOS range sensor and double LOS sensors, are discussed in the following sections.

Single LOS Sensor Measurement
When the LOS angles are measured from only one passive camera available for the Pursuer, the observation can be modeled as follows: where α and β are the azimuth and pitch angle, respectively.

LOS Range Sensor Measurement
With the active sensor such as radar/lidar on board, both of the LOS angles and range can be measured. Then, the observation model can be governed as follows: where ρ refers to the distance between the Pursuer and Evader.

Single LOS Sensor Measurement
When the LOS angles are measured from only one passive camera available for the Pursuer, the observation can be modeled as follows: where α and β are the azimuth and pitch angle, respectively.

LOS Range Sensor Measurement
With the active sensor such as radar/lidar on board, both of the LOS angles and range can be measured. Then, the observation model can be governed as follows: where ρ refers to the distance between the Pursuer and Evader.

Double LOS Sensor Measurement
When two or more passive optical sensors (e.g., two cameras) can be used to measure the LOS, as shown in Figure 3, the observation model can be given as follows: where the subscripts 1 and 2 are the label of sensors, [x 1 y 1 z 1 ] T and [x 2 y 2 z 2 ] T stand for the relative position from the Evader to the cameras, respectively, and the vector R represents the baseline between two cameras, which is supposed to be known and can be calculated from the locations of the cameras.

Double LOS Sensor Measurement
When two or more passive optical sensors (e.g., two cameras) can be used to measure the LOS, as shown in Figure 3, the observation model can be given as follows: where the subscripts 1 and 2 are the label of sensors, [ ] for the relative position from the Evader to the cameras, respectively, and the vector R represents the baseline between two cameras, which is supposed to be known and can be calculated from the locations of the cameras. Accordingly, when the observation model shown in Equation (7) is used, the system state X for the estimation can be switched to a 12-dimension vector as follows: where 1 e x and 2 e x refer to the state of the Evader as related to the cameras of the Pursuer.

Observability Analysis
Conceptually, the system is observable if the relative state can be uniquely determined from the measurements in time history. By contrast, the system is unobservable if more than one set of states share the same measurements in time history. The goal of this section is to mathematically analyze the observability of the system for the three utilized measurement modes based on the method presented in Ref. [2]. Accordingly, when the observation model shown in Equation (7) is used, the system state X for the estimation can be switched to a 12-dimension vector as follows: where x e1 and x e2 refer to the state of the Evader as related to the cameras of the Pursuer.

Observability Analysis
Conceptually, the system is observable if the relative state can be uniquely determined from the measurements in time history. By contrast, the system is unobservable if more than one set of states share the same measurements in time history. The goal of this section is to mathematically analyze the observability of the system for the three utilized measurement modes based on the method presented in Ref. [2].

Observability Analysis in the Case of Single LOS Sensor Measurement
The observation equations can be conducted through a transformation in the form of "Analogous Linearization" [40]. Taking the tangent of the LOS angles α and β in Equation (5) and simplifying yields: x sin(α) − y cos(α) y sin(β) − z sin(α) cos(β) = 0 By reorganizing Equation (9), a homogeneous linear equation is obtained: where: Obviously, the rank of h A (Z) is 2. Then, the 6-dimension state x cannot be uniquely solved. Theoretically, at least three sets of measurements are required for solving x uniquely. Then, if there are three sets of measurements, the following linear equations can be obtained: where x 0 is the initial relative state. The state x k on epoch t k can be obtained from state transition equation, which is derived from the solution of the HCW equation, as follows: where φ is the state transition matrix, G is the control driven matrix, and ω k−1 is noise which originates from the maneuver of Evader and error of the HCW equation. Substituting Equation (13) into Equation (12) and reforming the equation produces gives the following: Equation (14) is a non-homogeneous linear equation if the maneuver of Pursuer is non-zero, where the control vector of Pursuer u is non-linear function of state vector x. Then, the initial relative state vector can be solved when ω k−1 is known, so the system is observable if the maneuver of the Evader is known. However, the colored noise ω k−1 (the Evader's maneuver) is unknown, so even if the system is observable, the solution would be polluted, which decreases the accuracy of the solution.

Observability Analysis with the LOS Range Measurement
When the distance measurement is added to the observation, the relative position of the Evader can be calculated by the following equation: Thus, the converted observation is: where: Similarly, the rank of h r (Z) is 3 and the dimension of state x is 6, so at least two sets of measurements are required for solution of state x. Then, based on Equation (13) and Equation (16), the following equation can be obtained: Equation (19) is a non-homogeneous linear equation, and the system is observable if the maneuver of the Evader is known. Different from the single LOS sensor measurement case, Equation (19) has the component h r T (Z 0 ) h r T (Z 1 ) T , which means that the filter is more stable in the case of joint measurement with LOS and range sensors.

Observability Analysis with Double LOS Measurements
When the binocular camera is used, the observation model of Equation (7) can be used. Similar to the observability analysis of a single camera measurement, the following equation can be obtained: Equation (20) can be written in the following form: where the rank of H DA (Z) is 6; thus, X cannot be determined uniquely from Equation (21). The state transfer equation of X can be obtained by Equation (13) in the following form: where: Based on Equations (21) and (22), the following equation can be obtained when two sets of measurements are available: Under the measurement of double LOS sensors, the filter has strong convergent performance with the measurement component made up of R, R 1 , and R 2 , and the system still has observability with limited colored noise. Compared to the above two measurement methods, Table 1 is obtained.

Review of Differential Game Control Theory
In Pursuit-Evasion games, as the basis for the decision-making of the Pursuer and Evader, the cost function often takes the following form: where the subscript i stands for the participant p or Evader e. Both parties involved in the game would like to make self-interested control decisions, so the following well-known inequality [41] holds: where u * p and u * e denote the optimal control of the Pursuer and Evader, respectively. When J p + J e = 0, the equation shown above is a so-called zero-sum differential game problem. The linear quadratic differential game is widely studied; its cost function is composed of terminal error 1 2 x T (t f )Sx(t f ), integration of process error 1 2 t f t 0 x T (t)Qx(t)dt and fuel consumption 1 2 t f t 0 u(t)Ru(t)dt, and it is studied in this paper in the form of the following equations: where S and Q are symmetric positive semi-definite matrices and R p and R e are symmetric positive definite matrices. Based on Equations (27)-(30), the optimal control strategies for both game parties can be obtained in the form of inequality as follows: The Hamiltonian function can be defined as follows; Then, the following equation can be obtained from Equation (31): Thus, the optimal control law of both parties in the game can be obtained by the following equation: Therefore, the optimal control law of both parties involved in the game can be obtained as: where λ = Px, P can be obtained from the following Riccati equation: When t f = ∞, the cost function, that is, Equation (26), will only have integral terms, as shown below: The optimal control laws u * p and u * e are given, respectively, as follows: where matrix P can be solved from the following Riccati algebraic equation:

Control of Incomplete Information Pursuit-Evasion Games
In the previous section, the optimal control law based on linear quadratic differential game with complete information is established, which is, in essence, used to solve the saddle-point control problem based on the Nash equilibrium hypothesis [42]. The optimal control law discussed above has good applicability to non-cooperative targets without maneuverability and cooperative targets. However, the pursuit-evasion game strategy of space non-cooperative targets is unknown and uncertain in reality. Thus, the optimal control law discussed above will be invalid if the game is in the incomplete information condition. Therefore, based on the complete information game control law, solving the problem of redesigning the game control law in the incomplete information condition is discussed in the following section.

Degradation of Pursuit-Evasion Games
The control strategy of space non-cooperative targets is unknown because of: (1) The cost function of the non-cooperative target is not known, and its cost function is not necessarily the same as the form discussed above. (2) The weight matrix of the cost function is not known, that is, even if the non-cooperative target adopts the cost function as the form discussed above, its weight matrix is not necessarily known.
Therefore, the maneuver of the target is not discussed in this paper, and it is treated as colored noise to derive the game control law. After the above method is processed, the incomplete information pursuit-evasion game will degenerate into an optimal control problem. Thus, the dynamic model Equation (4) will degenerate into: where ω e denotes the colored noise resulting from the maneuver of the Evader. The cost function in Equation (29) can be obtained as follows: The Hamiltonian function is given as: The optimal control of the Pursuer is as follows: P can be obtained from the following equations: Obtaining the optimal control law of the Pursuer requires acknowledging the relative state vector of the Evader with respect to the Pursuer, as Equation (46) shows. When treating the Evader's maneuver as colored noise, it is impossible to obtain accurate relative state vector from the relative dynamic model, and the state of the Evader needs to be extracted from the observation information. Thus, an extended Kalman filter is used to obtain the estimated value of the relative state x.

Control Restrictions
In the actual situation, the maneuverability of the satellite is limited, which means the thruster output is limited. Therefore, the aforementioned derivation and design of the control law cannot be directly used in engineering, where Pontryagin's principle [43] can be used in the following form to solve the problem: Normally, the weight matrix R p in the cost function taken as K R I, K R is a number and I is an identity matrix. Therefore, the Hamiltonian function can be obtained from Equation (45) in the limit control case: where U p and e p are the amplitude and unit direction vector of u p , respectively, thus: From Equation (49), the following equation can be obtained: where e p0 = − B T p λ λ T B p , thus: From Equation (34), we can get: When λ T B p K R > U pmax , Equation (54) cannot be used directly, but the following equation can be obtained from Pontryagin's principle:

Numerical Simulations
The simulation frame of the space pursuit-evasion game for a near-circular orbit target was established in a MATLAB (version 2020b) environment. The entire architecture of the method proposed in this paper is shown in Figure 4. R equation can be obtained from Pontryagin's principle:

Numerical Simulations
The simulation frame of the space pursuit-evasion game for a near-circular orbit target was established in a MATLAB (version 2020b) environment. The entire architecture of the method proposed in this paper is shown in Figure 4. The key parameters for the simulation are shown in Table 2 and Table 3. Because HCW is adopted in this paper, the orbit of the virtual Chief must be a nearly circular orbit. In other words, the eccentricity of the virtual Chief should be very small. The key parameters for the simulation are shown in Tables 2 and 3.  Table 3. Other parameters settings.

Parameters Value
Pursuer's initial relative state x p0 Model error covariance matrix without maneuver limit I 6×6 × 10 −4 Model error covariance matrix with maneuver limit I 6×6 Because HCW is adopted in this paper, the orbit of the virtual Chief must be a nearly circular orbit. In other words, the eccentricity of the virtual Chief should be very small.
The Evader adopts the game control law using complete information, which theoretically represent the optimal control in the game. In other words, in extreme conditions where the Evader fully knows the Pursuer's maneuver strategy, and performs the optimal escape control, while the Pursuer can still track and approach the Evader using incomplete information in the game, then, we can say that the pursuit mission can be completed in other, easier conditions.

Single LOS Measurement Case
First, the optimal control law obtained by the Evader using complete information is analyzed, where the Pursuer's maneuvering weight matrix is R p = I 3×3 × 10 9 , and the Evader's maneuvering weight matrices are R e = 1.6 R p , 2R p , 5R p , 10R p or ∞R p respectively to verify the effectiveness of the algorithm with different forms of maneuverability of the Evader, shown in Figure 5. When e p = ∞ R R (the Evader does not maneuver), as the observability analysis indicated, the Pursuer can approach the Evader. However, when the Pursuer is sufficiently close to the Evader, the Pursuer's maneuver is not obvious, which leads to non-observability of the system, so the Pursuer moves away from the Evader, as shown in Figure 5. When the Evader maneuvers, which represent color noise in the filter, the Pursuer can get close to the Evader, but the error cannot be eliminated, as shown in Figure 5. Thus, the Pursuer moves away from the Evader. Therefore, making the Evader's maneuver as colored noise is not suitable for an incomplete information game with a single LOS measurement.

LOS Range Measurement Case
When angle and distance measurements are used for observation, the equivalent position measurement can be obtained through numerical calculation. The mean value and standard deviation of the measurement error are shown in Figure 6. Figure 6 also shows the Pursuer's measurement precision for a 1 km range using the accuracy from Table 3. The equivalent position measurement error is 1.2 m (maximum). When R e = ∞R p (the Evader does not maneuver), as the observability analysis indicated, the Pursuer can approach the Evader. However, when the Pursuer is sufficiently close to the Evader, the Pursuer's maneuver is not obvious, which leads to non-observability of the system, so the Pursuer moves away from the Evader, as shown in Figure 5.
When the Evader maneuvers, which represent color noise in the filter, the Pursuer can get close to the Evader, but the error cannot be eliminated, as shown in Figure 5. Thus, the Pursuer moves away from the Evader. Therefore, making the Evader's maneuver as colored noise is not suitable for an incomplete information game with a single LOS measurement.

LOS Range Measurement Case
When angle and distance measurements are used for observation, the equivalent position measurement can be obtained through numerical calculation. The mean value and standard deviation of the measurement error are shown in Figure 6. Figure 6 also shows the Pursuer's measurement precision for a 1 km range using the accuracy from Table 3. The equivalent position measurement error is 1.2 m (maximum).
In this paper, we only discuss the case with a limited maneuvering capabilities of the Pursuer and Evader because of the maximum thrust limit. First, the Pursuer and Evader use the same maneuver limit, as U pmax = U emax = 1 m/s 2 . The Pursuer maneuvering weight matrix is R p = I 3×3 × 10 5 , while the Evader uses different maneuvering weight matrices, i.e., R e = 2 R p , 2.5R p , 3.74287R p , 3.74288R p , 5R p or ∞R p , as shown in Figure 7.
R e = 3.74288R p is the boundary beyond which the Pursuer can approach the Evader when the LOS range measurement is used. Figure 8 shows the case where U emax = 0.8 m/s 2 , and the Evader takes different maneuvering weight matrices R e = 1.6 R p , 2R p or 2.5R p , respectively.

LOS Range Measurement Case
When angle and distance measurements are used for observation, the equivalent position measurement can be obtained through numerical calculation. The mean value and standard deviation of the measurement error are shown in Figure 6. Figure 6 also shows the Pursuer's measurement precision for a 1 km range using the accuracy from Table 3. The equivalent position measurement error is 1.2 m (maximum). In this paper, we only discuss the case with a limited maneuvering capabilities of the Pursuer and Evader because of the maximum thrust limit. First, the Pursuer and Evader     When the maneuver limit between the Pursuer and Evader is different, the Pursuer can gradually get close to the Evader if the limit of the Evader is smaller than the Pursuer, When the maneuver limit between the Pursuer and Evader is different, the Pursuer can gradually get close to the Evader if the limit of the Evader is smaller than the Pursuer, e.g., R e = 1.6R p . The control strategy of the Evader is not considered when designing the control of the Pursuer, so the Pursuer cannot catch the Evader when R e = 1.6R p . Therefore, the Pursuer can no longer get close to the Evader when the maneuvering amplitude of the Pursuer is lower than the limit, so a relatively stable distance between the Pursuer and the Evader exists in the game process.

Double LOS Measurement Case
Similar to the discussion in the previous section. First, the Pursuer and Evader use the same maneuver limit, i.e., U pmax = U emax = 1 m/s 2 . The Pursuer maneuvering weight matrix is R p = I 3×3 × 10 5 , while the Evader uses different maneuvering weight matrices, i.e., R e = 2 R p , 2.5R p , 3.825R p , 3.826R p , 5R p or ∞R p , as shown in Figure 9. the Pursuer can no longer get close to the Evader when the maneuvering amplitude of the Pursuer is lower than the limit, so a relatively stable distance between the Pursuer and the Evader exists in the game process.

Double LOS Measurement Case
Similar to the discussion in the previous section. First, the Pursuer and Evader use the same maneuver limit, i.e.,  , as shown in Figure 9. . Distance between the Evader and Pursuer when the Evader and Pursuer use the same maneuver limit while the Evader uses different maneuver weights.

3.826
e p = R R is the boundary beyond which the Pursuer can approach the Evader when the double LOS measurement is used. Figure 10 shows the case where the Pursuer and Evader use different maneuver weights, i.e., Figure 9. Distance between the Evader and Pursuer when the Evader and Pursuer use the same maneuver limit while the Evader uses different maneuver weights. R e = 3.826R p is the boundary beyond which the Pursuer can approach the Evader when the double LOS measurement is used. Figure 10 shows the case where the Pursuer and Evader use different maneuver weights, i.e., U pmax = 1 m/s 2 , U emax = 0.8m/s 2 .
Similar to the previous section, the Pursuer cannot catch the Evader when R e is small, because the control strategy of the Evader is not considered when designing the control of the Pursuer, such as R e = 1.6R p and 2R p . However, when R e is big enough, i.e., R e = 2.5R p , the Pursuer can catch the Evader.

Conclusions
To solve the incomplete information game problem with a space non-cooperative target, this paper studied the optimal control algorithm based on the differential game theory where the unknown maneuver of the Evader is processed as colored noise. EKF was used to obtain the Evader's relative state, and thus, observability analysis with different measurement methods is performed and its influence on the proposed algorithm is also shown in the fourth section. Numerical simulations were conducted to verify the proposed algorithm using different measurement models. The following conclusions are obtained: (1) The measurement method has a great influence on the algorithm proposed in this paper. When single angle measurement is used, the Pursuer can approach the Evader using observation information at the beginning, but the chasing process cannot be maintained because of weak observability. However, the Pursuer can approach the Evader when LOS range or double LOS sensor measurements are used by the Pursuer. (2) There is still some position/displacement/distance estimation error, although observability is improved by adding the distance measurement or when the double LOS sensor measurement is used, as shown in Figure 11. Thus, the Pursuer cannot catch the Evader when R p <R e < 3.74288R p in the LOS range measurement case, or R p <R e < 3.826R p in the double LOS measurement case. The critical value of R e with which the Pursuer can catch the Evader will be smaller if U emax < U pmax . (3) The essence of the method proposed in this paper is that the Pursuer seeks the optimal control approaching the Evader under the assumption that the Evader's maneuverability is lower than that of the Pursuer.