Orbital Behavior Intention Recognition for Space Non-Cooperative Targets Under Multiple Constraints

Chen, Yuwen; Zhang, Xiang; Liao, Wenhe; Wei, Guoning; Fan, Shuhui

doi:10.3390/aerospace12060520

Open AccessArticle

Orbital Behavior Intention Recognition for Space Non-Cooperative Targets Under Multiple Constraints

by

Yuwen Chen

,

Xiang Zhang

^*,

Wenhe Liao

,

Guoning Wei

and

Shuhui Fan

School of Mechanical Engineering, Nanjing University of Science and Technology, No. 200 Xiao Ling Wei, Nanjing 210094, China

^*

Author to whom correspondence should be addressed.

Aerospace 2025, 12(6), 520; https://doi.org/10.3390/aerospace12060520

Submission received: 5 May 2025 / Revised: 31 May 2025 / Accepted: 8 June 2025 / Published: 9 June 2025

(This article belongs to the Special Issue Asteroid Impact Avoidance)

Download

Browse Figures

Versions Notes

Abstract

To address the issue of misclassification and diminished accuracy that is prevalent in existing intent recognition models for non-cooperative spacecraft due to the omission of environmental influences, this paper presents a novel recognition framework leveraging a hybrid neural network subject to multiple constraints. The relative orbital motion of the targets is characterized and categorized through the use of Clohessy–Wiltshire equations, forming the foundation of a constrained intention dataset employed for training and evaluation. Furthermore, the method incorporates a composite architecture combining a convolutional neural network (CNN), long short-term memory (LSTM) unit, and self-attention (SA) mechanism to enhance recognition performance. The experimental results demonstrate that the integrated CNN-LSTM-SA model attains a recognition accuracy of 98.6%, significantly surpassing traditional methods and neural network models. Additionally, it demonstrates high efficiency, indicating significant promise for practical applications in avoiding spacecraft collisions and performing orbital maneuvers.

Keywords:

intention recognition; space non-cooperative target; multiple constraints; long temporal sequence features; hybrid neural network

1. Introduction

As space technology advances rapidly and space launches become more frequent, space debris, inactive satellites, and other non-cooperative objects pose significant risks to space operations. Traditional collision risk assessment methods primarily rely on relative distance to assess potential risks [1]. However, using relative distance as single evaluation metric overlooks the relative motion between objects, making it difficult to predict the orbital changes of non-cooperative targets and accurately anticipate collision risks. Consequently, there is a pressing demand for more robust and multifaceted threat evaluation frameworks, particularly those incorporating intention recognition technology, to achieve a comprehensive understanding of motion states and behavioral intentions. This will improve the accuracy of early warning information, reduce false positives, and provide timely support for spacecraft decision-making.

Intention recognition involves predicting a target’s future behavior by analyzing its motion patterns and surrounding environmental factors [2]. Notable progress in this technology has been observed in areas like autonomous driving, human–computer interaction, and collaborative operations. Common techniques include both traditional methods and modern approaches based on artificial intelligence (AI).

For instance, Tsogas et al. [3] integrated Dempster–Shafer evidence theory with multi-source information to accurately identify vehicle maneuver types and predict driver intentions, Guanglei et al. [4] employed support vector machines to forecast attack intentions in multi-aircraft collaborative air combat, Li [5] introduced an innovative algorithm that combined hidden Markov models with Bayesian filtering techniques for the inference of drivers’ lane change intentions, and Tahboub [6] proposed a model for human–machine interaction that utilized dynamic Bayesian networks, enabling intelligent machines to infer human intentions and interactive driving behaviors in response to the surrounding environment. However, traditional methods typically depend on extensive domain expertise, such as quantifying feature weights and determining prior probabilities, which restricts their applicability in complex environments.

To overcome the constraints of traditional techniques, research on intention prediction has progressively transitioned to AI-based modern techniques. Deep learning approaches minimize the dependence on manually extracted features, facilitating the automatic identification and extraction of critical features, thereby enhancing adaptability in complex environments. Le-Hong and Le [7] emphasized the advantages of convolutional neural networks (CNNs) in semantic recognition tasks. Kim et al. [8] enhanced the reliability of human–machine interaction through the use of deep CNNs (DCNNs). Cao et al. [8] proposed a method for recognizing the intentions of air targets that combined knowledge graphs and deep learning, effectively addressing challenges such as inadequate feature extraction and misclassification.

In time series prediction, recurrent neural networks (RNNs) and their variants, including long short-term memory (LSTM) networks and gated recurrent units (GRUs), have shown remarkable success, particularly in fields characterized by significant temporal dependencies, such as trajectory tracking and maneuver prediction [9,10,11]. For example, Choi et al. [12] proposed a future trajectory prediction framework that integrated RNNs with inverse reinforcement learning. Tang et al. [13] employed multiple LSTMs to predict temporal behaviors in vehicle lane changing processes. Yoon et al. [14] enhanced trajectory prediction accuracy by combining variational autoencoders (VAE) with GRU. Additionally, Vaswani et al. [15] introduced an attention mechanism that improved the processing of long-sequence information by reducing dependence on external data, establishing it as a mainstream method for sequential feature learning. Dong et al. [16] presented a spatiotemporal transformer-based model for airspace trajectory prediction. Chen et al. [17] developed an efficient multimodal vehicle trajectory prediction approach by integrating graph attention with temporal attention mechanisms. Teng et al. [18] introduced a bidirectional gated recurrent unit with attention (BiGRU-Attention) model for recognizing air target tactical intention, significantly improving the accuracy of dynamic air target tactical intention recognition.

In the field of intention recognition for space non-cooperative targets, existing research mainly focuses on identifying the orbital behaviors and intentions of relative motion between spacecraft. Zhang et al. [19] introduced a non-cooperative target intention inference model based on a bidirectional gated recurrent unit with self-attention (BiGRU-SA), sizing real-time measurement values as features to predict the intention of space non-cooperative targets. Sun and Dang [20] investigated potential geometric configurations of relative motion for these targets based on Clohessy–Wiltshire (C-W) equations, defining the associated intentions, and employed backpropagation to train a deep neural network (DNN) on datasets for rapid and precise recognition of non-cooperative target intentions. However, the above studies tend to over-categorize orbital behaviors, focusing excessively on minor differences in actions while overlooking the possibility that such actions may share the same underlying intent. This leads to reduced accuracy in intent recognition. Moreover, the aforementioned studies rely on idealized datasets and intricate intention definitions, which complicates their direct application to practical space exploration tasks.

Therefore, this study seeks to address the limitations of the aforementioned methods in practical applications and introduces a model for recognizing the intent of space non-cooperative targets utilizing long temporal sequence data. First, in response to the drawbacks of traditional intent recognition approaches, this paper redefines the non-cooperative target intent recognition problem as a classification task focused on time series feature learning. It defines a set of intentions for space non-cooperative targets, incorporating constraints such as illumination conditions and detection payloads, thereby streamlining the classification of intent types. Second, considering the local feature extraction capability of convolutional neural networks (CNNs), the temporal sequence modeling ability of long short-term memory (LSTM) networks, and the long temporal sequence information capturing ability of self-attention mechanisms, this paper proposes a CNN-LSTM-SA hybrid neural network model. This model can efficiently extract features from long temporal sequence data and perform classification learning, enabling real-time non-cooperative target intent recognition. Finally, through comparative experiments, the paper demonstrates the significant benefits of the proposed method, particularly in terms of accuracy, real-time performance, and robustness.

The organization of this paper is outlined as follows. Section 2 presents the mathematical formulation of the intention recognition problem and defines the non-cooperative target intentions. Section 3 details the implementation of the CNN-LSTM-SA model. Section 4 evaluates the model’s performance and confirms the efficacy of the proposed approach through comparative experiments. Section 5 provides the conclusion.

2. Formulation and Definition of Non-Cooperative Target Intentions

This chapter first transforms intent recognition into a long temporal sequence feature classification problem. It subsequently analyzes the motion patterns of space non-cooperative targets under the constraints of relative orbital dynamics and, based on this, inductively classifies different intents and their corresponding orbital behaviors. Finally, the chapter defines the constraints for the temporal information dataset in alignment with practical conditions.

2.1. Mathematical Formulation of Intention Recognition

Orbital dynamics govern the relative motion between spacecraft and non-cooperative space targets, which imposes constraints on their movement patterns. Consequently, their intentions are often reflected in their orbital behaviors. Identifying the intentions of space non-cooperative targets serves as a key example of a multi-modal classification problem. Therefore, this process can be framed as the conversion of time series feature data into the corresponding intention category.

Let

I = {i_{1}, i_{2}, \dots, i_{n}}

denote the intention space of non-cooperative objectives in space, and let

H_{t}

represents the characteristic information at time t. In real-time, complex, and uncertain threat prediction or evasion tasks, determining a target’s intention based solely on perception information at a single moment poses certain limitations and may result in false predictions. Consequently, it is essential to analyze and infer the temporal variation of features over continuous time to enhance task accuracy.

Let

H_{T}

be the time series feature set of non-cooperative objectives in space, comprising discrete time observations of continuous variables over T consecutive time, from

t_{1}

to

t_{T}

, i.e.,

H_{T} = {H_{t_{1}}, H_{t_{2}}, \dots, H_{t_{T}}}

. The objective intention recognition task can be represented as a mapping function that maps the time series feature set

H_{T}

to the intention space I, mathematically expressed as follows:

I = f (H_{T}) = f (H_{t_{1}}, H_{t_{2}}, \dots, H_{t_{T}})

(1)

Considering the complexity and unpredictability associated with space avoidance missions, comprehensively describing them through simple mathematical formulas is challenging. To overcome this limitation, this paper proposes the use of a CNN-LSTM-SA network architecture, trained on a spatial target intention dataset, to model the implicit relationship between orbital sequential features and target intentions. This approach facilitates robust and accurate intention recognition.

2.2. Relative Orbital Motion Dynamics Model

A local vertical local horizontal (LVLH) coordinate system

O_{M} - x y z

is established to describe the relative motion between spacecraft, using the center of mass of the mission spacecraft serving as the origin, as depicted in Figure 1. The

x y

plane corresponds to the orbital plane of the mission spacecraft, with the

O_{M} - x

axis pointing from the Earth’s center to spacecraft’s center of mass (

\vec{r_{M}}

), the

O_{M} - z

axis aligning with the spacecraft’s angular momentum direction, and the

O_{M} - y

axis is determined by applying the right-hand rule.

Assuming that both the target spacecraft and the mission spacecraft operate in circular or near-circular orbits, the gravitational difference between the two spacecraft can be linearized. This assumption allows for the derivation of the equations governing the relative motion of the target spacecraft within the LVLH coordinate system of the task spacecraft:

\{\begin{array}{l} \ddot{x} - 2 n \dot{y} - 3 n^{2} x = 0 \\ \ddot{y} + 2 n \dot{x} = 0 \\ \ddot{z} + n^{2} z = 0 \end{array}

(2)

In the equation, n denotes the orbital angular velocity of the spacecraft, while x, y, and z represent the position of the non-cooperative target’s center of mass within the LVLH coordinate system. Additionally,

\dot{x}

and

\dot{y}

represent the velocity of the non-cooperative target along the X and Y axes, respectively.

It should be noted that the in-plane variables x and y, along with the out-of-plane variable z, exhibit decoupled motion. This study primarily focuses on the relative motion within the orbital plane OXY of the reference spacecraft.

As a system of linear equations, the solution within the OXY plane can be represented as follows:

\{\begin{cases} x = A \sin (n t) - B \cos (n t) + C \\ y = 2 B \sin (n t) + 2 A \cos (n t) + D \cdot t + E \\ \dot{x} = n B \sin (n t) + n A \cos (n t) \\ \dot{y} = - 2 n A \sin (n t) + 2 n B \cos (n t) + D \end{cases}

(3)

where,

\{\begin{cases} A = \frac{{\dot{x}}_{0}}{n} \\ B = \frac{2 {\dot{y}}_{0}}{n} + 3 x_{0} \\ C = 2 (2 x_{0} + \frac{{\dot{y}}_{0}}{n}) \\ D = - 3 (\frac{{\dot{y}}_{0}}{n} + 2 x_{0}) \\ E = y_{0} - \frac{2}{n} {\dot{x}}_{0} \end{cases}

(4)

Additionally, the state of the non-cooperative target

[x_{0}, y_{0}, z_{0}, \dot{x_{0}}, \dot{y_{0}}, \dot{z_{0}}]

at the

t_{0}

moment is represented in the LVLH coordinate system

O_{M} - x y z

.

2.3. Definition of Non-Cooperative Target Intentions

The non-cooperative target’s intention is influenced by its relative motion patterns, which are influenced by orbital dynamics. Extensive research on relative motion patterns of spacecraft has been documented in the literature [21,22,23]. Sabol et al. [24] focused on the design of satellite formations and their temporal variation relationships, while Shasti et al. [25] investigated the stability and control issues associated with relative configurations in satellite formations. Sun and Dang [20] provided a detailed study of the formation conditions for droplet-like hovering configurations in the plane and classified non-cooperative target spacecraft intentions into 11 types. However, even though this classification method refines orbital behavior, certain behaviors may still correspond to the same intention in real-world scenarios. For example, both water droplet flybys and overhead flybys exhibit similar non-cooperative target behavior, where the target spacecraft passes over the task spacecraft in the +y to -y direction. Overly detailed intention types may lead to redundancy and inefficiency. Therefore, this paper optimizes and streamlines orbital behavior intentions for practical tasks based on the research of Sun and Dang [20].

According to Equation (2), the relative motion within the orbital plane can be decomposed into a simple harmonic motion along the X-axis (motion 1) and a linear motion along the Y-axis (motion 2), as illustrated in the Figure 2. Equations (5) and (6) for motions 1 and 2 are shown accordingly.

\{\begin{array}{l} x_{1} (t) = A \sin (n t) - B \cos (n t) \\ y_{1} (t) = 2 B \sin (n t) + 2 A \cos (n t) \end{array}

(5)

\{\begin{array}{l} x_{2} (t) = C \\ y_{2} (t) = D t + E \end{array}

(6)

Motion 1 is characterized by circular motion along the ellipse, with X and Y components maintaining a constant 90° phase difference and a counterclockwise trajectory. In reference [20], the intersection of motion 1 with the positive X-axis is identified as a feature point, which is used to represent different motion trajectories. In this study, the feature point is treated as the initial point of the relative motion; by varying this initial point, different relative motion trajectories of a non-cooperative space target can be obtained. The initial point exhibits the following properties:

\{\begin{array}{l} {\dot{x}}_{0} = 0 \\ y_{0} = 0 \\ {\dot{y}}_{0} < 0 \end{array}

(7)

Consequently, Equation (3) can be simplified to:

\{\begin{array}{l} x = - B \cos (n t) + C \\ y = 2 B \sin (n t) + D t + E \end{array}

(8)

This paper classifies the intentions and orbital behaviors as follows. For the following intention, it primarily manifests as relative motion forms such as natural orbiting and controlled orbiting. For the approaching and distancing intention, two relative motion forms are identified: passing by and orbiting. The specific classification of orbital behavior intentions is shown in Table 1.

The following intention is mainly expressed through relative motion patterns, including natural orbiting and controlled orbiting. By contrast, approaching and distancing intentions encompass flyby and fly-around as two forms of relative motion, with the key difference being whether the motion creates an enclosing path around the mission spacecraft.

The target relative motion intent defined in this paper consists of six types: natural orbiting, controlled orbiting, forward fly-around, reverse fly-around, forward flyby, and reverse flyby. The specific definitions and formation conditions of these orbital behavior intents are presented in Table 1.

2.4. Construction of the Intention Set Under Multiple Constraints

In practical operations, the relative orbital information of the non-cooperative target is acquired via the visual detection payloads of the mission spacecraft. Therefore, when constructing the intention set, it is essential to consider various environmental constraints affecting the relative orbital information. This paper specifically examines the impacts of illumination constraints and environmental errors.

(1): Illumination Constraints

To describe the observation conditions of the task spacecraft for the non-cooperative target, the illumination azimuth angle

θ

is shown in Figure 3. This angle represents the orientation between the vector extending from the mission spacecraft M to the non-cooperative target N and the vector pointing from the M to the sun within the LVLH coordinate system centered on M. The expression of

θ

is given by:

θ = \arccos (\frac{{\vec{r}}_{M - N} - {\vec{r}}_{M - s u n}}{|{\vec{r}}_{M - N}| \cdot |{\vec{r}}_{M - s u n}|})

(9)

When

θ

is smaller than the constraint angle

α

, the non-cooperative target N lies within the forward light angle of the mission spacecraft M, allowing it to conduct observations and obtain the corresponding orbital data.

(2): Payload Constraints

In practical scenarios, the payload’s operational range is limited, necessitating the determination of both the maximum detection distance

L_{\max}

and the alarm distance

L_{\min}

. In addition, the operating time of the payload and the spacecraft’s computer processing efficiency are constrained by the total satellite power, which can be represented by the capacity of the timing data N and the interval time

t_{n}

. Under actual conditions, space disturbances such as atmospheric disturbances and space radiation may cause deviations between actual measurement values and ideal values. Hence, uncertainties and errors are considered, as shown in Equation (10):

Δ = [Δ x, Δ y, Δ \dot{x}, Δ \dot{y}] = [0.5, 0.5, 0.02, 0.02]

(10)

where

[Δ x, Δ y, Δ \dot{x}, Δ \dot{y}]

represents the small deviations in the relative position and velocity of the orbital data.

3. Development of the Intention Recognition Model

The network structure of the intention recognition method presented in this paper is illustrated in Figure 4. It is composed of three layers: the input layer, the hidden layer, and the output layer. The hidden layer incorporates CNN, LSTM, and SA network structures. The detailed structure and functions of each layer are outlined below.

3.1. Input Layer

The input layer is responsible for preprocessing the collected feature data to generate feature, which are then directly processed by the subsequent layer. In this paper, the feature data are obtained through orbital simulations under multiple constraints, with specific details as follows.

The intention and orbital behavior of the target spacecraft are randomly selected, while the initial conditions of the task spacecraft are specified. By applying numerical integration to Equation (8), the relative trajectory data of the target spacecraft over a given period are obtained. After generating the preliminary trajectory data, their compliance with the illumination conditions is assessed. A segment of trajectory data is then randomly selected within a specified time window

N

, ultimately forming a three-dimensional matrix of relative orbital behavior data. Figure 5 illustrates the process of constructing a single sample instance.

To mitigate the impact of varying information scales across different features and enhance the model’s convergence efficiency, the original time series must be converted into dimensionless values with consistent intervals. In this study, a max–min linear transformation is applied to scale the data to the [0, 1] range. For the x-dimensional feature data

F_{x} = [f_{x 1}, f_{x 2}, \dots, f_{x n}]

, where n represents the total number of data, the normalization formula is given as follows:

f_{x}^{*} = \frac{f_{x} - \min F_{x}}{\max F_{x} - \min F_{x}}

(11)

where

\min F_{x}

denotes the minimum value of the x-dimensional feature,

\max F_{x}

denotes the maximum value,

f_{x}

corresponds to the data before normalization, and

f_{x}^{*}

refers to the normalized training data.

3.2. Hidden Layer

In the design of the hidden layer, this paper presents a deep learning model that combines a convolutional neural network (CNN), long short-term memory (LSTM) network, and a self-attention (SA) mechanism. By leveraging the strengths of each network structure, this hybrid model effectively extracts local features from time series data, captures long-term dependencies, and highlights crucial information, thereby enhancing the accuracy and robustness of non-cooperative target intention identification. The CNN layer is tasked with extracting spatial features, the LSTM layer is responsible for capturing long-term dependencies within time series data, and the self-attention mechanism emphasizes the most crucial segments of the sequence, mitigating information loss that commonly occurs in traditional methods when handling long time series data.

This integrated model exhibits superior performance in various complex environments, particularly in processing long time series data, where it achieves higher accuracy and better adaptability. The mechanisms of each layer are explained in detail below.

3.2.1. CNN Layer

A convolutional neural network (CNN) is a type of neural network engineered to automatically extract features from data, characterized by local connections, weight sharing, and spatial pooling. In this study, a CNN is used to extract local features from the input time series data. The network architecture comprises the convolutional, pooling, fully connected, and softmax layers, as shown in Figure 6.

The convolution operation applies a sliding filter over the input data to extract key features. The computation of the convolutional layer can be expressed by the following equation:

x_{i}^{l} = f (W_{i}^{l} * x^{l - 1} + b_{i}^{l})

(12)

where

x_{i}^{l}

represents the output feature of layer l corresponding to index i; f is the nonlinear activation function of the network;

W_{i}^{l}

denotes the weight of the i-th convolution kernel in layer l;

*

is the convolution operation;

x^{l - 1}

is the input to layer l; and

b_{i}^{l}

represents the bias of the i-th convolution kernel in layer l.

The pooling layer, which follows the convolutional layer, reduces the dimensionality of the features while retaining crucial information. This compression helps to decrease computational costs and improves the overall efficiency of the model.

3.2.2. LSTM Layer

A long short-term memory (LSTM) network is a specialized form of recurrent neural network (RNN), with its architecture shown in Figure 7. By introducing forget gates, input gates, and output gates to regulate the flow of information, LSTM addresses the vanishing gradient issue that can arise in traditional RNNs when learning long sequences.

The computation formula for the forget gate is given as follows:

Forget gates:

f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f})

(13)

Input gates:

C_{t} = \tanh (W_{c} \cdot [h_{t - 1}, x_{t}] + b_{c})

(14)

i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})

(15)

Output gates:

O_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o})

(16)

h_{t} = O_{t} * \tanh (C_{t})

(17)

In the formula,

σ

represents the activation function, while f denotes the output of the layer.

h_{t - 1}

and

h_{t}

correspond to the previous and current unit states, respectively.

W_{f}

,

W_{c}

,

W_{i}

, and

W_{o}

are the weight matrices of the neural network.

C_{t}

represents the “memory” of the neuron at time t, while

x_{t}

denotes the input of the current cell.

LSTM effectively captures temporal features while preventing information loss in long time series data, thus providing a significant advantage in the long temporal sequence feature learning in this study.

3.2.3. Self-Attention Layer

A self-attention (SA) mechanism models dependencies between any two positions in a sequence, addressing the information loss problem that may arise in long sequences when using traditional RNNs and LSTMs. In this study, the SA mechanism is primarily adopted to capture key features in long time series data and minimize reliance on external feature engineering. Its architecture is shown in Figure 8.

The self-attention mechanism determines the relationships between each time step by calculating Query (Q), Key (K), and Value (V) matrices from the input data, as shown in Equation (18):

\{\begin{cases} Q = X \cdot W_{Q} \\ K = X \cdot W_{K} \\ V = X \cdot W_{V} \end{cases}

(18)

where

W_{Q}

,

W_{K}

, and

W_{V}

represent the three distinct weight matrices.

The self-attention calculation is expressed in Equation (19), in which softmax() is the activation function:

A t t e n t i o n (Q, K, V) = softmax (\frac{Q K^{T}}{\sqrt{d_{k}}}) V

(19)

where X is a dot-product attention calculation, which is more efficient by using matrix multiplication to calculate the correlation of each Query and Key Vector and sum them up. The softmax function maps the correlation of X and X in the range of [0, 1], which plays a normalization role. X is the square root of the key vector dimension and acts as a regulator to avoid the softmax function from entering the regions where it has extremely small gradients [15].

This mechanism calculates the attention matrix to dynamically assign weights to each time step within the sequence, emphasizing the most important information. In this research, the self-attention mechanism is employed to process the long temporal trajectory data of non-cooperative targets, enabling more accurate capture of motion patterns, preventing information loss, and boosting the model’s capability to retain long-term dependencies, thereby significantly improving recognition accuracy.

3.3. Output Layer

In the model’s output layer, local features are first aggregated using an average pooling layer to form a global feature vector. The resulting data are then passed through a multi-class softmax function, which estimates the probability distribution over various intentions. Ultimately, the intention with the highest probability is chosen as the forecasted intention for the target. The computation formulas for the softmax function and the output layer are presented below:

softmax (x_{i}) = \frac{e^{x_{i}}}{\sum_{l = 1}^{L} e^{x_{i}}}

(20)

y_{p} = softmax (W^{T} x + b)

(21)

In Equation (20),

x_{i}

serves as the data fed into the softmax function. In Equation (21), W denotes the weight matrix of the dense layer to be trained, b is the corresponding bias term, and

y_{p}

refers to the conceptual value associated with each intention in the output layer.

4. Experiments and Analysis

To evaluate the effectiveness of the proposed method, this section describes a series of simulations and analyses of the results. Initially, a dataset was constructed based on real-world conditions to verify the authenticity and reliability of the experiments. Subsequently, the model underwent training and validation processes. Finally, a comparison is made between the developed method and other models to confirm its advantages.

4.1. Dataset and Evaluation Metrics

Considering the large number of valuable spacecraft in high-altitude Earth orbits and the limited detection capabilities of ground-based sensors for these regions, there is an increasing need to identify the orbital behaviors of non-cooperative targets. Therefore, this study focuses on active proximity operations involving non-cooperative targets in high-altitude orbits. Multiple simulations were conducted with various target intentions, neglecting orbital perturbations, to obtain a range of relative motion patterns corresponding to each intention. The orbit of the mission spacecraft is defined as shown in Table 2, with the following assumptions: the maximum detection range

L_{\max}

of the spacecraft is 200 km, the minimum alarm distance

L_{\min}

is 1 km, the continuous operating time of the detector (N) is 60 s, and the spacecraft’s information processing interval t is 1 s. A total of 60,000 sample data points were constructed, each consisting of 60 time steps, including the six intent types mentioned earlier, with 10,000 samples for each type. The dataset was divided into training, testing, and validation sets with a ratio of 8:1:1.

For model evaluation, several machine learning metrics were used, including accuracy, precision, recall, and the F1-score, with the corresponding calculation formulas outlined below:

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(22)

P r e c i s i o n = \frac{T P}{T P + F P}

(23)

R e c a l l = \frac{T P}{T P + F N}

(24)

F_{1} - s c o r e = \frac{2 \cdot P r e c i s i o n \cdot R e c a l l}{P r e c i s i o n + R e c a l l}

(25)

In which TP denotes the instances where the model predicts positive, and the actual value is also positive; TN refers to the cases where the model predicts negative, and the actual value is negative; FP indicates the instances where the model predicts positive, but the actual value is negative; and FN refers to the instances where the model predicts negative, but the actual value is positive.

Accuracy indicates the ratio of correctly predicted samples by the model to the total number of samples, offering a comprehensive measure of the model’s predictive performance.

Precision represents the proportion of actual positive samples among the samples predicted as positive, assessing the accuracy of positive predictions. Recall represents the proportion of true positive samples identified by the model out of all actual positive samples, highlighting the model’s capacity to detect positive instances. The F1-score is the harmonic mean of precision and recall, with a higher score indicating better model performance.

4.2. Model Configuration

The dataset was generated through orbital simulations conducted in Matlab R2020a, while the neural network was implemented using Python 3.9 with the PyTorch 1.13.0 deep learning framework. Model training was conducted on a computer running Windows 11, equipped with an NVIDIA RTX 3060 GPU supported by CUDA 11.6 acceleration, and 48 GB of RAM.

The experiment used cross-entropy as the loss function and Adam as the optimizer, evaluating different model structural parameters evaluated using the test set. Selecting appropriate hyperparameters is essential for optimizing the model and enhancing the efficiency of intention recognition. Based on the characteristics of the CNN-LSTM-SA model, the key hyperparameters selected included the convolutional kernel size, pooling layer window, number of neurons in the hidden layer, and number of hidden layers. A preliminary experiment was conducted with 50 iterations. The detailed parameter choices are presented in Table 3, while the detailed experimental results are illustrated in Figure 9.

First, the impact of the number of hidden layers on the model’s performance was examined. Three common kernel sizes—3, 5, and 7—were compared, with the pooling layer window set to 2, the number of neuron nodes fixed at 4, and the number of hidden layers set to 1 in experiments 1, 2, and 3. The results of this comparison are presented in Figure 9a. When the convolution kernel sizes were 3, 5, and 7, the accuracies of the test set were 89.92%, 91.60%, and 90.20%, respectively. Among them, the model converged faster with a kernel size of 5, making it the optimal choice.

Subsequently, the effect of the pooling layer window on model performance was examined. Window sizes of 2 and 3 were tested, with the convolution kernel size set to 5, the number of neuron nodes fixed at 4, and the number of hidden layers set to 1 in experiments 2 and 4. The results of this comparison are presented in Figure 9b. Increasing the pooling layer window to 3 degraded the model’s performance, reducing the test set accuracy to 88.40% and slowing down convergence. Therefore, a pooling layer window size of 2 was chosen.

Following this, the role of the number of neuron nodes in determining model efficiency was investigated. The number of neuron nodes selected was set to 4, 8, and 16, with the convolution kernel size set to 5, the pooling window set to 2, and the number of hidden layers set to 1 in experiments 2, 5, and 6. The results of the comparison are displayed in Figure 9c. The accuracies of the test set were 91.60%, 98.62%, and 95.53%, respectively, when the number of neuron nodes was 4, 8, and 16. Among them, the model achieved the highest accuracy and the lowest loss when there were 8 neuron nodes, making it the optimal choice.

Lastly, the influence of hidden layer depth on the model’s accuracy and convergence speed was explored. The number of hidden layers was set to 1, 2, and 3, with the convolution kernel size set to 5, the pooling window set to 2, and the number of neuron nodes set to 8 in experiments 5, 7, and 8. The outcomes of the comparison are presented in Figure 9d. When the number of hidden layers was increased to 2 or 3, the accuracy of the test set decreased to 94.15% and 92.23%, respectively, while the convergence speed also slowed down. Therefore, a single hidden layer was selected.

Based on the comparative experimental analysis, the model converged within 20 to 30 iterations, leading to the selection of 50 iterations as the optimal setting. The final model structure is presented in Table 4.

4.3. Analysis of Model Results

4.3.1. Results and Analysis of the CNN-LSTM-SA Model

Utilizing the dataset from Section 4.1 and the final model from Section 4.2, the experiment was conducted, and the results are displayed in Figure 10.

As illustrated in the figure, with an increasing number of iterations, the validation accuracy of the CNN-LSTM-SA model progressively improved, while the loss value decreased, eventually reaching convergence. The model attained an accuracy of 98.62%, with a final loss value of 0.049. A confusion matrix was created to visualize the recognition accuracy for each orbital behavior intention, aiding in the evaluation of the model’s reasoning, as shown in Figure 11. In the figure, the predicted labels are displayed on the horizontal axis, while the true labels are represented on the vertical axis, and each diagonal element of the matrix indicates the number of correct classified instances. The figure demonstrates that the proposed intention recognition model achieved high accuracy across all intentions, confirming its reliability.

4.3.2. Performance Comparison Across Different Models

To further confirm the performance advantages of the proposed CNN-LSTM-SA model, a comparative evaluation was conducted on the same dataset with CNN, LSTM, CNN-LSTM, LSTM-SA, and the traditional support vector machine (SVM) model. The final results are presented in Table 5.

To provide a more comprehensive evaluation of the model, computational complexity was introduced to assess both performance and efficiency. The final assessment considered each model’s computation time and test set accuracy after 50 iterations.

As presented in the table, the CNN-LSTM-SA model demonstrated higher accuracy than other models and a relatively simple structure. Compared to the traditional SVM model, the accuracy increased by 11.24%, and the recognition time was reduced by 162 s. In comparison to the individual CNN and LSTM models, the accuracy increased by 17.19% and 6.64%, respectively, with a reduction in completion time of 195 s and 144 s. In comparison to the CNN-LSTM model, the accuracy improved by 6.60%, accompanied by a slight increase in iteration time (by 17 s) due to the addition of the self-attention layer, which further enhanced feature learning. In comparison to LSTM-SA, the accuracy increased by 1.25%, and the recognition time was significantly reduced by 201 s, further emphasizing the effectiveness of the CNN layer in extracting key features.

These comparison experiments demonstrate that the model possesses strong input feature extraction capability, with the LSTM and self-attention mechanisms significantly enhancing intention recognition performance.

5. Conclusions

This paper develops a non-cooperative target intention recognition model grounded in neural network theory and algorithms, addressing the challenge of intention recognition for non-cooperative space targets. First, mathematical methods for intention recognition were studied, and six distinct non-cooperative target motion models were defined and simplified. Based on this, an intention set for space targets under various constraint conditions was constructed. The paper presents a CNN-LSTM-SA fusion model for intention recognition, and its effectiveness in predicting the motion intention of space non-cooperative targets is validated through training, testing, validation, and comparative experiments. Compared with traditional intention recognition models, the integrated model demonstrates significant advantages in terms of inference accuracy and convergence speed. It enhances the quality of early warning for non-cooperative targets and contributes to extending the operational lifespan of on-orbit spacecraft. However, this study only considers the orbital aspects of non-cooperative target intention representation. Future research could further explore the underlying intentions behind orbital behaviors by incorporating the target’s maneuvering patterns. Additionally, the proposed method can be extended and optimized for more complex space surveillance scenarios to better adapt to increasingly dynamic space environments.

Author Contributions

Conceptualization, X.Z. and G.W.; methodology, Y.C. and G.W.; software, Y.C.; validation, Y.C.; formal analysis, Y.C.; investigation, Y.C. and S.F.; resources, W.L. and G.W.; data curation, Y.C.; writing—original draft preparation, Y.C.; writing—review and editing, Y.C., X.Z. and S.F.; visualization, X.Z. and W.L.; supervision, X.Z.; project administration, X.Z. and G.W.; funding acquisition, W.L. and G.W. All authors have read and agreed to the published version of the manuscript.

Funding

The research was supported by the National Key Research and Development Program of the Ministry of Science and Technology of China (Grant No. 2024YFE0116500).

Data Availability Statement

The data presented in this study are available upon request.

Conflicts of Interest

The authors reported no potential conflicts of interest regarding the research, authorship, or publication of this article.

References

Han, C.; Gu, Y.; Sun, X.; Liu, S. Rapid Algorithm for Covariance Ellipsoid Model Based Collision Warning of Space Objects. Aerosp. Sci. Technol. 2021, 117, 106960. [Google Scholar] [CrossRef]
Jiang, W.; Han, D.; Fan, X.; Duanmu, D. Research on Threat Assessment Based on Dempster–Shafer Evidence Theory. In Green Communications and Networks; Yang, Y., Ma, M., Eds.; Lecture Notes in Electrical Engineering; Springer: Dordrecht, The Netherlands, 2012; Volume 113, pp. 975–984. ISBN 978-94-007-2168-5. [Google Scholar]
Tsogas, M.; Polychronopoulos, A.; Floudas, N.; Amditis, A. Situation Refinement for Vehicle Maneuver Identification and Driver’s Intention Prediction. In Proceedings of the 2007 10th International Conference on Information Fusion, Quebec, QC, Canada, 9–12 July 2007; pp. 1–8. [Google Scholar] [CrossRef]
Guanglei, M.; Runnan, Z.; Biao, W.; Mingzhe, Z.; Yu, W.; Xiao, L. Target Tactical Intention Recognition in Multiaircraft Cooperative Air Combat. Int. J. Aerosp. Eng. 2021, 2021, 9558838. [Google Scholar] [CrossRef]
Li, K. Lane Changing Intention Recognition Based on Speech Recognition Models. Transp. Res. Part C Emerg. Technol. 2015, 69, 497–514. [Google Scholar] [CrossRef]
Tahboub, K.A. Intelligent Human–Machine Interaction Based on Dynamic Bayesian Networks Probabilistic Intention Recognition. J. Intell. Robot. Syst. 2005, 45, 31–52. [Google Scholar] [CrossRef]
Le-Hong, P.; Le, A.-C. A Comparative Study of Neural Network Models for Sentence Classification. In Proceedings of the 2018 5th NAFOSTED Conference on Information and Computer Science (NICS), Ho Chi Minh City, Vietnam, 23–24 November 2018; pp. 360–365. [Google Scholar]
Cao, C.; Wu, W.; Wang, Y. Motion direction detection of space target based on Radon transformation. Opt. Precis. Eng. 2021, 29, 1678–1685. [Google Scholar] [CrossRef]
Alahi, A.; Goel, K.; Ramanathan, V.; Robicquet, A.; Fei-Fei, L.; Savarese, S. Social LSTM: Human Trajectory Prediction in Crowded Spaces. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 961–971. [Google Scholar]
Archana, M.; Viji, R.; Ganapathy, S. A CNN-GRU Based Hybrid Approach for Pedestrian Trajectory Prediction. In Proceedings of the 2024 10th International Conference on Communication and Signal Processing (ICCSP), Melmaruvathur, India, 12 April 2024; pp. 1611–1616. [Google Scholar]
Deo, N.; Trivedi, M.M. Multi-Modal Trajectory Prediction of Surrounding Vehicles with Maneuver Based LSTMs. In Proceedings of the 2018 IEEE Intelligent Vehicles Symposium (IV), Changshu, China, 26–30 June 2018; pp. 1179–1184. [Google Scholar]
Choi, D.; An, T.-H.; Ahn, K.; Choi, J. Future Trajectory Prediction via RNN and Maximum Margin Inverse Reinforcement Learning. In Proceedings of the 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), Orlando, FL, USA, 17–20 December 2018; pp. 125–130. [Google Scholar]
Tang, L.; Wang, H.; Mei, Z.; Li, L. Driver Lane Change Intention Recognition of Intelligent Vehicle Based on Long Short-Term Memory Network. IEEE Access 2020, 8, 136898–136905. [Google Scholar] [CrossRef]
Yoon, Y.; Kim, T.; Lee, H.; Park, J. Road-Aware Trajectory Prediction for Autonomous Driving on Highways. Sensors 2020, 17, 4703. [Google Scholar] [CrossRef] [PubMed]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar] [CrossRef]
Dong, X.; Tian, Y.; Dai, L.; Li, J.; Wan, L. A New Accurate Aircraft Trajectory Prediction in Terminal Airspace Based on Spatio-Temporal Attention Mechanism. Aerospace 2024, 11, 718. [Google Scholar] [CrossRef]
Chen, X.; Zhang, H.; Zhao, F.; Cai, Y.; Wang, H.; Ye, Q. Vehicle Trajectory Prediction Based on Intention-Aware Non-Autoregressive Transformer with Multi-Attention Learning for Internet of Vehicles. IEEE Trans. Instrum. Meas. 2022, 71, 2513912. [Google Scholar] [CrossRef]
Teng, F.; Song, Y.; Wang, G.; Zhang, P.; Wang, L.; Zhang, Z. A GRU-Based Method for Predicting Intention of Aerial Targets. Comput. Intell. Neurosci. 2021, 2021, 6082242. [Google Scholar] [CrossRef] [PubMed]
Zhang, H.; Luo, J.; Gao, Y.; Ma, W. An Intention Inference Method for the Space Non-Cooperative Target Based on BiGRU-Self Attention. Adv. Space Res. 2023, 72, 1815–1828. [Google Scholar] [CrossRef]
Sun, Q.; Dang, Z. Deep Neural Network for Non-Cooperative Space Target Intention Recognition. Aerosp. Sci. Technol. 2023, 142, 108681. [Google Scholar] [CrossRef]
Dana-Bashian, D.; Hablani, H.; Tapper, M. Guidance Algorithms for Autonomous Rendezvous of Spacecraft with a Target Vehicle in Circular Orbit. In Proceedings of the AIAA Guidance, Navigation, and Control Conference and Exhibit, Montreal, QC, Canada, 6–9 August 2001; American Institute of Aeronautics and Astronautics: Reston, VA, USA, 2001. [Google Scholar]
Lee, S.; Park, S.-Y. Approximate Analytical Solutions to Optimal Reconfiguration Problems in Perturbed Satellite Relative Motion. J. Guid. Control Dyn. 2011, 34, 1097–1111. [Google Scholar] [CrossRef]
Ran, D.; Chen, X.; Misra, A.K. Finite Time Coordinated Formation Control for Spacecraft Formation Flying under Directed Communication Topology. Acta Astronaut. 2017, 136, 125–136. [Google Scholar] [CrossRef]
Sabol, C.; Burns, R.; McLaughlin, C.A. Satellite Formation Flying Design and Evolution. J. Spacecr. Rockets 2001, 38, 270–278. [Google Scholar] [CrossRef]
Shasti, B.; Alasty, A.; Assadian, N. Robust Distributed Control of Spacecraft Formation Flying with Adaptive Network Topology. Acta Astronaut. 2017, 136, 281–296. [Google Scholar] [CrossRef]

Figure 1. Definition of the LVLH coordinate system.

Figure 2. The decomposing of relative motion and the selection of a feature point.

Figure 3. Definition of the illumination azimuth angle.

Figure 4. CNN-LSTM-SA neural network architecture for intention recognition.

Figure 5. Process of a single sample data.

Figure 6. CNN architecture.

Figure 7. LSTM unit architecture.

Figure 8. Self-attention mechanism architecture.

Figure 9. Accuracy and loss curve results of hyperparametric experiments.

Figure 10. Recognition results of the CNN-LSTM-SA model.

Figure 11. Confusion matrix for CNN-LSTM-SA model recognition.

Table 1. Classification of orbital behavior intention.

0	Natural Orbiting: The non-cooperative target N follows an elliptical orbit around the mission spacecraft M.
		$\begin{array}{l} L (t) > ε, t \in [t_{m i n}, t_{m a x}] \\ {\dot{x}}_{0} = 0 \\ x_{0} > 0 \\ {\dot{y}}_{0} = - 2 n x_{0} \\ \frac{4 {\dot{y}}_{0}}{n} + 6 x_{0} < y_{0} < - \frac{4 {\dot{y}}_{0}}{n} - 6 x_{0} \end{array}$
1	Controlled Orbiting: The non-cooperative target N follows a controlled, periodic elliptical orbit around the mission spacecraft M.
		$\begin{array}{l} L (t) > ε, t \in [t_{m i n}, t_{m a x}] \\ {\dot{x}}_{0} = 0 \\ x_{0} > 0 \\ {\dot{y}}_{0} = - 2 n x_{0} \\ y_{0} \in [- \infty, \frac{4 {\dot{y}}_{0}}{n} + 6 x_{0}] \cup [- \frac{4 {\dot{y}}_{0}}{n} - 6 x_{0}, + \infty] \end{array}$
2	Forward Fly-Around: The non-cooperative target N ‘chases’ the mission spacecraft M in a waterdrop configuration from the −Y direction.
		$\begin{array}{l} L (t) > ε, t \in [t_{m i n}, t_{m a x}] \\ {\dot{x}}_{0} = 0 \\ x_{0} > 0 \\ - 2 n x_{0} < {\dot{y}}_{0} < - \frac{7}{4} n x_{0} \\ y_{0} \in (- \infty, + \infty) \end{array}$
3	Reverse Fly-Around: The non-cooperative target N ‘encounters’ the mission spacecraft M in a drip-drop configuration from the +Y direction.
		$\begin{array}{l} L (t) > ε, t \in [t_{m i n}, t_{m a x}] \\ {\dot{x}}_{0} = 0 \\ x_{0} > 0 \\ {\dot{y}}_{0} < - 2 n x_{0} \\ y_{0} \in (- \infty, + \infty) \end{array}$
4	Forward Flyby: The non-cooperative target N is located below the task spacecraft M and ‘chases’ M from the −Y direction.
		$\begin{array}{l} L (t) > ε, t \in [t_{m i n}, t_{m a x}] \\ {\dot{x}}_{0} = 0 \\ x_{0} > 0 \\ {\dot{y}}_{0} > - \frac{3}{2} n x_{0} \\ y_{0} \in (- \infty, + \infty) \end{array}$
5	Reverse Flyby: The non-cooperative target N is located above the task spacecraft M and ‘encounters’ M from the +Y direction.
		$\begin{array}{l} L (t) > ε, t \in [t_{m i n}, t_{m a x}] \\ {\dot{x}}_{0} = 0 \\ x_{0} > 0 \\ - \frac{7}{4} n x_{0} < {\dot{y}}_{0} < - \frac{3}{2} n x_{0} \\ y_{0} \in (- \infty, + \infty) \end{array}$

Table 2. Table of orbital elements for mission spacecraft in time

t_{0}

.

Table 2. Table of orbital elements for mission spacecraft in time

t_{0}

.

Orbit Elements	Value
Inclination (i)	53°
Right ascension of ascending node ( $Ω$ )	0°
Argument of perigee ( $ω$ )	0°
Semi-major axis (a)	45,000 km
Eccentricity (e)	0
Eccentric anomaly (E)	0°

Table 3. Hyperparameter settings and corresponding experiments.

Number		1	2	3	4	5	6	7	8
Hyperparameter Settings	convolutional kernel size	3	5	7	5	5	5	5	5
	pooling layer window	2	2	2	3	2	2	2	2
	number of neurons in the hidden layer	4	4	4	4	8	16	8	8
	number of hidden layers	1	1	1	1	1	1	2	3

Table 4. Final model structure.

Hyperparameter	Value
convolutional kernel size	5
pooling layer window	2
number of neurons in the hidden layer	8
number of hidden layers	1

Table 5. Experimental results of different models.

	SVM	LSTM	CNN	CNN-LSTM	LSTM-SA	CNN-LSTM-SA
Accuracy	87.38%	81.43%	91.98%	92.02%	97.37%	98.62%
Precision	88.06%	79.15%	93.82%	93.87%	97.37%	98.73%
Recall	87.38%	81.43%	91.98%	92.02%	97.37%	98.62%
F1-score	87.35%	77.53%	91.79%	91.84%	97.37%	98.62%
Time (s)	261	294	243	82	300	99

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, Y.; Zhang, X.; Liao, W.; Wei, G.; Fan, S. Orbital Behavior Intention Recognition for Space Non-Cooperative Targets Under Multiple Constraints. Aerospace 2025, 12, 520. https://doi.org/10.3390/aerospace12060520

AMA Style

Chen Y, Zhang X, Liao W, Wei G, Fan S. Orbital Behavior Intention Recognition for Space Non-Cooperative Targets Under Multiple Constraints. Aerospace. 2025; 12(6):520. https://doi.org/10.3390/aerospace12060520

Chicago/Turabian Style

Chen, Yuwen, Xiang Zhang, Wenhe Liao, Guoning Wei, and Shuhui Fan. 2025. "Orbital Behavior Intention Recognition for Space Non-Cooperative Targets Under Multiple Constraints" Aerospace 12, no. 6: 520. https://doi.org/10.3390/aerospace12060520

APA Style

Chen, Y., Zhang, X., Liao, W., Wei, G., & Fan, S. (2025). Orbital Behavior Intention Recognition for Space Non-Cooperative Targets Under Multiple Constraints. Aerospace, 12(6), 520. https://doi.org/10.3390/aerospace12060520

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Orbital Behavior Intention Recognition for Space Non-Cooperative Targets Under Multiple Constraints

Abstract

1. Introduction

2. Formulation and Definition of Non-Cooperative Target Intentions

2.1. Mathematical Formulation of Intention Recognition

2.2. Relative Orbital Motion Dynamics Model

2.3. Definition of Non-Cooperative Target Intentions

2.4. Construction of the Intention Set Under Multiple Constraints

3. Development of the Intention Recognition Model

3.1. Input Layer

3.2. Hidden Layer

3.2.1. CNN Layer

3.2.2. LSTM Layer

3.2.3. Self-Attention Layer

3.3. Output Layer

4. Experiments and Analysis

4.1. Dataset and Evaluation Metrics

4.2. Model Configuration

4.3. Analysis of Model Results

4.3.1. Results and Analysis of the CNN-LSTM-SA Model

4.3.2. Performance Comparison Across Different Models

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI