6.3. Case 2: Comparison Experiment of Prediction Methods
This study aims to verify that the proposed method outperforms the RT-GRU and AIMM-IAKF methods, which are used as comparison methods, in providing non-cooperative target predicted locations and assisting reinforcement learning in collision avoidance. Specifically, the RT-GRU method [
27] is based on the GRU algorithm, while the AIMM-IAKF method [
23] relies on the Kalman method.
We trained the three methods mentioned above in a flight task environment with 80 non-cooperative targets per square kilometer and conducted 1000 flight mission tests. The success rates are shown in
Table 5:
The experimental results demonstrate that the proposed method achieves a higher task success rate. Compared to the RT-GRU algorithm, which also uses deep learning, the proposed method better extracts global and local flight intentions, making more accurate non-cooperative target position predictions and providing stronger data support for reinforcement learning collision avoidance decisions. The AIMM-IAKF algorithm, based on traditional methods, performs poorly in handling the high-density, high uncertainty in flight intentions environment proposed in this paper, highlighting the challenges traditional methods face in predicting tasks for nonlinear systems.
6.7. Local Explanation
This experiment aims to analyze the impact of individual features on model decision-making in specific scenarios, thereby enhancing model interpretability. A widely adopted machine learning explanation framework is based on Shapley values, which quantify the contribution of each feature to the model’s predictions. The SHAP framework [
31] provides a unified measure to explain how input features influence the model’s output. By analyzing feature contributions, the decision-making process becomes more transparent, and the model’s preferences can be effectively revealed. Lundberg et al. [
32] proposed the Deep SHAP method, which combines SHAP theory with deep learning by estimating average feature contributions through multiple reference points to compute Shapley values. Deep SHAP is model-agnostic and applicable to any deep learning architecture. Experimental results demonstrate its ability to reasonably interpret decision processes in deep learning models.
The specific SHAP values are presented in
Table 9. For the first sample (as shown in
Figure 10), the UAV has a non-cooperative target in Sector 1 and a relative angle of 320° counterclockwise toward the goal point. Since Sector 1 (located at the UAV’s left front) contains a non-cooperative target, this feature contributes to the decision to decelerate, while other sectors without non-cooperative targets favor acceleration. In addition, based on the predicted flight intent, the non-cooperative target within Sector 1 will move away from the UAV in the future, so the incentive for slowing the UAV down will correspondingly diminish. Ultimately, the UAV selects deceleration to avoid the non-cooperative target. Additionally, due to the presence of a non-cooperative target on the left side and the goal point’s position on the right, the target location contributes substantially to a right-turn decision. The UAV ultimately executes a right turn to resolve the conflict.
For the second sample, the UAV has non-cooperative targets in Sectors 4 and 5 and a relative angle of 200° counterclockwise toward the goal point. Non-cooperative targets exist in Sectors 4 and 5, located at the UAV’s rear. Notably, Sector 5 contains a non-cooperative target at an extremely close proximity, which significantly contributes to the decision to accelerate. However, the goal point is positioned behind the UAV, favoring deceleration to reach it, thereby counteracting the acceleration tendency. Furthermore, according to the flight intent prediction, the non-cooperative target in sector 4 will move away from the UAV, whereas the one in Sector 5 will approach it. Consequently, the future contribution of Sector 4 to accelerating the UAV diminishes, while that of Sector 5 increases. Ultimately, to prioritize safety and increase distance from non-cooperative targets, the UAV executes an acceleration maneuver. Additionally, the presence of a non-cooperative target in the left-rear Sector 4 drives the UAV to perform a right-turn decision for collision avoidance.
For the third sample, the UAV has non-cooperative targets in Sectors 4–6, with a relative angle of 150° counterclockwise toward the goal point. Non-cooperative targets exist in Sectors 4–6 (located at the UAV’s rear), and Sectors 4 and 5 contain targets at extremely close proximity. This significantly contributes to the decision to accelerate. However, the goal point lies behind the UAV, favoring deceleration to approach it, thereby counteracting the acceleration tendency. Furthermore, based on the flight intent prediction, the non-cooperative target in Sector 4 will move away from the UAV and enter Sector 5, while the one in Sector 5 will approach the UAV and enter Sector 4; the non-cooperative target in Sector 6 will move away from the UAV. Consequently, the future contribution of sector 4 to accelerating the UAV increases, whereas those of Sectors 5 and 6 decrease. Ultimately, to prioritize safety and increase separation from non-cooperative targets, the UAV chooses to maintain a straight course while accelerating. Additionally, the presence of non-cooperative targets in the left-rear Sector 4 and right-rear Sector 6 neutralizes turning tendencies, leading to a decision to continue straight.
For the fourth sample, the UAV has a non-cooperative target in Sector 7, with a relative angle of 320° counterclockwise toward the goal point. Since Sector 7 (located at the UAV’s right front) contains a non-cooperative target, this strongly contributes to the decision to decelerate, while other sectors without non-cooperative targets favor acceleration. Furthermore, based on the flight intent prediction, the non-cooperative target in sector 7 will maintain an almost constant distance from the UAV in the future; therefore, the incentive for the UAV to decelerate will remain almost unchanged. Ultimately, to avoid the non-cooperative target, the UAV selects a deceleration maneuver. Similarly, the presence of a non-cooperative target on the right side encourages a left-turn decision, while the goal point’s position on the UAV’s right side counteracts this maneuver. To prioritize collision avoidance, the UAV ultimately executes a left turn.
6.8. Global Explanation
This experiment aims to evaluate how the model makes decisions under diverse scenarios and analyze its behavioral preferences in different situations.
- (1)
UAV yaw decisions are influenced by the positions of the goal point and non-cooperative targets.
Regarding
Figure 11a, it illustrates the influence of the spatial distribution of non-cooperative targets on the drone’s right-turn maneuver. Each dot in the figure represents the SHAP value contributed by a non-cooperative target located at that point toward the drone’s decision to turn right. It is evident that the location of non-cooperative targets significantly impacts the agent’s decision-making. When the agent selects a right-turn action, non-cooperative targets in Sectors 1–5 exert a notable influence, with the impact intensity increasing as the distance decreases. This indicates that targets in this direction encourage the agent to perform a right-turn maneuver. As the distance between the agent and the non-cooperative target increases, the influence diminishes. When the goal point approaches the agent’s perception boundary, the encouraging effect nearly disappears. Regarding
Figure 11b, it shows the impact of the spatial distribution of non-cooperative targets on the drone’s left-turn maneuver; non-cooperative targets in Sectors 4–9 also promote left-turn actions. These results demonstrate that the UAV adopts appropriate turning maneuvers to avoid non-cooperative targets.
Regarding
Figure 11c, it depicts how the spatial distribution of goal points influences the drone’s straight-ahead action; the goal point located in Sector 9 (directly ahead on the right) has the strongest influence on this decision, with the impact intensity increasing as the distance decreases. This indicates that targets in this direction encourage straight movement. As the clockwise angle between the goal point and the agent’s heading increases, the influence weakens. When the goal point is positioned behind the agent, the encouraging effect vanishes. Further, as the angle increases, targets on the agent’s left side begin to suppress straight movement. The strongest suppression occurs when the goal point is in Sector 1 (directly ahead on the left), with the suppression effect intensifying as the distance decreases. Regarding
Figure 11d, it shows how the spatial distribution of goal points influences the drone’s left-turn maneuver; the goal point located in Sector 1 (directly ahead on the left) has the greatest impact on this decision, with influence intensity increasing as the distance decreases. This indicates that targets in this direction encourage a left-turn maneuver. As the counterclockwise angle between the goal point and the agent’s heading increases, the influence weakens. When the goal point is positioned behind the agent, the encouraging effect disappears. Further, as the angle increases, targets on the agent’s right side begin to suppress left-turn actions. The strongest suppression occurs when the goal point is in Sector 9 (directly ahead on the right), with the suppression effect intensifying as the distance decreases.
- (2)
UAV acceleration decisions are influenced by the positions of both the goal point and non-cooperative targets.
Regarding
Figure 12a, it illustrates the influence of the spatial distribution of non-cooperative targets on the drone’s deceleration maneuver. Each point in the figure represents the SHAP value contributed by a non-cooperative target located at that position toward the drone’s decision to slow down. When non-cooperative targets are located in the front sectors (Sectors 1, 2, 8, and 9) of the UAV, they significantly impact the decision to decelerate, with the influence intensifying as the distance decreases. This indicates that non-cooperative targets in the forward direction strongly encourage the UAV to decelerate. When non-cooperative targets are near the UAV’s perception boundary, the encouraging effect diminishes to nearly zero. Regarding
Figure 12b, it illustrates how the acceleration maneuver of the UAV is influenced by the positional distribution of non-cooperative targets. When non-cooperative targets are present in the adjacent sectors (Sectors 2–8) of the UAV, they significantly promote the decision to accelerate, with the influence increasing as the distance decreases. This suggests that non-cooperative targets in non-frontal directions (excluding the forward sectors) encourage acceleration. However, this encouraging effect also vanishes when the targets approach the perception boundary.
Regarding
Figure 12c,d, they, respectively, depict the UAV performing a speed-holding maneuver and an acceleration maneuver, both influenced by the positional distribution of the goal points. Beyond a certain distance, the goal point’s position generally suppresses acceleration. The suppression effect strengthens as the distance decreases. Within a certain proximity threshold, however, the goal point begins to encourage acceleration, with the promoting effect intensifying as the distance shrinks. Notably, the influence varies by action type: the range within which the goal point encourages the speed-holding maneuver is comparatively small, whereas the range that encourages the acceleration maneuver is significantly larger. This is because higher UAV speeds reduce the time window for collision avoidance maneuvers, and collision risks persist near the goal point. Consequently, the UAV only prioritizes acceleration when the goal point is sufficiently close. Conversely, to mitigate collision risks, the UAV cautiously regulates its speed most of the time. For the speed-holding maneuver, when the goal point is far from the UAV, the UAV should change its speed instead of maintaining it; consequently, viewed overall, the inhibitory region far exceeds the encouraging region, and the goal point more strongly tends to inhibit the UAV’s speed-holding action.