1. Introduction
Air Traffic Management (ATM) systems worldwide face increasing strain from rising flight densities [
1,
2]. This pressure is significantly intensified by severe weather, particularly thunderstorms, which are a primary cause of flight delays and disruptions, costing the industry billions of dollars annually [
3]. The turbulence, icing, and wind shear associated with thunderstorms pose direct safety threats, mandating that aircraft maintain safe separation from them [
4,
5]. However, avoidance maneuvers lead to detours, increasing fuel burn and flight time and disrupting Estimated Time of Arrival (ETA). These local disruptions can trigger network-wide cascading delays [
6,
7]. The central challenge, therefore, is to plan safe avoidance trajectories that also maintain temporal consistency.
In response to the threat that convective weather phenomena such as thunderstorms pose to flight safety, systematic research has been conducted on path planning for thunderstorm avoidance. At the level of meteorological modeling, researchers abstract convective cells as dynamically evolving no-fly zones characterized by inherent uncertainty, and employ Ensemble Prediction Systems (EPS) to generate probabilistic representations of thunderstorm development, thereby enabling the quantitative assessment of the safety margin associated with various avoidance strategies [
8]. At the level of planning algorithms, stochastic optimal control approaches directly incorporate meteorological uncertainties into trajectory generation by maximizing the probability of safely reaching waypoints [
9], while methods such as Scenario-Based Rapidly exploring Random Trees (SB-RRT*) search for the shortest avoidance paths under user-defined safety thresholds and achieve near real-time planning performance with the aid of Graphical Processing Units (GPU) parallelization [
10]. Further investigations indicate that atmospheric uncertainty and convective characteristics exert a quantifiable influence on fuel consumption and flight time deviations within trajectory optimization. Consequently, the operational acceptability of a detour strategy depends not only on its spatial avoidance capabilities but also on its ability to maintain the Estimated Time of Arrival (ETA) within permissible limits, thereby imposing rigorous temporal constraints on the avoidance planning problem [
11,
12].
The need for temporal precision is amplified by modern operational concepts like Performance-Based Navigation (PBN) and 4D Trajectory Operations, which demand arrival accuracy on the order of seconds [
13]. Consequently, avoidance algorithms must now generate precise 4D trajectories that meet strict time constraints at critical waypoints, in addition to ensuring spatial safety [
14].
In high-density terminal airspace or en-route convergence zones, these challenges are further amplified by concurrent multi-aircraft avoidance scenarios. When multiple aircraft must simultaneously circumnavigate the same weather system, uncoordinated, independent avoidance decisions can easily lead to aircraft converging into the limited remaining airspace, thereby inducing new airborne conflicts [
15,
16]. Such scenarios essentially constitute a high-dimensional, multi-objective, and strongly coupled multi-agent decision problem: each aircraft must ensure its own safe passage around the thunderstorm while simultaneously maintaining statutory separation from all other aircraft, seeking a global equilibrium between cost minimization and temporal constraint satisfaction [
17]. The traditional coordination model, which relies on the experience of air traffic controllers, struggles in these high-workload, dynamic situations, where decision quality is difficult to guarantee and the controller’s cognitive workload approaches its capacity limits [
18]. Therefore, it is of significant theoretical and practical value to investigate trajectory planning theories and methods that can achieve coordinated and spatiotemporally consistent flight routing for multiple aircraft in dynamic meteorological environments.
In the field of automated trajectory planning, the academic community has conducted systematic explorations across various technical routes. Discrete algorithms based on graph search—including the A* algorithm and its various heuristic variants—have been widely adopted in research owing to their theoretical guarantees of completeness [
19]. In recent years, further studies have sought to enhance A* and integrate it with other methodologies to achieve collaborative multi-aircraft detour path planning within discrete spaces [
20]. However, graph-search-based methods inherently suffer from two major limitations: First, grid resolution constrains the fidelity of the solution space, potentially causing the omission of the true global optimum in the continuous domain; second, computational complexity escalates exponentially with increasing spatial resolution, rendering these methods incapable of satisfying the stringent real-time requirements of operational planning [
21].
The Artificial Potential Field (APF) method, another classic planning paradigm, has been widely deployed in fields like Unmanned Aerial Vehicle (UAV) path planning due to its computational simplicity and rapid response [
22,
23]. Its core concept is to model obstacles as sources of a repulsive potential and the destination as an attractive potential well, guiding the aircraft towards the goal while avoiding obstacles under the influence of the resultant force. However, APF exhibits fundamental drawbacks in environments with complex, multiple constraints. The non-convexity of the potential function makes the algorithm highly susceptible to local minima, leading to planning failures. Moreover, the method does not readily accommodate strict kinematic constraints or time-window requirements, and its coordination capabilities in coupled multi-agent scenarios are limited [
24].
In recent years, sampling-based random planning methods, such as the Rapidly exploring Random Tree (RRT) and its optimal variant RRT*, have garnered significant attention for their asymptotic optimality in high-dimensional, complex constraint spaces [
25]. These methods explore the solution space by randomly sampling the state space and progressively extending a search tree, enabling them to handle non-convex obstacle constraints and theoretically guaranteeing convergence to the optimal solution with probability one. However, their convergence rates are often slow, the dynamic feasibility of the generated trajectories requires additional processing, and the computational overhead scales poorly with the number of agents in multi-agent cooperative scenarios [
26].
In contrast to the aforementioned methods, trajectory optimization techniques based on optimal control provide a unified framework that naturally integrates dynamic constraints with performance optimization [
27,
28]. These approaches formulate trajectory planning as the minimization of a composite cost function—encompassing metrics such as fuel consumption, flight time deviation, and maneuver smoothness—subject to the aircraft’s kinematic differential equations, physical boundary conditions, and path constraints (e.g., weather avoidance, aircraft separation maintenance). Direct Collocation is a mainstream numerical technique for solving such continuous-time optimal control problems. It discretizes the time domain and converts dynamic constraints into algebraic ones, thereby transcribing the original problem into a large-scale, sparse Nonlinear Programming (NLP) problem solvable by mature interior-point solvers like the Interior Point Optimizer (IPOPT) [
29]. Numerical optimization frameworks such as CasADi leverage automatic differentiation to provide exact gradients and Hessian matrices to the solver, significantly enhancing convergence speed and reliability [
30]. Model Predictive Control (MPC) further extends trajectory optimization to real-time, reactive scenarios by repeatedly solving the optimization problem over a rolling horizon, enabling online adaptation to dynamic environmental changes [
31]. However, when this centralized framework is directly applied to multi-agent scenarios, the dimension of the joint state space grows combinatorially with the number of agents, leading to the severe “curse of dimensionality” and rendering the problem computationally intractable [
32].
To fundamentally address the scalability challenge in multi-agent cooperative planning, game theory provides a mathematical foundation for modeling strategic interactions among rational agents [
33]. The Nash Equilibrium, a core solution concept in non-cooperative game theory, describes a stable state of strategies wherein no participant can unilaterally deviate to achieve a lower individual cost. Conflict resolution models based on the Nash Equilibrium have been validated in the ATM domain, proving capable of producing globally self-consistent and conflict-free cooperative solutions [
34]. In practice, computing an exact Nash Equilibrium is often computationally prohibitive. Consequently, the Iterative Best Response (IBR) algorithm has emerged as a mainstream strategy for approximating this equilibrium. In IBR, each participant sequentially computes its optimal single-agent response given the fixed strategies of others, thereby iteratively approaching an equilibrium state [
35]. This decentralized paradigm decomposes the high-dimensional coupled problem into a series of computationally tractable subproblems, effectively balancing solution quality with computational efficiency, and has been experimentally validated in domains such as multi-robot planning and cooperative UAV task allocation.
Existing literature exhibits a gap between discrete, combinatorially complex routing and continuous, dynamically feasible trajectory generation, particularly under strict 4D arrival constraints. This study addresses this gap by presenting an integrated framework that couples continuous optimal control with game-theoretic coordination. The core contributions of this work are:
- (1)
Integrated 4D Formulation: We formulate multi-aircraft thunderstorm avoidance as a continuous 4D trajectory optimization problem that simultaneously satisfies dynamic weather avoidance, inter-aircraft separation, and hard waypoint ETA constraints.
- (2)
Joint Speed-Heading Control: Unlike approaches that treat ETA as a soft penalty or assume constant speed, our formulation enforces ETA as a terminal equality constraint, allowing the optimizer to jointly modulate velocity and heading.
- (3)
Continuous Game-Theoretic Decomposition: We adapt the IBR mechanism to the continuous trajectory space, defining a Local Nash Equilibrium over continuous variables with a convergence criterion based on the trajectory-variation norm.
- (4)
Emergent Control Hierarchy: We demonstrate that under this formulation, a “speed-first, heading-second” decision hierarchy emerges naturally from the optimization without requiring heuristic rules, aligning with practical air traffic control heuristics.
The remainder of this paper is organized as follows:
Section 2 establishes the mathematical models and problem formulation, including the aircraft kinematics, dynamic thunderstorm environment, and safety constraints.
Section 3 presents the proposed planning framework, detailing the single-aircraft optimal control formulation and the multi-aircraft game-theoretic coordination mechanism based on IBR.
Section 4 provides numerical simulations and result analysis. Finally,
Section 5 concludes the paper.
3. Proposed Planning Framework
Based on these assumptions, the overall framework of this work is as follows: First, a dynamics and constraint model is formulated for the single-aircraft thunderstorm avoidance problem within a continuous-time optimal control framework, and is then transcribed into a large-scale sparse NLP problem via direct collocation. High-precision numerical solutions are obtained using CasADi (Version 3.7.2) and IPOPT (Version 3.14.11). The multi-aircraft cooperative avoidance problem is then abstracted as a noncooperative game, and an IBR mechanism is employed to decompose the high-dimensional, strongly coupled joint optimization problem into a set of parallelizable single-aircraft optimal control subproblems. In this way, distributed and coordinated multi-aircraft trajectory generation is achieved.
3.1. Single-Aircraft Trajectory Planning
For a single aircraft, given that the trajectories of all other aircraft are fixed, the problem reduces to finding an optimal trajectory and control inputs that minimize a cost function. This cost function is a weighted sum of several performance indices. By employing a configuration of weight coefficients, the optimizer is guided to exhibit a hierarchical sensitivity to operational costs. This prioritization aligns with practical aviation standards, ensuring a cost-effective decision-making process.
where the components are:
Path Length
: Minimizes the total flight distance, serving as a proxy for fuel consumption and flight time.
Control Effort
: Penalizes large control inputs to encourage energy-efficient and smooth maneuvers.
Control Smoothness
: Penalizes the rate of change in control inputs (jerk), ensuring smoother transitions.
Curvature Penalty
: Penalizes large turn rates
, which is equivalent to penalizing high curvature.
The design of this cost function, particularly the proper configuration of its weights , , , , is crucial in guiding the optimization towards desired performance characteristics. Through extensive simulation-based tuning, this study prioritizes path length to reflect fuel efficiency, while other weights are calibrated based on typical aircraft dynamic response characteristics to ensure flyability.
The formulated optimal control problem is an infinite-dimensional functional optimization problem, which is difficult to solve analytically. Therefore, this study employs a direct collocation method to transform it into a finite-dimensional problem that is amenable to numerical solution. This method discretizes the entire time horizon. For one aircraft among the aircraft, the total flight time is divided into equidistant intervals of duration , where denotes the -th time step. The model’s decision variables comprise the states and control inputs of all aircraft at every discrete time node:
The total number of optimization variables is on the order of , indicating that the problem scale expands rapidly with an increase in the number of aircraft and discretization steps. Polynomials are used to approximate the state and control trajectories, ultimately transcribing the dynamic equations, path constraints, and performance indices into a single, large-scale, highly sparse NLP problem. The entire modeling and transcription process is implemented using the CasADi framework. Its automatic differentiation capabilities provide precise Jacobian and Hessian information to high-performance solvers like IPOPT, thereby ensuring both the efficiency and convergence of the solution process.
3.2. Multi-Aircraft Conflict Resolution and Coordination Framework Based on Game Theory
In high-density operational airspace, ensuring that multi-aircraft trajectories not only avoid external threats like thunderstorms but also do not create new conflicts among themselves is a far more complex optimization problem. To this end, this section extends the trajectory planning problem from a single agent to a multi-agent system, aiming to achieve coordinated conflict resolution and overall cost optimality.
This multi-aircraft coordination problem is formulated as a non-cooperative game, where each aircraft is treated as a rational game participant. Each participant chooses its own flight trajectory , with the objective of minimizing its own cost function , which depends not only on its own trajectory but also on the trajectories of all other aircraft .
In a multi-aircraft non-cooperative game, the Nash Equilibrium is a key solution concept. It describes a stable state of strategy combinations where no participant (aircraft) has an incentive to unilaterally change its strategy (trajectory), as any unilateral change would not result in a lower cost.
For a set of
aircraft, a set of trajectory strategies
constitutes a Nash Equilibrium if and only if, for any aircraft
, its equilibrium trajectory
is the best choice given that all other aircraft
adhere to their equilibrium trajectories
. This condition can be formally expressed as:
where
is the set of all feasible trajectories for aircraft
.
Directly solving for the Nash Equilibrium of this multi-aircraft game is a high-dimensional, complex optimization problem, especially when trajectories are described by continuous state and control variables, making it difficult to handle directly. Therefore, this study employs an iterative computational method—IBR [
34,
35]—to obtain an approximate equilibrium solution for this game.
The IBR algorithm decomposes the complex multi-agent coupled problem into a series of single-agent optimal control subproblems that are solved sequentially. In iteration , following a predetermined planning order, aircraft treats the trajectories of other aircraft as fixed constraints and solves for its own single-aircraft optimal trajectory, which serves as its “best response” to the strategies of others.
When planning for aircraft , it utilizes the updated trajectories of preceding aircraft from the current iteration {} and the not-yet-updated trajectories of succeeding aircraft from the previous iteration {}. This approach leverages the most recent available information, accelerating convergence.
The pseudocode for the multi-aircraft thunderstorm avoidance method based on game-theoretic iteration is as detailed in Algorithm 1.
| Algorithm 1. Iterative Best Response for Multi-Aircraft Coordinated Avoidance |
| Input: |
| Initial states, destination points, and performance envelopes for aircraft; |
| External thunderstorm models; |
|
Initial trajectory set . |
| Output: |
| Coordinated, conflict-free trajectory set .
|
Initialize: Set iteration counter , max iterations , convergence threshold . Determine Planning Order: Establish a planning sequence for the aircraft, where denotes the planning rank of aircraft within each iteration (i.e., aircraft is the -th to solve its subproblem). The total number of planning steps per iteration is . WHILE AND not converged DO FOR to DO Let the currently planned aircraft be Construct the constraint set from other aircraft’s trajectories:
Solve for the best response trajectory:
END FOR IF THEN BREAK END WHILE RETURN
|
3.3. Analysis of Convergence for the Coordination Mechanism
We analyze the convergence of the IBR mechanism from two perspectives: the optimality guarantee at each individual step, and the game-theoretic interpretation of the solution returned upon termination.
In iteration
, aircraft
solves its single-agent NLP subproblem with the trajectories of all other aircraft,
, held fixed as hard constraints. The subproblem is a finite-dimensional, continuously differentiable NLP over a compact feasible set
, solved by IPOPT to a point satisfying the Karush–Kuhn–Tucker (KKT) first-order necessary conditions. Since
is a feasible point of the current subproblem and
is its locally optimal solution, the following holds by construction:
Because the constraint environment varies dynamically as other agents update sequentially, Equation (20) guarantees local cost improvement relative to the current iteration’s information state, rather than a monotonic descent of the global aggregate cost across iterations. However, this local improvement is sufficient to preclude degenerate strategy cycling during the iterative process.
Convergence of the overall process is monitored through the trajectory variation norm:
The algorithm terminates when , with m, indicating that the joint strategy profile has stabilized to within the prescribed positional tolerance. A hard iteration cap ensures termination in finite time under all conditions, including cases where oscillatory behavior—a known risk of sequential best-response schemes in non-convex games—prevents from falling below .
When the algorithm terminates at iteration
via the variation norm criterion, the following fixed-point condition holds simultaneously for all aircraft:
At this fixed point, each is the output of an NLP solve in which all other aircraft trajectories were fixed at values within of their current values, and by (19) it is locally optimal within under those constraints. This terminal state corresponds to a Local Nash Equilibrium: A strategy profile is a Local Nash Equilibrium if, for each aircraft , there exists no perturbation with such that , where .
The local nature of this equilibrium stems from the non-convexity of the joint strategy space (due to thunderstorm avoidance and separation constraints) and the local nature of the KKT-stationary points returned by the IPOPT solver. Given these characteristics and the non-contractive nature of the best-response map, the algorithm is designed to converge to a local Nash equilibrium rather than a global optimum. This formulation aligns with the practical requirements of real-time conflict resolution, where finding a stable, collision-free local equilibrium is computationally prioritized over global optimality. In the ATM context, the Local Nash Equilibrium represents a spatiotemporally self-consistent, conflict-free cooperative solution from which no aircraft can reduce its individual cost through any small unilateral trajectory perturbation, which is precisely the stability condition required for an operationally deployable rerouting plan. In practice,
decreases monotonically to below
in the overwhelming majority of tested scenarios, confirming that this solution concept is routinely achieved in operationally relevant configurations; the specific safety performance of the trajectories, is documented in detail in
Section 4.6.
4. Experimental Validation
To comprehensively validate the effectiveness of the proposed method under thunderstorm conditions, this chapter details a series of simulation experiments. To ensure the practical relevance of the results, all test scenarios were designed such that a portion of the aircraft’s planned route is detected to have thunderstorm activity, requiring effective avoidance of weather hazards and flight conflicts with minimal deviation from the overall flight plan.
4.1. Simulation Environment and Parameter Settings
The simulation experiments were conducted in a Python 3.9 environment. All optimization problems were solved using the IPOPT 3.13.4 solver via the CasADi 3.5.5 interface.
The operational envelope constraints for the simulated Airbus A320 aircraft are derived from the EUROCONTROL Base of Aircraft Data (BADA) User Manual, Revision 3.14 [
37]. The cruising true airspeed range is set to [720, 870] km/h, corresponding to Mach 0.68 (long-range cruise lower bound) through the maximum operating Mach number
= 0.82 at FL350 under International Standard Atmosphere (ISA) conditions. The longitudinal acceleration limit is [−0.5, 0.5] m/s
2. The maximum turn rate is [−1.5°/s, 1.5°/s]. The minimum inter-aircraft separation of 10 km is set as a conservative rounding of the standard en-route radar separation minimum of 5 NM (9.26 km) specified in ICAO Doc 4444 PANS-ATM §8.7.3 [
38]. The minimum clearance from the 30 dBZ thunderstorm boundary of 20 km is adopted as a conservative margin above the approximately 10 statute miles (~16 km) avoidance distance associated with MODERATE-intensity convective echoes under FAA AIM §7-1-27 [
39] and FAA AC 00-45H [
36]. All parameters are summarized in
Table 1.
The weight configuration in
Table 1 reflects a deliberate prioritization of flight economy over control aggressiveness. The dominant weight assigned to path length (
= 5.0) encodes the operational objective of minimizing detour distance and, by extension, fuel burn—the primary cost driver in practical thunderstorm rerouting decisions. The control energy weight (
= 0.8) and velocity smoothness weight (
= 0.1) serve complementary roles:
penalizes large acceleration and turn-rate inputs that would stress the airframe and reduce passenger comfort, while
suppresses rapid speed oscillations that could arise from conflicting ETA and avoidance requirements. The curvature penalty (
= 1.0) prevents geometrically degenerate trajectories with sharp heading reversals that, while mathematically feasible, are operationally unacceptable.
The relative magnitudes of the weights directly govern the trade-off between trajectory economy and control smoothness. The dominant path length weight
produces a cost-driven speed-heading priority, wherein velocity modulation is preferred over heading deviation as the lower-cost response to temporal and spatial constraints. The remaining weights are calibrated to suppress geometrically degenerate solutions and excessive control activity while preserving the optimizer’s freedom to generate dynamically feasible avoidance maneuvers. This cost-driven hierarchy is consistent with standard ATM practice, as demonstrated in the representative scenario analyzed in
Section 4.3.
4.2. Single-Aircraft Avoidance Scenario Analysis
The results for a single-aircraft, three-thunderstorm avoidance test are shown in
Figure 1. This scenario demonstrates a flight path that would have passed through three closely spaced thunderstorm areas. After replanning with the algorithm, the aircraft successfully navigates around all three thunderstorm areas before rejoining its originally planned route.
Analysis of trajectory flyability confirms that the segment lengths meet continuous flight requirements, with no unflyable segments.
4.3. Multi-Aircraft Avoidance Scenario Analysis
This study simulates a scenario with three aircraft approaching a single thunderstorm area from three different directions (south, west, and northwest) at the same time and speed. The thunderstorm is located at the intersection of the routes of two of the aircraft, posing a high risk of flight hazard and conflict. If they were to proceed directly through the risk area as planned, they would face high risk and a guaranteed conflict. Therefore, both aircraft must simultaneously detour around the thunderstorm, which still presents a high risk of conflict. Meanwhile, a third aircraft is also in the vicinity of the storm. The proposed algorithm is used to plan for all three aircraft, employing an iterative game-theoretic approach where safety separation and conflict avoidance strategies serve as constraints. After 4 iterations, the multi-aircraft avoidance test results are shown in
Figure 2.
The simulation results demonstrate that the proposed method can generate three trajectories that completely avoid the thunderstorm area from different directions while maximally preserving the original routes. Furthermore, conflicts are successfully avoided, validating the feasibility and robustness of the algorithm. The algorithm converges to a final result in 4 iterations, with computation times of 1.14 s, 0.90 s, 0.86 s, and 0.77 s, respectively, achieving trajectory planning within seconds and ensuring timeliness. The results further validate the robust convergence of this mechanism: in all tested scenarios, the algorithm consistently reached a stable state within a very small number of iterations. This efficient convergence, combined with the powerful numerical solving capabilities of the CasADi framework, ensures the method’s timeliness and reliability in handling dynamic, high-density avoidance tasks.
To quantitatively assess the framework’s ability to meet preset ETA constraints,
Table 2 presents the planned versus actual arrival times for each aircraft. The results show that despite all three aircraft executing complex cooperative avoidance maneuvers—which inherently increase the flight path length—their final arrival time errors are all controlled to within a few seconds. This outcome follows from the velocity optimization: the optimizer compensates for the longer avoidance path by accelerating, so the added distance does not translate into a delay. Consequently, even with extended trajectories, the aircraft meet their scheduled arrival times. In the context of actual ATM, these negligible errors are well within Required Navigation Performance (RNP) standards. This strongly demonstrates that the proposed velocity integral constraint functions as a hard equality constraint, ensuring that aircraft can precisely meet their 4D waypoint time requirements in dynamic, high-conflict-density environments, thereby validating the method’s reliability in time-critical missions.
Figure 3 illustrates the temporal evolution of separation distances between all aircraft pairs, while
Figure 4 depicts the distance between each aircraft and the meteorological obstacles. These represent the core metrics for evaluating the safety of the proposed conflict resolution strategy. As clearly observed, all distance curves strictly remain above the minimum safety separation threshold (indicated by the red dashed line) throughout the entire flight horizon. The simulation results demonstrate that even under conditions where thunderstorms occupy core airspace, the multi-aircraft trajectories exhibit exceptional spatiotemporal compactness. All aircraft precisely navigate around dynamic thunderstorms while maintaining inter-aircraft separation above the critical threshold, validating the coordination efficiency of the proposed algorithm in highly constrained airspace.
Analyzing the simulation results from
Figure 5, it is evident that under the dual constraints of conflict avoidance and ETA, aircraft 2 and 3 adopt a combined strategy of heading maneuvers and speed adjustments. In contrast, aircraft 1 exhibits significant control heterogeneity. Because its original path was not directly obstructed by the thunderstorm, the algorithm, during the multi-aircraft game iterations, determined that maintaining a direct flight was the globally cost-optimal response, thus avoiding unnecessary rerouting costs. Although its heading remains constant, aircraft 1’s velocity profile shows precise dynamic modulation, aimed at eliminating potential spatiotemporal conflict risks with the detouring aircraft (2 and 3) and compensating for time losses due to their path deviations. This behavior reveals an emergent control hierarchy within the optimization framework. When faced with time deviations or minor conflict risks, the optimizer preferentially uses longitudinal speed compensation to absorb deviations within allowable limits (i.e., “time absorption”). This is primarily because speed adjustments mainly involve changes in acceleration, which have a smaller marginal impact on the total cost. Only when speed adjustments cannot resolve safety-critical encounters or when obstacles like thunderstorms block the path does the optimizer initiate heading maneuver strategies, even though this leads to a significant increase in total cost by triggering curvature penalties, increasing angular velocity energy consumption, and extending path length. This “speed-first, heading-second” control logic is not driven by pre-set heuristic rules but is an emergent optimal solution from the multi-objective cost function within the dynamic game. The results confirm that the framework reproduces a decision hierarchy familiar to practicing controllers: speed adjustment first, heading change only when necessary.
4.4. Baseline Method Comparison
To evaluate the performance of the proposed game-theoretic optimal control framework (The Proposed Method), we conducted a comparative analysis against two representative baseline methods: A* + IBR and the Alternating Direction Method of Multipliers (ADMM). The selection of these baselines is strategic: the comparison with A* + IBR isolates the benefit of our continuous NLP-based planner over a discrete grid-search planner while using the same IBR coordination mechanism. The comparison with ADMM, conversely, isolates the benefit of our IBR coordination mechanism over another popular decentralized optimization technique while using the same NLP-based planner. The quantitative results of this comparison are summarized in
Table 3, with the corresponding trajectories visualized in
Figure 6. An upward arrow (↑) indicates that a higher value is better, while a downward arrow (↓) indicates that a lower value is better.
4.4.1. Comparison with A* + IBR
The A* + IBR method combines a discrete grid-based planner with the same iterative best-response coordination logic as our proposed method. As shown in
Figure 6b, the trajectories generated by A* are visibly jagged and consist of sharp, piecewise-linear segments. This is a direct consequence of the underlying grid representation, which inherently limits the solution space to discrete state transitions.
The quantitative results in
Table 3 confirm this visual assessment. The Trajectory Smoothness, a metric where lower is better, is exceptionally poor for A* + IBR (64.466 vs. 0.153 for our method), indicating frequent and sharp heading changes that are not dynamically feasible for commercial aircraft without significant post-processing. Although A* + IBR shows minimal control effort and speed variation, this is an artifact of its simplified model, which does not perform genuine dynamic optimization of velocity profiles. Most critically, the discrete nature of the grid and the post hoc separation enforcement resulted in safety violations, with inter-aircraft separation (9.510 km) falling below the required 10 km minimum, and thunderstorm clearance (19.728 km) falling below the required 20 km minimum. In contrast, our proposed method, by operating in a continuous domain, generates inherently smooth and dynamically feasible trajectories that strictly adhere to all safety constraints. Furthermore, the total path length is significantly shorter (446.57 km vs. 505.89 km), highlighting the sub-optimality introduced by grid discretization. This comparison validates the superiority of using a continuous optimal control planner (NLP/IPOPT) for generating high-quality, safe, and efficient aircraft trajectories.
4.4.2. Comparison with ADMM
ADMM is another powerful technique for decentralized optimization, which, like our method, uses NLP/IPOPT as the core planner for each aircraft. The key difference lies in the coordination mechanism: while our approach shares full trajectory information and enforces hard separation constraints, ADMM coordinates through shared dual variables (or “prices”) and treats inter-aircraft separation as a soft constraint within an augmented Lagrangian.
As seen in
Table 3, ADMM successfully generates conflict-free trajectories with zero violations and maintains a slightly larger minimum separation (11.122 km), which is a positive outcome. However, this safety comes at a significant cost in other aspects. The Total Path Length (495.15 km) and Trajectory Smoothness (36.249), both metrics where lower is better, are substantially worse than those of our proposed method. The Control Effort (118.467 vs. 5.528) is dramatically higher, indicating aggressive and inefficient maneuvers, which is also reflected in the trajectory shapes in
Figure 6c. Most notably, the Computation Time for ADMM is much longer than for our method (25.33 s vs. 3.76 s).
This performance difference stems from the nature of the coordination. The soft-constraint formulation of ADMM often requires many more iterations and a carefully tuned penalty parameter to converge to a feasible solution, leading to longer computation times and less efficient trajectories. In contrast, the IBR mechanism in our framework, by treating other aircraft trajectories as hard constraints, converges rapidly to a high-quality, locally optimal Nash Equilibrium. This comparison validates that for this specific ATM problem, the IBR coordination mechanism provides a more effective balance of solution quality, safety, and computational efficiency than ADMM.
4.4.3. Overall Assessment
The comparative analysis demonstrates the comprehensive advantages of the proposed framework. It successfully combines the strengths of a continuous optimal control planner (providing smooth, efficient, and dynamically feasible trajectories) with a fast and effective game-theoretic coordination mechanism (IBR). The results show that our method is the only one among the three that achieves a superior balance of safety (zero violations), efficiency (shortest path length), trajectory quality (best smoothness), and computational speed.
4.5. Monte Carlo Study and Sensitivity Analysis of Formation Size
4.5.1. Experimental Design
The simulations presented thus far are all based on fixed scenarios, which makes it difficult to assess algorithm performance across the broader range of conditions encountered in practice. To address this, a Monte Carlo study was conducted using a large set of randomly generated scenarios.
The experiment covered scenarios with varying numbers of aircraft (1–10) and thunderstorms (1–4), ensuring diversity in conflict geometry and storm encounter patterns. For each aircraft count group, 500 independent scenarios were generated and simulated. Thunderstorm parameters—center location, radius, velocity, and direction of movement—were all drawn independently from physically reasonable ranges.
4.5.2. Sensitivity to Aircraft Count
Grouping the scenarios by aircraft count yields the breakdown shown in
Table 4.
Table 4 summarizes performance across formation sizes from one to ten aircraft. For formations of up to seven aircraft, the algorithm converged in every tested scenario with all safety constraints satisfied in full. Both convergence and safety rates declined moderately beyond eight aircraft, as shown in
Table 4.
Solve time scales predictably with formation size, rising from under 4 s for one-to-three aircraft to 4–17 s for four-to-seven aircraft, and up to 32 s for the largest groups. This growth reflects the quadratic increase in pairwise separation constraints: a seven-aircraft formation generates C(7,2) = 21 pairs, while ten aircraft produce C(10,2) = 45—a 43% increase in fleet size that nearly doubles the constraint count. For pre-tactical rerouting applications that typically operate on a several-minute look-ahead horizon, even the upper end of this range remains operationally acceptable. The detour ratio increases modestly from 6.36% for small formations to 6.79% for the largest, a difference of less than half a percentage point, suggesting that the game-equilibrium formulation effectively contains the cooperative cost overhead even as formation size grows.
4.6. Dynamic Rerouting Case Study Based on Real-World Flight Data
To verify the practical applicability of the proposed method under realistic operating conditions, a real-world case study was constructed based on publicly available data. Flight information was obtained from the public commercial flight-tracking platform VariFlight, from which four real commercial flights operating on 7 April 2026 were selected: GJ8860 from Guangzhou Baiyun to Hangzhou Xiaoshan, CZ6586 from Hefei Xinqiao to Guangzhou Baiyun, JD5067 from Xi’an Xianyang to Xiamen Gaoqi, and CZ3572 from Shanghai Hongqiao to Guangzhou Baiyun. The thunderstorm systems involved in the scenario were reconstructed from the convective weather observations released by the China Meteorological Administration for the same region on that day. The proposed method was then evaluated under this combination of real flight and meteorological background. Key waypoints from their planned routes are shown in
Table 5.
During their flights, the aircraft involved encountered a thunderstorm system between waypoints LEKUV and P215. Based on the game-theoretic cooperative planning framework proposed in this paper, the system generated real-time avoidance advisories for all four flights that satisfied their waypoint ETA constraints. Based on the avoidance advisories generated by the proposed algorithm, alternative rerouting trajectories were planned and evaluated accordingly. The new routes preserved the main structure of the original flight plans while incorporating necessary avoidance waypoints. A comparison of the routes before and after the rerouting, along with the updated waypoints, is presented in
Table 6 and
Figure 7.
The simulation results confirm that the generated avoidance trajectories successfully circumvented the high-risk thunderstorm area and ensured mutual separation between the two aircraft, with a minimum inter-aircraft separation distance of 10.45 km. Additionally, the trajectories maintained a safe distance from the thunderstorm, with the closest encounter point at 20.24 km. All trajectory segments complied with the aircraft’s dynamic constraints. Compared to the original flight plans, the rerouting resulted in an approximately 6.42% increase in actual flight distance. While these increases represent an additional cost, they are well within a reasonable range given the imperative of ensuring flight safety.
5. Conclusions
This paper addresses the problem of multi-aircraft 4D trajectory cooperative planning in dynamic weather environments by proposing a cooperative planning framework based on optimal control theory and game theory. A comprehensive environmental model including dynamic thunderstorm life cycles and aircraft kinematics was constructed. Using the CasADi toolkit, the single-aircraft 4D trajectory generation problem was transformed into an NLP problem for solution. Building on this, a cooperative conflict resolution mechanism based on IBR was introduced to address multi-aircraft conflicts, achieving an efficient approximation of the Nash Equilibrium.
Simulation results show that the framework effectively avoids dynamic weather threats and inter-aircraft conflicts while precisely meeting 4D waypoint time constraints. A comparison with two baseline methods (A* + IBR and ADMM) indicates that the proposed approach offers a better overall balance among safety, trajectory quality, and computational efficiency. A case study based on real Chinese civil aviation routes confirms its applicability under realistic conditions, and Monte Carlo experiments further demonstrate that the framework remains stable across the formation sizes commonly encountered in cooperative rerouting tasks, with solve times suitable for pre-tactical planning.
Future research could extend this work by incorporating more complex operational constraints (such as uncertain wind fields, communication limitations, and the impact of weather forecast uncertainties) into the model, with additional consideration of altitude integration, extending the scenario to the vertical dimension, and evaluating the computational performance and scalability of the framework in large-scale fleet scenarios. This would broaden the scope of the study and enhance its application potential in future ATM systems.