A Cooperative Trajectory Planning Method for Multi-Aircraft Thunderstorm Avoidance Based on Optimal Control and Game Equilibrium

Su, Rui; Wen, Xiangxi; Li, Shuangfeng; Chen, Youfu; Yang, Wenda

doi:10.3390/aerospace13060537

Open AccessArticle

A Cooperative Trajectory Planning Method for Multi-Aircraft Thunderstorm Avoidance Based on Optimal Control and Game Equilibrium

by

Rui Su

¹,

Xiangxi Wen

^1,*

,

Shuangfeng Li

¹,

Youfu Chen

² and

Wenda Yang

¹

Air Traffic Control and Navigation College, Air Force Engineering University, Xi’an 710051, China

²

Unit 93220 of the PLA, Harbin 150049, China

^*

Author to whom correspondence should be addressed.

Aerospace 2026, 13(6), 537; https://doi.org/10.3390/aerospace13060537 (registering DOI)

Submission received: 16 April 2026 / Revised: 20 May 2026 / Accepted: 1 June 2026 / Published: 9 June 2026

(This article belongs to the Section Air Traffic and Transportation)

Download

Browse Figures

Versions Notes

Abstract

This paper presents a cooperative trajectory planning method for multiple aircraft avoiding thunderstorms, formulated within a game-theoretic optimal control framework. We model the multi-aircraft system as a non-cooperative game and employ an Iterative Best Response (IBR) algorithm to decompose the coupled planning problem into a series of single-agent, nonlinear optimal control subproblems. Each subproblem is solved using the CasADi framework, enabling the continuous and simultaneous optimization of both aircraft velocity and heading. This approach directly generates smooth, dynamically feasible 4D trajectories that satisfy strict on-time arrival constraints at each waypoint, addressing a key limitation of many existing methods. Our simulations show that the framework not only ensures safe separation from thunderstorms and other aircraft but also effectively manages arrival times, with errors on the order of seconds. These results demonstrate the method’s capability to produce safe, efficient, and punctual trajectories for complex multi-aircraft encounters in dynamic weather.

Keywords:

trajectory optimization; air traffic management; thunderstorm avoidance; multi-aircraft cooperation; optimal control; game theory; nonlinear programming

1. Introduction

Air Traffic Management (ATM) systems worldwide face increasing strain from rising flight densities [1,2]. This pressure is significantly intensified by severe weather, particularly thunderstorms, which are a primary cause of flight delays and disruptions, costing the industry billions of dollars annually [3]. The turbulence, icing, and wind shear associated with thunderstorms pose direct safety threats, mandating that aircraft maintain safe separation from them [4,5]. However, avoidance maneuvers lead to detours, increasing fuel burn and flight time and disrupting Estimated Time of Arrival (ETA). These local disruptions can trigger network-wide cascading delays [6,7]. The central challenge, therefore, is to plan safe avoidance trajectories that also maintain temporal consistency.

In response to the threat that convective weather phenomena such as thunderstorms pose to flight safety, systematic research has been conducted on path planning for thunderstorm avoidance. At the level of meteorological modeling, researchers abstract convective cells as dynamically evolving no-fly zones characterized by inherent uncertainty, and employ Ensemble Prediction Systems (EPS) to generate probabilistic representations of thunderstorm development, thereby enabling the quantitative assessment of the safety margin associated with various avoidance strategies [8]. At the level of planning algorithms, stochastic optimal control approaches directly incorporate meteorological uncertainties into trajectory generation by maximizing the probability of safely reaching waypoints [9], while methods such as Scenario-Based Rapidly exploring Random Trees (SB-RRT*) search for the shortest avoidance paths under user-defined safety thresholds and achieve near real-time planning performance with the aid of Graphical Processing Units (GPU) parallelization [10]. Further investigations indicate that atmospheric uncertainty and convective characteristics exert a quantifiable influence on fuel consumption and flight time deviations within trajectory optimization. Consequently, the operational acceptability of a detour strategy depends not only on its spatial avoidance capabilities but also on its ability to maintain the Estimated Time of Arrival (ETA) within permissible limits, thereby imposing rigorous temporal constraints on the avoidance planning problem [11,12].

The need for temporal precision is amplified by modern operational concepts like Performance-Based Navigation (PBN) and 4D Trajectory Operations, which demand arrival accuracy on the order of seconds [13]. Consequently, avoidance algorithms must now generate precise 4D trajectories that meet strict time constraints at critical waypoints, in addition to ensuring spatial safety [14].

In high-density terminal airspace or en-route convergence zones, these challenges are further amplified by concurrent multi-aircraft avoidance scenarios. When multiple aircraft must simultaneously circumnavigate the same weather system, uncoordinated, independent avoidance decisions can easily lead to aircraft converging into the limited remaining airspace, thereby inducing new airborne conflicts [15,16]. Such scenarios essentially constitute a high-dimensional, multi-objective, and strongly coupled multi-agent decision problem: each aircraft must ensure its own safe passage around the thunderstorm while simultaneously maintaining statutory separation from all other aircraft, seeking a global equilibrium between cost minimization and temporal constraint satisfaction [17]. The traditional coordination model, which relies on the experience of air traffic controllers, struggles in these high-workload, dynamic situations, where decision quality is difficult to guarantee and the controller’s cognitive workload approaches its capacity limits [18]. Therefore, it is of significant theoretical and practical value to investigate trajectory planning theories and methods that can achieve coordinated and spatiotemporally consistent flight routing for multiple aircraft in dynamic meteorological environments.

In the field of automated trajectory planning, the academic community has conducted systematic explorations across various technical routes. Discrete algorithms based on graph search—including the A* algorithm and its various heuristic variants—have been widely adopted in research owing to their theoretical guarantees of completeness [19]. In recent years, further studies have sought to enhance A* and integrate it with other methodologies to achieve collaborative multi-aircraft detour path planning within discrete spaces [20]. However, graph-search-based methods inherently suffer from two major limitations: First, grid resolution constrains the fidelity of the solution space, potentially causing the omission of the true global optimum in the continuous domain; second, computational complexity escalates exponentially with increasing spatial resolution, rendering these methods incapable of satisfying the stringent real-time requirements of operational planning [21].

The Artificial Potential Field (APF) method, another classic planning paradigm, has been widely deployed in fields like Unmanned Aerial Vehicle (UAV) path planning due to its computational simplicity and rapid response [22,23]. Its core concept is to model obstacles as sources of a repulsive potential and the destination as an attractive potential well, guiding the aircraft towards the goal while avoiding obstacles under the influence of the resultant force. However, APF exhibits fundamental drawbacks in environments with complex, multiple constraints. The non-convexity of the potential function makes the algorithm highly susceptible to local minima, leading to planning failures. Moreover, the method does not readily accommodate strict kinematic constraints or time-window requirements, and its coordination capabilities in coupled multi-agent scenarios are limited [24].

In recent years, sampling-based random planning methods, such as the Rapidly exploring Random Tree (RRT) and its optimal variant RRT*, have garnered significant attention for their asymptotic optimality in high-dimensional, complex constraint spaces [25]. These methods explore the solution space by randomly sampling the state space and progressively extending a search tree, enabling them to handle non-convex obstacle constraints and theoretically guaranteeing convergence to the optimal solution with probability one. However, their convergence rates are often slow, the dynamic feasibility of the generated trajectories requires additional processing, and the computational overhead scales poorly with the number of agents in multi-agent cooperative scenarios [26].

In contrast to the aforementioned methods, trajectory optimization techniques based on optimal control provide a unified framework that naturally integrates dynamic constraints with performance optimization [27,28]. These approaches formulate trajectory planning as the minimization of a composite cost function—encompassing metrics such as fuel consumption, flight time deviation, and maneuver smoothness—subject to the aircraft’s kinematic differential equations, physical boundary conditions, and path constraints (e.g., weather avoidance, aircraft separation maintenance). Direct Collocation is a mainstream numerical technique for solving such continuous-time optimal control problems. It discretizes the time domain and converts dynamic constraints into algebraic ones, thereby transcribing the original problem into a large-scale, sparse Nonlinear Programming (NLP) problem solvable by mature interior-point solvers like the Interior Point Optimizer (IPOPT) [29]. Numerical optimization frameworks such as CasADi leverage automatic differentiation to provide exact gradients and Hessian matrices to the solver, significantly enhancing convergence speed and reliability [30]. Model Predictive Control (MPC) further extends trajectory optimization to real-time, reactive scenarios by repeatedly solving the optimization problem over a rolling horizon, enabling online adaptation to dynamic environmental changes [31]. However, when this centralized framework is directly applied to multi-agent scenarios, the dimension of the joint state space grows combinatorially with the number of agents, leading to the severe “curse of dimensionality” and rendering the problem computationally intractable [32].

To fundamentally address the scalability challenge in multi-agent cooperative planning, game theory provides a mathematical foundation for modeling strategic interactions among rational agents [33]. The Nash Equilibrium, a core solution concept in non-cooperative game theory, describes a stable state of strategies wherein no participant can unilaterally deviate to achieve a lower individual cost. Conflict resolution models based on the Nash Equilibrium have been validated in the ATM domain, proving capable of producing globally self-consistent and conflict-free cooperative solutions [34]. In practice, computing an exact Nash Equilibrium is often computationally prohibitive. Consequently, the Iterative Best Response (IBR) algorithm has emerged as a mainstream strategy for approximating this equilibrium. In IBR, each participant sequentially computes its optimal single-agent response given the fixed strategies of others, thereby iteratively approaching an equilibrium state [35]. This decentralized paradigm decomposes the high-dimensional coupled problem into a series of computationally tractable subproblems, effectively balancing solution quality with computational efficiency, and has been experimentally validated in domains such as multi-robot planning and cooperative UAV task allocation.

Existing literature exhibits a gap between discrete, combinatorially complex routing and continuous, dynamically feasible trajectory generation, particularly under strict 4D arrival constraints. This study addresses this gap by presenting an integrated framework that couples continuous optimal control with game-theoretic coordination. The core contributions of this work are:

(1): Integrated 4D Formulation: We formulate multi-aircraft thunderstorm avoidance as a continuous 4D trajectory optimization problem that simultaneously satisfies dynamic weather avoidance, inter-aircraft separation, and hard waypoint ETA constraints.
(2): Joint Speed-Heading Control: Unlike approaches that treat ETA as a soft penalty or assume constant speed, our formulation enforces ETA as a terminal equality constraint, allowing the optimizer to jointly modulate velocity and heading.
(3): Continuous Game-Theoretic Decomposition: We adapt the IBR mechanism to the continuous trajectory space, defining a Local Nash Equilibrium over continuous variables with a convergence criterion based on the trajectory-variation norm.
(4): Emergent Control Hierarchy: We demonstrate that under this formulation, a “speed-first, heading-second” decision hierarchy emerges naturally from the optimization without requiring heuristic rules, aligning with practical air traffic control heuristics.

The remainder of this paper is organized as follows: Section 2 establishes the mathematical models and problem formulation, including the aircraft kinematics, dynamic thunderstorm environment, and safety constraints. Section 3 presents the proposed planning framework, detailing the single-aircraft optimal control formulation and the multi-aircraft game-theoretic coordination mechanism based on IBR. Section 4 provides numerical simulations and result analysis. Finally, Section 5 concludes the paper.

2. Mathematical Models and Problem Formulation

This study primarily focuses on the cruise phase, where altitude variations are minimal. To manage the complexity of the multi-agent cooperative optimization problem at this stage of the research, the following assumptions are made: (1) the position and velocity information of each aircraft is known; and (2) trajectory planning is conducted in the horizontal plane, i.e., conflict avoidance is achieved at a fixed flight level. This approach allows us to concentrate on addressing the core challenges of horizontal conflict detection and avoidance within the same flight level, and to effectively apply optimal-control and game-theoretic frameworks.

2.1. Aircraft Kinematic Model

Each aircraft is modeled as a point mass moving in a 2D horizontal plane at a constant altitude. At time

t

, the state of aircraft

i

is defined by the vector

x_{i} (t) = {[x_{i}, y_{i}, v_{i}, ψ_{i}]}^{T}

, where

(x_{i}, y_{i})

are its Cartesian coordinates,

ν_{i}

is its forward speed, and

ψ_{i}

is its heading angle.

The control input vector is

u_{i} (t) = {[a_{i}, ω_{i}]}^{T}

, where

a_{i}

is the longitudinal acceleration and

ω_{i}

is the angular velocity (turn rate). The aircraft’s motion is described by the following system of kinematic equations:

\{\begin{matrix} {\dot{x}}_{i} = v_{i} \cos (ψ_{i}) \\ {\dot{y}}_{i} = v_{i} \sin (ψ_{i}) \\ {\dot{v}}_{i} = a_{i} \\ {\dot{ψ}}_{i} = ω_{i} \end{matrix}

(1)

These dynamics are subject to operational limits on speed and control inputs, which are modeled as box constraints:

\{\begin{cases} v_{m i n} \leq v_{i} (t) \leq v_{m a x} \\ a_{m i n} \leq a_{i} (t) \leq a_{m a x} \\ ω_{m i n} \leq ω_{i} (t) \leq ω_{m a x} \end{cases}

(2)

2.2. Environmental and Safety Constraints

2.2.1. Dynamic Thunderstorm Environment Model

To accurately characterize the dynamic threats posed by thunderstorms, this paper constructs a dynamic thunderstorm environment model driven by meteorological radar data. This approach moves beyond traditional binary (no-fly/fly) zone modeling by utilizing radar reflectivity fields to define hazardous regions and then employing a continuous, differentiable representation for path planning.

(1): Hazardous Zone Definition

At any given time

t_{k}

, the radar reflectivity field

Z (p, t_{k})

is binarized to define the thunderstorm hazardous region

W (t_{k})

as:

W (t_{k}) = {p ∣ Z (p, t_{k}) \geq 30 dBZ}

(3)

The 30 dBZ reflectivity isoline serves as the physical boundary

\partial W (t_{k})

of the thunderstorm hazardous zone. According to Federal Aviation Administration (FAA) Advisory Circular [36], a reflectivity factor of 30 dBZ marks the lower bound of the MODERATE precipitation intensity category, signifying the onset of convective precipitation of sufficient intensity to constitute an operational avoidance requirement. This threshold is adopted in preference to the more commonly cited 40 dBZ contour because it provides an earlier and more conservative delineation of the hazard boundary, affording greater reaction time for trajectory replanning.

(2): Signed Distance Field (SDF) Construction

A SDF

ϕ (p, t_{k})

is constructed for every point pin the spatial domain:

ϕ (p, t_{k}) = \min_{q \in \partial W (t_{k})} ∥ p - q ∥ \cdot sgn (p \notin W (t_{k}))

(4)

Here, the sign function

sgn (\cdot)

dictates that

ϕ (p, t_{k})

takes a positive value if

p

is outside the hazardous zone, zero on the boundary, and a negative value if inside. A positive value of

ϕ (p, t_{k})

directly represents the Euclidean distance from point

p

to the nearest boundary of the hazardous zone. In implementation,

ϕ (p, t_{k})

is efficiently computed on a discrete grid with a resolution of 1 km using a Fast Distance Transform algorithm.

(3): Continuous Differentiable Approximation

To integrate seamlessly with the optimal control framework, the discrete SDF is fitted into a continuous and differentiable function of spatial position using a B-spline interpolation operator provided by the CasADi framework:

\hat{ϕ} (p, t_{k}) = B [ϕ (p, t_{k})]

(5)

where

B [\cdot]

denotes the B-spline interpolation operator. This

\hat{ϕ}

function is continuously differentiable across the entire domain, providing accurate constraint gradient information for the IPOPT solver and ensuring the convergence of the optimization process.

(4): Safety Constraint

Aircraft

i

at time

t_{k}

with position

p_{i} (t_{k})

must satisfy:

\hat{ϕ} (p_{i} (t_{k}), t_{k}) \geq d_{s}, \forall i, \forall k

(6)

where

d_{s}

denotes the minimum required clearance from the 30 dBZ reflectivity boundary, applied uniformly at all time steps across all aircraft.

(5): Dynamic Evolution Characteristics

The SDF is independently calculated at each planning moment

t_{k}

using current radar observation data. This allows for a comprehensive description of the following dynamic evolutionary characteristics throughout a thunderstorm’s life cycle:

Deformation: Changes in the shape, area, and orientation of reflectivity contours are directly reflected in the updates of

\partial W (t_{k})

at each time step.

Fission: The splitting of a connected echo region into two independent areas corresponds to an increase in the number of connected components within

\partial W (t_{k})

.

Fusion: The merging of two independent echo regions into a single entity corresponds to a decrease in the number of connected components.

All these topological changes are driven by radar data and automatically captured through the time-step-by-time-step update of the SDF. This modeling approach enables the planning algorithm to perceive subtle dynamic changes, ensuring safe avoidance of high-risk areas in complex dynamic environments.

2.2.2. Aircraft Separation Standard

For any pair of aircraft

i

and

j

, a minimum safety separation

d_{a c}

must be maintained at all times. This translates to the following set of constraints for all

t

:

{(x_{i} (t) - x_{j} (t))}^{2} + {(y_{i} (t) - y_{j} (t))}^{2} \geq d_{a c}^{2}

(7)

These nonlinear, non-convex coupled constraints are the primary source of complexity in the multi-agent optimization problem.

2.2.3. Boundary and ETA Constraints

Each aircraft’s mission is defined by a starting position

(x_{i} (0), y_{i} (0))

, an ending position

(x_{i}^{dest}, y_{i}^{dest})

, and a required time of arrival

T

.

The ETA constraint is enforced through two complementary components. The first is a necessary feasibility condition. Since any physically realizable trajectory must cover a distance no less than the straight-line separation between the origin and destination, the following inequality is imposed:

\int_{0}^{T} v_{i} (t) d t \geq \sqrt{{(x_{i}^{dest} - x_{i} (0))}^{2} + {(y_{i}^{dest} - y_{i} (0))}^{2}}

(8)

This condition ensures that the aircraft’s speed envelope is sufficient to reach the destination within the allotted time, preventing the solver from converging to degenerate low-speed solutions that are physically infeasible. If the combined demands of thunderstorm avoidance, inter-aircraft separation, and speed limits render this condition unsatisfiable, the problem is correctly identified as infeasible.

Equation (8) is, however, a necessary but not sufficient condition for exact ETA satisfaction. The sufficient condition is established through the following terminal equality constraints, which are imposed as hard boundary conditions of the optimal control problem:

x_{i} (T) = x_{i}^{dest}, y_{i} (T) = y_{i}^{dest}

(9)

where the planning horizon

T = T_{ETA}

is treated as a fixed parameter, not a decision variable. Under this fixed-time-horizon formulation, the optimizer is required to find a velocity and heading profile that simultaneously satisfies all path constraints and delivers the aircraft to the exact destination coordinates at the prescribed time. The combination of the fixed horizon, the terminal equality constraints in Equation (9), and the feasibility guard in Equation (8) constitutes the complete mechanism for 4D waypoint ETA management.

3. Proposed Planning Framework

Based on these assumptions, the overall framework of this work is as follows: First, a dynamics and constraint model is formulated for the single-aircraft thunderstorm avoidance problem within a continuous-time optimal control framework, and is then transcribed into a large-scale sparse NLP problem via direct collocation. High-precision numerical solutions are obtained using CasADi (Version 3.7.2) and IPOPT (Version 3.14.11). The multi-aircraft cooperative avoidance problem is then abstracted as a noncooperative game, and an IBR mechanism is employed to decompose the high-dimensional, strongly coupled joint optimization problem into a set of parallelizable single-aircraft optimal control subproblems. In this way, distributed and coordinated multi-aircraft trajectory generation is achieved.

3.1. Single-Aircraft Trajectory Planning

For a single aircraft, given that the trajectories of all other aircraft are fixed, the problem reduces to finding an optimal trajectory and control inputs that minimize a cost function. This cost function is a weighted sum of several performance indices. By employing a configuration of weight coefficients, the optimizer is guided to exhibit a hierarchical sensitivity to operational costs. This prioritization aligns with practical aviation standards, ensuring a cost-effective decision-making process.

J_{i} = w_{1} J_{p a t h} + w_{2} J_{c t r l} + w_{3} J_{s m o o t h} + w_{4} J_{c u r v e}

(10)

where the components are:

Path Length

(J_{p a t h})

: Minimizes the total flight distance, serving as a proxy for fuel consumption and flight time.

J_{p a t h} = \int_{0}^{T} v_{i} (t) d t

(11)

Control Effort

(J_{c t r l})

: Penalizes large control inputs to encourage energy-efficient and smooth maneuvers.

J_{c t r l} = \int_{0}^{T} (a_{i} {(t)}^{2} + ω_{i} {(t)}^{2}) d t

(12)

Control Smoothness

(J_{s m o o t h})

: Penalizes the rate of change in control inputs (jerk), ensuring smoother transitions.

J_{s m o o t h} = \int_{0}^{T} ({\dot{a}}_{i} {(t)}^{2} + {\dot{ω}}_{i} {(t)}^{2}) d t

(13)

Curvature Penalty

(J_{c u r v e})

: Penalizes large turn rates

(ω_{i})

, which is equivalent to penalizing high curvature.

J_{c u r v e} = \int_{0}^{T} ω_{i} {(t)}^{2} d t

(14)

The design of this cost function, particularly the proper configuration of its weights

w_{1}

,

w_{2}

,

w_{3}

,

w_{4}

, is crucial in guiding the optimization towards desired performance characteristics. Through extensive simulation-based tuning, this study prioritizes path length to reflect fuel efficiency, while other weights are calibrated based on typical aircraft dynamic response characteristics to ensure flyability.

The formulated optimal control problem is an infinite-dimensional functional optimization problem, which is difficult to solve analytically. Therefore, this study employs a direct collocation method to transform it into a finite-dimensional problem that is amenable to numerical solution. This method discretizes the entire time horizon. For one aircraft among the

N

aircraft, the total flight time

T_{i}

is divided into

M

equidistant intervals of duration

Δ t_{i} = T_{i} / M

, where

k

denotes the

k

-th time step. The model’s decision variables comprise the states and control inputs of all aircraft at every discrete time node:

State sequence:

X_{i} = {x_{i, k}}_{k = 0}^{N} = {{[x_{i, k}, y_{i, k}, v_{i, k}, ψ_{i, k}]}^{T}}_{k = 0}^{N} \leftarrow

(15)

Control sequence:

U_{i} = {u_{i, k}}_{k = 0}^{N - 1} = {{[a_{i, k}, ω_{i, k}]}^{T}}_{k = 0}^{N - 1} \leftarrow

(16)

The total number of optimization variables is on the order of

N \times (6 M + 4)

, indicating that the problem scale expands rapidly with an increase in the number of aircraft and discretization steps. Polynomials are used to approximate the state and control trajectories, ultimately transcribing the dynamic equations, path constraints, and performance indices into a single, large-scale, highly sparse NLP problem. The entire modeling and transcription process is implemented using the CasADi framework. Its automatic differentiation capabilities provide precise Jacobian and Hessian information to high-performance solvers like IPOPT, thereby ensuring both the efficiency and convergence of the solution process.

3.2. Multi-Aircraft Conflict Resolution and Coordination Framework Based on Game Theory

In high-density operational airspace, ensuring that multi-aircraft trajectories not only avoid external threats like thunderstorms but also do not create new conflicts among themselves is a far more complex optimization problem. To this end, this section extends the trajectory planning problem from a single agent to a multi-agent system, aiming to achieve coordinated conflict resolution and overall cost optimality.

This multi-aircraft coordination problem is formulated as a non-cooperative game, where each aircraft

i \in {1, 2, \dots, N}

is treated as a rational game participant. Each participant chooses its own flight trajectory

τ_{i}

, with the objective of minimizing its own cost function

C_{i} (τ_{i}, τ_{- i})

, which depends not only on its own trajectory but also on the trajectories of all other aircraft

τ_{- i} = {τ_{j} ∣ j \neq i}

.

In a multi-aircraft non-cooperative game, the Nash Equilibrium is a key solution concept. It describes a stable state of strategy combinations where no participant (aircraft) has an incentive to unilaterally change its strategy (trajectory), as any unilateral change would not result in a lower cost.

For a set of

N

aircraft, a set of trajectory strategies

{τ_{1}^{*}, τ_{2}^{*}, \dots, τ_{N}^{*}}

constitutes a Nash Equilibrium if and only if, for any aircraft

i

, its equilibrium trajectory

τ_{i}^{*}

is the best choice given that all other aircraft

j \neq i

adhere to their equilibrium trajectories

τ_{j}^{*}

. This condition can be formally expressed as:

C_{i} (τ_{i}^{*}, τ_{- i}^{*}) \leq C_{i} (τ_{i}, τ_{- i}^{*}), \forall τ_{i} \in T_{i}

(17)

where

T_{i}

is the set of all feasible trajectories for aircraft

i

.

Directly solving for the Nash Equilibrium of this multi-aircraft game is a high-dimensional, complex optimization problem, especially when trajectories are described by continuous state and control variables, making it difficult to handle directly. Therefore, this study employs an iterative computational method—IBR [34,35]—to obtain an approximate equilibrium solution for this game.

The IBR algorithm decomposes the complex multi-agent coupled problem into a series of single-agent optimal control subproblems that are solved sequentially. In iteration

k

, following a predetermined planning order, aircraft

i

treats the trajectories of other aircraft as fixed constraints and solves for its own single-aircraft optimal trajectory, which serves as its “best response” to the strategies of others.

When planning for aircraft

s_{i}

, it utilizes the updated trajectories of preceding aircraft from the current iteration {

τ_{j}^{(k)} |j < i

} and the not-yet-updated trajectories of succeeding aircraft from the previous iteration {

τ_{j}^{(k)} |j > i

}. This approach leverages the most recent available information, accelerating convergence.

The pseudocode for the multi-aircraft thunderstorm avoidance method based on game-theoretic iteration is as detailed in Algorithm 1.

Algorithm 1. Iterative Best Response for Multi-Aircraft Coordinated Avoidance

Input:

Initial states, destination points, and performance envelopes for

N

aircraft;

External thunderstorm models;

Initial trajectory set

{τ_{1}^{(0)}, τ_{2}^{(0)}, \dots, τ_{N}^{(0)}}

.

Output:

Coordinated, conflict-free trajectory set

{τ_{1}^{(0)}, τ_{2}^{(0)}, \dots, τ_{N}^{(0)}}

.

Initialize: Set iteration counter $k = 0$ , max iterations $K_{m a x}$ , convergence threshold $ϵ$ .
Determine Planning Order: Establish a planning sequence $s = (s_{1}, s_{2}, \dots, s_{N})$ for the $N$ aircraft, where $s_{i} \in {1, 2, \dots, N}$ denotes the planning rank of aircraft $i$ within each iteration (i.e., aircraft $i$ is the $s_{i}$ -th to solve its subproblem). The total number of planning steps per iteration is $τ_{p} = N$ .
WHILE $k < K_{m a x}$ AND not converged DO
$k \leftarrow k + 1$
FOR $i = 1$ to $N$ DO
Let the currently planned aircraft be $p = s_{i}$
Construct the constraint set from other aircraft’s trajectories:

$τ_{- p}^{(k)} = {τ_{j}^{(k)} ∣ j < i} \cup {τ_{j}^{(k - 1)} ∣ j > i}$

(18)
Solve for the best response trajectory:

$τ_{p}^{(k)} = \arg \min_{τ_{p} \in T_{p}} C_{p} (τ_{p}, τ_{- p}^{(k)})$

(19)
END FOR
IF $\sum_{i = 1}^{N} ∥ τ_{i}^{(k)} - τ_{i}^{(k - 1)} ∥ < ϵ$ THEN BREAK
END WHILE
RETURN ${τ_{1}^{(k)}, τ_{2}^{(k)}, \dots, τ_{N}^{(k)}}$

3.3. Analysis of Convergence for the Coordination Mechanism

We analyze the convergence of the IBR mechanism from two perspectives: the optimality guarantee at each individual step, and the game-theoretic interpretation of the solution returned upon termination.

In iteration

k

, aircraft

i

solves its single-agent NLP subproblem with the trajectories of all other aircraft,

τ_{- i}^{(k)}

, held fixed as hard constraints. The subproblem is a finite-dimensional, continuously differentiable NLP over a compact feasible set

F_{i}

, solved by IPOPT to a point satisfying the Karush–Kuhn–Tucker (KKT) first-order necessary conditions. Since

τ_{i}^{(k - 1)}

is a feasible point of the current subproblem and

τ_{i}^{(k)}

is its locally optimal solution, the following holds by construction:

J_{i} (τ_{i}^{(k)}, τ_{- i}^{(k)}) \leq J_{i} {(τ_{i}^{(k - 1)}, τ_{- i}^{(k)})}^{\leftarrow}

(20)

Because the constraint environment varies dynamically as other agents update sequentially, Equation (20) guarantees local cost improvement relative to the current iteration’s information state, rather than a monotonic descent of the global aggregate cost across iterations. However, this local improvement is sufficient to preclude degenerate strategy cycling during the iterative process.

Convergence of the overall process is monitored through the trajectory variation norm:

Δ^{(k)} = \max_{i} ‖{τ_{i}^{(k)} - τ_{i}^{(k - 1)}}_{\infty}‖

(21)

The algorithm terminates when

Δ^{(k)} < ϵ

, with

ϵ = 10

m, indicating that the joint strategy profile has stabilized to within the prescribed positional tolerance. A hard iteration cap

K_{\max}

ensures termination in finite time under all conditions, including cases where oscillatory behavior—a known risk of sequential best-response schemes in non-convex games—prevents

Δ^{(k)}

from falling below

ϵ

.

When the algorithm terminates at iteration

k^{*}

via the variation norm criterion, the following fixed-point condition holds simultaneously for all aircraft:

{τ_{i}^{(k^{*})} - τ_{i}^{(k^{*} - 1)}}_{\infty} < ϵ, \forall i

(22)

At this fixed point, each

τ_{i}^{(k^{*})}

is the output of an NLP solve in which all other aircraft trajectories were fixed at values within

ϵ

of their current values, and by (19) it is locally optimal within

F_{i}

under those constraints. This terminal state corresponds to a Local Nash Equilibrium: A strategy profile

{τ_{i}^{*}}

is a Local Nash Equilibrium if, for each aircraft

i

, there exists no perturbation

δ τ_{i}

with

δ τ_{i \infty} \leq ϵ

such that

J_{i} (τ_{i}^{*} + δ τ_{i}, τ_{- i}^{*}) < J_{i} (τ_{i}^{*}, τ_{- i}^{*})

, where

τ_{i}^{*} + δ τ_{i} \in F_{i} (τ_{- i}^{*})

.

The local nature of this equilibrium stems from the non-convexity of the joint strategy space (due to thunderstorm avoidance and separation constraints) and the local nature of the KKT-stationary points returned by the IPOPT solver. Given these characteristics and the non-contractive nature of the best-response map, the algorithm is designed to converge to a local Nash equilibrium rather than a global optimum. This formulation aligns with the practical requirements of real-time conflict resolution, where finding a stable, collision-free local equilibrium is computationally prioritized over global optimality. In the ATM context, the Local Nash Equilibrium represents a spatiotemporally self-consistent, conflict-free cooperative solution from which no aircraft can reduce its individual cost through any small unilateral trajectory perturbation, which is precisely the stability condition required for an operationally deployable rerouting plan. In practice,

Δ^{(k)}

decreases monotonically to below

ϵ

in the overwhelming majority of tested scenarios, confirming that this solution concept is routinely achieved in operationally relevant configurations; the specific safety performance of the trajectories, is documented in detail in Section 4.6.

4. Experimental Validation

To comprehensively validate the effectiveness of the proposed method under thunderstorm conditions, this chapter details a series of simulation experiments. To ensure the practical relevance of the results, all test scenarios were designed such that a portion of the aircraft’s planned route is detected to have thunderstorm activity, requiring effective avoidance of weather hazards and flight conflicts with minimal deviation from the overall flight plan.

4.1. Simulation Environment and Parameter Settings

The simulation experiments were conducted in a Python 3.9 environment. All optimization problems were solved using the IPOPT 3.13.4 solver via the CasADi 3.5.5 interface.

The operational envelope constraints for the simulated Airbus A320 aircraft are derived from the EUROCONTROL Base of Aircraft Data (BADA) User Manual, Revision 3.14 [37]. The cruising true airspeed range is set to [720, 870] km/h, corresponding to Mach 0.68 (long-range cruise lower bound) through the maximum operating Mach number

M_{m o}

= 0.82 at FL350 under International Standard Atmosphere (ISA) conditions. The longitudinal acceleration limit is [−0.5, 0.5] m/s². The maximum turn rate is [−1.5°/s, 1.5°/s]. The minimum inter-aircraft separation of 10 km is set as a conservative rounding of the standard en-route radar separation minimum of 5 NM (9.26 km) specified in ICAO Doc 4444 PANS-ATM §8.7.3 [38]. The minimum clearance from the 30 dBZ thunderstorm boundary of 20 km is adopted as a conservative margin above the approximately 10 statute miles (~16 km) avoidance distance associated with MODERATE-intensity convective echoes under FAA AIM §7-1-27 [39] and FAA AC 00-45H [36]. All parameters are summarized in Table 1.

The weight configuration in Table 1 reflects a deliberate prioritization of flight economy over control aggressiveness. The dominant weight assigned to path length (

w_{1}

= 5.0) encodes the operational objective of minimizing detour distance and, by extension, fuel burn—the primary cost driver in practical thunderstorm rerouting decisions. The control energy weight (

w_{2}

= 0.8) and velocity smoothness weight (

w_{3}

= 0.1) serve complementary roles:

w_{2}

penalizes large acceleration and turn-rate inputs that would stress the airframe and reduce passenger comfort, while

w_{3}

suppresses rapid speed oscillations that could arise from conflicting ETA and avoidance requirements. The curvature penalty (

w_{4}

= 1.0) prevents geometrically degenerate trajectories with sharp heading reversals that, while mathematically feasible, are operationally unacceptable.

The relative magnitudes of the weights directly govern the trade-off between trajectory economy and control smoothness. The dominant path length weight

w_{1}

produces a cost-driven speed-heading priority, wherein velocity modulation is preferred over heading deviation as the lower-cost response to temporal and spatial constraints. The remaining weights are calibrated to suppress geometrically degenerate solutions and excessive control activity while preserving the optimizer’s freedom to generate dynamically feasible avoidance maneuvers. This cost-driven hierarchy is consistent with standard ATM practice, as demonstrated in the representative scenario analyzed in Section 4.3.

4.2. Single-Aircraft Avoidance Scenario Analysis

The results for a single-aircraft, three-thunderstorm avoidance test are shown in Figure 1. This scenario demonstrates a flight path that would have passed through three closely spaced thunderstorm areas. After replanning with the algorithm, the aircraft successfully navigates around all three thunderstorm areas before rejoining its originally planned route.

Analysis of trajectory flyability confirms that the segment lengths meet continuous flight requirements, with no unflyable segments.

4.3. Multi-Aircraft Avoidance Scenario Analysis

This study simulates a scenario with three aircraft approaching a single thunderstorm area from three different directions (south, west, and northwest) at the same time and speed. The thunderstorm is located at the intersection of the routes of two of the aircraft, posing a high risk of flight hazard and conflict. If they were to proceed directly through the risk area as planned, they would face high risk and a guaranteed conflict. Therefore, both aircraft must simultaneously detour around the thunderstorm, which still presents a high risk of conflict. Meanwhile, a third aircraft is also in the vicinity of the storm. The proposed algorithm is used to plan for all three aircraft, employing an iterative game-theoretic approach where safety separation and conflict avoidance strategies serve as constraints. After 4 iterations, the multi-aircraft avoidance test results are shown in Figure 2.

The simulation results demonstrate that the proposed method can generate three trajectories that completely avoid the thunderstorm area from different directions while maximally preserving the original routes. Furthermore, conflicts are successfully avoided, validating the feasibility and robustness of the algorithm. The algorithm converges to a final result in 4 iterations, with computation times of 1.14 s, 0.90 s, 0.86 s, and 0.77 s, respectively, achieving trajectory planning within seconds and ensuring timeliness. The results further validate the robust convergence of this mechanism: in all tested scenarios, the algorithm consistently reached a stable state within a very small number of iterations. This efficient convergence, combined with the powerful numerical solving capabilities of the CasADi framework, ensures the method’s timeliness and reliability in handling dynamic, high-density avoidance tasks.

To quantitatively assess the framework’s ability to meet preset ETA constraints, Table 2 presents the planned versus actual arrival times for each aircraft. The results show that despite all three aircraft executing complex cooperative avoidance maneuvers—which inherently increase the flight path length—their final arrival time errors are all controlled to within a few seconds. This outcome follows from the velocity optimization: the optimizer compensates for the longer avoidance path by accelerating, so the added distance does not translate into a delay. Consequently, even with extended trajectories, the aircraft meet their scheduled arrival times. In the context of actual ATM, these negligible errors are well within Required Navigation Performance (RNP) standards. This strongly demonstrates that the proposed velocity integral constraint functions as a hard equality constraint, ensuring that aircraft can precisely meet their 4D waypoint time requirements in dynamic, high-conflict-density environments, thereby validating the method’s reliability in time-critical missions.

Figure 3 illustrates the temporal evolution of separation distances between all aircraft pairs, while Figure 4 depicts the distance between each aircraft and the meteorological obstacles. These represent the core metrics for evaluating the safety of the proposed conflict resolution strategy. As clearly observed, all distance curves strictly remain above the minimum safety separation threshold (indicated by the red dashed line) throughout the entire flight horizon. The simulation results demonstrate that even under conditions where thunderstorms occupy core airspace, the multi-aircraft trajectories exhibit exceptional spatiotemporal compactness. All aircraft precisely navigate around dynamic thunderstorms while maintaining inter-aircraft separation above the critical threshold, validating the coordination efficiency of the proposed algorithm in highly constrained airspace.

Analyzing the simulation results from Figure 5, it is evident that under the dual constraints of conflict avoidance and ETA, aircraft 2 and 3 adopt a combined strategy of heading maneuvers and speed adjustments. In contrast, aircraft 1 exhibits significant control heterogeneity. Because its original path was not directly obstructed by the thunderstorm, the algorithm, during the multi-aircraft game iterations, determined that maintaining a direct flight was the globally cost-optimal response, thus avoiding unnecessary rerouting costs. Although its heading remains constant, aircraft 1’s velocity profile shows precise dynamic modulation, aimed at eliminating potential spatiotemporal conflict risks with the detouring aircraft (2 and 3) and compensating for time losses due to their path deviations. This behavior reveals an emergent control hierarchy within the optimization framework. When faced with time deviations or minor conflict risks, the optimizer preferentially uses longitudinal speed compensation to absorb deviations within allowable limits (i.e., “time absorption”). This is primarily because speed adjustments mainly involve changes in acceleration, which have a smaller marginal impact on the total cost. Only when speed adjustments cannot resolve safety-critical encounters or when obstacles like thunderstorms block the path does the optimizer initiate heading maneuver strategies, even though this leads to a significant increase in total cost by triggering curvature penalties, increasing angular velocity energy consumption, and extending path length. This “speed-first, heading-second” control logic is not driven by pre-set heuristic rules but is an emergent optimal solution from the multi-objective cost function within the dynamic game. The results confirm that the framework reproduces a decision hierarchy familiar to practicing controllers: speed adjustment first, heading change only when necessary.

4.4. Baseline Method Comparison

To evaluate the performance of the proposed game-theoretic optimal control framework (The Proposed Method), we conducted a comparative analysis against two representative baseline methods: A* + IBR and the Alternating Direction Method of Multipliers (ADMM). The selection of these baselines is strategic: the comparison with A* + IBR isolates the benefit of our continuous NLP-based planner over a discrete grid-search planner while using the same IBR coordination mechanism. The comparison with ADMM, conversely, isolates the benefit of our IBR coordination mechanism over another popular decentralized optimization technique while using the same NLP-based planner. The quantitative results of this comparison are summarized in Table 3, with the corresponding trajectories visualized in Figure 6. An upward arrow (↑) indicates that a higher value is better, while a downward arrow (↓) indicates that a lower value is better.

4.4.1. Comparison with A* + IBR

The A* + IBR method combines a discrete grid-based planner with the same iterative best-response coordination logic as our proposed method. As shown in Figure 6b, the trajectories generated by A* are visibly jagged and consist of sharp, piecewise-linear segments. This is a direct consequence of the underlying grid representation, which inherently limits the solution space to discrete state transitions.

The quantitative results in Table 3 confirm this visual assessment. The Trajectory Smoothness, a metric where lower is better, is exceptionally poor for A* + IBR (64.466 vs. 0.153 for our method), indicating frequent and sharp heading changes that are not dynamically feasible for commercial aircraft without significant post-processing. Although A* + IBR shows minimal control effort and speed variation, this is an artifact of its simplified model, which does not perform genuine dynamic optimization of velocity profiles. Most critically, the discrete nature of the grid and the post hoc separation enforcement resulted in safety violations, with inter-aircraft separation (9.510 km) falling below the required 10 km minimum, and thunderstorm clearance (19.728 km) falling below the required 20 km minimum. In contrast, our proposed method, by operating in a continuous domain, generates inherently smooth and dynamically feasible trajectories that strictly adhere to all safety constraints. Furthermore, the total path length is significantly shorter (446.57 km vs. 505.89 km), highlighting the sub-optimality introduced by grid discretization. This comparison validates the superiority of using a continuous optimal control planner (NLP/IPOPT) for generating high-quality, safe, and efficient aircraft trajectories.

4.4.2. Comparison with ADMM

ADMM is another powerful technique for decentralized optimization, which, like our method, uses NLP/IPOPT as the core planner for each aircraft. The key difference lies in the coordination mechanism: while our approach shares full trajectory information and enforces hard separation constraints, ADMM coordinates through shared dual variables (or “prices”) and treats inter-aircraft separation as a soft constraint within an augmented Lagrangian.

As seen in Table 3, ADMM successfully generates conflict-free trajectories with zero violations and maintains a slightly larger minimum separation (11.122 km), which is a positive outcome. However, this safety comes at a significant cost in other aspects. The Total Path Length (495.15 km) and Trajectory Smoothness (36.249), both metrics where lower is better, are substantially worse than those of our proposed method. The Control Effort (118.467 vs. 5.528) is dramatically higher, indicating aggressive and inefficient maneuvers, which is also reflected in the trajectory shapes in Figure 6c. Most notably, the Computation Time for ADMM is much longer than for our method (25.33 s vs. 3.76 s).

This performance difference stems from the nature of the coordination. The soft-constraint formulation of ADMM often requires many more iterations and a carefully tuned penalty parameter to converge to a feasible solution, leading to longer computation times and less efficient trajectories. In contrast, the IBR mechanism in our framework, by treating other aircraft trajectories as hard constraints, converges rapidly to a high-quality, locally optimal Nash Equilibrium. This comparison validates that for this specific ATM problem, the IBR coordination mechanism provides a more effective balance of solution quality, safety, and computational efficiency than ADMM.

4.4.3. Overall Assessment

The comparative analysis demonstrates the comprehensive advantages of the proposed framework. It successfully combines the strengths of a continuous optimal control planner (providing smooth, efficient, and dynamically feasible trajectories) with a fast and effective game-theoretic coordination mechanism (IBR). The results show that our method is the only one among the three that achieves a superior balance of safety (zero violations), efficiency (shortest path length), trajectory quality (best smoothness), and computational speed.

4.5. Monte Carlo Study and Sensitivity Analysis of Formation Size

4.5.1. Experimental Design

The simulations presented thus far are all based on fixed scenarios, which makes it difficult to assess algorithm performance across the broader range of conditions encountered in practice. To address this, a Monte Carlo study was conducted using a large set of randomly generated scenarios.

The experiment covered scenarios with varying numbers of aircraft (1–10) and thunderstorms (1–4), ensuring diversity in conflict geometry and storm encounter patterns. For each aircraft count group, 500 independent scenarios were generated and simulated. Thunderstorm parameters—center location, radius, velocity, and direction of movement—were all drawn independently from physically reasonable ranges.

4.5.2. Sensitivity to Aircraft Count

Grouping the scenarios by aircraft count yields the breakdown shown in Table 4.

Table 4 summarizes performance across formation sizes from one to ten aircraft. For formations of up to seven aircraft, the algorithm converged in every tested scenario with all safety constraints satisfied in full. Both convergence and safety rates declined moderately beyond eight aircraft, as shown in Table 4.

Solve time scales predictably with formation size, rising from under 4 s for one-to-three aircraft to 4–17 s for four-to-seven aircraft, and up to 32 s for the largest groups. This growth reflects the quadratic increase in pairwise separation constraints: a seven-aircraft formation generates C(7,2) = 21 pairs, while ten aircraft produce C(10,2) = 45—a 43% increase in fleet size that nearly doubles the constraint count. For pre-tactical rerouting applications that typically operate on a several-minute look-ahead horizon, even the upper end of this range remains operationally acceptable. The detour ratio increases modestly from 6.36% for small formations to 6.79% for the largest, a difference of less than half a percentage point, suggesting that the game-equilibrium formulation effectively contains the cooperative cost overhead even as formation size grows.

4.6. Dynamic Rerouting Case Study Based on Real-World Flight Data

To verify the practical applicability of the proposed method under realistic operating conditions, a real-world case study was constructed based on publicly available data. Flight information was obtained from the public commercial flight-tracking platform VariFlight, from which four real commercial flights operating on 7 April 2026 were selected: GJ8860 from Guangzhou Baiyun to Hangzhou Xiaoshan, CZ6586 from Hefei Xinqiao to Guangzhou Baiyun, JD5067 from Xi’an Xianyang to Xiamen Gaoqi, and CZ3572 from Shanghai Hongqiao to Guangzhou Baiyun. The thunderstorm systems involved in the scenario were reconstructed from the convective weather observations released by the China Meteorological Administration for the same region on that day. The proposed method was then evaluated under this combination of real flight and meteorological background. Key waypoints from their planned routes are shown in Table 5.

During their flights, the aircraft involved encountered a thunderstorm system between waypoints LEKUV and P215. Based on the game-theoretic cooperative planning framework proposed in this paper, the system generated real-time avoidance advisories for all four flights that satisfied their waypoint ETA constraints. Based on the avoidance advisories generated by the proposed algorithm, alternative rerouting trajectories were planned and evaluated accordingly. The new routes preserved the main structure of the original flight plans while incorporating necessary avoidance waypoints. A comparison of the routes before and after the rerouting, along with the updated waypoints, is presented in Table 6 and Figure 7.

The simulation results confirm that the generated avoidance trajectories successfully circumvented the high-risk thunderstorm area and ensured mutual separation between the two aircraft, with a minimum inter-aircraft separation distance of 10.45 km. Additionally, the trajectories maintained a safe distance from the thunderstorm, with the closest encounter point at 20.24 km. All trajectory segments complied with the aircraft’s dynamic constraints. Compared to the original flight plans, the rerouting resulted in an approximately 6.42% increase in actual flight distance. While these increases represent an additional cost, they are well within a reasonable range given the imperative of ensuring flight safety.

5. Conclusions

This paper addresses the problem of multi-aircraft 4D trajectory cooperative planning in dynamic weather environments by proposing a cooperative planning framework based on optimal control theory and game theory. A comprehensive environmental model including dynamic thunderstorm life cycles and aircraft kinematics was constructed. Using the CasADi toolkit, the single-aircraft 4D trajectory generation problem was transformed into an NLP problem for solution. Building on this, a cooperative conflict resolution mechanism based on IBR was introduced to address multi-aircraft conflicts, achieving an efficient approximation of the Nash Equilibrium.

Simulation results show that the framework effectively avoids dynamic weather threats and inter-aircraft conflicts while precisely meeting 4D waypoint time constraints. A comparison with two baseline methods (A* + IBR and ADMM) indicates that the proposed approach offers a better overall balance among safety, trajectory quality, and computational efficiency. A case study based on real Chinese civil aviation routes confirms its applicability under realistic conditions, and Monte Carlo experiments further demonstrate that the framework remains stable across the formation sizes commonly encountered in cooperative rerouting tasks, with solve times suitable for pre-tactical planning.

Future research could extend this work by incorporating more complex operational constraints (such as uncertain wind fields, communication limitations, and the impact of weather forecast uncertainties) into the model, with additional consideration of altitude integration, extending the scenario to the vertical dimension, and evaluating the computational performance and scalability of the framework in large-scale fleet scenarios. This would broaden the scope of the study and enhance its application potential in future ATM systems.

Author Contributions

Conceptualization, R.S. and X.W.; methodology, R.S. and X.W.; software, R.S. and W.Y.; validation, R.S., S.L. and X.W.; formal analysis, R.S.; investigation, R.S.; resources, X.W. and S.L.; data curation, R.S.; writing—original draft preparation, R.S.; writing—review and editing, X.W., S.L., Y.C. and W.Y.; visualization, R.S. and Y.C.; supervision, X.W. and S.L.; project administration, X.W.; funding acquisition, X.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data supporting the findings of this study are derived from publicly accessible sources. The flight schedule information used in the case study (flights GJ8860, CZ6586, JD5067, and CZ3572 on 7 April 2026) was obtained from the public commercial flight-tracking platform VariFlight (https://www.variflight.com, accessed on 17 May 2026). The convective weather information used to reconstruct the thunderstorm scenarios was obtained from the publicly released observation products of the China Meteorological Administration (https://www.cma.gov.cn, accessed on 17 May 2026). No proprietary or restricted datasets were used. Further processed data and simulation results generated in this study are available from the corresponding author upon reasonable request.

Acknowledgments

During the preparation of this manuscript, the authors utilized Tencent Yuanbao (powered by the Tencent Hunyuan Large Language Model) solely for partial translation and language polishing to enhance readability. The original conceptualization, methodology, data analysis, and draft content were entirely generated by the authors.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

IBR	Iterative Best Response
ATM	Air Traffic Management
ICAO	International Civil Aviation Organization
EPS	Ensemble Prediction Systems
SB-RRT*	Scenario-Based Rapidly exploring Random Trees
GPU	Graphical Processing Units
ETA	Estimated Time of Arrival
PBN	Performance-Based Navigation
APF	Artificial Potential Field
UAV	Unmanned Aerial Vehicle
RRT	Rapidly exploring Random Tree
NLP	Nonlinear Programming
IPOPT	Interior Point Optimizer
MPC	Model Predictive Control
FAA	Federal Aviation Administration
SDF	Signed Distance Field
KKT	Karush–Kuhn–Tucker
RNP	Required Navigation Performance
ADMM	Alternating Direction Method of Multipliers

References

Eurocontrol. European Aviation in 2040: Challenges of Growth; Eurocontrol: Brussels, Belgium, 2022. [Google Scholar]
Cook, A.; Belkoura, S.; Zanin, M. ATM performance measurement in Europe, the US and China. Chin. J. Aeronaut. 2017, 30, 479–490. [Google Scholar] [CrossRef]
IATA. The True Cost of Aviation’s Weather Disruptions; IATA Economics: Geneva, Switzerland, 2022. [Google Scholar]
International Civil Aviation Organization (ICAO). Annex 3: Meteorological Service for International Air Navigation, 20th ed.; ICAO: Montréal, QC, Canada, 2018. [Google Scholar]
Federal Aviation Administration (FAA). Aviation Weather Handbook, FAA-H-8083-28; FAA: Washington, DC, USA, 2022.
Brueckner, J.K.; Czerny, A.I.; Gaggero, A.A. Airline delay propagation: A simple method for measuring its extent and determinants. Transp. Res. Part B Methodol. 2022, 162, 55–71. [Google Scholar] [CrossRef]
Rosenow, J.; Michling, P.; Schultz, M.; Schönberger, J. Evaluation of strategies to reduce the cost impacts of flight delays on total network costs. Aerospace 2020, 7, 165. [Google Scholar] [CrossRef]
García-Heras, J.; Soler, M.; González-Arribas, D.; Eschbacher, K.; Rokitansky, C.-H.; Sacher, D.; Gelhardt, U.; Lang, J.; Hauf, T.; Simarro, J.; et al. Robust flight planning impact assessment considering convective phenomena. Transp. Res. Part C Emerg. Technol. 2021, 123, 102968. [Google Scholar] [CrossRef]
Hentzen, D.; Kamgarpour, M.; Soler, M.; González-Arribas, D. On maximizing safety in stochastic aircraft trajectory planning with uncertain thunderstorm development. Aerosp. Sci. Technol. 2018, 79, 543–553. [Google Scholar] [CrossRef]
Andrés, E.; González-Arribas, D.; Soler, M.; Kamgarpour, M.; Sanjurjo-Rivo, M.; Simarro, J. Informed scenario-based RRT* for aircraft trajectory planning under ensemble forecasting of thunderstorms. Transp. Res. Part C Emerg. Technol. 2021, 129, 103232. [Google Scholar] [CrossRef]
Andrés, E.; González-Arribas, D.; Soler, M.; Kamgarpour, M.; Sanjurjo-Rivo, M.; Simarro, J. Iterative graph deformation for aircraft trajectory planning considering ensemble forecasting of thunderstorms. Transp. Res. Part C Emerg. Technol. 2022, 145, 103919. [Google Scholar] [CrossRef]
Soler, M.; González-Arribas, D.; Sanjurjo-Rivo, M.; García-Heras, J.; Sacher, D.; Gelhardt, U.; Lang, J.; Hauf, T.; Simarro, J. Influence of atmospheric uncertainty, convective indicators, and cost-index on the leveled aircraft trajectory optimization problem. Transp. Res. Part C Emerg. Technol. 2020, 120, 102784. [Google Scholar] [CrossRef]
ICAO. Doc 9965: Manual on Flight and Flow Information for a Collaborative Environment (FF-ICE), 1st ed.; ICAO: Montréal, QC, Canada, 2012. [Google Scholar]
Gonzalez-Arribas, D.; Baneshi, F.; Andres, E.; Soler, M.; Jardines, A.; García-Heras, J. Fast 4D flight planning under uncertainty through parallel stochastic path simulation. Transp. Res. Part C Emerg. Technol. 2023, 148, 104018. [Google Scholar] [CrossRef]
Kuchar, J.K.; Yang, L.C. A review of conflict detection and resolution modeling methods. IEEE Trans. Intell. Transp. Syst. 2000, 1, 179–189. [Google Scholar] [CrossRef]
Ribeiro, M.; Ellerbroek, J.; Hoekstra, J. Review of conflict resolution methods for manned and unmanned aviation. Aerospace 2020, 7, 79. [Google Scholar] [CrossRef]
Rangrazjeddi, A.; González, A.D.; Barker, K. Applied Game Theory to Enhance Air Traffic Control in 3D Airspace. J. Optim. Theory Appl. 2023, 196, 1125–1154. [Google Scholar] [CrossRef]
Alaydi, B.; Ng, S.I. Mitigating the Negative Effect of Air Traffic Controller Mental Workload on Job Performance: The Role of Mindfulness and Social Work Support. Safety 2024, 10, 20. [Google Scholar] [CrossRef]
Akdoğan, F.; Şahin, A.D. A survey of route optimisation and planning based on meteorological conditions. Aeronaut. J. 2026, 130, 613–638. [Google Scholar] [CrossRef]
Zhou, H.; Xiong, H.L.; Liu, Y.; Tan, N.D.; Chen, L. Trajectory Planning Algorithm of UAV Based on System Positioning Accuracy Constraints. Electronics 2020, 9, 250. [Google Scholar] [CrossRef]
Hao, S.; Cheng, S.; Zhang, Y. A multi-aircraft conflict detection and resolution method for 4-dimensional trajectory-based operation. Chin. J. Aeronaut. 2018, 31, 177–191. [Google Scholar] [CrossRef]
Shao, M.; Liu, X.; Xiao, C.; Zhang, T.; Yuan, H. Research on UAV trajectory planning algorithm based on adaptive potential field. Drones 2025, 9, 79. [Google Scholar] [CrossRef]
Chen, Y.B.; Yu, J.Q.; Su, X.L.; Luo, G.P.; Mei, Y.S. Path planning for multi-UAV formation. J. Intell. Robot. Syst. 2015, 77, 229–246. [Google Scholar] [CrossRef]
Ma, B.; Ji, Y.; Fang, L. A multi-UAV formation obstacle avoidance method combined with improved simulated annealing and an adaptive artificial potential field. Drones 2025, 9, 390. [Google Scholar] [CrossRef]
Gammell, J.D.; Barfoot, T.D.; Srinivasa, S.S. Informed sampling for asymptotically optimal path planning. IEEE Trans. Robot. 2018, 34, 966–984. [Google Scholar] [CrossRef]
Cáp, M.; Novák, P.; Vokřínek, J.; Pěchouček, M. Multi-agent RRT: Sampling-based cooperative pathfinding. In Proceedings of the 12th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), Saint Paul, MN, USA, 6–10 May 2013. [Google Scholar]
Betts, J.T. Practical Methods for Optimal Control and Estimation Using Nonlinear Programming; SIAM: Philadelphia, PA, USA, 2010. [Google Scholar] [CrossRef]
Bonami, P.; Olivares, A.; Soler, M.; Staffetti, E. Multiphase mixed-integer optimal control approach to aircraft trajectory optimization. J. Guid. Control Dyn. 2013, 36, 1267–1277. [Google Scholar] [CrossRef]
Wächter, A.; Biegler, L.T. On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming. Math. Program. 2006, 106, 25–57. [Google Scholar] [CrossRef]
Andersson, J.A.E.; Gillis, J.; Horn, G.; Rawlings, J.B.; Diehl, M. CasADi—A software framework for nonlinear optimization and optimal control. Math. Program. Comput. 2019, 11, 1–36. [Google Scholar] [CrossRef]
Lindqvist, B.; Mansouri, S.S.; Agha-Mohammadi, A.A.; Nikolakopoulos, G. Nonlinear MPC for Collision Avoidance and Control of UAVs With Dynamic Obstacles. IEEE Robot. Autom. Lett. 2020, 5, 6001–6008. [Google Scholar] [CrossRef]
Hoogendoorn, S.; Knoop, V.; Mahmassani, H.; Hoogendoorn-Lanser, S. Game-theoretical approach to decentralized multi-drone conflict resolution and emergent traffic flow operations. arXiv 2023, arXiv:2308.01069. [Google Scholar] [CrossRef]
Başar, T.; Olsder, G.J. Dynamic Noncooperative Game Theory, 2nd ed.; SIAM: Philadelphia, PA, USA, 1999. [Google Scholar] [CrossRef]
Spica, R.; Falanga, D.; Cristofalo, E.; Montijano, E.; Scaramuzza, D.; Schwager, M. A real-time game theoretic planner for autonomous two-player drone racing. IEEE Trans. Robot. 2020, 36, 1389–1403. [Google Scholar] [CrossRef]
Rasoulie, F. Cooperative graph-based predictive collision avoidance (CGPCA): A decentralized framework for safe drone traffic management. IEEE Access 2025, 13, 132450–132458. [Google Scholar] [CrossRef]
Federal Aviation Administration. Aviation Weather Services. Advisory Circular AC 00-45H; U.S. Department of Transportation: Washington, DC, USA, 2016. Available online: https://www.faa.gov/regulations_policies/advisory_circulars/index.cfm/go/document.information/documentid/1030235 (accessed on 17 May 2026).
EUROCONTROL. User Manual for the Base of Aircraft Data (BADA), Revision 3.14; EEC Technical/Scientific Report No. 16/12/07-01; EUROCONTROL Experimental Centre: Brétigny-sur-Orge, France, 2016. [Google Scholar]
ICAO. Procedures for Air Navigation Services—Air Traffic Management (PANS-ATM). Doc 4444, 16th ed.; International Civil Aviation Organization: Montréal, QC, Canada, 2016; ISBN 978-92-9258-081-0. [Google Scholar]
Federal Aviation Administration. Aeronautical Information Manual (AIM); U.S. Department of Transportation: Washington, DC, USA, 2026. Available online: https://www.faa.gov/air_traffic/publications (accessed on 17 May 2026).

Figure 1. Single-Aircraft, Three-Thunderstorm Avoidance Test Result.

Figure 2. Multi-Aircraft Avoidance Test Result.

Figure 3. Inter-Aircraft Safety Separation.

Figure 4. Aircraft-Thunderstorm Safety Separation.

Figure 5. Flight Speed Adjustment Strategy.

Figure 6. Baseline Method Comparison.

Figure 7. Comparison of Trajectories Before and After Rerouting.

Table 1. Simulation Parameter Settings.

Parameter Category	Parameter Name	Value	Unit
Safety Constraints	Minimum separation between aircraft	10	km
Safety Constraints	Minimum separation from thunderstorm zones	20	km
Optimization Model	Number of discrete time segments	100	-
	Weight for path length ( $w_{1}$ )	5.0	-
	Weight for control energy consumption ( $w_{2}$ )	0.8	-
	Weight for velocity smoothness ( $w_{3}$ )	0.1	-
	Weight for curvature penalty ( $w_{4}$ )	1.0	-

Table 2. Verification of ETA Constraints in the Multi-aircraft Scenario.

Aircraft	Preset Arrival Time/min	Actual Arrival Time/min	Arrival Time Error/min
Aircraft 1	10.0	10.02	+0.02
Aircraft 2	11.2	11.21	+0.01
Aircraft 3	11.9	11.89	−0.01

Table 3. Quantitative Comparison of Trajectory Planning Methods.

Metric	The Proposed Method	A* + IBR	ADMM
Total Path Length/km ↓	446.57	505.89	495.15
Avg Path Efficiency (actual/direct) ↓	1.012	1.149	1.131
Trajectory Smoothness Σ(Δθ)² ↓	0.153	64.466	36.249
Control Effort Σa² ↓	5.528	0.000	118.467
Speed Variation σ/(km/min) ↓	0.347	0.000	0.239
Min Inter-Aircraft Separation/km ↑	10.347	9.510	11.122
Inter-Aircraft Separation Violations ↓	0	1	0
Min Thunderstorm Clearance/km ↑	20.100	19.728	20.100
Thunderstorm Clearance Violations ↓	0	1	0
Wall-clock Time/s ↓	3.67	9.20	25.33

Table 4. Performance by Aircraft Count Group.

Aircraft Count	Obstacle Safety	Inter-AC Safety	Solve Time/s	Mean Detour Ratio
1–3	100%	100%	0.45–4.10	6.36%
4–7	100%	100%	3.22–16.51	6.46%
8–10	97.2%	95.8%	16.07–31.81	6.79%

Table 5. Planned Waypoints for the Subject Flights.

Waypoint	Name	Longitude	Latitude
ZGGG	Guangzhou/Baiyun	113°18′29″ E	23°23′35″ N
ZSOF	Hefei/Xinqiao	116°58′30″ E	31°59′12″ N
ZLXY	Xian/Xianyang	108°45′00″ E	34°27′18″ N
ZSSS	Shanghai/Hongqiao	121°20′60″ E	31°11′48″ N
PLT	-	114°52′30″ E	25°48′29″ N
LEKUV	-	116°12′37″ E	26°55′13″ N
NF	-	116°12′33″ E	27°33′40″ N
XUVGI	-	116°55′15″ E	27°31′09″ N
P215	-	117°32′38″ E	28°03′31″ N
P574	-	116°11′24″ E	28°11′42″ N
P553	-	117°42′22″ E	26°45′10″ N
ZSAM	Xiamen/Gaoqi	118°07′36″ E	24°32′42″ N
ZSHC	Hangzhou/Xiaoshan	120°26′59″ E	30°13′41″ N

Table 6. Rerouted Waypoints.

Waypoint	Name	Longitude	Latitude
ZGGG	Guangzhou/Baiyun	113°18′29″ E	23°23′35″ N
ZSOF	Hefei/Xinqiao	116°58′30″ E	31°59′12″ N
ZLXY	Xian/Xianyang	108°45′00″ E	34°27′18″ N
ZSSS	Shanghai/Hongqiao	121°20′60″ E	31°11′48″ N
PLT	-	114°52′30″ E	25°48′29″ N
LEKUV	-	116°12′37″ E	26°55′13″ N
X1	-	117°06′15″ E	27°17′07″ N
X2	-	116°43′20″ E	27°38′35″ N
X3	-	117°07′00″ E	27°36′37″ N
X4	-	116°43′37″ E	27°36′37″ N
NF	-	116°12′33″ E	27°33′40″ N
XUVGI	-	116°55′15″ E	27°31′09″ N
P215	-	117°32′38″ E	28°03′31″ N
P574	-	116°11′24″ E	28°11′42″ N
P553	-	117°42′22″ E	26°45′10″ N
ZSAM	Xiamen/Gaoqi	118°07′36″ E	24°32′42″ N
ZSHC	Hangzhou/Xiaoshan	120°26′59″ E	30°13′41″ N

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Su, R.; Wen, X.; Li, S.; Chen, Y.; Yang, W. A Cooperative Trajectory Planning Method for Multi-Aircraft Thunderstorm Avoidance Based on Optimal Control and Game Equilibrium. Aerospace 2026, 13, 537. https://doi.org/10.3390/aerospace13060537

AMA Style

Su R, Wen X, Li S, Chen Y, Yang W. A Cooperative Trajectory Planning Method for Multi-Aircraft Thunderstorm Avoidance Based on Optimal Control and Game Equilibrium. Aerospace. 2026; 13(6):537. https://doi.org/10.3390/aerospace13060537

Chicago/Turabian Style

Su, Rui, Xiangxi Wen, Shuangfeng Li, Youfu Chen, and Wenda Yang. 2026. "A Cooperative Trajectory Planning Method for Multi-Aircraft Thunderstorm Avoidance Based on Optimal Control and Game Equilibrium" Aerospace 13, no. 6: 537. https://doi.org/10.3390/aerospace13060537

APA Style

Su, R., Wen, X., Li, S., Chen, Y., & Yang, W. (2026). A Cooperative Trajectory Planning Method for Multi-Aircraft Thunderstorm Avoidance Based on Optimal Control and Game Equilibrium. Aerospace, 13(6), 537. https://doi.org/10.3390/aerospace13060537

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Cooperative Trajectory Planning Method for Multi-Aircraft Thunderstorm Avoidance Based on Optimal Control and Game Equilibrium

Abstract

1. Introduction

2. Mathematical Models and Problem Formulation

2.1. Aircraft Kinematic Model

2.2. Environmental and Safety Constraints

2.2.1. Dynamic Thunderstorm Environment Model

2.2.2. Aircraft Separation Standard

2.2.3. Boundary and ETA Constraints

3. Proposed Planning Framework

3.1. Single-Aircraft Trajectory Planning

3.2. Multi-Aircraft Conflict Resolution and Coordination Framework Based on Game Theory

3.3. Analysis of Convergence for the Coordination Mechanism

4. Experimental Validation

4.1. Simulation Environment and Parameter Settings

4.2. Single-Aircraft Avoidance Scenario Analysis

4.3. Multi-Aircraft Avoidance Scenario Analysis

4.4. Baseline Method Comparison

4.4.1. Comparison with A* + IBR

4.4.2. Comparison with ADMM

4.4.3. Overall Assessment

4.5. Monte Carlo Study and Sensitivity Analysis of Formation Size

4.5.1. Experimental Design

4.5.2. Sensitivity to Aircraft Count

4.6. Dynamic Rerouting Case Study Based on Real-World Flight Data

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI