A Data-Driven Heuristic Method for Irregular Flight Recovery

: In this study, we develop a data-driven heuristic method to solve the irregular flight recovery problem. Based on operational data from China South Airlines, Beijing, China, we evaluate the importance of a flight in the flight network and the influence of a delay on a flight and its subsequent flights. Then, we classify historical states into three scenarios according to their delay reasons and investigate the recovery patterns for each scenario. Inspired by the results of the data analysis, we develop a heuristic algorithm that imitates dispatcher actions. The algorithm is based on two basic operations: swapping the tail numbers of two flights and resetting their flight departure times. The algorithm can provide multiple recovery plans in real time for different scenarios, and we continue to refine and validate the algorithm for more robust and general solutions through a cost analysis. Finally, we test the efficiency and effectiveness of the recovery method based on the flight schedule, with real and simulated delays, and compare it with two other methods and the recovery actions of dispatchers.


Background
Incidents such as poor weather, aircraft failures, and air traffic control sometimes result in being unable to implement normal flight schedules.Over the last decade, a lot of research has been devoted to developing decision support systems for irregular flight recovery problems (see [1,2] for two review papers).The majority of mathematical models for solving airline recovery problems are similar to the optimization methods applied for planning and scheduling purposes.In general, the irregular flight recovery problem can be formulated as a large-scale integer programming problem and it has complex variables and constraints [3,4].Two classical methods for this problem are mixed-integer programming and the set covering model (see Section 1.2 for details).However, these methods have two main challenges in their implementation.First, their objective functions and constraint functions often include a large set of parameters, such as the expected delay time, delay costs, operation rules, airport capacity, crews, passengers, and operational costs.The optimal solution is often sensitive to changes in the objective function and these parameters.However, in reality, the delay time cannot be estimated precisely and the accurate cost information is unknown to dispatches.Thus, these methods may fail to obtain the optimal solution if some parameters are inaccurate or unknown.Second, even if the optimal solution is obtained, it may be difficult (or unable) to implement due to some external reasons that are not considered in the model.In this case, an alternative solution is required.Thus, an algorithm that can generate multiple feasible (not necessarily optimal) solutions would be beneficial.
To overcome these challenges, heuristic methods have been proposed to solve the irregular flight recovery problem (see Section 1.2 for details).Recently, [5] developed a two-step heuristic method by imitating dispatcher actions.The first step was to establish a scoring system to quantify the importance of a flight through the analytic hierarchy process (AHP).The second step was to generate adjustment plans using two types of recovery operations, by replacing one aircraft with another or by swapping two aircrafts.Specifically, to validate the scoring system and the plan generation method, they sent questionnaires with pairwise comparisons of different simulation cases to dispatchers from the China South Airlines Beijing Office, Beijing, China.The questionnaires included 26 questions that could be decomposed as 36 pairwise comparisons of recovery cases, with different types of flights, aircrafts, passengers, and delay times (see details in [5]).Their results showed that the scoring system could explain most dispatcher recovery operations, and adjustment schemes could be generated in real time.However, their method still had some limitations.First, due to a lack of operational data, the scoring system only incorporated the aircraft type, flight information, and delay time.In reality, the significance of a flight is also influenced by the flight network.Second, when generating adjustment plans, their method only considered the plan of the day, and subsequent irregular flights (e.g., the return flight in a round trip) were not taken in to account.Finally, when evaluating these adjustment plans, their method focused on scores and delay times; other factors, such as the delay costs and operational costs, were not considered.
As an extension of [5], this study develops a data-driven decision support system based on operational data from the China South Airlines Beijing Office, Beijing, China.We analyze data from three angles.First, we evaluate the importance of an aircraft and a flight in the flight network.Second, we classify historical states into different scenarios according to their delay reasons and investigate the recovery patterns (e.g., exchange, replacement, or cancellation) for each scenario.Third, we evaluate the influence of a delay on a flight and its subsequent flights.Inspired by the data analysis results, we develop a heuristic recovery algorithm by imitating the actions of dispatchers.The algorithm can provide multiple recovery plans in real time for different scenarios, and we continue to refine and validate the algorithm for more robust and general solutions by evaluating the delay costs and adjustment costs of the generated plans.Furthermore, we compare our method with the heuristic method in [5] and the classical optimization method (a two-stage mixed integer programming method, MIP for short), as well as the dispatcher choices in the scenario of single irregular flight.Overall, our method is much faster and more efficient.It performs better than Pei et al.'s method, both in the quantity and quality of its solutions.Furthermore, it can generate plans that are the same as or similar to the MIP method and the dispatcher choices, giving it some advantages in implementation.

Related Research
A popular method is to describe the recovery problem as a large-scale integer programming problem.Next, the problem can be studied through integer programming [6][7][8], mixed-integer programming [9][10][11][12], and its variants, such as mixed integer nonlinear programming [3,8], conic quadratic mixed integer programming [3], conic mixed integer programming [13,14], stochastic mixed-integer programming [15], and stochastic binary integer programming [16].Specifically, [6,7] built an integrated integer programming model based on a time-band network for single-fleet aircraft recovery, multi-fleet aircraft routing, and passenger transit optimization.[11] proposed a separate mixed-integer mathematical model for scheduling, aircraft, crew, and passenger recovery problems.They used a Benders decomposition scheme, together with the column generation approach, to achieve coordination among these four mathematical models.[13] first used the implementation of a conic quadratic optimization approach to solve a critical aircraft recovery problem in an optimal manner.Then, [3] formulated the recovery problem as a mixedinteger nonlinear programming (MINLP) model and reformulated it as a conic quadratic mixed-integer programming (CQMIP) problem, which could be solved with commercial optimization software such as version 12.1 of IBM ILOG CPLEX, IBM, New York, USA.[12] modeled the joint aircraft and passenger recovery problem as a mixed-integer program and presented a column generation post-optimization heuristic method to solve it.Recently, [15] proposed a chance-constrained two-stage stochastic mixed-integer programming model to solve the extended aircraft arrival management problem with uncertainty.
Secondly, many studies have considered the recovery problem as a set covering [17], set packing [18], or set partitioning model [8,19].Methods such as Benders decomposition [20], column generation [21], branch-and-bound [19,20,22], Dantzig-Wolfe decomposition [18,22], Dang's iterative method [23], and Lagrangian decomposition [21] are typically used to solve these types of models and problems.For example, [19] formulated the crew scheduling problem as a set-partition-type problem and used a column generation method embedded in a branch-and-bound search tree to solve it.[8] then proposed an integer programming model that was based on the traditional set partition model for the integrated airline recovery problem, which included flight recovery, aircraft recovery, and crew recovery simultaneously.[18] developed a mixed-integer multicommodity flow model with side constraints and further reformulated it into a set packing model using Dantzig-Wolfe decomposition.Recently, [17] used a set covering problem with side constraints to investigate the intra-and inter-fleet models for solving the crew scheduling problem during irregular flight operations.
In addition, some researchers have applied heuristic methods to solve the recovery problem.For example, based on the greedy random adaptive search procedure (GRASP), [24] developed a new hybrid heuristic procedure for irregular flight recovery.[25] presented a heuristic approach based on the shortest path problems for integrated flight, aircraft, and passenger rescheduling under disruptions.[26] designed a large neighborhood search heuristic method to solve an integrated aircraft and passenger recovery problem.Based on a large neighborhood search metaheuristic model, [27] defined an optimization approach for tackling the Stochastic Aircraft Recovery Problem.[28] used a heuristic method to solve integrated recovery for both aircraft routing and passengers.Recently, [5] developed a simple two-step heuristic model by imitating dispatcher actions.

Paper Organization
The remainder of this paper is organized as follows (see Figure 1).Section 2 analyzes the historical operational data provided by China South Airlines, Beijing, China.Section 2.1 evaluates the importance of an aircraft and a flight in the flight network, Section 2.2 summarizes delay reasons and their related recovery patterns, and Section 2.3 evaluates the influence of a delay on a flight and its subsequent flights.Section 2.4 provides an example of the score calculation.Section 3 develops and validates a three-step, real-time recovery algorithm.The detailed process of this algorithm is presented in Sections 3.1 and 3.2.Section 3.3 analyzes the delay costs and adjustment costs for the generated adjustment plans.Section 3.4 tests the effectiveness of the scoring system and recovery algorithm based on the historical operational data.Section 4 compares the proposed method with two other methods, the heuristic method in [5] and a two-stage mixed integer programming model.We conclude with the primary results and future research directions in Section 5.

Data Analysis
The data considered in this study were provided by the China South Airlines Beijing Office, Beijing, China.The data were collected from March 2018 to December 2018 (10 months) and describe 35 aircraft, 85 airports, 537 flights, and 29,202 records.Each record includes the flight number, aircraft tail number, departure airport, arrival airport, delay time, schedule information, and recovery actions (no action, change in tail number, or cancellation of a flight) of dispatchers.
In this section, we first evaluate the importance of an aircraft or a flight by analyzing the effect of removing it from the flight network.Intuitively, removing a more important aircraft (or flight) should have a greater influence on the flight network, such as increasing the average distance or even affecting the strong connectivity.Then, we analyze delay reasons and their recovery action patterns.Finally, we use a scoring system to evaluate the influence of a delay on a flight and its subsequent flights.

Flight Network Analysis
To form a flight network, we consider airports as nodes and flight records as edges.Thus, a flight network is a direct, multi-edge, time-varying network.Some flights are daily, weekly, or twice per week, but most do not have a clear period.Additionally, some flights only operate in certain months.Thus, it is not easy to evaluate the importance of an aircraft or a flight based on the full (10-month) network.Therefore, we divide the flight data into different time periods (i.e., different numbers of days) and analyze the properties of the flights and aircraft networks for each short period.Because dispatchers from China South Airlines, Beijing, China.can only see the schedule for the next two days when managing irregular flights, we primarily focus on the flights and aircraft in 1-day and 2-day networks (see Table 1).In particular, the importance of a flight or an aircraft is evaluated based on a 1-day network and the proposed recovery algorithm is based on a 2-day flight network.According to the management rule of China South Airlines, Beijing, China, flights can be identified as "international", "single", "low density", or "high density", where the former types are more important than the later.Naturally, "International" indicates international flights.The other three types describe the daily flight frequency.A flight from A to B, which corresponds to a direct edge from A to B, is called a "single" if it is the only flight from A to B in all the 1-day flight networks of the month.Thus, removing it can affect the strong connectivity or average distance of the network.In contrast, a flight is "high density" if removing it does not affect the connectivity or average distance of all the 1-day flight networks of the month; thus, a passenger can easily rebook another flight if a high-density flight is cancelled.Other flights are "low density" (i.e., it is the only flight from A to B in a 1-day flight network).As an example, in March 2018, 38.6% (54 out of 140) of flights were identified as "single", 13.6% (19) of flights were "low density", and 47.9% (67) of flights were "high density".
Then, we analyze the 2-day flight networks of different aircraft.Figure 2a shows a representative 2-day flight network, which includes 30 aircraft, 21 airports, and 159 flights.In this network, flights with the same aircraft tail number form a weakly connected subnetwork, which we refer to as the aircraft subnetwork.Figure 2b-d shows three typical aircraft subnetworks in other 2-day flight networks.Conversely, if an aircraft subnetwork is strongly connected, then it consists of round-trip flights or directed cycles with lengths of > 2 (see Figure 2b).Additionally, most round trips are between PEK and another airport, and most aircraft stay overnight at PEK, as shown in the data provided by China South Airlines, Beijing, China.Conversely, an aircraft subnetwork does not need to be strongly connected, even if the representative flight network is strongly connected (see Figure 2c,d).Overall, approximately 60% of the aircraft subnetworks are strongly connected, which implies that, in many cases, two aircraft can be swapped back at PEK overnight.The swap back problem will be considered in detail in Section 3.2.

Delay Reason Analysis
In this section, we analyze delay reasons and their different recovery action patterns based on the database built with the flight history and change logs provided by the China South Airline Beijing Office.Specifically, we match the records in the flight history with the adjustment records (cancellation, change in tail number, or change in aircraft type) and delay reasons in the change logs.In addition, flight records without adjustments are labeled as no recovery action.Overall, 9931 (out of 29,201) flight records were delayed; thus, the average delay rate was 34.01%.However, the delay time was 0 in 4003 records, and in 3294 of them, the dispatchers took no recovery action.Then, a question is why the delay time was 0. There are two possible explanations for this.First, the delay of an irregular flight could have been repaired or cancelled.Second, the expected delay did not occur or was short (e.g., a few minutes) and did not affect the schedule.We suspect that most delays were due to the second reason, because dispatchers did not make any adjustments in most records.For the remaining 5928 records with a delay time of > 0, the delay time was less than 1 h in 3282 records (i.e., the delay was short) and between 1 and 4 h in 2301 records (i.e., the delay was long).Delays with times of > 0 may be due to two reasons: (1) an irregular flight was late; and (2) a regular flight was used to fix an irregular flight, and thus, the regular flight was delayed.Because there was no recovery action in 5274 of the 5928 records, most delays were likely due to the first reason.
We then analyze the reasons for these delays.There are 84 different delay reasons in the database and we classify them into three categories: (i) aircraft reasons, (ii) route/airport reasons, and (iii) other.Aircraft reasons mean that the aircraft that is designated for a given task is unavailable or cannot take off on time; thus, the dispatcher may have to identify a fungible aircraft for the assignment.Overall, 39 of the 84 delay reasons are sorted into aircraft reasons, which corresponds to 2340 change records.Typical delay reason codes include AK1, KD, HS, and HQ (see Table 2 for detailed explanations about these delay reason codes).In contrast, route/airport reasons represent the reasons for a scheduled route or airport not being available.Overall, 23 of the 84 delay reasons are sorted into this category, which corresponds to 7310 change records (see Table 2).Typical delay reason codes include AK2 and KF.The remaining 22 delay reasons are sorted into the other category.Only 281 change records are caused by other reasons.Typical delay reason codes include IC, SQ, and DH.Table 2 also shows the repair rates for the different categories of delays.Repair means that the dispatchers took recovery actions of changing the tail number (i.e., they tried to fix the irregular flight).The repair rate for aircraft reasons is triple that of the rate for route/airport reasons; thus, it makes sense that an aircraft-caused delay is easier to repair, because the dispatchers can replace the unavailable aircraft with another.In contrast, dispatchers cannot do anything except reset the departure times for the affected flights if an airport is closed and not allowing aircraft to takeoff.Because the recovery logics for aircraft-caused delays and route/airport-caused delays are different, we will design two different recovery processes for them (see Section 3).

Irregular Flight Evaluation
When an incident occurs, we must consider the reason for the delay and determine the flights that are affected by this incident.For example, if the weather conditions are unfavorable at an airport, then all the flights related to this airport will be affected.Then, we estimate the expected delay time for each flight directly affected by the incident.This information can be obtained from air traffic, airline, and airport companies.In addition to the flights directly affected by the incident, their subsequent flights may also be affected.A specific delay reason is that the previous flight was late.We can consider that, if a flight takes off or arrives late, then the expected delay for its subsequent flight can be calculated by the following equation: Expected delay = planned arrival time + delay +1 h turnover − planned departure time.
A long time delay may cause delays for several subsequent flights and their expected delays can be estimated using the same logic.
We then use a scoring system to evaluate the influence of an incident.[5] established an AHP scoring system based on questionnaires completed by dispatchers from China South Airlines, Beijing, China.where the score of an irregular flight was determined by four types of factors: delay time, aircraft type, flight properties, and VIP passengers (see Table 3).In particular, a flights' score was 0 when there was no delay, thus requiring no attention from the dispatchers.Applying this scoring system, each irregular flight will be given a real-time score, where a higher score indicates that the irregular flight has a greater influence on the airline system.In this study, we extend the scoring system developed in [5], such that the influence of an irregular flight on its subsequent flights is considered.Specifically, the cumulative score of an irregular flight is defined as the sum of its score and the scores of its subsequent flights in the 2-day subnetwork.In the next section, an example is provided to show how to calculate the expected delay time, flight real-time scores, and cumulative scores (see Tables 4 and 5).

An Example of the AHP Model and Score Calculation
Based on dispatcher interviews, [5] established the AHP scoring system (see Figure 3) and calculated the weighted values for the four clusters: flight, aircraft, passenger, and delay time, and the weighted values for the elements within these four clusters (see Table 3).In this section, we evaluate and estimate the impact and importance of an irregular flight by applying the same values.We then consider an example from the historical data.On 19 April 2018, flight CZ6991 took off 191 min late due to poor weather.Therefore, its expected arrival time was 20:15 + 191 min = 23:26.Thus, the expected delay of its subsequent flight CZ6992 was 23:26 + 1 h, with a turnover of 21:55 = 151 min, and the expected departure time for CZ6992 was 0:26 (see Table 4).
Following the example shown in Table 4, we calculate the scores for the two irregular flights.Both them were low-density flights and operated by a narrow aircraft.Additionally, their delay times were long and there were no VIP passengers.Thus, their scores are 0.015 + 0.017 + 0.21 = 0.242.Because CZ6992 was the subsequent flight of CZ6991, the cumulative score of CZ6991 was 0.242 + 0.242 = 0.484; thus, if the delay of CZ6991 was mitigated, the total reduction in the score was 0.484 (see Table 5).

Basic Structure of the Algorithm
By imitating dispatchers' decision processes, we design a heuristic algorithm to repair irregular flights.This heuristic algorithm has three steps (see Figure 4).The first step is to consider the delay reasons and evaluate the influence of an incident.As mentioned in Section 2.3, each flight affected by the incident will be distributed a real-time score representing the significance of the flight and the system (operational) impact due to the current delay.In particular, in the score calculation, the delays of subsequent flights (on that day and the next day) are also considered.Thus, a long time delay may cause delays for these subsequent flights, which increases the cumulative score of irregular flights.Then, we need to confirm the threshold value for the scoring system to identify the flights that are important enough to repair.The threshold value will trigger the recovery system.When there is no irregular flight (all flights' scores are 0), or when the influence of an irregular flight is small, all flights' scores should be lower than the threshold, and no operation is required.Only when some flights' scores are above the threshold will the algorithm be triggered, moving to the adjustment phase and searching for recovery plans.
The second step is to generate adjustment plans.We design two different recovery processes for aircraft-caused delays and route/airport-caused delays based on two basic operations: swapping the tail numbers of two flights and resetting the departure time of a flight.In a swap, the tail numbers of subsequent flights will be changed and the two aircrafts will be swapped back when they meet again at the same airport.When resetting a departure time, the departure times of subsequent flights should also be considered.Specifically, cancelling a flight means that its departure time is postponed indefinitely.Thus, replacing one (irregular) aircraft with another is shown as a combination swap of the two aircraft and a reset of the departure times of their related flights.Additionally, when there are multiple irregular flights, solutions will be first provided to flights with higher scores.
The third step is to evaluate the advantages and disadvantages of the generated plans.A plan is regarded as feasible when it satisfies the next three criteria: (i) decreases or maintains the irregular flight's score; (ii) decreases the total scores for all the involved flights; and (iii) reduces the total delay time.In addition to the three criteria, we analyze these feasible plans from the perspectives of delay costs and adjustment costs (e.g., total number of recovery actions or flights involved in a plan).
We will now explain the recovery methods in detail for different delay scenarios.

Aircraft Reason with One Irregular Flight
In this scenario, one flight is identified as an irregular flight (i.e., its score exceeds the threshold value) due to a delay caused by an aircraft.Before running the algorithm, we must set the following three parameters: (i) a time window for searching the flights for adjustment; (ii) operational restrictions (e.g., earliest and latest departure times for the flights, overnight airport for an aircraft); and (iii) the maximum number of adjustment steps.
The key step of the irregular flight recovery algorithm is swapping two aircraft.During the adjustment, the recovery algorithm searches for alternative aircrafts that are unoccupied or have the same departure airport within a given time window.Because there is no passenger information, alternatives are also required for aircraft types with seat configurations larger than or equal to that of the previous aircraft.As an extension of the method reported by [5], the local flight network of an irregular flight is considered during its recovery.In addition to swapping the tail numbers of the irregular flight and the flight that is used to repair this irregular flight, the tail numbers of their subsequent flights will also be swapped.Additionally, if two aircraft reappear at the same airport at a similar time, they will be swapped back.This operation makes the proposed adjustment plan more reasonable and feasible.
After swapping the two aircraft, we may need to reset the departure times for their related flights.The basic principle of this is the flights should take off as early as possible, where the earliest time is the planned departure time.The last step is to evaluate the adjustment plan.We calculate the scores and delay times of the related flights and check whether the adjustment plan is feasible.In the plan output, we will show the score and delay time of the irregular flight; the total score and delay time of its related flights; and the reductions in the score and delay time.After a suggested operation, the scores of other flights may be altered and reach values above the threshold, requiring further operations.In this case, the system will repeat the adjustment procedure until all of the flights' scores are below the threshold.
To improve the response speed and operant level for operators, the algorithm will also stop searching when it reaches a certain number of operation steps.For example, the algorithm stops searching for a solution once the dispatcher performs more than four operations.When the algorithm produces a series of solutions, the system will rank and display all the feasible plans according to the feasible plan criteria.

Aircraft Reason for Multiple Irregular Flights
In a scenario with multiple irregular flights, the recovery course begins with the highest scored flight (i.e., the most important or impacted), where the recovery algorithm is the same as the one irregular flight scenario.After the most impacted irregular flight is repaired, we update the flight schedule and check the scores of the related flights.The algorithm will repeat this process until all the irregular flights are repaired or a recovery plan cannot be found in a predefined number of action steps.Finally, we evaluate the feasibility of potential plans and output feasible plans.

Airport and Route Reasons
In this scenario, an airport (or a certain route) is closed due to poor weather, traffic control, or military reasons.As a result, no flight can take off during the closure of this airport.Our recovery algorithm will identify the irregular flights that are affected by this incident and reset the departure times for these flights.Before running the algorithm, we need to set two parameters: (i) the opening time of the airport (i.e., the earliest departure time); and (ii) the minimum time interval for takeoff.The algorithm will first find the possible departure times under the restrictions of the minimum time interval for takeoff.Then, the algorithm reschedules the departure times for the irregular flights, which can minimize the total score and avoid very long delays (if possible).Finally, we evaluate potential plans and output feasible plans.

Feasible Plan Evaluation
A plan is first evaluated based on the scoring system and delay time.In particular, a feasible plan should not increase the (cumulative) scores and delay times of irregular flights and should decrease the total scores and delay times of related flights.To better assess an adjustment plan, we analyze its costs from the perspective of delay costs and adjustment costs.
Many existing optimization methods try to minimize delay costs.To calculate these delay costs, we must estimate the economic benefits of a flight, which is a difficult problem.Although China South Airlines, Beijing, China.has specialized departments for analyzing flight costs, there are a lack of public data.Therefore, we choose a study on European aviation as a reference to qualitatively evaluate these delay costs [40].A detailed calculation of the delay costs is discussed in Section 4.3.
In addition to delay costs, adjustment costs should also be considered.Specifically, changes in tail numbers and departure times should be as small as possible.Thus, in the plan output, we will show the number of flights involved in the adjustment process and whether their related flights can be swapped back in two days.
In Section 4.2, we develop a two-step mixed integer programming (MIP) method that to minimize the delay costs and adjustment costs.Thus, the solution of the MIP algorithm can be used to evaluate the cost of the plans generated by the proposed heuristic method.Detailed results are shown in Section 4.3.

Scoring System Validation
Because the proposed method relies on a scoring system, we must consider whether this scoring system can effectively explain the behaviors of dispatchers.We test the scoring system with historical scenarios, particularly in one-step cases, such as replacing one aircraft with another and swapping two aircrafts.The reason is twofold: (1) these two steps of operation can easily be identified from the data, and it is easy to identify an irregular flight (or aircraft); and (2) the logic of these two steps of action is clear (i.e., problematic flights have a higher priority than regular flights that are used for the adjustment).By calculating the scores of irregular flights before and after the operation, we can validate the scoring system.
Overall, we obtain 60 cases from the data, where the expected delay can be precisely estimated and the recovery action is a single-step swap or replacement.Of the 60 cases tested, 56 cases satisfy the logic of the scoring system.Compared to no operation, the scores are decreased or kept the same, and the delay time is decreased after the dispatchers take action.Additionally, in 23 cases, the proposed optimal solutions are identical to those chosen by the dispatchers, which implies that the logic of the scoring system conforms with the operation logic of the dispatchers, and the plan generated by the algorithm is similar to the dispatchers' choice.

Effectiveness and Efficiency of the Recovery Algorithm
We program the recovery algorithm using Python; the algorithm routine and its description can be found in the attached file "Flow chart.docx".After importing the flight schedule and setting the expected delays, the algorithm produces adjustment plans if they exist and presents a decrease in the score and delay time in real time.Unfortunately, it is hard to execute the recovery algorithm based on historical data, because the expected delay times for problematic flights (before adjustment) are unidentified in most cases.To overcome this challenge, we validate the algorithm response time with a simulated delay within a one-day flight schedule (1 June 2018).
The one-day flight schedule is provided by China South Airlines, Beijing, China.and includes 92 flights, with 29 aircrafts interacting with 38 airports.We validate the algorithm with two simulation conditions that have different levels of delay severity.In each simulation iteration, one of the 92 flights is considered to be the problematic flight, and the delay time is 90 min in the long delay situation and 300 min in the very long delay situation.With a long delay, the algorithm can provide 483 feasible plans for 92 flights, where 41 plans can swap back.In contrast, with a very long delay, the algorithm can provide 551 feasible plans for 92 flights, where 42 plans can swap back.The repair rates would be higher, in fact, because the next day's flights can also be used to repair these problematic flights.Importantly, 184 iterations of the plan output (92 flights × two types of delay conditions) can be accomplished in real time (<1 s), which is much faster than that of most existing algorithms.

Comparison Analysis
To further validate our heuristic method, we compare it with two other methods: the heuristic method in [5] and the classical optimization method (a two-stage mixed integer programming method, which will be introduced later).In addition to the two simulation conditions described in Section 3.4.2,two real cases of aircraft reasons with one irregular flight are considered, where detailed information about the incidents (e.g., detailed reasons and expected delays) and the actions of dispatchers are provided by China South Airlines, Beijing, China (see Section 4.3).Therefore, we also compare the plans generated by the proposed heuristic method with the choices of dispatchers.

Heuristic Method
The proposed method (i.e., "the new method" in this section) is an improvement on the heuristic method from [5] (i.e., "the previous method" in this section).The two methods are based on the same scoring system (see Figure 3).We compare the results of the two methods based on the one-day flight schedule with a simulated delay.With a long delay of 90 min, the previous method can provide solutions for 45 flights, while the new method can provide solutions for 54 flights.With a very long delay of 300 min, the previous method can provide solutions for 32 flights, while the new method can provide solutions for 59 flights.In both cases, the new method performs better than the previous method, where the new algorithm can generate adjustment plans that the previous algorithm cannot.Additionally, the new algorithm evaluates the influences of subsequent flights and considers the swap back problem, which implies that the adjustment plans generated by the new method are easier to implement.

Mixed-Integer Programming Method
The classic optimization approach formulates the recovery problem as a mixed-integer programming (MIP) problem, where the optimization objective is to set departure times and tail numbers for each flight with the aim of minimizing the cost function.Here, we describe a two-step MIP algorithm, where the first step is to set the arrival/departure times for all the flights, minimizing the total costs of delays and cancellations, and the second step is to assign aircraft (tail numbers) to these flights, minimizing the adjustment costs.
The parameters for the first step of the algorithm are shown in Table 6.These parameters can be classified into three types: input parameters, algorithm parameters, and output parameters.The input parameters include information on the flight schedule, expected delay, and flight properties.These parameters can be obtained from the flight data.The algorithm parameters include the minimum time intervals for arrival and departure, cost coefficients based on flight types, costs of delays per min, and costs of cancellations.We must set these parameters before running the algorithm.As with the proposed heuristic method, we use European aviation as a reference to qualitatively evaluate the delay costs [40], and the cost coefficient for a flight is based on the scoring system (see Figure 3).The output parameters are the arrival or departure time for each flight.In particular, a flight X is cancelled if ∑ , = 0.  , is the cost of cancellation.The first step of the MIP algorithm has five constraints: Constraint (i) means that one flight lands/takes off once.Constraint (ii) means that the departure time of a flight should not be earlier than the planned time and the delay should be less than 4 h.Constraint (iii) is the minimum time interval for arrival and departure.Constraints (iv) and (v) are related to aircraft types, where the number of available narrow (or wide)-body aircraft at time tn is the sum of the number of wide-body aircraft parked at the airport at the initial time and the number of wide-body aircraft that land before this time, minus the number of wide-body aircraft that take off before this time.Additionally, the turnover time for an aircraft is considered to be 60 min.
We now introduce the second step of the MIP algorithm.The parameters are shown in Table 7.The parameters can be classified into two types: input parameters and output parameters.The input parameters include information on the rearranged flight schedule and aircraft properties.These parameters can be obtained from the first step of the algorithm and the flight data.The output parameters are the tail numbers for each flight.The objective function for the second step of the MIP algorithm is: which states that changes should be as small as possible.The second step of the MIP algorithm has two constraints: Constraint (vi) ensures that the aircraft is at the airport and can be used to recover the irregular flight, where the turnover time for an aircraft is taken as 60 min.Constraint (vii) means the aircraft type is larger than or equal to the aircraft type required by the flight.
Different from the heuristic method, the objective of the MIP algorithm is to minimize the delay costs (i.e., first step) and adjustment costs (i.e., second step).Thus, the solution of the MIP algorithm can be used to evaluate the quality of the plans generated by the heuristic method.

Case Study
In this section, we compare the proposed heuristic method with the MIP method for three different scenarios: an aircraft reason with one irregular flight, an aircraft reason with multiple irregular flights, and airport and route reasons.In particular, the first scenario includes two real cases; thus, we also compare the proposed method with the choices of dispatchers.
In the historical data, we find that dispatchers will take recovery actions when a long delay occurs (i.e., score > 0.21), but do not take any action when a flight (without VIP passengers) is delayed for a short time (i.e., score < 0.19).Therefore, in the heuristic method, the threshold value is set as equal to 0.2, i.e., a flight with a long delay or a flight with VIP passengers and a short delay will trigger the recovery process (parameter sensitivity is discussed in Section 4.3.1).Additionally, the time window for searching the flights for adjustment is considered to be 3 h and the maximum number of adjustment steps is four.Finally, in the cost analysis, the standard delay cost is considered to be EUR 334 per min.Thus, the delay cost of a flight with a delay time t and cost coefficient (=flight score + aircraft score) is 334 .
To facilitate this comparison with the proposed heuristic method, the parameters in the MIP method are considered to be = 0, = EUR 334 per min, = flight score + aircraft score, and = EUR 80,160.Thus, we only focus on the delay costs of the departure, and it is more profitable to cancel a flight if the delay is longer than 4 h.Additionally, the minimum time interval for departure is 5 min.The MIP algorithm is programed by LINGO and solved using the branch-and-bound method.The solution can be obtained in 1-2 min in the case of one irregular flight and 4-8 min in the cases of multiple irregular flights and airport closure.

Aircraft Reason for One Irregular Flight
This scenario includes two real cases provided by China South Airlines, Beijing, China.In each case, one ongoing flight arrived late and caused the delay of its subsequent flights.Dispatchers then took actions in the operational system to recover the irregular flight.In each case, there are multiple plans that could have been executed to reduce the delay effect, i.e., a swap or a replacement, which allows us to compare plans from the proposed method with the dispatchers' choices.
Case 1.At 12:57, the dispatcher was informed that CZ6400 (tail number B6398) would arrive 215 min late.As a result, the expected delay of CZ6902 was 175 min and its subsequent flight CZ6909 was expected to be delayed by 160 min (see Table 8).The dispatcher's choice (DC) was to replace B6398 (CZ6902 and CZ6909) with B6578, while the MIP suggests swapping B6398 (CZ6902 and CZ6909) and B6319 (CZ8669 and CZ8670) (see Table 9).The proposed heuristic method (HM) generates three feasible solutions, where solution 1 is consistent with the MIP and solution 3 is consistent with the DC.Additionally, the MIP and HM's solution 1 saves more time, and the DC and HM's solution 3 is easier to operate, in the sense that the adjustment plan involves only two flights.Case 2. At 15:07, the dispatcher was informed that CZ6162 (tail number B6317) would arrive 200 min late.As a result, the expected delay of CZ315 was 85 min and its subsequent flight CZ316 was expected to be delayed by 85 min (see Table 10).The dispatcher's choice (DC) was to swap B6317 (CZ315 and CZ316) and B6319 (CZ6716), while the MIP method suggests replacing B6317 (CZ315 and CZ316) with B6319 and replacing B6319 (CZ6716) with B6137.The proposed heuristic method generates two feasible solutions, where solution 1 is consistent with the DC (see Table 11).Additionally, the MIP solution saves more time, DC and HM's solution 1 involves only two aircrafts, and HM's solution 2 can swap back.
Table 10.Flight schedule for aircraft reason with one irregular flight case 2. The first column is the tail numbers of the aircrafts and the remaining columns show the flight schedule of the irregular flight and flights that can be used to repair the irregular flight.The irregular flight is marked in red and its subsequent flight is marked in blue.Finally, we analyze the sensitivity of the threshold value in the recovery process.In general, the larger the threshold value, the more adjustment plans will be generated.When the threshold value is taken as 0, an adjustment plan will be outputted only if all flights can take off on time.In this case, Case 1 has only one solution, i.e., solution 1 in Table 10, and Case 2 has no solution.When the threshold value is 0.2, Case 1 has three solutions (see Table 10) and Case 2 has two solutions (see Table 11).If we further increase this threshold value, the delayed flight will not be identified as irregular and therefore the recovery algorithm will not be triggered.In sum, one could choose an appropriate threshold value in practice to obtain flexible solutions.

Aircraft Reason for Multiple Irregular Flights
We validate the heuristic method for the case of multiple irregular flights with long delays based on a one-day flight schedule (1 June 2018) with simulated delays.We add a 120 min delay to one-four flights and the comparative results are shown in Table 12.The MIP method performs better than the proposed heuristic method when there is more than one irregular flight.Specifically, the effectiveness of the proposed method decreases with the number of irregular flights, because the MIP method tries to minimize the delay costs, while the proposed method stops adjusting when no flight is identified as irregular (i.e., a short delay will not be repaired).Similar to Section 4.3.2,we also use a one-day flight schedule (1 June 2018) with simulated incidents to validate the feasibility of the proposed method for airport and route reasons.We assume that flights are not allowed to take off before 8:00 and the average time interval for departure is 5 min.The MIP method and the heuristic method generate similar results (see Table 13).Specifically, the delay costs of the MIP result are cheaper than those of the heuristic method, but include one cancellation, because the purpose of the MIP method is to minimize these delay costs, and canceling an irregular flight with an expected delay longer than 4 h is cost effective.

Discussion
In this study, we developed a data-driven heuristic method for the irregular flight recovery problem.Based on operational data from China South Airlines, Beijing, China, we first evaluated the importance of a flight in a flight network, where a flight was considered important if removing it could affect the strong connectivity or average distance of the network.Then, we classified the historical states into three scenarios according to their delay reasons and investigated the recovery patterns for each scenario.The repair rate for aircraft-caused delays was much higher than that for route/airport-caused delays and the recovery logics for these two types of delays were different.Therefore, we designed two different recovery processes for them.Finally, we used the scoring system established in [5] to evaluate the influence of an incident on irregular flights and their subsequent flights.
Inspired by the data analysis results, we proposed a new heuristic method for the irregular flight recovery problem by imitating dispatchers' decision processes.The proposed method consisted of three steps: (1) considering the delay reasons and evaluating the influence of an incident; (2) generating adjustment plans based on two basic operations (swapping the tail numbers of two flights and setting the departure time of a flight); and (3) evaluating the feasibility and costs of the generated plans.As an extension of [5], the local flight network of an irregular flight was considered in the recovery process and we swapped the two aircraft back when they met again.
We then validated the effectiveness and efficiency of the recovery algorithm based on the flight schedule with real and simulated delays.With a real delay, the proposed method conformed to the operation logic of the dispatchers in 93% of cases (i.e., after the operators took action, the scores were reduced or not changed and the delay time was reduced), and in 38% of cases, the optimal plan proposed by the algorithm was identical to that chosen by the dispatchers.With a simulated delay, the heuristic algorithm could generate multiple recovery plans in real time for both long and very long delays, which was much faster and more efficient than the method proposed by [5].
We also compared the proposed method with a mixed-integer programming (MIP) method to evaluate the delay costs and adjustment costs of the generated plans.When there was one irregular flight, the proposed method could generate plans that were same as or similar to those of the MIP method.Furthermore, the proposed method was fast and efficient, and its recovery plans were easy to implement.However, the MIP method performed better when there were multiple irregular flights, because the MIP method regenerated the flight schedule to minimize the cost functions.In contrast, the proposed heuristic method did not seek solutions that minimized the delay time or cost function, and the adjustment process would be stopped if no flight was identified as irregular.Although our heuristic method did not outperform the MIP method in terms of cost function, it had two main advantages in its implementation.First, our method did not rely on the specific form of the objective function or cost parameters.Thus, it had the flexibility to include or exclude different relevant factors.In particular, our method could work even if the cost function or other parameters (e.g., accurate delay time) are unknown, while the MIP method and most of the existing optimization methods cannot work without a specific cost function.Second, our method could generate multiple feasible adjustments, from which dispatchers could choose and assess these plans using external information (such as passenger information or orders from above).Thus, our method would provide dispatchers with more flexibility.
However, the proposed heuristic method still has limitations, which suggests the directions for future research.First, due to a lack of passenger information, we did not consider the passenger delay costs in the cost analysis.In the future, we plan to obtain more data and consider more factors in the cost analysis.Second, when outputting and evaluating the adjustment plans, we primarily focused on the flight scores and delay times.Other factors, such as action rules, airport capacity, crews, passengers, and operating costs, should also be considered.Third, because the proposed method first repairs the flight with the highest score and then repeats the adjustment process until the scores of all the flights are below the threshold, this recovery process may be less efficient for largescale flight delays.
Author Contributions: N.W., H.W., S.P. and B.Z. all contributed to the development of the original idea and model concepts; N.W., H.W. and S.P. preformed data analysis; N.W. and H.W. wrote the algorithm and did simulations; S.P. and B.Z. wrote the first draft of the manuscript and all authors contributed substantially to revisions.All authors have read and agreed to the published version of the manuscript.

Funding:
We also acknowledge financial supports from the Beijing Natural Science Foundation No. Z220001 and the National Science Foundation of China grant No. 71922004, 72131003.The funders had no role in study design, data collection and analysis, publication decisions, or manuscript preparation.

Data Availability Statement:
Restrictions apply to the availability of these data.Data was obtained from China South Airlines, Beijing, China and are available from the authors with the permission of China South Airlines, Beijing, China.

Figure 2 .
Figure 2. Representative 2-day flight network and its subnetworks.(a) A visualization of flight networks for 4-5 March 2018.In this network, the most important node is PEK (i.e., Beijing airport).(b-d) Subnetworks for three aircraft, where the numbers of edges show the path of the aircraft.(b) Aircraft with tail number B6319, (c) aircraft with tail number B6056, and (d) aircraft with tail number B1802.

Figure 3 .
Figure 3. Structure of the AHP scoring system.

Figure 4 .
Figure 4. Structure of the recovery algorithm.

A
flight that is going to land, ∈ A flight that is arriving and is ready to take off, ∈ 0-1 variable, aircraft type for flight X, 0 = narrow, 1 = wide, = , Planned arrival/departure time, = , Number of narrow aircrafts at airport at time , the initial number is Number of wide aircrafts at airport at time , the initial number is Cost of arrival/departure delay per min, = , Cost coefficient based on flight types, = , Cost of cancellation, = , ℎ Time interval for arrival/departure Optimized arrival/departure time for the n-th flight , 0-1 variable, 1 = flight X arrives/takes off at time , = ,The objective function for the first step of the MIP algorithm is:

Table 7 .
Parameters for the second step of the MIP algorithm.Input parameters and output parameters are marked by black and red colors, respectively.Time for arrival, related to Time for departure, related to , 0-1 variable, 1 = the flight is operated by aircraft with tail number , 0-1 variable, 1 = the aircraft is at the airport at time tb 0-1 variable, aircraft type for flight X, 0 = narrow, 1 = wide, = , 0-1 variable, type for aircraft with tail number , 0 = narrow, 1 = wide , 0-1 variable, 1= the flight is operated by aircraft with tail number , 0-1 variable, 1= tail number of the rescheduled flight equal to the previous one

Table 8 .
Flight schedule for aircraft reason with one irregular flight Case 1.The first column shows the tail numbers of the aircrafts and the remaining columns show the flight schedule of the irregular flight and related flights that can be used to repair the irregular flight.The irregular flight is marked in red and its subsequent flight is marked in blue.

Table 1 .
Properties for k-day flight networks.To calculate the "Average number of flights", flights with the same flight number are counted as one.In contrast, the calculation of the "Average degree" is based on the flight network, where an edge corresponds to one flight, because a flight may make stopovers in midway airports, where Average degree >

Table 3 .
Weighted values for the elements of the AHP model.The first row gives the weighted values for the four types of factors, and the rows below show the weighted values for the elements within each factor.Specifically, a very long delay is greater than 4 h, a long delay is between 1 and 4 h, and a short delay is less than 1 h.

Table 4 .
Example delay estimation.The incident causes delays in flight CZ6991 and its subsequent flight CZ6992.

Table 5 .
Example score calculation.CZ6992 is the subsequent flight of CZ6991.

Table 6 .
Parameters for the first step of the MIP algorithm.Input parameters, algorithm parameters, and output parameters are marked by black, blue, and red colors, respectively.

Table 9 .
Adjustment plans for aircraft reason with one irregular flight case 1. DC: dispatcher's choice, MIP: result of the MIP method, and HM: results of the heuristic method.The first column shows the adjustment plans of DC, MIP, and HM and the remaining columns evaluate the scores, delay times, and delay costs of these plans.

Table 11 .
Adjustment plans for aircraft reason with one irregular flight case 2. DC: dispatcher's choice, MIP: result of the MIP method, and HM: results of the heuristic method.The first column shows the adjustment plans of DC, MIP, and HM and the remaining columns evaluate these plans.

Table 12 .
Adjustment plans for aircraft reason with multiple irregular flights.MIP: result of the MIP method, and HM: result of the heuristic method with the lowest score.

Table 13 .
Adjustment plans for airport and route reasons.MIP: result of the MIP method, and HM: results of the heuristic method with the lowest score.* The flight number of the subsequent flight of CZ6931 (IR) is also CZ6931.