1. Introduction
In 2024, China’s civil aviation passenger traffic reached 730 million, already surpassing pre-epidemic levels [
1]. With the continuous growth of air transport demand, the A-CDM system, as a core tool for enhancing airport operational efficiency, has received more attention. As one of the main functions of the A-CDM system, the precise prediction of taxi-out time is crucial for improving the effectiveness of the A-CDM mechanism. Since the pushback time of each flight on the apron is calculated by subtracting the estimated taxi-out time from the Calculated Take-Off Time (CTOT), the accuracy of taxi-out time prediction not only directly affects the pushback sequence of flights on the apron but also correlates with the final execution rate of the CTOT on the runway, which is of great significance for optimizing ground handling and improving the efficiency of the runway and taxiway system.
Compared with arrival flights, departure flights have a longer taxi time, with various and complex influencing factors, as the pilots cannot directly face the apron when taxiing out and need to use a tractor to push the aircraft out of the gate. Both feature construction and models are the main topics of current research on taxi-out time prediction. The European Organization for the Safety of Air Navigation (EUROCONTROL) has summarized the factors affecting taxi-out time prediction into 11 categories [
2], which include the airport layout and facilities, runway usage, runway crossings, gates, weather, aircraft types and pilots, aircraft weight, pushback instruction delivery time, de-icing positions, surface traffic flow, and local airport operational procedures. After that, researchers have additionally looked at the influence index between gates [
3] and the apron’s static spatial arrangement [
4]. However, most research focuses on the macro-factors that affect taxi time and the overall taxi path, with very few studies examining the specific micro-segments that constitute the taxi path of flights.
The aircraft taxi-time prediction problem is similar to the ETA problem. Currently, road-level prediction has achieved more mature applications in the study of ETA in the field of highway transportation. These studies divide the driving route of vehicles from the starting point to the destination into a number of road segments. The total traveling time is defined as the sum of the traveling time of each segment plus the delay of road segment intersections [
5]. The effects of micro-segment-related elements on both segment and total trip times can be captured by this segmented prediction approach [
6,
7]. Therefore, in the absence of existing aircraft surface taxiing trajectory data, we aim to construct a taxi-out time prediction model for departure based on an airport road transport network in order to investigate the specific impact of micro-features on taxi-out time. Based on this, we construct the micro-feature of estimated taxi time by predicting individual link travel time.
Secondly, the vast majority of China’s 10-million-passenger airports have completed the transfer of apron control operations. The control handover waiting situation frequently occurs when an aircraft taxis from the Apron Control (APC) area to the Air Traffic Control (ATC) area, from the ATC area to the APC area, or when it taxis for handover in the internal sectors of the ATC or APC, but almost no research has paid attention to this problem. We think that the taxi time at a particular airport may be significantly impacted by the consistency of the process, and how to quantify the consistency factors of the taxiing process and define corresponding feature values remain unexplored. Therefore, the number of handovers that an aircraft needs to go through during taxiing can act as a good proxy for such information. This feature will likely increase the prediction accuracy even more by capturing data on taxiing consistency for specific airports. Based on this, another focus of this paper is to introduce the number of handovers as a dynamic micro-feature.
Based on the above, this paper will carry out research on departure taxi-out time prediction from a new perspective, and we will expand the factors that may affect the taxi-out time from the macro-level to the micro-level, including the control handover, the information of the runway and taxiway road sections, etc. As a result, a new departure taxi-out time prediction approach is proposed: two new micro-features are established in a departure taxi-out time prediction model based on the airport road transport network. The empirical study chooses Shanghai Pudong International Airport (PVG) as the research object, which is the busiest airport in China in 2024.
The paper is structured as follows:
Section 2 reviews relevant research on the prediction of taxi-out time for departure flights and discusses the theoretical foundations of road-level transport network models.
Section 3 provides an introduction to the data sources and proposes a methodology for constructing an airport road traffic network model and micro-features. In addition, we construct other features to support our predictions.
Section 4 describes the model and performance metrics used in this paper.
Section 5 presents the results and corresponding explanations, based on which we provide an analysis of feature importance.
Section 6 draws conclusions.
2. Literature Review
The aircraft surface taxiing procedure is divided into arrival taxiing and departure taxiing based on the distinct characteristics of arrival and departure flights. Since arrival-flight scheduling, sequencing, and waiting all take place in the air during real operation and are unrelated to taxi-in time, taxi-in time is relatively fixed. The departure flight operation process, on the other hand, is more intricate. Furthermore, the taxi-out time prediction’s accuracy will affect the flight pushback sequencing and runway slot utilization, so the research on taxi-out time has received more attention [
8].
In research on taxi-time prediction, feature construction has gained a lot of attention. In addition to basic features such as aircraft type [
4], airline type [
9], and distance [
10], the selection of model features has evolved from the most critical surface traffic flow to more detailed runway configurations [
3], runway usage patterns [
4], the total angle of turns [
11], operating hours [
12], weather [
13], etc. Inspired by the queuing model, the surface traffic flow features have been introduced into the taxi-out time prediction research earlier [
14]. Numerous studies have demonstrated the significant impact of surface traffic and the ability of surface traffic characteristics to enhance prediction accuracy. Idris et al. [
15] proposed that downstreaming is also an important factor that affects the taxi-out time prediction of aircraft departure; the considerable increase in taxi time and associated unpredictability for aircraft with restrictions are some of the contributing causes. Wang et al. [
13] demonstrated that a feature set comprising a limited number of features for particular target airports and generalized features affecting all airports (such as distance, total turns, average speed, and number of nearest aircraft) can be used to predict taxi time with a high accuracy. Thus, based on the conventional feature set, our study will present additional attributes that are relevant to the empirical airports we are targeting.
The following research gaps on taxi-time prediction features can be found in the above literature: firstly, relatively few studies take into account the individual micro-segments that make up the aircraft taxi path; almost all research simply consider the overall path of flight taxiing at the macro-level. Secondly, apron towers, which are in charge of operational command within the apron area, are found at the majority of large airports in China. The handover of controls between sectors is required when an aircraft is taxiing from the apron area to the ground control area or from the ground control area to the apron area, and this step may have an additional impact on taxi time, but the relevant studies have not yet paid sufficient attention to it.
For the research on micro-segments, the ETA prediction problem can be drawn upon. Current research on ETA based on road traffic maps is focused on the field of highway transportation. The road segment travel time, also known as the Link Travel Time (LTT) [
5], is the estimated travel time of the vehicle on a specific road segment. The vehicle’s total travel time is the sum of the travel times for each segment and the delays at each segment intersection. The ETA research breaks down the entire travel time prediction problem into a number of smaller issues, such as estimating travel times for individual segments and estimating intersection delay times. Zhan et al. [
16] estimated the LTT by estimating the average travel time per hour on urban road segments and by minimizing the error between the expected path travel time and the actual path travel time. Hunter et al. [
17] predicted the LTT of a vehicle using a full path inference filter. Using a tensor decomposition approach, Wang et al. [
18] optimized the path and predicted the LTT based on three-dimensional features (road segment, time, and driver). Paliwal et al. [
19] trained deep learning-based generative models to dynamically update the ETA based on trip information.
In civil aviation research, there are also studies involving the concept of the ETA prediction problem. Jinan Liu [
20] used the Kalman filter algorithm to process the historical trajectory data of aircraft and predicted the LTT based on the KNN algorithm. For the Brasília International Airport, Nogueira et al. [
21] built an airport traffic network model and created an aircraft taxi scheduling algorithm based on ant colony optimization.
However, ETA prediction studies based on road traffic maps still have a significant research gap in the field of aircraft taxi-time prediction. In fact, the airport surface network, as a special transportation network, has the following typical features that distinguish it from the urban road network: only one aircraft is permitted to pass on any taxiway or runway segment at any given time due to its single-lane operation mechanism; since all crossroads have no traffic signal equipment, it is theoretically possible to disregard the node delay time; the topological complexity of the taxi path in airports is substantially lower than that of the urban road network; and there is no need to implement dynamic path optimization methods because aircraft taxiing is strictly regulated by airport regulations, and their path selection is highly fixed. These characteristics show that the ETA conceptual structure is not only theoretically possible but also offers special practical benefits in predicting aircraft taxi times by adjusting to the aircraft’s operational characteristics.
In terms of the use of prediction models, machine learning has been gradually introduced to the study of taxi-out time prediction in order to address nonlinear relationships in complex operational scenarios. Balakrishna et al. [
10] introduced machine learning for the first time to predict taxi-out time, with a prediction accuracy of 81% within ±5 min. In an extensive evaluation of neural networks, regression trees, and multilayer perceptrons, Herrema et al. [
22] discovered that the decision tree model performed better overall. Wang et al. [
13] discovered that integrated class methods like Random Forest (RF) and Gradient Boosted Regression Trees (GBRT) outperformed classical linear regression with regard to prediction performance. After comparing the efficacy of several machine learning methods, Tang et al. [
4] showed that GBRT was the most effective in predicting taxi-in time.
Although the above studies have made significant progress in the field of taxi-time prediction, there are still obvious gaps in the existing methods of modeling microscopic features. In general, these studies tend to focus on the macro-influencing factors of taxi time and consider the overall path of aircraft taxiing. They have neither analyzed the micro-sections constituting the taxiing path nor paid attention to the delay implicitly generated by the coordination of control handover, resulting in the prediction accuracy under the complex scene being difficult to meet the demand of the A-CDM system. In order to investigate the precise impact of the micro-features on taxi-out time, this paper constructs an airport road transport network model from a microscopic point of view by combining the aircraft taxi path and traffic flow features. Two new micro-features are introduced on this basis: the estimated taxi time and the number of handovers. To increase the prediction model’s accuracy and generalizability, GBRT is used as a popular integration class algorithm.
3. Data and Micro-Features
This study uses data from PVG, one of the largest international airports in China in terms of throughput. We obtained all the data of PVG in December 2023 from the A-CDM system. After data cleaning, there were a total of 20,223 flight information data. The data mainly includes flight number, departure runway, gate, aircraft type, Actual Off-Block Time (AOBT), Actual Take Off Time (ATOT), etc.
Taxi-out time, as a response variable in the model construction, is defined as the time experienced by the aircraft from off block to the takeoff on the runway, as follows:
where
is the actual off block time of the aircraft, and
is the actual time of the aircraft’s takeoff.
In this paper, a total of 23 features is constructed for taxi-out time prediction, and the feature set is constructed as follows. Our focus is on modeling road-level airport networks based on a microscopic view and introducing two micro-features in addition to the classical ones.
3.1. Airport Road Transport Network Model
The taxi time of an aircraft on a specific path segment is called the Link Travel Time (LTT). Consequently, the LTT prediction problem can be defined as follows: predict the time required for the aircraft to travel from the start node to the target node on a known taxi path, taking into account relevant contributing factors. For a departing flight, its overall taxi time is the cumulative value of all LTTs on the departure path. One of the key microscopic features introduced in this paper, the estimated taxi time, is precisely obtained from the prediction of each LTT based on the airport road traffic network.
3.1.1. Airport Road Network
Airport road transport network construction is the basis of the LTT prediction model. The road network can be represented by an undirected graph, which contains node sets and link sets. In this paper, the node set is a collection of critical locations in the airport surface transport network. It refers to the taxiway–taxiway intersection, the runway–taxiway intersection, or the gates, which reflect the different locations that an aircraft can travel through during the surface taxiing process. The link set represents the collection of taxi paths or road links connecting each node, reflecting the accessibility of the aircraft when taxiing on the surface. This paper strictly refers to PVG’s Aeronautical Information Publication (AIP) [
23] to determine a total of 2320 taxi paths for arrival/departure aircrafts corresponding to each direction–runway–gate match, and all the links are extracted from the taxi paths to generate a set of non-duplicated links.
Based on the above, considering the surface configuration of PVG, as shown in
Figure 1, we constructed the airport road network of PVG, which contains a total of 200 nodes and 399 links. Different colors in the map represent different control areas, which are analyzed in particular in
Section 3.3, detailed as shown in
Figure 2.
In order to better depict the operation of the airport surface, we dynamically extend the airport road network both in temporal and spatial terms by introducing dynamic traffic conditions on all links.
- (1)
Simulation of taxi trajectory data
Since the A-CDM system only records the milestone times of the aircraft rather than the trajectory data of surface taxiing, we design a method to simulate the aircraft taxiing trajectory data. Every aircraft taxiing link is assigned a weight proportional to its length and the total taxi distance. Then a LTT is calculated based on the weighted taxi time. This is also the prediction target for the LTT prediction model that we are going to build in the following sections.
When simulating this taxi trajectory data, it is assumed that the whole length of a taxi path
, (denoted
by a departing aircraft can be divided into
links. Each link
of path
(
) has the length
Thus, the proportional taxi time weight of link
of path
,
can be modeled by Equation (2). With the milestone times obtained from the A-CDM system (
for the Actual Off-Block Time, and
for the Actual Take-off Time), the simulated travel time of link
of path
,
can be modeled by Equation (3).
All the LTT data generated above is the simulated aircraft taxi trajectory data, and each link data contains the following information: the link data ID, the corresponding section, the starting time, and the finishing time. The complete trajectory data in the time–space dimension is obtained by merging 343,149 pieces of link data.
- (2)
Calculation of traffic flow on links
The link traffic flow includes the current traffic flow, the preceding traffic flow, and the succeeding traffic flow. In order to accurately calculate the traffic flow, we divide each day covered by the flight data into equal time slices of 5 min, which are divided into a total of 8928 time slices. Subsequently, an 8928 × 399 matrix F, called the traffic flow matrix, is constructed, with each row representing a time slice and each column representing a link.
In matrix
F,
Um,n represents the element of the row
m and the column
n in matrix
F. The value of
Um,n is added by 1 to all the values that satisfy the requirement in matrix
F, where
n is all the integers between [
a,
b], while traversing to a link data if its corresponding road section is
m and the start and end times are in time slice
a and time slice
b, accordingly. This step is designed to accurately reflect the flow on road section
m during time slices
a to
b in matrix
F. The process is shown in
Figure 3.
Assuming that the corresponding road segment of a link data is m and its start time is at time slice n, the three traffic flow characteristics of the link data are computed as follows:
Set Um,n − 1 as the current traffic flow of the link data. Determine the set P of road sections other than road section m that are linked to the start of m, and set ∑ Up,n as the preceding traffic flow for this link data, where p ∈ P. Determine the set S of road sections other than road section m that are linked to the end of m, and set ∑ Us,n as the succeeding traffic flow for this link data, where s ∈ S.
3.1.2. Features for LTT Prediction
As a result, we constructed the complete feature set for the prediction of LTTs as shown in
Table 1.
3.2. Estimated Taxi Time
Based on the feature set above, we introduce GBRT (further information is provided in
Section 4) that refines LTT prediction link by link to calculate the estimated taxi time of the aircraft. The results of the prediction performance evaluation are shown in
Table 2. When the error range is ±5 min, the accuracy of the model is 86.85%, i.e., the absolute value of the error between the predicted taxi-out time and the actual taxi-out time that satisfies 86.85% of the departure flights is no more than 5 min; the accuracies of ±3 min and ±1 min are 79.09% and 39.50%, respectively.
The LTT of the aircraft and the cumulative LTT are visualized in this paper to better illustrate the prediction performance. Taking a flight with a 16-link taxi path as an example,
Figure 4 shows the predicted and actual LTT values for each link. The cumulative LTT of the particular link is the total amount of time the aircraft takes from the start of the taxi path to the end.
As shown in
Figure 4a, the prediction error of the LTT for each link is comparatively small, and the prediction results of the majority of links are close to the actual value. This indicates that the model is able to more accurately depict the characteristics of the taxi time for links. In
Figure 4b, the result of link_16 is the estimated taxi time based on the LTT prediction. With the increase in the number of links, the trend of the predicted and actual values of the cumulative LTT remains consistent, and the error between the two is not significantly enlarged. In summary, it is demonstrated that the model has a high accuracy in both the LTT and cumulative LTT prediction of individual links, which definitely serves as a good feature for the next taxi-out time prediction of departure flights.
3.3. Handover of Controls
The taxi handover of aircraft on the surface includes two scenarios, i.e., the handover from the ATC ground to the APC, or from the APC to the ATC ground. At present, PVG’s apron control area is divided into four aprons: west apron one, east apron two, north apron three, and south apron four. The division of the control boundary determines the handover points. The junction of two taxi lanes is the physical point of control handover between different units. For the case of northward running at PVG, there are 46 handover points in total, of which 30 are from the ATC to the APC, indicated by blue bars, and 16 are from the APC to the ATC, indicated by orange bars, as shown in
Figure 5. The high number of handover points at PVG is primarily attributable to its operational characteristics as a large-scale hub airport. The airport’s extensive physical area, coupled with the division of ATC responsibilities and the presence of multiple control boundaries between the tower, ground, and release positions, naturally leads to an increased need for coordination. Additionally, during peak hours, the implementation of dynamic traffic management strategies necessitates the establishment of additional coordination points to ensure smooth operations. These factors collectively result in a dense distribution of handover points across the airport’s operational environment. All control handover point locations in this study strictly follow PVG’s Aeronautical Information Publication (AIP) [
23].
Depending on the aircraft’s operating status, PVG’s control handover is now divided into two types: “Dynamic handover” refers to the control handover that occurs while the aircraft is moving, which promotes the efficiency of ground handling and maintains continuous taxiing. “Static handover” refers to the control handover in case of a complete stop of the aircraft, which can ensure the safety of aircraft ground handling, with a lower probability of deviation from the control commands during the handover process and sufficient time for the controller to formulate the control plan.
Obviously, static handover has a much greater impact on the taxiing process than dynamic handover. In the process of static handover, the pilot must wait in front of the handover point during the frequency change and is not allowed to enter the other party’s control area until he has received the most recent control instructions from the recipient. This will undoubtedly influence the consistency of the aircraft taxiing process, which will impact the taxi time. Specifically, a high frequency of turnovers within a short distance could cause the aircraft to put on and release brakes repeatedly, which might have a significant effect on the efficiency of airport handling.
As we have statistics on the average taxi distance and average taxi time under different numbers of handovers at PVG, the number of handovers is calculated as follows: for each link of the taxi path with a handover point, it is counted as one handover; accumulating all the handover points of the link, the total number of handovers for the corresponding aircraft can be determined. As shown in
Table 3, it can be noticed that a higher number of handovers usually means more taxiing distance and taxi time.
In this paper, we introduce a new feature, the number of handovers, to characterize the number of different control areas that an aircraft experiences during taxi for departure and explore the effect of this feature on taxi-out time prediction. The number of handovers in this regard includes handovers between seats within the apron tower or ATC tower in addition to handovers between different control authorities. For instance, if an aircraft takes off from gate 169 to runway 35R, shown in
Figure 4, there will be three handovers in total, with the order being EAST APN02-SOUTN APN04, SOUTN APN04-ATC GND WEST, and ATC GND WEST-ATC TWR WEST 1. Similarly, in the case of a static handover, the aircraft would need to put on and release brakes at least three times, but in the case of a dynamic handover, there would be no need to stop and wait.
3.4. Features
In order to ensure the accuracy of predictions, the relevant features that may affect taxi time should be considered comprehensively [
13]. In this study, the final set of features we construct for taxi-out time prediction is shown in
Table 4, which are divided into three categories, including surface operation features, flight and aircraft features, and weather features.
The weather information is collected from the METAR Weather Service, and all the movement data of PVG from 1 December 2023 to 31 December 2023 are collected in this paper.
Surface traffic flow is generally recognized as one of the most important features affecting the prediction of taxi time, so we refer to the approach proposed by TANG et al. [
4] for the calculation of surface traffic flow features, which comprehensively considers the arrival and departure flights and avoids the fixed threshold’s influence, as shown in
Table 5.
4. Methodology
In this section, we introduce the GBRT model for taxi-out time prediction and use three critical metrics to evaluate the model.
4.1. GBRT
Gradient Boosting Regression Trees (GBRT) is widely employed due to its efficiency, accuracy, and explainability [
24]. In this paper, the taxi-out time prediction model adopts the GBRT algorithm, an ensemble learning technique that iteratively combines weak learners to enhance prediction accuracy and robustness. The selection of GBRT is motivated by three key advantages: First, its inherent capability to automatically capture nonlinear feature interactions makes it particularly suitable for modeling the complex dynamics inherent in airport surface operations. Second, comparative studies have demonstrated that GBRT maintains a more consistent performance than random forests and neural networks when handling small-to-medium-sized datasets—a characteristic that aligns perfectly with the scale of flight operation data analyzed in this study. Third, the algorithm’s iterative optimization of base-learners naturally supports multi-feature importance assessment, which directly facilitates our subsequent feature analysis.
The core mechanism of GBRT operates through an additive optimization process: each subsequently generated decision tree systematically corrects the prediction residuals from preceding iterations, progressively refining the model’s accuracy. This staged approach not only improves predictive performance but also enhances model interpretability through explicit feature contribution tracking.
The original dataset was randomly partitioned into training and test sets in an 8:2 ratio after comprehensive data cleaning, ensuring unbiased performance evaluation. To optimize model generalization, we implemented a systematic hyperparameter tuning procedure focusing on four critical parameters: learning rate, maximum tree depth, number of leaves, and minimum samples per leaf. Through a 5-fold cross-validated grid search, we identified the parameter combination that achieved optimal predictive performance on the validation sets, as shown in
Table 6. This optimized parameter set was then evaluated on the held-out test set for final model assessment.
4.2. Performance Metrics
This paper selects prediction accuracy, RMSE, and MAE as performance metrics. MAE and RMSE are used to measure the gap between predicted and true values. The prediction accuracy is defined as the ratio of the total number of prediction samples to the number of differences between the model output and the actual taxi-out time within a set range. The prediction accuracy, which is commonly employed for evaluating the accuracy of taxi-time prediction [
14,
22], is established here with ranges of 1, 3, and 5 min, respectively.
5. Results
In this section, we work on prediction, employing the GBRT model and the established feature set, then compare the performance of the model with the introduction of new features. Finally, we attempt to interpret the results of the model.
5.1. Model Convergence Analysis and Parameter Robustness Discussion
In order to verify the effectiveness of the model, we plotted the variation curves of training and testing loss with the number of boosting iterations, as shown in
Figure 6. The analysis indicates that in the initial iteration stage, the model loss decreases rapidly, indicating that the base learner can effectively capture the main features of the data. As the number of iterations increase, the test loss and the training loss decrease synchronously and eventually converge, which proves that the model still maintains good performance on unseen data without the risk of overfitting. In general, the model training process converges well, the test performance is stable, and the model is able to effectively learn the data features and generalize to new data.
The experiment verifies the robustness of the GBRT model through systematic parameter sensitivity tests. As shown in
Table 7, when evaluating the three critical parameters of learning rate, maximum depth of tree, and minimum number of samples of leaf nodes, the model MSE fluctuates in a small range, which indicates that the model is highly robust to the adjustment of parameters and effectively resists the parameter perturbation within the range of the recommended parameters with good adaptability.
5.2. Results of GBRT Model
In order to evaluate quantitatively the effect of micro-features on taxi-out time prediction, we divide the features into the following four groups:
Group A: Includes all the features in
Table 4 except for the estimated taxi time and number of handovers.
Group B: Add the feature of number of handovers to group A.
Group C: Add the feature of estimated taxi time to group A.
Group D: Includes all the features in
Table 4.
Table 8 presents a comparison of the performance of each group. In general, the introduction of micro-features significantly improves the prediction accuracy: the prediction accuracy increases from 88.23% to 94.36% for ±3 min and reaches 99.42% for ±5 min. This is a significant improvement over the taxi-out time prediction accuracy of 93.01% for ±5 min in TANG et al.’s [
3] study. With the optimized feature set, the overall prediction error of the model decreases as well. This result shows that the airport road transport network model not only improves the accuracy of the short-term prediction but also has a greater increase in the overall taxi-time prediction accuracy of the model, which qualifies the explanatory ability of micro-features on a multilevel time scale.
It is worth noting that despite the relatively limited accuracy for ±1 min, the improvement is still effective. This phenomenon reveals that although the airport road transport network model cannot completely eliminate the inherent uncertainty of the taxi-out process, it effectively captures the microscopic variations in the taxiing process and enables the model to more accurately predict in a short period of time.
Comparing groups B and C with group A, respectively, it can be seen that the estimated taxi time has a greater improvement for the prediction than the number of handovers. A possible explanation for this is simply that this feature reflects the micro-information of the link more comprehensively, including capacity, location characteristics, etc., and thus, has a greater impact on the prediction results than the number of handovers itself.
Meanwhile, group D has the best performance on all indicators, which suggests that the co-incorporation of two micro-features is better for prediction improvement than the single feature introduction, confirming the validity of the features we established.
5.3. Feature Importance
The decision tree algorithm’s feature importance scoring mechanism makes it possible to assess each feature’s degree of contribution to prediction. The ranking of the importance degree of each feature of the taxi-out time prediction model is shown in
Figure 7.
The results show that the estimated taxi times predicted by the LTT prediction model are ranked very high, which indicates that the airport road transport network model contributes greatly to the prediction performance of the taxi-out time, and it also suggests that our importing of micro-link information is an effective way to improve the model. This feature compensates for the weakness of the traditional taxi-out time prediction model, which only takes the taxiing global into account. Based on the airport road transport network model, the taxiing process of the aircraft is regarded as a number of micro taxiing segments, which enables taxi distance, handover, and traffic flow information to be considered in a more detailed way.
The number of handovers, another micro-feature we introduced, also has relatively high importance in the prediction model, exceeding the importance of the majority of the traditional features such as aircraft type and airline type, which confirms the validity of this feature. The impact of handovers on the taxi time is mainly reflected in the waiting time caused by procedures such as the handover of control instructions and communication coordination, which might affect the consistency of the departure taxiing process.
Of the features we construct for surface traffic flow, D3 and D1 exhibit a significant degree of contribution in the prediction. When a conflict occurs on the airport surface, these two types of flights usually have a higher departure priority than the target departure flight because they occupy ground resources before the target flights, which means they will have a greater impact on the target departure aircraft’s taxiing.
However, the features related to the number of arrival flights do not show a greater degree of importance. We believe that this may be due to the unique configuration of PVG. There are two sets of narrow parallel runways at PVG. In each set of narrow runways, departure aircraft use the inner runway close to the terminal building for takeoff, while arrival aircraft use the outer runway for landing, and departure flights have a higher priority compared with arrival flights after landing. The above mode of operation minimizes, to a large extent, the conflict between arrival and departure flights on the surface.
As traditional features, distance and the runway used also show a high degree of contribution. Clearly, distance has a direct correlation with taxi time, since longer distances usually require more time. This feature is the basic physical element of taxi time. The impact of runway selection for departures on aircraft taxi time is primarily reflected in the different path lengths and complexities required to taxi to different runways as well as the different number of ATC responsibility areas they must pass through. All of these will have significant effects on the taxi-out time.
It is interesting that the feature of departure restriction shows strong importance. This result aligns with the conclusions drawn by Idris et al. [
15] regarding downstreaming impacts. Restricted flights refer to flights that are affected by objective factors and cannot be released according to the scheduled time, which need to follow the CTOT from the ATC to execute the release. On the one hand, when departure flights are restricted due to airspace flow control, runway capacity limitations, destination airport limitations (e.g., bad weather or congestion), or military activities, etc., they frequently need to be prioritized in order to prevent CTOT slot waste. In this case, such flights will be pushed out before schedule, which results in additional queuing and waiting at the safe taxiing process, and this will immediately lead to an increase in the departure taxi time. On the other hand, considering A-CDM, the restricted flights may need to be adjusted in collaboration with the ATC, resulting in a high deviation from the schedule time, which will also affect the release of other flights.
Meanwhile, the weather feature’s importance degree is lower, mainly due to the data we collected. We collected a sample of flight data from PVG between 00:00 on 1 December 2023, and 23:55 on 31 December 2023, during which the weather at PVG was relatively stable, with no extreme weather such as typhoons, thunderstorms, or snowfalls. This greatly diminishes the impact of weather on taxi times. In addition, PVG’s complete infrastructure and standardized operating procedures further reduce the potential impact of mild weather disturbances on taxi time.
6. Conclusions
Aircraft departure is a complex process involving multi-resource coordination, strict time constraints, and dynamic operational conditions. To address this challenge, this study developed a high-resolution, road-level transport network model for PVG, which is the world’s busiest cross-runway operation hub. We proposed an enhanced taxi-out time prediction method by integrating two novel micro-scale features: estimated taxi time which is derived from link travel time prediction and the number of handovers.
The experiment results show that the incorporation of these micro-features significantly boosted prediction accuracy, among which the combined use of both features yielded optimal results: the prediction accuracy is improved by 5.12% under a 1 min error threshold and 6.13% under a 3 min error threshold, while the accuracy reaches 99.42% under a 5 min error threshold. These results demonstrate that the road-level transport network model effectively captures the microscopic variations in the taxiing process and enables the model to more accurately predict in a short period of time. This undoubtedly provides insights into the enhancement and improvement of the flight arrival and departure time prediction model of the A-CDM system.
We revealed the crucial factors affecting the taxi-out time prediction and their mechanisms of contribution through feature importance analysis. Notably, the importance of the two micro-features both exceeds the average importance, with the estimated taxi time ranking second, which again confirms the validity of the model we established. The features related to the number of arrival flights did not exhibit high importance, further reflecting the unique configuration of PVG, indicating that feature construction needs to consider suitability for target airports.
This work bridges the gap between micro-scale traffic network characteristics and macroscopic taxi-time prediction, offering a transferable framework for cross-modal traffic modeling in aviation. Future research can further expand the application scenarios of the model by considering its suitability for other airports and exploring strategies such as flow control or dynamic scheduling to improve the accuracy and applicability of the model. In addition, future research could incorporate long-term weather data to better characterize the effects of weather factors such as pavement conditions, seasonal variations, and the impact of extreme weather conditions on taxiing performance.