Machine Learning Methods Benchmarking for Predicting Flight Delays: An Efficiency Meta-Analysis
Abstract
1. Introduction
2. Methodology
2.1. Systematic Literature Review Analysis
2.2. Research Strategy
2.3. Applying Meta-Analysisto Evaluate Machine Learning Methods
2.4. Ranking of Machine Learning Methods with Data Envelopment Analysis (DEA)
2.5. Censored Tobit Regression Analysis for Identifying Efficiency Determinants
3. Results and Analysis
3.1. Systematic Analysis
Research Strategy: Data Selection and Analysis
| DMU | Forecast Logic | Methods Code | Analysis Area | Evaluation Metric | Sample Size | Impact Factor | Citation |
|---|---|---|---|---|---|---|---|
| Bisandu & Moulitsas (2024) [54] | R 1 | C4, A5, C2, G1, D1, D2, D3, D4 | N 4 | 4.23% | 3442 | 7.5 | 1 |
| Cai et al. (2023) [53] | R 1 | B1, E1, E2, E3, E4, E5, E6, E7, E8 | N 4 | 30.96% | 1,800,000 | 5.3 | 14 |
| Chen, Whang & Zhou (2021) [51] | R 1 | A1, F1, C1 | Ap 5 | 74.00% | 338,251 | 6.3 | 13 |
| Falque, Mazure & Tabia (2024) [59] | R 1 | A3 | Ap 5 | 71.47% | 10,633,920 | 2.7 | 1 |
| Khan et al. (2021) [11] | R 1 | E9, B2, C6, C6 | Al 3 | 67.18% | 19,109 | 7.6 | 44 |
| Khan et al. (2024) [12] | R 1 | G10 | Al 3 | 58.83% | 21,987 | 3.9 | 5 |
| Khanmohammadi, Tutun & Kucuk (2016) [44] | R 1 | D6 | Ap 5 | 13.66% | 1099 | 0 | 69 |
| Li et al. (2024) [56] | R 1 | B1 | N 4 | 97.00% | 5,426,150 | 2.8 | 0 |
| Schultz, Reitmann & Alam (2021) [58] | R 1 | D9, D11, D14, D15 | Ap 5 | 63.35% | 45,900 | 7.6 | 32 |
| Shao et al. (2022) [60] | R 1 | D6, C7, A3, D16 | Ap 5 | 40.79 | 27,500 | 5.5 | 30 |
| Shen, Chen & Yan (2024) [55] | R 1 | D9, D17, D18, D19, D20 | N 4 | 81.80% | 12,339,775 | 7.2 | 0 |
| Sun et al. (2024) [61] | R 1 | D21 | N 4 | 74% | 1,048,575 | 7.5 | 1 |
| Wang, Liang & Delahaye (2018) [46] | R 1 | G12, G13 | Ap 5 | 91.50% | 8574 | 7.6 | 81 |
| Yang et al. (2023) [49] | R 1 | B1, F1, A4, E1, E2, E3, E1 + E2 | Ap 5 | 50% | 124,453 | 7.6 | 3 |
| Yu et al. (2019) [36] | R 1 | C4, G2, C1 | Ap 5 | 89% | 528,471 | 8.3 | 189 |
| Henriques & Freitas (2018) [45] | B 2 | F1, B1, D6 | Ap 5 | 84.32% | 248,956 | 0 | 15 |
| Khan et al. (2021) [11] | B 2 | D7, D8, C1, E9, A1, B1, A2, A4 | Al 3 | 63.73% | 19,105 | 7.6 | 44 |
| Khan et al. (2024) [12] | B 2 | G8, G9, B1, A4, D5, G3 C1, G10 | Al 3 | 64.94% | 21,987 | 3.9 | 5 |
| Lambelho et al. (2020) [62] | B 2 | A3, D6, B1 | Ap 5 | 77.63% | 2,300,000 | 3.9 | 57 |
| Li & Jing (2022) [10] | B 2 | B3 | N 4 | 90.50% | 762,415 | 7.5 | 29 |
| Li et al. (2022) [10] | B 2 | D14 | N 4 | 92.39% | 5,426,150 | 7.5 | 26 |
| Mamdouh, Ezzat & Hefny (2024) [63] | B 2 | D10, D11, D9, D12,D13 | Ap 5 | 76.10% | 1,000,000 | 7.5 | 7 |
| Mokhtarimousavi & Mehrabi (2023) [52] | B 2 | C1, C5 | Ap 5 | 90.40% | 21,298 | 4.3 | 11 |
| Pineda-Jaramillo et al. (2024) [50] | B 2 | G2, F1, B1, A5, C3, G4, G5, G6, G7 | Ap 5 | 63.38% | 67,730 | 4.1 | 0 |
| Schultz, Reitmann & Alam (2021) [58] | B 2 | D9, D11, D14, D15 | N 4 | 89.56% | 45,900 | 7.6 | 32 |
| Tenorio, Marques & Cadarso (2021) [48] | B 2 | F1, B1, A5, D6, E1, D9 | Ap 5 | 93.11% | 7,000,000 | 0 | 1 |
| Truong (2021) [47] | B 2 | G11 | N 4 | 91.97% | 1058 | 3.9 | 41 |
| Yanying, Mo & Haifeng (2019) [64] | B 2 | F1, G4, C1 | N 4 | 76.67% | 5,635,978 | 0 | 9 |
| Birolini & Jacquillat (2023) [65] | B 2 | A1 | N 4 | 77.30% | 15,244 | 6 | 6 |
3.2. Meta-Analysis
3.3. Data Envelopment Analysis (DEA)
3.4. Censored Tobit Regression Model
4. Discussion: Implications for Efficiency and Variable Selection
4.1. The Efficacy and Efficiency of Binary Classification
4.2. The Value of Parsimony over Complexity
4.3. Balancing Predictive Power with Operational Cost
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Wang, C.; Wang, X. Airport congestion delays and airline networks. Transp. Res. Part E 2019, 122, 328–349. [Google Scholar] [CrossRef]
- Liu, Y.; Yin, M.; Hansen, M. Economic costs of air cargo flight delays related to late package deliveries. Transp. Res. Part E 2019, 125, 388–401. [Google Scholar] [CrossRef]
- de Almeida, E.E.; Oliveira, A.V.M. An econometric analysis for the determinants of flight speed in the air transport of passengers. Sci. Rep. 2023, 13, 4573. [Google Scholar] [CrossRef] [PubMed]
- Wang, Z.; Liao, C.; Hang, X.; Li, L.; Delahaye, D.; Hansen, M. Distribution prediction of strategic flight delays via machine learning methods. Sustainability 2022, 14, 15180. [Google Scholar] [CrossRef]
- Chu, A.-M. A Bayesian Network Approach to Identify Influential Factors for Flight Delays; Massachusetts Institute of Technology: Cambridge, UK, 2012. [Google Scholar]
- Jebbor, I.; Hachimi, H.; Benmamoun, Z. Artificial Intelligence in Predicting Automotive Supply Chain Disruptions: A Literature Review. In Advances in Intelligent Systems and Digital Applications; Springer: Cham, Switzerland, 2025; pp. 1–13. [Google Scholar] [CrossRef]
- Jebbor, I.; Benmamoun, Z.; Hachimi, H. Forecasting supply chain disruptions in the textile industry using machine learning: A case study. Ain Shams Eng. J. 2024, 15, 103116. [Google Scholar] [CrossRef]
- Carvalho, L.; Sternberg, A.; Gonçalves, L.M.; Cruz, B.; Soares, J.A.; Brandão, D.; Carvalho, D.; Ogasawara, E. On the relevance of data science for flight delay research: A systematic review. Transp. Rev. 2020, 41, 852–872. [Google Scholar] [CrossRef]
- Rebollo, J.J.; Balakrishnan, H. Characterization and prediction of air traffic delays. Transp. Res. Part C 2014, 44, 231–241. [Google Scholar] [CrossRef]
- Li, Q.; Jing, R. Flight delay prediction from spatial and temporal perspective. Expert Syst. Appl. 2022, 201, 117662. [Google Scholar] [CrossRef]
- Khan, W.A.; Ma, H.-L.; Chung, S.-H.; Wen, X. Hierarchical integrated machine learning model for predicting flight departure delays and duration in series. Transp. Res. Part C 2021, 129, 103225. [Google Scholar] [CrossRef]
- Khan, W.A.; Chung, S.-H.; Eltoukhy, A.E.E.; Khurshid, F. A novel parallel series data-driven model for IATA-coded flight delays prediction and features analysis. J. Air Transp. Manag. 2024, 114, 102488. [Google Scholar] [CrossRef]
- Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef]
- Okraszewska, R.; Romanowska, A.; Laetsch, D.C.; Gobis, A.; Reisch, L.A.; Kamphuis, C.B.; Lakerveld, J.; Krajewski, P.; Banik, A.; Braver, N.R.D.; et al. Interventions reducing car usage: Systematic review and meta-analysis. Transp. Res. D Transp. Environ. 2024, 131, 104217. [Google Scholar] [CrossRef]
- Christopher, J.O.; Prasada Rao, D.S.; George, E.B. Metafrontier frameworks for the study of firm-level efficiencies and technology ratios. Empir. Econ. 2008, 34, 231–255. [Google Scholar] [CrossRef]
- Rudolpho, C.W.; Katz, I.M.; Lavigne, K.N.; Zacher, H. Job crafting: A meta-analysis of relationships with individual differences, job characteristics, and work outcomes. J. Vocat. Behav. 2017, 102, 112–138. [Google Scholar] [CrossRef]
- Relijic, T.; Sehovic, M.; Lancet, J.; Kim, J.; Ali, N.A.; Djulbegovic, B.; Extermann, M. Benchmarking treatment effects for patients over 70 with acute myeloid leukemia: A systematic review and meta-analysis. J. Geriatr. Oncol. 2020, 11, 1293–1308. [Google Scholar] [CrossRef]
- Reiss, R.; Gaylor, D. Use of benchmark dose and meta-analysis to determine the most sensitive endpoint for risk assessment for dimethoate. Regul. Toxicol. Pharmacol. 2005, 42, 55–65. [Google Scholar] [CrossRef]
- Khoja, L.; Atenafu, E.G.; Suciu, S.; Leyvraz, S.; Sato, T.; Marshall, E.; Keilholz, U.; Zimmer, L.; Patel, S.P.; Piperno-Neumann, S.; et al. Meta-analysis in metastatic uveal melanoma to determine progression free and overall survival benchmarks: An international rare cancers initiative (IRCI) ocular melanoma study. Ann. Oncol. 2019, 30, 1370–1380. [Google Scholar] [CrossRef]
- Papalia, G.F.; Brigato, P.; Sisca, L.; Maltese, G.; Faiella, E.; Santucci, D.; Pantano, F.; Vincenzi, B.; Tonini, G.; Papalia, R.; et al. Artificial Intelligence in Detection, Management, and Prognosis of Bone Metastasis: A Systematic Review. Cancers 2024, 16, 2700. [Google Scholar] [CrossRef]
- Jaiteh, M.; Phalane, E.; Shiferaw, Y.A.; Voet, K.A.; Phaswana-Mafuya, R.N. Utilization of Machine Learning Algorithms for the Strengthening of HIV Testing: A Systematic Review. Algorithms 2024, 17, 362. [Google Scholar] [CrossRef]
- Akinsoji, A.H.; Adelodun, B.; Adeyi, Q.; Salau, R.A.; Odey, G.; Choi, K.S. Integrating Machine Learning Models with Comprehensive Data Strategies and Optimization Techniques to Enhance Flood Prediction Accuracy: A Review. Water Resour. Manag. 2024, 38, 4735–4761. [Google Scholar] [CrossRef]
- Arévalo, P.; Ochoa-Correa, D.; Villa-Ávila, E. A Systematic Review on the Integration of Artificial Intelligence into Energy Management Systems for Electric Vehicles: Recent Advances and Future Perspectives. World Electr. Veh. J. 2024, 15, 364. [Google Scholar] [CrossRef]
- Utami, W.; Sugiyanto, C.; Rahardjo, N. Artificial intelligence in land use prediction modeling: A review. IAES Int. J. Artif. Intell. 2024, 13, 2514–2523. [Google Scholar] [CrossRef]
- Yaghoubi, E.; Yaghoubi, E.; Khamees, A.; Razmi, D.; Lu, T. A systematic review and meta-analysis of machine learning, deep learning, and ensemble learning approaches in predicting EV charging behavior. Eng. Appl. Artif. Intell. 2024, 135, 108789. [Google Scholar] [CrossRef]
- Borenstein, M.; Hedges, L.V.; Higgins, J.P.T.; Rothstein, H.R. A basic introduction to fixed-effect and random-effects models for meta-analysis. Res. Synth. Methods 2010, 1, 97–111. [Google Scholar] [CrossRef] [PubMed]
- Hedges, L.V.; Olkin, I. Statistical Methods for Meta-Analysis; Academic Press, Inc.: Orlando, FL, USA, 1985. [Google Scholar]
- Arjomandi, A.; Dakpo, K.; Seufert, J. Have Asian airlines caught up with European airlines? A by-production efficiency analysis. Transp. Res. Part A 2018, 111, 389–403. [Google Scholar] [CrossRef]
- Shah, W.U.H.; Hao, G.; Yan, H.; Shen, J.; Yasmeen, R. Forestry Resource Efficiency, Total Factor Productivity Change, and Regional Technological Heterogeneity in China. Forests 2024, 15, 152. [Google Scholar] [CrossRef]
- Zubir, M.Z.; Noor, A.A.; Rizal, A.M.M.; Harith, A.A.; Abas, M.I.; Zakaria, Z.; Bakar, A.F.A. Approach in inputs & outputs selection of Data Envelopment Analysis (DEA) efficiency measurement in hospitals: A systematic review. PLoS ONE 2024, 19, e0293694. [Google Scholar]
- Yen, B.; Li, J.-S. Route-based performance evaluation for airlines—A metafrontier data envelopment analysis approach. Transp. Res. Part E 2022, 161, 102706. [Google Scholar] [CrossRef]
- Barbosa, F.C.; Fuchigami, H.Y. Análise Envoltória de Dados: Teoria e Aplicações Práticas; ULBRA: Itumbiara, Brazil, 2018. [Google Scholar]
- Meza, L.A.; Gomes, E.G.; Neto, L.B. Curso de Análise de Envoltória de Dados. XXXVII Simpósio Brasileiro de Pesquisa Operacional. 2005; pp. 2520–2547. Available online: http://www.din.uem.br/~ademir/sbpo/sbpo2005/pdf/arq0289.pdf (accessed on 14 July 2021).
- Greene, W.H. Econometric Analysis, 5th ed.; Prentice Hall: Hoboken, NJ, USA, 2003; pp. 145–198. [Google Scholar]
- Wooldridge, J.M. Introdução à Econometria; Editora Thomson; Cengage: San Francisco, CA, USA, 2006; pp. 295–328. [Google Scholar]
- Yu, B.; Guo, Z.; Asian, S.; Wang, H.; Chen, G. Flight delay prediction for commercial air transport: A deep learning approach. Transp. Res. Part E 2019, 125, 203–221. [Google Scholar] [CrossRef]
- Miyawaki, K.; MacEachern, S.N. Economic variable selection. Can. J. Stat. 2023, 51, 19–37. [Google Scholar] [CrossRef]
- Halpern, N.; Mwesiumo, D.; Budd, T.; Suau-Sanchez, P.; Bråthen, S. Segmentation of passenger preferences for using digital technologies at airports in Norway. J. Air Transp. Manag. 2021, 91, 102005. [Google Scholar] [CrossRef]
- Hatıpoğlu, I.; Tosun, Ö. Predictive Modeling of Flight Delays at an Airport Using Machine Learning Methods. Appl. Sci. 2024, 14, 5472. [Google Scholar] [CrossRef]
- Callaham, M.; Wears, R.L.; Weber, E. Factors associated with postpublication citation. JAMA 2002, 287, 2847–2850. [Google Scholar] [CrossRef] [PubMed]
- Sternberg, A.; Soares, J.; Carvalho, D.; Ogasawara, E. A Review on Flight Delay Prediction. arXiv 2022, arXiv:1703.06118. [Google Scholar]
- Bishop, C.M. Pattern Recognition and Machine Learning; Springer Science + Business Media, LLC: Berlin/Heidelberg, Germany, 2006; pp. 1–729. [Google Scholar]
- Zhou, Z.-H. Machine Learning; Springer: Singapore, 2021; pp. 1–536. [Google Scholar]
- Khanmohammadi, S.; Tutun, S.; Kucuk, Y. A New Multilevel Input Layer Artificial Neural Network for Predicting Flight Delays at JFK Airport. Procedia Comput. Sci. 2016, 95, 237–244. [Google Scholar] [CrossRef]
- Henriques, R.; Feiteira, I. Predictive Modelling: Flight Delays and Associated Factors, Hartsfield–Jackson Atlanta International Airport. Procedia Comput. Sci. 2018, 138, 638–645. [Google Scholar] [CrossRef]
- Wang, Z.; Liang, M.; Delahaye, D. A hybrid machine learning model for short-term estimated time of arrival prediction in terminal manoeuvring area. Transp. Res. Part C 2018, 95, 280–294. [Google Scholar] [CrossRef]
- Truong, D. Using causal machine learning for predicting the risk of flight delays in air transportation. J. Air Transp. Manag. 2021, 91, 101993. [Google Scholar] [CrossRef]
- Tenorio, V.M.; Marques, A.G.; Cadarso, L. Signal processing and machine learning for air traffic delay prediction. Transp. Res. Procedia 2021, 58, 463–470. [Google Scholar] [CrossRef]
- Yang, Z.; Chen, Y.; Hu, J.; Song, Y.; Mao, Y. Departure delay prediction and analysis based on node sequence data of ground support services for transit flights. Transp. Res. Part C 2023, 153, 104217. [Google Scholar] [CrossRef]
- Pineda-Jaramillo, J.; Munoz, C.; Mesa-Arango, R.; Gonzalez-Calderon, C.; Lange, A. Integrating multiple data sources for improved flight delay prediction using explainable machine learning. Res. Transp. Bus. Manag. 2024, 56, 101161. [Google Scholar] [CrossRef]
- Chen, Z.; Wang, Y.; Zhou, L. Predicting weather-induced delays of high-speed rail and aviation in China. Transp. Policy 2021, 101, 14–22. [Google Scholar] [CrossRef]
- Mokhtarimousavi, S.; Mehrabi, A. Flight delay causality: Machine learning technique in conjunction with random parameter statistical analysis. Int. J. Transp. Sci. Technol. 2023, 12, 230–244. [Google Scholar] [CrossRef]
- Cai, K.; Li, Y.; Zhu, Y.; Fang, Q.; Yang, Y.; Du, W. A geographical and operational deep graph convolutional approach for flight delay prediction. Chin. J. Aeronaut. 2023, 36, 17–31. [Google Scholar] [CrossRef]
- Bisandu, D.B.; Moulitsas, I. Prediction of flight delay using deep operator network with gradient-mayfly optimisation algorithm. Expert Syst. Appl. 2024, 247, 123306. [Google Scholar] [CrossRef]
- Shen, X.; Chen, J.; Yan, R. A spatial–temporal model for network-wide flight delay prediction based on federated learning. Appl. Soft Comput. 2024, 154, 111380. [Google Scholar] [CrossRef]
- Li, C.; Mao, J.; Li, L.; Wu, J.; Zhang, L.; Zhu, J.; Pan, Z. Flight delay propagation modeling: Data, Methods, and Future opportunities. Transp. Res. E Logist. Transp. Rev. 2024, 185, 103525. [Google Scholar] [CrossRef]
- Qu, J.; Wu, S.; Zhang, J. Flight Delay Propagation Prediction Based on Deep Learning. Mathematics 2023, 11, 494. [Google Scholar] [CrossRef]
- Schultz, M.; Reitmann, S.; Alam, S. Predictive classification and understanding of weather impact on airport performance through machine learning. Transp. Res. C Emerg. Technol. 2021, 131, 103119. [Google Scholar] [CrossRef]
- Falque, T.; Mazure, B.; Tabia, K. Machine learning for predicting off-block delays: A case study at Paris—Charles de Gaulle International Airport. Data Knowl. Eng. 2024, 152, 102303. [Google Scholar] [CrossRef]
- Shao, W.; Prabowo, A.; Zhao, S.; Koniusz, P.; Salim, F.D. Predicting flight delay with spatio-temporal trajectory convolutional network and airport situational awareness map. Neurocomputing 2022, 472, 280–293. [Google Scholar] [CrossRef]
- Sun, M.; Tian, Y.; Wang, X.; Huang, X.; Li, Q.; Li, Z.; Li, J. Transport causality knowledge-guided GCN for propagated delay prediction in airport delay propagation networks. Expert Syst. Appl. 2024, 240, 122426. [Google Scholar] [CrossRef]
- Lambelho, M.; Mitici, M.; Pickup, S.; Marsden, A. Assessing strategic flight schedules at an airport using machine learning-based flight delay and cancellation predictions. J. Air Transp. Manag. 2020, 82, 101737. [Google Scholar] [CrossRef]
- Mamdouh, M.; Ezzat, M.; Hefny, H. Improving flight delays prediction by developing attention-based bidirectional LSTM network. Expert Syst. Appl. 2024, 238, 121747. [Google Scholar] [CrossRef]
- Dong, X.; Zhu, X.; Hu, M.; Bao, J. A Methodology for Predicting Ground Delay Program Incidence through Machine Learning. Sustainability 2023, 15, 6883. [Google Scholar] [CrossRef]
- Birolini, S.; Jacquillat, A. Day-ahead aircraft routing with data-driven primary delay predictions. Eur. J. Oper. Res. 2023, 310, 379–396. [Google Scholar] [CrossRef]
- Yu, Y.; Mo, H.; Li, H. A Classification Prediction Analysis of Flight Cancellation Based on Spark. Procedia Comput. Sci. 2019, 162, 480–486. [Google Scholar] [CrossRef]
- Chang, B.R.; Tsai, H.-F.; Mo, H.-Y. Ensemble Meta-Learning-Based Robust Chipping Prediction for Wafer Dicing. Electronics 2024, 13, 1802. [Google Scholar] [CrossRef]
- Zheng, Z.; Zou, B.; Wei, W.; Tian, W. A Data-Light and Trajectory-Based Machine Learning Approach for the Online Prediction of Flight Time of Arrival. Aerospace 2023, 10, 675. [Google Scholar] [CrossRef]










| Variable Group | Description | Possible Values | Variable Type | |
|---|---|---|---|---|
| Author(s) | Publication information | Identifying studies | - | Nominal |
| Journal | Journal of publication origin | Nominal | ||
| Publication Year | Year of study publication in journal | 2015–2025 | Ordinal | |
| Journal Impact Factor | Impact factor of the journal of origin of the publication (value obtained when the study was collected) | - | Continuous Numerical | |
| Number of Citations in the Study | Number of citations of the study up to October 2024. | - | Discrete numerical | |
| Method Employed | Information on the prediction model(s) used | Machine learning method employed in the study | - | Nominal |
| Total Sample | The size of the universal sample used in the flight delay prediction model(s) | Values ranging from tens to millions of sample units tested. | Discrete numerical | |
| Test Sample | Portion of the total sample reserved for test analyses of the prediction model(s) tested. | Values ranging from tens to millions of sample units tested | Discrete numerical | |
| Accuracy of the Model(s) | Accuracy of the machine learning model(s) analyzed in the study | - | Continuous numerical | |
| Correct Forecast Units | Units of the test sample correctly predicted by the model(s) applied. | - | Discrete numerical | |
| Predictive Logic | Forecasting logic adopted by the forecasting model(s) analyzed | Regression or Binary | Binary | |
| Study Area | Size of the study area | Airline or route; Airport | Nominal | |
| Independent Variables: Aircraft Rotation Data | Information on the groups of independent variables used in the prediction models | Binary use of variables with aircraft rotation information as independent variables in the model(s) analyzed | They include planned and actual rotations, used to track the propagation of delays and reconstruct primary delays. | Binary |
| Independent Variables: Flight Data | Binary use of variables with flight data information as independent variables in the model(s) analyzed | Characteristics of each flight, such as flight duration, temporal patterns of airline operations (such as year, season, month, day of the week and time of day), and spatial characteristics of departure and arrival airports. | Binary | |
| Independent Variables: Aircraft Data | Binary use of variables with flight data information as independent variables in the model(s) analyzed | Information about the airline’s fleet, such as aircraft model, seat configuration and capacity, airport base and aircraft age. | Binary | |
| Independent Variables: Passenger Data | Binary use of variables with aircraft data information as independent variables in the model(s) analyzed | Information such as the number of passengers boarding, passenger boarding/disembarking times or information on passenger security and dispatch. | Binary | |
| Independent Variables: Meteorological Data | Binary use of variables with passenger data information as independent variables in the model(s) analyzed | Weather conditions that impact operational procedures, such as temperature, visibility, wind direction and speed, gusts, altitude and humidity. | Binary | |
| Independent Variables: Traffic Data | Binary use of variables with meteorological data information as independent variables in the model(s) analyzed | Information on flight traffic and airport capacity, modeled using a dynamic and stochastic queuing model to estimate traffic-related delays. | Binary | |
| Independent Variables: Airport Capacity Data | Binary use of variables with air traffic data information as independent variables in the model(s) analyzed | Information such as the peak-hour capacity of the passenger terminal (TPS), or information on the runway capacity (PPD) at the airports where the flights originate and end | Binary | |
| Independent Variables: Data on Previous Delays | Binary use of variables with airport capacity data information as independent variables in the model(s) analyzed | Information on flight delays or stages prior to the flight analyzed. | Binary |
| Model Components | Adopted Variable |
|---|---|
| Number of participants | Total sample |
| Number of events observed | Units of correct predictions |
| Number of events expected | Test sample |
| Study identifier | Authors + Year + Method |
| Variable | Tobit Models—Factors Influencing Efficiency | |||
|---|---|---|---|---|
| Model 1 | p-Value | Model 2 | p-Value | |
| Constant | 0.4151 | 3.32 × 10−16 *** | 0.1883 | 0.0162 ** |
| Dummy (Regression) | −0.0404 | 0.0322 ** | - | - |
| Prediction Accuracy | 0.4068 | 1.89 × 10−17 *** | 0.4547 | 8.72 × 10−30 *** |
| Total Sample | −7.477 × 10−8 | 2.02 × 10−28 *** | −7.144 × 10−8 | 1.18 × 10−32 *** |
| Correctly Predicted Instances | 2.603 × 10−7 | 6.27 × 10−14 *** | 2.18701 × 10−7 | 3.18 × 10−12 *** |
| Dummy (Airline) | −0.0468 | 0.0154 ** | - | - |
| Dummy Var. (Aircraft) | 0.0711 | 0.0004 *** | - | - |
| Dummy Var. (PAX)—Passengers | −0.1812 | 1.89 × 10−8 *** | - | - |
| Dummy Var. (Meteor.)—Meteorology/Weather | 0.1035 | 2.44 × 10−6 *** | 0.1450 | 2.49 × 10−15 *** |
| Dummy Var. (Airport Capacity.) | 0.1722 | 4.49 × 10−16 *** | 0.1631 | 1.41 × 10−21 *** |
| Dummy Var. (Flight data) | - | - | 0.2362 | 0.0009 *** |
| Dummy Var. (Air Routes) | - | - | −0.1618 | 1.39 × 10−19 *** |
| Dummy Var. (Air Traffic) | - | - | 0.0774 | 8.16 × 10−5 *** |
| Dummy Var. (Previous Delays) | - | - | 0.0405 | 0.0224 ** |
| Impact Factor | −0.0097 | 0.0148 ** | −0.0354 | 3.86 × 10−21 *** |
| Quotes | - | - | 0.0007 | 0.0008 *** |
| n (number of observations) | 1–109 | 1–109 | ||
| Chi-square (10) | 565,488 | 780,543 | ||
| Log Likelihood | 127,572 | 142,655 | ||
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Queiróz Júnior, H.d.S.; Falcão, V.; da Silva, F.G.F.; Bezerra, I.M.T.; Machado, J.K.L. Machine Learning Methods Benchmarking for Predicting Flight Delays: An Efficiency Meta-Analysis. Sustainability 2025, 17, 9887. https://doi.org/10.3390/su17219887
Queiróz Júnior HdS, Falcão V, da Silva FGF, Bezerra IMT, Machado JKL. Machine Learning Methods Benchmarking for Predicting Flight Delays: An Efficiency Meta-Analysis. Sustainability. 2025; 17(21):9887. https://doi.org/10.3390/su17219887
Chicago/Turabian StyleQueiróz Júnior, Hélio da Silva, Viviane Falcão, Francisco Gildemir Ferreira da Silva, Izabelle Marie Trindade Bezerra, and Joab Kleber Lucena Machado. 2025. "Machine Learning Methods Benchmarking for Predicting Flight Delays: An Efficiency Meta-Analysis" Sustainability 17, no. 21: 9887. https://doi.org/10.3390/su17219887
APA StyleQueiróz Júnior, H. d. S., Falcão, V., da Silva, F. G. F., Bezerra, I. M. T., & Machado, J. K. L. (2025). Machine Learning Methods Benchmarking for Predicting Flight Delays: An Efficiency Meta-Analysis. Sustainability, 17(21), 9887. https://doi.org/10.3390/su17219887

