Dual-Pipeline Machine Learning Framework for Automated Interpretation of Pilot Communications at Non-Towered Airports
Abstract
1. Introduction
2. Related Work and Challenges
2.1. Aircraft Operation Estimation Approaches
2.2. Machine Learning for Aviation Audio Analysis
3. Materials and Methods
3.1. Data Collection and Processing
3.1.1. Textual Feature Extraction with Spectral Subtraction
3.1.2. Spectral Feature Extraction with Mel-Spectrograms
3.1.3. Audio Data Augmentation
3.1.4. Classification Models and Training Procedure
| Algorithm 1 Audio Classification via Textual and Spectral Feature Pipelines |
| Require: Audio dataset {Landing, Takeoff} |
| Ensure: 1: do |
2: Preprocess audio:
|
3: Extract features:
|
| 4: end for |
| 5: |
| 6: using accuracy, precision, recall, and F1-score |
4. Results
4.1. Model Performance Across Pipelines
4.2. Feature Representation Comparison
4.3. Robustness Through Data Augmentation
4.4. Evaluation of Metric Correlation Analysis
5. Discussion
5.1. Limitations
- The empirical evaluation is based on a single non-towered airport (KMLE), which constrains the direct evidence for cross-site generalizability. Official operational statistics for many non-towered fields are themselves estimates and are not always suitable as ground truth, complicating external validation.
- The textual pipeline depends on ASR transcription quality; ASR errors can degrade downstream classification performance, particularly for overlapping transmissions or non-standard phraseology.
- While augmentation proved beneficial in this dataset, it cannot fully substitute for diverse real-world recordings from multiple airports and varying environmental conditions. We intentionally limited the scope of new data collection for this study; however, these limitations motivate the avenues discussed below.
- Although the present study evaluates the proposed framework using data collected from a single non-towered airport (KMLE), several features of the task and model design support its potential for cross-scenario generalization. First, pilot communication procedures at non-towered airports are governed by nationally standardized FAA phraseology and reporting conventions, meaning that many linguistic cues relevant to intent classification (e.g., position reports, runway identifiers, maneuver intentions) remain consistent across airports. Second, VHF radio characteristics—such as channel noise, compression artifacts, and intermittent interference—are broadly similar nationwide, enabling spectral models to learn noise-invariant patterns that are not site-specific. Third, the dual-pipeline architecture explicitly captures complementary semantic and acoustic information, allowing the system to remain robust even when ASR performance varies across accents or background noise conditions.
5.2. Generalization, Scalability, and Future Data Augmentation Directions
5.3. Practical Deployment Considerations
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
| ATC | Air Traffic Control |
| FAA | Federal Aviation Administration |
| CTAF | Common Traffic Advisory Frequency |
| UNICOM | Universal Communications Frequency |
| ASR | Automatic Speech Recognition |
| TF-IDF | Term Frequency–Inverse Document Frequency |
| FFT | Fast Fourier Transform |
| CNN | Convolutional Neural Network |
| LSTM | Long Short-Term Memory |
| AUROC | Area Under the Receiver Operating Characteristic Curve |
| AUPR | Area Under the Precision–Recall Curve |
| MCC | Matthews Correlation Coefficient |
| ADS-B | Automatic Dependent Surveillance–Broadcast |
| ML | Machine Learning |
| PSD | Power Spectral Density |
References
- Federal Aviation Administration. Air Traffic by the Numbers. 2024. Available online: https://www.faa.gov/air_traffic/by_the_numbers/media/Air_Traffic_by_the_Numbers_2024.pdf (accessed on 19 June 2025).
- National Academies of Sciences, Engineering, and Medicine. Counting Aircraft Operations at Non-Towered Airports; Airport Cooperative Research Program; Washington, DC, USA, 2007. Available online: https://nap.nationalacademies.org/catalog/23241/counting-aircraft-operations-at-non-towered-airports (accessed on 19 June 2025).
- Transportation Research Board; National Academies of Sciences, Engineering, and Medicine. Evaluating Methods for Counting Aircraft Operations at Non-Towered Airports; Muia, M.J., Johnson, M.E., Eds.; The National Academies Press: Washington, DC, USA, 2015. [Google Scholar]
- Mott, J.H.; Sambado, N.A. Evaluation of acoustic devices for measuring airport operations counts. Transp. Res. Rec. 2019, 2673, 17–25. [Google Scholar] [CrossRef]
- Yang, C.; Mott, J.H.; Hardin, B.; Zehr, S.; Bullock, D.M. Technology assessment to improve operations counts at non-towered airports. Transp. Res. Rec. 2019, 2673, 44–50. [Google Scholar] [CrossRef]
- General Aviation Manufacturers Association. Contribution of General Aviation to the U.S. Economy in 2023. 2025. Available online: https://gama.aero/wp-content/uploads/General-Aviations-Contribution-to-the-US-Economy_Final_021925.pdf (accessed on 19 June 2025).
- Badrinath, S.; Balakrishnan, H. Automatic speech recognition for air traffic control communications. Transp. Res. Rec. 2022, 2676, 798–810. [Google Scholar] [CrossRef]
- Lin, Y.; Deng, L.; Chen, Z.; Wu, X.; Zhang, J.; Yang, B. A real-time ATC safety monitoring framework using a deep learning approach. IEEE Trans. Intell. Transp. Syst. 2019, 21, 4572–4581. [Google Scholar] [CrossRef]
- Sun, Z.; Tang, P. Automatic communication error detection using speech recognition and linguistic analysis for proactive control of loss of separation. Transp. Res. Rec. 2021, 2675, 1–12. [Google Scholar] [CrossRef]
- Ohneiser, O.; Ahmed, U. Text-to-speech application for training of aviation radio telephony communication operators. IEEE Trans. Aerosp. Electron. Syst. 2024, 60. in press. [Google Scholar] [CrossRef]
- Chen, L.; Zhou, X.; Chen, H. Audio scanning network: Bridging time and frequency domains for audio classification. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2024; Volume 38, No. 10. pp. 11355–11363. [Google Scholar]
- Federal Aviation Administration. Non-Towered Airport Flight Operations; Advisory Circular No. 90-66C; 2023. Available online: https://www.faa.gov/documentlibrary/media/advisory_circular/ac_90-66c.pdf (accessed on 19 June 2025).
- Mott, J.H.; Bullock, D.M. Estimation of aircraft operations at airports using mode-C signal strength information. IEEE Trans. Intell. Transp. Syst. 2017, 19, 677–686. [Google Scholar] [CrossRef]
- Mott, J.H.; McNamara, M.L.; Bullock, D.M. Accuracy assessment of aircraft transponder–based devices for measuring airport operations. Transp. Res. Rec. 2017, 2626, 9–17. [Google Scholar] [CrossRef]
- Farhadmanesh, M.; Rashidi, A.; Marković, N. General aviation aircraft identification at non-towered airports using a two-step computer vision-based approach. IEEE Access 2022, 10, 48778–48791. [Google Scholar] [CrossRef]
- Pretto, M.; Dorbolò, L.; Giannattasio, P.; Zanon, A. Aircraft operation reconstruction and airport noise prediction from high-resolution flight tracking data. Transp. Res. Part D Transp. Environ. 2024, 135, 104397. [Google Scholar] [CrossRef]
- Patrikar, J.; Dantas, J.; Moon, B.; Hamidi, M.; Ghosh, S.; Keetha, N.; Higgins, I.; Chandak, A.; Yoneyama, T.; Scherer, S. Image, speech, and ADS-B trajectory datasets for terminal airspace operations. Sci. Data 2025, 12, 468. [Google Scholar] [CrossRef] [PubMed]
- Farhadmanesh, M.; Marković, N.; Rashidi, A. Automated video-based air traffic surveillance system for counting general aviation aircraft operations at non-towered airports. Transp. Res. Rec. 2022, 2677, 250–273. [Google Scholar] [CrossRef]
- Florida Department of Transportation. Operations Counting at Non-Towered Airports Assessment; FDOT: Tallahassee, FL, USA, 2018; Available online: http://www.invisibleintelligencellc.com/uploads/1/8/4/9/18495640/2018_ops_count_project_final_report_09102018.pdf (accessed on 19 June 2025).
- Yang, C.; Huang, C. Natural language processing (NLP) in aviation safety: Systematic review of research and outlook into the future. Aerospace 2023, 10, 600. [Google Scholar] [CrossRef]
- Alreshidi, I.; Moulitsas, I.; Jenkins, K.W. Advancing aviation safety through machine learning and psychophysiological data: A systematic review. IEEE Access 2024, 12, 5132–5150. [Google Scholar] [CrossRef]
- Castro-Ospina, A.E.; Solarte-Sanchez, M.A.; Vega-Escobar, L.S.; Isaza, C.; Martínez-Vargas, J.D. Graph-based audio classification using pre-trained models and graph neural networks. Sensors 2024, 24, 2106. [Google Scholar] [CrossRef] [PubMed]
- Lin, Y.; Tan, X.; Yang, B.; Yang, K.; Zhang, J.; Yu, J. Real-time controlling dynamics sensing in air traffic system. Sensors 2019, 19, 679. [Google Scholar] [CrossRef] [PubMed]
- Google Cloud. Speech-to-text: Automatic Speech Recognition. 2025. Available online: https://cloud.google.com/speech-to-text (accessed on 30 June 2025).
- Borsos, Z.; Marinier, R.; Vincent, D.; Kharitonov, E.; Pietquin, O.; Sharifi, M.; Zeghidour, N. AudioLM: A language modeling approach to audio generation. IEEE/ACM Trans. Audio Speech Lang. Process. 2023, 31, 2523–2533. [Google Scholar] [CrossRef]
- Huang, R.; Ren, Y.; Liu, J.; Cui, C.; Zhao, Z. Generspeech: Towards style transfer for generalizable out-of-domain text-to-speech. Adv. Neural Inf. Process. Syst. 2022, 35, 10970–10983. [Google Scholar]
- Bae, J.S.; Kuznetsova, A.; Manocha, D.; Hershey, J.; Kristjansson, T.; Kim, M. Generative data augmentation challenge: Zero-shot speech synthesis for personalized speech enhancement. In Proceedings of the 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), Turin, Italy, 24–29 April 2025; pp. 1–5. [Google Scholar]




| Text | Label |
|---|---|
| Millard Traffic, [Callsign], Final Runway 12, Millard Traffic | Landing |
| Millard Traffic, [Callsign], departing Runway 12 and staying in the pattern, Millard Traffic | Takeoff |
| Millard Traffic, [Callsign], touch-and-go, Runway 12, Millard Traffic | Landing and Takeoff |
| Millard Traffic, [Callsign], Final Runway 30, Millard Traffic | Landing |
| Model | Textual (TF-IDF) | Spectral (Mel-Spectrogram) | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Acc. | Prec. | Rec. | F1 | AUROC | AUPR | Acc. | Prec. | Rec. | F1 | AUROC | AUPR | |
| Logistic Regression | 0.82 | 0.81 | 0.80 | 0.80 | 0.85 | 0.84 | 0.85 | 0.84 | 0.83 | 0.83 | 0.88 | 0.87 |
| Support Vector Machine | 0.83 | 0.82 | 0.82 | 0.82 | 0.86 | 0.85 | 0.87 | 0.86 | 0.85 | 0.86 | 0.90 | 0.89 |
| K-Nearest Neighbors | 0.78 | 0.77 | 0.76 | 0.76 | 0.82 | 0.81 | 0.80 | 0.79 | 0.78 | 0.78 | 0.83 | 0.82 |
| Random Forest | 0.84 | 0.83 | 0.83 | 0.83 | 0.87 | 0.86 | 0.89 | 0.88 | 0.87 | 0.88 | 0.91 | 0.90 |
| Gradient Boosting | 0.85 | 0.84 | 0.84 | 0.84 | 0.88 | 0.87 | 0.90 | 0.89 | 0.89 | 0.89 | 0.93 | 0.92 |
| Ensemble Voting | 0.86 | 0.85 | 0.85 | 0.85 | 0.89 | 0.88 | 0.88 | 0.89 | 0.88 | 0.88 | 0.90 | 0.91 |
| LSTM (Deep Learning) | 0.84 | 0.83 | 0.85 | 0.84 | 0.88 | 0.86 | - | - | - | - | - | - |
| CNN (Deep Learning) | - | - | - | - | - | - | 0.93 | 0.91 | 0.92 | 0.91 | 0.95 | 0.94 |
| Model | Textual | Spectral | Fusion |
|---|---|---|---|
| Logistic Regression | 0.82 | 0.85 | 0.83 |
| SVM | 0.83 | 0.87 | 0.88 |
| KNN | 0.78 | 0.80 | 0.81 |
| Random Forest | 0.84 | 0.89 | 0.88 |
| Gradient Boosting | 0.85 | 0.90 | 0.87 |
| Ensemble Voting | 0.86 | 0.88 | 0.88 |
| LSTM (Textual) | 0.84 | — | — |
| CNN (Spectral) | — | 0.93 | — |
| Model | TF-IDF (Textual) | BERT (Textual) | Mel (Spectral) | Log Mel (Spectral) |
|---|---|---|---|---|
| LR | 0.82 | 0.78 | 0.85 | 0.84 |
| SVM | 0.83 | 0.80 | 0.87 | 0.84 |
| KNN | 0.78 | 0.77 | 0.80 | 0.81 |
| RF | 0.84 | 0.86 | 0.89 | 0.88 |
| GB | 0.85 | 0.82 | 0.90 | 0.92 |
| EV | 0.86 | 0.79 | 0.88 | 0.90 |
| LSTM (Textual) | 0.84 | 0.85 | - | - |
| CNN (Spectral) | - | - | 0.93 | 0.89 |
| Model | Textual | Spectral | ||
|---|---|---|---|---|
| Before Aug. | After Aug. | Before Aug. | After Aug. | |
| LR | 0.82 | 0.84 | 0.85 | 0.87 |
| SVM | 0.83 | 0.86 | 0.87 | 0.90 |
| KNN | 0.78 | 0.81 | 0.80 | 0.83 |
| RF | 0.84 | 0.82 | 0.89 | 0.85 |
| GB | 0.85 | 0.87 | 0.90 | 0.92 |
| EV | 0.86 | 0.89 | 0.93 | 0.93 |
| LSTM (Textual) | 0.84 | 0.86 | - | - |
| CNN (Spectral) | - | - | 0.88 | 0.91 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Tanvir, A.A.; Huang, C.; Alahmad, M.; Yang, C.; Zhong, X. Dual-Pipeline Machine Learning Framework for Automated Interpretation of Pilot Communications at Non-Towered Airports. Aerospace 2026, 13, 32. https://doi.org/10.3390/aerospace13010032
Tanvir AA, Huang C, Alahmad M, Yang C, Zhong X. Dual-Pipeline Machine Learning Framework for Automated Interpretation of Pilot Communications at Non-Towered Airports. Aerospace. 2026; 13(1):32. https://doi.org/10.3390/aerospace13010032
Chicago/Turabian StyleTanvir, Abdullah All, Chenyu Huang, Moe Alahmad, Chuyang Yang, and Xin Zhong. 2026. "Dual-Pipeline Machine Learning Framework for Automated Interpretation of Pilot Communications at Non-Towered Airports" Aerospace 13, no. 1: 32. https://doi.org/10.3390/aerospace13010032
APA StyleTanvir, A. A., Huang, C., Alahmad, M., Yang, C., & Zhong, X. (2026). Dual-Pipeline Machine Learning Framework for Automated Interpretation of Pilot Communications at Non-Towered Airports. Aerospace, 13(1), 32. https://doi.org/10.3390/aerospace13010032

