Air Traffic Controller Fatigue Detection by Applying a Dual-Stream Convolutional Neural Network to the Fusion of Radiotelephony and Facial Data
Abstract
1. Introduction
2. AF Dual-Stream CNN: A Dual-Stream CNN for Audio and Facial Images
2.1. Audio Feature Extraction
- (a)
- Zero-crossing rate
- (b)
- Chromagram
- (c)
- Mel-frequency cepstral coefficients
- (d)
- Root mean square
- (e)
- Mel spectrogram
2.2. AF Dual-Stream CNN
- (a)
- Audio data stream: convolution module based on audio data
- (b) Facial data stream: convolution module based on facial data
- (c) Feature fusion and fatigue state discrimination
| Algorithm 1: AF dual-stream CNN | 
| Input: , Output: Initialize: initialize Step 1: initialize and For initialize , , , , and , by using Equations (1) and (4)–(7), initialize | 
| Then,  according to Equation (9). For , initialize according to Equation (11), Step 2: initialize | 
| Step 1: Fully connected layer update For E = 1 to the number of iterations, train the fully connected layer using of each and in the training set; according to the difference between the input and output labels, update the parameters in the fully connected layer using a backpropagation algorithm; E = E + 1. End Step 2: Fatigue state discrimination Initialize of and according to Equation (13). Output the fatigue label according to Equation (14): | 
3. Fatigue Detection Experiments
3.1. Experimental Setup
3.2. Experimental Results
- For the SVC model, the penalty coefficient C was 10, the kernel function used the radial basis function, and the randomness was set to 69.
- For the KNN model, the number of neighbors was set to five, the prediction weight function was inversely proportional to the distance, and the brute force algorithm was used. The leaf size passed to the nearest-neighbor search algorithm was 30.
- For the random forest model, the number of trees was set to 500, the random state was 69, and the maximum number of features was the square root of the number of sample features. The node split criterion was the information gain entropy.
- For the multilayer perceptron classifier unscaled MLP model, randomness was set to 69, and data scaling was not performed during testing.
- For the multilayer perceptron classifier standard scaled MLP model, randomness was set to 69, and data scaling was performed during testing.
4. Conclusions
Author Contributions
Funding
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Piao, Q.; Xu, X.; Fu, W.; Zhang, J.; Jiang, W.; Gao, X.; Chen, Z. Fatigue Index of ATC in Number Recognition Task. In Man-Machine-Environment System Engineering, Proceedings of the 21st International Conference on MMESE: Commemorative Conference for the 110th Anniversary of Xuesen Qian’s Birth and the 40th Anniversary of Founding of Man-Machine-Environment System Engineering, Beijing, China, 23–25 October 2021; Springer: Singapore, 2022; pp. 255–259. [Google Scholar]
- Joseph, B.E.; Joseph, A.M.; Jacob, T.M. Vocal fatigue—Do young speech-language pathologists practice what they preach? J. Voice 2020, 34, 647.e1–647.e5. [Google Scholar] [CrossRef]
- Kelly, D.; Efthymiou, M. An analysis of human factors in fifty controlled flight into terrain aviation accidents from 2007 to 2017. J. Saf. Res. 2019, 69, 155–165. [Google Scholar] [CrossRef] [PubMed]
- Abd-Elfattah, H.M.; Abdelazeim, F.H.; Elshennawy, S. Physical and cognitive consequences of fatigue: A review. J. Adv. Res. 2015, 6, 351–358. [Google Scholar] [CrossRef] [PubMed]
- Zhang, X.; Yuan, L.; Zhao, M.; Bai, P. Effect of fatigue and stress on air traffic control performance. In Proceedings of the 2019 5th International Conference on Transportation Information and Safety (ICTIS), Liverpool, UK, 14–17 July 2019; IEEE: Toulouse, France, 2019; pp. 977–983. [Google Scholar]
- Devi, M.S.; Bajaj, P.R. Driver fatigue detection based on eye tracking. In Proceedings of the 2008 First International Conference on Emerging Trends in Engineering and Technology, Bursa, Turkey, 30 November–2 December 2017; IEEE: Toulouse, France, 2008; pp. 649–652. [Google Scholar]
- Saradadevi, M.; Bajaj, P. Driver fatigue detection using mouth and yawning analysis. Int. J. Comput. Sci. Netw. Secur. 2008, 8, 183–188. [Google Scholar]
- Azim, T.; Jaffar, M.A.; Mirza, A.M. Automatic fatigue detection of drivers through pupil detection and yawning analysis. In Proceedings of the 2009 Fourth International Conference on Innovative Computing, Information and Control (ICICIC), Kaohsiung, Taiwan, 7–9 December 2019; IEEE: Toulouse, France, 2009; pp. 441–445. [Google Scholar]
- Moujahid, A.; Dornaika, F.; Arganda-Carreras, I.; Reta, J. Efficient and compact face descriptor for driver drowsiness detection. Expert Syst. Appl. 2021, 168, 114334. [Google Scholar] [CrossRef]
- Khan, S.A.; Hussain, S.; Xiaoming, S.; Yang, S. An Effective Framework for Driver Fatigue Recognition Based on Intelligent Facial Expressions Analysis. IEEE Access 2018, 6, 67459–67468. [Google Scholar] [CrossRef]
- Liang, H.; Liu, C.; Chen, K.; Kong, J.; Han, Q.; Zhao, T. Controller fatigue state detection based on ES-DFNN. Aerospace 2021, 8, 383. [Google Scholar] [CrossRef]
- Deng, Y.; Sun, Y. A method to determine the fatigue of air traffic controller by action recognition. In Proceedings of the 2020 IEEE 2nd International Conference on Civil Aviation Safety and Information Technology (ICCASIT), Weihai, China, 14–16 October 2020; IEEE: Toulouse, France, 2020; pp. 95–97. [Google Scholar]
- Li, K.; Wang, S.; Du, C.; Huang, Y.; Feng, X.; Zhou, F. Accurate fatigue detection based on multiple facial morphological features. J. Sens. 2019, 2019, 7934516. [Google Scholar] [CrossRef]
- Zhang, J.; Chen, Z.; Liu, W.; Ding, P.; Wu, Q. A field study of work type influence on air traffic controllers’ fatigue based on data-driven PERCLOS detection. Int. J. Environ. Res. Public Health 2021, 18, 11937. [Google Scholar] [CrossRef]
- Abtahi, S.; Hariri, B.; Shirmohammadi, S. Driver drowsiness monitoring based on yawning detection. In Proceedings of the IEEE International Instrumentation and Measurement Technology Conference, Hangzhou, China, 10–12 May 2011; pp. 1–4. [Google Scholar]
- Devi, M.S.; Bajaj, P.R. Fuzzy based driver fatigue detection. In Proceedings of the 2010 IEEE International Conference on Systems, Man and Cybernetics, Istanbul, Turkey, 10–13 October 2010; pp. 3139–3144. [Google Scholar]
- Li, K.; Gong, Y.; Ren, Z. A fatigue driving detection algorithm based on facial multi-feature fusion. IEEE Access 2020, 8, 101244–101259. [Google Scholar] [CrossRef]
- de Vasconcelos, C.A.; Vieira, M.N.; Kecklund, G.; Yehia, H.C. Speech analysis for fatigue and sleepiness detection of a pilot. Aerosp. Med. Hum. Perform. 2019, 90, 415–418. [Google Scholar] [CrossRef]
- Wu, N.; Sun, J. Fatigue Detection of Air Traffic Controllers Based on Radiotelephony Communications and Self-Adaption Quantum Genetic Algorithm Optimization Ensemble Learning. Appl. Sci. 2022, 12, 10252. [Google Scholar] [CrossRef]
- Shen, Z.; Pan, G.; Yan, Y. A high-precision fatigue detecting method for air traffic controllers based on revised fractal dimension feature. Math. Probl. Eng. 2020, 2020, 4563962. [Google Scholar] [CrossRef]
- Sun, H.; Jia, Q.; Liu, C. Study on voice feature change of radiotelephony communication under fatigue state. China Saf. Sci. J. 2020, 30, 158. [Google Scholar]
- Shen, Z.; Wei, Y. A high-precision feature extraction network of fatigue speech from air traffic controller radiotelephony based on improved deep learning. ICT Express 2021, 7, 403–413. [Google Scholar] [CrossRef]
- Dobrišek, S.; Gajšek, R.; Mihelič, F.; Pavešić, N.; Štruc, V. Towards efficient multi-modal emotion recognition. Int. J. Adv. Robot. Syst. 2013, 10, 53. [Google Scholar] [CrossRef]
- Panda, R.; Malheiro, R.M.; Paiva, R.P. Audio features for music emotion recognition: A survey. IEEE Trans. Affect. Comput. 2020, 14, 68–88. [Google Scholar] [CrossRef]
- Zaw, T.H.; War, N. The combination of spectral entropy, zero crossing rate, short time energy and linear prediction error for voice activity detection. In Proceedings of the 2017 20th International Conference of Computer and Information Technology (ICCIT), Dhaka, Bangladesh, 22–24 December 2017; IEEE: Toulouse, France, 2017; pp. 1–5. [Google Scholar]
- Yuan, S.; Wang, Z.; Isik, U.; Giri, R.; Valin, J.-M.; Goodwin, M.M.; Krishnaswamy, A. Improved singing voice separation with chromagram-based pitch-aware remixing. In Proceedings of the ICASSP 2022—2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 23–27 May 2022; IEEE: Toulouse, France, 2022; pp. 111–115. [Google Scholar]
- Muda, L.; Begam, M.; Elamvazuthi, I. Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. arXiv 2010, arXiv:1003.4083. [Google Scholar]
- Madsen, P.T. Marine mammals and noise: Problems with root mean square sound pressure levels for transients. J. Acoust. Soc. Am. 2005, 117, 3952–3957. [Google Scholar] [CrossRef]
- Soundarya, M.; Karthikeyan, P.R.; Ganapathy, K.; Thangarasu, G. Automatic Speech Recognition using the Melspectrogram-based method for English Phonemes. In Proceedings of the 2022 International Conference on Computer, Power and Communications (ICCPC), Chennai, India, 14–16 December 2022; IEEE: Toulouse, France, 2022; pp. 270–273. [Google Scholar]
- Hu, Y.; Liu, Z.; Hou, A.; Wu, C.; Wei, W.; Wang, Y.; Liu, M. On fatigue detection for air traffic controllers based on fuzzy fusion of multiple features. Comput. Math. Methods Med. 2022, 2022, 4911005. [Google Scholar] [CrossRef]
- Yang, L.; Li, L.; Liu, Q.; Ma, Y.; Liao, J. Influence of physiological, psychological and environmental factors on passenger ship seafarer fatigue in real navigation environment. Saf. Sci. 2023, 168, 106293. [Google Scholar] [CrossRef]
- Sun, J.; Sun, R.; Li, J.; Wang, P.; Zhang, N. Flight crew fatigue risk assessment for international flights under the COVID-19 outbreak response exemption policy. BMC Public Health 2022, 22, 1843. [Google Scholar] [CrossRef] [PubMed]
- Jamshidi, S.; Azmi, R.; Sharghi, M.; Soryani, M. Hierarchical deep neural networks to detect driver drowsiness. Multimed. Tools Appl. 2021, 80, 16045–16058. [Google Scholar] [CrossRef]




| Category | Method | Principle | Accuracy | Usability | |
|---|---|---|---|---|---|
| Subjective Detection Methods | Subjective Feeling Rating Method | Determine fatigue level based on subjective fatigue feeling. | Medium | Low | |
| Fatigue Rating Scale Method | Design scales to rate fatigue level based on fatigue characterization indicators. | Medium | Medium | ||
| Objective Detection Methods | Contact Type | Electroencephalogram Measurement Method | Different brain wave frequencies when the cerebral cortex is in different states. | High | Low | 
| Electrocardiogram Measurement Method | Heart rate time–frequency domain indicators are significantly related to the degree of fatigue. | High | Low | ||
| Electromyogram Measurement Method | Monitor the bioelectric changes when muscle cells are active. | High | Low | ||
| Dynamic Heart Rate Method | There is a close relationship between heart rate and muscle fatigue when engaging in physical operations. | High | Medium | ||
| Non-contact Type | Facial State Recognition Method | Detect fatigue by analyzing and recognizing facial features. | Medium | High | |
| Voice Frequency Analysis Method | Voice features change under fatigue state. | Medium | Medium | ||
| Subject Number | Gender | Age, Years | No Fatigue | Mild Fatigue | Severe Fatigue | Total | 
|---|---|---|---|---|---|---|
| 1 | Male | 30 | 35 | 39 | 40 | 109 | 
| 2 | Male | 28 | 30 | 40 | 41 | 116 | 
| 3 | Male | 27 | 34 | 38 | 39 | 111 | 
| 4 | Male | 28 | 35 | 39 | 40 | 109 | 
| 5 | Male | 30 | 42 | 46 | 47 | 127 | 
| 6 | Male | 29 | 30 | 34 | 35 | 104 | 
| 7 | Male | 32 | 32 | 36 | 37 | 105 | 
| 8 | Female | 35 | 33 | 37 | 38 | 108 | 
| 9 | Female | 28 | 37 | 41 | 42 | 120 | 
| 10 | Female | 29 | 38 | 42 | 43 | 123 | 
| 11 | Female | 31 | 39 | 43 | 44 | 121 | 
| 12 | Female | 30 | 40 | 44 | 45 | 118 | 
| 13 | Female | 29 | 36 | 40 | 41 | 117 | 
| 14 | Female | 29 | 35 | 39 | 40 | 114 | 
| Network Layer | Number of Kernels | Kernel Size | Stride | Dropout | Activation Function | Output Size | 
|---|---|---|---|---|---|---|
| Audio feature | 162 × 1 | |||||
| Conv1D | 256 | 5 | 1 | 0 | Relu | 162 × 256 | 
| MaxPooling1D | 0 | 2 | 2 | 0 | 81 × 256 | |
| Conv1D | 256 | 5 | 1 | 0 | Relu | 81 × 256 | 
| MaxPooling1D | 0 | 2 | 2 | 0 | 41 × 256 | |
| Conv1D | 128 | 5 | 1 | 0 | Relu | 41 × 128 | 
| MaxPooling1D | 0 | 2 | 2 | 0.2 | 21 × 128 | |
| Conv1D | 64 | 5 | 1 | 0 | Relu | 21 × 64 | 
| MaxPooling1D | 0 | 2 | 2 | 0 | 11 × 64 | 
| Network Layer | Number of Kernels | Kernel Size | Stride | Dropout | Output Size | 
|---|---|---|---|---|---|
| Facial data | 48 × 48 × 1 | ||||
| Conv2D | 32 | 1 × 1 | 1 | Relu | 48 × 48 × 32 | 
| Conv2D | 64 | 3 × 3 | 1 | Prelu | 48 × 48 × 64 | 
| Conv2D | 64 | 5 × 5 | 1 | Prelu | 48 × 48 × 64 | 
| MaxPooling2D | 0 | 2 × 2 | 2 | 24 × 24 × 64 | |
| Conv2D | 64 | 3 × 3 | 1 | Prelu | 24 × 24 × 64 | 
| Conv2D | 64 | 5 × 5 | 1 | Prelu | 24 × 24 × 64 | 
| MaxPooling1D | 0 | 2 × 2 | 2 | 12 × 12 × 64 | 
| Network Layer | Input | Output | Activation Function | Dropout | Classifier | 
|---|---|---|---|---|---|
| Fully connected 2048 | 9920 | 2048 | Relu | 0.5 | None | 
| Fully connected 1024 | 2048 | 1024 | Relu | 0.5 | Softmax | 
| Model Name | Accuracy | 
|---|---|
| SVC model (C = 10) | 51.40% | 
| KNN model (K = 5) | 46.10% | 
| Random forest model | 77.57% | 
| Unscaled MLP model | 67.13% | 
| Standard scaled MLP model | 76.01% | 
| Audio data stream model | 62.88% | 
| AF dual-stream CNN | 98.03% | 
| Model Name | Accuracy | 
|---|---|
| VGG19 | 35.94% | 
| VGG16 | 96.81% | 
| ResNet50 | 97.82% | 
| LeNet | 95.01% | 
| Facial data stream model | 96.0% | 
| AF dual-stream CNN | 98.03% | 
| Model Name | Number of Nontrainable Parameters | Number of Trainable Parameters | Number of Iterations | 
|---|---|---|---|
| VGG19 | 0 | 21,601,219 | 1000 | 
| VGG16 | 0 | 16,291,523 | 300 | 
| ResNet50 | 53,120 | 23,534,467 | 1200 | 
| LeNet | 0 | 3,627,573 | 200 | 
| AF dual-stream CNN | 0 | 23,582,979 | 50 | 
| Label | Precision | F1 Score | Recall | Support | 
|---|---|---|---|---|
| Severe fatigue | 0.98 | 0.99 | 1.00 | 97 | 
| Mild fatigue | 1.00 | 0.99 | 0.98 | 129 | 
| No fatigue | 0.99 | 0.99 | 0.99 | 95 | 
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xu, L.; Ma, S.; Shen, Z.; Nan, Y. Air Traffic Controller Fatigue Detection by Applying a Dual-Stream Convolutional Neural Network to the Fusion of Radiotelephony and Facial Data. Aerospace 2024, 11, 164. https://doi.org/10.3390/aerospace11020164
Xu L, Ma S, Shen Z, Nan Y. Air Traffic Controller Fatigue Detection by Applying a Dual-Stream Convolutional Neural Network to the Fusion of Radiotelephony and Facial Data. Aerospace. 2024; 11(2):164. https://doi.org/10.3390/aerospace11020164
Chicago/Turabian StyleXu, Lin, Shanxiu Ma, Zhiyuan Shen, and Ying Nan. 2024. "Air Traffic Controller Fatigue Detection by Applying a Dual-Stream Convolutional Neural Network to the Fusion of Radiotelephony and Facial Data" Aerospace 11, no. 2: 164. https://doi.org/10.3390/aerospace11020164
APA StyleXu, L., Ma, S., Shen, Z., & Nan, Y. (2024). Air Traffic Controller Fatigue Detection by Applying a Dual-Stream Convolutional Neural Network to the Fusion of Radiotelephony and Facial Data. Aerospace, 11(2), 164. https://doi.org/10.3390/aerospace11020164
 
        


 
       