1. Introduction
In recent years, with the rapid developments of aircraft design, manufacturing and maintenance, the reliability and safety of aircraft have improved significantly. However, aviation accidents and incidents caused by pilot fatigue still occur. Pilots are often fatigued due to a lack of sleep, long flights, night flying, transmeridian flight, as well as cabin noise, vibrations, and air pressure changes [
1]. When pilots are in a state of fatigue, problems such as slower reaction speed, misjudgment, decreased operational ability, and flight illusion may occur, which will seriously affect flight safety. Therefore, how to quickly and accurately identify the fatigue state of pilots has become a core scientific problem in the field of aviation safety that needs to be solved urgently. The identification of pilot fatigue status has important theoretical and practical significance for ensuring flight safety, realizing pilot fatigue risk control, and formulating flight-related regulations.
Scholars at home and abroad have adopted various methods to detect the fatigue status of pilots. Subjective evaluation [
2,
3,
4], physiological parameters [
5,
6,
7], behavior characteristics [
8,
9,
10,
11,
12], and flight parameters [
13,
14] are included in these methods. Among them, the monitoring of pilot fatigue state was more successfully achieved by physiological parameters than other methods. When pilots are fatigued, their physiological functions such as cardiac function, neurological function, and respiratory function change accordingly. Therefore, the fatigue status can be reflected by physiological indicators such as electroencephalogram (EEG) signals [
15], electrocardiogram (ECG) signals [
16], electromyogram (EMG) signals [
17,
18], and blood oxygen levels [
19,
20].
Functional near-infrared spectroscopy (fNIRS) is an emerging brain functional imaging technology. It has the advantages of low consumption, easy operation, portability, less interference from controlling operations, real-time detection, excellent temporal resolution and better spatial localization ability, etc., fNIRS is now widely used to monitor the degree of brain activation. Some scholars have found that fNIRS can reflect the fatigue state of the brain to a certain extent [
21]. Zhao Yue used functional near-infrared spectroscopy to detect the fatigue state of drivers and found that as driving time increased, the absolute concentration of oxyhemoglobin in the left region of the prefrontal lobe of the brain tended to increase. The change of deoxyhemoglobin concentration in this region could be used to measure the fatigue level of drivers [
22]. Xu et al. used functional near-infrared spectroscopy to investigate the synergistic mechanisms of different brain regions of participants during driving tasks and mental calculation tasks. The results showed that mental fatigue has an adverse effect on cognitive function [
23]. Li et al. randomly divided the participants into two groups (driving and non-driving) and found that compared with the non-driving group, the concentration of oxygenated hemoglobin in the frontal cortex of the driving group was relatively increased [
24].
In recent years, domestic and foreign scholars have used various algorithms to classify fatigue status based on fNIRS data, such as linear discriminant analysis (LDA), fisher linear discriminant (FLD) analysis, and support vector machines (SVM). Khan et al. collected fNIRS signals of the drivers’ prefrontal lobe area through a simulation experiment, and an LDA model was utilized to classify the drivers’ sleepiness and alertness [
25]. Nguyen et al. collected fNIRS data of drivers in the awake and drowsy states, and the mean values of HbO and Hb concentration changes were analyzed to classify fatigue states based on the FLD method [
26]. Zhang et al. induced fatigue through character 2-back working memory tasks, collected fNIRS, ECG, and respiratory data of participants, and used SVMs to detect mental fatigue based on various data sources. It was found that the accuracy of mental fatigue detection based on fNIRS data was higher than that of ECG and respiratory [
27].
FNIRS provides a technical basis for researchers to continuously detect pilots’ brain signals during flight. The continuous improvement of denoising and modeling methods provides a reliable theoretical basis for the interpretation of the pilot’s operational behavior at the “brain-machine” level.
At present, there is little research on pilot fatigue based on fNIRS, with incomplete feature indicators and single identification methods. In this paper, the research on pilot fatigue state based on fNIRS is carried out to address the problems. The fatigue scale and fNIRS data were acquired on the basis of simulated flight experiments. Then the data were preprocessed to obtain the scale statistics and the relative HbO
2 concentrations for each channel. The characteristic indexes of the relative HbO
2 concentrations of each channel were extracted to establish the data set. The corresponding fatigue status labels were added to the data set. The data sets were min-max standardized, after which the data set was divided into training and test sets. Subsequently, the SDAE model was initialized. With the training set as the input, the network structure and parameters were adjusted through unsupervised pre-training and reverse fine-tuning steps. The test set was utilized to test the SDAE network model to obtain the classification results and evaluate the model.
Figure 1 shows the research framework.
The remainder of this paper is organized as follows: Details of the experimental content and research methodology are presented in
Section 2. The results of the study are analyzed in
Section 3. The discussion and future works are described in
Section 4. We conclude the paper in
Section 5.
3. Results
3.1. Fatigue Scale Analysis
Pilots filled out the KSS before performing the flight mission during two time periods, respectively. The Shapiro–Wilk test was utilized to analyze the normality of KSS scores. The results show that under the test level of α = 0.05, the data from 9:00 to 11:00 follow the normal distribution (W = 0.941, p > 0.05) and the data from 14:00 to 16:00 follow the normal distribution (W = 0.965, p > 0.05). Then, a parametric test method, namely a paired sample t-test, was used to analyze the participants’ KSS scores. There is a significant increase in the KSS scores between 14:00 to 16:00 (M = 5.93, SD = 1.112) compared to KSS scores between 9:00 to 11:00 (M = 2.57, SD = 1.006); t (29) = −13.394, p < 0.001. In other words, pilots’ fatigue levels were significantly different in the two time periods. Between 7:00 and 9:00, the average KSS scores were less than 4. Between 14:00 and 16:00, the average KSS scores were greater than 4 but no greater than 7. Therefore, the fatigue levels corresponding to the two time periods can be divided into non-fatigued and fatigued.
3.2. fNIRS Data Collection
In our study, the fNIRS data of 30 pilots were obtained through flight simulation experiments. The data of 22 participants were randomly selected as training samples, and the data of the remaining 8 participants were taken as test samples. The fNIRS data were intercepted according to a time window of 15 s [
33]. 1080 samples were ultimately obtained. The specific figures are described in
Table 5. Eight characteristic indexes of each channel in the samples were extracted. Some data of certain samples are shown in
Table 6.
3.3. Establishment of SDAE Model
In the process of establishing the SDAE model, the selected different parameter combinations were trained and tested 10 times to obtain the average recognition accuracy.
3.3.1. Number of Hidden Layers
In this section, the characteristic indexes of fNIRS data are utilized as the input to the SDAE model, that is, the input dimension is 864 (108 × 8). The fatigue states were used as the output of the model, that is, the output dimension is two. In order to achieve the best recognition performance, the number of hidden layers was set to 1, 2, 3, 4, and 5, respectively. The number of neurons in the hidden layer was set to 864. The learning rate, the noise rate, the number of iterations, and the batch size were set to 0.1, 0.1, 15, and 4, respectively. The input data were classified, and the classification results are exhibited in
Figure 9. With the increase in the number of hidden layers, the identification accuracy rate of the training set grew increasingly high. However, when the number of hidden layers was greater than two, the identification accuracy rate of the test set decreased instead of increasing. When the number of hidden layers was 2, the recognition rate of the network for the training set reached 88.52% and the identification accuracy rate of the network for the test set reached 86.46%. At this point, the accuracy of the test set was the highest. Therefore, two was chosen as the number of hidden layers.
3.3.2. Number of Nodes in the Hidden Layer
The number of hidden layer nodes was determined on the basis of determining the number of hidden layers. The number of neurons in the first hidden layer of SDAE indicated the feature dimension automatically extracted during the pre-training of the first DAE. Usually, the number of nodes in the hidden layer of SDAE is less than the number of nodes in the input layer. In this study, the number of neurons in the first hidden layer was set to 864, 432, 216, 108, and 54, respectively. The number of neurons in the second hidden layer was 1/2 of the first hidden layer. The networks were trained and tested. Results are illustrated in
Figure 10. When the number of neurons in the first hidden layer was 432, the network had the highest accuracy on the training set and test set and the accuracy rate on the test set was 88.19%. Therefore, the number of neurons in the first hidden layer was 432.
On the basis of choosing the number of neurons in the first hidden layer, the effect of the number of neurons in the second hidden layer on the results of SDAE fatigue identification accuracy is analyzed. The number of neurons in the second hidden layer was set to 27, 54, 108, 216, and 432, respectively. The accuracy results of the training set and test set are revealed in
Figure 11. When the number of neurons in the second hidden layer reached 108, the accuracy rate of the training set and test set were 93.06% and 90.63%, respectively. Therefore, 108 was chosen as the number of the second hidden layer neurons after comprehensive consideration.
3.3.3. Noise Rate
The noise rate indicates the proportion of noise added to the SDAE samples during the pre-training process. The larger the noise rate, the larger the proportion of noise added during the pre-training process. The optimal noise rate was selected through multiple trials, as shown in
Figure 12. When the noise ratio was 0.1, the identification accuracy rate of the network was the highest and the identification accuracy rate of the training set and test set were 93.06% and 90.63%, respectively. Therefore, the noise rate was set to 0.1.
3.3.4. Learning Rate
The accuracy rates of the training set and test set under different learning rates are displayed in
Figure 13. When the learning rate was 0.2, the network had the highest identification accuracy rate. The identification accuracy rate of the training set and test set were 93.89% and 91.32%, respectively. Combining the accuracy results of the training set and test set, 0.2 was chosen as the learning rate of the network.
3.3.5. Batch Size
Batch size represents the number of samples loaded into the neural network at a time. The number of batch size must be evenly divisible by the number of training set.
Figure 14 shows the relationship between different batch size and identification accuracy rate. When the batch size was 4, the identification accuracy rate of the test set was 91.32%. Therefore, the parameter batch size was set to four.
The optimal structure of the trained SDAE neural network is 864-432-108-2, which means that the SDAE neural network has a four-layer of structure, as seen in
Figure 15. The input layer is used to input data. The first and the second hidden layers are used for dimensionality reduction and feature extraction. The output layer is utilized for outputting the results. The noise rate, learning rate, and batch size are 0.1, 0.2, and 4, respectively. The accuracy rate of SDAE model training set and test set are 93.89% and 91.32%, respectively.
3.4. Model Performance Evaluation
In this section, the identification results of the SDAE model are compared with those of the traditional classification models SVM and LDA to verify the accuracy and effectiveness of the models. The training set was adopted to train SVM and LDA, respectively. In addition, the test set was used to test SVM and LDA models. The confusion matrix of the SDAE model, SVM model, and LDA model are shown in
Figure 16 and the results of the identification accuracy rates are found in
Table 7.
Figure 16 demonstrates that the accuracy of the SDAE model under two fatigue states is 93.06% and 89.58%, respectively. The accuracy of the LDA model under two fatigue states is 72.22% and 63.89%, respectively. The accuracy of the SVM model under two fatigue states is 77.08% and 73.61%, respectively. The accuracy of the SDAE model is higher than that of the LDA model and the SVM model under two fatigue states.
Table 7 presents the identification accuracy rate of the SDAE model as 23.26% higher than that of the LDA model and 15.97% higher than that of the SVM model. Therefore, the pilot fatigue identification model based on the SDAE model established in this paper has high identification accuracy.
In order to further evaluate the classification performance of the models, four indicators are used in our study: precision rate, recall value, F1 score, and ROC curve. The calculated results of precision, recall value, and F1 score for the SDAE, LDA, and SVM models are indicated in
Table 8, and the receiver operating characteristic curve (ROC) is shown in
Figure 17.
Table 8 demonstrates that compared with the LDA model and the SVM model, the SDAE model has higher accuracy, recall value, and F1 score, indicating that the classification precision and the sensitivity of the SDAE model are higher.
As can be seen from
Figure 17, three curves are located on the upper left of the 45° diagonal and deviate from the 45° diagonal. However, the SDAE model curve is closer to the (0,1) point than the LDA and SVM model curves, indicating that the SDAE model performs better.
Based on the above accuracy rate, precision rate, recall value, F1 score and ROC curve results, the SDAE model is superior to the LDA model and the SVM model. The SDAE model can effectively identify the two fatigue states of the pilot.
4. Discussion
In order to establish the identification model of the pilots’ fatigue state, the pilots were selected as participants. The airfield traffic pattern was selected as the flight task, and the wireless wearable near-infrared device was selected as the data acquisition device. The flight simulation experiment had less interference from the pilot operation and a low cost of data acquisition. Therefore, the obtained fNIRS data are of great significance for studying the identification of pilots’ fatigue status and the effect of fatigue on pilots’ cognitive activities.
The fNIRS signals from the prefrontal lobe of the pilots’ brains were acquired. The results of the study show that the degree of brain fatigue can be reflected by the fNIRS data from the prefrontal lobe of the brain. This is consistent with the results of relevant research [
17,
42]. Meanwhile, the relative concentration of HbO2 in the fNIRS signal was extracted. The findings suggest that the relative concentration of HbO2 can reflect the degree of brain fatigue, which is consistent with the results of related studies [
19,
22]. This maintains the consistency of the related research results.
In our study, mean value, variance, standard deviation, kurtosis, skewness, coefficient of variation, peak value, and range of HbO2 for each channel were extracted as the characteristic indexes. Furthermore, the fatigue status of pilots was identified by these characteristic indicators. In related studies, fatigue states have been analyzed by experts and scholars based on indicators such as mean value, variance, skewness, and kurtosis [
43]. Our results enrich the research of characteristic indicators of fNIRS.
This means that fNIRS could be a powerful tool for studying fatigue. The relevant characteristics of fNIRS may be indicators of fatigue. Therefore, fNIRS has good prospects in fatigue-related research.
Due to the limitation of time and effort, fatigue was divided into two states in this paper—non-fatigue and fatigue. In future studies, pilots’ fatigue states could be further refined and identified. The experimental participants selected for the experiments in our study were all male pilots, so female pilots could be recruited for the experiments in future research to expand the scope of the study. Thirty pilots were recruited as participants for the flight simulation experiment in our study, and the number of participants could be increased in future studies to improve reliability by increasing the sample size.
5. Conclusions
In this study, the wearable near-infrared equipment was used to obtain pilots’ fNIRS data in the flight simulation experiment and 1080 valid samples were selected. Eight feature indexes (mean, variance, standard deviation, kurtosis, skewness, standard deviation, coefficient of variation, peak value, and extreme deviation) of HbO2 of each sample were extracted to establish the feature parameter set. These were used as the input of the SDAE model to train and test the pilots’ fatigue state identification model. The identification accuracy of the SDAE model is 91.32%, which is 23.26% and 15.97% higher than that of the LDA and SVM models, respectively. Therefore, the pilots’ fatigue state identification model based on the SDAE established in this paper has high identification accuracy. The SDAE model proposed in this paper can provide a new theoretical basis for the identification of pilots’ fatigue. This research can provide directions for monitoring and an early warning of pilot fatigue during flight missions by using fNIRS in the future. This is conducive to improving the pilot fatigue risk management system (FRMS) and reduce flight accidents caused by pilot fatigue.