Research on the Identification of Pilots’ Fatigue Status Based on Functional Near-Infrared Spectroscopy

Fatigue can lead to sluggish responses, misjudgments, flight illusions and other problems for pilots, which could easily bring about serious flight accidents. In this paper, a wearable functional near-infrared spectroscopy (fNIRS) device was used to record the changes of hemoglobin concentration of pilots during flight missions. The data was pre-processed, and 1080 valid samples were determined. Then, mean value, variance, standard deviation, kurtosis, skewness, coefficient of variation, peak value, and range of oxyhemoglobin (HbO2) in each channel were extracted. These indexes were regarded as the input of a stacked denoising autoencoder (SDAE) and were used to train the identification model of pilots’ fatigue state. The identification model of pilots’ fatigue status was established. The identification accuracy of the SDAE model was 91.32%, which was 23.26% and 15.97% higher than that of linear discriminant analysis (LDA) models and support vector machines (SVM) models, respectively. Results show that the SDAE model established in our study has high identification accuracy, which can accurately identify different fatigue states of pilots. Identification of pilots’ fatigue status based on fNIRS has important practical significance for reducing flight accidents caused by pilot fatigue.


Introduction
In recent years, with the rapid developments of aircraft design, manufacturing and maintenance, the reliability and safety of aircraft have improved significantly. However, aviation accidents and incidents caused by pilot fatigue still occur. Pilots are often fatigued due to a lack of sleep, long flights, night flying, transmeridian flight, as well as cabin noise, vibrations, and air pressure changes [1]. When pilots are in a state of fatigue, problems such as slower reaction speed, misjudgment, decreased operational ability, and flight illusion may occur, which will seriously affect flight safety. Therefore, how to quickly and accurately identify the fatigue state of pilots has become a core scientific problem in the field of aviation safety that needs to be solved urgently. The identification of pilot fatigue status has important theoretical and practical significance for ensuring flight safety, realizing pilot fatigue risk control, and formulating flight-related regulations.
Functional near-infrared spectroscopy (fNIRS) is an emerging brain functional imaging technology. It has the advantages of low consumption, easy operation, portability, less interference from controlling operations, real-time detection, excellent temporal resolution and better spatial localization ability, etc., fNIRS is now widely used to monitor the degree of brain activation. Some scholars have found that fNIRS can reflect the fatigue state of the brain to a certain extent [21]. Zhao Yue used functional near-infrared spectroscopy to detect the fatigue state of drivers and found that as driving time increased, the absolute concentration of oxyhemoglobin in the left region of the prefrontal lobe of the brain tended to increase. The change of deoxyhemoglobin concentration in this region could be used to measure the fatigue level of drivers [22]. Xu et al. used functional near-infrared spectroscopy to investigate the synergistic mechanisms of different brain regions of participants during driving tasks and mental calculation tasks. The results showed that mental fatigue has an adverse effect on cognitive function [23]. Li et al. randomly divided the participants into two groups (driving and non-driving) and found that compared with the non-driving group, the concentration of oxygenated hemoglobin in the frontal cortex of the driving group was relatively increased [24].
In recent years, domestic and foreign scholars have used various algorithms to classify fatigue status based on fNIRS data, such as linear discriminant analysis (LDA), fisher linear discriminant (FLD) analysis, and support vector machines (SVM). Khan et al. collected fNIRS signals of the drivers' prefrontal lobe area through a simulation experiment, and an LDA model was utilized to classify the drivers' sleepiness and alertness [25]. Nguyen et al. collected fNIRS data of drivers in the awake and drowsy states, and the mean values of HbO and Hb concentration changes were analyzed to classify fatigue states based on the FLD method [26]. Zhang et al. induced fatigue through character 2-back working memory tasks, collected fNIRS, ECG, and respiratory data of participants, and used SVMs to detect mental fatigue based on various data sources. It was found that the accuracy of mental fatigue detection based on fNIRS data was higher than that of ECG and respiratory [27].
FNIRS provides a technical basis for researchers to continuously detect pilots' brain signals during flight. The continuous improvement of denoising and modeling methods provides a reliable theoretical basis for the interpretation of the pilot's operational behavior at the "brain-machine" level.
At present, there is little research on pilot fatigue based on fNIRS, with incomplete feature indicators and single identification methods. In this paper, the research on pilot fatigue state based on fNIRS is carried out to address the problems. The fatigue scale and fNIRS data were acquired on the basis of simulated flight experiments. Then the data were preprocessed to obtain the scale statistics and the relative HbO 2 concentrations for each channel. The characteristic indexes of the relative HbO 2 concentrations of each channel were extracted to establish the data set. The corresponding fatigue status labels were added to the data set. The data sets were min-max standardized, after which the data set was divided into training and test sets. Subsequently, the SDAE model was initialized. With the training set as the input, the network structure and parameters were adjusted through unsupervised pre-training and reverse fine-tuning steps. The test set was utilized to test the SDAE network model to obtain the classification results and evaluate the model. Figure 1 shows the research framework.
The remainder of this paper is organized as follows: Details of the experimental content and research methodology are presented in Section 2. The results of the study are analyzed in Section 3. The discussion and future works are described in Section 4. We conclude the paper in Section 5.

Participants
Thirty pilots were recruited for the experiment, all of whom had obtained a Private Pilot License, Commercial Pilot License, and Instrument Rating License. The basic information is listed in Table 1. The selected pilots were all right-handed, with normal or corrected vision, no history of serious medical conditions such as tumors, nephritis, or endocrine disorders and good sleep habits. Before the experiment, the participants were required to ensure sufficient sleep, avoid strenuous exercise, avoid smoking, drinking and other behaviors that excite or inhibit the central nervous system 24 h before the experiment and maintain a good mental state.
The experimental procedures were carried out in strict accordance with the Declaration of Helsinki. All experimenters received full instructions on the experimental procedures and equipment. All participants were familiar with the entire experimental procedure. They participated in the experiment voluntarily and completed a written consent form.

Experimental Equipment
The Gowerlabs LUMO system was adopted to acquire functional near-infrared spectral data from the prefrontal lobes of pilots' brains with a sampling frequency of 10 Hz. The device adopts continuous wave (CW) measurement technology with a dual-wavelength light source (735 nm and 850 nm) and a sampling frequency of 10 Hz. 9 light sources (light sources numbered with the letters A, B, C, D, E, F, G, H, I) and 12 detectors (detectors numbered with the numbers 1-12) were utilized in the experiment and their positions were distributed as depicted in Figure 2. Each light source can form channels with each detector. In this case, a total of 108 channels were formed.

Participants
Thirty pilots were recruited for the experiment, all of whom had obtained a Private Pilot License, Commercial Pilot License, and Instrument Rating License. The basic information is listed in Table 1. The selected pilots were all right-handed, with normal or corrected vision, no history of serious medical conditions such as tumors, nephritis, or endocrine disorders and good sleep habits. Before the experiment, the participants were required to ensure sufficient sleep, avoid strenuous exercise, avoid smoking, drinking and other behaviors that excite or inhibit the central nervous system 24 h before the experiment and maintain a good mental state. The experimental procedures were carried out in strict accordance with the Declaration of Helsinki. All experimenters received full instructions on the experimental procedures and equipment. All participants were familiar with the entire experimental procedure. They participated in the experiment voluntarily and completed a written consent form.

Experimental Equipment
The Gowerlabs LUMO system was adopted to acquire functional near-infrared spectral data from the prefrontal lobes of pilots' brains with a sampling frequency of 10 Hz. The device adopts continuous wave (CW) measurement technology with a dual-wavelength light source (735 nm and 850 nm) and a sampling frequency of 10 Hz. 9 light sources (light sources numbered with the letters A, B, C, D, E, F, G, H, I) and 12 detectors (detectors numbered with the numbers 1-12) were utilized in the experiment and their positions were distributed as depicted in Figure 2. Each light source can form channels with each detector. In this case, a total of 108 channels were formed. The Cessna 172 flight simulator was utilized for experiments and the experimental scene is indicated in Figure 3. In this simulator, equipment, airport, weather environment, training scene, and other conditions can be set. Moreover, the equipment itself is highly simulated in the Cessna 172, which creates a good simulation effect and a sense of immersion.

Subjective Scale
In order to evaluate the fatigue state of the participants during the flight mission, the participants filled out the Karolinska sleepiness scale (KSS) before performing the airfield traffic pattern mission. The subjective fatigue scale of the KSS is described in Table 2. The fatigue scale is divided into 9 levels, with level 1 representing extremely alert and level 9 representing very sleepy. As the score increases, the degree of fatigue gradually deepens [28,29]. It is generally believed that when the KSS value of the participants is less than or equal to 4, they are in an awake state and can provide timely feedback to the outside world. When the KSS value of the participants is greater than 4 and less than or equal to 7, they are in a state of mild fatigue and they begin to appear unresponsive, tired, etc. When the KSS value of the subject is greater than 7, they are in a state of fatigue, lack of energy, and trance [30]. Rather alert 4 5 Neither alert nor sleepy 5 6 Some signs of sleepiness 6 7 Sleepy, but no effort to keep awake 7 8 Sleepy, but some effort to keep awake 8 9 Very sleepy, great effort to keep fighting sleep 9 The Cessna 172 flight simulator was utilized for experiments and the experimental scene is indicated in Figure 3. In this simulator, equipment, airport, weather environment, training scene, and other conditions can be set. Moreover, the equipment itself is highly simulated in the Cessna 172, which creates a good simulation effect and a sense of immersion. The Cessna 172 flight simulator was utilized for experiments and the experimental scene is indicated in Figure 3. In this simulator, equipment, airport, weather environment, training scene, and other conditions can be set. Moreover, the equipment itself is highly simulated in the Cessna 172, which creates a good simulation effect and a sense of immersion.

Subjective Scale
In order to evaluate the fatigue state of the participants during the flight mission, the participants filled out the Karolinska sleepiness scale (KSS) before performing the airfield traffic pattern mission. The subjective fatigue scale of the KSS is described in Table 2. The fatigue scale is divided into 9 levels, with level 1 representing extremely alert and level 9 representing very sleepy. As the score increases, the degree of fatigue gradually deepens [28,29]. It is generally believed that when the KSS value of the participants is less than or equal to 4, they are in an awake state and can provide timely feedback to the outside world. When the KSS value of the participants is greater than 4 and less than or equal to 7, they are in a state of mild fatigue and they begin to appear unresponsive, tired, etc. When the KSS value of the subject is greater than 7, they are in a state of fatigue, lack of energy, and trance [30]. Rather alert 4 5 Neither alert nor sleepy 5 6 Some signs of sleepiness 6 7 Sleepy, but no effort to keep awake 7 8 Sleepy, but some effort to keep awake 8 9 Very sleepy, great effort to keep fighting sleep 9

Subjective Scale
In order to evaluate the fatigue state of the participants during the flight mission, the participants filled out the Karolinska sleepiness scale (KSS) before performing the airfield traffic pattern mission. The subjective fatigue scale of the KSS is described in Table 2. The fatigue scale is divided into 9 levels, with level 1 representing extremely alert and level 9 representing very sleepy. As the score increases, the degree of fatigue gradually deepens [28,29]. It is generally believed that when the KSS value of the participants is less than or equal to 4, they are in an awake state and can provide timely feedback to the outside world. When the KSS value of the participants is greater than 4 and less than or equal to 7, they are in a state of mild fatigue and they begin to appear unresponsive, tired, etc. When the KSS value of the subject is greater than 7, they are in a state of fatigue, lack of energy, and trance [30]. Table 2. Description of Karolinska sleepiness scale. Data from [28].

Item
Describe Scale   1  Extremely alert  1  2  Very alert  2  3  Alert  3  4 Rather alert 4 5 Neither alert nor sleepy 5 6 Some signs of sleepiness 6 7 Sleepy, but no effort to keep awake 7 8 Sleepy, but some effort to keep awake 8 9 Very sleepy, great effort to keep fighting sleep

Experimental Task
The airfield traffic pattern is a common training task used in pilot training courses. This task includes the basic flight operations of a pilot during flight, such as taxiing, take-off, climbing, cruising, descent, landing, etc. Therefore, the airfield traffic pattern task was used as the experimental task. The process of the airfield traffic pattern task is given in Figure 4 and the simulated airfield traffic pattern path completed by the pilot is shown in Figure 5.

Experimental Task
The airfield traffic pattern is a common training task used in pilot training courses. This task includes the basic flight operations of a pilot during flight, such as taxiing, takeoff, climbing, cruising, descent, landing, etc. Therefore, the airfield traffic pattern task was used as the experimental task. The process of the airfield traffic pattern task is given in Figure 4 and the simulated airfield traffic pattern path completed by the pilot is shown in Figure 5.  Participants participated in the experiment from 9:00 to 11:00 and from 14:00 to 16:00, respectively. One effective airfield traffic pattern was carried out in each time period. It took about 10 min to complete the airfield traffic pattern mission. Participants were asked to sleep for 8 h before the experiment and were deprived of midday sleep on the day of the experiment. Pilots filled out the KSS at the beginning of each time period of the mission, then sat in meditation for 15 min, followed by an airfield traffic pattern mission.

Pre-Processing
In this section, the modified Beer-Lambert law (MBLL) was utilized to convert the obtained fNIRS data at two wavelengths per channel into optical density variations. Then the relative concentration of oxyhemoglobin (HbO2), deoxyhemoglobin (Hb), and total

Experimental Task
The airfield traffic pattern is a common training task used in pilot training courses. This task includes the basic flight operations of a pilot during flight, such as taxiing, takeoff, climbing, cruising, descent, landing, etc. Therefore, the airfield traffic pattern task was used as the experimental task. The process of the airfield traffic pattern task is given in Figure 4 and the simulated airfield traffic pattern path completed by the pilot is shown in Figure 5.  Participants participated in the experiment from 9:00 to 11:00 and from 14:00 to 16:00, respectively. One effective airfield traffic pattern was carried out in each time period. It took about 10 min to complete the airfield traffic pattern mission. Participants were asked to sleep for 8 h before the experiment and were deprived of midday sleep on the day of the experiment. Pilots filled out the KSS at the beginning of each time period of the mission, then sat in meditation for 15 min, followed by an airfield traffic pattern mission.

Pre-Processing
In this section, the modified Beer-Lambert law (MBLL) was utilized to convert the obtained fNIRS data at two wavelengths per channel into optical density variations. Then the relative concentration of oxyhemoglobin (HbO2), deoxyhemoglobin (Hb), and total Participants participated in the experiment from 9:00 to 11:00 and from 14:00 to 16:00, respectively. One effective airfield traffic pattern was carried out in each time period. It took about 10 min to complete the airfield traffic pattern mission. Participants were asked to sleep for 8 h before the experiment and were deprived of midday sleep on the day of the experiment. Pilots filled out the KSS at the beginning of each time period of the mission, then sat in meditation for 15 min, followed by an airfield traffic pattern mission.

Pre-Processing
In this section, the modified Beer-Lambert law (MBLL) was utilized to convert the obtained fNIRS data at two wavelengths per channel into optical density variations. Then the relative concentration of oxyhemoglobin (HbO 2 ), deoxyhemoglobin (Hb), and total hemoglobin (tHb) was obtained. Although the relative concentration data of HbO 2 , Hb, and tHb were obtained through experiments in this paper, it was considered that oxygenated Aerospace 2022, 9, 173 6 of 17 hemoglobin indicators had a higher signal-to-noise ratio [31,32] and were more sensitive to task stimuli [33]. Therefore, oxygenated hemoglobin concentration data was selected for analysis in this study.
In the process of collecting brain activity signals by the fNIRS equipment, in addition to the required effective data, there were some noises in the acquired fNIRS data, such as physiological noise, instrument noise, and motion artifacts, as shown in Table 3. In order to eliminate the interference of physiological noise and instrument noise, low-pass filtering (0.01 Hz) and high-pass filtering (0.2 Hz) were utilized to process fNIRS signal [34]. After that, spline interpolation was used to remove motion artifacts from fNIRS signal [35]. The relative concentrations of hemoglobin before and after denoising are depicted in Figure 6. Table 3. Main noise in fNIRS signal.

Motion artifact
Head movement, coughing, and other behaviors may cause a shift in the relative position between the hat and the head, which can produce high-frequency interference. Motion artifacts are usually manifested as baseline drift in the signal [38].
Instrument noise Instrument noise caused by unstable light sources, experimental equipment, surrounding environment, etc. [39].
Aerospace 2022, 8, x 6 of 17 hemoglobin (tHb) was obtained. Although the relative concentration data of HbO2, Hb, and tHb were obtained through experiments in this paper, it was considered that oxygenated hemoglobin indicators had a higher signal-to-noise ratio [31,32] and were more sensitive to task stimuli [33]. Therefore, oxygenated hemoglobin concentration data was selected for analysis in this study.
In the process of collecting brain activity signals by the fNIRS equipment, in addition to the required effective data, there were some noises in the acquired fNIRS data, such as physiological noise, instrument noise, and motion artifacts, as shown in Table 3. In order to eliminate the interference of physiological noise and instrument noise, low-pass filtering (0.01 Hz) and high-pass filtering (0.2 Hz) were utilized to process fNIRS signal [34]. After that, spline interpolation was used to remove motion artifacts from fNIRS signal [35]. The relative concentrations of hemoglobin before and after denoising are depicted in Figure 6. Table 3. Main noise in fNIRS signal.

Motion artifact
Head movement, coughing, and other behaviors may cause a shift in the relative position between the hat and the head, which can produce high-frequency interference. Motion artifacts are usually manifested as baseline drift in the signal [38].
Instrument noise Instrument noise caused by unstable light sources, experimental equipment, surrounding environment, etc. [39].

Data Pre-Processing
Based on the obtained relative concentration data of HbO2, the characteristic parameters reflecting the relative concentration of HbO2 were further extracted. Furthermore, the near-infrared signal characteristic index that characterizes the fatigue state of the pilot could be obtained.
The time-domain method is usually adopted to analyze the characteristics of fNIRS signal and to study the variation of fNIRS characteristics in time sequence. The brain state and function of participants was judged and recognized by time domain features. Timedomain analysis method has been widely used in the research of brain-machine interfaces based on fNIRS [40], and the time-domain feature indexes used in our study are listed in Table 4.

Data Pre-Processing
Based on the obtained relative concentration data of HbO 2 , the characteristic parameters reflecting the relative concentration of HbO 2 were further extracted. Furthermore, the near-infrared signal characteristic index that characterizes the fatigue state of the pilot could be obtained.
The time-domain method is usually adopted to analyze the characteristics of fNIRS signal and to study the variation of fNIRS characteristics in time sequence. The brain state and function of participants was judged and recognized by time domain features. Time-domain analysis method has been widely used in the research of brain-machine interfaces based on fNIRS [40], and the time-domain feature indexes used in our study are listed in Table 4 [41]. As shown in Figure 7, DAE is a classic three-layer neural network with full connectivity between the layers. Compared with traditional autoencoders, noise is added to the original input data in DAE. After the encoding-decoding process, the output layer can reconstruct the original input data from the corrupted data. The whole network has the ability to resist noise. In other words, the robustness of the whole network is increased. Thus, the generalization performance of the model is improved. Table 4. Characteristic indexes of fNIRS signal.

Item
Indexes Symbol Unit Formula Coefficient of variation  [41]. As shown in Figure 7, DAE is a classic three-layer neural network with full connectivity between the layers. Compared with traditional autoencoders, noise is added to the original input data in DAE. After the encoding-decoding process, the output layer can reconstruct the original input data from the corrupted data. The whole network has the ability to resist noise. In other words, the robustness of the whole network is increased. Thus, the generalization performance of the model is improved.  The main principle of denoising autoencoder is as follows: (1) Noise is randomly added to the input data x i j of the input layer to generate corrupted input data x i j . (2) Based on x i j the output data of the hidden layer y i j is obtained by encoding function f (•). Formula (1) illustrates the encoding process.
where w is the weight matrix, b is the deviation vector, s f is the activation function, and s f = 1 1+e −x . (3) Based on y i j , the non-linear calculation of the decoding function g(•) is used for the hidden layer. Formula (2) demonstrates the decoding process. z = g(y) = s g (w y + b ) (2) where w is the weight matrix, b is the deviation vector, s g is the activation function, and s g = 1 1+e −y . During the training process, the loss function of the DAE is calculated by the following Equation (3).

Stacked Denoising Autoencoder
The stacked denoising autoencoder (SDAE) is formed by stacking multiple denoising autoencoders, as shown in Figure 8. The output of the previous DAE is utilized as the input of the next DAE. In the SDAE training process, the SDAE network is pre-trained with an unsupervised layer-by-layer training algorithm. Using the classification algorithm, optimization parameters of the whole neural network are reversely fine-tuned. In this way, the network is easier to converge during the whole training process. The training process of SADE is as follows:

Fatigue Scale Analysis
Pilots filled out the KSS before performing the flight mission during two time periods, respectively. The Shapiro-Wilk test was utilized to analyze the normality of KSS scores. The results show that under the test level of α = 0.05, the data from 9:00 to 11:00 follow the normal distribution (W = 0.941, p > 0.05) and the data from 14:00 to 16:00 follow the normal distribution (W = 0.965, p > 0.05). Then, a parametric test method, namely a paired sample t-test, was used to analyze the participants' KSS scores. There is a significant increase in the KSS scores between 14:00 to 16:00 (M = 5.93, SD = 1.112) compared to KSS scores between 9:00 to 11:00 (M = 2.57, SD = 1.006); t (29) = −13.394, p < 0.001. In other words, pilots' fatigue levels were significantly different in the two time periods. Between 7:00 and 9:00, the average KSS scores were less than 4. Between 14:00 and 16:00, the average KSS scores were greater than 4 but no greater than 7. Therefore, the fatigue levels corresponding to the two time periods can be divided into non-fatigued and fatigued.

fNIRS Data Collection
In our study, the fNIRS data of 30 pilots were obtained through flight simulation experiments. The data of 22 participants were randomly selected as training samples, and the data of the remaining 8 participants were taken as test samples. The fNIRS data were intercepted according to a time window of 15 s [33]. 1080 samples were ultimately ob- Step 1: SDAE network parameters are initialized.
Step 2: The first DAE is trained by minimizing error, and the parameters of the first hidden layer are obtained. The first hidden layer is used as the input of the second DAE, whose training method is the same as the first. Training methods of other DAEs are the same.
Step 3: The trained DAEs are stacked to form SDAE. An output layer is added to the top layer of the SDAE network.
Step 4: The sample data and the correct rate of the output layer are utilized to perform supervised reverse fine-tuning of the entire network.

Fatigue Scale Analysis
Pilots filled out the KSS before performing the flight mission during two time periods, respectively. The Shapiro-Wilk test was utilized to analyze the normality of KSS scores. The results show that under the test level of α = 0.05, the data from 9:00 to 11:00 follow the normal distribution (W = 0.941, p > 0.05) and the data from 14:00 to 16:00 follow the normal distribution (W = 0.965, p > 0.05). Then, a parametric test method, namely a paired sample t-test, was used to analyze the participants' KSS scores. There is a significant increase in the KSS scores between 14:00 to 16:00 (M = 5.93, SD = 1.112) compared to KSS scores between 9:00 to 11:00 (M = 2.57, SD = 1.006); t (29) = −13.394, p < 0.001. In other words, pilots' fatigue levels were significantly different in the two time periods. Between 7:00 and 9:00, the average KSS scores were less than 4. Between 14:00 and 16:00, the average KSS scores were greater than 4 but no greater than 7. Therefore, the fatigue levels corresponding to the two time periods can be divided into non-fatigued and fatigued.

fNIRS Data Collection
In our study, the fNIRS data of 30 pilots were obtained through flight simulation experiments. The data of 22 participants were randomly selected as training samples, and the data of the remaining 8 participants were taken as test samples. The fNIRS data were intercepted according to a time window of 15 s [33]. 1080 samples were ultimately obtained. The specific figures are described in Table 5. Eight characteristic indexes of each channel in the samples were extracted. Some data of certain samples are shown in Table 6.

Establishment of SDAE Model
In the process of establishing the SDAE model, the selected different parameter combinations were trained and tested 10 times to obtain the average recognition accuracy.

Number of Hidden Layers
In this section, the characteristic indexes of fNIRS data are utilized as the input to the SDAE model, that is, the input dimension is 864 (108 × 8). The fatigue states were used as the output of the model, that is, the output dimension is two. In order to achieve the best recognition performance, the number of hidden layers was set to 1, 2, 3, 4, and 5, respectively. The number of neurons in the hidden layer was set to 864. The learning rate, the noise rate, the number of iterations, and the batch size were set to 0.1, 0.1, 15, and 4, respectively. The input data were classified, and the classification results are exhibited in Figure 9. With the increase in the number of hidden layers, the identification accuracy rate of the training set grew increasingly high. However, when the number of hidden layers was greater than two, the identification accuracy rate of the test set decreased instead of increasing. When the number of hidden layers was 2, the recognition rate of the network for the training set reached 88.52% and the identification accuracy rate of the network for the test set reached 86.46%. At this point, the accuracy of the test set was the highest. Therefore, two was chosen as the number of hidden layers. Figure 9. With the increase in the number of hidden layers, the identification accuracy rate of the training set grew increasingly high. However, when the number of hidden layers was greater than two, the identification accuracy rate of the test set decreased instead of increasing. When the number of hidden layers was 2, the recognition rate of the network for the training set reached 88.52% and the identification accuracy rate of the network for the test set reached 86.46%. At this point, the accuracy of the test set was the highest. Therefore, two was chosen as the number of hidden layers. Figure 9. The relationship between the number of hidden layers and recognition rate.

Number of Nodes in the Hidden Layer
The number of hidden layer nodes was determined on the basis of determining the number of hidden layers. The number of neurons in the first hidden layer of SDAE indicated the feature dimension automatically extracted during the pre-training of the first DAE. Usually, the number of nodes in the hidden layer of SDAE is less than the number of nodes in the input layer. In this study, the number of neurons in the first hidden layer was set to 864, 432, 216, 108, and 54, respectively. The number of neurons in the second hidden layer was 1/2 of the first hidden layer. The networks were trained and tested. Results are illustrated in Figure 10. When the number of neurons in the first hidden layer was 432, the network had the highest accuracy on the training set and test set and the accuracy rate on the test set was 88.19%. Therefore, the number of neurons in the first hidden layer was 432.

Number of Nodes in the Hidden Layer
The number of hidden layer nodes was determined on the basis of determining the number of hidden layers. The number of neurons in the first hidden layer of SDAE indicated the feature dimension automatically extracted during the pre-training of the first DAE. Usually, the number of nodes in the hidden layer of SDAE is less than the number of nodes in the input layer. In this study, the number of neurons in the first hidden layer was set to 864, 432, 216, 108, and 54, respectively. The number of neurons in the second hidden layer was 1/2 of the first hidden layer. The networks were trained and tested. Results are illustrated in Figure 10. When the number of neurons in the first hidden layer was 432, the network had the highest accuracy on the training set and test set and the accuracy rate on the test set was 88.19%. Therefore, the number of neurons in the first hidden layer was 432. On the basis of choosing the number of neurons in the first hidden layer, the effect of the number of neurons in the second hidden layer on the results of SDAE fatigue identification accuracy is analyzed. The number of neurons in the second hidden layer was set to 27, 54, 108, 216, and 432, respectively. The accuracy results of the training set and test set are revealed in Figure 11. When the number of neurons in the second hidden layer reached 108, the accuracy rate of the training set and test set were 93.06% and 90.63%, respectively. Therefore, 108 was chosen as the number of the second hidden layer neurons after comprehensive consideration. On the basis of choosing the number of neurons in the first hidden layer, the effect of the number of neurons in the second hidden layer on the results of SDAE fatigue identification accuracy is analyzed. The number of neurons in the second hidden layer was set to 27, 54, 108, 216, and 432, respectively. The accuracy results of the training set and test set are revealed in Figure 11. When the number of neurons in the second hidden layer reached 108, the accuracy rate of the training set and test set were 93.06% and 90.63%, respectively. Therefore, 108 was chosen as the number of the second hidden layer neurons after comprehensive consideration. cation accuracy is analyzed. The number of neurons in the second hidden layer was set to 27, 54, 108, 216, and 432, respectively. The accuracy results of the training set and test se are revealed in Figure 11. When the number of neurons in the second hidden layer reached 108, the accuracy rate of the training set and test set were 93.06% and 90.63%, respectively Therefore, 108 was chosen as the number of the second hidden layer neurons after com prehensive consideration. Figure 11. Accuracy results of different number of neurons in the second hidden layer.

Noise Rate
The noise rate indicates the proportion of noise added to the SDAE samples during the pre-training process. The larger the noise rate, the larger the proportion of noise added during the pre-training process. The optimal noise rate was selected through multiple tri als, as shown in Figure 12. When the noise ratio was 0.1, the identification accuracy rat of the network was the highest and the identification accuracy rate of the training set and test set were 93.06% and 90.63%, respectively. Therefore, the noise rate was set to 0.1.

Noise Rate
The noise rate indicates the proportion of noise added to the SDAE samples during the pre-training process. The larger the noise rate, the larger the proportion of noise added during the pre-training process. The optimal noise rate was selected through multiple trials, as shown in Figure 12. When the noise ratio was 0.1, the identification accuracy rate of the network was the highest and the identification accuracy rate of the training set and test set were 93.06% and 90.63%, respectively. Therefore, the noise rate was set to 0.1.

Learning Rate
The accuracy rates of the training set and test set under different learning rates are displayed in Figure 13. When the learning rate was 0.2, the network had the highest identification accuracy rate. The identification accuracy rate of the training set and test set were 93.89% and 91.32%, respectively. Combining the accuracy results of the training set and test set, 0.2 was chosen as the learning rate of the network.

Learning Rate
The accuracy rates of the training set and test set under different learning rates are displayed in Figure 13. When the learning rate was 0.2, the network had the highest identification accuracy rate. The identification accuracy rate of the training set and test set were 93.89% and 91.32%, respectively. Combining the accuracy results of the training set and test set, 0.2 was chosen as the learning rate of the network.

Batch Size
Batch size represents the number of samples loaded into the neural network at a time. The number of batch size must be evenly divisible by the number of training set. Figure 14 shows the relationship between different batch size and identification accuracy rate. When the batch size was 4, the identification accuracy rate of the test set was 91.32%. Therefore, the parameter batch size was set to four.

Learning Rate
The accuracy rates of the training set and test set under different learning rates are displayed in Figure 13. When the learning rate was 0.2, the network had the highest identification accuracy rate. The identification accuracy rate of the training set and test set were 93.89% and 91.32%, respectively. Combining the accuracy results of the training set and test set, 0.2 was chosen as the learning rate of the network. Figure 13. The relationship between learning rate and accuracy rate.

Batch Size
Batch size represents the number of samples loaded into the neural network at a time. The number of batch size must be evenly divisible by the number of training set. Figure  14 shows the relationship between different batch size and identification accuracy rate. When the batch size was 4, the identification accuracy rate of the test set was 91.32%. Therefore, the parameter batch size was set to four. The optimal structure of the trained SDAE neural network is 864-432-108-2, which means that the SDAE neural network has a four-layer of structure, as seen in Figure 15. The input layer is used to input data. The first and the second hidden layers are used for dimensionality reduction and feature extraction. The output layer is utilized for outputting the results. The noise rate, learning rate, and batch size are 0.1, 0.2, and 4, respectively. The accuracy rate of SDAE model training set and test set are 93.89% and 91.32%, respectively. The optimal structure of the trained SDAE neural network is 864-432-108-2, which means that the SDAE neural network has a four-layer of structure, as seen in Figure 15. The input layer is used to input data. The first and the second hidden layers are used for dimensionality reduction and feature extraction. The output layer is utilized for outputting the results. The noise rate, learning rate, and batch size are 0.1, 0.2, and 4, respectively. The accuracy rate of SDAE model training set and test set are 93.89% and 91.32%, respectively.

Model Performance Evaluation
In this section, the identification results of the SDAE model are compared with those of the traditional classification models SVM and LDA to verify the accuracy and effectiveness of the models. The training set was adopted to train SVM and LDA, respectively. In addition, the test set was used to test SVM and LDA models. The confusion matrix of the SDAE model, SVM model, and LDA model are shown in Figure 16 and the results of the identification accuracy rates are found in Table 7.
The optimal structure of the trained SDAE neural network is 864-432-108-2, which means that the SDAE neural network has a four-layer of structure, as seen in Figure 15. The input layer is used to input data. The first and the second hidden layers are used for dimensionality reduction and feature extraction. The output layer is utilized for outputting the results. The noise rate, learning rate, and batch size are 0.1, 0.2, and 4, respectively. The accuracy rate of SDAE model training set and test set are 93.89% and 91.32%, respectively. Figure 15. The relationship between batch size and accuracy rate.

Model Performance Evaluation
In this section, the identification results of the SDAE model are compared with those of the traditional classification models SVM and LDA to verify the accuracy and effectiveness of the models. The training set was adopted to train SVM and LDA, respectively. In addition, the test set was used to test SVM and LDA models. The confusion matrix of the SDAE model, SVM model, and LDA model are shown in Figure 16 and the results of the identification accuracy rates are found in Table 7. Figure 16 demonstrates that the accuracy of the SDAE model under two fatigue states is 93.06% and 89.58%, respectively. The accuracy of the LDA model under two fatigue states is 72.22% and 63.89%, respectively. The accuracy of the SVM model under two fatigue states is 77.08% and 73.61%, respectively. The accuracy of the SDAE model is higher than that of the LDA model and the SVM model under two fatigue states.  Table 7 presents the identification accuracy rate of the SDAE model as 23.26% higher than that of the LDA model and 15.97% higher than that of the SVM model. Therefore, the pilot fatigue identification model based on the SDAE model established in this paper has high identification accuracy.  In order to further evaluate the classification performance of the models, four indicators are used in our study: precision rate, recall value, F1 score, and ROC curve. The calculated results of precision, recall value, and F1 score for the SDAE, LDA, and SVM models are indicated in Table 8, and the receiver operating characteristic curve (ROC) is shown in Figure 17.    Figure 16 demonstrates that the accuracy of the SDAE model under two fatigue states is 93.06% and 89.58%, respectively. The accuracy of the LDA model under two fatigue states is 72.22% and 63.89%, respectively. The accuracy of the SVM model under two fatigue states is 77.08% and 73.61%, respectively. The accuracy of the SDAE model is higher than that of the LDA model and the SVM model under two fatigue states. Table 7 presents the identification accuracy rate of the SDAE model as 23.26% higher than that of the LDA model and 15.97% higher than that of the SVM model. Therefore, the pilot fatigue identification model based on the SDAE model established in this paper has high identification accuracy.
In order to further evaluate the classification performance of the models, four indicators are used in our study: precision rate, recall value, F1 score, and ROC curve. The calculated results of precision, recall value, and F1 score for the SDAE, LDA, and SVM models are indicated in Table 8, and the receiver operating characteristic curve (ROC) is shown in Figure 17. in Figure 17.  17. ROC curves of three models. Table 8 demonstrates that compared with the LDA model and the SVM model, the SDAE model has higher accuracy, recall value, and F1 score, indicating that the classification precision and the sensitivity of the SDAE model are higher.  Table 8 demonstrates that compared with the LDA model and the SVM model, the SDAE model has higher accuracy, recall value, and F1 score, indicating that the classification precision and the sensitivity of the SDAE model are higher.
As can be seen from Figure 17, three curves are located on the upper left of the 45 • diagonal and deviate from the 45 • diagonal. However, the SDAE model curve is closer to the (0,1) point than the LDA and SVM model curves, indicating that the SDAE model performs better.
Based on the above accuracy rate, precision rate, recall value, F1 score and ROC curve results, the SDAE model is superior to the LDA model and the SVM model. The SDAE model can effectively identify the two fatigue states of the pilot.

Discussion
In order to establish the identification model of the pilots' fatigue state, the pilots were selected as participants. The airfield traffic pattern was selected as the flight task, and the wireless wearable near-infrared device was selected as the data acquisition device. The flight simulation experiment had less interference from the pilot operation and a low cost of data acquisition. Therefore, the obtained fNIRS data are of great significance for studying the identification of pilots' fatigue status and the effect of fatigue on pilots' cognitive activities.
The fNIRS signals from the prefrontal lobe of the pilots' brains were acquired. The results of the study show that the degree of brain fatigue can be reflected by the fNIRS data from the prefrontal lobe of the brain. This is consistent with the results of relevant research [17,42]. Meanwhile, the relative concentration of HbO2 in the fNIRS signal was extracted. The findings suggest that the relative concentration of HbO2 can reflect the degree of brain fatigue, which is consistent with the results of related studies [19,22]. This maintains the consistency of the related research results.
In our study, mean value, variance, standard deviation, kurtosis, skewness, coefficient of variation, peak value, and range of HbO2 for each channel were extracted as the characteristic indexes. Furthermore, the fatigue status of pilots was identified by these characteristic indicators. In related studies, fatigue states have been analyzed by experts and scholars based on indicators such as mean value, variance, skewness, and kurtosis [43]. Our results enrich the research of characteristic indicators of fNIRS.
This means that fNIRS could be a powerful tool for studying fatigue. The relevant characteristics of fNIRS may be indicators of fatigue. Therefore, fNIRS has good prospects in fatigue-related research.
Due to the limitation of time and effort, fatigue was divided into two states in this paper-non-fatigue and fatigue. In future studies, pilots' fatigue states could be further refined and identified. The experimental participants selected for the experiments in our study were all male pilots, so female pilots could be recruited for the experiments in future research to expand the scope of the study. Thirty pilots were recruited as participants for the flight simulation experiment in our study, and the number of participants could be increased in future studies to improve reliability by increasing the sample size.

Conclusions
In this study, the wearable near-infrared equipment was used to obtain pilots' fNIRS data in the flight simulation experiment and 1080 valid samples were selected. Eight feature indexes (mean, variance, standard deviation, kurtosis, skewness, standard deviation, coefficient of variation, peak value, and extreme deviation) of HbO 2 of each sample were extracted to establish the feature parameter set. These were used as the input of the SDAE model to train and test the pilots' fatigue state identification model. The identification accuracy of the SDAE model is 91.32%, which is 23.26% and 15.97% higher than that of the LDA and SVM models, respectively. Therefore, the pilots' fatigue state identification model based on the SDAE established in this paper has high identification accuracy. The SDAE model proposed in this paper can provide a new theoretical basis for the identification of pilots' fatigue. This research can provide directions for monitoring and an early warning of pilot fatigue during flight missions by using fNIRS in the future. This is conducive to improving the pilot fatigue risk management system (FRMS) and reduce flight accidents caused by pilot fatigue.