Adaptive Neuro-Fuzzy Fusion of Multi-Sensor Data for Monitoring a Pilot’s Workload Condition

To realize an early warning of unbalanced workload in the aircraft cockpit, it is required to monitor the pilot’s real-time workload condition. For the purpose of building the mapping relationship from physiological and flight data to workload, a multi-source data fusion model is proposed based on a fuzzy neural network, mainly structured using a principal components extraction layer, fuzzification layer, fuzzy rules matching layer, and normalization layer. Aiming at the high coupling characteristic variables contributing to workload, principal component analysis reconstructs the feature data by reducing its dimension. Considering the uncertainty for a single variable to reflect overall workload, a fuzzy membership function and fuzzy control rules are defined to abstract the inference process. An error feedforward algorithm based on gradient descent is utilized for parameter learning. Convergence speed and accuracy can be adjusted by controlling the gradient descent rate and error tolerance threshold. Combined with takeoff and initial climbing tasks of a Boeing 737–800 aircraft, crucial performance indicators—including pitch angle, heading, and airspeed—as well as physiological indicators—including electrocardiogram (ECG), respiration, and eye movements—were featured. The mapping relationship between multi-source data and the comprehensive workload level synthesized using the NASA task load index was established. Experimental results revealed that the predicted workload corresponding to different flight phases and difficulty levels showed clear distinctions, thereby proving the validity of data fusion.


Introduction
The rapid development of the air transportation industry puts forward higher requirements on aircraft performance and safety than ever before. By means of structural optimization, system integration, and redundant design, airborne systems are being highly complicated and coupled. However, the cockpit is the only interface between pilot and aircraft, integrating all the human-machine interaction equipment required to perform flight tasks. Novel features constantly arise during pilot-aircraft interaction, including large-scale information, simplex interactive mode, multiple interactive nodes, and high real-time requirements. Therefore, it is highly possible for pilots to be caught in an unbalanced workload condition. Future advanced flight support systems are expected to monitor and evaluate the pilot's real-time workload with multi-source data, which could obtain a timely and effective warning of overload status. The same holds true for ground controllers of unmanned aerial vehicles (UAV), in which case, workload still cannot be neglected, although the pilot is detached from the aircraft. Furthermore, it is of great benefit for improving the design of human-machine interaction and flight operation procedures to ensure flight safety.
Related research in pilot workload domains mainly covers two directions. Among them, aviation physiologists and psychologists concentrate on the mechanism of workload initiated by cognition, while human factors engineers hold interests in qualitative and quantitative evaluation of human-machine interaction design. The productions of these studies include, but are not limited to, the changing rule of physiological characteristic parameters or statistical subjective feedbacks of operators.
In the 1960s and 1980s, aerospace research institutes gradually developed evaluation techniques like Cooper-Harper scales, NASA task load index (NASA-TLX) and subjective workload assessment technique (SWAT) [1][2][3][4], and successfully applied them in various types of aircrafts. However, such methods mostly rely on pilot's subjective feedback and thus there exist limitations in reflecting workload comprehensively. Later work tended to combine physiological characteristics with evaluation scales to increase the credibility of conclusions. Nocera et al. analyzed the sensitivity of the spatial dispersion index to mental workload changes based on a pilot's eye movements in different flight phases [5]. Borghini et al. quantified the relationship between neurophysiological activities and mental workload [6]. Hoepf et al. addressed the feasibility of using physiological data to evaluate cognitive workload [7]. Then, they compared the advantages and disadvantages of different neural network models in assessment, and found that the evaluation results of different models were not always consistent with subjective perceptions, which should be strengthened in terms of adaptability [8]. Based on NASA-TLX and Likert scales, Orlandi et al. studied the effects of shiphandling maneuvers on psychological and physiological reactions of marine pilots in combination with heart rate, heart rate variability, and pupil dilation [9]. Wulvik et al. found the relationship between mental state and physiology in a simulated control room, but it lacked the quantitative description of mathematical models [10]. Ziegler et al. compared workload-based performance in two simulated flight tasks to determine different sensors' predictive power, but the work was limited to only the direct comparison of physiological indexes [11]. A general concern about the aforementioned studies is the generalization ability of the workload model. The physiological parameters that could best reflect the workload remains controversial. As a result, the major limitation appears when we have to give up the existing one to rebuild another from scratch if other physiological indices are added into the model.
Other research has focused on the practicability of evaluation techniques, and thus were centered on improving subjective assessment methods. Liu et al. proposed to investigate the effects of time pressure and target uncertainty on a UAV operator's workload based on subjective scores and secondary task performance [12]. Rusnock et al. reported on a new method called workload profiles to continuously evaluate workload without interrupting operations, avoiding the widely known inability of subjective evaluation [13]. Chi et al. used subjective mental workload ratings, adapted from NASA-TLX scales, to help determine the training time for new workers to get familiar with new tasks [14]. Nevertheless, the main weakness still lies in how to handle the widely varied personal opinions to reach a convincing conclusion.
Even though there is an persistent call for the intelligent identification of an operator's workload, it has only been partly implemented in some fields, including automatic driving vehicle, artificial intelligence, and medical science.
Kong et al. presented a driving fatigue recognition method based on the phase lock value (PLV) of electroencephalogram (EEG) signal [15]. In addition, they also developed a driving fatigue detection system based on machine vision and an AdaBoost algorithm by capturing the driver's real-time facial and eyeball features [16]. Sun et al. proposed an adaptive dynamic recognition model for vehicle driver's fatigue based on a multi-level feature fusion [17]. Through introducing dynamic basic probability assignment (BPA), the weight of each feature source can be adapted to the change of fatigue. Ke et al. constructed a mental workload recognition model using feature selection and regression analysis of an EEG [18]. What makes their work stand out is the supportability for cross-task situations. Zammouri et al. designed an EEG workload classifier combining power spectral density (PSD) analysis and statistical criteria. Based on this, a brain-computer interface was developed for real-time workload evaluation in cognitive learning [19]. McDonald et al. proposed machine learning approaches for detecting driver distraction, with a training set of 21 algorithms [20].
What can be summarized from above-mentioned studies is that physiological parameter recording and analysis is still one of the most popular measures in workload detection. It has advantages in real-time performance while other means fail. In the domain of physiology and medicine, research used physiological indices to monitor health condition as well. Yamada et al. studied the method of identifying mental fatigue in different age groups based on eye movement indicators [21]. Lamti et al. established a mental workload assessment model by fusing the features in an EEG [22]. Zhang et al. fused EEG spectral and temporal features using two-stream neural networks for mental workload assessment [23].
An operator's workload condition is dually affected by his physiological state and task performance in a complex task environment. As a result, research emphasis should be placed on the strong coupling between the two factors. For monitoring the pilot's workload condition in the cockpit, both real-time performance and generalization ability of the proposed method should receive significant attention. In contrast to the aforementioned methods, the model proposed in this paper aimed to assess a pilot's real-time workload condition quantitatively by fusing multi-source physiological and flight data based on a fuzzy neural network. More specifically, to cover the shortage of qualitative and delayed evaluation, this study raises an objective solution by mapping real-time captured multi-source data to an overall workload level. Additionally, to make up the deficiency of incompatibility and one-sidedness from model to model, this study leaves the expansibility for any data flows concerned with workload to future research.
The remainder of this paper is divided into five sections. Section 2 gives a detailed explanation for each part of the model. Section 3 addresses the parameter learning method to help realize the ability of self-adaption. Section 4 outlines a case to demonstrate the feasibility and reasonability of applying the data fusion model to monitor the pilot's real-time workload. Section 5 discusses the experimental results from different perspectives. Finally, conclusions are drawn in Section 6.

Model Structure
In a real flight environment, pilots need to make cognitive decisions in constantly changing situations according to their own abilities acquired from training and work experience. Due to the uncertainty in probability, it is suitable to describe their behavior patterns with fuzzy system. However, owing to the lack of ability toward autonomous learning and adaption, modeling with a fuzzy system is clearly not commensurate with the intelligent attributes of pilots who act as agents in abstract models. On the other hand, a neural network has excellent learning performance by mining feature information from samples and adapting parameter values to the needs of different occasions, though it cannot form abstract rules to express knowledge. As a result, fuzzy neural network has been widely used in modeling nonlinear dynamic systems with the advantages of both sides. Based on this, a high-performance data fusion model can be constructed with feature parameters. For basic concepts about fuzzy neural networks, see References [24,25]. Meanwhile, refer to References [26][27][28][29] for performances of fuzzy neural networks in related studies such as flight control and fault diagnosis. Furthermore, they have also gained popularity in other areas of science such as pattern recognition, data classification, image processing, and robot control [30][31][32][33][34].
A pilot's workload condition can be reflected by physiological responses during flight and the ability to maneuver the aircraft according to a flight plan. Therefore, the monitoring data related to the workload condition includes flight data and a pilot's physiological data. Flight data can be obtained from sensors installed in the airframe structure or airborne system and saved in flight recorders. Key flight characteristic parameters include pitch angle, yaw angle, roll angle, airspeed, climbing rate, etc. Furthermore, the pilot's physiological data can be collected from EEG, ECG, eye tracker, and the like. Key physiological characteristic parameters include heart rate, heart rate variability, respiratory rate, respiratory depth, pupil diameter, etc. Taking key flight or physiological characteristic parameters x 1 , x 2 , . . . , x k as inputs, and the pilot's integrated workload level y as the output, the mapping between inputs and output can be expressed as: Based on this mapping relationship, a fuzzy neural network model with multiple layers can be constructed, as shown in Figure 1. In Layer 1, feature quantities x i (i = 1, 2, . . . , k) are extracted from original monitoring data as the inputs of the fuzzy neural network. The balance between accuracy and resource demand can be achieved via fusion with feature data rather than raw data.
Layer 2 is the principal component extraction layer. The principal components of the input feature data are extracted through a linear transformation into a new variable X p (p = 1, 2, . . . , n). It can help guarantee the accuracy of results and further improve the fusion efficiency.
Layer 5 is the normalization layer, which aims to normalize the applicability of different fuzzy rules. Layer 6 is the output layer. In this layer, the data fusion result will be defuzzified using connection weight w j (j = 1, 2, . . . , m), and the pilot's integrated workload level is the output.
Overall, the model conforms to the basic structure of Mamdani fuzzy neural networks [35]. What makes it stand out from standard models is the dimension reduction process before fuzzification. Thus, it is expected to improve fusion efficiency when dealing with complex flight tasks compared with other similar applications.

Principal Component Analysis
High coupling exists in the feature data that is inputted into the model. Flight characteristic parameters, such as pitch angle, yaw angle, roll angle, airspeed, and climbing rate, are interrelated with each other under the constraints of flight dynamics. Pilot's physiological characteristic parameters, such as heart rate, heart rate variability, respiratory rate, respiratory depth, pupil diameter, etc., are linked as the task scenario changes. In order to reduce the complexity, the main information of the feature data is extracted using principal component analysis. Thereby, the original k variables can be linearly combined to form n new variables, and then a fuzzy membership relationship will be constructed for the reasoning analysis.
Assume that the observation matrix of k variables in a monitoring time series [1,t] is: After normalization, the correlation coefficient matrix of X is where r ij represents the covariance of x i and x j (i, j = 1, 2, . . . , k). Based on the eigenvalues λ i (i = 1, 2, . . . , k) of matrix R, the respective contribution rates of different principal components can be calculated using: According to the cumulative contribution rate of principal components, the number of new combination variables X p (p = 1, 2, . . . , n) can be determined. It is generally considered that if the cumulative contribution rate reaches 70-90%, combination variables can represent most information of the original variables [36].
Taking the eigenvector a i = [a i1 a i2 . . . a ik ] (i = 1, 2, . . . , k) of matrix R to be linear transform coefficients, the original variables x i (i = 1,2, . . . , k) can be combined into n new ones using: By using new combination variables for fuzzy reasoning, the scale of fuzzy layer nodes will be compressed to 1/ k p=n+1 m p of the original one. Therefore, the efficiency of the data fusion can be improved.

Fuzzy Membership
Layer 3 of the fuzzy neural network is used to determine the membership of different combination variables to each fuzzy attribute; that is to say, the input characteristic quantities are blurred in this layer.
In order to maintain the balance between control precision and operation speed, applicable fuzzy partitions should be determined after several repeated attempts. If eight fuzzy partitions are chosen, for example, then m 1 = m 2 = . . . = m n = 8 in this layer. The corresponding fuzzy language values are NB (negative big), NM (negative medium), NS (negative small), NZ (negative zero), PZ (positive zero), PS (positive small), PM (positive medium), and PB (positive big).
In order to obtain better control sensitivity, the membership function can be a composite form of Z-shaped, Gaussian, and S-shaped membership functions, namely: • Z-shaped membership function: where a Z pq denotes the shoulder value and b Z pq denotes the foot value of the Z-shaped membership function.
• Gaussian membership function: where c pq denotes the mean value and σ pq denotes the standard deviation of the Gaussian membership function. • S-shaped membership function: where a S pq denotes the foot value and b S pq denotes the shoulder value of the S-shaped membership function.

Fuzzy Control Rules
Each node in Layer 4 represents a fuzzy rule. Under the premise that the number of new combination variables X p in Layer 2 is n, the number of nodes in Layer 4, denoting m, can be determined using: According to the logical relationship between the inputs and outputs of the fuzzy neural network, fuzzy control rules can be constructed based on "if-then" language, which is like: If (X 1 is NB) and (X 2 is NB) and . . . and (X n is NB), then (y is NB).
If (X 1 is NM) and (X 2 is NB) and . . . and (X n is NB), then (y is NB). . . . If (X 1 is PB) and (X 2 is PB) and . . . and (X n is PB), then (y is PB).
Obviously, in real conditions, the fuzzy membership of each fuzzy control rule is not consistent. Define the applicability of fuzzy control rule, R j (j = 1, 2, . . . , m), to represent the minimum value of membership function in antecedent rules according to: After determining the applicability of each fuzzy control rule, in order to ensure the comparability of a pilot's integrated workload level and speed up the optimization algorithm for parameter learning based on a gradient descent, Layer 5 normalizes the applicability by calculating:

Defuzzification
What the fuzzy neural network finally outputs in Layer 6 is the deblurred pilot's integrated workload level by giving a connection weight w j (j = 1, 2, . . . , m) to each normalized applicability value in Layer 5. The calculation method can be expressed as: If the fuzzy neural network needs to achieve multi-dimensional outputs, that is, to construct a multi-input and multi-output (MIMO) mapping relationship, then the defuzzification calculation method can be expressed as: where r indicates the number of outputs.

Parameter Learning
The weights connecting Layers 5 and 6, namely w j , need to be determined through adaptive learning. An error feedforward algorithm based on a gradient descent can be applied for parameter learning, and the algorithm flow is shown in Figure 3. For a fuzzy neural network with r-dimensional outputs, the error cost function is: where y ei represents the ith expected output and y ai represents the ith actual output. The first-order gradient of the error cost function in the direction of the weight variable is: where w ij denotes the weight connecting the ith node in Layer 6 and jth node in Layer 5; R j denotes the value of the jth node in Layer 5. The iterative equation for error feedforward parameter learning based on a gradient descent is: where β represents the parameter learning rate. The convergence rate of parameter learning can be controlled by adjusting the gradient falling rate.
Substituting Equation (15) into Equation (16), the cumulative calculation equation can be derived to be: By setting an error tolerance threshold τ, the iterative learning process will be terminated when suitable parameter values are obtained. At this point, the fuzzy neural network will reach the optimal state via training.
The formalized description of the algorithm is shown in Table 1. Table 1. Formalized description of the error feedforward algorithm.

Case Study
From the perspective of cognitive psychology, individual performance can be measured from multiple aspects including: (1) time to perform the task, (2) accuracy in performing the task, and (3) neurological data [37]. In the cockpit, the "human-machine-environment" closed-loop circuit during a flight mission is shown in Figure 4. A pilot's integrated workload level can be reflected by the physiological state, time requirement, and task performance. These elements run through the overall process composed of perception, decision-making, behaving, etc. In this section, a typical flight task from a civil aircraft operation manual will be taken as background. By constructing the "human-in-the-loop" simulation flight scenario, physiological parameter-measuring instruments, flight-parameter recorder, and other sensing devices were used to capture the state characteristics of the flight simulator and personnel. A multi-sensor data fusion model was constructed based on a fuzzy neural network to realize the mapping from a multi-source feature data to workload level. Through these studies, we wish to offer an effective solution for monitoring the pilot's workload condition in real time.

Synthetic Flight Simulation Experimental System
The synthetic flight simulation experimental system is mainly composed of a Boeing 737-800 cockpit simulator, annular vision module, physiological monitoring module, flight data recording module, and general control station. The cockpit simulator and annular vision provide an immersive operation experience for flight personnel. A physiological monitoring module obtains real-time physiological characteristics through an SMI eye tracker (SensoMotoric Instruments company, Teltow, Germany) and a BIOPAC multi-channel physiological recorder (BIOPAC Systems Inc., Goleta, USA). The flight data recording module transmits the time-varying data of dynamic parameters during flight manipulation to the general control station. The experiment scenes are shown in Figure 5.

Simulated Flight Scene
We referred to the Shanghai Airlines Standard Operating Procedures (SOP) for a Boeing 737-800 aircraft, and supposed the left-seat pilot maneuvers the aircraft. It was also assumed that the takeoff course was in the same direction as the runway. Analog experimental operation procedures were developed on the basis of a normal takeoff and initial climbing procedures. The steps of responding to the air traffic controller (ATC) commands, setting the heading, setting the navigation mode, etc., which were not closely related to the direct flight maneuver, were omitted to form a simplified version, as shown in Table 2.
Since the left-seat pilot was responsible for maneuvering the aircraft, the evaluator performed the role of a right-seat pilot in the simulated flight in order to make the variables of the left-seat pilot relatively independent. Regardless of the interactions between pilots due to task assignments, it was assumed that the tasks of the right-seat pilot can be performed accurately and in a timely manner.
According to the setting of the simulated flight scene, key indicators for measuring the pilot's workload condition extracted from flight data during takeoff and the initial climbing phase included pitch angle, heading, and airspeed [38,39]. Key indicators that characterize the state of the pilot extracted from physiological data included heart rate, heart rate variability, respiratory rate, respiratory depth, pupil diameter, gaze time, glance time, and blink time. Table 2. Analog takeoff and initial climbing operation procedures.
Step No.

Pilot Flying (PF) Pilot Monitoring (PM)
1 Align the runway, and be ready to take off when airspeed is greater than 45 knots.
Slowly push the thrust handle and slide out the aircraft when N1 is within 40%.
2 Call "take off" when airspeed is greater than 45 knots.
Push thrust to 40% N1, and make the left and right thrusts symmetrical. Call "40, stable" when parameters are stable.

3
Push the steering column a little forward and keep the direction with the rudder.
Call "130 knots" when the airspeed is 3-5 knots in advance of 130 knots.

4
Answer "130 knots" and hold the steering column to get ready to rotate. Call "rotate" when airspeed is 140 knots.

5
Softly rotate (2 • -3 • per second) at V R speed. Then, bring the aircraft to an initial attitude angle of 15 • .
Call "positive climbing rate" when the altimeter indicates positive climbing rate. 6 Call "raise the landing gear" after confirming positive climbing.
Raise the landing gear when hearing the calling.

7
Hold the flight attitude and rise at a speed of 165-175 knots.
Call "1000 feet" when reaching the height of 1000 feet.

8
Call "flaps up" when observing the speed is above 165 knots and there is a clear growth trend.
Place the flap handle at 0, and answer "flaps back in place" when the flaps indicate 0.

12
Put the pitch angle to about 3 • and keep the flight altitude constant.
Call "experiment ends" when flight height remains the same.

Feature Extraction
Left-seat flight personnel were composed of fourteen people who already had training experience in similar flight simulators such as Boeing 737 series, Airbus 320 series, and Cessna 172. Before the formal experiment, they had mastered the flight control and operation procedures of a Boeing 737-800 simulator through necessary instruction and training.
In the experiment, each person encountered three different abnormal conditions, including primary flight display (PFD) images disappearing, left aileron jamming, and left engine in-flight shutdown. Any failure was randomly triggered during the takeoff and the initial climbing phase so that the flight personnel could not make predictions in advance. Therefore, no psychological expectation would be considered either.
Flight and physiological data under different conditions were recorded in real time. Once finishing each flight mission, the flight personnel filled in NASA-TLX scale questionnaires according to their own experience. Subjective evaluations of flight utility were carried out based on six dimensions: mental demand (MD), physical demand (PD), temporal demand (TD), performance (OP), effort (EF), and frustration (FR) [2,3].
In different simulated flight scenes, three kinds of commands, including start and stop commands, flight control commands, and fault observation commands, were extracted as segment signs for different flight phases, namely: • Start and stop commands: "experiment starts," "experiment ends." • Flight control commands: PF calling "take off," PM calling "130 knots," PM calling "rotate," PF calling "raise the landing gear," PF calling "flaps up," PM calling "flare out." • Fault observation commands: "PFD images disappear," "left engine stops." Among them, PFD images disappearing and left engine in-flight shutdown belong to explicit failures, that is, PF/PM can obtain fault information through display interfaces. The difference is that the former can be obtained directly by PF through the visual channel, but the latter was acquired by the PF capturing PM's calling through the auditory channel. In addition, left aileron jamming was set to be an implicit failure, that is, PF/PM could not obtain relevant information through any display interface. The purpose was to distinguish PF's physiological characteristics under unknown failure conditions. These commands were helpful to segment each flight process into different phases including taxiing, normal climbing, maneuvering under failure, and flaring out.
According to the requirements of different single-point failures or combined failures on PF's flight maneuvering skill, the mission difficulty can be divided into three levels:  Tables 3-5. Subjective workload evaluation results using NASA-TLX are shown in Table 6.
The flight and physiological characteristic data initially input into the model have defects such as non-uniform dimensions and inconsistent time-varying regularity. It is easy to cause low computational efficiency and data logic confusion via direct fusion. Therefore, it is necessary to standardize and squeeze the original feature data first.

Data Standardization
What can be obtained from flight characteristic data is the pilot's ability to stably maintain the current flight control state after the failure. In this paper, the degree of stability was measured using population standard deviations of pitch angle (x 1 ), heading (x 2 ), and airspeed (x 3 ) during the takeoff and initial climbing phase.
What can be obtained from the physiological characteristic data is the pilot's physiological and psychological stress when they try to maintain the established flight plan after the failure. In this paper, the degree of concentration and tension was measured using mean values of heart rate (x 4 ), heart rate variability (x 5 ), respiratory rate (x 6 ), respiratory depth (x 7 ), and pupil diameter (x 8 ), as well as statistical proportions of gaze time (x 9 ), glance time (x 10 ), and blink time (x 11 ).
Based on the research results in the fields of physiology, medicine, ergonomics, etc. [40][41][42][43], the above-mentioned physiological characteristic parameters can be divided into the following two categories according to the change relations with workload. The first category contains the parameters positively related to workload level, namely heart rate (x 4 ), heart rate variability (x 5 ), respiratory rate (x 6 ), pupil diameter (x 8 ), and gaze time (x 9 ). The second category contains the parameters negatively related to workload level, namely respiratory depth (x 7 ), glance time (x 10 ), and blink time (x 11 ).
Considering the individual differences in flight and physiological characteristic data, the maximum and minimum values of characteristic parameters remain unknown. In a few cases, outlier data beyond the normal range of values will also appear. Therefore, the z-score standardization method was adopted for data preprocessing, transforming original data to standard normal distributed sequences using: For the subjective evaluation results after the flight, since the NASA-TLX scale already limits the score range to [0,100], the output data did not need to be standardized. However, in order to achieve a concentrated opinion score, the six-dimensional NASA-TLX scores should be averaged on the assumption that each factor holds a comparable weight in this given task.

Dimension Reduction
In order to improve the efficiency of data fusion, principal component analysis was then performed on the standardized feature data to reduce the data size in the fuzzification layer. During the flight process, characteristic parameters have a strong coupling, with the change of one parameter causing changes through the linkage with others. After the dimension reduction process, the contribution rate of each principal component and the cumulative contribution rate of the first n (n = 1, 2, . . . , 11) principal components are shown in Table 7. It can be seen from Table 6 that the cumulative contribution rate of the first four and five principal components reached 77.16% and 85.44%, respectively, which can meet the standard set in Jolliffe [36] to be able to cover most of the original information. Using the eigenvectors of the correlation coefficient matrix as coefficients to linearly combine the normalized physiological characteristic data in Tables 2-4, expressions of the first five principal components are listed below: P 1 = 0.1158x 1 +0.7878x 2 + 0.1080x 3 + 0.0173x 4 + 0.0777x 5 − 0.0481x 6 − 0.0258x 7 −0.3583x 8 − 0.3346x 9 + 0.3219x 10 + 0.0320x 11 (19) P 2 = 0.0715x 1 −0.3914x 2 + 0.1602x 3 − 0.0265x 4 + 0.0031x 5 + 0.2395x 6 − 0.3314x 7 −0.1859x 8 − 0.4805x 9 + 0.1213x 10 + 0.6066x 11 (20)

Data Fusion
The number of variables in Layer 2, n, was set be four or five for data fusion. By weighing the fusion accuracy and computing cost, the number could be finally determined.
By selecting the suitable fuzzy membership and fuzzy control rules through several attempts, the fuzzy neural network can be used to train the connection weight w j between Layer 5 and Layer 6. In this case, fuzzy membership functions were chosen to be shaped like those in Section 2.2, but the number of fuzzy partitions was set to be four, with initial parameters a Z pq = −1.5, b Z pq = −0.5, c pq = 0.2690, σ pq {±0.5}, a S pq = 0.5, and b S pq = 1.5. The number of w j appeared as an exponent relation to the number of variables in Layer 2. The changes in scale can also reflect the consumption of computing resources. Different parameter learning rates cause different convergence speeds. When the number of principal components is large, choosing a high learning rate can speed up the process of data fusion.

n = 4
In this case, the number of w j was 256. Due to the fact that the minimum division value of the NASA-TLX scale is 5, the error threshold was also set to be 5. When choosing the parameter learning rate β to be 0.25, 0.50, and 0.75, the convergence relation between the output error cost and learning times is shown in Figure 6. Under the learning rate of 0.75 and error tolerance threshold of 5, the final error cost was 4.85 after 21 learning iterations. The comparison between the desired and actual outputs is shown in Figure 7.

n = 5
In this case, the number of w j is 625. With parameter learning rate 0.25, 0.50, and 0.75 respectively, the convergence relation between output error cost and learning times is shown in Figure 8. Under the learning rate 0.75 and error tolerance threshold 5, the final error cost is 4.96 after 25 times' learning. The comparison between desired and actual outputs is shown in Figure 9.  Comparison from the perspectives of parameter learning rate, number of iterations, and root-mean-square error under different numbers of variables in Layer 2 is shown in Table 8. It can be seen that as the number of variables increased, the number of iterations within the expected error threshold range increased significantly, and the computing cost increased gradually. In addition, the root-mean-square error increased overall, such that the ratio of benefit to cost continued to deteriorate. It can be concluded that, although increasing the number of principal components could improve the convergence precision, it was at the cost of computing resources. When the error cost first fell into the predetermined threshold, both the iteration speed and error values in the first case (n = 4) are superior to those in the second case (n = 5). As a result, the first four principal components of flight and physiological characteristic data were extracted for modeling.

Discussion
By comparing the same flight personnel's workload during different flight phases in the same simulated flight, or the workload when encountering different failures in different simulated flights, the model performance of feature data fusion and workload prediction could be verified. The prediction results are shown in Table 9. Consistent with the NASA-TLX scale, the predicted workload values in different flight phases are shown in Table 10. There were four abnormal values that exceeded the normal value range [0,100] in prediction results, accounting for 2.38% of the total number. The main reasons for this phenomenon may include: • Abnormal feature attributes in sampling data before fusion: This may be caused by environmental changes or interference from other channels during sensor measurement. The model could not map these data points into a normal value range according to normal decision logic.

•
Side effects of using the z-score normalization method in data preprocessing: On the one hand, the z-score standardization method is applicable to the case where the maximum or minimum value is uncertain, which is consistent with the characteristics of individual differences in physiological characteristic parameters. On the other hand, the z-score normalization method cannot eliminate abnormal data outside the normal range, while it also avoids the impact of abnormal points on the overall data set.
• Limitations of the training sample sets: The fuzzy neural network was trained based on the data samples during failure. The data samples collected in this stage contained obvious feature attributes, which had advantages in identifying or predicting the severe workload. However, it showed insufficient sensitivity for flight phases where feature changes were not obvious and overall workload stayed low. In this study, training samples obtained during failure could also form the baseline where workload stayed at a high level. The establishment of a baseline is one of the most common handling methods in workload research [44,45]. Using this method, data samples collected in other phases can function as validation sets to evaluate the performance. Through contrasting statistical workload values in different flight phases or at difficulty levels, determining whether the proposed model serves the purpose of distinguishing the changing workload reasonably can be done. The predicted workload values in taxiing, normal climbing, maneuvering under failure, and flaring out are shown in Figure 10. Obviously, the mean workload under failure conditions was the highest, and workload in the taxiing phase was higher than that in the normal climbing phase. The reason may be complex. When the aircraft is taxiing, flight personnel need to observe different information such as alignment with the runway, takeoff speed, and rotation speed. In this case, the workload is mainly composed of mental load, while the pilot only needs to control the attitude angle in a normal climbing stage, with physical load playing a leading role. Furthermore, workload in a flaring out stage fell between the ones in the taxiing phase and normal climbing phase. It can be seen that the impact of failures was greater than that of operation complexity on maneuvering performance. In different flight phases, the average workload under different difficulty levels predicted by the model is shown in Figure 11. It can be seen that workload values could not distinguish between each other before the single-point failures or combined failures were triggered (Phases 1 & 2). After the failure was triggered, the workload increased significantly with the increase of operation difficulty, and Phases 3 and 4 exhibited the same phenomenon. Obviously, the effect of "left engine in-flight shutdown" was greater than that of "left aileron jamming" on flight maneuvering. Because of the existence of standby instruments, "PFD images disappearing" had the least impact on flight maneuvering, which is consistent with the pre-experimental assumption and basic perception of flight. To further our research, more physiological measuring means should be applied to monitor the pilot's workload condition. For instance, another feature source we have decided to take into account is EEG data, which is well acknowledged for its close connection with mental workload. However, due to the proficiency requirements of data collection and analysis, it remains controversial regarding how to quantify the relations between brain waves and mental workload. Despite that, we hope to enhance the fusion ability of our model on accuracy, sensitivity, and robustness with the introduction of such physiological indexes. The validity of this method can be further proved using questionnaire surveys among test pilots or airline pilots. To apply workload-monitoring techniques in real flights, other issues to consider mainly include the impacts on flight safety and compatibility with airborne systems. In other words, whether real-time workload monitoring will meet airworthiness requirements still requires much exploration.

Conclusions
In this paper, a multi-source data fusion model for monitoring the pilot's workload condition based on a fuzzy neural network is proposed. The case analysis of simulated takeoff and initial climbing process shows that the model can realize the fusion of key flight and physiological characteristic parameters such as pitch angle, heading, airspeed, heart rate, heart rate variability, respiratory rate, respiratory depth, pupil diameter, fixation time, glance time, and blink time. It also possesses the ability to map to an integrated workload level characterized by NASA-TLX's six-dimensional metrics. Furthermore, the fusion and prediction results could distinguish the different flight phases and maneuvering difficulties based on multi-source feature data.
The proposed model reveals the mechanism of mapping physiological and flight feature data to overall workload. It gives insight into the cross-coupled factors contributing to the pilot's mental state and makes the most of multi-source information to avoid one-sidedness. Furthermore, it realizes the transformation from experience-dependent estimation to a data-driven evaluation in the field of man-machine system science. Moreover, in real-world applications, what are studied in this paper can provide basic support for real-time monitoring and early warning of the pilot's workload status. Specifically, in the stage of airworthiness verification, the work can help to recognize the factors that may result in overload and improving the design of interfaces or operating procedures. In the stage of pilot training, the findings can also serve as the evaluation basis of training effects. In the stage of airline operation, the monitoring technique can warn crew members of an overload state to prevent further deterioration, thereby reducing the risk of air crew disability.