Next Article in Journal
Diabetes Classification with Symptom Data: Apriori-Based Feature Selection and Performance Comparison
Previous Article in Journal
Model to Assess the Intelligence Level of Buildings in the Hotel Industry by Applying Integrated Fuzzy Shannon Entropy and Fuzzy Multi-Objective Optimization on the Basis of Ratio Analysis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Exploring the Impact of Emotional States on Fatigue Evolution in Metro Drivers: A Physiological Signal-Based Approach

School of Urban Railway Transportation, Shanghai University of Engineering Science, Songjiang Campus, No. 333 Longteng Road, Songjiang District, Shanghai 201620, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2026, 16(6), 2653; https://doi.org/10.3390/app16062653
Submission received: 20 January 2026 / Revised: 9 March 2026 / Accepted: 9 March 2026 / Published: 10 March 2026
(This article belongs to the Section Transportation and Future Mobility)

Abstract

To investigate the regulatory effects of emotional states on the evolution of fatigue in metro drivers, this study conducts an experimental investigation based on an urban rail transit driving simulation platform. A total of 21 participants complete a 90 min simulated driving task, during which electroencephalogram (EEG) and electrocardiogram (ECG) signals are synchronously collected from drivers for fatigue assessment and emotion recognition, respectively. An emotion recognition model based on a multi-scale convolutional neural network (MSCNN) combined with an attention mechanism is constructed. The proposed model uses ECG signals to classify three emotional states—neutral, positive, and negative—where the neutral state is defined as an emotionally undefined baseline that is neither positive nor negative. The model achieves a classification accuracy of 86.96% on the DREAMER dataset. By temporally aligning the emotion recognition results with EEG frequency-domain fatigue indicators, the results show that fatigue exhibits the highest growth and largest fluctuation in amplitude under negative emotions, demonstrating a pronounced fatigue-accelerating effect. Under positive emotions, fatigue decreases considerably and has smaller fluctuations, indicating a certain buffering and restorative effect. In contrast, the neutral emotional state exhibits intermediate and transitional fatigue characteristics. This study innovatively integrates ECG-based emotion recognition with EEG-based fatigue assessment to reveal the mechanisms based on which emotions influence fatigue in metro driving tasks from a physiological perspective. This work provides a basis for emotion-aware fatigue monitoring and safety intervention strategies.

1. Introduction

The safety of a metro in urban rail transit largely depends on the driver. Studies have shown that human actions and decision-making are dominant contributors in train accidents [1]. The tasks of metro drivers require high concentration and repetitive operations. A prolonged engagement in monotonous yet highly responsible duty may lead to the accumulation of negative emotions, which in turn may exacerbate the accumulation of physiological fatigue [2]. As an important psychological variable influencing drivers’ cognitive functions and behavioral decision-making, emotions play a critical moderating role in the generation and evolution of fatigue. Negative emotions tend to accelerate fatigue accumulation and reduce work vigilance, whereas positive emotions may, to some extent, buffer the adverse effects of fatigue and promote psychological recovery and task persistence [3]. Therefore, accurately identifying metro drivers’ emotional states during driving and examining how different emotions are associated with temporal changes in fatigue are essential for understanding fatigue development and improving operational safety in metro systems.
In addition to fatigue, operational monotony and routine task exposure have been widely recognized as critical human risk factors in railway and transportation safety. Long-duration repetitive operations under highly standardized procedures may lead to vigilance decrement, reduced situational awareness, and gradual cognitive disengagement [4]. Previous studies have shown that routine-based task environments are strongly associated with human error in major transportation accidents, particularly under conditions of low stimulation and sustained attention demand. The interaction between monotony, mental workload, and physiological fatigue has been identified as a key contributor to performance degradation in rail operations [5].
Although routine-induced vigilance decline has been extensively studied, less attention has been given to the dynamic role of emotional states in modulating fatigue evolution under such monotonous operational contexts. Emotional fluctuations may either exacerbate cognitive depletion under repetitive tasks or buffer fatigue accumulation through affective regulation mechanisms. Therefore, studying the mechanism of how emotions affect fatigue under daily subway driving conditions is crucial for a more comprehensive understanding of safety risks related to human factors.
In addition to operational monotony, environmental factors such as lighting conditions have been shown to significantly influence attention, wakefulness, and emotional states [6]. Exposure to different light spectra, particularly blue-enriched light, has been associated with melatonin suppression, enhanced alertness, and a modulation of affective responses [7]. Recent interdisciplinary research combining transportation science and neuroscience has emphasized the role of ambient light in shaping driver emotions and cognitive performance. Variations in light quality and color perception within transportation environments may therefore contribute to emotional fluctuations and vigilance regulation [8]. However, the present study focuses specifically on the interaction between physiological emotion recognition and fatigue evolution under controlled simulated driving conditions, where lighting parameters were kept stable to minimize external variability. The influence of dynamic lighting environments represents an important direction for future research.
In this study, emotional categories are defined based on the valence dimension of the widely accepted valence–arousal emotional model. From a scientific perspective, positive emotions refer to affective states characterized by high valence, typically associated with pleasant, rewarding, or desirable experiences. Negative emotions correspond to low-valence affective states, which are commonly linked to unpleasant, stressful, or adverse experiences. The neutral emotional state represents an intermediate valence level that is neither distinctly positive nor negative. This classification framework is consistent with established affective neuroscience and emotion recognition research [9].
Numerous studies have shown that emotional fluctuations can affect drivers’ behavior during driving tasks. Under different emotional states, drivers exhibit an activation of the sympathetic nervous system, which in turn influences reaction time, decision-making, and overall driving performance [10]. Moreover, emotional states can also induce driving fatigue, thus leading to increased drowsiness, reduced attention, and diminished reaction capability.
The existing studies on emotion recognition can be mainly divided into three categories [11]. First, methods based on driving behavior characteristics, such as maximum lane deviation, steering angle, and acceleration–deceleration patterns, capture observable variations in driving performance. However, these behavioral indicators are influenced by multiple factors, including task demands, operational conditions, and individual driving styles. As a result, they cannot directly or specifically represent the underlying emotional valence of the driver and may lack sensitivity in detecting instantaneous emotional fluctuations [12]. Second, the methods are based on facial recognition features, such as eye movement trajectories, pupil diameter, and frequency of blinking. These methods perform well under stable lighting and fixed viewing conditions but are highly susceptible to variations in illumination, camera angles, and occlusion, resulting in limited robustness [13]. Third, the methods are based on the physiological indicators of a driver, such as electroencephalogram (EEG), electrocardiogram (ECG), and pulse signals. The analysis of these changes happening in the nervous system can objectively and accurately reflect the emotional state of a driver in real time. The third class of methods is regarded as the most promising collection of approaches in the current emotion recognition research, as physiological signals are not subject to deliberate human control and authentically reflect genuine emotional states [14]. Physiological signals characterize the physiological regulation processes underlying the internal states of drivers in a more direct manner, without relying on external environments or overt behavioral manifestations, thereby providing more stable and fine-grained emotional representations in dynamic driving contexts. Therefore, adopting physiological signals as the primary information source for emotion recognition effectively meets the requirements of this study for performing the real-time and reliable detection of emotional variations [15].
This work presents an emotion recognition model based on a multi-scale convolutional neural network (MSCNN) combined with an attention mechanism to classify drivers’ emotional states using ECG signals. In this proposed method, preprocessed electrocardiogram (ECG) features are used as inputs. The proposed model first extracts the local dynamic features at different temporal scales by using multi-scale one-dimensional convolutional modules. By employing convolution kernels of different sizes, the proposed model captures the emotion-related short-term fluctuations as well as long-term trends, thereby enhancing the temporal representational capacity of the extracted features. Subsequently, an attention mechanism is introduced to adaptively allocate feature weights to emphasize the key features that contribute more in the process of emotion discrimination while suppressing noise and redundant information. This improves the model’s discriminative performance and feature interpretability. The feature vectors obtained after multi-scale convolution and attention weighting are fed into fully connected layers, and a Softmax classifier is used to classify the input signal into three emotional states: neutral, positive, and negative. During the training process, cross-entropy loss function is adopted as the optimization objective, Adam optimizer with a learning rate of 1 × 10−3 is used for parameter updating, and K-fold cross-validation is incorporated to enhance model stability and generalization capability. The training and evaluation phases integrate multiple visualization analyses, including accuracy metrics, confusion matrices, and temporal visualizations of emotion labels for comprehensively assessing the model’s classification performance and dynamic response capability in emotion recognition for metro driving tasks.
After the training process is completed, the trained model is applied to ECG data collected during metro driving tasks to evaluate its performance in classifying drivers’ emotional states. The data for performing experiments are obtained from a metro driving simulation platform. EEG and ECG signals are synchronously collected from all participants during the experiment. The EEG data are subject to feature extraction for assessing fatigue levels, while the ECG data are used for emotion recognition. The emotion labels predicted by the trained model are temporally aligned with fatigue indicators derived from EEG signals over the same time segments, and statistical analyses are conducted to quantify the effects of different emotional states on fatigue levels. By comparatively analyzing the variation trends of fatigue indicators under different emotional categories, dynamic regulatory characteristics of emotions on the formation and evolution of driver fatigue are extracted, thereby obtaining analytical data on the roles of positive, neutral, and negative emotions during fatigue accumulation and regulation phases. These analytical results not only help in quantifying the magnitude and direction of the effects of emotions on driving fatigue, but also provide data support and theoretical foundations for real-time emotion monitoring and fatigue intervention strategies for urban rail transit drivers. The overall workflow of the proposed method is illustrated in Figure 1.

2. Materials and Methods

This section describes the experimental design, participant information, simulation procedures, physiological signal acquisition and preprocessing, feature extraction methods, and the architecture and training strategy of the proposed emotion recognition model.

2.1. Participants in the Driving Simulation Experiment

To minimize inter-subject variability and ensure experimental control, 21 participants were recruited for this study. The participants consisted of undergraduate and graduate students aged between 22 and 26 years who had completed at least one semester of structured metro driving simulation training. Although they were not professional metro drivers, the experimental design aimed to investigate the physiological mechanisms underlying the influence of emotional states on fatigue development under standardized and controlled driving conditions.
The relatively homogeneous age range was selected to reduce variability in baseline physiological responses and cognitive performance, thereby strengthening internal validity. While age-related differences in fatigue recovery may exist, the present study focuses on relative fatigue dynamics across emotional states rather than absolute fatigue thresholds.
Considering that the majority of metro drivers are male, all participants in this study were male to maintain demographic consistency and reduce gender-related physiological variability. All participants reported good health with no history of major neurological or cardiovascular disorders and completed the Vienna Test System (VTS) assessment, with results meeting or exceeding standard cognitive performance criteria. Participants were instructed to avoid alcohol, caffeine, and medication for 24 h prior to the experiment and to obtain at least 8 h of sleep the night before testing to minimize confounding factors related to circadian rhythm and physiological fatigue.

2.2. Simulation Driving Experimental Tasks and Procedures

An overview of the experiments and procedures is illustrated in Figure 2. This study employs an urban rail transit driving simulation system for conducting experiments by simulating metro driving to reproduce real-world operating conditions. Shanghai Metro Line 3 is used to achieve a high-fidelity representation of the actual driving environment. EEG signals are measured using the NeuSen W-series wireless EEG acquisition system. This system ensures high-quality multimodal physiological signal collection during simulated driving tasks. In addition, an ECG acquisition device from the CAPTIV series is used to collect the data of the participants. The data recording and analysis are performed using the CAPTIV-L7000 multimodal physiological and behavioral data acquisition and analysis system.
To control for circadian rhythm variability and minimize inter-day physiological fluctuations, all experiments were conducted within a fixed time window (14:00–16:00). This period corresponds to the well-documented post-lunch circadian dip, during which healthy adults typically exhibit reduced alertness and increased susceptibility to fatigue [16]. Conducting the experiment during this naturally fatigue-prone interval facilitated the observation of fatigue evolution under emotionally modulated conditions while maintaining standardized experimental control. The driving simulations required participants to operate trains on both elevated and underground sections under simulated clear weather conditions. During operations, the drivers are required to control train acceleration and deceleration to ensure smooth operations without exceeding the speed limits of each track section. Upon arrival at a station, the drivers have to confirm the signals, and the train is manually stopped before passenger boarding. Prior to experiments, the participants are required to complete a basic information questionnaire reporting their demographic information (e.g., gender, age, and major), and also if they consumed any stimulant or sedative foods or medications. At the same time, the participants also completed an informed consent form and are briefed regarding the experimental procedures. All the participants have undergone at least one semester of simulated driving training prior to the experiment to reduce the differences in familiarity with the driving simulator.
At the beginning of the experiment, the participants are first required to complete the Karolinska sleepiness scale (KSS) and the positive and negative affect schedule (PANAS) to report their subjective alertness and emotional experiences. Afterwards, all the participants practiced the task until all performance metrics reached the predefined criteria and then rested for five minutes. Subsequently, the participants performed a 90 min driving task, during which the KSS questionnaire was administered every 15 min. At the end of the task, the participants completed the KSS and PANAS again to report their subjective experiences under current conditions. The post-experiment KSS and PANAS data are used in subsequent analyses to validate and complement the accuracy of the physiological data assessment results. ECG and EEG signals are continuously measured throughout the entire experiment. In addition, during driving simulations, the participants are restricted from consuming food or beverages, using mobile devices, or communicating with the staff performing the experiments. Participants received standard financial compensation for their time and participation, which was fixed and independent of task performance. The compensation was provided solely as an ethical reimbursement for participation and was not contingent upon task performance, emotional state, or experimental outcomes, thereby minimizing potential motivational or affective bias. Meanwhile, baseline emotional assessments (PANAS) were collected prior to task initiation to ensure that initial affective states were not systematically influenced by participation compensation.

2.3. Data Acquisition and Preprocessing

In this study, the physiological signals of metro drivers are synchronously collected in a multimodal manner, including electrocardiogram (ECG) and electroencephalogram (EEG) signals.
EEG signals are recorded using the Neuracle NeuSen W-series wireless EEG system, as shown in Figure 2, which are then stored in BDF file format. EEG data were read and processed using the MNE-Python library (version 1.9.0), which supports the offline loading of raw EEG recordings for subsequent preprocessing and analysis. During the preprocessing stage, a 1–40 Hz band-pass filter is applied to the EEG signals to remove low-frequency drift and high-frequency noise, followed by z-score normalization to standardize signal amplitudes and reduce inter-subject variability [17]. In the data segmentation step, EEG signals are divided into fixed-length windows of 400 data points to ensure temporal consistency with the ECG signals. The power features are extracted from three frequency bands, i.e., θ (4–8 Hz), α (8–13 Hz), and β (13–30 Hz). An additional ratio θ β is computed as a fatigue sensitivity indicator to reflect the changes in the cerebral functional activity during different stages of driving.
ECG signals are recorded using the CAPTIV L7000 system, with the electrodes being placed horizontally on the chest and connected by the leads suspended in front of the chest. A portable configuration effectively captures the electrocardiogram signals, as illustrated in Figure 2. The ECG data are stored in UTF-16-encoded CSV files with a sampling rate of 128 Hz. The program identifies and extracts the columns containing time and ECG information based on header detection and preferentially used the pre-filtered signals. During preprocessing, a fourth-order Butterworth band-pass filter (0.5–40 Hz) combined with bidirectional zero-phase filtering is applied to remove low-frequency drift and high-frequency noise. Subsequently, z-score normalization is applied to the signals to eliminate any inter-individual amplitude differences. The preprocessed ECG signals are segmented into fixed-length windows of 400 data points each, and the mean and standard deviation curves are computed at window level to characterize the temporal variations in ECG features.

2.4. Feature Extraction

This section describes the extraction of EEG- and ECG-based features used in the present study. Specifically, EEG frequency-domain indicators were derived to quantify fatigue levels during the simulated driving task, whereas ECG-related features were extracted to provide physiological inputs for emotion recognition and subsequent analysis of the influence of emotional states on fatigue evolution.

2.4.1. EEG-Related Indicators

After preprocessing the EEG signals, frequency-domain analysis is performed using the fast Fourier transform (FFT). This is accomplished by transforming the time-domain EEG signals into frequency domain signals to extract the power features from three main frequency bands, i.e., θ (4–8 Hz), α (8–13 Hz), and β (13–30 Hz) [18]. In this study, the specific computational procedure of the fast Fourier transform (FFT) is described as follows:
1.
Discrete Fourier Transform (DFT)
X ( k ) = n = 0 N 1 x ( n ) e j 2 π N k n , k = 0 , 1 , , N 1 .
In (1), x ( n ) denotes the amplitude of the EEG time-domain signal of length N at the n-th sampling point, N represents the length of the analysis window, X ( k ) represents the complex amplitude of the k-th frequency component of the signal in the frequency domain, k represents the frequency index, e j 2 π N k n denotes the complex exponential basis function that maps the time-domain signal to the frequency domain, and j represents the imaginary unit.
2.
Power spectral density (PSD) based on FFT
P S D ( f k ) = 1 N f s | X ( k ) | 2 .
In (2), P S D ( f k ) denotes the power spectral density of the EEG signal at frequency f k , | X ( k ) | 2 represents the squared amplitude of the corresponding frequency component, N represents the window length, f s denotes the sampling rate, and f k represents the k-th frequency point.
3.
Band power calculation
P θ = f = 4 8 P S D ( f ) , P α = f = 8 13 P S D ( f ) , P β = f = 13 30 P S D ( f ) .
In (3), P S D ( f ) denotes the power spectral density at frequency f , and P θ , P α , and P β represent the total power of the θ , α , and β frequency bands, respectively. P θ reflects the increasing drowsiness and cognitive load, whose sum ranges between 4 and 8 Hz, P α is associated with the relaxation and internal attention states, whose sum ranges between 8 and 13 Hz, and P β represents alertness and executive function, whose sum ranges between 13 and 30 Hz.
4.
Extraction of EEG-based fatigue indicators ( θ β ratio)
R θ / β = P θ P β .
In (4), the θ β ratio is a commonly used indicator in EEG signal analysis for assessing fatigue and attention levels. When the drivers are engaged in prolonged monotonous driving or high-workload tasks, θ -band power increases while β -band power decreases, leading to an elevated θ β ratio that reflects fatigue accumulation and attention decline [19]. Therefore, the θ β ratio comprehensively reflects the cognitive activation levels of the drivers under different conditions and serves as an important EEG feature for characterizing fatigue development during driving tasks. This key EEG feature will be used as a quantitative representation of train operators’ fatigue level in the data analysis of the metro driving simulation experiment. To reduce inter-subject variability and ensure comparability across participants, the θ β ratio time-series was further normalized using subject-wise z-score standardization:
R ^ θ / β t = R θ / β t μ R σ R .
In (5), μ R and σ R represent the mean and standard deviation of R θ / β t for the corresponding participant over the entire driving task. The normalized R ^ θ / β t was used as the final EEG-based fatigue indicator to quantitatively represent train operators’ fatigue level in the metro driving simulation experiment.
5.
Construction of EEG-based fatigue time-series
To characterize fatigue dynamics during the metro driving simulation experiment, the EEG-based fatigue indicator R θ / β t was computed sequentially to form a fatigue time-series for each participant. Time-series variations were quantified through point-to-point comparisons between consecutive values of R θ / β t . Specifically, for each time index t t 2 , fatigue was considered to increase when R θ / β t > R θ / β t 1 , and to decrease when R θ / β t < R θ / β t 1 . Based on these comparisons, the proportions of fatigue increase and fatigue decrease were calculated by counting the numbers of increasing and decreasing transitions, respectively, and normalizing them by the total number of transitions within the analyzed segment. This procedure provides a consistent quantitative definition of fatigue variation and supports the subsequent statistical analysis.

2.4.2. ECG-Related Indicators

After preprocessing the ECG signals, statistical features are extracted from each window to capture the changes in the activity of the autonomic nervous system across different stages of driving. In this study, the following features from ECG signals are extracted:
1.
The mean and standard deviation of ECG signals, denoted as μ and σ , extracted from each window reflect the central tendency and dispersion of ECG fluctuations, respectively [20]:
μ = 1 N i = 1 N x i , σ = 1 N i = 1 N ( x i μ ) 2 .
In (6), μ denotes the mean amplitude of the ECG signal within the window. It reflects the overall level of cardiac activity. x i represents the amplitude at the i-th sample, and N indicates the number of samples in a window, i.e., 400.
2.
The R–R interval is calculated by detecting the positions of R peaks in the ECG signal, and then computing the time intervals between adjacent R waves:
RR i = t { R , i + 1 }     t { R , i } .
In (7), RR i denotes the duration of the i-th cardiac cycle, which is obtained by calculating the difference between the occurrence times of two adjacent R waves. Here, t i and t i + 1 represent the occurrence times of consecutive R waves. Based on the R–R intervals, the instantaneous heart rate H R i is calculated as follows:
H R i = 60 R R i .
In (8), H R i denotes the i-th instantaneous heart rate.
3.
The mean heart rate H R ¯ is also considered as a time-domain statistical indicator, and is mathematically expressed as follows:
H R ¯ = 1 M i = 1 M H R i .
In (9), H R ¯ denotes the mean heart rate within the window, H R i represents the instantaneous heart rate, and M denotes the number of computable heartbeats within the window minus one [21].
After completing the process of feature extraction from EEG and ECG signals, this study uses EEG features to assess fatigue levels. In order to explore the emotional states of drivers during driving tasks and quantify their relationship with fatigue accumulation, this study builds and trains an emotion classification model based on a multi-scale convolutional neural network (MSCNN) combined with an attention mechanism. The ECG features are used as the input for the emotion recognition model.

2.5. Model Architecture and Training

In this study, a multi-scale convolutional neural network (MSCNN) combined with an attention mechanism is built that uses the features extracted from ECG signals to classify the emotional states of drivers. During the training process, in addition to the binary emotion model (positive and negative emotional states), a neutral state is introduced to represent an emotionally undefined baseline condition that is neither positive nor negative. The inclusion of the neutral category facilitates the identification of transitions from a stable state toward positive or negative emotional directions and avoids semantic ambiguity. The proposed model adopts a one-dimensional convolutional architecture that uses the preprocessed ECG unified feature sequence vectors as inputs [22]. The overall preprocessing workflow is illustrated in Figure 3.
The core architecture of the proposed model adopts three parallel one-dimensional convolutional branches with kernel sizes of 3, 5, and 9, respectively, to extract rapid rhythmic variations from the ECG physiological data, as well as the emotional dynamics over longer temporal ranges. The multi-scale convolution captures hierarchical information ranging from short-term local fluctuations to cross-temporal variations. Batch normalization and max-pooling layers are employed to further ensure the stability of feature extraction and effective temporal compression.
To further enhance the model’s sensitivity towards emotion-related key features, a channel attention mechanism is applied on multi-scale convolutional outputs to adaptively weight the importance of features from different branches. The attention module employs global average pooling to extract channel-wise statistical features and generate channel weight vectors based on a two-layer fully connected structure with nonlinear activations. These outputs are then multiplied element-wise with the multi-scale features to obtain the attention-enhanced deep feature representations. This mechanism significantly improves the model’s focus on ECG physiological features, while suppressing noise, thereby achieving a higher discriminative capability in recognizing neutral, positive, and negative emotional states.
The attention-enhanced feature sequences are flattened and fed to a two-layer fully connected network for final classification. The first fully connected layer introduces nonlinearity via a Rectified Linear Unit (ReLU) activation function to further integrate the multimodal features. The second dense layer employs a Softmax probability activation function to generate the probability distribution over three emotion categories. The proposed model is trained using the DREAMER (Database for Emotion Recognition through EEG and ECG signals from wireless low-cost off-the-shelf devices) dataset, a publicly available physiological emotion dataset [23], which features high-quality annotations, validated reliability, well-structured data, and strong sample consistency, thus making it suitable for the convolutional neural network approach adopted in this study. The dataset is split into training and test sets by 8:2 to ensure consistent distributions across the three emotion categories. Subsequently, TensorDataset and DataLoader are constructed to enable mini-batch data iteration. Cross-entropy loss is used as the optimization objective, and model parameters are updated using the Adam optimizer, with a learning rate set to 1 × 10−3. A batch size of 32 is selected, and the number of training epochs is set to 100. The validation performance is continuously monitored during the training process. The model weights that result in the highest classification accuracy are selected to ensure model stability and generalization capability. The overall model architecture is illustrated in Figure 4.
The network architecture was designed according to the temporal characteristics of ECG signals and established practices in physiological signal classification. A one-dimensional convolutional structure was adopted because ECG data are sequential time-series signals. The use of multi-scale convolutional branches with kernel sizes of 3, 5, and 9 was intended to capture short-term rhythmic variations, intermediate temporal patterns, and longer-range fluctuations, respectively. This design enables hierarchical feature extraction across different temporal resolutions [24]. The attention mechanism was introduced to enhance channel-wise feature weighting, allowing the model to emphasize emotion-relevant representations while suppressing noise [25]. Such attention-based enhancement has been widely adopted in physiological signal recognition tasks to improve discriminative capability. Hyperparameters were selected based on preliminary experiments and common practices in deep learning for physiological data. The Adam optimizer with a learning rate of 1 × 10−3 was employed due to its adaptive gradient properties and stable convergence behavior. A batch size of 32 was used to balance gradient stability and generalization performance under moderate sample size conditions. The number of training epochs was set to 100, while model performance was monitored on the validation set to select the best-performing weights and mitigate overfitting.
In summary, the integration of a multi-scale CNN and attention mechanism enables the proposed model to capture the emotion-related information embedded in multimodal physiological signals across different temporal scales and reinforce key emotional features through adaptive feature weighting, thereby significantly enhancing its ability to discriminate between various emotion categories. The training results demonstrate that the proposed model achieves good generalization performance on the DREAMER dataset and can be effectively applied to emotion classification in simulated driving experiments. Therefore, the proposed model provides a reliable emotion recognition tool for subsequent analyses of the mechanisms based on which emotional states influence fatigue evolution.

3. Results and Analysis

In order to evaluate the performance of the trained model in recognizing emotional states (positive, neutral, and negative), multiple evaluation metrics are adopted in addition to the accuracy, including the precision, recall, F1-score, and confusion matrix, considering that the label distribution of the dataset used in this study is imbalanced. In addition, a baseline CNN model comprising a two-layer one-dimensional convolution–pooling architecture is also trained and used for performing comparisons with the proposed model.

3.1. Overall Model Performance

Table 1 presents a comparison of classification performance based on the DREAMER dataset between the baseline CNN model composed of a two-layer one-dimensional convolution–pooling architecture and the proposed multi-scale CNN with an attention mechanism.
The evaluation results show that the multi-scale CNN combined with an attention mechanism outperforms the conventional CNN in terms of its accuracy, precision, recall, and F1-score. This is because the multi-scale convolutional architecture simultaneously captures the short-term fluctuations and long-term emotional variation trends embedded in the ECG signals, thereby enhancing the richness and stability of feature representations. Moreover, the attention mechanism adaptively adjusts the feature weights in response to emotional changes, thus allowing emotion-relevant signals to exert a greater influence while suppressing noise and redundant information, thereby improving the model’s discriminative capability.
Although the classification accuracy does not approach 100%, emotion recognition based on physiological signals inherently involves inter-individual variability and signal noise. Achieving an overall accuracy of 86.96% on a three-class emotion task demonstrates a strong discriminative capability, particularly when compared to the baseline CNN model (70.84%).

3.2. Analysis of Emotion Recognition Classification Results

As shown in Figure 5, the confusion matrix results show that the proposed model overall exhibits a stable performance in recognizing the three emotion categories, with the highest accuracy observed for “neutral” and “negative” emotions, i.e., 139 (the number of neutral emotion samples is 161, and the proportion of successfully recognized samples is 86.34%) and 144 (the number of negative emotion samples is 163, and the proportion of successfully recognized samples is 88.34%) samples were correctly classified, respectively. For the “positive” emotion, the model was able to correctly identify most of the samples, although some “positive” samples were misclassified as “neutral” or “negative”. This indicates that the proposed model is sensitive to subtle differences between neutral and adjacent emotional states. Overall, the concentration of values along the diagonal of the confusion matrix suggests that the model effectively distinguishes among different emotion categories, demonstrating a good classification reliability.
Figure 6 presents a line chart of predicted emotions against sample indices. This chart shows that the proposed model is able to effectively capture emotional fluctuations during the driving process. The proposed model exhibits consistent recognition in stable intervals, and emotion transition points display clear stage-wise characteristics, indicating a good temporal sensitivity in processing continuous physiological signals. Although a small number of boundary samples are misclassified, this does not affect the identification of main trends in emotional variations.

3.3. Analysis of the Effects of Emotions on Driving Fatigue in Metro Driving Tasks

To elucidate the regulatory effects of emotions on the evolution of driving fatigue, this study utilizes a trained emotion recognition model to obtain the emotion label sequences of each participant throughout the simulated driving task. Subsequently, the emotion labels are temporally aligned with the fatigue indices derived from the EEG features. The fatigue variation segments corresponding to positive, neutral, and negative emotional states are extracted to construct emotion-specific fatigue time-series curves. Figure 7 illustrates the temporal variations in fatigue under three emotional states for a representative participant during the experiment, where (a) shows the fatigue time-series curve under positive emotion, (b) shows the fatigue time-series curve under neutral emotion, and (c) shows the fatigue time-series curve under negative emotion.
Table 2 presents the proportions of increase and decrease in fatigue level under three emotional states. It also shows the maximum value, minimum value, and maximum difference in EEG frequency-domain features representing fatigue states.

4. Discussion

This section interprets the main findings of the study and discusses the influence of different emotional states on fatigue evolution from a physiological perspective. The implications of the results, as well as the limitations of the present study, are also addressed.

4.1. Effects of Emotional States on Fatigue

This work is based on physiological signal analysis and verifies the significant moderating role of emotional states in the evolution of fatigue during metro driving. The proposed multi-scale convolutional neural network (MSCNN) combined with an attention mechanism uses ECG signals to stably distinguish between positive, neutral, and negative emotional states. Moreover, the EEG frequency-domain fatigue indicators enable a quantitative analysis of the emotion–fatigue relationship. The experimental results show that different emotional states exert directional differences in the formation and evolution of fatigue.
The statistical results show that fatigue increases significantly under negative emotional states, and both the maximum value and fluctuations in the amplitude of fatigue-related features are significantly greater than those under other emotional states. This suggests that negative emotions accelerate the process of fatigue accumulation. This finding is consistent with previous studies showing that negative emotions increase mental workload and accelerate the consumption of cognitive resources. In contrast, under positive emotional states, the proportion of fatigue decreases considerably and fluctuations in the amplitude of fatigue are smaller. However, the differences between the increasing and decreasing proportions of fatigue are smaller than those observed under negative emotions, indicating that positive emotions exert a certain buffering and restorative effect that helps in delaying fatigue development, although their influence is weaker than that of the negative emotions.
The neutral emotional state exhibits intermediate fatigue evolution characteristics, with comparable proportions of fatigue increase and decrease, indicating a transitional condition between positive and negative emotional influences. This suggests that treating neutral as an independent emotion category helps in accurately characterizing the continuity and transitional features of emotional regulation on fatigue, while avoiding interference from neutral states in the analysis of positive and negative emotional effects.
Although the directional trend that negative emotions are associated with increased fatigue may appear intuitive, the novelty of the present study lies in its quantitative physiological validation within a controlled metro driving context. Unlike prior research that primarily relies on behavioral or self-report measures, this study integrates ECG-based emotion recognition with EEG-derived fatigue indicators to characterize the dynamic evolution of fatigue under different emotional states. The results reveal not only directional differences but also distinct fluctuation patterns and accumulation rates, providing a more fine-grained physiological understanding of how emotional states influence fatigue in operational settings.
Overall, the results of this study demonstrate that emotional states can significantly influence the fatigue level of train operators during metro driving tasks. The experimental evidence indicates that negative emotions are associated with a faster increase and larger fluctuations in fatigue, whereas positive emotions show a buffering effect by slowing fatigue accumulation and promoting a relatively lower fatigue level. These findings provide physiological support for considering the emotional state as an influential factor in fatigue assessment and highlight the potential value of emotion-aware fatigue monitoring for improving operational safety in urban rail transit.

4.2. Limitations and Future Directions

Despite the promising findings, several limitations should be acknowledged. First, the sample size was relatively small and consisted of young participants aged 22–26 years. Although this homogeneous age range was selected to reduce inter-individual physiological variability and enhance internal validity, age-related differences in fatigue recovery, cognitive resilience, and reaction time may influence absolute fatigue trajectories. Therefore, caution should be exercised when generalizing the results to older or more experienced metro drivers.
Second, participants were trained simulation operators rather than professional subway drivers. While this controlled experimental design allowed for standardized task conditions and reduced confounding operational variability, it may limit ecological validity. Future studies should incorporate professional drivers to further validate the applicability of the findings in real-world contexts.
Third, the experiments were conducted within a fixed afternoon time window (14:00–16:00), corresponding to the circadian post-lunch dip. Although this timing facilitated the observation of fatigue evolution under controlled conditions, it does not fully capture the operational complexity of peak-hour metro traffic. Future research may consider incorporating peak-hour simulations or field-based experiments to further enhance ecological validity.
These limitations do not undermine the methodological contribution of the present study, which focuses on elucidating the physiological mechanisms underlying the influence of emotional states on fatigue evolution. However, they highlight important directions for future research.

5. Conclusions

This study investigated the impact of emotional states on fatigue levels in metro driving tasks based on a physiological signal-driven analysis framework. A multi-scale convolutional neural network combined with an attention mechanism was developed to recognize drivers’ emotional states from ECG signals, achieving an emotion classification accuracy of 86.96%. Meanwhile, an EEG-based fatigue indicator was extracted to quantify fatigue evolution over time. By temporally aligning the ECG-based emotion recognition outputs with the EEG-derived fatigue time-series, the proposed model enables a quantitative investigation of how different emotional states influence fatigue development during long-duration metro driving operations.
From a methodological perspective, the proposed multi-scale convolutional neural network combined with an attention mechanism effectively captures the activity of the autonomic nervous system in complex and long-duration driving tasks. Temporally aligning the emotion recognition results with EEG-based fatigue indicators provides a feasible data-driven framework for revealing the mechanisms based on which emotions influence fatigue during driving tasks.
From an application perspective, the findings indicate that negative emotions are a significant amplifier of fatigue, whereas positive emotions possess potential buffering value. This provides a theoretical basis for emotion regulation-based fatigue intervention strategies, such as maintaining drivers’ positive or neutral emotional states through optimized human–machine interface design, work–rest scheduling, or psychological interventions.
This study still has certain limitations. First, the sample size is relatively small, and the participants mainly consisted of young male individuals who had only undergone simulated driving training. Future studies should include metro drivers with actual driving experience to enhance generalization. Second, this study is conducted in a simulated driving environment. However, considering the complexity of the task and unexpected real-world events, formulizing emotion–fatigue relationship is non-trivial, making validation in actual lines or quasi-realistic environments necessary. In addition, future research should incorporate more physiological indicators (e.g., electrodermal activity and eye-tracking measures) and combine multimodal fusion models to further deepen the understanding of the mechanisms based on which drivers’ emotions affect fatigue.

Author Contributions

Conceptualization, L.C. and Y.H.; Methodology, L.C., Y.H. and F.W.; Software, L.C.; Validation, L.C.; Formal analysis, L.C.; Investigation, L.C.; Data curation, L.C., Y.H., F.W., L.Z. and Z.L.; Writing—original draft, L.C.; Writing—review & editing, Y.H. and F.W.; Visualization, L.C.; Supervision, Y.H., F.W., L.Z. and Z.L.; Project administration, Y.H. and Z.L.; Funding acquisition, Z.L. All authors have read and agreed to the published version of the manuscript.

Funding

A data–event fusion-driven method for metro operation risk prevention and control under extreme weather scenarios: 62576205.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Baysari, M.T.; McIntosh, A.S.; Wilson, J.R. Understanding the human factors contribution to railway accidents and incidents in Australia. Accid. Anal. Prev. 2008, 40, 1750–1757. [Google Scholar] [CrossRef] [PubMed]
  2. Lees, T.; Chalmers, T.; Burton, D.; Zilberg, E.; Penzel, T.; Lal, S. Psychophysiology of Monotonous Driving, Fatigue and Sleepiness in Train and Non-Professional Drivers: Driver Safety Implications. Behav. Sci. 2023, 13, 788. [Google Scholar] [CrossRef] [PubMed]
  3. Habibifar, N.; Salmanzadeh, H. Improving driving safety by detecting negative emotions with biological signals: Which is the best? Transp. Res. Rec. J. Transp. Res. Board 2022, 2676, 334–349. [Google Scholar] [CrossRef]
  4. Gartenberg, D.; Gunzelmann, G.; Hassanzadeh-Behbaha, S.; Trafton, J.G. Examining the Role of Task Requirements in the Magnitude of the Vigilance Decrement. Front. Psychol. 2018, 9, 1504. [Google Scholar] [CrossRef]
  5. Xi, J.; Wang, S.; Ding, T.; Tian, J.; Shao, H.; Miao, X. Detection Model on Fatigue Driving Behaviors Based on the Operating Parameters of Freight Vehicles. Appl. Sci. 2021, 11, 7132. [Google Scholar] [CrossRef]
  6. Hassib, M.; Braun, M.; Pfleging, B.; Alt, F. Detecting and influencing driver emotions using psycho-physiological sensors and ambient light. In Human-Computer Interaction—INTERACT 2019; Lamas, D., Loizides, F., Nacke, L., Petrie, H., Winckler, M., Zaphiris, P., Eds.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2019; Volume 11746. [Google Scholar] [CrossRef]
  7. Mekki, O.; Dupuis, P.; Hbaieb, C.G.; Canale, L.; Zissis, G.; Cheikhrouhou, M. Light quality, color perception and emotions in the interior space. In Proceedings of the 2021 Joint Conference—11th International Conference on Energy Efficiency in Domestic Appliances and Lighting & 17th International Symposium on the Science and Technology of Lighting (EEDAL/LS:17), Toulouse, France, 1-3 June 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–4. [Google Scholar]
  8. Souche-Le Corvec, S.; Zhao, J. Transport and emotion: How neurosciences could open a new research field. Travel Behav. Soc. 2020, 20, 12–21. [Google Scholar] [CrossRef]
  9. Baudouin, J.Y.; Gallian, F.; Pinoit, J.M.; Damon, F. Arousal, valence, and discrete categories in facial emotion. Sci. Rep. 2025, 15, 40268. [Google Scholar] [CrossRef]
  10. Liu, Y.-Q.; Wang, X.-Y. The analysis of driver behavioral tendency under different emotional states based on a Bayesian network. IEEE Trans. Affect. Comput. 2023, 14, 165–177. [Google Scholar] [CrossRef]
  11. Zepf, S.; Hernandez, J.; Schmitt, A.; Minker, W.; Picard, R.W. Driver emotion recognition for intelligent vehicles: A survey. ACM Comput. Surv. 2020, 53, 64. [Google Scholar] [CrossRef]
  12. Zhao, D.; Zhong, Y.; Fu, Z.; Hou, J.; Zhao, M. A review for the driving behavior recognition methods based on vehicle multisensor information. J. Adv. Transp. 2022, 2022, 7287511. [Google Scholar] [CrossRef]
  13. Xiao, H.; Li, W.; Zeng, G.; Wu, Y.; Xue, J.; Zhang, J.; Li, C.; Guo, G. On-Road Driver Emotion Recognition Using Facial Expression. Appl. Sci. 2022, 12, 807. [Google Scholar] [CrossRef]
  14. Xiang, G.; Yao, S.; Deng, H.; Wu, X.; Wang, X.; Xu, Q.; Yu, T.; Wang, K.; Peng, Y. A multi-modal driver emotion dataset and study: Including facial expressions and synchronized physiological signals. Eng. Appl. Artif. Intell. 2024, 130, 107772. [Google Scholar] [CrossRef]
  15. Wang, L.; Hao, J.; Zhou, T.H. ECG Multi-Emotion Recognition Based on Heart Rate Variability Signal Features Mining. Sensors 2023, 23, 8636. [Google Scholar] [CrossRef] [PubMed]
  16. Zhang, H.; Yan, X.; Wu, C.; Qiu, T.Z. Effect of circadian rhythms and driving duration on fatigue level and driving performance of professional drivers. Transp. Res. Rec. J. Transp. Res. Board 2014, 2402, 19–27. [Google Scholar] [CrossRef]
  17. Sun, C.; Mou, C. Survey on the research direction of EEG-based signal processing. Front. Neurosci. 2023, 17, 1203059. [Google Scholar] [CrossRef]
  18. Singh, A.K.; Krishnan, S. Trends in EEG signal feature extraction applications. Front. Artif. Intell. 2022, 5, 1072801. [Google Scholar] [CrossRef]
  19. Jap, B.T.; Lal, S.; Fischer, P. Comparing combinations of EEG activity in train drivers during monotonous driving. Expert Syst. Appl. 2011, 38, 996–1003. [Google Scholar] [CrossRef]
  20. Qian, C.; Su, H.; Yu, H. Local means denoising of ECG signal. Biomed. Signal Process. Control 2019, 53, 101571. [Google Scholar] [CrossRef]
  21. Wang, L.; Li, J.; Wang, Y. Modeling and recognition of driving fatigue state based on R–R intervals of ECG data. IEEE Access 2019, 7, 175584–175593. [Google Scholar] [CrossRef]
  22. Gao, J.M.; Yang, C.Y.; Liu, F.; Qi, J. Emotion prediction of EEG signals based on 1D convolutional neural network. J. Phys. Conf. Ser. 2021, 2024, 012044. [Google Scholar] [CrossRef]
  23. Katsigiannis, S.; Ramzan, N. DREAMER: A database for emotion recognition through EEG and ECG signals from wireless low-cost off-the-shelf devices. IEEE J. Biomed. Health Inform. 2018, 22, 98–107. [Google Scholar] [CrossRef]
  24. Albalooshi, F.A. Novel Approach in Vegetation Detection Using Multi-Scale Convolutional Neural Network. Appl. Sci. 2024, 14, 10287. [Google Scholar] [CrossRef]
  25. Ghaleb, E.; Niehues, J.; Asteriadis, S. Joint modelling of audio-visual cues using attention mechanisms for emotion recognition. Multimed. Tools Appl. 2023, 82, 11239–11264. [Google Scholar] [CrossRef]
Figure 1. Overall research framework.
Figure 1. Overall research framework.
Applsci 16 02653 g001
Figure 2. Overview of experimental tasks and procedures.
Figure 2. Overview of experimental tasks and procedures.
Applsci 16 02653 g002
Figure 3. Physiological data preprocessing flowchart.
Figure 3. Physiological data preprocessing flowchart.
Applsci 16 02653 g003
Figure 4. Overall model framework design.
Figure 4. Overall model framework design.
Applsci 16 02653 g004
Figure 5. Confusion matrix.
Figure 5. Confusion matrix.
Applsci 16 02653 g005
Figure 6. Line chart of predicted emotions varying with sample indices.
Figure 6. Line chart of predicted emotions varying with sample indices.
Applsci 16 02653 g006
Figure 7. Time-series of the EEG-based fatigue indicator R θ / β t for a single participant under three emotional states during the metro driving simulation experiment. The three subfigures were generated from the same participant’s experimental data and arranged in chronological order to illustrate fatigue evolution under different emotional conditions.
Figure 7. Time-series of the EEG-based fatigue indicator R θ / β t for a single participant under three emotional states during the metro driving simulation experiment. The three subfigures were generated from the same participant’s experimental data and arranged in chronological order to illustrate fatigue evolution under different emotional conditions.
Applsci 16 02653 g007
Table 1. Comparison results of the performance of two models.
Table 1. Comparison results of the performance of two models.
Modeling MethodsEvaluation Metrics
AccuracyPrecisionRecallF1-Score
CNN70.84%0.78210.70340.7406
MSCNN + Attention86.96%0.86970.86960.8696
Table 2. Statistical analysis results of the experimental data.
Table 2. Statistical analysis results of the experimental data.
Emotional StateStatistical Analysis Indicators
Proportion of
Fatigue Increase
Proportion of
Fatigue Decrease
Maximum
Feature Value
Minimum
Feature Value
Maximum
Feature Difference
MeanSDMeanSDMeanSDMeanSDMeanSD
Positive45.38%1.2954.62%1.290.8690.210.3160.190.5530.28
Neutral51.02%1.0748.98%1.071.1590.160.4280.240.7310.29
Negative62.20%1.5137.80%1.511.7820.230.5320.311.250.39
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chen, L.; Huang, Y.; Wang, F.; Zhu, L.; Liu, Z. Exploring the Impact of Emotional States on Fatigue Evolution in Metro Drivers: A Physiological Signal-Based Approach. Appl. Sci. 2026, 16, 2653. https://doi.org/10.3390/app16062653

AMA Style

Chen L, Huang Y, Wang F, Zhu L, Liu Z. Exploring the Impact of Emotional States on Fatigue Evolution in Metro Drivers: A Physiological Signal-Based Approach. Applied Sciences. 2026; 16(6):2653. https://doi.org/10.3390/app16062653

Chicago/Turabian Style

Chen, Lianjie, Yuanchun Huang, Fangsheng Wang, Lin Zhu, and Zhigang Liu. 2026. "Exploring the Impact of Emotional States on Fatigue Evolution in Metro Drivers: A Physiological Signal-Based Approach" Applied Sciences 16, no. 6: 2653. https://doi.org/10.3390/app16062653

APA Style

Chen, L., Huang, Y., Wang, F., Zhu, L., & Liu, Z. (2026). Exploring the Impact of Emotional States on Fatigue Evolution in Metro Drivers: A Physiological Signal-Based Approach. Applied Sciences, 16(6), 2653. https://doi.org/10.3390/app16062653

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop