1. Introduction
When one’s ability cannot match the requirements of the external environment, psychological stress will appear, such as too difficult a work task or too heavy a financial burden [
1]. In fact, we all live under stress, and moderate stress can keep us competitive. However, chronically living under high stress will increase the risk of physical and psychological disease [
2], including severe cardiac arrhythmias, high blood pressure, stroke, gastric ulcers, cancer and depression [
3,
4]. If people could get their stress situation in a low-cost and convenient way and manage it appropriately, it would not only reduce people’s risk of disease but also improve people’s efficiency, creativity and security at work, especially for special industry practitioners, such as military personnel, pilots, firefighters and high-speed rail drivers. Therefore, it is of great value and of social significance to develop a non-invasive stress estimation system to monitor people’s stress changes in their daily work.
At present, the main basis for psychological stress assessment includes social media information and physiological signals. For the former, it is easy to understand that people’s psychological stress can be roughly estimated by multimodal fusion and analysis of information such as texts, images, and videos posted on social media, and many methods have been proposed in this research direction [
5,
6]. Further, it is easier for people to obtain social media data than physiological signals. However, the accuracy of its stress assessment depends on how active users are on social media, and it seems difficult to make accurate stress assessments for users who are less active on social media. In addition, because of psychological defense mechanisms, people are likely to deliberately disguise their real stress situations in their behavioral performance. Compared with social media data, physiological signals can provide more objective and reliable information for psychological stress assessment [
7]. Physiological signals used for stress assessment mainly include electroencephalogram (EEG), electrodermal activity (EDA), photoplethysmographic (PPG) and electrocardiogram (ECG). Although EEG can provide useful information for psychological stress analysis with high temporal resolution [
8], the wearing process of its collection equipment is cumbersome and requires the help of professionals. Moreover, the EEG signal is easily disturbed by movements during the collection process. Therefore, EEG is not suitable for daily monitoring of human psychological stress. Compared with EEG, the acquisition equipment for PPG and EDA is portable, and the acquisition process is simple. However, after being interfered with by body movements, the signal is prone to a large degree of distortion, which will increase the difficulty of subsequent feature extraction and analysis. ECG offers advantages over PPG in terms of stability and reliability and is by far the most widely used cardiac monitoring method in healthcare. In recent years, with the development of wearable devices, many wearable ECG devices with both comfort and anti-interference have been developed, including vests, bracelets and chest belts [
9,
10,
11]. The development of these non-invasive ECG devices is the basis for research on the daily monitoring of people’s psychological stress. Wearable physiological parameter monitoring equipment has also been widely used in the field of human action recognition, which has some implications for our research [
12,
13,
14,
15].
Compared with psychological stress detection methods based on scales or social media data, the use of wearable devices to collect ECG signals and detect psychological stress obviously has more advantages in real-time and flexibility of usage scenarios. In practical applications, we can use this solution to monitor the psychological stress state of police, firefighters, pilots and other special industry workers during the execution of tasks in real-time and even give real-time psychological intervention at the right time to relieve their anxiety. This not only can improve their work efficiency but also probably play an important role in keeping them safe. In addition, this solution can also be used in the recruitment and selection of workers in special industries.
When changes in the external environment make people feel tense or anxious, it will also cause a physiological response in the body. At this time, the parasympathetic branch of the human autonomic nervous system (ANS) is temporarily suppressed, and the sympathetic branch is activated, which causes a rapid increase in heart rate, cardiac contractility, blood pressure and respiration, and promotes hormone release [
16]. It puts the body in a state of hyperactivity to cope with the upcoming challenge. The changes in the ANS associated with psychological stress can be obtained by recording the ECG signal of the subject. Specifically, these ANS changes can be obtained by HRV (Heart Rate Variability) analysis [
17].
In this field, previous studies have mostly used classical machine learning methods to detect psychological stress through HRV features, namely, binary classification of stressed and unstressed [
18,
19,
20,
21,
22,
23,
24]. First of all, such a binary classification is not completely consistent with people’s stress experience in real life, and it is more and more necessary to study the evaluation methods of human stress in different states. Secondly, deep learning methods have achieved good results in many fields, such as image recognition, natural language processing and signal processing, so the powerful representation ability of deep learning methods can achieve good results in the multi-classification of psychological stress is a problem worth studying. Furthermore, the generation of psychological stress is not instantaneous, and whether its temporal features can be used to improve the accuracy of psychological stress classification is also an interesting problem.
To this end, in this paper, we introduce an ECG dataset collected under four stress states and propose to introduce the concept of time series into psychological stress assessment in order to improve the classification accuracy. Specifically, by constructing a continuous HRV time series, we use a multi-layer GRU network to extract multi-level features related to psychological stress and finally obtain the results through a classification network composed of multi-layer perceptrons. The contributions of this paper are mainly in two aspects. The first is that we propose the concept of time series in the classification of psychological stress states and introduce a recurrent neural network into the classification of psychological stress to obtain the representation of psychological stress in continuous HRV sequences to improve the classification accuracy. The second is that we conducted a psychological stress data collection experiment with 80 participants, designed and developed a stress-induced VR high-altitude scene and collected ECG signals from the subjects during four stress states, including resting, VR scene adaptation, VR high-altitude task and recovery. The purpose is to construct a dataset that can be used to study the mapping relationship between ECG signals and psychological stress in various states. After data cleaning and elimination, this dataset finally contains the ECG data and corresponding status labels of 63 subjects.
2. Related Works
Compared with the subjective scales used in the past, psychological stress assessment based on physiological signals has advantages in objectivity and reliability. In the field of psychological stress or emotion estimation based on physiological signals, many methods have been proposed, and some scholars have put forward their insights and analysis on the relationship between physiological signals and psychological stress.
Classical machine learning methods are widely used to classify psychological stress or emotions. Ref. [
18] uses Principal Component Analysis (PCA) to verify the HRV time domain, frequency domain and statistical features and then classify two emotions and five emotions by Support Vector Machines. Ref. [
21] selects robust HRV features through the mRMR method, reduces the differences in physiological parameters between individuals through baseline data to improve the classification accuracy, and finally, classifies psychological stress in relaxation and task states through a variety of machine learning methods. In the study of driver stress detection, ref. [
25] proposes the use of an enhanced random forest classifier to monitor driver stress by combining ECG waveform features and HRV features. Ref. [
23] tries to use various machine learning algorithms, including KNN and multi-layer perceptron (MLP), to classify the psychology stress level using the HRV obtained from the ECG signal, and achieved good classification results through the MLP method. Ref. [
26] uses the multi-scale analysis method to evaluate the stress of pilots flying at night by fusing the area of the heart rate curve and constructing the functional relationship between the stress intensity and the training frequency, which effectively improved the effect of high-altitude training. Some researchers use genetic algorithm, artificial bee colony algorithm and improved particle optimization algorithm to optimize multi-kernel support vector machine, which improves the accuracy of stress detection [
22].
At the same time, there are also studies that use biochemical indicators as a reference in the experiment and apply a variety of physiological signals to the detection of psychological stress. Ref. [
27] proves that some indicators of HRV (e.g., HF, LF) have a strong correlation with some features of the EEG signal (e.g., LAPFpl) for stress estimation by analyzing the linear correlation between the HRV features of the ECG signal and the EEG signal features. Based on the above study, the authors propose that combining EEG with HRV can improve the accuracy of psychological stress detection. Ref. [
19] develops a wearable multiphysiological parameter system to measure human stress and collect salivary cortisol as a reference. Specifically, the MAST (Maastricht Acute Stress Test) experiment is used to induce the generation of psychological stress, PCA and statistical methods are used to select and reduce the dimensionality of the features extracted from the recorded ECG, EDA and EEG signals, and finally, the SVM is used to classify psychological stress during the experimental period and the relaxation period. In addition, the experimental results in the paper show that salivary cortisol levels are highly correlated with HRV features. Some researchers also propose the detection of rest and task states of the human body by combining HRV features and PPG waveform features. A wrapping method based on ensemble learning is designed for feature selection, and a decision tree-based bagging model is developed for final state classification [
20]. In [
28], the salivary amylase and salivary cortisol concentrations are used to label the stress of subjects in TSST experiments into three levels, and the fuzzy ARTMAP method and voting integration method optimized by genetic algorithms have been used to establish a predictive model from subject HRV to psychological stress level, and good accuracy rates have been obtained.
In recent years, the use of deep learning methods to classify psychological stress has gradually emerged. Ref. [
24] uses a one-dimensional convolutional neural network to extract the complex features of the RR intervals, thereby building an end-to-end neural network model to detect stress states through ECG signals. The RR interval is the time interval between two adjacent R waves in the ECG signal; that is, the time interval between two heartbeats. Ref. [
29] proposes the use of a Gabor wavelet transform and discrete Fourier transform to convert the ECG signal into pictures in the time-frequency domain and frequency domain, respectively, and fuse the original signal, time-frequency domain and frequency domain information through a convolutional neural network to classify five levels of stress. Ref. [
30] designs a deep convolutional neural network with a transformer mechanism to detect psychological stress using the location information of R-waves in ECG signals and achieves good performance through the fine-tuned network. Ref. [
31] proposes the concept of real-time monitoring of psychological stress, and a convolutional neural network is used for the real-time recognition of acute cognitive stress from ECG signals with a 10-s window, which reduces the detection error rate compared to traditional methods. In previous studies, we used a multi-layer GRU network for the heartbeat classification of ballistocardiogram (BCG) signals and a bidirectional LSTM method for end-to-end heart rate estimation of BCG signals in a regression way, which achieved the best results compared to previous algorithms [
32,
33]. The successful application of a recurrent neural network in heartbeat detection also inspires and helps us in this work.
3. Materials
In this section, the wearable ECG signal collection device, VR scene, the process of the experiment and the dataset will be introduced in detail.
3.1. Smart ECG T-Shirt
Figure 1 is the smart ECG T-shirt designed and developed in our laboratory, which can simultaneously record various human physiological signals such as ECG, respiration and electrodermal activity [
34]. The left and right of
Figure 1a show the front lining and the front of the smart ECG T-shirt, respectively. In the experiment, it is used to record the ECG signals of subjects under different stress states. The sensor system of the smart ECG T-shirt is shown in the left half of
Figure 1a, which consists of five flexible electrodes. The right half of
Figure 1a shows the signal processing module of the smart ECG T-shirt, which can collect and store three lead ECG signals at a sampling rate of 250 Hz and provide power for the entire system through the built-in lithium battery.
Figure 1b shows a subject wearing the smart ECG T-shirt.
Figure 2 shows the three-lead ECG signal collected by this device. Each prominently raised peak in
Figure 2 represents a heartbeat, and the heartbeat location is consistent across each lead. The recording of three-lead ECG signals can guarantee the signal quality of ECG to a large extent and improve the tolerance of our ECG acquisition equipment to motion or noise interference.
3.2. VR Scenarios and Tasks
Figure 3 visualizes the VR experiment. The left of
Figure 3a shows the experimental scene, and the right shows the VR scene seen by the subjects (in which the curves of various physiological parameters will not be seen). The positions and sizes of key objects in the experimental scene are consistent with the VR scene. During the experiment, the subjects need to wear a VR helmet to enter the virtual high-altitude scene and complete the following three tasks on the board in this scene, as shown in
Figure 3b. These three tasks are described in detail as follows:
Task 1: Go to the end of the board to pick up the tennis ball from basket B and put it in basket A. Basket A and basket B are shown in the left of
Figure 3a.
Task 2: Go back to the end of the board to pick up the prop snake from basket C and put it in basket A. Basket A and basket C are shown in the left of
Figure 3a.
Task 3: Go to the end of the board and jump to the square board shown in the right of
Figure 3a.
3.3. Experimental Procedure and Dataset
The experiment consisted of four phases: resting, VR scene adaptation, VR task and recovery. The ECG signals were recorded synchronously in each phase of the experiment. Each of these phases is described in detail as follows:
Phase 1—Resting (5 min): Sit calmly in a chair. This phase lasts 5 min.
Phase 2—VR scene adaptation (2 min): Wear VR equipment to enter the VR high-altitude scene, and adapt to the scene. This phase lasts 2 min.
Phase 3—VR task: Complete the tennis ball and prop snake transport and jump to the board in the VR high-altitude scene. The duration of this phase depends on how fast the subject is performing the task.
Phase 4—Recovery (5 min): After the VR task, stay calm and sit back in the chair. This phase lasts 5 min.
The above four experimental phases correspond to the four stress states of the subjects. This experiment collects the ECG signals of 63 healthy male subjects with an average age of 17.89 ± 0.45. Considering the adaptability of the subjects and the possible duration of each stress state, we select the subjects’ ECG data in the first 70 s in the VR scene adaptation and recovery states and the subjects’ data in the last 70 s in the resting and VR task states as the stress state classification dataset. Different stress states are the stress classification labels of the corresponding ECG signals so that we can obtain a stress state classification dataset consisting of the ECG data of 63 subjects and four labels. By summarizing the intuitive feelings of each subject in the experiment, we found that the stress level during resting is the lowest, the stress during the VR task is the highest, and the stress during recovery is greater than that in the VR scene adaptation.
6. Conclusions
This paper proposes a deep psychological stress classification method based on ECG signals. First, HRV feature samples containing the timing information of ECG signals are constructed. Deep GRU networks are then used to extract deep features from HRV feature samples that have more essential and general connections to psychological stress states. Finally, a multi-layer, fully connected network is used to fuse the deep and shallow features of the GRU network to predict the psychological stress state. The experimental results show that the proposed method is a robust psychological stress estimation scheme, and its estimation accuracy in this dataset is 0.78 better than other mainstream methods.
However, we noticed that the classification accuracy is not very high. In future work, we will try to further improve the accuracy of psychological stress classification from the following aspects. The first is that the amount of information input to the classification model can be increased by introducing other physiological signals besides ECG, such as EEG and EDA, or extracting more valuable features from ECG signals, thereby improving the performance of stress classification. Secondly, we can also consider reducing the differences in physiological signals between individuals to improve the classification accuracy of psychological stress. Specifically, domain adaptation methods in transfer learning have achieved good results in many image datasets with large distribution differences, and in recent years, this method has achieved high performance in EEG-based cross-subject emotion recognition accuracy [
42,
43]. Therefore, we will consider introducing a transfer learning method to further improve the classification accuracy of psychological stress states. Furthermore, high-level feature design and feature space applicable reduction to multidimensional wearable sensors, such as referable approaches for wearable-based HAR, are also worthy of further experimentation [
14,
44].