Research of HRV as a Measure of Mental Workload in Human and Dual-Arm Robot Interaction

: Robots instead of humans work in unstructured environments, expanding the scope of human work. The interactions between humans and robots are indirect through operating terminals. The mental workloads of human increase with the lack of direct perception to the real scenes. Thus, mental workload assessment is important, which could e ﬀ ectively avoid serious accidents caused by mental overloading. In this paper, the operating object is a dual-arm robot. The classiﬁcation of operator’s mental workload is studied by using the heart rate variability (HRV) signal. First, two kinds of electrocardiogram (ECG) signals are collected from six subjects who performed tasks or maintained a relaxed state. Then, HRV data is obtained from ECG signals and 20 kinds of HRV features are extracted. Last, six di ﬀ erent classiﬁcations are used for mental workload classiﬁcation. Using each subject’s HRV signal to train the model, the subject’s mental workload is classiﬁed. Average classiﬁcation accuracy of 98.77% is obtained using the K-Nearest Neighbor (KNN) method. By using the HRV signal of ﬁve subjects for training and that of one subject for testing with the Gentle Boost (GB) method, the highest average classiﬁcation accuracy (80.56%) is obtained. This study has implications for the analysis of HRV signals characteristic of mental workload in di ﬀ erent subjects, which could improve operators’ well-being and safety in the human-robot interaction process.


Introduction
In unstructured environments, robots replace humans to perform some complex tasks, which expends the scope of human work [1,2]. The dual-arm robot, a kind of typical robot, has been widely studied [3,4]. Dual-arm robots can simulate the movement of two arms of human, making an important step towards humanoid operation. Studies based on dual-arm robots have always moved towards the operation of humanization. In this paper, a dual-arm robot is studied as the operating object, which is controlled by a wearable exoskeleton controller in master-slave mode. The dual-arm robot's performance is not only limited by the performance of the system, but also related to the current state of the operator closely. Sometimes, a large mental workload can still lead to improper or wrong operation even when the system is stable and the operator has a good sense of presence. Therefore, it is crucial to monitor the mental workload of the operator. On this basis, the human-robot task assignment could be dynamically adjusted based on the mental workload. This kind of research improves human-robot system performance and safety and refine the subjective experience of operators. Therefore, it is of method is presented. Section 3 shows the experimental results, which reflect the statistical analysis of features and mental workload measures. The discussion of results are present in Section 4. In Section 5, the conclusion of this paper is presented.

Data and Methods
Firstly, the process of mental workload recognition in this paper is presented and shown in Figure 1. Then, the subjects that participated in the data acquisition are introduced, respectively. Subsequently, the data acquisition process is introduced and the features are extracted. Finally, the mental workload identification results based on the extraction features are presented.
Electronics 2020, 8 analysis of features and mental workload measures. The discussion of results are present in Section 4. In Section 5, the conclusion of this paper is presented.

Data and Methods
Firstly, the process of mental workload recognition in this paper is presented and shown in Figure 1. Then, the subjects that participated in the data acquisition are introduced, respectively. Subsequently, the data acquisition process is introduced and the features are extracted. Finally, the mental workload identification results based on the extraction features are presented.

Participants
Subjects were, on average, 25.16 years old, and the study employed a total of six male participants, as shown in Table 1. They were selected from the Shenyang Institute of Automation, Chinese Academy of Sciences. They have normal or corrected vision, right-handedness, good health, and no heart, cerebrovascular, or nervous system problems. All participants were informed of the experiment, and participants were asked to wear loose and comfortable clothing.

Data Acquisition and Processing
The dual-arm robot utilized in this paper is shown in the Figure 2a. The robot has six independent driving wheels. Therefore, it can adapt to various complex topographic structures. Moreover, the robot is equipped with double arms, both with seven degrees of freedom, to imitate the number and structure of a human. The end of the arm is an open-close clamp, which can be used for precision operation. At the same time, the robot is equipped with a binocular camera, which can be used to enhance the operator's sense of presence. In order to facilitate operation, the manipulator

Participants
Subjects were, on average, 25.16 years old, and the study employed a total of six male participants, as shown in Table 1. They were selected from the Shenyang Institute of Automation, Chinese Academy of Sciences. They have normal or corrected vision, right-handedness, good health, and no heart, cerebrovascular, or nervous system problems. All participants were informed of the experiment, and participants were asked to wear loose and comfortable clothing.

Data Acquisition and Processing
The dual-arm robot utilized in this paper is shown in the Figure 2a. The robot has six independent driving wheels. Therefore, it can adapt to various complex topographic structures. Moreover, the robot is equipped with double arms, both with seven degrees of freedom, to imitate the number and structure of a human. The end of the arm is an open-close clamp, which can be used for precision operation.
At the same time, the robot is equipped with a binocular camera, which can be used to enhance the operator's sense of presence. In order to facilitate operation, the manipulator of the dual-arm robot adopts a wearable controller, which is shown in Figure 2b. Obviously, the wearable controller has the same structure as arms of the dual-arm robot. Between the wearable controller and the dual-arm robot, the master-slave control mode is used, as shown in Figure 2c. of the dual-arm robot adopts a wearable controller, which is shown in Figure 2b. Obviously, the wearable controller has the same structure as arms of the dual-arm robot. Between the wearable controller and the dual-arm robot, the master-slave control mode is used, as shown in Figure 2c.
(c) Master-slave control mode. The ECG signal acquisition sensor and software in this paper are shown in Figure 3a,b. The sensor is a portable chest strap that can be attached to the operator's chest. Additionally, The sensor is based on the BMD101 chip, which is the most widely used ECG signal acquisition sensor at present and can avoid interfering with the operator's normal operation. Then, the ECG data is transmitted via Bluetooth to a computer for collecting and displaying the ECG signals. The ECG signal acquisition sensor and software in this paper are shown in Figure 3a,b. The sensor is a portable chest strap that can be attached to the operator's chest. Additionally, The sensor is based on the BMD101 chip, which is the most widely used ECG signal acquisition sensor at present and can avoid interfering with the operator's normal operation. Then, the ECG data is transmitted via Bluetooth to a computer for collecting and displaying the ECG signals.  The flow chart of data acquisition is shown in Figure 4. Firstly, subjects read and sign the informed consent. Then, they are trained in operating the robot professionally. Only after passing the set assessment indicators can they participate in the experiment. Before the beginning of experiment, the ECG acquisition device needs to be placed on the subject's chest. Then the Karolinska Sleepiness Scale (KSS) is filled to determine the operator's sleepiness state. The KSS needs to be filled once the operator has completed their mission. After giving the operator a minute to concentrate, the experiment starts. ECG signals of each operator in two mental workload states are collected. The tasks performed under each level of mental workload are defined as follows: (1) The task of mental workload level 1: The operator does not perform any task and maintains a relaxed state. (2) The task of mental workload level 2: The operator operates the arms of robot to follow a specified trajectory. ECG signals of the operator are collected at each task for 10 min. At the end of the task, the data records are checked and the ECG acquisition equipment on the subject is removed. The experiment ends. A 3 min sliding window is used to process the data, which slides for 10 s each time. The sliding window segments the 10-min data of each state of each subject. Furthermore, the three-minute segments obtained are used for the identification and classification of the two mental workload states. The HRV is shown in Figure 5, which is obtained by ECG signal collected by sensor. In reality, the HRV signal is defined as the fluctuation in continuous RR intervals. Hence, for the sake of getting the HRV sequence from the ECG signal, a QRS wave group detection method is utilized to detect the Q wave, R wave, and S wave [24]. Nevertheless, the abnormal point maybe present in the HRV signal that is output by the QRS wave group detection method. In order to remove the exception value, a median filtering method is utilized [25].   The flow chart of data acquisition is shown in Figure 4. Firstly, subjects read and sign the informed consent. Then, they are trained in operating the robot professionally. Only after passing the set assessment indicators can they participate in the experiment. Before the beginning of experiment, the ECG acquisition device needs to be placed on the subject's chest. Then the Karolinska Sleepiness Scale (KSS) is filled to determine the operator's sleepiness state. The KSS needs to be filled once the operator has completed their mission. After giving the operator a minute to concentrate, the experiment starts. ECG signals of each operator in two mental workload states are collected. The tasks performed under each level of mental workload are defined as follows: (1) The task of mental workload level 1: The operator does not perform any task and maintains a relaxed state. (2) The task of mental workload level 2: The operator operates the arms of robot to follow a specified trajectory. ECG signals of the operator are collected at each task for 10 min. At the end of the task, the data records are checked and the ECG acquisition equipment on the subject is removed. The experiment ends. A 3 min sliding window is used to process the data, which slides for 10 s each time. The sliding window segments the 10-min data of each state of each subject. Furthermore, the three-minute segments obtained are used for the identification and classification of the two mental workload states.  The flow chart of data acquisition is shown in Figure 4. Firstly, subjects read and sign the informed consent. Then, they are trained in operating the robot professionally. Only after passing the set assessment indicators can they participate in the experiment. Before the beginning of experiment, the ECG acquisition device needs to be placed on the subject's chest. Then the Karolinska Sleepiness Scale (KSS) is filled to determine the operator's sleepiness state. The KSS needs to be filled once the operator has completed their mission. After giving the operator a minute to concentrate, the experiment starts. ECG signals of each operator in two mental workload states are collected. The tasks performed under each level of mental workload are defined as follows: (1) The task of mental workload level 1: The operator does not perform any task and maintains a relaxed state. (2) The task of mental workload level 2: The operator operates the arms of robot to follow a specified trajectory. ECG signals of the operator are collected at each task for 10 min. At the end of the task, the data records are checked and the ECG acquisition equipment on the subject is removed. The experiment ends. A 3 min sliding window is used to process the data, which slides for 10 s each time. The sliding window segments the 10-min data of each state of each subject. Furthermore, the three-minute segments obtained are used for the identification and classification of the two mental workload states. The HRV is shown in Figure 5, which is obtained by ECG signal collected by sensor. In reality, the HRV signal is defined as the fluctuation in continuous RR intervals. Hence, for the sake of getting the HRV sequence from the ECG signal, a QRS wave group detection method is utilized to detect the Q wave, R wave, and S wave [24]. Nevertheless, the abnormal point maybe present in the HRV signal that is output by the QRS wave group detection method. In order to remove the exception value, a median filtering method is utilized [25].   The HRV is shown in Figure 5, which is obtained by ECG signal collected by sensor. In reality, the HRV signal is defined as the fluctuation in continuous RR intervals. Hence, for the sake of getting the HRV sequence from the ECG signal, a QRS wave group detection method is utilized to detect the Q wave, R wave, and S wave [24]. Nevertheless, the abnormal point maybe present in the HRV signal that is output by the QRS wave group detection method. In order to remove the exception value, a median filtering method is utilized [25].

Feature Extraction
In this sub-section, extracting features from the HRV data obtained is presented. During the operation of the dual-arm robot operation tasks, the change of mental workload of operator is related to the volatility of sympathetic and parasympathetic nerve closely. In fact, the time domain features of the HRV signal reflect the overall volatility of the autonomic nervous system reaction. Additionally, frequency domain features of high frequency are related to the intensity of the modulation of parasympathetic nerve. Nevertheless, the low frequency band is influenced more by sympathetic nervous regulation. In addition, nonlinear features are expressed the chaotic and dynamic characteristics of HRV signal.

Time domain features
The main features used in time domain is shown in Table 2. They are SDNN, RMSSD, RMSSD, PNN50, and HRVTi. In addition, the mean and median of the HRV signal are also extracted as features.

SDNN ms
The standard deviation of all successive R-R intervals.
The root mean square of the successive R-R interval difference.
( ) The proportion of the beats with a successive R-R interval difference that exceed 50 ms.

Feature Extraction
In this sub-section, extracting features from the HRV data obtained is presented. During the operation of the dual-arm robot operation tasks, the change of mental workload of operator is related to the volatility of sympathetic and parasympathetic nerve closely. In fact, the time domain features of the HRV signal reflect the overall volatility of the autonomic nervous system reaction. Additionally, frequency domain features of high frequency are related to the intensity of the modulation of parasympathetic nerve. Nevertheless, the low frequency band is influenced more by sympathetic nervous regulation. In addition, nonlinear features are expressed the chaotic and dynamic characteristics of HRV signal.

Time domain features
The main features used in time domain is shown in Table 2. They are SDNN, RMSSD, RMSSD, PNN50, and HRVTi. In addition, the mean and median of the HRV signal are also extracted as features.

SDNN ms
The standard deviation of all successive R-R intervals.
RRs i

RMSSD ms
The root mean square of the successive R-R interval difference.
The proportion of the beats with a successive R-R interval difference that exceed 50 ms.

HRVTi -
The sum of all R-R intervals divided by the maximum density distribution.

Frequency domain features
The all frequency features used in this paper are based on the power spectra density. In this paper, a Lomb-Scamble periodic graph is used to calculate the power spectral density, which has a higher estimation accuracy than the FFT-based method [26]. The detailed description and definition are shown in Table 3. SaEn is a method that can be used for the measurement of physiological signal complexity. SaEn is a probability of two HRV signals matching at a length of m + 1 if they match at m. In addition, a tolerance parameter r will determine the match result. In this paper, the value of m is set to 2, and the value of r is defined as 0.2 × std. The std in this paper represents the standard deviation of the input HRV data [27].

2.
Detrended Fluctuation Analysis (DFA): DFA can be used for the statistical self-affinity of physiological signal, which is used for removing the trend of a series of events. Especially, it can reflect the information about the long-term correlation in the HRV signal. Furthermore, it has been widely used in HRV signal analysis [28]. The fluctuations of the HRV signal can express as a function of time intervals: F(n) = pn Alpha where p is a constant and Alpha is a scale factor. F represents the fluctuations of HRV and n is time intervals. The HRV signal fluctuations will be altered by changing the parameter n. Two parameters of Alpha1 and Alpha2 are defined as the slop of F(n), which is a function of logn in different time range.

Results
Using the time domain, frequency domain and nonlinear analysis method above, the HRV signals are analyzed when the subjects are in performing the task and relaxing state, respectively. Firstly, a t-test is used and the statistical significance of the extracted time domain, frequency domain, and nonlinear features are analyzed. Then, the features with statistical differences are selected for the classification of mental workload. Furthermore, for the sake of excluding the effects of classifier performance differences, six classifier algorithms are selected to identify and classify the mental workload, which are Support Vector Machine (SVM), Linear Discriminant Analysis (LDA), K-Nearest Neighbor (KNN), Decision Tree (DT), Gentle Boost (GB), and Naive Bayes (NB). The default parameters are selected as the parameters of the six classification algorithms in this paper. In addition, the HRV signals under different mental workload are divided into testing set and training set based on 10-fold cross-validation. Furthermore, the performance of mental workload levels are classified and evaluated by three indicators, which are defined as follows: Accuracy : Acc = TP + TN TP + FP + TN + FN × 100%; Sensitivity : Sen = TP TP + FN × 100%; where TP is defined as those samples in which the predicted and actual values are both positive. FP is defined as those samples that are classified as positive samples, but they are actually negative samples. FN is defined as those samples that are predicted to be negative samples, but their actual values are positive. Additionally, TN is defined as the actual values of samples that are positive but that are predicted to be negative. In this paper, the performing task state samples are defined as positive samples and the relaxing state samples are defined as negative samples.

Statistical Difference Analysis of Features from the Same Subject
Using the t-test, the statistical differences of time domain, frequency domain, and nonlinear features are analyzed in the same subject at different states (performing task state and relaxing state). Defining the sample set of subject1's performing task state as S1-M, sample set of subject1's relaxing state as S1-R. Meanwhile, the sample set of subject2 s, subject3's, subject4's, subject5's, and subject6's different lengths of time is defined by this rule. Table 4 shows the statistical differences among 6 subjects. Moreover, each subject has two different mental workload states (performing task state and relaxing state). In detail, Table 4 shows the statistical differences of time domain, frequency domain and nonlinear features. It can be seen that there are total 87 features that are most significant differences (p < 0.001) between two different mental workload states from Table 4.
Among them, subject1 has 20 features with most significant differences (p < 0.001), which consist of six time domain features, 10 frequency domain features, and four nonlinear features.
Subject2 has 13 features with most significant differences (p < 0.001), which consist of six time domain features, five frequency domain features, and two nonlinear features.
Subject3 has 15 features with most significant differences (p < 0.001), which consist of six time domain features, seven frequency domain features, and two nonlinear features.
Subject4 has 16 features with most significant differences (p < 0.001), which consist of five time domain features, seven frequency domain features, and four nonlinear features.
Subject5 has nine features with most significant differences (p < 0.001), which consist of five time domain features, and four frequency domain features.
Subject6 has 14 features with most significant differences (p < 0.001), which consist of five time domain features, five frequency domain features, and four nonlinear features.

Statistical difference analysis of features cross the different subject
Using the t-test, the statistical differences of time domain, frequency domain, and nonlinear features are analyzed cross the different subject at different states (perform task state and relaxed state). The sample set of subject1's, subject2's, subject3's, subject4's, subject5's, and subject6's performing task state is defined as the CM group and the sample set of subject1's-subject6's in the relaxing state is defined as CR group. Table 5 shows the statistical differences between two different mental workload state sample sets. Table 5 shows time domain and nonlinear features and Table 5 shows frequency domain features. It can be seen from Table 5 that there are 18 most significant difference (p < 0.001) features in the two groups of CM and CR.

Mental Workload Classification Based on the Same Subject
The classification and identification of mental workload are carried out on six subjects, respectively. Additionally, the features with statistical differences are selected for the classification of mental workload. The sample datasets for each experiment are divided into training set and testing set. In order to verify the classification performance of features, a total of six classification algorithms are used in this paper so each subject has trained six models. In this paper, there are six experimental subjects and 6 × 6 = 36 models are trained. The average value of 10-fold cross-validation is used as the final experimental result. In order to ensure the reliability of the experimental results, the 10-fold cross-validation is repeated 100 times. Figure 6 and Table 6 are the classification results for each subject using different classifiers, as can be seen from Figure 6a and Table 6. SVM, KNN, and GB show better classification results for subject1. In addition, the KNN classification algorithm shows the highest Spe, Sen, and Acc: 99.26%, 98.86%, and 98.91%, respectively. It can be seen from Figure 6b and Table 6 that SVM, KNN GB, NB, and DT show better classification results for subject2. In addition, the KNN classification algorithm shows the highest Spe, Sen, and Acc: 99.99%, 98.94%, and 99.95%, respectively. It can be seen from Figure 6c and Table 6, for the subject3. LDA shows the worst classification effect and the KNN classification algorithm shows the highest Spe, Sen, and Acc: 99.15%, 99.07%, and 98.84%, respectively. As Figure 6d and Table 6 demonstrate, SVM, KNN, GB, and DT show better classification results for subject4. The SVM classification algorithm shows the highest Spe (98.43%) and KNN classification algorithm shows the highest Sen and Acc: 97.61% and 96.45%, respectively. As can be seen from Figure 6e and Table 6, for the subject5, all five classification algorithms, except LDA, show good performance of classification. The SVM classification algorithm shows the best Spe, Sen, and Acc: 99.97%, 99.99%, and 99.97%, respectively. It can be seen from Figure 6f and Table 6 that the KNN classification algorithm shows the highest Spe, Sen, and Acc: 98.61%, 99.34%, and 98.64%, respectively.  Finally, the Spe, Sen, and Acc of the six subjects under different classification are presented in box plots (Figure 7). Box plots not only show the average values, but the distribution of the computed values can also be given. Additionally, the abnormal values are given by red points. As can be seen from the figure, while using the KNN classifier, all 6 subjects exhibit highest Spe, Sen, and Acc, with the least overall discreteness. However, in Spe and Acc, outliers appear. While using the SVM classifier, the six subjects perform higher Spe, Sen, and Acc, and the data are less discrete. Comparing with KNN and SVM classifiers, the GB classifier shows a large degree of discreteness but the classification results are stable. The performance of classification of the DT classifier is slightly worse than GB. The classification results of LDA and NB classifiers are the least satisfactory, with Spe, Sen, and Acc of LDA being lower, while the Spe, Sen, and Acc of NB classifier are the most discrete.

Mental Workload Classification Cross Subject
In this sub-section, the performance differences of cross-subject mental workload classification are analyzed. The features with statistical differences are selected for the classification of mental workload. Samples of five subjects are used as a training set and samples of the leave-out subject who is not involved in the training are used as the testing set. Since there are six subjects, the validation process is performed six times. Figure 8 and Table 7 are cross-subject classification results using different classifiers. As can be seen from Figure 8 and Table 7, for subject1, the KNN classification algorithm shows the highest Sen (100%), and the GB method shows the highest Spe (100%) and Acc (91.18%). For subject2, SVM and GB methods show the highest Spe (100%). At the same time, the GB method also shows the highest Spe (100%) and Acc (100%). For subject3, the LDA classification algorithm shows the best classification performance. The Spe, Sen, and Acc are 78.43%, 100.00%, and 89.22%, respectively. For subject4, SVM shows the highest Sen (98.43%). The KNN method shows the highest Acc (95.1%). Te NB method shows the highest Spe (100%). For subject5, SVM shows the highest Acc (81.76%). GB and DT show the highest Spe (100%) and the NB method shows the highest Spe (100%). For subject6, both SVM and KNN methods show the highest Spe (84.31%). SVM shows the best Acc (91.18%) and the NB method shows the highest Sen (100%).  Finally, the results of cross-subject classification under different classifiers are presented in box plots ( Figure 9). As can be seen from the figure, there are higher maximums of Spe, Sen, and Acc regardless of the classifier used. However, the figure also shows a more discrete distribution result and the red points represent abnormal values. The difference between the maximum and minimum values is large. In addition, for each subject, there is a classifier that achieves better classification results.

Discussion
To the best of our knowledge, this is the first work to measure the operator's mental workload in human and dual-arm robot interaction process based on wearable exoskeleton controller. At present, many of the studies on mental workload are aimed at the n-back paradigm, simulated driving scenarios, and so on. In the process of interaction between human and dual-arm robot of this paper, the operator adopts the wearable controller. Additionally, the two arms of the dual-arm robot imitate the arms of human. This control mode of master-slave aims to reduce the operator's burden in the process of human and dual-arm robot interaction as much as possible. In addition, this control mode also excludes the operator's limb coordination ability differences, which significantly focuses the operator on the task. The study of mental workload in the process of human and dual-arm robot interaction has not been found. In addition, there is no corresponding public datasets. Thus, in this paper, the ECG signal data is collected. According to the ECG signals, the HRV for analysis is extracted.
Studies have shown that a stress response occurs [29] when the mental workload of the human increases. First, the sympathetic nervous system will be activated. Then the entire nervous system will respond to the increase of mental workload and improve human alertness. Furthermore, blood is transferred from the internal organs and skin to the skeletal muscles. Then the heart rate and heart contraction increase rapidly. These changes allow the body to accumulate large amounts of energy in a short period of time to prepare for external threats. Furthermore, the HRV signal contains information about the regulation of the cardiovascular system by body fluid factors, which can reflect fluctuations of the autonomic nervous system. Therefore, it is feasible to use the HRV signal for mental workload analysis.
More specifically, the existing studies show that the aTotal feature reflects the whole activity of the autonomic nervous system. LF-relative features are thought to be associated with sympathetic activity. HF-relative features are thought to have correlation between the parasympathetic activity. The physiological significance of the VLF-relative features have been identified with long-period rhythms. The relationship between LF components and HF components (LF/HF) is an important indicator of the sympathetic and parasympathetic balance in the body [30,31]. The SDNN index and HRVTi feature are believed to primarily measure autonomic influence on HRV [32]. Both RMSSD and PNN50 reflect parasympathetic (vagal) activity. Nonlinear features represent the fluctuation characteristics of the autonomic nervous system [33].
In this paper, the time domain features, frequency domain features, and nonlinear features between two mental workload states of the same subject or across different subjects, most features show statistical differences. Only individual features do not show statistical differences, which may be due to personalized differences between subjects. This does not affect the classification of the two mental workload states. Firstly, this paper analyzes the different mental workload states of the same subject. The results show that, for subject1-subject6, the highest Acc are 98.91% (KNN), 99.95% (KNN), 98.84% (KNN), 96.45% (KNN), 99.97% (SVM), and 98.64% (KNN), respectively. The KNN classifier has the highest average recognition accuracy (98.77%) when using the same classifier to identify six subjects separately. The SVM and GB classifiers also show good classification, with the Acc being 97.54% and 95.90%, respectively. None of the remaining three classifiers (LDA, NB, DT) have a classification accuracy rate of more than 90%. Therefore, the KNN algorithm is more suitable for the human and dual-arm robot interaction, using the sample data training model of the same subject and classifying the mental workload of the subject. Then, the different mental workload states cross-subject are classified. The results show that, for subject1-subject6, the highest Acc are 91.18% (GB), 100% (GB), 89.22% (LDA), 95.10% (KNN), 81.76% (SVM), and 91.18% (SVM). Thus, the average classification accuracy of the six subjects classifying using different classifiers is 91.41%. In the case of using the same classifier for the six subjects, the average accuracy of cross-subject identification is 80.56% (GB). Additionally, SVM and KNN also show good classification results, with classification accuracy of 78.51% and 73.53%, respectively. When identifying across subjects, each subject has a classifier that makes it better classified. Therefore, in the future, multiple classifiers should be considered for use and use the voting method to select the best classifier's classification results.
The analysis of mental workload is related to specific tasks and the study of mental workload in the process of master-to-slave interaction between a wearable controller and a dual-arm robot have not been reported. Therefore, this paper chooses to compare the studies related to mental workload or stress in other scenarios. In [34], a pilot study is conducted on whether machine learning can predict stress decrease after relaxation on the basis of a wearable sensor. The status before and after relaxation is classified using the ECG and GSR signals for 79.2% classification accuracy. In [35], detection of drivers' anxiety based on physiological signals is studied. The results show that classification on the basis of EEG alone shows the best accuracy, it is 77.01%. In [36], the cross-subject mental workload classification is studied on the basis of kernel spectral regression and transfer learning techniques. An average Acc of 72.66% is obtained for six subjects, the Acc of six subjects are 73.15%, 77.32%, 78.63%, 65.40%, 71.08%, and 70.36%, respectively. In [37], using wearable sensors, the mental workload of human and robot collaboration is analyzed. However, it is only the statistical analysis of HRV signals in different mental workload states. In addition, there is no study of classification and identification. In this paper, the data of two different mental workload states are collected and 20 kinds of HRV features are extracted. Then, the statistical significance of HRV signal features are analyzed in different states. The features with statistical differences (p < 0.05) are selected for the identification and analysis of mental workload. Models trained with the same subject data and models trained across different subjects all obtained higher Acc compared with [34][35][36][37].
In addition, in this paper, the heart beat data collection device is a custom one. Its functionality can be modified based on demand. Furthermore, it is cheap. However, with the rapid development of consumer electronics devices, most of the existing smart watches have heart beat monitoring capabilities. This will be more conducive to long-term detection. Thus, in the future, smart watches will be considered as the heart beat data collection device for research.

Conclusions
A human remote-controlled robot performs complex or dangerous tasks in unstructured environments, which expends the scope of human work. In the process of completing the tasks, the mental workload of the operator will change based on the different tasks of the robot. However, too much mental workload will not only affect the robot's working efficiency and safety, but also impact human physical and mental health. In order to assess the mental workload during human interaction with a dual-arm robot, in this paper, HRV is the measure that is studied. Firstly, the ECG signals of two kinds of mental workload states (performing task state and relaxing state) are collected. The ECG signals are collected from six subjects based on a custom device. Based on the ECG signal, the HRV signal is obtained. Then, 20 kinds of HRV features (time domain, frequency domain, and nonlinear features) are extracted. Finally, six different classifications are used to mental workload classification. The results are that, firstly, using each subject's HRV signal training model, the subject's mental workload is classified. The average classification accuracy of 98.77% is obtained using the KNN method. Then, using the HRV signal of five subjects for training, and the remaining one subject for testing, the GB method can obtain the highest average classification accuracy, with the average classification accuracy of six subjects being 80.56%. This study has demonstrated that the HRV can be used to measure the mental workload during human interaction with a dual-arm robot.

Conflicts of Interest:
The authors declare no conflict of interest.