Sample-Entropy-Based Method for Real Driving Fatigue Detection with Multichannel Electroencephalogram

: Safe driving plays a crucial role in public health, and driver fatigue causes a large proportion of crashes in road driving. Hence, this paper presents the development of an efﬁcient system to determine whether a driver is fatigued during real driving based on 14-channel EEG signals. The complexity of the EEG signal is then quantiﬁed with the sample entropy method. Finally, we explore the performance of multiple kernel-based algorithms based on sample entropy features for classifying fatigue and normal subjects by only analyzing noninvasive scalp EEG signals. Experimental results show that the highest classiﬁcation accuracy of 97.2%, a sensitivity of 95.6%, a speciﬁcity of 98.9%, a precision of 98.9%, and the highest AUC value of 1 are achieved using SampEn feature and cubic SVM classiﬁer (SCS Model). It is hence concluded that SampEn is an effectively distinguishing feature for classifying normal and fatigue EEG signals. The proposed system may provide us with a new and promising approach to monitoring and detecting driver fatigue at a relatively low computational cost.


Introduction
Traffic safety is one of the most serious problems [1,2]. Therefore, it is necessary to comprehensively analyze the causes of traffic accidents considering drivers, vehicles, and environmental factors and propose corresponding solutions to reduce the frequency of accidents and the number of casualties [3][4][5][6]. The car driver is responsible for the perception of surrounding information and the control of the car and is the most critical part of the entire system. Drivers need to focus on the driving process, comprehensive analysis of the car's movement status, traffic conditions, and other information, to determine the correct driving strategy and make the appropriate driving action. Therefore, car driving requires the driver not only to exert a certain amount of physical strength but also to bear a certain amount of mental load. When the driver's load is too high or too low, it may cause decision-making or operational errors, leading to traffic accidents [5].
With the increase in traffic density, complexity of roads, and use of information equipment in the car, driving has become a more complex interactive activity [7]. Drivers are often under double-tasking or multitasking conditions, and their attention may be distracted from tasks that are directly related to the driving task, such as the perception of the environment and car manipulations. However, the ability of the brain to process information has certain limitations. When it is in a complex multitasking environment, other tasks that are not directly related to the driving task are bound to compete with the driving task in terms of brain resources, thus increasing the mental stress of the driver. For a driver, a high level of mental stress can easily lead to a sense of tension, distractions, and errors in perception and decision-making, which can lead to traffic accidents [8][9][10]. In particular, long-term and long-distance driving is more likely to induce driver fatigue.
Fatigue is one of the crucial factors resulting in a large proportion of crashes in road driving in the transportation industry [11,12]. Currently, existing fatigue-detection technologies mainly rely on driving behavior, vehicle driving information, and the driver's physiological characteristics [13][14][15]. Among these three technologies, driving fatigue is highly linked to physiological signals [13].
For developing a driver fatigue detection system, in general, EEG is the most common noninvasive way to evaluate driver fatigue and also is a low-cost network of measurement compared to recent electrical neuroimaging modalities [14,15]. In recent years, the method proposed in [16] classified the driver distraction level based on the fusion of discrete wavelet packet transform and FFT and three classifiers, namely subtractive fuzzy clustering, probabilistic neural network, and K nearest neighbor, using wireless EEG signals, obtaining the highest classification accuracy of 85%. In another study [17], the authors collected 16 channels of EEG signals and then calculated 12 types of energy parameters as features and finally employed the regression equation for the classification of fatigue and nonfatigue signals from the Fp1 channel and O1 channel, and they reported an accuracy of 91.5%. Time, spectral, wavelet analysis and neural network classifier were proposed to detect the drowsiness stage in EEG records in [18]; this method achieved accuracies of 87.4% and 83.6% for alertness and drowsiness, respectively.
The aim of this study is to develop an efficient system to determine whether a driver is fatigued during real driving based on 14-channel EEG signals. A novel system combining sample entropy and multiple machine learning algorithms is proposed. The novelty of this study can be summarized as follows: (1) development of an efficient system to determine whether a driver is fatigued during real driving (recognizing the state of the driver in a real driving environment it is a challenging and rewarding task); (2) development of multiple kernel-based algorithms based on sample entropy features for classifying fatigue and normal subjects by only analyzing noninvasive scalp EEG signals; (3) achievement of high identification performance at relatively low computational cost (the highest classification accuracy of 97.2%, a sensitivity of 95.6%, a specificity of 95.7%, a precision of 98.9%, the highest AUC value of 1, and a training time of 1.225 s with cubic SVM algorithm).
The rest of the paper is organized as follows: Section 2 provides a detailed description of methods, including experiment design and data collection. The results and discussion of the experiment are shown in Section 3. Finally, the conclusion is presented in Section 4.

Participants
Ten participants aged between 32 and 40 years (mean age: 36.6) were recruited for this study. None of the participants reported any neurological or psychiatric disorders. All reported normal or corrected to normal vision and normal hearing. All participants had owned their driving licenses for at least two years. They were asked to have a good sleep and refrain from drinking coffee, alcohol, or tea within 48 h before the experiment [19]. Each participant was informed of the experimental procedure. They read the consent and signed it. The participants were paid for their participation in the experiment.

Task and EEG Data
In the experiment phase, the real driving route (about 254 km) was from Shenyang (41.78 • N, 123.43 • E) to Dandong (40.12 • N, 124.38 • E), Liaoning, China. The drivers drove for about 3 h along the whole drive, and almost all driving distances were chosen on a monotonous highway. To reduce artifacts in the physiological data recording, an automatic shifting car was selected, and the use of the radio was banned during the experiment [8]. The driving experiments started at about 13:00. To avoid the influence of circadian fluctuations, the experiments were scheduled at the same time each day.
Each participant was asked to keep the motor vehicle on the track as accurately as possible to avoid other vehicles. EEG signals were continuously collected by EMOTIV EPOC (very convenient to deploy under real driving conditions, wireless device, and inexpensive compared with other EEG acquisition equipment) during driving. EEG data from 14 channels (AF3, F7, F3, FC5, T7, P7, O1, O2, P8, T8, FC6, F4, F8, AF4, see Figure 1), which were arranged according to the international 10-20 standard, and two reference electrodes, which were located at the bilateral posterior mastoid [20], were recorded during the real driving. The participants' EEG signals were captured by the neuroheadsets and then wirelessly transmitted to an external laptop via the USB receivers for further processing. The sampling frequency was kept at 128 Hz. Subsequently, the data of each participant for the first three minutes and the last three minutes out of those of 3 h were selected for further preprocessing. In the case of EEG signals, the undesired artifacts such as eye blinking were reduced by using independent component analysis (ICA) in the EEGLAB software. The subjective fatigue level of participants was investigated before and after the real driving experiments by the Stanford Sleepiness Scale (SSS) [21] (1-2 = conscious, 3-4 = slight fatigue, 5 = medium fatigue, 6-7 = severe fatigue). The facial video was recorded using a video camera installed in front of the participants. automatic shifting car was selected, and the use of the radio was banned during the experiment [8]. The driving experiments started at about 13:00. To avoid the influence of circadian fluctuations, the experiments were scheduled at the same time each day.
Each participant was asked to keep the motor vehicle on the track as accurately as possible to avoid other vehicles. EEG signals were continuously collected by EMOTIV EPOC (very convenient to deploy under real driving conditions, wireless device, and inexpensive compared with other EEG acquisition equipment) during driving. EEG data from 14 channels (AF3, F7, F3, FC5, T7, P7, O1, O2, P8, T8, FC6, F4, F8, AF4, see Figure 1), which were arranged according to the international 10-20 standard, and two reference electrodes, which were located at the bilateral posterior mastoid [20], were recorded during the real driving. The participants' EEG signals were captured by the neuroheadsets and then wirelessly transmitted to an external laptop via the USB receivers for further processing. The sampling frequency was kept at 128 Hz. Subsequently, the data of each participant for the first three minutes and the last three minutes out of those of 3 h were selected for further preprocessing. In the case of EEG signals, the undesired artifacts such as eye blinking were reduced by using independent component analysis (ICA) in the EEGLAB software. The subjective fatigue level of participants was investigated before and after the real driving experiments by the Stanford Sleepiness Scale (SSS) [21] (1-2 = conscious, 3-4 = slight fatigue, 5 = medium fatigue, 6-7 = severe fatigue). The facial video was recorded using a video camera installed in front of the participants.

Feature Extraction
Compared to other available feature extraction techniques, an entropy-based method is more suitable for describing nonlinear and unstable dynamic EEG signals. Sample entropy (SampEn) is a new measure of time series complexity proposed by Richman and Moorman [22]. Entropy is originally a thermodynamic concept that is a measure of the degree of chaos (disorder) in a thermodynamic system. After the establishment of information theory, the concept and theory of entropy have been developed. As a nonlinear dynamic parameter for measuring the incidence of new information in time series, entropy has been applied in many scientific fields [23]. Sample entropy is an improvement of the approximate entropy algorithm. Sample entropy is intended to reduce the error of approximate entropy. The sample entropy method is similar to the current approximate entropy method but with better precision [24]. Compared with approximate entropy, sample entropy has two major advantages: first, sample entropy does not include the comparison of its own data segments, and the calculation of the sample entropy does not depend on the data length; second, compared to approximate entropy, sample entropy has better consistency. Therefore, in the present work, we

Feature Extraction
Compared to other available feature extraction techniques, an entropy-based method is more suitable for describing nonlinear and unstable dynamic EEG signals. Sample entropy (SampEn) is a new measure of time series complexity proposed by Richman and Moorman [22]. Entropy is originally a thermodynamic concept that is a measure of the degree of chaos (disorder) in a thermodynamic system. After the establishment of information theory, the concept and theory of entropy have been developed. As a nonlinear dynamic parameter for measuring the incidence of new information in time series, entropy has been applied in many scientific fields [23]. Sample entropy is an improvement of the approximate entropy algorithm. Sample entropy is intended to reduce the error of approximate entropy. The sample entropy method is similar to the current approximate entropy method but with better precision [24]. Compared with approximate entropy, sample entropy has two major advantages: first, sample entropy does not include the comparison of its own data segments, and the calculation of the sample entropy does not depend on the data length; second, compared to approximate entropy, sample entropy has better consistency. Therefore, in the present work, we quantify the complexity of the EEG signal with the sample entropy method. The sample entropy is calculated as follows: 1.
For a time series consisting of N data Define the distance d [X m (i), X m (j)] between vectors X m (i) and X m (j) as the absolute value of the maximum difference between the two elements.

3.
For a given X m (i), count the number of j (1 ≤ j ≤ N − m, j = i) where the distance between X m (i) and X m (j) is less than or equal to r, and record it as 4. Define B m (r) as:

5.
Increase the number of dimensions to m + 1, calculate the number of distances between X m+1 (i) and X m+1 (j) that are less than or equal to r, and record it as 6.
Define A m (r) as: In this way, B m (r) is the probability of two sequences matching m points under the similarity tolerance r, and A m (r) is the probability of two sequences matching m + 1 points. The sample entropy is defined as:

Classification
Since there is no uniform and efficient classification method for all applications, in general, it may be useful to test and compare multiple classification methods [14]. In the present work, we explored multiple classification algorithms (K nearest neighbor (KNN), support vector machine (SVM), logistic regression (LR)) to classify EEG signals, and these classification algorithms were carried out using MATLAB R2018b. The sample entropy of different channels was fed as input to the multiple classification algorithms, and the target variables were fatigue state and normal state. For fine KNN and medium KNN, the distance was computed using Euclidean distance and distance weight was equal. For cubic KNN, the distance was computed using Minkowski distance and distance weight was equal. We obtained the highest accuracy when the number of neighbors was 1 for fine KNN and the number of neighbors was 10 for medium KNN and cubic KNN. Additionally, SVMs with three kinds of kernel functions (linear, quadratic, and cubic) were explored and compared, and the kernel scale was automatic.
The performance of the classifiers considered in the present work was evaluated in terms of sensitivity (SEN), specificity (SPE), precision (PRE), and accuracy (ACC), using the following equations: where TP = true positive, representing the correctly detected fatigue signals; FN = false negative, representing the undetected fatigue signals; FP = false positive, representing the wrong detections; and TN = the number of normal signals.
Furthermore, a receiver operating characteristic (ROC) curve was applied to evaluate the performance of these classification algorithms. The area under the ROC curve (AUC) was also computed to evaluate the classification performance quantitatively.

Results
Self-reporting plays an important role in the physiological indicators of driving fatigue. Figure 2 presents the variation in participants' SSS scores. The results illustrate that participants were not fatigued before the real driving experiments and were quite fatigued after the real driving experiments. The SSS scores increased significantly from 1.8 to 5 (t = −12.829, p < 0.01, df = 9) after the real driving experiments. Subjective indicators reveal that a three-hour continuous real driving experiment does lead to an increase in fatigue.

TP TN
100 TP TN FP FN Accurancy where TP = true positive, representing the correctly detected fatigue signals; FN = false negative, representing the undetected fatigue signals; FP = false positive, representing the wrong detections; and TN = the number of normal signals. Furthermore, a receiver operating characteristic (ROC) curve was applied to evaluate the performance of these classification algorithms. The area under the ROC curve (AUC) was also computed to evaluate the classification performance quantitatively.

Results
Self-reporting plays an important role in the physiological indicators of driving fatigue. Figure 2 presents the variation in participants' SSS scores. The results illustrate that participants were not fatigued before the real driving experiments and were quite fatigued after the real driving experiments. The SSS scores increased significantly from 1.8 to 5 (t = −12.829, p < 0.01, df = 9) after the real driving experiments. Subjective indicators reveal that a three-hour continuous real driving experiment does lead to an increase in fatigue.
The SampEn was given as input to the different classifiers considered in the present work. Ten-fold cross-validation was adopted to ensure the reliable classification performance of these classifiers. Specifically, the 1800 normal and 1800 fatigue EEG data were divided into ten sub-datasets. Each time, 1 of the 10 sub-datasets was marked as a test dataset, and the other 9 sub-datasets were put together to form a training dataset; the method was repeated 10 times. Finally, the average accuracy for all of the 10 times was calculated. Table 1 lists the performance analysis of these machine learning algorithms considered in this work on the testing dataset. Figure 4 shows 3D confusion matrices for the SampEn feature approach using multiple machine learning algorithms. Classification performance is affected by kernel functions [15]; therefore, we explored multiple classification algorithms (K nearest neighbor (KNN), support vector machine (SVM), logistic regression (LR)) with different kernel functions. From Table 1 and Figure 4, we can see that the detection accuracies of the EEG records based on the SampEn feature vary from 90.0% to 97.2%. Particularly, the proposed methodology achieves the highest accuracy of 97.2% with the cubic SVM classifier. The best performance result is indicated in bold. It can be observed that 95.6% of the fatigue EEG features are correctly classified as fatigue states.
Moreover, a small percentage of 4.4% of the normal EEG features are wrongly classified as fatigue states. Additionally, a high percentage of 98.9% of the normal EEG features are correctly classified as normal states and a small percentage of 1.1% of the fatigue EEG features are wrongly classified as normal states. From Figure 5, it can be seen that the proposed methodology achieves the highest sensitivity (SEN) of 95.6%, highest precision (PRE) of 98.9%, and highest specificity (SPE) of 98.9% for the cubic SVM classifier. In addition, the quadratic SVM classifier also shows good performances, with ACC, PRE, SEN, and SPE being 96.4%, 97.7%, 95.0%, and 97.8%, respectively, with the SampEn feature. It detected 171 true positives, 9 false negatives, 4 false positives, and 176 true negatives.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 6 of 11 different channels. More specifically, compared to the normal state, a lower SampEn value for almost all channels in the fatigue state can be observed in Figure 3.  The SampEn was given as input to the different classifiers considered in the present work. Ten-fold cross-validation was adopted to ensure the reliable classification performance of these classifiers. Specifically, the 1800 normal and 1800 fatigue EEG data were divided into ten sub-datasets. Each time, 1 of the 10 sub-datasets was marked as a test dataset, and the other 9 sub-datasets were put together to form a training dataset; the method was repeated 10 times. Finally, the average accuracy for all of the 10 times was calculated. Table 1 lists the performance analysis of these machine learning algorithms considered in this work on the testing dataset. Figure 4 shows 3D confusion matrices for the SampEn feature approach using multiple machine learning algorithms. Classification performance is affected by kernel functions [15]; therefore, we explored multiple classification algorithms (K nearest neighbor (KNN), support vector machine (SVM), logistic regression (LR)) with different kernel functions. From Table 1 and Figure 4, we can see that the detection accuracies of the EEG records based on the SampEn feature vary from 90.0% to 97.2%. Particularly, the proposed methodology achieves the highest accuracy of 97.2% with the cubic SVM classifier. The best performance result is indicated in bold. It can be observed that 95.6% of the fatigue EEG features are correctly classified as fatigue states. Moreover, a small percentage of 4.4% of the normal EEG features are wrongly classified as fatigue states. Additionally, a high percentage of 98.9% of the normal EEG features are correctly classified as normal states and a small percentage of 1.1% of the fatigue EEG features are wrongly classified as normal states. From Figure 5, it can be seen that the proposed methodology achieves the highest sensitivity (SEN) of 95.6%, highest precision (PRE) of 98.9%, and highest specificity (SPE) of 98.9% for the cubic SVM classifier. In addition, the quadratic SVM classifier also shows good performances, with ACC, PRE, SEN, and SPE being 96.4%, 97.7%, 95.0%, and 97.8%, respectively, with the SampEn feature. It detected 171 true positives, 9 false negatives, 4 false positives, and 176 true negatives.

Discussion
To illustrate the relationship between true positive rate and false positive rate in an intuitive way, we used a receiver operating characteristic (ROC) graph to evaluate the performance of the proposed method [25,26]. The area under the curve of the ROC (AUC) was calculated to assess the classification performance. Figure 6 presents the ROC curves for cubic SVM (a) and quadratic SVM (b). The y-axis represents the true positive rate and the x-axis represents the false positive rate. More specifically, a perfect result with no misclassified points is a right angle to the top left of the plot. A poor result that is no better than random is a line at 45 degrees. The AUC is a measure of the overall quality of the classifier. Larger AUC values demonstrate better classifier performance [25]. In Figure 6, the two ROC curves for cubic SVM classifier (a) and quadratic SVM classifier (b) show the curves plotted in the upper left, which indicates that these two classifiers achieve good performance with the SampEn feature. The result also shows that the ROC curve for the cubic SVM classifier achieves the best upper left curve compared to the quadratic SVM classifier. Additionally, Figure 6 shows that these two classifiers with the SampEn feature achieve the best performance result with the highest AUC of 1.
classifier. Larger AUC values demonstrate better classifier performance [25]. In Figure 6, the two ROC curves for cubic SVM classifier (a) and quadratic SVM classifier (b) show the curves plotted in the upper left, which indicates that these two classifiers achieve good performance with the SampEn feature. The result also shows that the ROC curve for the cubic SVM classifier achieves the best upper left curve compared to the quadratic SVM classifier. Additionally, Figure 6 shows that these two classifiers with the SampEn feature achieve the best performance result with the highest AUC of 1. To simulate cross-subject driver fatigue recognition, we carried out leave-onesubject-out cross-validation on our dataset. In the leave-one-subject-out cross-validation setting, the data from nine subjects were pooled together and used as the training data to train the classifiers, and the data from the remaining subject were held out from the dataset and reserved as test data. The recognition accuracy was then evaluated on the remaining subject. This procedure was repeated for all 10 subjects. Usually, the leave onesubject-out cross-validation allows us to assess the adaptability of previously trained classifiers on new individuals. Table 2 shows the accuracy of leave-one-subject-out fatigue recognition across multiple machine learning algorithms. Figure 7 shows confusion matrices for leave-one-subject-out fatigue recognition with fine KNN. Performance measures (sensitivity, specificity, precision) of leave-one-subject-out fatigue recognition are presented in Figure 8. It is worth noting that for all 10 subjects, the highest average accuracy for driving fatigue of 97.8% is fine KNN with one neighbor using Euclidean distance and equal distance weight.  To simulate cross-subject driver fatigue recognition, we carried out leave-one-subjectout cross-validation on our dataset. In the leave-one-subject-out cross-validation setting, the data from nine subjects were pooled together and used as the training data to train the classifiers, and the data from the remaining subject were held out from the dataset and reserved as test data. The recognition accuracy was then evaluated on the remaining subject. This procedure was repeated for all 10 subjects. Usually, the leave one-subject-out cross-validation allows us to assess the adaptability of previously trained classifiers on new individuals. Table 2 shows the accuracy of leave-one-subject-out fatigue recognition across multiple machine learning algorithms. Figure 7 shows confusion matrices for leaveone-subject-out fatigue recognition with fine KNN. Performance measures (sensitivity, specificity, precision) of leave-one-subject-out fatigue recognition are presented in Figure 8. It is worth noting that for all 10 subjects, the highest average accuracy for driving fatigue of 97.8% is fine KNN with one neighbor using Euclidean distance and equal distance weight.        For a valid comparison, we used the same experimental data to compare the method considered in the present work with other published methods in terms of classification performance. We adopted the method proposed in [16], using probabilistic neural network and K nearest neighbor for different wavelets (fusion of discrete wavelet packet transform and FFT), and achieved classification accuracy rates of 92% and 89%, respectively. We also attempted the method proposed in [17], calculated 12 types of energy parameters as features, and finally used the regression equation for the classification of fatigue and nonfatigue signals from the O1 channel; the accuracy was 93%. Finally, we used time, spectral, wavelet analysis and neural network classifier, as introduced in [18], and obtained an accuracy of 89.5%. In the present work, we developed a fatigue detection system using SampEn feature and cubic SVM classifier (SCS Model) during real driving based on EEG signals, which gives the highest accuracy of 97.2%, a sensitivity of 95.6%, a specificity of 98.9%, a precision of 98.9%, and the highest value of area under the receiver operating curve (AUC = 1). In addition, compared with current EEG-based fatigue-detection technologies, our system only uses the SampEn feature. The merit of our proposed system is that it is very effective for the analysis of short-length time series and achieves high detection performance with low computational cost. For the deployment of the fatigue detection system proposed in this research, the driver's EEG signals are continuously collected by EMOTIV EPOC and then wirelessly transmitted to an external laptop via the USB receivers for the SCS model. Finally, it is judged whether the driver is in a state of fatigue.

Conclusions
Driver fatigue causes a large proportion of crashes in road driving. Therefore, in this study, we developed a fatigue detection system using the SampEn feature from 14 channels and seven classifiers, namely LR, linear SVM, quadratic SVM, cubic SVM, fine KNN, medium KNN, and cubic KNN, during real driving based on EEG signals. Experimental results show that the highest classification accuracy of 97.2%, a sensitivity of 95.6%, a specificity of 95.7%, a precision of 98.9%, and the highest AUC value of 1 are achieved using SampEn feature and quadratic SVM classifier. It is hence concluded that SampEn is an effectively distinguishing feature for classifying normal and fatigue EEG signals. The proposed system may provide us with a new and promising approach to monitoring and detecting driver fatigue.

Institutional Review Board Statement:
The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by "NEU" Ethics Committee on 8 June 2014 (01_08/06/14).
Informed Consent Statement: Informed consent will be obtained from all subjects involved in the study.

Data Availability Statement:
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.