A Feature Extraction Method Based on Differential Entropy and Linear Discriminant Analysis for Emotion Recognition

Feature extraction of electroencephalography (EEG) signals plays a significant role in the wearable computing field. Due to the practical applications of EEG emotion calculation, researchers often use edge calculation to reduce data transmission times, however, as EEG involves a large amount of data, determining how to effectively extract features and reduce the amount of calculation is still the focus of abundant research. Researchers have proposed many EEG feature extraction methods. However, these methods have problems such as high time complexity and insufficient precision. The main purpose of this paper is to introduce an innovative method for obtaining reliable distinguishing features from EEG signals. This feature extraction method combines differential entropy with Linear Discriminant Analysis (LDA) that can be applied in feature extraction of emotional EEG signals. We use a three-category sentiment EEG dataset to conduct experiments. The experimental results show that the proposed feature extraction method can significantly improve the performance of the EEG classification: Compared with the result of the original dataset, the average accuracy increases by 68%, which is 7% higher than the result obtained when only using differential entropy in feature extraction. The total execution time shows that the proposed method has a lower time complexity.


Introduction
Electroencephalography (EEG) is a means of data acquisition via sensors [1,2]. In actual applications, due to the large amount of data involved in EEG, the collected data is often used in the mobile edge computing server for moving edge calculation. The Brain-Computer Interface (BCI), also known as the direct neural interface, is an interdisciplinary cutting-edge technology. It is a direct connection pathway established between a human or animal brain (or culture of brain cells) and an external device [3][4][5][6][7]. The role of BCI is to establish communication between the human brain and external computers or other intelligent electronic devices [8,9]. Emotional recognition is a very important part of BCI. Emotional recognition generally refers to the acquisition of an individual's physiological or non-physiological signals to automatically identify the individual's  The remaining sections are organized as follows: Section 2 introduces the classification methods used in this paper, including LR, SVM, k-NN, RF, and MLP. Section 3 introduces the public dataset used in this paper and the fusion of differential entropy and the LDA method. Section 4 gives a performance comparison of experiments with different methods. In Section 5, the current state of the brain-computer interface and the emotional classification field are discussed. In section 6, a conclusion is given and future work is described.

Related Works
This section introduces five classic machine-learning methods. In previous research, these methods have been applied to emotion recognition and other classification recognition based on EEG signals.
The logistic regression (LR) method is a classic classification method. This method is widely used in various fields including brainwave classification [23][24][25]. The LR method is used to solve the classification problem, firstly establish the cost function, and then iteratively solve the optimal model parameters through the optimization method. Finally, it verifies the quality of this model. In this paper, the L2 norm is used to prevent overfitting.
The SVM method is a classic machine-learning method. This method is widely used in the field of classification based on EEG signals [26][27][28]. If the data is not linearly separable, the kernel function will be used. Common kernel functions include linear, poly, radial basis function (RBF), sigmoid, and so on. In this paper, we use linear as a kernel function for the SVM method.
The k-NN method is a kind of classical data-mining method that has been used in EEG emotion recognition for many years [29,30]. The basis of the k-NN method is measuring the distance between different sample feature values. Its main concept is that for a new sample in the feature space, it belongs to the most frequent category in its most similar samples of k in all samples. In general, k is usually not greater than an integer of 20.
RF is an improved method of the decision-tree method. This method is also widely used in the classification of EEG signals [31][32][33]. RF is an algorithm that integrates multiple trees through the idea of integrated learning. Its basic unit is the decision tree, and its essence belongs to a large branch of machine-learning-integrated learning methods. From an intuitive point of view, each decision tree is a classifier, then for an input sample, N trees will have N classification results. The random forest The remaining sections are organized as follows: Section 2 introduces the classification methods used in this paper, including LR, SVM, k-NN, RF, and MLP. Section 3 introduces the public dataset used in this paper and the fusion of differential entropy and the LDA method. Section 4 gives a performance comparison of experiments with different methods. In Section 5, the current state of the brain-computer interface and the emotional classification field are discussed. In Section 6, a conclusion is given and future work is described.

Related Works
This section introduces five classic machine-learning methods. In previous research, these methods have been applied to emotion recognition and other classification recognition based on EEG signals.
The logistic regression (LR) method is a classic classification method. This method is widely used in various fields including brainwave classification [23][24][25]. The LR method is used to solve the classification problem, firstly establish the cost function, and then iteratively solve the optimal model parameters through the optimization method. Finally, it verifies the quality of this model. In this paper, the L2 norm is used to prevent overfitting.
The SVM method is a classic machine-learning method. This method is widely used in the field of classification based on EEG signals [26][27][28]. If the data is not linearly separable, the kernel function will be used. Common kernel functions include linear, poly, radial basis function (RBF), sigmoid, and so on. In this paper, we use linear as a kernel function for the SVM method.
The k-NN method is a kind of classical data-mining method that has been used in EEG emotion recognition for many years [29,30]. The basis of the k-NN method is measuring the distance between different sample feature values. Its main concept is that for a new sample in the feature space, it belongs to the most frequent category in its most similar samples of k in all samples. In general, k is usually not greater than an integer of 20.
RF is an improved method of the decision-tree method. This method is also widely used in the classification of EEG signals [31][32][33]. RF is an algorithm that integrates multiple trees through the idea of integrated learning. Its basic unit is the decision tree, and its essence belongs to a large branch of machine-learning-integrated learning methods. From an intuitive point of view, each decision tree is a classifier, then for an input sample, N trees will have N classification results. The random forest integrates all the classification results by a voting strategy, so the category of highest frequency is the final output.
Finally, MLP, also called an artificial neural network, is a kind of neural network method which is often used in various fields, including EEG signal classification [34,35]. The MLP model consists of an input layer, an output layer, and multiple hidden layers. The layers are generally in the form of full connections, using the sigmoid or tanh functions as the activation function. MLP can implement nonlinear discriminants. Studies have shown that any function with continuous input and output can be approximated by MLP. An MLP neural network with a hidden layer (no limit to the number of hidden nodes) can learn any nonlinear function of the input to output.

Dataset
The experiment was conducted using a public emotional EEG dataset called SEED, which uses film fragments as emotion-inducing materials and includes three categories: positive, neutral, and negative emotions. In each experiment, the participants watched movie clips of different emotional states. Each clip was played for about four minutes. In the experiment, three types of movie clips were played. Each type of movie clip contains five movies, giving a total of 15 movies. These movie clips were all from Chinese movies. There was a five-second prompt before each short film show, with 45 seconds of feedback time after playback and 15 seconds of rest after watching. A total of 15 subjects participated in the experiment (seven males, eight females, mean age 23.27 years old with a standard deviation of 2.37), all of whom had normal visual, auditory, and emotional states. The EEG signal when the subject was watching the movie was recorded through the electrode cap, with a sampling frequency of 1000 Hz. The experiment used the international 10-20 system and a 62-channel electrode cap. Each volunteer participated in three experiments, and each experiment was separated by about one week. Therefore, a total of 675 (15 × 15 × 3) data samples were formed. Then, 200 Hz down-sampling and 0.5-70 Hz filtering were performed to obtain a preprocessed EEG dataset. For more information on the dataset, please refer to the website http://bcmi.sjtu.edu.cn/~{}seed/index.html.

Methods
The method of using the differential entropy algorithm for feature extraction has been widely used in the field of image and signal processing [36,37]. This method can effectively extract information that may be valid in the sample. The feature extraction method proposed in this paper uses the differential entropy algorithm to decompose the signal, remove the noise of the EEG signal, and extract important features. LDA is a classic algorithm for pattern recognition. It was introduced in the field of pattern recognition and artificial intelligence by Belhumeur in 1997 [38]. The basic idea of LDA is to project high-dimensional samples into low-dimensional space to achieve the effect of extracting classification information and compressing feature space dimensions. After projection, the sample has the largest inter-class distance and the minimum intraclass distance. Therefore, it is an effective feature extraction method. In this paper, the LDA method is used to reduce the dimension of the data after signal decomposition, which is used to achieve the secondary extraction feature and reduce the time complexity of the classification method. This section describes a feature extraction method based on differential entropy and LDA. This section includes two parts: signal decomposition and data dimensionality reduction.

Signal Decomposition
First, we perform signal decomposition on the original signal to remove noise and extract useful information from the signal. Entropy is a thermodynamic quantity describing the disorder of a system. The concept of entropy has been successfully applied to the analysis of EEG signals. Although the original EEG signal does not follow a fixed distribution, research has proven that from 2 to 44 Hz by steps of 2 Hz, the EEG signal after filtering obeys a Gaussian distribution. This paper uses differential entropy to perform signal decomposition of EEG [18].
Differential entropy is used to measure the complexity of continuous random variables, and is the entropy of continuous random variables. The differential entropy is also related to the minimum description length. Its calculation formula can be expressed as Equation (1): where X is a random variable and f (x) is the probability density function of X. For the time series X obeys Gaussian distribution N (µ, σ2), its differential entropy can be defined as Equation (2): In a fixed frequency band i, the differential entropy is defined as Equation (3):

Data Dimensionality Reduction
Next, we use the LDA method for quadratic feature extraction and data dimensionality reduction. There is a lot of noise in the EEG signals. However, the EEG signal after band-pass filtering has been proven to obey a Gaussian distribution [18]. More importantly, in a large amount of data, it can be sure that the EEG signal has a relatively obvious main component. This means that the EEG signals meet the requirements of the LDA method; that is to say, the signal conforms to a Gaussian distribution and has a distinct principal component. This in turn suggests that EEG signals can be extracted and classified using the LDA method. The goal of LDA is to create a new variable that is a combination of the original predictors. This is accomplished by maximizing the differences between the predefined groups, with respect to the new variable. The optimization function of the two-class classification and multi-class classification of the LDA method is different. Since this paper uses a three-category emotion EEG dataset, we give only a brief description of the method of multi-class classification.
Assumed dataset is D = {(x 1 , y 1 ), (x 2 , y 2 ), . . . , (x m , y m )}, any sample x i is n-dimensional vector, y i = {C 1 , C 2 , . . . , C K }. We define N j (j = 1, 2, . . . , K) as the number of samples of class j, X j (j = 1, 2, . . . , K) is a collection of class j samples, and µ j (j = 1, 2, . . . , K) is the mean vector of the j sample. Define Σ j (j = 1, 2, . . . , K) as the covariance matrix for the class j samples. The µ j is Equation (4), and the Σ j is Equation (5): Suppose the dimension of the low-dimensional space that the function needs to project into is d, the corresponding base vector is (w 1 , w 2 , . . . , w d ), base vector composition matrix W. It is a matrix of n × d. The optimization objective function at this time is Equation (6). S b is Equation (7) and S w is Equation (8). ∏ diag A is the product of the main diagonal elements of A. The optimization process for J(W) can be converted to Equation (9): The process above is the feature extraction process.

Results
First we adopted a band filter to divide the EEG signals into five bands, including Delta (1-3 Hz), Theta (4-7 Hz), Alpha (8-13 Hz), Beta (14-30 Hz), and Gamma (31-50 Hz). Therefore, we obtained five datasets representing the corresponding band. Then, we combined the five-band data in order to inform the combined band data. After that, we applied differential entropy (DE), LDA, and the DE and LDA method proposed in this paper to extract features respectively. Finally, we adopted five classification methods, that is, k-NN, LR, MLP, RF, and SVM. Each test randomly assigned 675 samples into mutually exclusive training sets (70%) and validation sets (30%). The experiment adopted the five-fold cross-validation method to ensure the accuracy of the results.
For the five classifiers, parameter tuning was conducted. After tuning, the better parameters for these classifiers were chosen. For the logistic regression method, L2 penalty was used, and the inverse of regularization strength was 1.0. For the SVM method, the 'rbf' kernel function is applied, and the penalty parameter C was 1.0. For the k-NN method, the number of neighbors was set as 20. For the random forest model, the number of trees in the forest was 120, the maximum depth of the tree was 10, and the minimum number of samples required to split an internal node was 8. For the MLP classifier, the optimizer was the Adam method, the activation function was ReLU, and the batch size was 32; the size of two hidden layers were 64 and 32 respectively, and the learning rate was 0.0001.
We decided to adopt several evaluation methods to ensure the accuracy of the results. First, we calculated the accuracy to evaluate the method; the accuracy is defined as the ratio of the number of samples correctly classified by the classifier to the total number of samples in the test dataset. However, accuracy is not always valid for evaluating the performance of a method, especially in the situation that the numbers of true and false samples in the same datasets is not completely equal. Then, we calculated the F1 score, which is a statistical indicator used to measure the accuracy of a binary model. It also considers the accuracy and recall rate of the classification model. Moreover, we used the Kappa coefficient, which is generally thought to be a more robust measure than simple percentage agreement calculation. Finally, in order to judge the performance of the methods intuitively, we drew the confusion matrix graph and the box plot. Table 1 gives the experimental outputs of prediction performance in the original dataset. Table 2 gives the experimental outputs of prediction performance in the original data based on LDA feature extraction. Table 3 gives the experimental outputs of prediction performance in the differential entropy data. Table 4 gives the experimental outputs of prediction performance in differential entropy data based on LDA feature extraction. These tables only show the best two classification methods in the experiments. Details are given in Supplementary Tables S1-S4. Table 1 shows that the random forest method has the highest average accuracy in the Gamma band. Table 2 shows that the random forest method has the highest average accuracy in the case of combined frequency band data. Table 3 shows that the SVM method has the highest average accuracy in the  Table 4 shows that the SVM method has the highest average accuracy in the case of combined frequency bands. It can be seen from Tables 1-4 that, under different  conditions, except for Table 1, in general, the accuracy of experiments on the combined frequency band data is better than the accuracy of experiments on the sub-band data, and the performance of the SVM method is superior to other methods. Therefore, the analysis of the experimental part uniformly uses the combined frequency band data and SVM method.    Tables 1-4 show that, in the case of using the combined frequency band data, the classification effect of the differential entropy combined with LDA method is significantly better than the case of using the differential entropy method alone and the LDA method alone. The average classification accuracy of feature extraction using the differential entropy combined with LDA method is 82.5%, and the accuracy is 7.1% higher than that of the differential entropy method alone, 73.7% higher than that of the LDA method. The precision and the recall rate were improved by 4.3% and 4.2%, respectively, compared with the differential entropy method alone, and improved by 62.8% and 67.8%, respectively, compared with the LDA method alone. Compared with the feature extraction method of differential entropy alone and the LDA method alone, the F1 score increased by 4.2% and 69.3%, respectively. It is worth noting that the Kappa value of the differential entropy combined with LDA method is 0.698, which indicates that the predictive category of emotional EEG is highly consistent with the actual category, which is 59.5% higher than that of the LDA method alone and 6.7% higher than that of the differential entropy method alone. The experimental results show that the differential entropy combined with LDA method has better recognition effect than the existing methods based on the differential entropy method and LDA method.

Classification Result of the Experiment with Different Feature Selection Methods
The experimental results show that the differential entropy combined with LDA method can be effectively used to process EEG signals. The differential entropy method can extract the time-phase information of emotional EEG and reflect the change of emotion. The use of the LDA method preserves valid feature information while reducing the data dimension.
From the experimental results in the frequency bands of Delta, Theta, Alpha, Beta, Gamma, and the combined frequency bands data, it is found that the differential entropy combined with LDA method has the best classification effect, not only in the combined frequency band, but also in other frequency bands, and the classification result of the method proposed in this paper is also better. For example, in the Beta band, when using the SVM classification method, the average accuracy of the differential entropy combined with LDA method is 71.4%, which is 2.9% and 58.3% higher than the differential entropy method alone and the LDA method alone, respectively. The precision rate of the differential entropy combined with LDA method increased by 2.5% and 55.7% compared with the differential entropy method alone and the LDA method alone, respectively. The recall rate was increased by 2.4% and 63.8% compared with the differential entropy method alone and the LDA method alone, respectively. The F1 score was increased by 2.2% and 64.3% compared with the differential entropy method alone and the LDA method alone. The Kappa coefficient increased by 4.6% and 56.4%, respectively, compared to the differential entropy method alone and the LDA method alone. In the Gamma band, the average accuracy of the differential entropy combined with LDA method is 74.1%, which is 8.8% and 56.0% higher than the differential entropy method alone and the LDA method alone, respectively. The results show that the differential entropy combined with LDA method is superior to the existing differential entropy method and LDA method in the classification of emotional EEG in all frequency bands.
In order to analyze the prediction results more intuitively, this paper presents box plots of the accuracy rate of experiments in the combined frequency band, and a confusion matrix diagram for the prediction results of the best performing methods in each experiment. Figure 2 shows a box plot of the classification accuracy for different classification methods on the original data. Figure 3 shows a box plot of the classification accuracy for different classification methods on raw data based on the LDA method. Figure 4 shows a box plot of the classification accuracy for different classification methods on raw data based on differential entropy method. Figure 5 is a box plot of the classification accuracy for different classification methods on raw data based on differential entropy combined with LDA method. based on the LDA method using the random forest method. Figure 8 shows a confusion matrix diagram of the best prediction results based on the differential entropy method for feature extraction using the logistic regression method. Figure 9 shows a confusion matrix diagram of the best prediction result of feature extraction based on the differential entropy combined with LDA method using the SVM method. It can be seen from Figures 2 to 5 that the classification effect of the differential entropy combined with LDA method is superior to other methods. In the experiments based on the LDA method of raw data, the random forest method achieved the best accuracy. In the experiment based on the differential entropy method, the logistic regression method achieved the best classification accuracy, and the accuracy rate reached 77.4%. In the experiment based on differential entropy combined with LDA method, the SVM method achieved the best accuracy; the accuracy rate reached 82.5%, and the accuracy of the logistic regression method reached 81.7%. From the experimental results, the differential entropy combined with LDA method can effectively improve the results of emotional classification based on EEG.  based on the LDA method using the random forest method. Figure 8 shows a confusion matrix diagram of the best prediction results based on the differential entropy method for feature extraction using the logistic regression method. Figure 9 shows a confusion matrix diagram of the best prediction result of feature extraction based on the differential entropy combined with LDA method using the SVM method.
It can be seen from Figures 2 to 5 that the classification effect of the differential entropy combined with LDA method is superior to other methods. In the experiments based on the LDA method of raw data, the random forest method achieved the best accuracy. In the experiment based on the differential entropy method, the logistic regression method achieved the best classification accuracy, and the accuracy rate reached 77.4%. In the experiment based on differential entropy combined with LDA method, the SVM method achieved the best accuracy; the accuracy rate reached 82.5%, and the accuracy of the logistic regression method reached 81.7%. From the experimental results, the differential entropy combined with LDA method can effectively improve the results of emotional classification based on EEG.     It can also be seen from Figures 6 to 9 that, in the confusion matrix diagram corresponding to the differential entropy combined with LDA method, the dark regions are concentrated on the diagonal, while the dark regions of other methods are relatively scattered. This indicates that the classification performance of differential entropy combined with LDA method for feature extraction of EEG signals is superior to other methods.    It can also be seen from Figures 6 to 9 that, in the confusion matrix diagram corresponding to the differential entropy combined with LDA method, the dark regions are concentrated on the diagonal, while the dark regions of other methods are relatively scattered. This indicates that the classification performance of differential entropy combined with LDA method for feature extraction of EEG signals is superior to other methods.  Figure 6 shows a confusion matrix diagram of the best prediction results on raw data using the random forest method. Figure 7 shows a confusion matrix diagram of the best prediction results based on the LDA method using the random forest method. Figure 8 shows a confusion matrix diagram of the best prediction results based on the differential entropy method for feature extraction using the logistic regression method. Figure 9 shows a confusion matrix diagram of the best prediction result of feature extraction based on the differential entropy combined with LDA method using the SVM method.       Table 5 shows the time complexity of four experiments with different classification methods in the original dataset, the original dataset based on LDA feature extraction, the differential entropy dataset, and the differential entropy dataset based on LDA feature extraction.    Table 5 shows the time complexity of four experiments with different classification methods in the original dataset, the original dataset based on LDA feature extraction, the differential entropy dataset, and the differential entropy dataset based on LDA feature extraction.  It can be seen from Figures 2-5 that the classification effect of the differential entropy combined with LDA method is superior to other methods. In the experiments based on the LDA method of raw data, the random forest method achieved the best accuracy. In the experiment based on the differential entropy method, the logistic regression method achieved the best classification accuracy, and the accuracy rate reached 77.4%. In the experiment based on differential entropy combined with LDA method, the SVM method achieved the best accuracy; the accuracy rate reached 82.5%, and the accuracy of the logistic regression method reached 81.7%. From the experimental results, the differential entropy combined with LDA method can effectively improve the results of emotional classification based on EEG.

Time Complexity Result of Experiment with Different Feature Selection Methods
It can also be seen from Figures 6-9 that, in the confusion matrix diagram corresponding to the differential entropy combined with LDA method, the dark regions are concentrated on the diagonal, while the dark regions of other methods are relatively scattered. This indicates that the classification performance of differential entropy combined with LDA method for feature extraction of EEG signals is superior to other methods. Table 5 shows the time complexity of four experiments with different classification methods in the original dataset, the original dataset based on LDA feature extraction, the differential entropy dataset, and the differential entropy dataset based on LDA feature extraction. As can be seen from Table 5, the differential entropy combined with LDA feature extraction method has the best time complexity, and the time spent is significantly less than for other methods. Since the SVM method has the best classification effect in the combined frequency band under normal circumstances, the following is based on the experimental results of using the SVM method on the combined frequency band data. In experiments using the differential entropy method, the LDA method, and the differential entropy combined with LDA method, the time spent of the differential entropy combined with LDA method is 28.7% of that of the LDA method, and is only 17.6% of that of the differential entropy method. This indicates that the differential entropy method can extract the information of emotional changes of EEG signals, and that LDA can further extract effective features, accelerate the convergence speed of classification models, and reduce the time complexity. The experimental results show that the time performance of differential entropy combined with LDA method in emotional EEG recognition is better than the existing method.

Discussion
The last 10 years have seen the rapid development of computers and the continuous updating of signal acquisition equipment. New machine-learning and deep learning methods are constantly being proposed. The research on brain-computer interface has also made great progress, from the earliest two-category epilepsy recognition [39] and left-right hand movement imagination [40] to more complex two-classification motion recognition [41], and finally to multi-class emotion recognition [42]. We can see that the classification situation is increasing and the difficulty of feature extraction is increasing. This puts higher requirements on feature extraction methods and classification recognition methods. At present, with the depth of research, more complex feature-recognition and classification methods are constantly being proposed. Existing research has proved that the EEG signal is roughly consistent with a Gaussian distribution through band-pass filtering and signal processing, which lays a solid foundation for the feature extraction of EEG signals, such as signal feature extraction methods based on Fourier transformation, signal feature extraction methods based on differential entropy have been proposed. The proposed methods have greatly improved the classification results of EEG signals.
However, we believe that, whether in motion recognition or emotion recognition, it is necessary to have higher real-time performance in practical applications. In order to reduce the transmission delay, researchers often transmit the calculated data to a nearby mobile edge computing server for edge calculation instead of sending it to the cloud. However, due to the limited computing power of the server, the real-time transmission of EEG data is relatively high. In such a case, it becomes very important to perform effective feature extraction to reduce the amount of calculation. Under the premise of ensuring classification accuracy, the classification method needs to quickly judge the current state of the user. Therefore, it is not comprehensive enough to emphasize the improvement of classification accuracy. How to further reduce the time complexity of the method while improving the accuracy is still the focus of our attention. On this basis, the researchers have proposed dimensionality reduction methods such as PCA and LDA to reduce the time complexity of the classification method and to reduce the running time of the method.
Based on the previous research results and the characteristics of EEG signals, we believe that the combination of differential entropy and LDA method can effectively reduce the time complexity of the method while extracting EEG data features. The final experiment also verified this. Although our approach has many advantages, it is worth noting that our method can only reduce the complexity of the data in the feature extraction part; we cannot directly reduce the time complexity of the classification method. This has little effect on traditional machine learning methods. However, now, more and more researchers tend to use deep learning methods such as convolutional neural networks (CNN), recurrent neural networks (RNN), and deep belief network (DBN) [43][44][45]. Deep learning brings new breakthroughs in the use of brain-computer interfaces. Its classification accuracy is higher, and more complicated situations can be classified. However, due to the high dimension of EEG data and the relative scarcity of samples, determining how to establish a corresponding deep learning network is a big problem, and it is difficult to adjust the parameters of deep learning. Determining how to build a common model for different data is the direction of the next research. More importantly, these methods are inherently complex and can result in high time complexity for classification calculations. Therefore, our next goal is to optimize the deep learning method to further reduce the run-time requirements of the method, improve the classification accuracy, and try to build a more general deep learning model.

Conclusions
This paper proposes a feature extraction method based on the fusion of differential entropy and LDA method. Experiments were performed using five classical classification methods on emotion-based three-class datasets. The results show that the feature extraction method proposed in this paper can effectively improve the final classification accuracy in five classical classification methods. More importantly, the method proposed in this paper can reduce the time complexity and running time of the model. This means that the feature extraction method based on the fusion of differential entropy and the LDA method proposed in this paper can be effectively applied to multi-class emotion recognition. For the clinical environment, this means that if the method can be used, the patient's mood and pathology can be judged more quickly, and doctors can better diagnose their condition and determine the patient's state in real time. In the future, we hope to test more datasets and test the deep learning method.