1. Introduction
The Brain–Computer Interface (BCI) is a popular research topic in the field of health informatics. Its applications involve the analysis of electroencephalography signals (EEG) from the brain. Popular BCI applications include monitoring the health and abnormal activity of the brain, such as psychological seizures and detecting emotions. Emotion detection techniques help to detect the activity of mentally challenged people, who cannot explain their emotions.
Brain–Computer Interface (BCI) technology enables interaction between the brain and the computer and is one of the significant branches of Human–Computer Interaction (HCI). It is also considered one of the most essential modern research fields related to machine and deep learning and robotics. 
BCI technology works through sequential steps that aim to recognize the signals of the human brain and convert them into actions. After the signals are collected, they are processed according to the frequency, and time features are extracted; finally, the signals are classified. The results are converted into commands for the different devices, according to the application used. 
BCI actively contributes to helping patients find solutions to their health problems and improve the quality of healthy life for patients with motor disabilities or with various mental diseases [
1], to help patients communicate with others or control their prosthetic limbs by determining the brain’s activities [
2]. A study by Teles et al. confirmed that BCI-based devices can transmit and receive signals from the brain to control external devices, such as wheelchairs, and collect information about the user’s intentions [
2].
In another study, the authors demonstrated that the electroencephalogram (EEG) signal has a vital role in diagnosing epileptic seizures [
3]. The study also used the BCI technique to develop an emotion recognition model. In addition, BCI technology has been employed for non-medical uses, such as education and recreational games [
4,
5].
BCI techniques begin by collecting brain signals according to the purpose for which the signals are collected. This is done through various techniques, including EEG, functional magnetic resonance imaging (fMRI), and magnetoencephalograms (MEGs). The medical devices used to measure physiological signals differ in their accuracy, signal quality, length and measurement of the resulting frequencies. Physicians decide the appropriate devices for each patient’s case and how to collect the signals in the correct way, in cooperation with specialized medical technicians [
6].
Deep learning (DL) has demonstrated tremendous capabilities in medical decision-making systems, including: wearable technology, image-processing applications, natural language processing. All of these aim to improve the quality of health care. DL algorithms are effective to support decision making when the inputs are quantifiable.
In the study by Kamnitsas et al. [
7], the authors used MRI brain imaging to detect brain tumors and stroke etiology through 3D CNN. Abd-Ellah et al. [
8] used three different CNN networks, AlexNets, VGG-16 and VGG-19, for the detection of cancerous tumors, while Deniz et al. [
9] assessed bone fracture risk, using U-Nets from MR images.
Wearable technology constitutes one of the important applications of decision support in the field of health services, where vital data are collected by sensor units. In the study by [
10], a smartphone was used as a sensor to measure, say, the output/data collected from accelerometers and gyroscopes to study human activity. The application used SIFT (scale-invariant feature transform) to extract the various features collected from smartphone sensors and then passed these features through a convolutional neural network to classify the signal and take the decision about a person’s health status.
In another study [
11], the authors developed a decision support system that monitors mental health symptoms with wearable technology. The model collects facial expressions from the phone’s camera, speech from the phone’s microphone, and movement data through GPS, an accelerometer and gyroscopes. Using a smart watch, the electrical activity of the skin is monitored, and the model also measures social interaction through social networking.
The medical field needs a system to diagnose psychological and behavioral diseases. Psychological diseases are currently diagnosed using traditional methods, by asking the patients some questions or monitoring their behavior, which is, however, time consuming. The diagnosis may also not be accurate, because it depends on the patient’s responses to the questions. Many studies have also used external emotional expressions, such as facial expressions or speech, to recognize emotions. However, sometimes the emotional states remain internal and cannot be detected by external expression. In the research reported in this paper, EEG signals were used to identify emotions. EEG is one of the most important techniques used in collecting human brain signals, due to the availability of devices and their high time accuracy. The EEG technology collects the signal directly from the brain using metal electrodes placed directly on the head. Human emotions can be studied through external expressions, such as facial expressions, speech, and body language [
12]. Emotions can also be studied through monitoring internal physiological signals that interact and change with the emotional state of humans, through various techniques, such as EEG signals and MEGs [
13]. The internal physiological signals are characterized by the fact that they are not affected by self-will. The person cannot control the amount or intensity of these signals during the period of emotion, which give a more accurate estimate of the emotional state. Several studies have shown the superiority of deep learning methods in multichannel EEG-based emotion classification. This study improves the performance-recognition model of emotions using deep-learning algorithms
The brain is the central part of the human body and controls all its organs. The human brain contains the nervous system that provides electrical signals to the human body’s other organs. The primary data processing units in the brain are known as neurons [
14]. 
The electrical signals of the brain are processed among the neurons. The EEG signal is acquired by the electrodes and represents brain waves. Multiple channels are used to obtain EEG data with various electrodes [
14]. During the emotion recognition process, brain signals are recorded by the electroencephalography (EEG) devices. Deep learning approaches have been used in analyzing the EEG signal. Enhancing the performance and accuracy of emotion recognition from the EEG signals is the key focus of this work, using deep learning approaches [
12].
Accuracy is important in the emotion recognition process, especially in analyzing behavioral and psychological disorders, because it helps in making medical decisions. However, it has not been found easy to analyze and classify human emotions, and researchers have observed differences in the accuracy ratios in many studies conducted to identify emotions from EEG signals. Their results differed due to the diversity in many aspects of the research methods, such as variations in experience, the environment, data pre-processing techniques, and classifiers. Hence, it is agreed that there is a need for developing better methods to achieve high performance [
15].
To improve the efficiency of the methods used for emotion recognition, researchers must develop novel methods to offer superior performance and reduce complexity. In this paper, an approach for emotion recognition using EEG signals is proposed, which will help doctors in diagnosing psychological or behavioral disorders, with accurate results in a short time. In this study, we aim to improve the model’s performance using a Binary Grey Wolf Optimization (BGWO) algorithm in the feature selection stage to solve the data complexity problem in EEG signals. In addition, this study aims to use a stacked Bi-LSTM classification model to obtain high accuracy in emotion prediction by analyzing EEG signals. The proposed approach includes several phases, which are: data selection, feature extraction, feature selection and classification. 
This paper provides several contributions in the field of emotion recognition. The research contributions can be summarized in the following points:
	  
- A deep learning model was developed by building a network using bi-LSTM to classify multichannel EEG features. 
- The model can classify the patterns of multichannel EEG signals that have time and waveform frequency variation; the extracted time-domain-based features and the correlation information clearly improve the model’s performance. 
- The Hurst exponent has been adopted as an important feature of EEG classification. 
- The methods used in the feature extraction phase reduced model learning and generalization time and reduced the likelihood of overfitting. 
- The feature selection phase enhanced the accuracy of the proposed model; as the BGWO algorithm was used, the algorithm contributed to reducing the high dimensions of the dataset and reducing the complexity, which led to a reduction in classification time and an increase in the effectiveness of the model performance. 
- This model can perform the classification process of brain signals with high performance and accuracy for biomedical studies. Therefore, its results can be leveraged as a deep learning-based decision support system for medical purposes. 
  Related Works
In the study by George et al. [
16], the SVM method was also used, with a more accurate overall result of 92%. The DCT method and a box-and-whisker chart were used to determine the features. In the DEAP dataset, containing 32 participants, the researchers concluded that the Fast Fourier Transform (FFT) statistical features for detecting emotions resulted in 92% higher accuracy. Therefore, this method is superior to the technique used in Seeja et al., in terms of the results’ accuracy. The difference between the results is due to the different techniques used in extracting the features and pre-processing the data.
Alhagry et al. [
17] discussed the importance of emotion recognition systems that rely on Human–Computer Interaction (HCI) systems. They identified three main problems: the arousal, the valence, and the liking ratio, unlike most studies in this field which discuss only two levels (arousal and valence). Using the DEAP dataset, they extracted features using LSTM- RNN for classification, achieving good accuracy of 85.65%, 85.45%, and 87.99% with the valence, arousal, and liking categories, respectively. It should be noted that they used the end-to-end methods without using feature extraction methods, because deep learning algorithms have the ability to extract features and classify them in the same step.
In another study [
18], graph convolutional neural networks (GCNN) were used to implement an emotion recognition model using EEG. The experiment was applied to the DEAP database. After segmenting the data and extracting the differential entropy features, a method known as ECLGCNN, based on the merging of GCNN and LSTM was used. The researchers confirmed the effectiveness of the methods used, as they achieved an accuracy of 90.45% for valence label and 90.60% for arousal in subject-dependent and 85.04% in the independent trials. The complexity of computing required in this method needs to be reduced by developing methods for extracting more features.
The authors in [
19] used the end-to-end method to classify emotions using the CNN model, which has demonstrated the ability of efficient feature extraction. This study added additional layers to the CNN model to increase the depth and improve classification capacity. Three datasets, DEAP, LUMED, and SEED, were used in this study. The model achieved 86.56% and 78.3% accuracy in the SEED dataset, 72.81% in the DEAP dataset, and 81.8% in the LUMED dataset.
An emotion recognition model was developed by [
20] to identify three emotions (positive, neutral, and negative). Simple recurrent unit (SRU) models were generated using four features across five frequency bands using a SEED dataset. SRU was proposed for several reasons. It can process sequence data and solve the problem of long-term dependencies in RNN. The time, frequency, and nonlinear features were extracted using the dual-tree compound wave transfer (DT-CWT), achieving an accuracy of 80.02%. This model relies on a trial-and-error methodology.
With rapid advances in the emotion recognition field, Chao et al. [
21] discussed the problem of multiple channels of electroencephalogram (EEG) signals. They presented an advanced approach to address this problem and proposed a deep belief-conditional random field (DBN-CRF) to develop deep belief networks with glia chains (DBN-GC). The model was applied using three different datasets (AMIGOS, SEED, and DEAP). These methods performed well, with an average accuracy of 76.13%.
Seeja et al. [
22] studied the emotional responses to stimuli from EEG signals, using a DEAP dataset and choosing two methods of feature extraction: the Variational Mode Decomposition (VMD) and the Empirical Mode Decomposition (EMD). The researchers also used the DNN method for classifying emotions. This was found to be an effective method, with a valence accuracy of 62% and arousal accuracy of 63%. The study found that the emotional recognition model achieved a better performance with the deep neural network classifier compared to that with the SVM classifiers. The researchers argued that the VMD-based features method offered better performance compared to the EMD-based method and reduces signal complexity. However, the accuracy still needs improvement by improving the frequency resolution of EMD, using various masking operations for the amplitude rate between the mono-components.
Natraj et al. [
23] used two types of datasets (DEAP and SEED-IV) and proposed the DWT method to extract the statistical features, frequency domain, the Hurst exponential, and the reciprocal entropy of the signals. The SVM method was used for signal classification. The researchers achieved a valence accuracy of 79% for the DEAP dataset and 76% for the SEED-IV dataset, concluding that the SVM classifier’s channel-merging method yields better results for the DEAP dataset, compared to the SEED-IV dataset. 
Amiri et al. [
24] conducted a study to classify emotions in real time, according to the arousal/valence dimensions model, applying the DEAP dataset. The researchers suggested extracting the features of the EEG signals using the DWT method. In this study, there were two different types of classifiers to yield high accuracy: SVM and KNN. This study found that the high-frequency (gamma) band produces higher accuracy than the low frequencies of the EEG signal. The results obtained were comparable, with valence accuracy of 84% and arousal accuracy of 86%.
Numao et al. [
25] used the PSD method for feature extraction with the MLP classifier. The researchers were also interested in developing emotion detection using EEG data and used the DEAP database, but focused on the participants’ interaction with music to study emotional responses. The researchers concluded that music affected brain waves at different levels. When the music is unfamiliar to a person, it enhances EEG-based emotion recognition methods. The results achieved 64% valence accuracy and 73% arousal accuracy. This study has a good implementation time, due to the use of MLP (which is a class of ANN), which is suitable for classification prediction problems.
In [
26], the researchers discussed the problem of insufficient applications of neural patterns in subjective emotion recognition systems. Researchers collected the signals from 30 participants while they watched 18 videos. When collecting the signals, the researchers concluded that the high-frequency features of EEG signals showed better results using electrodes distributed on the temporal, frontal, and occipital lobes. The researchers classified six main emotions (fear, joy, sadness, disgust, neutrality and anger). The STFT algorithm was used to extract the features, and the SVM method for classification. The study achieved a valence accuracy of 87.36% in discriminating emotions and 54.52% for arousal. Further, in the study of [
27] that used the same STFT algorithm for feature extraction with the DEAP dataset, but with the CNN classifier, 83.88% were found with comparable accuracy. Comparing these two studies, it was concluded that the SVM classifier results were more accurate than those of the CNN classifier. These studies still need to add a pre-processing phase to improve the performance.
Girardi et al. [
28] studied emotion recognition through biometrics, for use in the health field. The researchers used EEG, EMG, and GSR sensors to collect different types of signals and used them to develop a low-cost emotion recognition model. The study aimed to find the level of valence and arousal in emotions. Using a DEAP database, the study adopted PSD and CSP methods to extract the features and SVM classifier. This study achieved a valence of 56% and arousal of 60%, providing a good solution for the problem of expensive sensors, through low-cost tools. However, the method needs to be developed using the pre-processing of signals to give more accurate results, especially in the medical field.
In another study [
29], the accuracy of the Convolutional Neural Network (CNN) results was also verified, as researchers used this to detect the emotional state of humans by analyzing 32 EEG signals. The researchers obtained results with an accuracy of 95.96% for valence and 96.09% for arousal.
In this paper, the performance and efficiency of the emotion recognition model were improved. The proposed approach includes four phases: data selection, feature extraction, feature selection, and classification. The remainder of this paper is divided into three further sections. The 
Section 2 describes the methods used in this research. The 
Section 3 presents the experimental results and a discussion of the findings, followed by a summary of the conclusions and future work in the final part.
  4. Discussion
Most BCI systems suffer from a lack of ability to interpret information and emotional intelligence. Accuracy is essential in this area, as it contributes to making a correct decision and appropriate actions. The goal of affective computing is to bridge this gap by precisely classifying emotional responses using emotional cues. This study answered the research question, and the proposed model resulted in high performance in emotion recognition.
In studies of the past, facial expressions or voice were used to elicit emotions. However, these traditional methods do not produce accurate results for the real condition of a person, because the person is able to control their facial expressions and the tone of their voice. In the current study, physiological EEG signals were used, since human beings cannot control them, thus, producing real results for the person’s psychological state. This model was developed using the DEAP dataset for emotion recognition. The model achieved accurate classification effects of 99.45%, 96.67% and 99.68% for Valence, Arousal and Liking, respectively.
In this model, a deep learning method is adopted to process the input. Although deep learning models deal directly with input, the steps were used to choose the feature or reduce the dimensions to increase the performance efficiency of the proposed model. BCI technology depends on several main steps, namely signal collection, pre-processing, feature extraction, and classification. 
Table 7 presents the results of the statistical tests that prove the significance of the feature selection stage and the effectiveness of the proposed classification model.
When comparing this work with earlier works, this study provides a good analysis of the multi-frequency EEG signal. Attention was paid to the feature extraction and selection stages because they reduce the amount of dimensionality of input data and increase the accuracy of models by removing the redundant data, thus, increasing training speed. Unlike previous studies that extracted only one type of feature, three different types of features were extracted in this study (Hurst exponent, wavelet features, and statistical features). It was concluded that the higher frequency bands, gamma and beta (12–30 Hz), yield more favorable results for the emotion recognition model than other lower frequency bands, such as delta (0–4 Hz), and, thus, high performance was obtained in terms of accuracy, precision, recall, and f-score. 
Many studies in the field of emotion recognition do not include the feature selection stage. However, we believe this to be important for removing duplicate data from the extracted data, reducing data dimensions and data complexity. In this study, the BGWO algorithm was used to select the features. This feature contributed to a significant increase in the efficiency of the model.
In the classification stage, a special type of RNN, the Bi-LSTM, was used. The Bi-LSTM network is good at manipulating the temporal change characteristic of different frequencies in the serial data. In the proposed model, the running time of each label is approximately 37 min. Our model training resulted in high performance and processing efficiency for emotion classification, as shown in 
Table 8 (summary of the performance criteria results for the proposed model).
Recognizing emotions is an essential step in the Human–Computer Interaction process. The results of this study can serve as a reference for researchers working on related applications. Deep learning has proved effective in categorizing feelings, although this differs from machine learning in that it contains more layers and is able to process large amounts of data with high efficiency. When the model relies on learning from sequential data (such as EEG signals), the purpose is to capture the temporal dynamics that allow generalization of time sequences by sharing parameters over time, rather than re-learning them at each step, and this helps the parameters to be shared more deeply. 
Figure 24 presents a comparison between the different models results for classifying emotions. 
Further, the feature extraction process is vital in BCI applications. Therefore, in this experiment, various feature extraction techniques were selected, such as statistical features, wavelet features, and the Hurst exponent, giving a total number of 68 features.
In the feature selection phase, the binary GWO algorithm was used, which significantly improved the performance of the model. The BGWO has proven its effectiveness in providing competitive results by contributing to the accuracy of rating and approximation of the proposed optimal solution. The BGWO has double exploration and exploitation processes that help the classifier to investigate the efficiency of the algorithm. This algorithm is characterized by its simplicity and speed, as it works by converging towards the optimal solution, and the convergence is very fast.
One of the main reasons for the high classification result in this model was the use of the BGWO algorithm, which has adaptive parameters to effectively balance exploration and exploitation. Half of the iteration is for exploration and the rest for exploitation. The binary GWO algorithm preserves the three best solutions obtained at any stage of optimization and is, hence, able to yield more accurate results due to its high exploration behavior. The highly exploitative behavior of the algorithm is an important reason why a BGWO-based coach is able to rapidly converge towards an optimal level of the dataset. Further, BGWO is recommended when the dataset and the number of features are large due to the large number of local options. A Bi-LSTM algorithm, one of the best deep learning algorithms used to process time series, was used in the classification stage. The Bi-LSTM model outperformed the traditional LSTM used in other studies [
57,
58]. As the current study showed more accurate results and better effectiveness, the ADAM optimizer was used to increase the efficiency of deep learning algorithm training. ADAM was used to improve the features of the Bi-LSTM algorithm by changing weights and learning rates for the purpose of minimizing losses. Consequently, results were obtained more quickly, with less loss and increased accuracy, as is evident in the training progress model. ADAM maintains the average decay rate of previous gradations, apart from correcting for vanishing learning rate and high contrast. This model has achieved good accuracy of 99.45%, 96.87%, and 99.68% for Valence, Arousal, and Liking, respectively. The deep learning algorithm (Bi-LSTM) achieved better results than the other classifiers used, when compared with the results mentioned in the previous works. 
In 
Table 8, the accuracy of the results of the proposed model is compared with that of other deep learning and machine learning methods that use the DEAP dataset. The results of the current model showed a significant improvement over the earlier ones, due to the use of an improved approach, vis-a-vis the traditional LSTM used in the studies in [
17,
57,
58].
By comparing the accuracy of the results of previous studies, we conclude that although the dataset is the same, there are different levels of classification accuracy, due to the different techniques of extracting features from EEG signals, the different methods of classifying EEG data, and their different parameters. It is worth noting that the use of the optimizer is of great importance in improving the performance of the model. In most studies, emotions were classified on the basis of Valence and Arousal; in this study, however, Liking was also classified. The research hypothesis can be tested by reviewing the results for the model’s performance criteria (accuracy, precision, recall, and f-score), presented in 
Table 9, which shows the high performance of the proposed model.
We faced a few challenges while developing the model. The model took a long time for training. The Bi-LSTM algorithm showed sensitivity and complexity in adjusting the O(w) random weight initialization process. These challenges are related to the issues of execution time consumption, reducing the high dimensionality and complexity of the dataset and its naming. In resolving these challenges, we came up with several effective methods for extracting statistical features, wave and time frequencies, and methods for selecting features and creating the correct classifier. We also faced a challenge in determining exact parameter values that would provide a high level of accuracy for the proposed classification model. After several experiments, we came up with a random search method that measures the effectiveness of the proposed parameters of the model.
  5. Conclusions and Future Work
The task of emotion recognition faces many challenges due to the instability and complexity of EEG signals. This research provided an effective solution for emotion recognition models. The deep learning-based approach was proposed to improve the accuracy of emotion recognition based on EEG signals, using a deep learning algorithm. This study contributed to enhancing accuracy and performance in the field of emotion recognition through the developed algorithms, which had not been used before in this field, such as the BGWO algorithm used in the feature selection phase and the newly developed Bi-LSTM technique. The proposed approach was tested on a DEAP dataset, and classification was implemented with the stacked Bi-LSTM deep learning algorithm. The feature extraction and selection stages improved model performance by reducing data dimension and complexity. Moreover, the method in this study provides a computational model that can quantify the correlations between EEG signals, frequency bands and emotions. The performance of the proposed model was compared with other models that used machine learning, and the proposed model achieved high accuracy in classifying the internal feelings of physiological signals based on the electroencephalogram, using the deep learning model with random search algorithms, which contributed to determining the most accurate parameters and stages of extraction and selection of features of the input signals. Three emotional measures of Valence, Arousal, and Liking were targeted for recognition by the proposed model, which high accuracy of 99.45%, 96.87%, and 99.68% for Valence, Arousal, and Liking, respectively. Comparison of the experimental results of the proposed model with those of the previous studies revealed the former’s superiority in accuracy and performance; further, this model produced competitive results in the field of EEG-based emotion recognition. 
Real state data remain difficult to collect and work on immediately due to the difficulties in creating a dataset, such as the high cost, and limitations of EEG recorders and human resources. In addition, it must be determined whether short videos can provide adequate stimuli to feelings, and whether the emotional volatility of the subjects overlapped during the interval between any two videos. In future research, other classification algorithms can be applied on different datasets to prove their effectiveness in emotion recognition by using advanced deep learning models based on RNN algorithms, such as the GRU (Gated Recurrent Unit) and other methods. We suggest using several techniques to measure brain signals, such as functional Magnetic Resonance Imaging (fMRI) and magnetoencephalography (MEG). We also recommend other types of feature selection algorithms (such as hybrid cellular automation and Gray Wolf Optimizer) that can be used to study their effect on the model performance.