1. Introduction
Emotions play an important role in human relations and they have gained interest in human-computer interaction as well in recent years. For instance, the possibilities of including users’ emotional states as input to computing systems are being explored to dynamically adapt the interface to their needs at each moment [
1,
2,
3,
4,
5]. Psychologists distinguish between psychological excitement, expression of behaviour and conscious experience of emotions [
6]. Traditional approaches analyse changes in facial expression and/or voice to infer emotional states [
7,
8,
9].
However, emotions are not always easily exhibited through these cues. This is why specialised devices capturing brain activity are used more and more to detect emotions. A brain-computer interface (BCI), which uses electroencephalography (EEG) techniques to record the electrical activity of the brain, is one such device. Among the manifold applications, EEG signals allow one to create emotional models capable of reporting a person’s emotional state when an external stimulus is presented [
6].
Traditionally, EEG techniques have been applied to medical applications [
10,
11,
12]. Nonetheless, they are now being used in different fields such diverse as marketing, video games and e-learning [
13,
14]. These new fields of application have pushed the evolution of EEG devices, following the new users’ needs in terms of usability, affordability and portability. The Emotiv EPOC+ headset (
https://www.emotiv.com/EPOC/) is one of these new devices. It is a low-priced and lightweight portable BCI-EEG device that offers great flexibility compared to traditional devices used in medicine. Even so, the device captures, processes and analyses in real-time a relatively high amount of data, which have enabled its use in new tele-health systems [
15].
The purpose of this paper is the assessment of the classification accuracy of the emotional states provided by the application programming interface (API) of the Emotiv EPOC+ headset. For this, an experiment is introduced in which several sets of images, extracted from the International Affective Picture System (IAPS) [
16] database, are presented to sixteen participants. Their subjective answers and the values provided by the API are compared with valence, arousal and dominance values of the images as validated and labelled in IAPS. The recordings of these values are analysed through artificial neural networks (ANNs) to validate the emotional model. An ANN is a type of non-linear classifier used in many applications in a wide variety of disciplines [
17]. It is not surprising that ANNs have recently gained significant interest in emotion classification through EEG signals in the last few years [
18,
19,
20,
21,
22].
To the best of our knowledge, there is no previous work assessing the accuracy of the headset’s real-time measurements by comparing them with a validated image database by means of ANNs. The rest of the paper is organised as follows. In
Section 2 the materials and methods used in the study are described.
Section 3 shows the results obtained. Lastly,
Section 4 exposes the more relevant conclusions and discussion derived from the present study.
3. Results
This section compares the results obtained with different ANN configurations. The target outputs of the tested networks are the values of valence, arousal and dominance, which are known in advance for each IAPS image selected for experimentation. The configurations use different number of layers, neurons per layer and learning methods. Moreover, the inverted pyramid method is used to determine the number of neurons in the first layer. As a first approximation, the product of the number of input and output variables is used and then increased to find the maximum performance of the network. The process stops when the performance starts decreasing.
Table 2 shows ANN configurations that offer the best classification accuracies in terms of the purposes established in this study. These are single-layer (L = 1) with N = 15 or N = 30 neurons, and multiple-layer with L = 3 layers and N = 15-8-3 or N = 30-8-3 neurons. Both L-M and BR learning methods have been used for the configurations. In all cases, the activation function used is the sigmoid.
3.1. Assessment of SAM Responses vs. IAPS Values
Firstly, the answers to the SAM scale obtained from the participants of the present study are compared to the values provided by the original IAPS database validation as an initial step towards determining the classification accuracy of the emotional model implemented by the API of the Emotiv EPOC+. Should the correlation between the two sets of results be high enough, this would mean that the subset of selected IAPS images is a good representation of the emotions that were intended to be evoked to the participants. This is an excellent starting point for checking the effectiveness of the model developed for the study.
This is the reason why the percentage of hits have been calculated for all the answers given by the sixteen participants. The hits are obtained assuming a normal distribution of the values provided in IAPS. We consider that a response in SAM questionnaire is a hit with regards to the IAPS values when it lies within one standard deviation of the mean.
Table 3 shows the percentage of hits for a 68.27% account.
The results of mean hits are not as good as expected (see
Table 3). Therefore, in order to try to better adjust the correlation between SAM responses and IAPS values to the specific sixteen participants of this study, artificial neural networks are tested. The use of this type of approach may be oversized. Nevertheless, with this special purpose in mind, ANNs are designed by using the SAM mean values for valence, arousal and dominance declared by the participants as input parameters, and the mean values provided by IAPS pictures as output parameters. As shown in
Table 4 the classification accuracies for each of the implemented configurations is above 90% for all performed analyses. According to these results, L-M offers the highest result (0.96% for L = 3/N = 15-8-3, shown in bold). Hence, a higher correlation is reached through training ANNs.
3.2. Different ANN Configurations to Compare Emotiv EPOC+ API Outcomes with IAPS Values
In second place, many ANN configurations are tested by varying each parameter present at every layer with the aim of comparing Emotiv EPOC+ API outcomes with IAPS values. Although we are limited by hardware in testing larger ANNs, a very large configuration is not necessary to obtain good results.
In our case, 76% classification accuracy is obtained with an L = 1/N = 15 configuration by using the L-M adjustment method for all emotional states present during the experiment (see
Table 5). Conversely, if the number of hidden layers is increased to L = 3 (using the same parameters), the performance of the network decreases 3% for the L-M method. On the other hand, the network performance increases up to 85% (shown in bold) when using the BR method. This increment is due to the fact that this type of approach is usually better for small datasets and a large number of layers.
4. Conclusions and Discussion
This paper has investigated the field of emotion elicitation, and more concretely the use of artificial neural networks to classify emotions. In this case, we have focused on the outcomes of the API of the Emotiv EPOC+ headset after processing electroencephalogram signals. The emotional states calculated by the API have been compared with validated valence, arousal and dominance values from IAPS database.
The first step, prior to examining the ANN classification accuracy of the API, was to validate the fit of the responses given by sixteen participants after viewing IAPS images through a SAM questionnaire. The percentage of hits was demonstrated to be not as good as expected, that is 85.94, 79.69 and 78.13% for valence, arousal and dominance, respectively. This may be partially due to the selection of the images related to both low valence conditions, where the level of valence is likely too low.
In general, the selection of images is challenging in this type of experiments, as emotional elicitation depends on personal stereotypes [
32]. In this case, the images selected could affect the results achieved in our experiment. Nonetheless, after using an ANN-based approach, up to 96% classification accuracy has been reached for the specific sixteen participants that have taken part in this experiment. Thus, although the images chosen could have some effect in our experiment, ANNs have mitigated the effect.
For the second and most important step, several multilayer perceptron ANN configurations were analysed to evaluate the emotional outcomes of the API of the Emotiv EPOC+ headset. This study demonstrated that multilayer perceptron is sufficient to validate the Emotiv EPOC+ API outcomes. It is not necessary to test with other more complex solutions (convolutional neural networks, deep learning, and so on) for facing the problem at hand. Hence, the main conclusion is that the emotional model implemented in the API offers 85% classification accuracy respect to the validated IAPS values. This result is in line with other research papers in the field of emotion recognition [
32,
33].
The proposed solution presents a series of advantages and disadvantages. An essential advantage is that multilayer perceptrons with backpropagation provide a simple solution to validate the emotional model implemented for the headset. As shown in this paper, the results obtained are excellent considering that the headset was designed for gaming and has a low price (compared to other devices). Another advantage is that the emotional states given by the headset’s API can be used with no further processing of the electroencephalogram signals acquired from the scalp.
On the other hand, as a clear disadvantage of using the API of the headset is that this software is not transparent. Therefore, it is not possible to further enhance the algorithms related to EEG signal processing. Another limitation of the study performed is the relatively low number of participants. A larger number of people involved in experimentation would guarantee statistically normalised data, closer to the IAPS database volunteers. Nonetheless, the already good results obtained with sixteen participants should be highlighted in this case.