1. Introduction
Electroencephalographic (EEG) signals have emerged as a promising area of study for detecting human emotions, with applications spanning neuroscience, psychology, brain–computer interfaces (BCIs), and affective computing. Emotions, as psychophysiological mechanisms, influence cognitive, communicative, and perceptual processes, playing a critical role in decision-making and environmental interactions [
1]. Recent technological advances now allow the real-time capture of physiological responses, including brain activity, providing unprecedented opportunities to explore the intricate dynamics of emotional states.
Among the methods for studying emotional responses, auditory stimuli offer unique advantages over visual or audiovisual approaches. Sounds are particularly effective in engaging the brain regions involved in emotional processing, such as the amygdala, which plays a central role in emotion evaluation and regulation. Furthermore, auditory stimuli activate central and temporal brain regions involved in auditory perception and processing, while audiovisual stimuli also recruit occipital areas associated with visual processing [
2,
3]. The evolutionary importance of auditory stimuli further amplifies their impact; certain sounds, such as cries or screams, elicit immediate emotional responses and increased attention. This makes auditory-based paradigms particularly compelling for investigating emotional responses [
4].
Despite the recognized importance of auditory stimuli, research on emotion recognition has predominantly relied on visual stimuli, such as images or videos. Notable tools like the International Affective Picture System (IAPS) and audiovisual databases such as DEAP and SEED are widely used to induce and analyze emotions through physiological signals [
5,
6,
7,
8]. However, databases focused exclusively on auditory stimuli remain relatively scarce. Although the International Affective Digitalized Sounds (IADS) database is widely used as a standardized resource for auditory emotional induction, it still requires further methodological refinement and adaptation for specific research applications [
4,
9].
The creation of an EEG dataset based on auditory stimuli addresses this gap by offering a resource tailored to understand how sounds elicit emotional responses. Auditory stimuli can evoke significant emotional reactions by activating key brain regions, including the amygdala, and by inducing states of attention and emotion related to evolutionary processes [
10]. These qualities make auditory stimuli particularly effective tools for emotional induction and underscore the necessity of a standardized EEG dataset focused on this modality.
This work introduces the creation and initial evaluation of a novel EEG dataset designed to analyze emotional responses induced by auditory stimuli from the IADS database. The dataset was generated through a controlled experimental protocol, wherein auditory stimuli were carefully normalized for homogeneity. EEG recordings were obtained using a 16-channel system synchronized with platforms such as OpenVibe and Lab Streaming Layer (LSL). Dual labeling was implemented, incorporating both participants’ self-reported emotional responses and the intrinsic characteristics of the auditory stimuli.
The primary contribution of this study is the provision of a valuable resource for the scientific community, aimed at advancing the characterization and classification of emotional states through machine learning techniques [
11]. Beyond fostering future research in emotion recognition, this dataset lays the groundwork for developing BCI systems that integrate affective analysis, with potential applications in mental health, education, and human-computer interaction [
6,
12,
13].
4. Discussion
The database of EEG signals with sound stimulation developed in this work is a significant resource for the scientific community interested in the analysis of emotional states (positive, negative, and neutral) through brain activity. The design and compilation of the model were intrinsically linked to its potential application in the training and validation of machine learning algorithms aimed at the classification and understanding of human emotional responses. The process was carried out in three stages. The first stage involved the design of the experiment, which included calculating the number of participants, the selection of stimuli, and the design of the signal acquisition and stimulation protocol. The second stage involved the execution of the experiment, and the third stage consisted of the analysis of the acquired signals to design the recognition methodology, together with the evaluation of the subjective results of the volunteers. These results were obtained by analyzing the instruments used during the experimental phase (PANAS Form and Affective Slider).
The selection of IADS database for the characterization process was made, as it is a standard widely validated and appropriately referenced in the existing literature [
4,
20]. This database was determined to be the most suitable for the characterization of EEG signals for the recognition of three emotional states, given that it is one of the few available resources containing validated auditory stimuli. A theoretical framework was required for the characterization process, and the IADS database provided the necessary reference labels for the theoretical evaluation of the EEG response. The results found in both the PANAS form and the Affective Slider indicated that the majority of volunteers exhibited minimal emotional fluctuations at the end of the experiment. It was determined that a contributing factor to the reduced emotional distress experienced by participants when presented with stimuli classified as negative within the IADS database was the naturalization of violence. This phenomenon has been widely reported in the literature [
21,
22,
23], where it is posited that the naturalization of violence is significantly influenced by the prevailing sociocultural context, which, in turn, modulates societal perceptions of and reactions to violent content. Consequently, the outcomes of the negative class exhibited a high degree of overlap with those of the neutral class. Furthermore, it was observed that the age of the database rendered many of the neutral and positive stimuli obsolete within the contemporary context. While the evaluation of stimuli is known to be significantly influenced by cultural factors, these stimuli nevertheless served as a crucial input for the subsequent creation of an EEG signal database for the recognition of three emotional states using auditory stimulation.
In the pattern extraction process, eight training scenarios were proposed. The first two focused on the calculation of pleasure and activation indices from EEG signal processing in the frontal region, using beta (
) and alpha (
) rhythms. According to the literature [
17,
24], these indices generally provide relevant features to train emotional classification models, facilitating the separation between classes of emotional states. However, in this particular study, models trained with these patterns did not achieve classification percentages higher than 64%. In contrast, research such as Koelstra et al. [
6] and Zheng and Lu [
7] proposes the analysis of emotional response using asymmetry indices, based on the comparison of complementary electrode activity to identify discriminative patterns. Following this line, training scenarios 3 to 8 explored combinations of EEG signal patterns, considering the five frequency bands present in the electroencephalographic recording (delta—
, theta—
, alpha—
, beta—
and gamma—
). The distribution of the asymmetry response per participant demonstrates the subjectivity of the emotion induction processes, where the sociocultural context affects both the response of the indexes found in the analysis of the affective slider and the distribution of the patterns of training scenarios 1 to 8. An overlap is observed between the analyzed classes, which causes the trained classifiers to not present classification percentages higher than 90%. In this context, the ANN that demonstrated the most optimal performance in recognizing the three emotional states was the one trained with scenario 4 (see
Table 4). This scenario involves an analysis of the brain hemispheric response by calculating the asymmetries of complementary electrodes, resulting in an accuracy percentage of 88.57% ± 2.00%. The alternative classifier option that can be used is the ANN trained with scenario 5, which presented an accuracy of 70.22% ± 2.45%. A subsequent analysis of the results obtained from the validation of the SVM, LDA, and CB classifiers revealed that these methods were unable to achieve satisfactory performance in recognizing the three emotion classes examined. This limited performance is likely attributable to the inherent complexity and high dimensionality of EEG signals, which often contain non-linear relationships and subtle patterns that traditional linear or shallow classifiers struggle to capture effectively. For such intricate data, more sophisticated approaches may be necessary.
Specifically, the application of deep learning techniques, such as convolutional neural networks (CNNs) or recurrent neural networks (RNNs), could offer significant improvements. These models are inherently designed to learn hierarchical features directly from raw or minimally preprocessed data, potentially uncovering nuanced emotional states embedded within the EEG signals. Alternatively, implementing a robust dimensionality reduction strategy, such as principal component analysis (PCA) or independent component analysis (ICA), as an initial step could also enhance recognition performance. By transforming high-dimensional EEG data into a set of uncorrelated components that capture the most significant variance, these methods not only can mitigate the curse of dimensionality but also potentially highlight features more discriminative for emotion classification, thereby optimizing the performance of even the traditional classifiers on this dataset.
Limitations and Experimental Challenges: During the experimental process, difficulties related to the normalization of violence by some volunteers were identified. As these were young individuals who had grown up in violent environments, we observed that many tended to interpret the presented negative stimuli as normal or expected situations. This altered perception resulted in the lack of desired emotional responses in the EEG signals. The effect of this normalization was evident in the overlapping of classes during the valence and arousal analysis performed on the EEG signals, which hindered a clear differentiation between the expected negative emotional states.
5. Conclusions
A novel database for emotion recognition through auditory stimulation was successfully developed and validated using electroencephalography (EEG) signals. The primary contribution of this study is the establishment of a robust methodology for characterizing and classifying emotional states, addressing a critical gap in the existing literature. A notable finding was that participants’ perception of negatively valenced stimuli was altered, with these stimuli often being perceived as neutral. This phenomenon was also observed during the EEG signal characterization process, suggesting that the sociocultural environment influences the perception of these stimuli.
It was determined that artificial neural networks (ANNs) exhibited superior performance for emotion recognition, achieving an accuracy of 88.57% ± 2.00%. This significantly outperformed other machine learning classifiers like SVM, LDA, and BC. This result was obtained using a training scenario that incorporated asymmetry indices derived from the spectral response of complementary electrodes in the frontal, temporal, and central regions. These findings highlight the importance of analyzing inter-hemispheric brain activity for the accurate classification of emotional states.
The development of this database and the validation of this methodology provide a valuable resource for advancing research in affective computing and BCI, particularly within the less-explored domain of auditory-induced emotional responses. The dataset’s meticulously curated content and standardized platform enable direct comparisons between various machine learning algorithms and signal processing techniques, thereby accelerating the development of more robust and accurate emotional state classification models.