EEG Dataset for Emotion Analysis

Aguirre-Grisales, Catalina; Arbeláez-Arias, Maria José; Valencia-Rincón, Andrés Felipe; Torres-Cardona, Hector Fabio; Rodriguéz-Sotelo, Jose Luis

doi:10.3390/data10090144

Open AccessArticle

EEG Dataset for Emotion Analysis

by

Catalina Aguirre-Grisales

^1,*

,

Maria José Arbeláez-Arias

¹

,

Andrés Felipe Valencia-Rincón

¹

,

Hector Fabio Torres-Cardona

²

and

Jose Luis Rodriguéz-Sotelo

³

¹

Department of Electronic Engineering, Faculty of Engineering, Universidad del Quindío, Armenia 16293, Colombia

²

Department of Music, Faculty of Arts and Humanities, Universidad de Caldas, Manizales 170004, Colombia

³

Department of Electronic and Automation, Faculty of Engineering, Universidad Autónoma de Manizales, Manizalez 170001, Colombia

^*

Author to whom correspondence should be addressed.

Data 2025, 10(9), 144; https://doi.org/10.3390/data10090144

Submission received: 18 June 2025 / Revised: 20 August 2025 / Accepted: 22 August 2025 / Published: 11 September 2025

(This article belongs to the Section Computational Biology, Bioinformatics, and Biomedical Data Science)

Download

Browse Figures

Versions Notes

Abstract

This work presents an EEG signal database derived from the induction of three emotional states using auditory stimuli. To this end, an experiment was designed in which 30 selected affective sounds from the IADS database were presented to 36 volunteers, from whom EEG signals were acquired. Stimuli were randomly configured in the Psychopy platform and synchronized via the LSL library with the OpenVibe signal acquisition platform. The 16-channel NautilusG brain computer interface was used for signal acquisition. As part of the database validation, a recognition system for the three emotional states was developed. This system utilized machine-learning-based parameterization and classification techniques, achieving detection percentages of 88.57%.

Keywords:

EEG dataset; emotion analysis; brain–computer interface; emotion recognition patterns

1. Introduction

Electroencephalographic (EEG) signals have emerged as a promising area of study for detecting human emotions, with applications spanning neuroscience, psychology, brain–computer interfaces (BCIs), and affective computing. Emotions, as psychophysiological mechanisms, influence cognitive, communicative, and perceptual processes, playing a critical role in decision-making and environmental interactions [1]. Recent technological advances now allow the real-time capture of physiological responses, including brain activity, providing unprecedented opportunities to explore the intricate dynamics of emotional states.

Among the methods for studying emotional responses, auditory stimuli offer unique advantages over visual or audiovisual approaches. Sounds are particularly effective in engaging the brain regions involved in emotional processing, such as the amygdala, which plays a central role in emotion evaluation and regulation. Furthermore, auditory stimuli activate central and temporal brain regions involved in auditory perception and processing, while audiovisual stimuli also recruit occipital areas associated with visual processing [2,3]. The evolutionary importance of auditory stimuli further amplifies their impact; certain sounds, such as cries or screams, elicit immediate emotional responses and increased attention. This makes auditory-based paradigms particularly compelling for investigating emotional responses [4].

Despite the recognized importance of auditory stimuli, research on emotion recognition has predominantly relied on visual stimuli, such as images or videos. Notable tools like the International Affective Picture System (IAPS) and audiovisual databases such as DEAP and SEED are widely used to induce and analyze emotions through physiological signals [5,6,7,8]. However, databases focused exclusively on auditory stimuli remain relatively scarce. Although the International Affective Digitalized Sounds (IADS) database is widely used as a standardized resource for auditory emotional induction, it still requires further methodological refinement and adaptation for specific research applications [4,9].

The creation of an EEG dataset based on auditory stimuli addresses this gap by offering a resource tailored to understand how sounds elicit emotional responses. Auditory stimuli can evoke significant emotional reactions by activating key brain regions, including the amygdala, and by inducing states of attention and emotion related to evolutionary processes [10]. These qualities make auditory stimuli particularly effective tools for emotional induction and underscore the necessity of a standardized EEG dataset focused on this modality.

This work introduces the creation and initial evaluation of a novel EEG dataset designed to analyze emotional responses induced by auditory stimuli from the IADS database. The dataset was generated through a controlled experimental protocol, wherein auditory stimuli were carefully normalized for homogeneity. EEG recordings were obtained using a 16-channel system synchronized with platforms such as OpenVibe and Lab Streaming Layer (LSL). Dual labeling was implemented, incorporating both participants’ self-reported emotional responses and the intrinsic characteristics of the auditory stimuli.

The primary contribution of this study is the provision of a valuable resource for the scientific community, aimed at advancing the characterization and classification of emotional states through machine learning techniques [11]. Beyond fostering future research in emotion recognition, this dataset lays the groundwork for developing BCI systems that integrate affective analysis, with potential applications in mental health, education, and human-computer interaction [6,12,13].

2. Materials and Methods

2.1. Experimental Setup

2.1.1. Stimuli Selection

The experimental stimuli were selected from the International Digitalized Affective Sound (IADS) database. The IADS database is made up of 170 semantic affective sounds, each one with a duration of 6 sec. This database is characterized by indices ranging from 1 to 9 across the dimensions of valence, arousal, and dominance [4]. To develop the stimulation protocol, the database was categorized into three categories based on the valence scale rating of each sound: negative (indices 1–4), neutral (indices 4–6) and positive (indices 6–9). This categorization was carried out according to the guidelines outlined in the database manual provided by the database authors. Thirty stimuli (10 per class) with extreme valence values and high activation levels were selected from the IADS database. To ensure a consistent and standardized presentation of auditory stimuli, key modifications were made to the IADS database, involving the normalization of both amplitude and sampling frequency. Specifically, the maximum intensity for all sound stimuli was set at 0 dB, and the sampling frequency was unified to 44.1 kHz. This methodological rigor not only minimizes variability in auditory experience by reducing biases related to perception and volume but also enhances the reliability of the collected data, allowing more precise comparisons of emotional responses between participants and experimental conditions and improving reproducibility. The selected stimuli, along with their reference number, sound description, valence, and arousal index for each class, are presented in Table 1.

2.1.2. Participants

The population sample size was calculated assuming a completely randomized experiment with a normal distribution, an infinite population size, and a stimulus variability of 0.30. The maximum permissible error was set at 10%, and the confidence level at 95%. These conditions resulted in a minimum sample size of 34 volunteers. All volunteers signed the informed consent form approved by the Bioethics Committee of the Universidad Autónoma de Manizales on 11 February 2020. The sample consisted of 36 participants (N = 36) aged 18 to 25 years (mean = 22 years, SD = 2.8 years) Of the total volunteers, 22.2% were women and 77.8% were men. All participants were Colombian university students from Universidad del Quindío (n = 31), Universidad Autónoma de Manizales (n = 3), and Universidad de Caldas (n = 2).

2.1.3. Experimental Protocol

The stimulation protocol was developed using the Psychopy platform [14]. The experimental setup was performed in phases, beginning with system synchronization, followed by a 180-second baseline period. Subsequently, experimental instructions were presented, accompanied by two demonstration examples. Finally, the six-second stimulation process was conducted, followed by the respective evaluation of the stimulus. Figure 1 shows a flow chart of the stimulation protocol. For precise experimental control, the start and end of the stimulation were labeled, with “0” indicating the start and “1” the end of the stimulus. This information, along with the stimulus class labels (Positive “1”, Neutral “0”, and Negative “−1”), was stored and synchronized with the signal acquisition system via the LSL library. All data labeled were saved in a CSV file.

2.1.4. Affective Self-Assessment

Two standardized instruments were selected to assess the participants’ affective state within the experimental design. The initial instrument was the Positive and Negative Affect Schedule (PANAS) [15], comprising 20 items that represent positive (10) or negative (10) emotions. Each item is answered on a five-point Likert scale with options ranging from 1 (’not at all’) to 5 (’very much’). The questionnaire was administered at the beginning and end of the experiment. The second instrument was the affective slider [16], which allowed for the evaluating of auditory stimuli during the experiment using valence and arousal scales. Each scale ranged from 1 to 9. Figure 2 presents the affective slider, with the arousal scale located at the top, and the valence scale located at the bottom.

2.1.5. Signal Acquisition

Signal acquisition was performed using the 16-channel g-Nautilus Research BCI with active wet electrodes. The device was configured with a sampling frequency of 250 Hz, a sensitivity of

\pm 250

mV, and two filters: an 8-order Butterworth bandpass filter with cutoff frequencies from 0 to 100 Hz, and a 4-order Butterworth band-reject filter with a central frequency of 60 Hz. The electrodes were strategically distributed to record neural responses from the prefrontal, temporal, and central cortices of the participants. The selected electrodes included FP1, FP2, F7, F3, Fz, F4, F8, C3, Cz, C4, T7, T8, FC1, FC2, FT9, and FT10. Signal acquisition was carried out on the OpenVibe platform. This specific placement was chosen to facilitate the analysis of auditory and emotional responses, as these functions are mainly concentrated in these brain regions.

2.2. Processing and Characterization of EEG Signal

2.2.1. Signal Filtering

Three second-order Butterworth filters were designed and implemented. A high-pass filter with a cut-off frequency of 0.1 Hz, which aims to remove the DC level present in the EEG signal, followed by a band-reject filter with cut-off frequencies ranging from 59 Hz to 62 Hz to attenuate the 60 Hz noise components that were still present in the EEG signals, and finally, a band-pass filter with a lower cut-off frequency of 0.5 Hz and a higher one of 55 Hz to remove high-frequency noise and possible low-frequency artifacts.

2.2.2. EEG Signal Normalization

Signal range normalization was performed in two steps. First, the signal was scaled according to its minimum value (Equation (1)), followed by an adjustment of the signal’s maximum values (Equation (2)) to bring all signals to the same scale, facilitating their analysis. This process also helps smooth out potential signal spikes caused by involuntary movements of participants.

\begin{matrix} x_{i m} (t) = x_{i f} (t) - m i n (x_{i f} (t)) \end{matrix}

(1)

\begin{matrix} x_{i n} (t) = \frac{x_{i m} (t)}{m a x (x_{i m} (t))} \end{matrix}

(2)

2.2.3. Signal Segmentation

Segmentation was performed based on stimuli categories, which were stored in a CSV file associated with each participant’s experimental session. These labels were numbered by class: 1 for positive, 0 for neutral, and −1 for negative stimuli. Additionally, the stimulation system sent a ‘0’ to indicate the beginning of auditory stimulus playback and a ‘1’ for its conclusion. The file containing the EEG signals, the acquired markers, and the time scale relating both systems allowed for the identification of signal segments for analysis. From each participant, signal segments corresponding to the 180 s baseline and the brain activity recorded by each user in the stimulation process were extracted. Figure 3 illustrates the EEG signal segmentation process.

The signal segmentation was performed according to the labels that denote the start and end stimuli point and the order in which the stimuli were presented. The use of these labels allowed the extraction of 31 segments of the EEG signal, covering a baseline segment with a duration of 3 min and 30 segments of 6 seconds, each divided into 10 signal sections per class. The signal segments are then organized through a comparison of the times associated with the markers and the recording times of the EEG signal. Following the signal segmentation stage, the signal underwent filtration to yield the five brain rhythms: delta, theta, alpha, beta, and gamma. All signals were stored in a 1 × 34 cell array structure (cell per volunteer). Each cell is composed of a 2 × 5 structure. The first row contains the structures of the five rhythms associated with the stimuli, and the second row contains the rhythms of the baseline EEG signal. The cell of each rhythm contains a substructure with the three classes analyzed, each of which has a 16 × 10 substructure. In this case, the rows represent the EEG channels, and the columns represent the 10 EEG signal segments acquired during the stimulation process.

2.2.4. EEG Signal Characterization

Signal processing and pattern extraction were performed on the delta, theta, alpha, beta, and gamma brain rhythms of each participant. Pattern extraction was conducted in two phases. The initial phase involved developing of two scenarios to estimate valence and arousal rates [17,18]. In the subsequent phase, an analysis of the hemispheric response within the calculated brain rhythms was performed.

Rhythms extraction: Brain rhythms were extracted using second-order Butterworth filters applied to segmented EEG signals. The filtered signals were stored in structures comprising independent vectors for the EEG signal of each rhythm. The frequencies used in the filter design are presented in Table 2.

Valence and Arousal Estimation: Valence and arousal rates were calculated by adapting the model presented by [17,18] (Equations (3) and (4)), where the spectral response of the alpha and beta rhythms of the electrodes located in the prefrontal cortex is analyzed.

\begin{matrix} V a l e n c e = β F 3 / α F 3 - β F 4 / α F 4 \end{matrix}

(3)

\begin{matrix} A r o u s a l = \frac{1}{α F P 1 + α F P 2 + α F 3 + α F 4} \end{matrix}

(4)

Valence and arousal values were calculated for each of the signal segments, which were labeled according to the stimulus presented to the user. A total of 34 volunteers participated in the study, with 10 stimuli administered in each class, in three classes. This yielded a total of 1020 segments/trials (34 volunteers × 10 stimuli/class × 3 classes). Each segment was characterized by its valence and arousal values, providing two features for each segment. Consequently, this resulted in 1020 × 2 training matrices, where each row represents a segment, and the two columns correspond to its valence and arousal features. The first column of each matrix corresponds to the arousal indexes, while the second column corresponds to the valence indexes.

Hemispheric Asymmetry Calculation: Asymmetry analysis was performed by subtracting the average energy and power levels of the complementary electrodes for the different brain rhythms studied [19]. The complementary channels are listed in Table 3.

2.2.5. Classifier Setup

To classify the dataset, four supervised learning algorithms were evaluated to analyze class separability and determine the model with optimal performance. The configurations of the selected algorithms are presented in Table 4.

The training scenarios were configured using the spectral energy and power spectral density (PSD) obtained from the features extracted from the EEG signals. The odd-numbered scenarios were created with spectral energy, while the even-numbered scenarios were created with power spectral density. Table 5 details the dimensions of the matrices employed in the training of the classifiers.

3. Results

3.1. Assessment Results

PANAS assessment: The PANAS questionnaire was administered to participants before and after the experiment to assess the effects of stimulation on their affective state. Figure 4 shows that 18 of the 36 volunteers experienced a decrease in their positive affective state at the end of the experiment, with the mean value of 31.92 to 30.66. Furthermore, Figure 5 shows that 11 of the 36 volunteers reported an increase in negative affective state at the end of the experiment. Nevertheless, when comparing the mean response before and after the experiment, a general decrease in negative affective state was observed, from a level of 15.50 to 15.08.

Affective Slider: Figure 6 shows the results of the auditory stimulus evaluation using the affective slider (experimental values), contrasting them with the indicators for each stimulus reported in the IADS database manual (control values). For the positive class (Table 6), the study revealed no statistically significant differences between the experimental results and the control values for both valence and arousal indices. The obtained p-values were above the established significance level of 0.05 for both scales. For the neutral class (Table 7), the valence indices did not show a significant difference between the experimental and control values. However, for the arousal indices, a statistically significant difference was observed between the experimental mean (5.27) and the control mean (4.78) (

t = 5.66

,

p = 22.68 \times 10^{- 6}

), indicating a noticeable effect of the experiment on the participants’ activation scale. Analysis of the negative class (Table 8) revealed significant changes in both the valence and arousal scales. On the valence scale, the mean increased from 1.80 to 2.60 (

p = 1.06 \times 10^{- 6}

), indicating that negative stimuli had a tendency to be perceived as neutral. On the arousal scale, there was a decrease in mean from 7.75 to 6.61 (

p = 9.00 \times 10^{- 6}

), indicating a lower activation of the stimuli.

3.2. Emotion Recognition

3.2.1. EEG Signal Preprocessing

The signal preprocessing stage conditioned the acquired EEG signal for the characterization stage. This process consists of filtering and normalization, followed by segmentation and extraction of EEG signal rhythms for pattern extraction. During the initial signal inspection, inconsistencies were observed in the EEG signals from two participants. These issues were attributed to data transmission problems between the BCI device and the acquisition computer. Consequently, the data from these two participants were excluded from the subsequent processing stage and machine learning training.

EEG Signal Filtering and Standardization: The acquired EEG signal is characterized by high-frequency noise, electrical network noise, and artifacts resulting from involuntary user movement. To address this, three filters were implemented in the analysis, followed by signal normalization applying Equations (1) and (2). The EEG signal from electrode FC1 is shown in Figure 7, with the raw data and the filtered and normalized signal presented in the upper and lower panels, respectively. Additionally, the segmented and filtered EEG signal for each rhythm from channel FP1 and the negative class is shown in Figure 8.

3.2.2. Pattern Extraction

The pattern extraction process involved the formulation of eight training scenarios. In the first two scenarios, the levels of valence and arousal were extracted by operating the alpha and beta using Equations (3) and (4), as reported in the existing literature [17,18]. The subsequent scenarios, designated as 3 to 8, involve the analysis of asymmetric responses through the integration of the five rhythms present in the EEG followed by the complementary channels (Table 3).

Valence and Arousal levels, Scenarios 1 and 2: Figure 9 illustrates the dispersion of valence and arousal, showing the indexes corresponding to scenarios 1 (spectral energy) and 2 (PSD). In Scenario 1, there is a clear absence of significant separation between the classes, as evidenced by the substantial dispersion of positive patterns, which are intermingled with negative and neutral patterns. In addition, it has been observed that the neutral and negative patterns are frequently grouped together. This observation suggests that in the machine learning process, there are significant classification errors between these two classes. Furthermore, the overlap between the negative and neutral classes was anticipated when contrasting the results of the significance analysis performed on the valence and arousal indices provided by the affective slider. In Scenario 2, the calculation of the valence and arousal indices with the power spectral density reveals a higher concentration of features per class.

Asymmetry analysis, scenarios 3 to 8: Using a box-and-whisker plot for each rhythm derived from the asymmetry analysis, Figure 10 and Appendix A depict the distribution and trend of the patterns for the 34 volunteers. In the analysis of the hemispheric response, it is evident that 89% of the volunteers showed a tendency to activate the left hemisphere in the delta rhythm, with no apparent change between the classes analyzed. In contrast, analysis of the theta rhythm reveals a greater dispersion, both in the spectral energy analysis and in the power spectral density. The findings indicate that 24% of the subjects exhibited a tendency to activate the right hemisphere in the three distinct classes. The results of the study indicated a tendency for activation of the left hemisphere in response to the alpha, beta, and gamma rhythms, with a comparable dispersion observed between all participants.

3.3. Classification Results

Four classifiers were trained in eight different characterization scenarios to determine the combination of patterns and the classifier with optimal performance in the recognition process of the three analyzed emotional classes. This determination was based on the processing of the brain rhythms of the electrodes located in the frontal region. The training of each learning machine was carried out 100 times. In each iteration, the characteristics of each scenario were randomized and divided into two groups: a training group comprising 80% of the data and a second validation group comprising the remaining 20%.

The performance of the classification algorithms is evaluated by the calculation of multiple indicators (true positives—“TP”, true negatives—“TN”, false positives—“FP” and false negatives—“FN”) derived from the results of the confusion matrices obtained during the training and validation process. The performance of the trained machines was determined by measuring the accuracy (ACC) based on the density of the models.

As Table 9 illustrates, the performance results obtained by measuring the accuracy in the recognition of the three analyzed classes of the four learning machines trained in the eight scenarios created are presented.

4. Discussion

The database of EEG signals with sound stimulation developed in this work is a significant resource for the scientific community interested in the analysis of emotional states (positive, negative, and neutral) through brain activity. The design and compilation of the model were intrinsically linked to its potential application in the training and validation of machine learning algorithms aimed at the classification and understanding of human emotional responses. The process was carried out in three stages. The first stage involved the design of the experiment, which included calculating the number of participants, the selection of stimuli, and the design of the signal acquisition and stimulation protocol. The second stage involved the execution of the experiment, and the third stage consisted of the analysis of the acquired signals to design the recognition methodology, together with the evaluation of the subjective results of the volunteers. These results were obtained by analyzing the instruments used during the experimental phase (PANAS Form and Affective Slider).

The selection of IADS database for the characterization process was made, as it is a standard widely validated and appropriately referenced in the existing literature [4,20]. This database was determined to be the most suitable for the characterization of EEG signals for the recognition of three emotional states, given that it is one of the few available resources containing validated auditory stimuli. A theoretical framework was required for the characterization process, and the IADS database provided the necessary reference labels for the theoretical evaluation of the EEG response. The results found in both the PANAS form and the Affective Slider indicated that the majority of volunteers exhibited minimal emotional fluctuations at the end of the experiment. It was determined that a contributing factor to the reduced emotional distress experienced by participants when presented with stimuli classified as negative within the IADS database was the naturalization of violence. This phenomenon has been widely reported in the literature [21,22,23], where it is posited that the naturalization of violence is significantly influenced by the prevailing sociocultural context, which, in turn, modulates societal perceptions of and reactions to violent content. Consequently, the outcomes of the negative class exhibited a high degree of overlap with those of the neutral class. Furthermore, it was observed that the age of the database rendered many of the neutral and positive stimuli obsolete within the contemporary context. While the evaluation of stimuli is known to be significantly influenced by cultural factors, these stimuli nevertheless served as a crucial input for the subsequent creation of an EEG signal database for the recognition of three emotional states using auditory stimulation.

In the pattern extraction process, eight training scenarios were proposed. The first two focused on the calculation of pleasure and activation indices from EEG signal processing in the frontal region, using beta (

β

) and alpha (

α

) rhythms. According to the literature [17,24], these indices generally provide relevant features to train emotional classification models, facilitating the separation between classes of emotional states. However, in this particular study, models trained with these patterns did not achieve classification percentages higher than 64%. In contrast, research such as Koelstra et al. [6] and Zheng and Lu [7] proposes the analysis of emotional response using asymmetry indices, based on the comparison of complementary electrode activity to identify discriminative patterns. Following this line, training scenarios 3 to 8 explored combinations of EEG signal patterns, considering the five frequency bands present in the electroencephalographic recording (delta—

δ

, theta—

θ

, alpha—

α

, beta—

β

and gamma—

γ

). The distribution of the asymmetry response per participant demonstrates the subjectivity of the emotion induction processes, where the sociocultural context affects both the response of the indexes found in the analysis of the affective slider and the distribution of the patterns of training scenarios 1 to 8. An overlap is observed between the analyzed classes, which causes the trained classifiers to not present classification percentages higher than 90%. In this context, the ANN that demonstrated the most optimal performance in recognizing the three emotional states was the one trained with scenario 4 (see Table 4). This scenario involves an analysis of the brain hemispheric response by calculating the asymmetries of complementary electrodes, resulting in an accuracy percentage of 88.57% ± 2.00%. The alternative classifier option that can be used is the ANN trained with scenario 5, which presented an accuracy of 70.22% ± 2.45%. A subsequent analysis of the results obtained from the validation of the SVM, LDA, and CB classifiers revealed that these methods were unable to achieve satisfactory performance in recognizing the three emotion classes examined. This limited performance is likely attributable to the inherent complexity and high dimensionality of EEG signals, which often contain non-linear relationships and subtle patterns that traditional linear or shallow classifiers struggle to capture effectively. For such intricate data, more sophisticated approaches may be necessary.

Specifically, the application of deep learning techniques, such as convolutional neural networks (CNNs) or recurrent neural networks (RNNs), could offer significant improvements. These models are inherently designed to learn hierarchical features directly from raw or minimally preprocessed data, potentially uncovering nuanced emotional states embedded within the EEG signals. Alternatively, implementing a robust dimensionality reduction strategy, such as principal component analysis (PCA) or independent component analysis (ICA), as an initial step could also enhance recognition performance. By transforming high-dimensional EEG data into a set of uncorrelated components that capture the most significant variance, these methods not only can mitigate the curse of dimensionality but also potentially highlight features more discriminative for emotion classification, thereby optimizing the performance of even the traditional classifiers on this dataset.

Limitations and Experimental Challenges: During the experimental process, difficulties related to the normalization of violence by some volunteers were identified. As these were young individuals who had grown up in violent environments, we observed that many tended to interpret the presented negative stimuli as normal or expected situations. This altered perception resulted in the lack of desired emotional responses in the EEG signals. The effect of this normalization was evident in the overlapping of classes during the valence and arousal analysis performed on the EEG signals, which hindered a clear differentiation between the expected negative emotional states.

5. Conclusions

A novel database for emotion recognition through auditory stimulation was successfully developed and validated using electroencephalography (EEG) signals. The primary contribution of this study is the establishment of a robust methodology for characterizing and classifying emotional states, addressing a critical gap in the existing literature. A notable finding was that participants’ perception of negatively valenced stimuli was altered, with these stimuli often being perceived as neutral. This phenomenon was also observed during the EEG signal characterization process, suggesting that the sociocultural environment influences the perception of these stimuli.

It was determined that artificial neural networks (ANNs) exhibited superior performance for emotion recognition, achieving an accuracy of 88.57% ± 2.00%. This significantly outperformed other machine learning classifiers like SVM, LDA, and BC. This result was obtained using a training scenario that incorporated asymmetry indices derived from the spectral response of complementary electrodes in the frontal, temporal, and central regions. These findings highlight the importance of analyzing inter-hemispheric brain activity for the accurate classification of emotional states.

The development of this database and the validation of this methodology provide a valuable resource for advancing research in affective computing and BCI, particularly within the less-explored domain of auditory-induced emotional responses. The dataset’s meticulously curated content and standardized platform enable direct comparisons between various machine learning algorithms and signal processing techniques, thereby accelerating the development of more robust and accurate emotional state classification models.

Author Contributions

Conceptualization, C.A.-G.; methodology, C.A.-G. and J.L.R.-S.; software, C.A.-G., M.J.A.-A. and A.F.V.-R.; validation, C.A.-G., M.J.A.-A. and A.F.V.-R.; formal analysis, C.A.-G.; investigation, C.A.-G. and H.F.T.-C.; resources, C.A.-G. and H.F.T.-C.; data curation, C.A.-G., M.J.A.-A. and A.F.V.-R.; writing—original draft preparation, C.A.-G.; writing—review and editing, C.A.-G.; visualization, C.A.-G.; supervision, J.L.R.-S. and H.F.T.-C.; project administration, C.A.-G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Ministerio de Ciencia y Tecnología de Colombia MINCIENCIAS, National Doctorate 757, and Universidad del Quindío, Funding code 100016837.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Review Board (or Ethics Committee) of UNIVERSIDAD AUTÓNOMA DE MANIZALES (protocol code 103 and 11 February 2020).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study. Written informed consent has been obtained from the patient(s) to publish this paper.

Data Availability Statement

Dataset url: https://sites.google.com/uniquindio.edu.co/thesis-repository/dataset?authuser=0 (accessed on 19 August 2025).

Acknowledgments

I would like to thank the Universidad del Quindío for the financial support provided for this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

BCI	Bran-Computer Interface
EEG	Electroencephalogram
ML	Machine Learning
SVM	Support Vector Machine
ANN	Artificial Neuronal Network
BC	Bayesian Classifier
LDA	Linear Discriminant Analysis
PSD	Power Spectral Density

Appendix A

Figure A1, Figure A2, Figure A3 and Figure A4 illustrate the asymmetric response of the beta, theta, alpha, and gamma rhythms in the 34 volunteers.

Figure A1. Asymmetry response Theta rhythm.

Figure A2. Asymmetry response Alpha rhythm.

Figure A3. Asymmetry response Beta rhythm.

Figure A4. Asymmetry response Gamma rhythm.

References

Scherer, K.; Ekman, P. Approaches to Emotion; Psychology Press: London, UK, 2014. [Google Scholar]
Koelsch, S. Significance of Broca’s area and ventral premotor cortex for music-syntactic processing. Cortex 2006, 42, 518–520. [Google Scholar] [CrossRef] [PubMed]
Schirmer, A.; Kotz, S.A. Beyond the right hemisphere: Brain mechanisms mediating vocal emotional processing. Trends Cogn. Sci. 2010, 14, 106–111. [Google Scholar] [CrossRef] [PubMed]
Bradley, M.M.; Lang, P.J. The International Affective Digitalized Sounds (IADS-2): Affective Ratings of Sounds and Instruction Manual; Technical Report B-3; University of Florida: Gainesville, FL, USA, 2007. [Google Scholar]
Lang, P.; Bradley, M.M.; Cuthbert, B.N. International Affective Picture System (IAPS). APA PsycNet Direct. 2005. Available online: https://psycnet.apa.org/doiLanding?doi=10.1037%2Ft66667-000 (accessed on 19 August 2025).
Koelstra, S.; Muhl, C.; Soleymani, M.; Lee, J.-S.; Yazdani, A.; Ebrahimi, T.; Pun, T.; Nijholt, A.; Patras, I. DEAP: A Database for Emotion Analysis ;Using Physiological Signals. IEEE Trans. Affect. Comput. 2012, 3, 18–31. [Google Scholar] [CrossRef]
Zheng, W.L.; Lu, B.L. Investigating critical frequency bands and channels for EEG-based emotion recognition with deep neural networks. IEEE Trans. Auton. Ment. Dev. 2015, 7, 162–175. [Google Scholar] [CrossRef]
Duan, R.-N.; Zhu, J.-Y.; Lu, B.-L. Differential Entropy Feature for EEG-based Emotion Classification. In Proceedings of the 6th International IEEE EMBS Conference on Neural Engineering (NER), San Diego, CA, USA, 6–8 November 2013; pp. 81–84. [Google Scholar]
Li, X.; Zhang, Y.; Tiwari, P.; Song, D.; Hu, B.; Yang, M.; Zhao, Z.; Kumar, N.; Marttinen, P. EEG based emotion recognition: A tutorial and review. ACM Comput. Surv. 2022, 55, 1–57. [Google Scholar] [CrossRef]
Klinge, C.; Röder, B.; Büchel, C. Increased amygdala activation to emotional auditory stimuli in the blind. Brain 2010, 133, 1729–1736. [Google Scholar] [CrossRef] [PubMed]
Samal, P.; Hashmi, M.F. Role of machine learning and deep learning techniques in EEG-based BCI emotion recognition system: A review. Artif. Intell. Rev. 2024, 57, 50. [Google Scholar] [CrossRef]
Liu, Y.J.; Yu, M.; Zhao, G.; Song, J.; Ge, Y.; Shi, Y. Real-Time Movie-Induced Discrete Emotion Recognition from EEG Signals. IEEE Trans. Affect. Comput. 2017, 9, 550–562. [Google Scholar] [CrossRef]
Dadebayev, D.; Goh, W.W.; Tan, E.X. EEG-based emotion recognition: Review of commercial EEG devices and machine learning techniques. J. King Saud Univ.-Comput. Inf. Sci. 2022, 34, 4385–4401. [Google Scholar] [CrossRef]
Peirce, J.W. PsychoPy—psychophysics software in Python. J. Neurosci. Methods 2007, 162, 8–13. [Google Scholar] [CrossRef] [PubMed]
Watson, D.; Clark, L.A.; Tellegen, A. Development and validation of brief measures of positive and negative affect: The PANAS scales. J. Personal. Soc. Psychol. 1988, 54, 1063. [Google Scholar] [CrossRef] [PubMed]
Betella, A.; Verschure, P.F. The affective slider: A digital self-assessment scale for the measurement of human emotions. PLoS ONE 2016, 11, e0148037. [Google Scholar] [CrossRef] [PubMed]
Casares, A. Design and Evaluation of a Musical Neurofeedback Software in Matlab. Master’s Thesis, Universitat Pompeu Fabra Barcelona, Barcelona, Spain, 2016. [Google Scholar]
Nawaz, R.; Nisar, H.; Yap, V.V. Recognition of useful music for emotion enhancement based on dimensional model. In Proceedings of the 2018 2nd International Conference on BioSignal Analysis, Processing and Systems (ICBAPS), Kuching, Malaysia, 24–26 July 2018; pp. 176–180. [Google Scholar]
Mouri, F.I.; Valderrama, C.E.; Camorlinga, S.G. Identifying relevant asymmetry features of EEG for emotion processing. Front. Psychol. 2023, 14, 1217178. [Google Scholar] [CrossRef] [PubMed]
Yang, W.; Makita, K.; Nakao, T.; Kanayama, N.; Machizawa, M.G.; Sasaoka, T.; Sugata, A.; Kobayashi, R.; Hiramoto, R.; Yamawaki, S.; et al. Affective auditory stimulus database: An expanded version of the International Affective Digitized Sounds (IADS-E). Behav. Res. Methods 2018, 50, 1415–1429. [Google Scholar] [CrossRef] [PubMed]
Bravo, D.L. De la naturalización de la violencia a la banalidad del mal. Ratio Juris 2017, 12, 111–125. [Google Scholar] [CrossRef]
Rueda, M.H. Nación y narración de la violencia en Colombia (de la historia a la sociología). Rev. Iberoam. 2008, 74, 345–359. [Google Scholar] [CrossRef]
Vargas, S.C.; Sánchez, M.L. Naturalización de la violencia urbana: Representaciones sociales en estudiantes de Medellín, Colombia. Voces Silencios. Rev. Latinoam. Educ. 2018, 9, 64–79. [Google Scholar] [CrossRef]
Daly, I.; Williams, D.; Hallowell, J.; Hwang, F.; Kirke, A.; Malik, A.; Weaver, J.; Miranda, E.; Nasuto, S.J. Music-induced emotions can be predicted from a combination of brain activity and acoustic features. Brain Cogn. 2015, 101, 1–11. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Stimulation protocol developed on the Psychopy platform.

Figure 2. Affective Slider used in the experiment, assessment adapted from [16]. The arousal scale is located at the top and the valence scale is situated at the bottom.

Figure 3. EEG signal segmentation and data storage by class.

Figure 4. Positive affective scale pre and post experiment.

Figure 5. Negative affective scale pre and post experiment.

Figure 6. Affective Slider Experimental Results.

Figure 7. EEG signal channel FC1. The upper figure illustrates the raw signal, while the lower figure depicts the filtered signal.

Figure 8. Segmented and filtered EEG signal, for each rhythm (Delta, Theta, Alpha, Beta and Gamma), channel FP1, negative class.

Figure 9. Scatter plot of valence and arousal parameters extracted from the EEG signal.

Figure 10. Asymmetry response Delta rhythm.

Table 1. Stimuli selected from the IADS database.

Emotional State	Reference Number	Sound Description	Valence	Arousal
Positive	110	Baby	7.64	6.03
	220	Boy Laugh	7.28	6.00
	311	Crowd	7.65	7.12
	352	Sports Crowd	7.17	7.07
	353	Baseball	7.38	6.62
	360	Roller coaster	6.94	7.54
	716	Slot Machine	7.00	6.44
	815	Rock N Roll	7.90	6.85
	817	Bongos	7.67	7.15
	820	Funk Music	6.94	5.87
Neutral	170	Night	5.31	4.60
	246	heart Beat	4.83	4.65
	322	Type writer	5.01	4.79
	361	Restaurant	5.36	5.01
	368	Crowd	5.15	4.75
	373	Paint	5.09	4.65
	376	Lawn Power	4.88	4.60
	425	Train	5.09	5.15
	698	Rain	4.83	4.65
	722	Walking	4.83	4.97
Negative	275	Scream	2.05	8.16
	276	Female scream	1.93	7.77
	277	Female scream	1.63	7.79
	278	Child abuse	1.57	7.27
	279	Attack	1.68	7.95
	285	Attack	1.80	7.79
	286	Victim	1.68	7.88
	290	Fight	1.65	7.61
	292	Male scream	1.99	7.28
	424	Car wreck	2.04	7.99

Table 2. Setups Brain Rhythms Filters.

Brain Rhythm	Filter Type	Cutoff Frequencies
Delta	Band pass	$0.5$ Hz–4 Hz
Theta	Band pass	4 Hz–7 Hz
Alpha	Band pass	8 Hz–13 Hz
Beta	Band pass	14 Hz–30 Hz
Gamma	High pass	30 Hz

Table 3. Complementary channels subtraction.

N°	Complementary Channels Subtraction
1	FP1–FP2
2	F3–F4
3	F7–F8
4	FC1–FC2
5	C3–C4
6	T7–T8
7	FT9–FT10

Table 4. Classifier Setup.

Classifier	Setup Description
Support Vector Machine	Kernel: RBF; Gamma = 0.5
Artificial Neural Network	RNA of 8 layers: 1 input layer, 6 hidden layers y 1 Output layer; Activation function: ReLU; Neurons per layer: 128, 256, 512, 1024
Linear Discriminant	Default setup
Bayesian Classifier	Default setup

Table 5. Dimensions of training scenarios 3 to 8.

Scenarios	Rhythms	Dimensions
1–2	Valence-Arousal	1020 × 2
3–4	Delta, Theta, Alpha, Beta and Gamma	105 × 34
5–6	Alpha, Beta and Gamma	63 × 34
7–8	Delta, theta and Gamma	63 × 34

Table 6. Affective slider results analysis, positive class with unequal mean and variance at a level of

p \leq

0.05.

Table 6. Affective slider results analysis, positive class with unequal mean and variance at a level of

p \leq

0.05.

	Valence		Arousal
	Experimental	Control	Experimental	Control
Mean	6.73	6.67	6.46	6.67
Variance	0.37	0.33	0.21	0.33
Observations	10.00	10.00	10.00	10.00
Hypothetical difference of means	0.00		0.00
Degrees of freedom	18.00		17.00
Statistical t	0.24		−0.90
$p (T \leq t)$ two tails	0.82		0.38
Critical value of t (two tails)	2.10		2.11

Table 7. Affective slider results analysis, Neutral class with unequal mean and variance at a level of

p \leq

0.05.

Table 7. Affective slider results analysis, Neutral class with unequal mean and variance at a level of

p \leq

0.05.

	Valence		Arousal
	Experimental	Control	Experimental	Control
Mean	5.47	5.04	5.27	4.78
Variance	0.63	0.04	0.04	0.04
Observations	10.00	10.00	10.00	10.00
Degrees of freedom	10.00		18.00
Statistical t	1.66		5.66
$p (T \leq t)$ two tails	0.13		22.65 × $10^{- 6}$
Critical value t (two tails)	2.23		2.10

Table 8. Affective slider results analysis, Negative class with unequal mean and variance at a level of

p \leq

0.05.

Table 8. Affective slider results analysis, Negative class with unequal mean and variance at a level of

p \leq

0.05.

	Valence		Arousal
	Experimental	Control	Experimental	Control
Mean	2.60	1.80	6.61	7.75
Variance	0.11	0.03	0.21	0.08
Observations	10.00	10.00	10.00	10.00
Degrees of freedom	14.00		15.00
Statistical t	6.66		−6.56
$p (T \leq t)$ two tails	1.08 × $10^{- 6}$		9.00 × $10^{- 6}$
Critical value of t (two tails)	2.14		2.13

Table 9. Training Performance Results (AAC) of the eight scenarios with the four selected learning machines.

N°	SVM	RNA	LDA	CB
1	63.56% $\pm$ 1.11%	58.86% ± 4.62%	52.40% $\pm$ 1.60%	58.6% $\pm$ 1.51%
2	46.45% ± 0.88%	47.06% ± 1.57%	50.13% ± 2.02%	49.35% ± 2.08%
3	47.90% ± 0.37%	33.33% ± 1.74%	50.98% ± 2.01%	27.80% ± 2.50%
4	43.48% ± 3.27%	88.57% $\pm$ 2.00%	27.59% ± 1.60%	28.70% ± 2.51%
5	47.06% ± 5.12%	70.22% ± 2.45%	36.67% ± 2.81%	33.33% ± 1.62%
6	46.43% ± 3.00%	65.08% ± 1.80%	36.67% ± 2.02%	33.33% ± 1.32%
7	44.18% ± 4.83%	60.31% ± 5.32%	33.33% ± 2.02%	22.23% ± 1.01%
8	46.35% ± 5.54%	49.20% ± 4.35%	40.00% ± 4.00%	22.23% ± 1.00%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Aguirre-Grisales, C.; Arbeláez-Arias, M.J.; Valencia-Rincón, A.F.; Torres-Cardona, H.F.; Rodriguéz-Sotelo, J.L. EEG Dataset for Emotion Analysis. Data 2025, 10, 144. https://doi.org/10.3390/data10090144

AMA Style

Aguirre-Grisales C, Arbeláez-Arias MJ, Valencia-Rincón AF, Torres-Cardona HF, Rodriguéz-Sotelo JL. EEG Dataset for Emotion Analysis. Data. 2025; 10(9):144. https://doi.org/10.3390/data10090144

Chicago/Turabian Style

Aguirre-Grisales, Catalina, Maria José Arbeláez-Arias, Andrés Felipe Valencia-Rincón, Hector Fabio Torres-Cardona, and Jose Luis Rodriguéz-Sotelo. 2025. "EEG Dataset for Emotion Analysis" Data 10, no. 9: 144. https://doi.org/10.3390/data10090144

APA Style

Aguirre-Grisales, C., Arbeláez-Arias, M. J., Valencia-Rincón, A. F., Torres-Cardona, H. F., & Rodriguéz-Sotelo, J. L. (2025). EEG Dataset for Emotion Analysis. Data, 10(9), 144. https://doi.org/10.3390/data10090144

Article Menu

EEG Dataset for Emotion Analysis

Abstract

1. Introduction

2. Materials and Methods

2.1. Experimental Setup

2.1.1. Stimuli Selection

2.1.2. Participants

2.1.3. Experimental Protocol

2.1.4. Affective Self-Assessment

2.1.5. Signal Acquisition

2.2. Processing and Characterization of EEG Signal

2.2.1. Signal Filtering

2.2.2. EEG Signal Normalization

2.2.3. Signal Segmentation

2.2.4. EEG Signal Characterization

2.2.5. Classifier Setup

3. Results

3.1. Assessment Results

3.2. Emotion Recognition

3.2.1. EEG Signal Preprocessing

3.2.2. Pattern Extraction

3.3. Classification Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI