MB-MSTFNet: A Multi-Band Spatio-Temporal Attention Network for EEG Sensor-Based Emotion Recognition
Abstract
1. Introduction
2. Proposed Method
- (1)
- Decompose raw EEG signals into distinct frequency bands corresponding to different brain states and partition them into temporal segments.
- (2)
- Compute DE and PSD for each frequency band slice, map these features onto the brain’s spatial topology, and concatenate them to form 3D fused features.
- (3)
- Extract local EEG features using a CNN and perform feature fusion via multi-scale convolution and pooling in the Inception module.
- (4)
- After dimensionality reduction via a max pooling layer, flatten the features and use BiGRU layers to capture bidirectional temporal dependencies in the feature sequence, revealing the signals’ dynamic patterns.
- (5)
- Integrate a multi-head attention mechanism to calculate weights across different subspaces, enabling adaptive focusing on critical features and suppression of interfering information.
- (6)
- Map the processed features to a four-class classification space through a fully connected layer, using the softmax function as the classifier to generate recognition outputs.
2.1. Three-Dimensional Feature Representation via DE and PSD Fusion
2.2. Multi-Scale Spatial Feature Learning
2.3. Bidirectional Temporal Attention Fusion
3. Experiments
3.1. Experimental Setup
3.2. Experimental Results
3.2.1. Emotion Recognition Results
3.2.2. Ablation Studies
- (1)
- DE-PSD feature fusion integrates different types of EEG features, providing richer emotional cues and leading to a significant improvement in recognition performance.
- (2)
- The BiGRU structure captures more comprehensive temporal dynamics through bidirectional processing, outperforming both the unidirectional GRU and GRU-free structures.
- (3)
- The Inception module enriches feature representation through multi-scale feature extraction, strongly supporting accurate emotion recognition by capturing hierarchical spatial correlations across different electrode neighborhoods.
- (4)
- The MHA mechanism enhances the model’s ability to focus on discriminative emotional features by parallelly modeling multi-dimensional feature dependencies, effectively suppressing noise interference and improving the stability of recognition results.
4. Interpretability
4.1. DE and PSD Feature Fusion and Spatial Differences in EEG Bands
4.2. Cross-Band EEG Feature Analysis via Inception Multi-Scale Convolution
4.3. Deciphering Arousal—Specific Dynamics via Temporal Correlations of GRUs
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Xu, W.; Jiang, H.; Liang, X. Leveraging Knowledge of Modality Experts for Incomplete Multimodal Learning. In Proceedings of the 32nd ACM International Conference on Multimedia, Melbourne, Australia, 28 October–1 November 2024; pp. 438–446. [Google Scholar]
- Zheng, W.L.; Liu, W.; Lu, Y.; Lu, B.L.; Cichocki, A. EmotionMeter: A Multimodal Framework for Recognizing Human Emotions. IEEE Trans. Cybern. 2019, 49, 1110–1122. [Google Scholar] [CrossRef]
- Zhang, W.; Qiu, F.; Wang, S.; Zeng, H.; Zhang, Z.; An, R.; Ma, B.; Ding, Y. Transformer-based multimodal information fusion for facial expression analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 2428–2437. [Google Scholar]
- Zhuang, X.; Liu, F.; Hou, J.; Hao, J.; Cai, X. Transformer-based interactive multi-modal attention network for video sentiment detection. Neural Process. Lett. 2022, 54, 1943–1960. [Google Scholar] [CrossRef]
- Jafari, M.; Shoeibi, A.; Khodatars, M.; Bagherzadeh, S.; Shalbaf, A.; García, D.L.; Gorriz, J.M.; Acharya, U.R. Emotion recognition in EEG signals using deep learning methods: A review. Comput. Biol. Med. 2023, 165, 107450. [Google Scholar] [CrossRef]
- Qin, Y.; Zhang, Y.; Zhang, Y.; Liu, S.; Guo, X. Application and development of EEG acquisition and feedback technology: A review. Biosensors 2023, 13, 930. [Google Scholar] [CrossRef]
- Xu, J.; Mitra, S.; Van Hoof, C.; Yazicioglu, R.F.; Makinwa, K.A. Active electrodes for wearable EEG acquisition: Review and electronics design methodology. IEEE Rev. Biomed. Eng. 2017, 10, 187–198. [Google Scholar] [CrossRef] [PubMed]
- Tao, W.; Li, C.; Song, R.; Cheng, J.; Liu, Y.; Wan, F.; Chen, X. EEG-Based Emotion Recognition via Channel-Wise Attention and Self Attention. IEEE Trans. Affect. Comput. 2023, 14, 382–393. [Google Scholar] [CrossRef]
- Uyanik, H.; Sengur, A.; Salvi, M.; Tan, R.S.; Tan, J.H.; Acharya, U.R. Automated Detection of Neurological and Mental Health Disorders Using EEG Signals and Artificial Intelligence: A Systematic Review. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2025, 15, e70002. [Google Scholar] [CrossRef]
- Wang, X.W.; Nie, D.; Lu, B.L. EEG-based emotion recognition using frequency domain features and support vector machines. In Proceedings of the International Conference on Neural Information Processing, Shanghai, China, 13–17 November 2011; Springer: Berlin/Heidelberg, Germany, 2011; pp. 734–743. [Google Scholar]
- Raja, M.; Sigg, S. Applicability of RF-based methods for emotion recognition: A survey. In Proceedings of the 2016 IEEE International Conference on Pervasive Computing and Communication Workshops (PerCom Workshops), Sydney, Australia, 14–18 March 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 1–6. [Google Scholar]
- Duan, R.N.; Zhu, J.Y.; Lu, B.L. Differential entropy feature for EEG-based emotion classification. In Proceedings of the 2013 6th International IEEE/EMBS Conference on Neural Engineering (NER), San Diego, CA, USA, 6–8 November 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 81–84. [Google Scholar]
- Alsolamy, M.; Fattouh, A. Emotion estimation from EEG signals during listening to Quran using PSD features. In Proceedings of the 2016 7th International Conference on Computer Science and Information Technology (CSIT), Amman, Jordan, 13–14 July 2016; pp. 1–5. [Google Scholar]
- Chao, H.; Dong, L. Emotion Recognition Using Three-Dimensional Feature and Convolutional Neural Network from Multichannel EEG Signals. IEEE Sens. J. 2021, 21, 2024–2034. [Google Scholar] [CrossRef]
- Chakravarthi, B.; Ng, S.C.; Ezilarasan, M.; Leung, M.F. EEG-based emotion recognition using hybrid CNN and LSTM classification. Front. Comput. Neurosci. 2022, 16, 1019776. [Google Scholar] [CrossRef] [PubMed]
- Song, T.; Zheng, W.; Song, P.; Cui, Z. EEG emotion recognition using dynamical graph convolutional neural networks. IEEE Trans. Affect. Comput. 2018, 11, 532–541. [Google Scholar] [CrossRef]
- Feng, L.; Cheng, C.; Zhao, M.; Deng, H.; Zhang, Y. EEG-Based Emotion Recognition Using Spatial-Temporal Graph Convolutional LSTM With Attention Mechanism. IEEE J. Biomed. Health Inform. 2022, 26, 5406–5417. [Google Scholar] [CrossRef] [PubMed]
- Yang, L.; Liu, J. EEG-Based Emotion Recognition Using Temporal Convolutional Network. In Proceedings of the 2019 IEEE 8th Data Driven Control and Learning Systems Conference (DDCLS), Dali, China, 24–27 May 2019; pp. 437–442. [Google Scholar]
- Tang, H.; Xie, S.; Xie, X.; Cui, Y.; Li, B.; Zheng, D.; Hao, Y.; Wang, X.; Jiang, Y.; Tian, Z. Multi-domain based dynamic graph representation learning for EEG emotion recognition. IEEE J. Biomed. Health Inform. 2024, 28, 5227–5238. [Google Scholar] [CrossRef]
- Wei, Y.; Liu, Y.; Li, C.; Cheng, J.; Song, R.; Chen, X. TC-Net: A Transformer Capsule Network for EEG-based emotion recognition. Comput. Biol. Med. 2023, 152, 106463. [Google Scholar] [CrossRef] [PubMed]
- Miao, M.; Zheng, L.; Xu, B.; Yang, Z.; Hu, W. A multiple frequency bands parallel spatial–temporal 3D deep residual learning framework for EEG-based emotion recognition. Biomed. Signal Process. Control. 2023, 79, 104141. [Google Scholar] [CrossRef]
- Hu, Z.; Chen, L.; Luo, Y.; Zhou, J. EEG-based emotion recognition using convolutional recurrent neural network with multi-head self-attention. Appl. Sci. 2022, 12, 11255. [Google Scholar] [CrossRef]
- Koelstra, S.; Muhl, C.; Soleymani, M.; Lee, J.S.; Yazdani, A.; Ebrahimi, T.; Pun, T.; Nijholt, A.; Patras, I. Deap: A database for emotion analysis; using physiological signals. IEEE Trans. Affect. Comput. 2011, 3, 18–31. [Google Scholar] [CrossRef]
- Khateeb, M.; Anwar, S.M.; Alnowami, M. Multi-domain feature fusion for emotion classification using DEAP dataset. IEEE Access 2021, 9, 12134–12142. [Google Scholar] [CrossRef]
- Alhagry, S.; Fahmy, A.A.; El-Khoribi, R.A. Emotion recognition based on EEG using LSTM recurrent neural network. Int. J. Adv. Comput. Sci. Appl. 2017, 8. [Google Scholar] [CrossRef]
- Russell, J.A. A circumplex model of affect. J. Personal. Soc. Psychol. 1980, 39, 1161–1178. [Google Scholar] [CrossRef]
- Cover, T.; Thomas, J. Elements of Information Theory, 2nd ed.; Wiley: Hoboken, NJ, USA, 2006. [Google Scholar]
- Chen, D.W.; Miao, R.; Yang, W.Q.; Liang, Y.; Chen, H.H.; Huang, L.; Deng, C.J.; Han, N. A feature extraction method based on differential entropy and linear discriminant analysis for emotion recognition. Sensors 2019, 19, 1631. [Google Scholar] [CrossRef] [PubMed]
- Feutrill, A.; Roughan, M. A review of Shannon and differential entropy rate estimation. Entropy 2021, 23, 1046. [Google Scholar] [CrossRef] [PubMed]
- Welch, P. The use of fast Fourier transform for the estimation of power spectra: A method based on time averaging over short, modified periodograms. IEEE Trans. Audio Electroacoust. 1967, 15, 70–73. [Google Scholar] [CrossRef]
- Youngworth, R.N.; Gallagher, B.B.; Stamper, B.L. An overview of power spectral density (PSD) calculations. Opt. Manuf. Test. VI 2005, 5869, 206–216. [Google Scholar]
- Homan, R.W.; Herman, J.; Purdy, P. Cerebral location of international 10–20 system electrode placement. Electroencephalogr. Clin. Neurophysiol. 1987, 66, 376–382. [Google Scholar] [CrossRef] [PubMed]
- Moon, S.E.; Jang, S.; Lee, J.S. Convolutional neural network approach for EEG-based emotion recognition using brain connectivity and its spatial information. In Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, 15–20 April 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 2556–2560. [Google Scholar]
- Mekruksavanich, S.; Hnoohom, N.; Jitpattanakul, A. A hybrid deep residual network for efficient transitional activity recognition based on wearable sensors. Appl. Sci. 2022, 12, 4988. [Google Scholar] [CrossRef]
- Klimesch, W. EEG alpha and theta oscillations reflect cognitive and memory performance: A review and analysis. Brain Res. Rev. 1999, 29, 169–195. [Google Scholar] [CrossRef]
- Codispoti, M.; De Cesarei, A.; Ferrari, V. Alpha-band oscillations and emotion: A review of studies on picture perception. Psychophysiology 2023, 60, e14438. [Google Scholar] [CrossRef]
- Ray, W.J.; Cole, H.W. EEG alpha activity reflects attentional demands, and beta activity reflects emotional and cognitive processes. Science 1985, 228, 750–752. [Google Scholar] [CrossRef]
- Gevins, A.; Smith, M.E.; McEvoy, L.; Yu, D. High-resolution EEG mapping of cortical activation related to working memory: Effects of task difficulty, type of processing, and practice. Cereb. Cortex 1997, 7, 374–385. [Google Scholar] [CrossRef]
- Li, M.; Lu, B.L. Emotion classification based on gamma-band EEG. In Proceedings of the 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Minneapolis, MN, USA, 3–6 September 2009; IEEE: Piscataway, NJ, USA, 2009; pp. 1223–1226. [Google Scholar]
- Chen, H.; Li, J.; He, H.; Zhu, J.; Sun, S.; Li, X.; Hu, B. Toward the construction of affective brain-computer interface: A systematic review. ACM Comput. Surv. 2025, 57, 1–56. [Google Scholar] [CrossRef]
- Chen, Y.; Peng, Y.; Tang, J.; Camilleri, T.A.; Camilleri, K.P.; Kong, W.; Cichocki, A. EEG-based affective brain-computer interfaces: Recent advancements and future challenges. J. Neural Eng. 2025, 22, 031004. [Google Scholar] [CrossRef] [PubMed]
- Yu, P.; He, X.; Li, H.; Dou, H.; Tan, Y.; Wu, H.; Chen, B. FMLAN: A novel framework for cross-subject and cross-session EEG emotion recognition. Biomed. Signal Process. Control 2025, 100, 106912. [Google Scholar] [CrossRef]
Layer | Input Size | Output Size | Number of Parameters |
---|---|---|---|
Input Layer | (100, 8, 8, 8, 9) | (100, 8, 8, 8, 9) | 0 |
Conv2d—conv1 | (800, 8, 8, 9) | (800, 64, 8, 9) | 4672 |
Conv2d—conv2 | (800, 64, 8, 9) | (800, 128, 8, 9) | 73,856 |
Conv2d—conv3 | (800, 128, 8, 9) | (800, 256, 8, 9) | 295,168 |
Inception Block | (800, 256, 8, 9) | (800, 256, 8, 9) | 148,996 |
MaxPool2d | (800, 256, 8, 9) | (800, 256, 4, 4) | 0 |
Flatten | (800, 256, 4, 4) | (800, 4096) | 0 |
BiGRU | (100, 8, 4096) | (100, 8, 256) | 1,572,864 |
Multi Head Attention | (100, 8, 256) | (100, 8, 256) | 262,144 |
Linear—fc1 | (100, 256) | (100, 1296) | 332,928 |
Linear—fc2 | (100, 1296) | (100, 512) | 665,856 |
Linear (OutLayer) | (100, 512) | (100, 4) | 2052 |
Method | DEAP-Valence | DEAP-Arousal |
---|---|---|
Accuracy (%) | Accuracy (%) | |
DCRNN | 76.60 | 81.63 |
AT-DGNN | 86.01 | 83.74 |
ST-GCLSTM | 90.52 | 90.04 |
DE-CNN-BiLSTM | 94.86 | 94.02 |
ERHGCN | 90.56 | 88.79 |
A-CNN-LSTM | 90.73 | 91.17 |
3D-CRU | 93.12 | 94.31 |
MD2GRL | 96.51 | 95.77 |
DEMA | 97.55 | 97.61 |
MB-MSTFNet (ours) | 96.80 | 98.02 |
Method | DEAP-Valence | DEAP-Arousal | Four-Class Classification |
---|---|---|---|
DE | 74.75 ± 2.31 | 76.43 ± 1.87 | 65.55 ± 3.24 |
PSD | 76.30 ± 1.98 | 78.51 ± 2.12 | 69.74 ± 2.86 |
DE + PSD | 96.80 ± 0.92 | 98.02 ± 0.76 | 92.85 ± 1.45 |
Method | DEAP-Valence | DEAP-Arousal | Four-Class Classification |
---|---|---|---|
Without GRU | 70.20 ± 3.50 | 73.41 ± 2.95 | 68.54 ± 4.12 |
GRU | 85.31 ± 2.10 | 88.75 ± 1.88 | 82.29 ± 3.05 |
BiGRU | 96.80 ± 0.92 | 98.02 ± 0.76 | 92.85 ± 1.45 |
Method | DEAP-Valence | DEAP-Arousal | Four-Class Classification |
---|---|---|---|
Without Inception | 90.68 ± 2.15 | 92.25 ± 1.83 | 89.84 ± 2.56 |
Inception | 96.80 ± 0.92 | 98.02 ± 0.76 | 92.85 ± 1.45 |
Method | DEAP-Valence | DEAP-Arousal | Four-Class Classification |
---|---|---|---|
Without MHA | 91.41 ± 2.05 | 92.28 ± 1.92 | 89.92 ± 2.47 |
MHA | 96.80 ± 0.92 | 98.02 ± 0.76 | 92.85 ± 1.45 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fang, C.; Liu, S.; Gao, B. MB-MSTFNet: A Multi-Band Spatio-Temporal Attention Network for EEG Sensor-Based Emotion Recognition. Sensors 2025, 25, 4819. https://doi.org/10.3390/s25154819
Fang C, Liu S, Gao B. MB-MSTFNet: A Multi-Band Spatio-Temporal Attention Network for EEG Sensor-Based Emotion Recognition. Sensors. 2025; 25(15):4819. https://doi.org/10.3390/s25154819
Chicago/Turabian StyleFang, Cheng, Sitong Liu, and Bing Gao. 2025. "MB-MSTFNet: A Multi-Band Spatio-Temporal Attention Network for EEG Sensor-Based Emotion Recognition" Sensors 25, no. 15: 4819. https://doi.org/10.3390/s25154819
APA StyleFang, C., Liu, S., & Gao, B. (2025). MB-MSTFNet: A Multi-Band Spatio-Temporal Attention Network for EEG Sensor-Based Emotion Recognition. Sensors, 25(15), 4819. https://doi.org/10.3390/s25154819