SleepMFormer: An Efficient Attention Framework with Contrastive Learning for Single-Channel EEG Sleep Staging
Abstract
1. Introduction
- We introduce SleepMFormer, an innovative single-channel EEG framework for sleep staging that achieves state-of-the-art performance across three public datasets: Sleep-EDF, PhysioNet, and SHHS.
- We introduce an efficient attention module tailored for sleep staging, significantly reducing computational overhead while maintaining strong performance.
- We employed a supervised contrastive learning approach to enhance the feature representations of intra-epoch and inter-epoch, thereby improving the classification accuracy.
- We conduct comprehensive ablation analyses and visualization interpretation to verify the contribution of each component and assess various modeling strategies.
2. Model Architecture
2.1. Problem Formulation
2.2. Overview
2.3. Feature Extractor
2.4. Transformer-Based Sequence Encoder
2.5. Attention-Based Sleep Stage Classifier
3. Training Procedure
3.1. Supervised Contrastive Learning (SCL)
3.1.1. Data Augmentation
3.1.2. Training Modules
3.1.3. Loss Function
3.2. Fine-Tuning
4. Experiments
4.1. Datasets and Preprocessing
4.2. Settings
4.3. Evaluation Metrics
4.4. Compared Approaches
5. Results
5.1. Comparison with State-of-the-Art (SOTA) Methods
5.2. Training and Inference Time
5.3. Ablation Study
5.4. Comparison with Standard Transformer Encoder
5.5. Effect of Max-Pooling Stride on Model Performance
5.6. Effect of Transformer Encoder Depth
6. Discussion
6.1. Theoretical Analysis of Computational Efficiency
6.2. Visualization of Attention Weights
6.3. Clinical Implications and Limitations
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| EEG | Electroencephalography |
| PSG | Polysomnography |
| AASM | American Academy of Sleep Medicine |
| R&K | Rechtschaffen and Kales |
| REM | Rapid Eye Movement |
| NREM | Non-Rapid Eye Movement |
| CNN | Convolutional Neural Network |
| FE | Feature Extractor |
| TSE | Transformer-based Sequence Encoder |
| AS2C | Attention-based Sleep Stage Classifier |
| SCL | Supervised Contrastive Learning |
| SupCon | Supervised Contrastive Loss |
| FFNN | Feed-Forward Neural Network |
| PE | Positional Encoding |
| FLOPs | Floating-Point Operations |
| MF1 | Macro F1 Score |
| ACC | Accuracy |
| CV | Cross-Validation |
| MLP | Multi-Layer Perceptron |
| SE | Squeeze-and-Excitation |
| SHHS | Sleep Heart Health Study |
References
- Wulff, K.; Gatti, S.; Wettstein, J.G.; Foster, R.G. Sleep and circadian rhythm disruption in psychiatric and neurodegenerative disease. Nat. Rev. Neurosci. 2010, 11, 589–599. [Google Scholar] [CrossRef]
- Berthomier, C.; Drouot, X.; Herman-Stoïca, M.; Berthomier, P.; Prado, J.; Bokar-Thire, D.; Benoit, O.; Mattout, J.; d’Ortho, M.P. Automatic analysis of single-channel sleep EEG: Validation in healthy individuals. Sleep 2007, 30, 1587–1595. [Google Scholar] [CrossRef]
- Wolpert, E.A. A manual of standardized terminology, techniques and scoring system for sleep stages of human subjects. Arch. Gen. Psychiatry 1969, 20, 246–247. [Google Scholar] [CrossRef]
- Berry, R.B.; Brooks, R.; Gamaldo, C.E.; Harding, S.M.; Marcus, C.; Vaughn, B.V.; Tangredi, M.M. The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology and Technical Specifications; American Academy of Sleep Medicine: Darien, IL, USA, 2012; Volume 176, p. 7. [Google Scholar]
- Malhotra, A.; Younes, M.; Kuna, S.T.; Benca, R.; Kushida, C.A.; Walsh, J.; Hanlon, A.; Staley, B.; Pack, A.I.; Pien, G.W. Performance of an automated polysomnography scoring system versus computer-assisted manual scoring. Sleep 2013, 36, 573–582. [Google Scholar] [CrossRef]
- Phan, H.; Andreotti, F.; Cooray, N.; Chén, O.Y.; De Vos, M. Joint classification and prediction CNN framework for automatic sleep stage classification. IEEE Trans. Biomed. Eng. 2018, 66, 1285–1296. [Google Scholar] [CrossRef]
- Supratak, A.; Dong, H.; Wu, C.; Guo, Y. DeepSleepNet: A model for automatic sleep stage scoring based on raw single-channel EEG. IEEE Trans. Neural Syst. Rehabil. Eng. 2017, 25, 1998–2008. [Google Scholar] [CrossRef]
- Stephansen, J.B.; Olesen, A.N.; Olsen, M.; Ambati, A.; Leary, E.B.; Moore, H.E.; Carrillo, O.; Lin, L.; Han, F.; Yan, H.; et al. Neural network analysis of sleep stages enables efficient diagnosis of narcolepsy. Nat. Commun. 2018, 9, 5229. [Google Scholar] [CrossRef]
- Tsinalis, O.; Matthews, P.M.; Guo, Y. Automatic sleep stage scoring using time-frequency analysis and stacked sparse autoencoders. Ann. Biomed. Eng. 2016, 44, 1587–1597. [Google Scholar] [CrossRef] [PubMed]
- Li, X.; Cui, L.; Tao, S.; Chen, J.; Zhang, X.; Zhang, G.Q. Hyclasss: A hybrid classifier for automatic sleep stage scoring. IEEE J. Biomed. Health Inform. 2017, 22, 375–385. [Google Scholar] [CrossRef] [PubMed]
- Phan, H.; Mikkelsen, K.; Chén, O.Y.; Koch, P.; Mertins, A.; De Vos, M. Sleeptransformer: Automatic sleep staging with interpretability and uncertainty quantification. IEEE Trans. Biomed. Eng. 2022, 69, 2456–2467. [Google Scholar] [CrossRef] [PubMed]
- Lee, S.; Yu, Y.; Back, S.; Seo, H.; Lee, K. SleePyCo: Automatic sleep scoring with feature pyramid and contrastive learning. Expert Syst. Appl. 2024, 240, 122551. [Google Scholar]
- Sors, A.; Bonnet, S.; Mirek, S.; Vercueil, L.; Payen, J.F. A convolutional neural network for sleep stage scoring from raw single-channel EEG. Biomed. Signal Process. Control 2018, 42, 107–114. [Google Scholar]
- Phan, H.; Andreotti, F.; Cooray, N.; Chén, O.Y.; De Vos, M. DNN filter bank improves 1-max pooling CNN for single-channel EEG automatic sleep stage classification. In Proceedings of the 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Honolulu, HI, USA, 18–21 July 2018; pp. 453–456. [Google Scholar]
- Sun, C.; Chen, C.; Li, W.; Fan, J.; Chen, W. A hierarchical neural network for sleep stage classification based on comprehensive feature learning and multi-flow sequence learning. IEEE J. Biomed. Health Inform. 2019, 24, 1351–1366. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015, Proceedings, Part III; Springer: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
- Perslev, M.; Jensen, M.; Darkner, S.; Jennum, P.J.; Igel, C. U-time: A fully convolutional network for time series segmentation applied to sleep staging. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: Red Hook, NY, USA, 2019; Volume 32. [Google Scholar]
- Supratak, A.; Guo, Y. TinySleepNet: An efficient deep learning model for sleep stage scoring based on raw single-channel EEG. In Proceedings of the 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Montreal, QC, Canada, 20–24 July 2020; pp. 641–644. [Google Scholar]
- Mousavi, S.; Afghah, F.; Acharya, U.R. SleepEEGNet: Automated sleep stage scoring with sequence to sequence deep learning approach. PLoS ONE 2019, 14, e0216456. [Google Scholar]
- Seo, H.; Back, S.; Lee, S.; Park, D.; Kim, T.; Lee, K. Intra-and inter-epoch temporal context network (IITNet) using sub-epoch features for automatic sleep scoring on raw single-channel EEG. Biomed. Signal Process. Control 2020, 61, 102037. [Google Scholar]
- Phan, H.; Chén, O.Y.; Tran, M.C.; Koch, P.; Mertins, A.; De Vos, M. XSleepNet: Multi-view sequential model for automatic sleep staging. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 5903–5915. [Google Scholar] [CrossRef]
- Mohsenvand, M.N.; Izadi, M.R.; Maes, P. Contrastive representation learning for electroencephalogram classification. In Proceedings of the Machine Learning for Health NeurIPS Workshop, Virtual, 11 December 2020; pp. 238–253. [Google Scholar]
- Jiang, X.; Zhao, J.; Du, B.; Yuan, Z. Self-supervised contrastive learning for EEG-based sleep staging. In Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 18–22 July 2021; pp. 1–8. [Google Scholar]
- Ye, J.; Xiao, Q.; Wang, J.; Zhang, H.; Deng, J.; Lin, Y. CoSleep: A multi-view representation learning framework for self-supervised learning of sleep stage classification. IEEE Signal Process. Lett. 2021, 29, 189–193. [Google Scholar]
- Khosla, P.; Teterwak, P.; Wang, C.; Sarna, A.; Tian, Y.; Isola, P.; Maschinot, A.; Liu, C.; Krishnan, D. Supervised contrastive learning. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: Red Hook, NY, USA, 2020; Volume 33, pp. 18661–18673. [Google Scholar]
- Zaheer, M.; Guruganesh, G.; Dubey, K.A.; Ainslie, J.; Alberti, C.; Ontanon, S.; Pham, P.; Ravula, A.; Wang, Q.; Yang, L.; et al. Big bird: Transformers for longer sequences. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: Red Hook, NY, USA, 2020; Volume 33, pp. 17283–17297. [Google Scholar]
- Han, D.; Ye, T.; Han, Y.; Xia, Z.; Pan, S.; Wan, P.; Song, S.; Huang, G. Agent attention: On the integration of softmax and linear attention. In Proceedings of the European Conference on Computer Vision, Milan, Italy, 29 September–4 October 2024; pp. 124–140. [Google Scholar]
- Wang, S.; Li, B.Z.; Khabsa, M.; Fang, H.; Ma, H. Linformer: Self-attention with linear complexity. arXiv 2020, arXiv:2006.04768. [Google Scholar] [CrossRef]
- Choromanski, K.; Likhosherstov, V.; Dohan, D.; Song, X.; Gane, A.; Sarlos, T.; Hawkins, P.; Davis, J.; Mohiuddin, A.; Kaiser, L.; et al. Rethinking attention with performers. arXiv 2020, arXiv:2009.14794. [Google Scholar]
- Beltagy, I.; Peters, M.E.; Cohan, A. Longformer: The long-document transformer. arXiv 2020, arXiv:2004.05150. [Google Scholar] [CrossRef]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 936–944. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Zaremba, W.; Sutskever, I.; Vinyals, O. Recurrent neural network regularization. arXiv 2014, arXiv:1409.2329. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1026–1034. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: Red Hook, NY, USA, 2017; Volume 30. [Google Scholar]
- Phan, H.; Andreotti, F.; Cooray, N.; Chén, O.Y.; De Vos, M. SeqSleepNet: End-to-end hierarchical recurrent neural network for sequence-to-sequence automatic sleep staging. IEEE Trans. Neural Syst. Rehabil. Eng. 2019, 27, 400–410. [Google Scholar]
- Bahdanau, D.; Cho, K.; Bengio, Y. Neural machine translation by jointly learning to align and translate. arXiv 2014, arXiv:1409.0473. [Google Scholar]
- Luong, M.T.; Pham, H.; Manning, C.D. Effective approaches to attention-based neural machine translation. arXiv 2015, arXiv:1508.04025. [Google Scholar] [CrossRef]
- Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning; Springer: New York, NY, USA, 2009. [Google Scholar]
- Kemp, B.; Zwinderman, A.H.; Tuk, B.; Kamphuisen, H.A.; Oberye, J.J. Analysis of a sleep-dependent neuronal feedback loop: The slow-wave microcontinuity of the EEG. IEEE Trans. Biomed. Eng. 2000, 47, 1185–1194. [Google Scholar] [CrossRef]
- Ghassemi, M.M.; Moody, B.E.; Lehman, L.W.H.; Song, C.; Li, Q.; Sun, H.; Mark, R.G.; Westover, M.B.; Clifford, G.D. You snooze, you win: The physionet/computing in cardiology challenge 2018. In Proceedings of the 2018 Computing in Cardiology Conference (CinC), Maastricht, The Netherlands, 23–26 September 2018; Volume 45, pp. 1–4. [Google Scholar]
- Quan, S.F.; Howard, B.V.; Iber, C.; Kiley, J.P.; Nieto, F.J.; O’Connor, G.T.; Rapoport, D.M.; Redline, S.; Robbins, J.; Samet, J.M.; et al. The sleep heart health study: Design, rationale, and methods. Sleep 1997, 20, 1077–1085. [Google Scholar] [CrossRef]
- Zhang, G.Q.; Cui, L.; Mueller, R.; Tao, S.; Kim, M.; Rueschman, M.; Mariani, S.; Mobley, D.; Redline, S. The National Sleep Research Resource: Towards a sleep data commons. J. Am. Med. Inform. Assoc. 2018, 25, 1351–1358. [Google Scholar] [PubMed]
- Loshchilov, I.; Hutter, F. Fixing weight decay regularization in adam. arXiv 2017, arXiv:1711.05101. [Google Scholar]
- Sokolova, M.; Lapalme, G. A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 2009, 45, 427–437. [Google Scholar] [CrossRef]
- Korkalainen, H.; Aakko, J.; Nikkonen, S.; Kainulainen, S.; Leino, A.; Duce, B.; Afara, I.O.; Myllymaa, S.; Töyräs, J.; Leppänen, T. Accurate deep learning-based sleep staging in a clinical population with suspected obstructive sleep apnea. IEEE J. Biomed. Health Inform. 2019, 24, 2073–2081. [Google Scholar] [CrossRef] [PubMed]











| Model | Parameters (M) | FLOPs (M) |
|---|---|---|
| TinySleepNet [18] | 0.41 | 6.84 |
| DeepSleepNet [7] | 0.78 | 30.60 |
| SleePyCo [12] | 1.63 | 140.06 |
| Transformation | Min | Max | Probability |
|---|---|---|---|
| amplitude shift (μV) | 10 | 0.5 each | |
| amplitude scaling | 0.5 | 2 | |
| time shift (samples) | 300 | ||
| zero-masking (samples) | 0 | 300 | |
| band-stop filter (2 Hz width) (lower bound frequency, Hz) | 0.5 | 30.0 | |
| additive Gaussian noise () | 0 | 0.2 |
| Dataset | Subjects | Channel | Experimental Setup | Class Distribution | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Evaluation Scheme | Held-Out Validation Set | W | N1 | N2 | N3 | REM | Total | |||
| Sleep-EDF | 78 | Fpz-Cz | 10-fold CV | 7 subjects | 69,824 (35.0%) | 21,522 (10.8%) | 69,132 (34.7%) | 13,039 (6.5%) | 25,835 (13.0%) | 199,352 |
| Physio2018 | 994 | C3-A2 | 5-fold CV | 50 subjects | 157,993 (17.7%) | 136,984 (15.4%) | 377,821 (42.3%) | 102,592 (11.5%) | 116,864 (13.1%) | 892,254 |
| SHHS | 5960 | C4-A1 | Train/Test: 0.7:0.3 | 100 subjects | 1,308,982 (24.0%) | 246,195 (4.0%) | 2,383,133 (43.7%) | 735,082 (13.5%) | 812,880 (14.9%) | 5,456,272 |
| Method | Overall Metrics | Per-Class F1 Score | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Dataset | System | Subjects | Acc | MF1 | W | N1 | N2 | N3 | REM | |
| Sleep-EDF | SleepEEGNet [19] | 78 | 80.0 | 73.6 | 0.73 | 91.7 | 44.1 | 82.5 | 73.5 | 76.1 |
| Sleep-EDF | U-Time [17] | 78 | 81.3 | 76.3 | 0.745 | 92.0 | 51.0 | 83.5 | 74.6 | 80.2 |
| Sleep-EDF | SleepTransformer [11] | 78 | 81.4 | 74.3 | 0.743 | 91.7 | 40.4 | 84.3 | 77.9 | 77.2 |
| Sleep-EDF | SeqSleepNet [37] | 78 | 82.6 | 76.4 | 0.76 | - | - | - | - | - |
| Sleep-EDF | TinySleepNet [18] | 78 | 83.1 | 78.1 | 0.77 | 92.8 | 51.0 | 85.3 | 81.1 | 80.3 |
| Sleep-EDF | CNN + LSTM [47] | 78 | 83.7 | - | 0.77 | - | - | - | - | - |
| Sleep-EDF | XSleepNet [21] | 78 | 84.0 | 77.9 | 0.778 | 93.3 | 49.9 | 86.0 | 78.7 | 81.8 |
| Sleep-EDF | SleePyCo [12] | 78 | 84.6 | 79.0 | 0.787 | 93.5 | 50.4 | 86.5 | 80.5 | 84.2 |
| Sleep-EDF | SleepMFormer-T (Ours) | 78 | 83.7 | 78.1 | 0.774 | 93.0 | 49.2 | 86.0 | 80.7 | 81.4 |
| Sleep-EDF | SleepMFormer-D (Ours) | 78 | 84.0 | 78.5 | 0.778 | 93.0 | 49.5 | 86.0 | 80.6 | 83.3 |
| Sleep-EDF | SleepMFormer-S (Ours) | 78 | 84.9 | 79.3 | 0.79 | 93.8 | 51.5 | 86.4 | 79.5 | 85.2 |
| Physio2018 | U-Time [17] | 994 | 78.8 | 77.4 | 0.714 | 82.5 | 59.0 | 83.1 | 79.0 | 83.5 |
| Physio2018 | SeqSleepNet [37] | 994 | 79.4 | 77.6 | 0.719 | - | - | - | - | - |
| Physio2018 | XSleepNet [21] | 994 | 80.3 | 78.6 | 0.732 | - | - | - | - | - |
| Physio2018 | SleePyCo [12] | 994 | 80.9 | 78.9 | 0.737 | 84.2 | 59.3 | 85.3 | 79.4 | 86.3 |
| Physio2018 | SleepMFormer-T (Ours) | 994 | 80.0 | 77.8 | 0.725 | 83.2 | 57.5 | 84.8 | 79.8 | 83.6 |
| Physio2018 | SleepMFormer-D (Ours) | 994 | 80.5 | 78.5 | 0.732 | 83.7 | 58.6 | 85.0 | 80.1 | 85.1 |
| Physio2018 | SleepMFormer-S (Ours) | 994 | 81.0 | 79.1 | 0.739 | 84.5 | 59.8 | 85.2 | 79.6 | 86.2 |
| SHHS | SeqSleepNet [37] | 5791 | 86.5 | 78.5 | 0.81 | - | - | - | - | - |
| SHHS | IITNet [20] | 5791 | 86.7 | 79.8 | 0.812 | 90.1 | 48.1 | 88.4 | 85.2 | 87.2 |
| SHHS | CNN [13] | 5728 | 86.8 | 78.5 | 0.815 | 91.4 | 42.7 | 88.0 | 84.9 | 85.4 |
| SHHS | XSleepNet [21] | 5791 | 87.6 | 80.7 | 0.826 | 92.0 | 49.9 | 88.3 | 85.0 | 88.2 |
| SHHS | SleepTransformer [11] | 5791 | 87.7 | 80.1 | 0.828 | 92.2 | 46.1 | 88.3 | 85.2 | 88.6 |
| SHHS | SleePyCo [12] | 5760 | 87.6 | 80.5 | 0.823 | 92.6 | 49.2 | 88.5 | 84.5 | 88.6 |
| SHHS | SleepMFormer-T (Ours) | 5760 | 87.2 | 79.7 | 0.818 | 92.5 | 48.9 | 88.7 | 83.2 | 88.9 |
| SHHS | SleepMFormer-D (Ours) | 5760 | 87.7 | 80.7 | 0.825 | 92.2 | 49.3 | 88.9 | 84.5 | 88.5 |
| SHHS | SleepMFormer-S (Ours) | 5760 | 87.8 | 80.4 | 0.826 | 92.5 | 47.9 | 89.0 | 83.8 | 88.9 |
| Dataset | Method | SleePyCo | DeepSleepNet | TinySleepNet | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| AS2C | SCL | Acc | MF1 | Acc | MF1 | Acc | MF1 | ||||
| Sleep-EDF | – | – | 84.1 | 78.1 | 0.779 | 82.0 | 76.4 | 0.752 | 82.1 | 76.1 | 0.753 |
| ✓ | – | 84.1 | 78.1 | 0.780 | 82.1 | 77.2 | 0.754 | 82.1 | 76.1 | 0.754 | |
| – | ✓ | 84.7 | 79.1 | 0.786 | 83.9 | 78.3 | 0.777 | 83.6 | 77.8 | 0.772 | |
| ✓ | ✓ | 84.9 | 79.3 | 0.790 | 84.0 | 78.5 | 0.778 | 83.7 | 78.1 | 0.774 | |
| PhysioNet | – | – | 80.0 | 78.0 | 0.725 | 79.7 | 77.9 | 0.723 | 78.8 | 76.8 | 0.700 |
| ✓ | – | 80.2 | 78.3 | 0.728 | 79.7 | 77.8 | 0.723 | 79.0 | 76.8 | 0.713 | |
| – | ✓ | 80.7 | 78.8 | 0.735 | 80.4 | 78.4 | 0.731 | 79.7 | 77.7 | 0.722 | |
| ✓ | ✓ | 81.0 | 79.1 | 0.739 | 80.5 | 78.5 | 0.732 | 80.0 | 77.8 | 0.725 | |
| SHHS | – | – | 87.3 | 80.0 | 0.819 | 87.2 | 80.0 | 0.819 | 86.4 | 79.0 | 0.807 |
| ✓ | – | 87.5 | 80.0 | 0.822 | 87.2 | 80.3 | 0.818 | 86.6 | 79.1 | 0.809 | |
| – | ✓ | 87.6 | 80.4 | 0.820 | 87.6 | 80.6 | 0.824 | 87.0 | 79.9 | 0.816 | |
| ✓ | ✓ | 87.8 | 80.7 | 0.830 | 87.7 | 80.7 | 0.825 | 87.2 | 79.7 | 0.818 | |
| Dataset | Frozen | SleePyCo | DeepSleepNet | TinySleepNet | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Acc | MF1 | Acc | MF1 | Acc | MF1 | |||||
| Sleep-EDF | – | 84.6 | 78.5 | 0.783 | 83.4 | 77.9 | 0.762 | 83.4 | 77.8 | 0.769 |
| ✓ | 84.9 | 79.3 | 0.790 | 84.0 | 78.5 | 0.778 | 83.7 | 78.1 | 0.774 | |
| Dataset | FE | Transformer | AvgFormer | MaxFormer | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Acc | MF1 | Acc | MF1 | Acc | MF1 | |||||
| Sleep-EDF | S | 84.0 | 78.0 | 0.778 | 83.8 ↓ | 77.9 ↓ | 0.776 ↓ | 84.1↑ | 78.1↑ | 0.780↑ |
| D | 81.6 | 76.3 | 0.749 | 81.8 ↑ | 76.7 ↑ | 0.750 ↑ | 82.1↑ | 77.2↑ | 0.754↑ | |
| T | 81.6 | 75.9 | 0.747 | 81.4 ↓ | 75.2 ↓ | 0.743 ↓ | 82.1↑ | 76.1↑ | 0.754↑ | |
| PhysioNet | S | 80.2 | 78.3 | 0.728 | 80.1 ↓ | 78.1 ↓ | 0.725 ↓ | 80.2 – | 78.3 – | 0.728 – |
| D | 79.4 | 77.6 | 0.718 | 79.5 ↑ | 77.3 ↓ | 0.719 ↑ | 79.7↑ | 77.8↑ | 0.723↑ | |
| T | 79.2 | 77.2 | 0.716 | 78.8 ↓ | 76.9 ↓ | 0.712 ↓ | 79.0 ↓ | 76.8 ↓ | 0.713 ↓ | |
| SHHS | S | 87.4 | 80.3 | 0.821 | 87.5↑ | 80.1 ↓ | 0.820 ↓ | 87.5↑ | 80.0 ↓ | 0.822↑ |
| D | 87.1 | 80.2 | 0.816 | 87.2↑ | 79.9 ↓ | 0.818 ↑ | 87.2↑ | 80.3↑ | 0.818↑ | |
| T | 86.5 | 79.6 | 0.809 | 86.6↑ | 79.1 ↓ | 0.810 ↑ | 86.6 ↑ | 79.1 ↓ | 0.809 – | |
| Datasets | n | Overall Metrics | Per-Class F1 Score | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Acc | MF1 | W | N1 | N2 | N3 | R | |||
| Sleep-EDF | S | 84.8 | 79.3 | 0.790 | 93.7 | 51.1 | 86.5 | 79.9 | 85.4 |
| 3 | 84.8 | 79.3 | 0.789 | 93.7 | 51.4 | 86.5 | 79.6 | 85.1 | |
| 6 | 84.9 | 79.3 | 0.790 | 93.8 | 51.5 | 86.4 | 79.5 | 85.2 | |
| 12 | 84.8 | 79.4 | 0.790 | 93.8 | 51.8 | 86.5 | 79.7 | 85.3 | |
| 24 | 84.8 | 79.3 | 0.789 | 93.8 | 51.6 | 86.5 | 79.6 | 84.8 | |
| 48 | 84.8 | 79.1 | 0.789 | 93.7 | 51.2 | 86.5 | 79.1 | 85.1 | |
| Physio2018 | S | 80.9 | 78.9 | 0.737 | 84.2 | 59.3 | 85.2 | 79.5 | 86.0 |
| 3 | 80.9 | 78.9 | 0.738 | 84.3 | 59.4 | 85.2 | 79.7 | 86.0 | |
| 6 | 81.0 | 79.1 | 0.739 | 84.5 | 59.8 | 85.2 | 79.6 | 86.2 | |
| 12 | 81.0 | 79.1 | 0.739 | 84.3 | 59.8 | 85.2 | 79.6 | 86.3 | |
| 24 | 80.9 | 79.0 | 0.738 | 84.5 | 59.6 | 85.2 | 79.6 | 86.3 | |
| 48 | 80.9 | 78.9 | 0.738 | 84.2 | 59.4 | 85.2 | 79.7 | 86.1 | |
| SHHS | S | 87.6 | 80.5 | 0.823 | 92.5 | 48.9 | 88.7 | 83.2 | 88.9 |
| 3 | 87.7 | 80.7 | 0.825 | 92.5 | 49.6 | 88.9 | 83.3 | 89.0 | |
| 6 | 87.8 | 80.4 | 0.826 | 92.5 | 47.9 | 89.0 | 83.8 | 88.9 | |
| 12 | 87.7 | 80.6 | 0.825 | 92.6 | 49.3 | 88.9 | 83.3 | 89.0 | |
| 24 | 87.7 | 80.7 | 0.826 | 92.5 | 49.6 | 89.0 | 83.7 | 88.9 | |
| 48 | 87.6 | 80.5 | 0.823 | 92.5 | 49.5 | 88.8 | 82.9 | 88.9 | |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Li, M.; Xia, J.; Pan, J.; Zhao, S.; Zhang, X.; Jin, H.; Dong, S. SleepMFormer: An Efficient Attention Framework with Contrastive Learning for Single-Channel EEG Sleep Staging. Brain Sci. 2026, 16, 95. https://doi.org/10.3390/brainsci16010095
Li M, Xia J, Pan J, Zhao S, Zhang X, Jin H, Dong S. SleepMFormer: An Efficient Attention Framework with Contrastive Learning for Single-Channel EEG Sleep Staging. Brain Sciences. 2026; 16(1):95. https://doi.org/10.3390/brainsci16010095
Chicago/Turabian StyleLi, Mingjie, Jie Xia, Jiadong Pan, Sha Zhao, Xiaoying Zhang, Hao Jin, and Shurong Dong. 2026. "SleepMFormer: An Efficient Attention Framework with Contrastive Learning for Single-Channel EEG Sleep Staging" Brain Sciences 16, no. 1: 95. https://doi.org/10.3390/brainsci16010095
APA StyleLi, M., Xia, J., Pan, J., Zhao, S., Zhang, X., Jin, H., & Dong, S. (2026). SleepMFormer: An Efficient Attention Framework with Contrastive Learning for Single-Channel EEG Sleep Staging. Brain Sciences, 16(1), 95. https://doi.org/10.3390/brainsci16010095

