Next Article in Journal
Research on LiDAR-Assisted Optimization Algorithm for Terrain-Aided Navigation of eVTOL.
Previous Article in Journal
Research on Multi-Sensor Fusion Localization for Forklift AGV Based on Adaptive Weight Extended Kalman Filter
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

PMMCT: A Parallel Multimodal CNN-Transformer Model to Detect Slow Eye Movement for Recognizing Driver Sleepiness

School of Computer Science and Artificial Intelligence, Aliyun School of Big Data, Changzhou University, Changzhou 213159, China
*
Author to whom correspondence should be addressed.
Sensors 2025, 25(18), 5671; https://doi.org/10.3390/s25185671
Submission received: 23 July 2025 / Revised: 2 September 2025 / Accepted: 9 September 2025 / Published: 11 September 2025
(This article belongs to the Section Biomedical Sensors)

Abstract

Sleepiness at the wheel is an important contributor to road traffic accidents. Slow eye movement (SEM) serves as a reliable physiological indicator for the sleep onset period (SOP). To detect SEM for recognizing drivers’ SOP, a Parallel Multimodal CNN-Transformer (PMMCT) model is proposed. The model employs two parallel feature extraction modules to process bimodal signals, each comprising convolutional layers and Transformer encoder layers. The extracted features are fused and then classified using fully connected layers. The model is evaluated on two bimodal signal combinations HEOG + O2 and HEOG + HSUM, where HSUM is the sum of two single-channel horizontal electrooculogram (HEOG) signals and captures electroencephalograph (EEG) features similar to those in the conventional O2 channel. Experimental results indicate that using the PMMCT model, the HEOG + HSUM combination performs comparably to the HEOG + O2 combination and outperforms unimodal HEOG by 2.73% in F1-score, with average classification accuracy and F1-score of 99.89% and 99.35%, outperforming CNN, CNN-LSTM, and CNN-LSTM-Attention models. The model exhibits minimal false positives and false negatives, with average values of 5.2 and 0.8. By combining CNNs’ local feature extraction with Transformers’ global temporal modeling, and using only two HEOG electrodes, the system offers superior performance while enhancing wearable device comfort for real-world applications.
Keywords: driver sleepiness; slow eye movement; CNN-transformer; electrooculogram (EOG); electroencephalograph (EEG); multimodal fusion; sleep onset period (SOP) driver sleepiness; slow eye movement; CNN-transformer; electrooculogram (EOG); electroencephalograph (EEG); multimodal fusion; sleep onset period (SOP)

Share and Cite

MDPI and ACS Style

Jiao, Y.; Zhang, J.; Jiao, Z. PMMCT: A Parallel Multimodal CNN-Transformer Model to Detect Slow Eye Movement for Recognizing Driver Sleepiness. Sensors 2025, 25, 5671. https://doi.org/10.3390/s25185671

AMA Style

Jiao Y, Zhang J, Jiao Z. PMMCT: A Parallel Multimodal CNN-Transformer Model to Detect Slow Eye Movement for Recognizing Driver Sleepiness. Sensors. 2025; 25(18):5671. https://doi.org/10.3390/s25185671

Chicago/Turabian Style

Jiao, Yingying, Jiajia Zhang, and Zhuqing Jiao. 2025. "PMMCT: A Parallel Multimodal CNN-Transformer Model to Detect Slow Eye Movement for Recognizing Driver Sleepiness" Sensors 25, no. 18: 5671. https://doi.org/10.3390/s25185671

APA Style

Jiao, Y., Zhang, J., & Jiao, Z. (2025). PMMCT: A Parallel Multimodal CNN-Transformer Model to Detect Slow Eye Movement for Recognizing Driver Sleepiness. Sensors, 25(18), 5671. https://doi.org/10.3390/s25185671

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop