Next Article in Journal
A Hierarchical Multi-Feature Point Cloud Lithology Identification Method Based on Feature-Preserved Compressive Sampling (FPCS)
Previous Article in Journal
Integrated Ultra-Wideband Microwave System to Measure Composition Ratio Between Fat and Muscle in Multi-Species Tissue Types
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Real-Time Driver Attention Detection in Complex Driving Environments via Binocular Depth Compensation and Multi-Source Temporal Bidirectional Long Short-Term Memory Network

1
CGNPC Uranium Resources Co., Ltd., Beijing 100084, China
2
Suzhou Automotive Research Institute (Wujiang), Tsinghua University, Suzhou 215200, China
*
Author to whom correspondence should be addressed.
Sensors 2025, 25(17), 5548; https://doi.org/10.3390/s25175548
Submission received: 29 July 2025 / Revised: 22 August 2025 / Accepted: 4 September 2025 / Published: 5 September 2025
(This article belongs to the Section Vehicular Sensing)

Abstract

Driver distraction is a key factor contributing to traffic accidents. However, in existing computer vision-based methods for driver attention state recognition, monocular camera-based approaches often suffer from low accuracy, while multi-sensor data fusion techniques are compromised by poor real-time performance. To address these limitations, this paper proposes a Real-time Driver Attention State Recognition method (RT-DASR). RT-DASR comprises two core components: Binocular Vision Depth-Compensated Head Pose Estimation (BV-DHPE) and Multi-source Temporal Bidirectional Long Short-Term Memory (MSTBi-LSTM). BV-DHPE employs binocular cameras and YOLO11n (You Only Look Once) Pose to locate facial landmarks, calculating spatial distances via binocular disparity to compensate for monocular depth deficiency for accurate pose estimation. MSTBi-LSTM utilizes a lightweight Bidirectional Long Short-Term Memory (Bi-LSTM) network to fuse head pose angles, real-time vehicle speed, and gaze region semantics, bidirectionally extracting temporal features for continuous attention state discrimination. Evaluated under challenging conditions (e.g., illumination changes, occlusion), BV-DHPE achieved 44.7% reduction in head pose Mean Absolute Error (MAE) compared to monocular vision methods. RT-DASR achieved 90.4% attention recognition accuracy with 21.5 ms average latency when deployed on NVIDIA Jetson Orin. Real-world driving scenario tests confirm that the proposed method provides a high-precision, low-latency attention state recognition solution for enhancing the safety of mining vehicle drivers. RT-DASR can be integrated into advanced driver assistance systems to enable proactive accident prevention.
Keywords: head pose estimation; convolutional neural network; long short-term memory; binocular vision head pose estimation; convolutional neural network; long short-term memory; binocular vision

Share and Cite

MDPI and ACS Style

Zhou, S.; Zhang, W.; Liu, Y.; Chen, X.; Liu, H. Real-Time Driver Attention Detection in Complex Driving Environments via Binocular Depth Compensation and Multi-Source Temporal Bidirectional Long Short-Term Memory Network. Sensors 2025, 25, 5548. https://doi.org/10.3390/s25175548

AMA Style

Zhou S, Zhang W, Liu Y, Chen X, Liu H. Real-Time Driver Attention Detection in Complex Driving Environments via Binocular Depth Compensation and Multi-Source Temporal Bidirectional Long Short-Term Memory Network. Sensors. 2025; 25(17):5548. https://doi.org/10.3390/s25175548

Chicago/Turabian Style

Zhou, Shuhui, Wei Zhang, Yulong Liu, Xiaonian Chen, and Huajie Liu. 2025. "Real-Time Driver Attention Detection in Complex Driving Environments via Binocular Depth Compensation and Multi-Source Temporal Bidirectional Long Short-Term Memory Network" Sensors 25, no. 17: 5548. https://doi.org/10.3390/s25175548

APA Style

Zhou, S., Zhang, W., Liu, Y., Chen, X., & Liu, H. (2025). Real-Time Driver Attention Detection in Complex Driving Environments via Binocular Depth Compensation and Multi-Source Temporal Bidirectional Long Short-Term Memory Network. Sensors, 25(17), 5548. https://doi.org/10.3390/s25175548

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop