Multimodal Machine Learning Framework for Driver Mental Workload Classification: A Comparative and Interpretable Approach
Abstract
1. Introduction
2. Materials and Methods
2.1. Participants
2.2. Apparatus
2.3. Experiment Design
2.4. Procedure
2.5. Classification Models
2.5.1. Logistic Regression
2.5.2. Naive Bayes
2.5.3. K-Nearest Neighbors (KNN)
2.5.4. Random Forest
2.5.5. Support Vector Machine (SVM)
2.5.6. XGBoost
2.6. SHAP Explanation Model
3. Model Development and Results
3.1. Data Pre-Processing and Feature Generation
3.2. Model Development and Evaluation
3.2.1. Differences in Model Performance Across Feature Sets
3.2.2. Differences in Model Performance Across Algorithms
3.3. Model Explanation
4. Model Application at Tunnel Entrance
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- de Waard, D. The Measurement of Drivers’ Mental Workload; Groningen University: Groningen, The Netherlands, 1996. [Google Scholar]
- Bitkina, O.V.; Park, J.; Kim, H.K. The Ability of Eye-Tracking Metrics to Classify and Predict the Perceived Driving Workload. Int. J. Ind. Ergon. 2021, 86, 103193. [Google Scholar] [CrossRef]
- Jeong, H.; Liu, Y. Effects of Non-Driving-Related-Task Modality and Road Geometry on Eye Movements, Lane-Keeping Performance, and Workload While Driving. Transp. Res. Part F Traffic Psychol. Behav. 2019, 60, 157–171. [Google Scholar] [CrossRef]
- Brookhuis, K.A.; de Waard, D. Monitoring Drivers’ Mental Workload in Driving Simulators Using Physiological Measures. Accid. Anal. Prev. 2010, 42, 898–903. [Google Scholar] [CrossRef] [PubMed]
- Teh, E.; Jamson, S.; Carsten, O.; Jamson, H. Temporal Fluctuations in Driving Demand: The Effect of Traffic Complexity on Subjective Measures of Workload and Driving Performance. Transp. Res. Part F Traffic Psychol. Behav. 2014, 22, 207–217. [Google Scholar] [CrossRef]
- Jeon, M.; Walker, B.N.; Yim, J.B. Effects of Specific Emotions on Subjective Judgment, Driving Performance, and Perceived Workload. Transp. Res. Part F Traffic Psychol. Behav. 2014, 24, 197–209. [Google Scholar] [CrossRef]
- Young, M.S.; Brookhuis, K.A.; Wickens, C.D.; Hancock, P.A. State of Science: Mental Workload in Ergonomics. Ergonomics 2015, 58, 1–17. [Google Scholar] [CrossRef]
- Wen, H.; Sze, N.N.; Zeng, Q.; Hu, S. Effect of Music Listening on Physiological Condition, Mental Workload, and Driving Performance with Consideration of Driver Temperament. Int. J. Environ. Res. Public Health 2019, 16, 2766. [Google Scholar] [CrossRef]
- Öztürk, İ.; Merat, N.; Rowe, R.; Fotios, S. The Effect of Cognitive Load on Detection-Response Task (DRT) Performance During Day- and Night-Time Driving: A Driving Simulator Study with Young and Older Drivers. Transp. Res. Part F Traffic Psychol. Behav. 2023, 97, 155–169. [Google Scholar] [CrossRef]
- Freitas, A.; Almeida, R.; Gonçalves, H.; Conceição, G.; Freitas, A. Monitoring Fatigue and Drowsiness in Motor Vehicle Occupants Using Electrocardiogram and Heart Rate: A Systematic Review. Transp. Res. Part F Traffic Psychol. Behav. 2024, 103, 586–607. [Google Scholar] [CrossRef]
- Kim, H.; Hwang, Y.; Yoon, D.; Choi, W.; Park, C.H. Driver Workload Characteristics Analysis Using EEG Data from an Urban Road. IEEE Trans. Intell. Transp. Syst. 2014, 15, 1844–1849. [Google Scholar] [CrossRef]
- Marquart, G.; Cabrall, C.; de Winter, J. Review of Eye-Related Measures of Drivers’ Mental Workload. Procedia Manuf. 2015, 3, 2854–2861. [Google Scholar] [CrossRef]
- Tamantini, C.; Cristofanelli, M.L.; Fracasso, F.; Umbrico, A.; Cortellessa, G.; Orlandini, A.; Cordella, F. Physiological Sensor Technologies in Workload Estimation: A Review. IEEE Sens. J. 2025. [Google Scholar] [CrossRef]
- Solovey, E.T.; Zec, M.; Perez, E.A.G.; Reimer, B.; Mehler, B. Classifying Driver Workload Using Physiological and Driving Performance Data: Two Field Studies. In Proceedings of the Conference on Human Factors in Computing Systems-Proceedings, Toronto, ON, Canada, 26 April–1 May 2014; Association for Computing Machinery: New York, NY, USA, 2014; pp. 4057–4066. [Google Scholar]
- Tran, C.; Yan, S.; Habiyaremye, J.L.; Wei, Y. Predicting Driver’s Work Performance in Driving Simulator Based on Physiological Indices. In Proceedings of the Intelligent Human Computer Interaction: 9th International Conference, Evry, France, 11–13 December 2017. [Google Scholar]
- Tjolleng, A.; Jung, K.; Hong, W.; Lee, W.; Lee, B.; You, H.; Son, J.; Park, S. Classification of a Driver’s Cognitive Workload Levels Using Artificial Neural Network on ECG Signals. Appl. Ergon. 2017, 59, 326–332. [Google Scholar] [CrossRef]
- Abd Rahman, N.I.; Md Dawal, S.Z.; Yusoff, N. Driving Mental Workload and Performance of Ageing Drivers. Transp. Res. Part F Traffic Psychol. Behav. 2020, 69, 265–285. [Google Scholar] [CrossRef]
- Meteier, Q.; Capallera, M.; Ruffieux, S.; Angelini, L.; Abou Khaled, O.; Mugellini, E.; Widmer, M.; Sonderegger, A. Classification of Drivers’ Workload Using Physiological Signals in Conditional Automation. Front. Psychol. 2021, 12, 596038. [Google Scholar] [CrossRef]
- He, D.; Wang, Z.; Khalil, E.B.; Donmez, B.; Qiao, G.; Kumar, S. Classification of Driver Cognitive Load: Exploring the Benefits of Fusing Eye-Tracking and Physiological Measures. Transp. Res. Rec. 2022, 2676, 670–681. [Google Scholar] [CrossRef]
- Wei, W.; Fu, X.; Zhong, S.; Ge, H. Driver’s Mental Workload Classification Using Physiological, Traffic Flow and Environmental Factors. Transp. Res. Part F Traffic Psychol. Behav. 2023, 94, 151–169. [Google Scholar] [CrossRef]
- Huang, J.; Peng, Y.; Hu, L. A Multilayer Stacking Method Base on RFE-SHAP Feature Selection Strategy for Recognition of Driver’s Mental Load and Emotional State. Expert Syst. Appl. 2024, 238, 121729. [Google Scholar] [CrossRef]
- Ma, J.; Wu, Y.; Rong, J.; Zhao, X. A Systematic Review on the Influence Factors, Measurement, and Effect of Driver Workload. Accid. Anal. Prev. 2023, 192, 107289. [Google Scholar] [CrossRef] [PubMed]
- Shao, X.; Chen, F.; Ma, X.; Pan, X. The Impact of Lighting and Longitudinal Slope on Driver Behaviour in Underwater Tunnels: A Simulator Study. Tunn. Undergr. Space Technol. 2022, 122, 104367. [Google Scholar] [CrossRef]
- Mehler, B.; Reimer, B.; Dusek, J.A. MIT AgeLab Delayed Digit Recall Task (n-Back); Massachusetts Institute of Technology: Cambridge, MA, USA, 2011. [Google Scholar]
- Kumagai, T.; Akamatsu, M. Prediction of Human Driving Behavior Using Dynamic Bayesian Networks. IEICE Trans. Inf. 2006, E89-D, 857–860. [Google Scholar] [CrossRef]
- Yang, L.; Ma, R.; Zhang, H.M.; Guan, W.; Jiang, S. Driving Behavior Recognition Using EEG Data from a Simulated Car-Following Experiment. Accid. Anal. Prev. 2018, 116, 30–40. [Google Scholar] [CrossRef]
- Xie, J.; Zhu, M. Maneuver-Based Driving Behavior Classification Based on Random Forest. IEEE Sensors Lett. 2019, 3, 1–4. [Google Scholar] [CrossRef]
- Wang, W.; Xi, J.; Chong, A.; Li, L. Driving Style Classification Using a Semisupervised Support Vector Machine. IEEE Trans. Hum.-Mach. Syst. 2017, 47, 650–660. [Google Scholar] [CrossRef]
- Shi, X.; Wong, Y.D.; Li, M.Z.F.; Palanisamy, C.; Chai, C. A Feature Learning Approach Based on XGBoost for Driving Assessment and Risk Prediction. Accid. Anal. Prev. 2019, 129, 170–179. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Erion, G.G.; Lee, S.-I. Consistent Individualized Feature Attribution for Tree Ensembles. arXiv 2018, arXiv:1802.03888. [Google Scholar]
- Lundberg, S.M.; Allen, P.G.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. Advances in Neural Information Processing Systems. arXiv 2017, arXiv:1705.07874. [Google Scholar]
- Engström, J.; Johansson, E.; Östlund, J. Effects of Visual and Cognitive Load in Real and Simulated Motorway Driving. Transp. Res. Part F Traffic Psychol. Behav. 2005, 8, 97–120. [Google Scholar] [CrossRef]
- Fuller, R. Towards a General Theory of Driver Behaviour. Accid. Anal. Prev. 2005, 37, 461–472. [Google Scholar] [CrossRef]
- Recarte, M.Á.; Pérez, E.; Conchillo, Á.; Nunes, L.M.; Perez, M.A.; Conchillo, E.; Nunes, A. Mental Workload and Visual Impairment: Differences Between Pupil, Blink, and Subjective Rating. Span. J. Psychol. Copyr. 2008, 11, 374–385. [Google Scholar] [CrossRef]
- Wang, Y.; Reimer, B.; Dobres, J.; Mehler, B. The Sensitivity of Different Methodologies for Characterizing Drivers’ Gaze Concentration Under Increased Cognitive Demand. Transp. Res. Part F Traffic Psychol. Behav. 2014, 26, 227–237. [Google Scholar] [CrossRef]
- Meteier, Q.; De Salis, E.; Capallera, M.; Widmer, M.; Angelini, L.; Abou Khaled, O.; Sonderegger, A.; Mugellini, E. Relevant Physiological Indicators for Assessing Workload in Conditionally Automated Driving, Through Three-Class Classification and Regression. Front. Comput. Sci. 2022, 3, 775282. [Google Scholar] [CrossRef]
- Cardone, D.; Perpetuini, D.; Filippini, C.; Mancini, L.; Nocco, S.; Tritto, M.; Rinella, S.; Giacobbe, A.; Fallica, G.; Ricci, F.; et al. Classification of Drivers’ Mental Workload Levels: Comparison of Machine Learning Methods Based on ECG and Infrared Thermal Signals. Sensors 2022, 22, 7300. [Google Scholar] [CrossRef]
- Islam, M.R.; Barua, S.; Ahmed, M.U.; Begum, S.; Aricò, P.; Borghini, G.; Flumeri, G.D. A Novel Mutual Information Based Feature Set for Drivers’ Mental Workload Evaluation Using Machine Learning. Brain Sci. 2020, 10, 551. [Google Scholar] [CrossRef]
- Feng, Z.; Yang, M.; Du, Y.; Xu, J.; Huang, C.; Jiang, X. Effects of the Spatial Structure Conditions of Urban Underpass Tunnels’ Longitudinal Section on Drivers’ Physiological and Behavioral Comfort. Int. J. Environ. Res. Public Health 2021, 18, 10992. [Google Scholar] [CrossRef] [PubMed]









| Authors | Features | Algorithms | Classification Labels | Classes | Best Performance (Accuracy) | ||
|---|---|---|---|---|---|---|---|
| Subjective | Task Performance | Physiological | |||||
| Solovey et al., 2014 [14] | NA | Vehicle velocity, steering wheel reversals | Heart rate, skin conductance level | Decision Tree; Logistic Regression; Multilayer Perceptron; Naïve Bayes; Nearest Neighbors. | N-back | 2 | 75.7% |
| Tran et al., 2017 [15] | NASA-TLX | Number of errors | Heart rate, heart rate variability, blink rate, pupil dilation, blink duration, fixation duration. | Group method of data handling | Situation complexity | 3 | 0.781 (R2) |
| Tjolleng et al., 2017 [16] | NA | NA | Heart rate variability | Artificial neural network | N-back | 3 | 82% |
| Abd Rahman et al., 2020 [17] | NASA-TLX | Number of traffic violations, speed variability, reaction time | EEG | Multiple linear regression | Situation complexity | 3 | NA |
| Meteier et al., 2021 [18] | NA | NA | Heart rate variability, electrodermal activity, respiration | Random Forest; C-support Vector; Multi-Layer Perceptron | Verbal cognitive task | 2 | 95% |
| He et al., 2022 [19] | NA | NA | Heart rate and heart rate variability, eye-tracking features, and galvanic skin response | Artificial neural network, K-Nearest Neighbors, Support Vector Machine, Feedforward Neural Network, Recurrent Neural Network, and Random Forest | N-back | 3 | 97.8% |
| Wei et al., 2023 [20] | NA | Traffic volume, space headway | Heart rate growth rate, heart rate variability, electrodermal activity | Neural networks, support vector machines, and random forest | NASA-TLX scores | 3 | 97.8% |
| Huang et al., 2024 [21] | NA | Steering wheel turning angle, steering wheel speed, following time distance, lateral position | EEG, electrodermal activity | XGBoost, LightGBM, CatBoost, K-Nearest Neighbors, multilayer stacking ensemble learning | NASA-TLX scores | 3 | 97.48% |
| Metrics | Algorithms | Feature Sets | ||||||
|---|---|---|---|---|---|---|---|---|
| Perf | Eye | Phys | Perf_Eye | Perf_Phys | Eye_Phys | All | ||
| ACC (%) | Logistic Regression | 57.94 | 76.82 | 72.39 | 79.85 | 74.39 | 82.82 | 85.67 |
| Naïve Bayes | 54.67 | 76.34 | 71.43 | 75.24 | 68.36 | 82.50 | 81.20 | |
| KNN | 54.20 | 75.66 | 61.29 | 79.72 | 57.96 | 81.08 | 80.98 | |
| Random Forest | 60.47 | 75.84 | 71.69 | 80.62 | 74.33 | 87.18 | 86.71 | |
| SVM | 62.39 | 77.81 | 72.71 | 79.55 | 72.09 | 87.48 | 86.22 | |
| XGBoost | 62.60 | 78.53 | 71.18 | 81.94 | 73.63 | 87.52 | 87.79 | |
| AUC | Logistic Regression | 62.27 | 86.57 | 80.89 | 89.19 | 82.19 | 91.88 | 93.92 |
| Naïve Bayes | 0.63 | 0.86 | 0.81 | 0.84 | 0.78 | 0.91 | 0.89 | |
| KNN | 0.54 | 0.86 | 0.66 | 0.87 | 0.61 | 0.89 | 0.91 | |
| Random Forest | 0.67 | 0.85 | 0.81 | 0.89 | 0.84 | 0.93 | 0.94 | |
| SVM | 0.68 | 0.83 | 0.81 | 0.88 | 0.81 | 0.94 | 0.94 | |
| XGBoost | 0.64 | 0.86 | 0.80 | 0.89 | 0.84 | 0.93 | 0.95 | |
| Algorithms | Optimal Feature Groups | Training Set | Testing Set | Training Time (s) | ||
|---|---|---|---|---|---|---|
| ACC | AUC | ACC | AUC | |||
| Logistic Regression | Mean and std. of lateral position, blink rate, and heart rate growth rate. | 86.5% | 0.936 | 82.9% | 0.93 | 1.596 |
| Random Forest | Mean lateral position, mean speed, blink rate, heart rate, and heart rate growth rate. | 87.6% | 0.950 | 85.4% | 0.95 | 12.080 |
| SVM | Mean lateral position, mean speed, mean accelerator position, blink rate, std. of horizontal gaze position, heart rate, and heart rate growth rate. | 88.2% | 0.953 | 87.8% | 0.95 | 2.177 |
| XGBoost | Mean lateral position, mean speed, blink rate, heart rate, and heart rate growth rate. | 87.5% | 0.958 | 85.4% | 0.96 | 4.732 |
| Longitudinal Slopes | Light Conditions | |||
|---|---|---|---|---|
| Bright | Reddish | Dark | Total | |
| 2.5% | 9 (18) | 12 (15) | 8 (19) | 29 (52) |
| 3.0% | 18 (9) | 18 (9) | 11 (16) | 47 (34) |
| 3.5% | 25 (2) | 21 (6) | 20 (7) | 66 (15) |
| 4.0% | 25 (2) | 26 (1) | 21 (6) | 72 (9) |
| Total | 77 (31) | 77 (31) | 60 (48) | 214 (110) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Shao, X.; Ma, X.; Chen, F.; Pan, X. Multimodal Machine Learning Framework for Driver Mental Workload Classification: A Comparative and Interpretable Approach. Appl. Sci. 2026, 16, 3581. https://doi.org/10.3390/app16073581
Shao X, Ma X, Chen F, Pan X. Multimodal Machine Learning Framework for Driver Mental Workload Classification: A Comparative and Interpretable Approach. Applied Sciences. 2026; 16(7):3581. https://doi.org/10.3390/app16073581
Chicago/Turabian StyleShao, Xiaojun, Xiaoxiang Ma, Feng Chen, and Xiaodong Pan. 2026. "Multimodal Machine Learning Framework for Driver Mental Workload Classification: A Comparative and Interpretable Approach" Applied Sciences 16, no. 7: 3581. https://doi.org/10.3390/app16073581
APA StyleShao, X., Ma, X., Chen, F., & Pan, X. (2026). Multimodal Machine Learning Framework for Driver Mental Workload Classification: A Comparative and Interpretable Approach. Applied Sciences, 16(7), 3581. https://doi.org/10.3390/app16073581
