You are currently viewing a new version of our website. To view the old version click .
Sensors
  • Article
  • Open Access

23 October 2024

Intelligent Human Operator Mental Fatigue Assessment Method Based on Gaze Movement Monitoring

,
,
,
,
,
,
,
and
1
St. Petersburg Federal Research Center of the Russian Academy of Sciences (SPC RAS), St. Petersburg 199178, Russia
2
Laboratory for Cognitive Psychology of Digital Interface Users, HSE University, Moscow 101000, Russia
3
Faculty of Physics and Mathematics and Natural Sciences, Peoples’ Friendship University of Russia, Moscow 117198, Russia
4
Digital Education Department, Moscow State University of Psychology and Pedagogy, Moscow 127051, Russia
This article belongs to the Section Physical Sensors

Abstract

Modern mental fatigue detection methods include many parameters for evaluation. For example, many researchers use human subjective evaluation or driving parameters to assess this human condition. Development of a method for detecting the functional state of mental fatigue is an extremely important task. Despite the fact that human operator support systems are becoming more and more widespread, at the moment there is no open-source solution that can monitor this human state based on eye movement monitoring in real time and with high accuracy. Such a method allows the prevention of a large number of potential hazardous situations and accidents in critical industries (nuclear stations, transport systems, and air traffic control). This paper describes the developed method for mental fatigue detection based on human eye movements. We based our research on a developed earlier dataset that included captured eye-tracking data of human operators that implemented different tasks during the day. In the scope of the developed method, we propose a technique for the determination of the most relevant gaze characteristics for mental fatigue state detection. The developed method includes the following machine learning techniques for human state classification: random forest, decision tree, and multilayered perceptron. The experimental results showed that the most relevant characteristics are as follows: average velocity within the fixation area; average curvature of the gaze trajectory; minimum curvature of the gaze trajectory; minimum saccade length; percentage of fixations shorter than 150 ms; and proportion of time spent in fixations shorter than 150 milliseconds. The processing of eye movement data using the proposed method is performed in real time, with the maximum accuracy (0.85) and F1-score (0.80) reached using the random forest method.

1. Introduction

Nowadays, the severity of the impact of human operator mental fatigue during the implementation of working tasks is currently underestimated. According to the National Highway Traffic Safety Administration (NHTSA), drowsy driving (a symptom of mental fatigue) was reportedly involved in 1.8% of fatal crashes from 2017 to 2021 [1]. According to the New Zealand Ministry of Transport, “In 2022 there were 34 fatal crashes, 80 serious injury crashes, and 460 minor injury crashes where driver fatigue was a contributing factor” [2].
Although mental fatigue has traditionally been regarded as a parameter affecting cognitive performance, in recent years scientists have found increasing evidence that prolonged performance of a monotonous task also affects the human physiological state [3]. Ways of determining mental fatigue include the analysis of different types of signals. Thus, the most common physiological characteristics that reflect this functional state are heart rate variability, electrical brain activity, skin galvanic response, respiratory rate, and face and eye movements.
The authors point out that the existing methods of fatigue detection in drivers insufficiently take into account the influence of mental fatigue on the human condition [4]. In recent years, due to the development of machine learning techniques, the accuracy and speed of detection algorithms have reached a new level. The authors of the article, on recognition of the state of operators, developed an LSTM for analyzing eye movements. As a result of the experiments conducted, the effectiveness of its development was confirmed [5]. For example, in a study of fatigue detection in excavator drivers, the authors pointed out that the appearance of subjective fatigue is accompanied by an increase in reaction time and number of errors. On the side of changes in oculomotor characteristics, the authors emphasize the change in the distribution of fixation points, which means that it becomes difficult for operators to see the surroundings clearly [6].
This paper presents a method for human operator gaze monitoring to identify characteristics that show an increase in mental fatigue. We use our previously developed dataset [7], that includes longitude gaze data recordings of human operators, to identify the gaze characteristics most related to mental fatigue, as well as propose a classifier that detects a fatigue state with high accuracy. We use correlation analysis of all gaze characteristics with different tests: Landolt rings, inner session dynamics, choice reaction times, and the visual analogue scale. Then, experts analyze this correlation and identify the seven gaze characteristics most related to the fatigue state. After that, we prove this choice using experimental results. Therefore, the scientific novelty of the paper includes the following:
  • A novel technique for identification of the gaze characteristics that are most related to a mental fatigue state;
  • A list of seven gaze characteristics that show the best results for mental fatigue classification in the considered dataset;
  • A classifier that allows the assessment of a mental fatigue state based on gaze movement data.
We have developed an open-source Python library that is available on GitHub (https://github.com/AI-group-72/FAEyeTON, accessed on 21 October 2024). The library is published under the GNU Lesser General Public License, version 2.1, and is also available at the PyPI repository, named eyetrackfatigue. It can be easily integrated into target business software, with key functions (we show the examples in test files, displayed in the repository). The library uses pandas to work with data files. The ML models used in the library are based on the scikit-learn library, and can be easily expanded. Additionally, the library has an option for running as a standalone application, with a PyQt-based user interface, for demonstrational purposes.
The rest of the paper is organized as follows. Section 2 includes an overview of the research dedicated to fatigue detection based on different types of signals, including eye-tracking. Section 3 includes the method description, training pipeline, and classification pipeline. Section 4 consists of the results obtained after training several algorithms. Section 5 includes discussion of our results and future research questions.

3. Fatigue Assessment Method

This section describes a method for mental fatigue assessment based on eye movement characteristic analysis. Firstly, we empirically identify eye movement characteristics that are most relevant to a mental fatigue state (characteristics that are significant when the person is fatigued). Then, we train a ML classifier that allows the distinguishing of people in fatigue and non-fatigue states based on the identified eye movement characteristics.

3.1. Method Description

The mental fatigue assessment method (see Figure 1) consisted of a training scenario and a fatigue assessment scenario. During the training scenario, we created a machine learning model for fatigue state classification. The result of the mental fatigue assessment scenario is an estimation of the PC operator’s fatigue state.
Figure 1. Fatigue classification pipeline.
The training scenario starts with the calculation of eye movement characteristics from eye movement coordinates. We classified all eye movement characteristics into the following groups: velocity-based, temporal-based, percentage-based, quantitative-based, saccade length-based, and trajectory-based. Correlations between eye movement characteristics and the fatigue state were implemented based on the ground truth values available in the dataset. The ground truth values included the choice reaction time (CRT), Landolt rings test, visual analogue scale to evaluate mental fatigue severity (VAS-F), and inner session dynamic estimation. We describe these ground truth values in detail in Section 3.3. A group of experts joined together and compared correlation results and theoretical knowledge on the topic of eye movement strategies to identify chosen characteristics, as well as eye coordinates, for classifier training. We tried the following architectures: random forest, decision tree, and multilayer perceptron. We describe this process in detail in Section 3.4.
For the fatigue assessment scenario, we used the best chosen characteristics, eye coordinates, and trained classification model.

3.2. Dataset Description

We used our earlier developed dataset. We recorded eye-tracking data of 15 participants. For each of them, we recorded for 7 days, with each day containing three recordings of approximately one hour duration (morning session, afternoon session, and evening session). Each participant also implemented a subjective assessment of the fatigue state as the ground truth (we used Landolt rings, CRT, and VAS-F). We described the dataset in detail in our previous paper [7].

3.3. Detection of Eye Movement Characteristic-Related Fatigue State

This subsection aims to identify a feasible number of eye movement characteristics that are most related to the mental fatigue state by calculation of the correlations of the whole set of eye movement characteristics, with subjective assessment of the mental fatigue state.

3.3.1. Landolt Rings

The Landolt rings correction test is a classic method for assessing attentional properties. The Landolt rings are emotionally neutral stimuli that are not difficult to distinguish. We proposed to use the Au index (mental performance) as a fatigue indicator. The values of this parameter within our dataset ranged from −0.5 to 4. Thus, the Au index was categorized as either low or high based on the selected threshold of 1.5. This threshold was chosen because this value allows all data to be categorized in equal proportions. If a subject had a mental performance value of 1.5 or higher, their performance was considered high, otherwise it was considered low. Then, the eye movement characteristics of these two groups were compared with each other using the Wilcoxon criterion. This non-parametric test is used for data that may not be normally distributed for dependent samples as well as for small sample sizes. The Wilcoxon signed-rank test on high and low groups provided a p-value of less than 0.05.

3.3.2. Inner Session Dynamics

In accordance with our dataset, each session of the experiment consisted of several tasks with a 1 h duration. Each session began and ended with the CRT task. Therefore, if a characteristic was affected by fatigue, then its values would differ during these tasks, and in different sessions these differences would have the same trend. For example, in the eye movement characteristics, the proportion of fixations more than 150 ms tended to increase (this was typical for 61% of sessions in our dataset), while the value of the average curvature fell in 19% of sessions for all participants. Thus, each eye movement characteristic was assigned the highest of two values that characterize the behavior of this characteristic in cases of fatigue: (1) the proportion of sessions where it tended to increase; (2) the proportion of sessions in which it decreased.

3.3.3. Choice Reaction Time (CRT)

The results of the CRT task are the values of the average reaction time, standard deviation, and number of errors. Participants completed this task at the beginning and at the end of the session. The CRT results corresponded to the values of the eye movement characteristics recorded during the session. We sorted the data based on the average reaction time and standard deviation. Then, the difference (delta) between the values of the average reaction time at the beginning and at the end of each session was calculated. Correlations were calculated between the values of the first characteristic and the corresponding delta values for the fixation area with a diameter from 0.1 to 2.5 degrees of the visual angle. The maximum correlation values of each characteristic were selected.

3.3.4. Visual Analogue Scale (VAS-F)

The results of the VAS-F to evaluate mental fatigue severity are numerical scores of fatigue and energy. This test was performed once before each session. One fatigue value of the VAS-F test corresponded to the eye movement characteristic values recorded during the session. Correlations were calculated between the values of the first characteristic and the corresponding fatigue values of the VAS-F test for a fixation area with a diameter from 0.1 to 2.5 degrees of the visual angle. The maximum correlation values of each characteristic were selected. Different characteristics had maximum correlation values with different diameters of the fixation area.

3.3.5. Correlation Analysis

To identify the characteristics that are the most related to mental fatigue state detection, we compared the correlation results that were calculated on the data from our dataset. In Table 1, we show the p-value for each characteristic in the column Landolt Rings (see Section 3.3.1). Then, we calculated the correlation analysis for each characteristic with such values as the inner session dynamics (see Section 3.3.2), CRT (see Section 3.3.3), and VAS-F (see Section 3.3.4).
Table 1. Methods of fatigue detection.
We analyzed the values in the following way. Firstly, we filtered characteristics with p-values less than 0.05, taking them into account. Then, we sorted the values by p-value and involved two experts to analyze the correlations with the inner session dynamics, CRT, and VAS-F. The inner session dynamics show how the persons’ state has changed from the beginning of the session until the end, but the CRT and VAS-F show how the fatigue state has changed from session to session. Finally, we chose seven characteristics that, from the point of view of these two experts, are the most related to the fatigue state. Six characteristics were chosen as the most important from Table 1: (1) average velocity within the fixation area; (2) average curvature of the gaze trajectory; (3) minimum curvature of the gaze trajectory; (4) minimum saccade length; (5) percentage of fixations shorter than 150 ms; and (6) proportion of time spent in fixations shorter than 150 milliseconds. Additionally, the experts added one characteristic, “average speed in the fixation area”, that does not show good correlation in Table 1 but theoretically has to show correlation with human fatigue.

3.4. Fatigue Classification

This subsection presents the mental fatigue classification pipeline we proposed to distinguish fatigue and non-fatigue human states based on eye movement characteristics and eye coordinates.

3.4.1. Data Preprocessing

We present the data preprocessing pipeline in Figure 2. We proposed to implement the following functions on the eye-tracking data: cleaning, parsing, calculation, and normalization. We proposed to use a cleaning function for data prefiltering, removing errors and omissions before training the model. The parsing function implements data labelling and preparation for processing. The calculation function implements gaze characteristic calculation as well as features for training the model. The normalization function implements a standard normalization procedure: mean subtraction as well as division by the mean. It is necessary to bring the various features to a near-zero neighborhood of values, which facilitates the selection of optimal weights for features in machine learning models.
Figure 2. Data preprocessing.
The data preprocessing included three separate steps. In the initial stage, relevant features were chosen, which were represented as separate time series for each activity (set of eye movement coordinates). We computed seven key values from each feature, namely the mean, standard deviation, minimum, maximum, 25th percentile, 50th percentile, and 75th percentile. Then we prepared all characteristics presented in Table 1 (that had a p-value less than or equal to 0.05 using Wilcoxon). Finally, we took the seven characteristics, selected by the experts, that are the most relevant to the mental fatigue state.

3.4.2. Mental Fatigue Classification Pipeline

We analyzed the ground truth data and decided that the most relevant for classification was the Landolt rings test, and we chose the mental performance (Au) parameter. If a subject had an Au value of 1.5 or higher, their performance was considered high; otherwise, it was considered low. Next, we evaluated multiple types of classifiers to determine the optimal one for our analysis. The classifiers assessed included random forest, decision tree, K-nearest neighbors, multilayer perceptron, logistic regression, and Support Vector Machine (SVM). Our dataset consisted of 1112 samples, which were divided into 912 samples for training and 200 samples for testing. The testing set comprised 100 samples with a low value of mental performance and 100 samples with a high value of mental performance. These sample sizes were chosen to maximize the amount of data available for training without compromising the reliability of the testing results.
For each classifier, we evaluated several parameter configurations to identify the optimal classifier from the available options. The goal was to select the configuration that yielded the best performance based on our evaluation criteria. The specific details of the parameter configurations and the evaluation results can be found in Table 2.
Table 2. Parameter configurations.
In terms of the evaluation criteria and loss functions utilized, various criteria were used during the training of the random forest and decision tree classifiers (which can be found in the previous table). Furthermore, the loss functions for the multilayer perceptron (MLP) and logistic regression classifiers were log-loss or cross-entropy loss, while for the SVM classifier we used hinge as a loss function. In contrast, the k-NN algorithm does not have a loss function that can be minimized during training and is not trained in the conventional sense. Instead, the algorithm stores a copy of the data and utilizes it during prediction. Accordingly, there is no function that is fit to the data, nor is any optimization performed.
In our study, we aimed to select the best classifier based on the highest accuracy achieved. To improve the performance of the classifiers, we applied normalization to each feature. This involved subtracting the mean from each feature and dividing the result by the standard deviation.
Additionally, we explored various feature selection methods to further enhance the classifiers’ performance. These methods included the following:
  • Removing quasi-constant features (we eliminated features that had almost constant values, as they do not contribute significantly to the classification task);
  • Removing correlated features (we utilized the Kendall correlation coefficient to identify and remove features that exhibited high correlation with each other);
  • Moreover, we employed principal component analysis (PCA) to reduce the dimensionality of the feature space (PCA is known for its effectiveness in avoiding overfitting).
In our study, we conducted experiments using different sets or combinations of groups to investigate the impact of specific feature groups on the classification performance. The objective was to determine whether using a particular group alone or combining multiple groups would yield better results. The main indicator of the quality of the performed estimation was the F1-score, which is a combination of the accuracy and completeness of the estimates given by the classifier.

4. Results

We have developed and published an open-source Python library, now available both at the GitHub and PyPI repositories, under the names FAEyeTON and eyetrackfatigue, respectively. The library is a ready-to-use software; it can be integrated into a complex system, or further modified, depending on the user’s requirements. The library is divided into several software modules that can be used separately for their main functions: reading raw data from an eye-tracker device; parsing the data and calculating features; and training and using fatigue evaluation models. Examples of such usage, in the form of test files, are available on the GitHub repository, along with supporting documentation. We developed an interface to process data that used the mentioned modules (see Figure 3).
Figure 3. FAEyeTON system interface.
We conducted experiments using the developed library that implemented the proposed Section 3 method. In Table 3, we present the most relevant results for the three best algorithms, random forest, decision tree, and MLP, as well as the results of experiments with the characteristics set (coordinates, selected characteristics by experts, and all characteristics presented in Table 3) that we discussed in Section 3.4.2. The goal was to determine whether using a single set or a combination of several sets would produce better results. We experimented with a random dataset, as well as implemented 10-fold cross validation that increased the trustworthiness of our results. As we can see from Table 3, the most accurate classifier is based on the random forest architecture and uses eye movement coordinates as input data together with the characteristics selected by the experts. The best accuracy after cross validation is 0.85, and the F1-score is 0.80. Therefore, we can conclude that the presented method and selected characteristics show the best results for the mental fatigue classification task.
Table 3. Results of algorithm approbation.

5. Conclusions

This paper proposed a developed method for mental fatigue assessment based on eye movement analysis. In the scope of this method, we proposed a technique for selecting the best relevant eye movement characteristics that are related to the mental fatigue state. We identified these characteristics for our dataset and developed a classifier that shows that, using the selected characteristics, we achieved the best accuracy and F1-score for the task of mental fatigue classification (we tested and presented results for several best architectures: random forest, decision tree, and multilayered perceptron). According to the results of our experiments, using our method, we obtained an accuracy of up to 0.85 percent and an F1-score of 0.80 after implementing 10-fold cross validation. We developed an open-source library that implements the proposed method and allows users to utilize the proposed classifier as well as train their own classifier using their own data. The library can be easily integrated into target business software that needs to assess human mental fatigue in real time based on eye movements. The library spends 0.15 s to calculate the eye movement characteristics and less than 0.01 s to operate the classifier to process one second of data.

Author Contributions

A.K. was responsible for the methodology, conceptualization, finding acquisition, paper draft writing, and paper review. S.K. was responsible for paper draft writing. A.M. was responsible for developing software (library development) and paper draft writing. B.H. was responsible for software (classifier training) and paper draft writing. A.B. was responsible for the presented method development and paper draft writing. V.K. was responsible for conducting the experiment. I.S. was responsible for the methodology and paper review. I.B. was responsible for the methodology and conceptualization development, as well as funding acquisition and paper review. G.K. was responsible for the funding acquisition and paper review. All authors have read and agreed to the published version of the manuscript.

Funding

The research has been supported by the Bortnik innovation fund #23ГУKoдИИC12-D7/79179.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The developed open-source library is available at https://github.com/AI-group-72/FAEyeTON (access date: 21 October 2024).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Drowsy Driving|NHTSA. Available online: https://www.nhtsa.gov/book/countermeasures-that-work/drowsy-driving (accessed on 2 September 2024).
  2. Safety—Annual statistics|Ministry of Transport. Available online: https://www.transport.govt.nz/statistics-and-insights/safety-annual-statistics/sheet/fatigue (accessed on 2 September 2024).
  3. Martin, K.; Meeusen, R.; Thompson, K.G.; Keegan, R.; Rattray, B. Mental Fatigue Impairs Endurance Performance: A Physiological Explanation. Sports Med. 2018, 48, 2041–2051. [Google Scholar] [CrossRef] [PubMed]
  4. Hu, X.; Lodewijks, G. Exploration of the effects of task-related fatigue on eye-motion features and its value in improving driver fatigue-related technology. Transp. Res. Part F Traffic. Psychol. Behav. 2021, 80, 150–171. [Google Scholar] [CrossRef]
  5. Liu, B.; Lye, S.W.; Zakaria, Z.B. An integrated framework for eye tracking-assisted task capability recognition of air traffic controllers with machine learning. Adv. Eng. Inform. 2024, 62, 102784. [Google Scholar] [CrossRef]
  6. Li, J.; Li, H.; Wang, H.; Umer, W.; Fu, H.; Xing, X. Evaluating the impact of mental fatigue on construction equipment operators’ ability to detect hazards using wearable eye-tracking technology. Autom. Constr. 2019, 105, 102835. [Google Scholar] [CrossRef]
  7. Kovalenko, S.; Mamonov, A.; Kuznetsov, V.; Bulygin, A.; Shoshina, I.; Brak, I.; Kashevnik, A. OperatorEYEVP: Operator Dataset for Fatigue Detection Based on Eye Movements, Heart Rate Data, and Video Information. Sensors 2023, 23, 6197. [Google Scholar] [CrossRef] [PubMed]
  8. Kovalenko, S.; Mamonov, A.; Kuznetsov, V.; Bulygin, A.; Shoshina, I.; Brak, I.; Kashevnik, A. Machine learning and deep learning techniques for driver fatigue and drowsiness detection: A review. Multimed Tools Appl. 2024, 83, 9441–9477. [Google Scholar] [CrossRef]
  9. Sikander, G.; Anwar, S. Driver Fatigue Detection Systems: A Review. IEEE Trans. Intell. Transp. Syst. 2019, 20, 2339–2352. [Google Scholar] [CrossRef]
  10. Xu, J.; Min, J.; Hu, J. Real-time eye tracking for the assessment of driver fatigue. Healthc. Technol. Lett. 2018, 5, 54–58. [Google Scholar] [CrossRef] [PubMed]
  11. Dreißig, M.; Baccour, M.H.; Schäck, T.; Kasneci, E. Driver Drowsiness Classification Based on Eye Blink and Head Movement Features Using the k-NN Algorithm. In Proceedings of the 2020 IEEE Symposium Series on Computational Intelligence, SSCI 2020, Canberra, Australia, 1–4 December 2020; pp. 889–896. [Google Scholar] [CrossRef]
  12. Wang, Y.; Huang, R.; Guo, L. Eye gaze pattern analysis for fatigue detection based on GP-BCNN with ESM. Pattern Recognit Lett. 2019, 123, 61–74. [Google Scholar] [CrossRef]
  13. Li, J.; Li, H.; Umer, W.; Wang, H.; Xing, X.; Zhao, S.; Hou, J. Identification and classification of construction equipment operators’ mental fatigue using wearable eye-tracking technology. Autom. Constr. 2020, 109, 103000. [Google Scholar] [CrossRef]
  14. Qin, H.; Zhou, X.; Ou, X.; Liu, Y.; Xue, C. Detection of mental fatigue state using heart rate variability and eye metrics during simulated flight. Hum. Factors Ergon. Manuf. Serv. Ind. 2021, 31, 637–651. [Google Scholar] [CrossRef]
  15. Qin, H.; Zhou, X.; Ou, X.; Liu, Y.; Xue, C. Driver Fatigue Detection Method Based on Human Pose Information Entropy. J. Adv. Transp. 2022, 2022, 7213841. [Google Scholar] [CrossRef]
  16. Biotech Laboratory Neiry. Available online: https://neiry.ru/ (accessed on 4 September 2024).
  17. Videomix, Biometric Recognition Systems. Available online: https://v-mix.ru/?utm_referrer=https%3A%2F%2Fwww.google.com%2F (accessed on 12 September 2024).
  18. Mittal, A.; Kumar, K.; Dhamija, S.; Kaur, M. Head movement-based driver drowsiness detection: A review of state-of-art techniques. In Proceedings of the IEEE International Conference on Engineering and Technology, Coimbatore, India, 17–18 March 2016; pp. 903–908. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.