Next Article in Journal
A Sub-Synchronous Oscillation Suppression Strategy Based on Active Disturbance Rejection Control for Renewable Energy Integration System via MMC-HVDC
Previous Article in Journal
Underwater Image Enhancement Method Based on Improved GAN and Physical Model
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Multi-Feature Fusion and Situation Awareness-Based Method for Fatigue Driving Level Determination

School of Electronic and Information Engineering, University of Science and Technology Liaoning, Anshan 114051, China
*
Author to whom correspondence should be addressed.
Electronics 2023, 12(13), 2884; https://doi.org/10.3390/electronics12132884
Submission received: 7 June 2023 / Revised: 22 June 2023 / Accepted: 27 June 2023 / Published: 29 June 2023

Abstract

:
The detection and evaluation of fatigue levels in drivers play a crucial role in reducing traffic accidents and improving the overall quality of life. However, existing studies in this domain often focus on fatigue detection alone, with limited research on fatigue level evaluation. These limitations include the use of single evaluation methods and relatively low accuracy rates. To address these issues, this paper introduces an innovative approach for determining fatigue driving levels. We employ the Dlib library and fatigue state detection algorithms to develop a novel method specifically designed to assess fatigue levels. Unlike conventional approaches, our method adopts a multi-feature fusion strategy, integrating fatigue features from the eyes, mouth, and head pose. By combining these features, we achieve a more precise evaluation of the driver’s fatigue state level. Additionally, we propose a comprehensive evaluation method based on the analytic hierarchy process (AHP) and fuzzy comprehensive evaluation, combined with situational prediction. This approach effectively evaluates the fatigue state level of drivers at specific moments or stages and provides accurate predictions. Furthermore, we optimize the gated recurrent unit (GRU) network using an enhanced marine predator algorithm (MAP), which results in significant improvements in predicting fatigue levels during situational prediction. Experimental results demonstrate a classification accuracy of 92% across various scenarios while maintaining real-time performance. In summary, this paper introduces a novel approach for determining fatigue driving levels through multi-feature fusion. We also incorporate AHP-fuzzy comprehensive evaluation and situational prediction techniques, enhancing the accuracy and reliability of fatigue level evaluation. This research holds both theoretical and practical significance in the field of fatigue driving.

1. Introduction

Since the beginning of the 21st century, with the continuous improvement in people’s living standards, urbanization has been accelerating. To meet the increasing demand for transportation, the demand for cars has also surged. However, along with this, numerous problems have emerged. The most prominent issue is the high frequency of traffic accidents, resulting in significant losses for the country and its people. According to survey statistics, the number of fatalities in traffic accidents in China exceeded 90,000, accounting for 2.3% of the total number of deaths [1]. Furthermore, the average fatality rate in traffic accidents in China is 31.3%, which is considerably higher than in other countries, highlighting the severity of the problem. Analysis of the causes of traffic accidents reveals that over 30% of accidents are attributed to driver-related factors. Real-time monitoring of drivers allows for the detection of early signs of fatigue and enables timely fatigue warnings to be issued, helping drivers adjust their state and effectively reducing the risk of fatigue-related accidents, thereby ensuring the safety of individuals and their property.
Currently, studies on driver fatigue detection can be categorized into three main approaches [2,3,4], as shown in Table 1: physical-data-based detection [5,6], vehicle-operation-based detection [7], and facial-feature-analysis-based detection [4,8]. Comparing the three detection methods, it is found that the analysis of facial features based on the driver’s face has the lowest detection cost and achieves the best detection performance. Additionally, the data collection method is relatively simple. However, the pure object detection method for detecting the driver’s fatigue state, although greatly improving the detection accuracy, can easily overlook small features when the driver is in a fatigued state due to the high similarity of facial and head poses. Therefore, it is crucial to study fine-grained features at arbitrary small scales, as it plays an important role in determining the level of driver fatigue. Furthermore, when the driver is engaged in fatigue driving, the level of fatigue varies over time. While computer-vision-based methods are effective in detecting the driver’s facial features, further research is needed to deeply understand the driver’s fatigue level.
Therefore, we propose a comprehensive approach that combines driver facial recognition using the Dlib library with the MAR, EAR, and HPE fatigue detection algorithms. By integrating multiple features from these different sources, our method enables a more comprehensive and accurate evaluation of the driver’s fatigue level. In terms of facial recognition, our method utilizes advanced algorithms and the Dlib library to detect and analyze facial expressions, with a particular focus on fatigue-related cues. This allows us to capture subtle changes in facial features and quantify them, providing indications of fatigue and enhancing the overall effectiveness of our subsequent fatigue level determination. For eye fatigue detection, we employ the Dlib library to track and analyze eye movements and features associated with fatigue, such as drooping eyelids and prolonged blink durations. This enables accurate identification and quantification of the level of eye fatigue experienced by the driver. Similarly, our method includes mouth and head fatigue testing, using the Dlib library to monitor mouth movements and head posture to assess signs of fatigue. By analyzing these specific features, we gain a deeper understanding of the driver’s fatigue state. Overall, the main contribution of this paper is the utilization of driver facial recognition using the Dlib library, combined with the MAR, EAR, and HPE fatigue detection algorithms to extract fatigue features and classify fatigue driving levels. Ultimately, we provide a comprehensive and reliable method for evaluating the driver’s fatigue driving level. By considering multiple fatigue indicators, our method offers a more comprehensive understanding of the driver’s condition, thereby enhancing safety and reducing accident risks.
This paper is structured as follows: Section 1 provides an introduction to the background and significance of fatigue driving. We discuss the motivation behind this paper and highlight our research methods and work in this field. Section 2 presents a detailed literature review, examining the existing studies and theories related to fatigue detection. We discuss the key findings and propose a multi-feature fusion analysis method adopted in this paper. In Section 3, we describe the feasibility of the methods employed in our paper and provide a description and improvement of the relevant methodology. Section 4 explains the data collection process, the experimental setup, and the statistical analysis techniques used to analyze the gathered data and presents the results and findings of our study. We discuss the empirical evidence and present the quantitative and qualitative analysis of the data. We also include visual representations such as figures and tables to support our findings. Finally, in Section 5, we draw conclusions based on our paper’s findings. We summarize the main innovations and contributions of this research and discuss the implications for the fatigue driving field. We also consider the use of the Dlib library for fatigue detection and its applicability in safety-related fields. In summary, this paper aims to provide a comprehensive understanding of driver fatigue and fatigue driving by examining the literature, conducting empirical research, and presenting our findings. The following sections will delve into each aspect in detail.

2. Literature Review

2.1. Face Recognition and 68 Feature Point Positioning

The Dlib library is an open-source utility toolkit that employs powerful algorithms and techniques to achieve accurate and efficient tasks, including face detection, feature localization, pose estimation, and expression analysis. These functionalities make Dlib a popular choice for face-related tasks in both research and practical applications. Developed using C++, the library encompasses mature machine learning algorithms and models, enabling widespread applications and excellent performance in image processing and face feature extraction. The library offers various functionalities for detecting and analyzing facial features [9].
In summary, the Dlib library plays a crucial role in face-related tasks and has made significant contributions to the advancement of facial recognition. Its comprehensive features, reliability, and broad applicability have established its importance in the scientific community.
This paper utilizes the Shape_Predictor_68_face_landmarks.dat [10] model library from the Dlib library to perform accurate detection of the 68 key points on the human face. These key points, as depicted in Figure 1, can be precisely marked on the face and organized in a consistent order. By extracting the coordinates of these facial feature points, along with important information such as face boundaries and facial angles, valuable insights into an individual’s facial expression, state, and overall well-being can be obtained. This information serves as a foundation for assessing the person’s physiological and mental state.

2.2. Eye Fatigue Detection Based on EAR Algorithm

Eye features are an important parameter for judging the fatigue state, which can best reflect the driving state of the driver. Based on the eye aspect ratio (EAR) judgment concept of eye feature points proposed by Soukupova T [11] in 2016, after extracting 68 face key points through relevant face models or environment libraries, there are six key points for each eye, such as the position of the key points of the eye in Figure 2, where (x36,y36), (x37,y37), …, (x41,y41) corresponds to the key points of the left eye [12]. The EAR formula is shown in Formula (1).
E A R = y 37 y 41 + y 38 y 40 2 x 39 x 36
The driver’s eye state is judged by the formula of the EAR; that is, when the human eye is open, the EAR fluctuates up and down at a certain value, and when the human eye is closed, the EAR drops rapidly, and theoretically it will be close to zero. At that time, the face detection model is not yet precise. So, we think that when the EAR is below a certain threshold, the eyes are closed. Therefore, the initial threshold is set to 0.2. When the value of the EAR fluctuates around 0.2, the driver’s eyes are open, indicating that the driver is in a normal state at this time; when the value of the EAR decreases and is 0, the driver’s eyes are closed, indicating that the driver is in an abnormal state at this time.

2.3. Mouth Fatigue Test Based on the MAR Algorithm

There are two states of the mouth: closed and open. When the mouth is in the open state, according to its size, it can be divided into talking, eating, and yawning. Yawning is a reaction when the human body breathes deeply. When the brain feels bored, tired, or distracted, people will subconsciously yawn. Therefore, mouth features are also an important parameter for fatigue determination. The principle is similar to that of the eye parameters. The mouth aspect ratio (MAR) is used to judge the opening and closing of the driver’s mouth [13]. The main reference points of the mouth are shown in Figure 3. The MAR value is determined according to the upper lip and the coordinate information of the key points of the lower lip, and its calculation is shown in Formula (2).
MAR = y 61 y 67 + y 62 y 66 + y 63 y 65 3 x 64 x 60
When the mouth is open, the MAR tends to have a certain value, and when the mouth is closed, the MAR molecule is 0. When a person yawns, the mouth opening is significantly larger than when talking and eating, so the MAR value becomes significantly larger. Because the features of yawning are normal for people. There are obvious differences in the relaxation state, and this feature is set as a secondary fatigue feature detection point in this paper, so individual differences can be almost completely ignored. Based on the comprehensive consideration of multiple test data, this paper initially sets 0.6 as the threshold of the MAR value. Since yawning is a continuous state of the opening and closing of the mouth, Gallup Andrew C et al. [14] found through research that the whole process of opening and closing the mouth takes about 6.5 s when a person yawns once. Accordingly, in combination with the requirements of this paper, the setting conditions are improved, and it is set that the detection is greater than the threshold value of 0.6, and more than 10 times within 60 s are judged as fatigue, and the false detection rate is low at this time.

2.4. Head Fatigue Test Based on HPE Algorithm

The steps of the head pose estimation (HPE) algorithm are as follows: firstly, detect the 2D facial key points, and then utilize the 3D morphable model (3DMM) to fit the corresponding 3D facial models for different individuals. By aligning the 3D facial model, the conversion relationship between the 3D points and their corresponding 2D points is determined. Subsequently, the solvePnP() function in OpenCV is employed to solve the PnP (Perspective-n-Point) problem, allowing the computation of the Euler angles based on the rotation matrix. The Euler angles related to the head pose include the pitch angle (rotation around the x-axis), yaw angle (rotation around the y-axis), and roll angle (rotation around the z-axis), as illustrated in Figure 4 [15]. These angles are utilized to assess the state of the head.
These angles are used to map the two-dimensional coordinates in the image to three-dimensional space through four coordinate systems [16,17]: World Coordinate System (U, V, W), Camera Coordinate System (X, Y, Z), Image Center Coordinate System (u, v), and Pixel Coordinate System (x, y). Figure 5 below shows the relationship between the three coordinate systems.
Previous researchers have employed rotation matrices to solve the Euler angles used in head pose estimation. The rotations around the x-axis, y-axis, and z-axis, denoted by angles α, β, and γ, respectively, correspond to rotation matrices as shown in Equations (3)–(5).
R X ( α ) = 1 0 0 0 cos ( a ) sin ( a ) 0 sin ( a ) cos ( a )
R Y ( β ) = cos ( β ) 0 sin ( β ) 0 1 0 sin ( β ) 0 cos ( β )
R Z ( γ ) = cos ( γ ) sin ( γ ) 0 sin ( γ ) cos ( γ ) 0 0 0 1
The sine and cosine values of the Euler angles α, β, and γ around the x, y, and z axes, respectively, denoted as Sx, Cx, Sy, Cy, Sz, and Cz, are obtained as shown in Equation (6). Euler angles can be represented using a rotation matrix R.
R = R X ( α ) R Y ( β ) R Z ( γ ) = r 11 r 12 r 13 r 21 r 22 r 23 r 31 r 32 r 33 = c y c z c z c x c y c x c z s x s z + c x c z c y c y s z c x c z + s x s y s z c x s y s z c z s z s y c y s x c z c y
The solution to the equation yields the result as shown in Equation (7).
a = arctan ( r 32 / r 33 ) β = arctan ( r 31 / r 32 2 + r 33 2 ) γ = arctan ( r 21 / r 11 )
Normally, the average range of head movement for adults in yaw is [−79.8°, 75.3°], pitch [−60.4°, 69.6°], and roll [−40.9°, 36.3°]. Based on the observation that the head posture of a drowsy individual exhibits minimal movement in the yaw, this paper considers changes in the pitch and roll angles as the basis for fatigue state evaluation. Following the PERCLOS criteria, this paper determines incorrect head posture in any direction when the pitch angle (Pitch) exceeds [−20°, 20°] or the roll angle (Roll) exceeds [−15.4°, 15.4°].
Combined with the P80 standard, which has the largest correlation coefficient with the objective fatigue degree in the PERCLOS standard [18], this paper sets the Euler angle |Pitch| ≥ 20° for the head down or the Euler angle |Roll| ≥ 15.4° for the head tilt as the fatigue features of the head.

2.5. Multi-Feature Fusion Analysis

Although it is relatively easy to implement the above methods for identifying eye, mouth, and head states, there are some drawbacks. In actual driving environments, fatigue driving is a complex state that involves psychological and physiological factors, and the detection results can be easily influenced by various environmental interferences. If only eye or mouth features are extracted for fatigue detection, the accuracy of the detection will be significantly reduced. Therefore, the fatigue detection method based on individual features needs to be applied under ideal conditions. Some scholars [19] have proposed solutions to alleviate the problem of inaccurate detection of facial key points to some extent. In addition, the size of eyes and mouth can vary among individuals [20,21], and using a fixed threshold may lead to different results for different eye and mouth sizes. Reference [22] suggests training a classification library for each individual driver in advance to address individual differences. However, this method has limited generalization capability and places high demands on computer storage capacity.
Furthermore, since threshold values are set based on eye, mouth, and head poses to determine the driver’s fatigue state, there can be various interfering factors in the data collection process. For example, when the driver smiles, the eyes tend to close; when the driver sings, the mouth tends to yawn; and when the driver looks outside to observe traffic, the head pose can deviate significantly. To reduce the interference of these situations on fatigue state determination, this paper utilizes a fusion of multiple features for comprehensive fatigue evaluation. Based on the aforementioned MAR, EAR, and HPE algorithms, fatigue detection in drivers is performed, and the driver’s fatigue state is determined by setting appropriate thresholds to identify the presence of fatigue features. Specifically, a driver is classified as non-fatigued when the MAR, EAR, |Pitch|, and |Roll| fall within the respective ranges of [0, 0.6), (0.2, 1], [0, 20), and [0, 15.4), as shown in Table 2.

3. Research Methods

3.1. Situation Awareness Structure

Endsley [23] classifies situation perception into three levels: element extraction, evaluation, and prediction. The prediction stage aims to anticipate the “potential” future state. Initially, relevant environmental factors are perceived to understand the driver’s current state. This paper specifically focuses on analyzing head features as part of facial features [24,25] that influence the driving state. The evaluation step involves evaluating the current state, including the driver’s fatigue level and their present situation. Throughout the driver’s journey, various elements are considered to determine the degree of goal attainment and assess the current outcome using different data types and information. Future state projection predicts potential future states based on the current state [26], estimating the driver’s future driving risk. In summary, this paper studies the current and present situation levels to identify potential future trends and issue warnings for high-risk fatigue states. Situational perception technology is commonly applied in network environment warnings, determining the network threat level based on network environment information [27]. The process of fatigue state situational awareness is depicted in Figure 6.
The fatigue state of a driver is determined by calculating parameters related to eye fatigue, mouth fatigue, and head posture. Firstly, the fatigue level of the driver’s fatigue state and evaluation indicators for the fatigue parameters are defined, and mathematical modeling and quantification of each indicator are performed. An AHP-fuzzy comprehensive evaluation algorithm is proposed to assess the driver’s fatigue state. Subsequently, future situation levels are predicted based on past and present states, with warnings issued for higher-level states. To accomplish this, a situation prediction method based on the improved marine predator algorithm (WMPA)-optimized GRU (gate recurrent unit) is proposed.

3.2. Situational Element Extraction

3.2.1. Division of Fatigue Status Level

The Dlib library can be used to obtain the results of facial feature extraction from human faces. The first step in trend perception involves considering these factors as elements of the situation. Subsequently, situation evaluation and situation prediction are conducted, and evaluation rules need to be formulated beforehand. During the situation evaluation process, a comprehensive and reasonable evaluation of various factors is performed. The corresponding evaluation system is set based on the specific circumstances from different perspectives. It is important to note that the results obtained can vary significantly depending on the chosen approach. The focus of this research paper is the fatigue driving status of the driver. Due to the research on fatigue detection based on target detection in fatigue state studies, most studies have focused on detecting fatigue by analyzing the mouth and eyes of the face. However, with the introduction of head pose, the majority of research studies have started incorporating the detection of nodding frequency within a unit of time into the analysis of fatigue using facial features. In this paper, we adopt a method that primarily focuses on facial features with supplementary consideration of head pose. We utilize the Dlib library along with the MAR, EAR, and HPE fatigue detection algorithms to establish the fatigue state thresholds for drivers. By combining these thresholds, we can determine the driver’s fatigue level. Specifically, it investigates the signs of fatigue in the driver’s face and head and classifies the level of fatigue according to Table 3 below. A higher level indicates a more severe degree of fatigue, requiring increased monitoring and attention.

3.2.2. Construction of the Evaluation System

  • Extract evaluation index
The driver’s fatigue state encompasses multiple contributing factors, and the complex interrelationships among these factors make it challenging to directly obtain the desired results. Therefore, it is crucial to extract evaluation indicators relevant to the evaluation objectives. The selection of indicators follows two principles: firstly, clearly defining the subject of the situation evaluation and, secondly, considering the attributes of the research subjects. Considering that the purpose of the situation evaluation is to determine the risk level of fatigue driving, which reflects the degree of driver fatigue, this relationship primarily manifests through the driver’s facial features, including mouth state, eye state, head pitch angle, head roll angle, and the presence of improper driving behaviors. Additionally, the driver’s physiological features and the environmental conditions inside the vehicle (such as temperature and driving duration) serve as indirect factors. Furthermore, weather conditions also influence the driver’s driving state. Based on the research conditions and objectives of this paper, the fatigue state level of fatigue driving will be determined by analyzing the facial features related to the direct factors. Therefore, the identified indicators for fatigue state include mouth state, eye state, head pitch angle, head roll angle, and the presence of improper driving behavior. Indicators 1 to 4 can be used to evaluate levels I to V, while indicator 5 is used to determine level VI. In subsequent algorithms, the scope of the situation evaluation will focus on levels I to V. Figure 7 below shows the fatigue state situation evaluation index system.
2.
Evaluation index quantification
Among the evaluation indicators, the relationship between fatigue and mouth, eyes, and head posture is highly significant. The opening and movement of the eyes, as well as the orientation of the head, directly impact the evaluation results. Head posture can be characterized by three Euler angles: pitch angle (Pitch), yaw angle (Yaw), and roll angle (Roll). In this paper, the changes in mouth state, eye state, head pitch angle, and roll angle are selected as the basis for judging fatigue state. The four evaluation indicators are quantified based on their respective numerical values. Based on the introduction of eye, mouth, and head fatigue detection algorithms in Section 2, the evaluation metrics for the eye and mouth can be quantified using Equation (1) and Equation (2), respectively. Similarly, the evaluation metrics for head pitch angle and roll angle can be quantified using Equation (7).

3.3. Situation Evaluation

The method of combining hierarchical analysis and fuzzy evaluation is referred to as the fuzzy hierarchical analysis method. The model establishment is depicted in Figure 8. The hierarchical analysis method is utilized to calculate the weight of the evaluation indicators that influence the level of driver’s fatigue status. Subsequently, the fuzzy comprehensive evaluation method is employed to construct a fuzzy evaluation matrix and conduct a comprehensive evaluation. Finally, the final result of the driver’s fatigue status level is determined based on the principles of fuzzy theory.

3.3.1. Calculation Weight of Hierarchical Analysis Method

  • Build a judgment matrix
The judgment matrix serves as a criterion in determining the element values of the next layer. The values assigned to the matrix elements reflect the subjective understanding and evaluation of the relative importance of different factors based on objective reality. Typically, the values used are 1, 3, 5, 7, and 9, with their reciprocals as the benchmark values and 2, 4, 6, and 8 as intermediate values. The specific labeling method for the median value of adjacent judgments follows the 1–9 standard method proposed by Satty [28], as shown in Table 4.
2.
Calculate the maximum feature value
The maximum eigenvalue and eigenvector of each judgment matrix are calculated using the multiplication and summation method. Subsequently, the eigenvectors are normalized to obtain the weight ranking. Based on the weight rankings of each level, the overall weight ranking is determined. The specific calculation steps are as follows:
Step 1: Normalize each column of the judgment matrix P to obtain the normalized matrix p ¯ .
P i j ¯ = P i j k = 1 n P k j ( i , j = 1 , 2 , n )
In the formula, P i j ¯ —line i, the value of the element in column j is formally treated; Pij—line i, the element of column j; k = 1 n P k j —the harmony of column j element.
Step 2: Each column of regular judgment matrix p ¯ is added according to the row:
W ¯ i = j = 1 n P ¯ i j i , j = 1 , 2 , , n
In the formula, W i ¯ —in the normalized judgment matrix, the sum of the ith row.
Step 3: Regularized vector W = [ W 1 ¯ , W 2 ¯ , , W n ¯ ] T :
W = W ¯ i i = 1 n W ¯ i i = 1 , 2 , , n
The obtained W = [W1, W2, …, Wn]T is the feature vector.
Step 4: Calculate the maximum feature value of the matrix λmax:
λ max = i = 1 n P W i n W i
In the formula, PWi is a matrix P and W product.
3.
Consistency check
Consistency check of weight vectors: Due to the complexity of objective phenomena or the potential bias in understanding them, it is necessary to assess the consistency and randomness of the judgment matrix by determining the feature values of the feature vector derived from it. The standardization steps are as follows:
Step 1: Calculate the consistency index (CI): To calculate the consistency index (CI), a survey was conducted among experts in 30 fields, resulting in the following judgment matrix: CI = (λmaxn)/(n − 1).
Step 2: Calculate the average random consistency index (RI): The RI is obtained by averaging the feature values from multiple randomly generated judgment matrices. Table 5 presents the values of RI for dimensions ranging from 1 to 10.
Step 3: Calculate CR: CR = CI/RI; when CR < 0.1, it is considered to have better consistency to judge the matrix.
4.
Calculation results
This paper formed an expert group consisting of 20 industry experts with senior titles or above in the field of object detection, as well as associate professors from universities. The selection process involved careful consideration of their expertise and industry experience. To ensure a comprehensive evaluation, multiple rounds of questionnaires were distributed to collect the experts’ opinions on the evaluation indices of driver fatigue status. The questionnaire was designed to assess the rationality of the preliminary selection of indicators based on the experts’ professional knowledge and industry experience. Moreover, by comparing the indicators pairwise, the relative importance between two indicators was determined using the 1–9 scale method. This approach helped capture the difference in significance between various indicators. Subsequently, the collected questionnaires underwent rigorous screening and elimination to exclude any that did not meet the requirements. The remaining data were analyzed and processed to obtain the judgment matrix, represented by matrix (12). This matrix reflects the comprehensive opinions and evaluations of the expert group.
P = 1 / 1 1 / 2 3 / 1 4 / 1 2 / 1 1 / 1 7 / 1 5 / 1 1 / 3 1 / 7 1 / 1 2 / 1 1 / 4 1 / 5 1 / 2 1 / 1
The calculation steps are as follows:
Step 1: Normalize each column vector of the pairwise comparison judgment matrix P to obtain P ¯ , as represented by matrix (13).
P ¯ = 0.279 0.2713 0.2608 0.3333 0.56 0.54 0.61 0.42 0.09 0.08 0.09 0.17 0.07 0.11 0.04 0.08
Step 2: Normalize and sum the matrix to obtain W i ¯ : W ¯ = [1.4450 2.1261 0.4241 0.3051]T. The purpose of calculating w is to compute the normalized eigenvector.
Step 3: Calculate the eigenvector (weights), as shown in Equation (14).
W MAR = 1.1445 / ( 1.1445 + 2.1261 + 0.4241 + 0.3051 ) = 0.2861 W EAR = 2.1261 / ( 1.1445 + 2.1261 + 0.4241 + 0.3051 ) = 0.5315 W Pitch = 0.4241 / ( 1.1445 + 2.1261 + 0.4241 + 0.3051 ) = 0.1060 W Roll = 0.3051 / ( 1.1445 + 2.1261 + 0.4241 + 0.3051 ) = 0.0763
Step 4: Translate the maximum eigenvalue of the judgment matrix, as shown in Equation (15).
λ max = i = 1 n P W i n W i = 4.094
Step 5: To perform consistency check on the judgment matrix, we need to calculate the consistency index (CI): C I = λ max n n 1 = 4.094 4 4 1 = 0.031 .
For a 5th-order matrix, the average random consistency index (RI) is 0.89. The random consistency ratio (CR) is calculated as follows: C R = C I R I = 0.031 0.89 = 0.035 < 0.10 .
Given that the consistency is reasonable, the weights can be summarized as shown in Table 6 below.

3.3.2. Fuzzy Comprehensive Evaluation

  • Establish a set of evaluation factors
Establish a set of evaluation factors, U = {u1, u2, u3u4}.
2.
Clear evaluation grade
The evaluation grade is represented by V = {V1, V2, V3Vm}. This paper uses five grades to evaluate the driver’s fatigue status level evaluation effect. The five grades correspond to low risk (level Ⅰ), lower risk (level Ⅱ), general risk (level Ⅲ), higher risk (level Ⅳ), and high risk (level Ⅴ).
3.
Constructing a fuzzy comprehensive evaluation matrix
By clarifying the attachment of the lower-level evaluation indicators in the evaluation indicators of each ladder level, the ambiguity matrix R is represented by matrix (16):
R = R 1 R 2 R m = r 11 r 12 r 1 n r 21 r 22 r 2 n r m 1 r m 2 r m n
4.
Calculate the evaluation factor right vector
Order W = {a1, a2, a3, …, am}. Among them, ai (indicator weight) represents the affiliation of ui (evaluation factor set) for the fuzzy subset of the evaluated object. The evaluation factor vector is the weight value of the staircase level in the previous section. In accordance with the different values of the affiliation, this paper obtains the summary table of the affiliation matrix.
5.
Calculate the comprehensive evaluation results
After combining the synthetic comprehensive evaluation result vector with the corresponding level of evaluation factor right vector W and the fuzzy comprehensive evaluation matrix R, calculate the fuzzy comprehensive evaluation result vector B, as shown in Equation (17):
B = W × R = a 1 , a 2 , a 3 , , a m × r 11 r 12 r 1 n r 21 r 22 r 2 n r m 1 r m 2 r m n = b 1 , b 2 , b 3 , , b m
Based on the principle of the most affiliated, the overall effect can be evaluated and analyzed.

3.4. Situation Prediction

After obtaining the situation evaluation result, which represents the “state” in the situation, it is crucial to focus on predicting the future trend of navigation, referred to as the “trend” in the situation. In the process of situation awareness, time plays a significant role as the situation evolves over time. Predicting future situations relies on capturing the long-term dependence of the situation on time. To address this issue, this section proposes an optimized GRU-based situation prediction method using the improved marine predator algorithm [29,30] to predict drivers’ fatigue levels. GRU (gated recurrent unit) [31,32] is a variant of the LSTM (long short-term memory) network [33,34,35], which has a simpler structure and better performance.
It is currently a popular choice in neural networks. In related fields, neural networks are widely used for situation prediction due to their ability to incorporate past and present data and handle temporal continuity. Recurrent neural networks, such as GRU and LSTM, are well-suited for such problems as they can store historical information. However, it should be noted that the activation function of the RNN network cannot be a sigmoid function. The sigmoid function leads to the vanishing gradient problem, where gradients diminish over time, making it difficult for RNNs to handle long-term dependencies. In response to this challenge, a neural network model capable of retaining information over long periods has emerged, known as long short-term memory (LSTM).
The LSTM (Long Short-Term Memory) module unit for information retention is more complex, consisting of cell states, forget gates, input gates, and output gates. This architecture enables LSTM to effectively handle long-term dependencies. Its unit structure is shown in Figure 9. The specific calculation steps for the proposed GRU-based situation prediction method optimized by the improved marine predator algorithm are as follows:
Step 1: The information that needs to be discarded is forgotten and output through the sigmoid function, Output ft.
f t = σ ( w f h t 1 + w f x t + b f )
wf is weight, ht−1 is the information of the previous time, and bf is offset.
Step 2: First, the previous moment ht−1 and the current state xt are updated to the information after the sigmoid function process it. Then, use the tanh activation function to obtain the update information.
i t = σ ( w i h t 1 + w i x t + b i ) c ˜ t = tan h ( w i h t 1 + w i x t + b c )
wi is the weight of the input door, bi is the input door offset, wc is the weight of entering the new cell, and bc is offset.
Step 3: Update the old status information by updating the door, ct−1, and ft’s Hadamard accumulation, select whether some old cell information is discarded by the forgotten door, and obtain the new cell status ct.
c t = f t c t 1 + i t c ˜ t
Step 4: Combine the hidden layer status ht−1 and the current state xt in the previous moment, and the new output value ot is processed by the sigmoid function.
o t = σ ( w o h t 1 + w o x t + b o ) h t = o t tan h ( c t )
Due to the complexity of the LSTM module, its computational process is cumbersome, resulting in longer calculation times. To address this issue, Cho et al. proposed GRU (gated recurrent unit) with gated control cycle units.
Figure 10 illustrates the GRU (Gated Recurrent Unit) architecture, which follows the same computational principles as LSTM but reduces the number of LSTM gates, effectively reducing the calculation time. GRU achieves comparable performance to LSTM using fewer parameters. It combines the forget and input gates of LSTM into a single update gate. The update gate output is obtained by applying the sigmoid function to the previous moment, as shown in the following equation.
z t = σ ( w z x t + u z h t 1 )
wz” corresponds to the weight parameter of the update gate, while “uz” represents the current weight parameter of the update gate.
Before the reset door decides how much information is available, the linear transformation also processes the output activation value through the sigmoid function.
r t = σ ( w r x t + u z h t 1 ) h ˜ t = tan h ( w h x t + r t u z h t 1 )
wr” stands for the recurrent weight matrix of the reset gate, while “wh” represents the weight matrix for the candidate hidden state. h ˜ t is the local information extracted from the maintained door. After the updated door, the updated state ht is obtained.
h t = z t h t 1 + ( 1 z t ) h ˜ t
The utilization of GRU networks is more suitable for addressing problems that require long-term data prediction. However, adjusting the parameter values of the GRU network can be tedious and complex, and manual tuning may not lead to the optimal state. Therefore, an algorithm that optimizes the weight values to improve learning efficiency is necessary. Figure 11 illustrates the process of optimizing the GRU neural network prediction algorithm using the improved marine predator algorithm.

Improved Marine Predator Algorithm (WMPA)

In the ocean, marine animals exhibit similarities with many animals in nature. They adopt a random walking strategy, wherein the next position or status of predators and prey is dependent on their current location or status. It has been observed that the speed of predators and prey in marine creatures determines their respective predation strategies. The transition between the Lévy flight strategy and Brownian motion strategy enables the identification of optimal strategies for optimization purposes.
In the process of updating the GRU network, the dual adaptive switching probability and the adaptive inertia weight are introduced into the MPA algorithm to speed up the convergence speed and solution accuracy of the algorithm, prevent the ocean capture algorithm from falling into local optimum, and improve the performance of global optimization. The specific improvement methods are as follows:
  • Improvement methods
  • Improvement Method 1: Adaptive switching probability
In this paper, adaptive learning factors are used to solve this problem, and the adaptive switching probability is used to replace the fixed probability P, as shown in Formula (25). The diversity of eclipses and walks improves the local and global search capabilities of the algorithm.
P = 0.5 0.1 × ( M a x _ i t e r I t e r ) M a x _ i t e r
Max_iter” represents the maximum number of iterations, and “Iter” represents the current iteration. The adaptive probability P undergoes changes in its value during the drawing process, as illustrated in Figure 12.
From Figure 12, it can be observed that there is a correlation between the number of iterations and the dynamics of the adaptive probability. In the early iterations, the algorithm focuses more on global search, which enhances its performance. However, as the number of iterations increases, the algorithm shifts toward conducting more local exploration when it approaches the global optimal solution. This adjustment aims to improve the accuracy of the solution and the overall search efficiency of the algorithm.
  • Improvement Method 2: Dual self-adaptive inertia weight
Inspired by various adaptive weight particle swarm algorithms [36,37], this paper enhances the local and global search capabilities of marine predators through the utilization of dual adaptive inertial weights. Specifically, weight W1 is designed to enhance the overall search capabilities, while weight W2 is intended to improve local search capabilities. These weights can be mathematically represented by Formula (26):
W 1 = ( 1 I t e r M a x _ i t e r ) 1 tan ( p ( r a n d 0.5 ) × r a n d M a x _ i t e r ) ; W 2 = ( 2 2   ×   I t e r M a x _ i t e r ) 1 tan ( p ( r a n d 0.5 ) × r a n d M a x _ i t e r ) ;
The variable “rand” represents a random number within the range [0, 1]. The dual inertial weights W1 and W2 have specific ranges: W1 ranges from 0 to 0.5, and W2 ranges from 0.5 to 1.
2.
Structure of WMPA algorithm
First, initialize the type:
X 0 = X min + r a n d ( X max X min )
Xmax and Xmin in the formula are the maximum and minimum values of variables, and r a n d is a uniform random vector defined between [0, 1]. Create the “Elite” matrix (28) and “prey” matrix (29):
E l i e = X 1 , 1 I X 1 , 2 I X 1 , d I X 2 , 1 I X 2 , 2 I X 2 , d I X n , 1 I X n , 2 I X n , d I n × d
P r e y = X 1 , 1 X 1 , 2 X 1 , d X 2 , 1 X 2 , 2 X 2 , d X n , 1 X n , 2 X n , d n × d
In the equation above, X represents the vector of the top predator, and after n iterations, the Elie matrix is replicated. “n” denotes the number of populations, “d” represents the dimension of the search space, and “Xi,j” indicates the jth dimension of the ith prey.
The first stage of the improved marine predator algorithm (WMPA) is characterized by the predator being faster than the prey, and it employs an exploration strategy. This stage involves a global search within the solution space. It is particularly suitable for the initial 1/3 iterations of the algorithm, represented by Equation (30).
s t e p s i z e i = R B ( E l i t e i R B Pr e y i ) i = 1 , n Pr e y i = w 1 Pr e y i + P . R s t e p s i z e i
In this vector, R consists of random numbers generated from a normal distribution based on Brownian motion. P = 0.5 is a constant, and R is a random number ranging from 0 to 1.
The second stage of improved marine predator algorithm (WMPA) occurs between 1/3 and 2/3 of the total number of iterations. This stage is divided into two parts, where half of the population is responsible for exploitation, and the other half is responsible for exploration. When Iter/Max_iter < 0.5, they are represented by Formulas (31) and (32), respectively. When Iter/Max_iter > 0.5, they are represented by Formulas (33) and (34), respectively.
When Iter/Max_iter < 0.5,
s t e p s i z e i = R L ( E l i t e i R L Pr e y i ) i = 1 , n / 2 Pr e y i = w 1 Pr e y i + P . R s t e p s i z e i
s t e p s i z e i = R B ( R B E l i t e i Pr e y i ) i = n / 2 , , n Pr e y i = w 1 E lite i + P . C F s t e p s i z e i
When Iter/Max_iter > 0.5,
s t e p s i z e i = R L ( E l i t e i R L Pr e y i ) i = 1 , n / 2 Pr e y i = Pr e y i + w 2 P . R s t e p s i z e i
s t e p s i z e i = R B ( R B E l i t e i Pr e y i ) i = n / 2 , , n Pr e y i = E lite i + w 2 P . C F s t e p s i z e i
In this stage, RL represents the random number vector distributed by Lévy motion, and C F = ( 1 I t e r M a x _ i t e r ) ( 2 I t e r M a x _ i t e r ) is the adaptive parameter that controls the movement of predators.
The third stage of the improved marine predator algorithm (WMPA) focuses on exploitation. It aims to find the optimal solution within the solution space by utilizing the Lévy flight strategy of the predators. The specific process is described by Formula (35):
s t e p s i z e i = R L ( R L E l i t e i Pr e y i ) i = 1 , , n Pr e y i = E lite i + w 2 P . C F s t e p s i z e i
The vortex formation and fish aggregating device (FADS) effect play a crucial role in helping the algorithm escape from local optima. It is represented by Formula (36):
Pr e y i = Pr e y i + C F X min + R ( X max X min ) U i f r F A D s Pr e y i + F A D s ( 1 r ) + r ( Pr e y r 1 Pr e y r 2 ) i f r > F A D s
In the formula, U is a binary vector with elements of 0 and 1. r is a uniformly distributed random number in the range of [0, 1], and the reduction in r1 and r2 represents random indices in the prey matrix.
3.
The implementation step of the WMPA algorithm
The steps of the WMPA algorithm are presented in Algorithm 1 as a pseudo code.
Algorithm 1: The pseudo code of the WMPA algorithm
1.Initialize search agent (prey) population group i = 1, …, n
2.While In the case of not meeting the termination conditions, calculate the adaptation,
    construct the elite matrix, and realize memory saving
3. If Iter < Max_Iter/3
4.  Update the position of the current search agent through equivalent (30)
5. else if Iter > Max_iter/3 && Iter < 2*Max_iter/3
6.    For the first half of the group (i = 1, …, n/2)
7.   if (Iter/Max_iter > 0.5)
8.      Update the position of the current search agent through equivalent (33)
9.   else Update the position of the current search agent through equivalent (31)
10.    end
11.   For the other half of the group (i = n/2, …, n)
12.   if (Iter/Max_iter > 0.5)
13.      Update the position of the current search agent through equivalent (34)
14.   else Update the position of the current search agent through equivalent (32)
15.   end
16.    else
17.    Update the position of the current search agent through equivalent (35)
18. end(if)
19. Use the FADS effect to complete memory saving and elite updates, and update
   according to the Formula (36)
20. end(while)

4. Result Analysis

4.1. Datasets

The experimental dataset in this section comprises two parts, totaling 500 samples. The first part of the dataset is derived from the Dlib library, which is used to extract facial features. Fatigue driving state videos from the YawDD dataset are fed into Dlib to extract the driver’s eye, mouth, head pitch angle, and roll angle. These features are then converted into vectors, and Python code is employed to calculate and compile the data statistics. Consequently, a new dataset consisting of 400 samples is obtained. The second part of the dataset is a custom-made collection obtained by recording videos of simulated drivers exhibiting both fatigue and non-fatigue states using a camera. These video streams are subsequently processed using the Dlib library, resulting in 100 samples.

4.2. Situation Evaluation Experiment

A fuzzy evaluation model is constructed, and the corresponding quantification is conducted using a 1 to 5 rating scale. The specific quantification criteria are presented in Table 7 below.
Fifteen sets of data obtained by Dlib face recognition for 5 s are randomly selected for initial data for situation evaluation, and two decimal places are retained, as shown in Table 8.
In this paper, Table 2 presents the principles of the comprehensive judgment of fatigue status, which defines the thresholds for determining fatigue for each indicator. Table 3, on the other hand, represents the fatigue status level judgment. It adopts a method that primarily focuses on facial features with supplementary consideration of head pose, utilizing the Dlib library along with the MAR, EAR, and HPE fatigue detection algorithms to establish the fatigue state thresholds for drivers. By combining these thresholds, the driver’s fatigue level is determined. In light of this, by integrating the explanations provided in Table 2 and Table 3, the paper derives the Membership Table Standard, which serves as the fuzzy rules for the AHP-fuzzy comprehensive evaluation. Therefore, the resulting Affiliation Table Standard is presented in Table 9, as follows.
According to the different value standards of the membership degree, this paper utilizes a frequency-based statistical approach to determine the membership degree. The summarized membership degree matrix, Table 10, is obtained. Specifically, based on the fuzzy rules established in Table 9, the original evaluation data from Table 8 are analyzed by counting their frequency of occurrence within different membership degree value ranges. The resulting summarized membership degree matrix, Table 10, reflects the higher membership degree for the original evaluation data that appears more frequently within the defined value ranges.
By performing non-dimensional standardization on the data in the table, we can obtain the non-dimensional standardized matrix of membership degrees:
0.8056 0.8684 0.6452 0.9677 0.6111 0.6316 0.6452 0.9677 1 1 1 1 0.5278 0.4737 0.6452 0.4839 0 0 0 0
After performing non-dimensional standardization on the data, the next step is to normalize the processed data. The resulting normalized matrix, after transposition, can be denoted as R.
R = 0.2736 0.2075 0.3396 0.1792 0 0.2920 0.2124 0.3363 0.1593 0 0.2198 0.2198 0.3407 0.2198 0 0.2830 0.2830 0.2925 0.1415 0
Based on the principles of fuzzy evaluation, we can calculate the comprehensive evaluation value.
According to the principles of the analytic hierarchy process (AHP), the weights calculated are denoted as W = [0.2861 0.5315 0.1060 0.0763]T. Based on the method’s principles, we can further calculate the following:
B = 0.2736 0.2075 0.3396 0.1792 0 0.2920 0.2124 0.3363 0.1593 0 0.2198 0.2198 0.3407 0.2198 0 0.2830 0.2830 0.2925 0.1415 0 × 0.2861 0.5315 0.1060 0.0763 = 0.2784 0.2172 0.3343 0.1700 0
After the normalization process, the membership degree evaluation values for the “fatigue state level” of the driver are as follows: B = [0.2784 0.2172 0.3343 0.1700 0]. Based on this, we can obtain the final membership degree evaluation table, denoted as Table 11.
According to the principle of maximum affiliation, the result of this fuzzy comprehensive evaluation is Level III, indicating a moderate level of risk.

4.3. Situation Prediction Experiment

Based on the evaluation of the driver’s condition, the goal is to predict the driving situation. The fundamental principle of situation prediction is to utilize a neural network approach, which combines the driver’s historical fatigue levels during fatigue driving and the current driver’s state, to forecast future situation levels. This process primarily involves two parts: generating sample data and configuring the parameters of the GRU network.

4.3.1. Sample Settings

Based on the acquisition of human facial feature data using Dlib, special signs are recorded every 5 s, with approximately 15–20 sets of data per cycle. A total of 15 groups of samples are obtained. The situation evaluation method is used to update the intention and trend levels of each sample’s target sequence, denoted as s i = [ s 1 i , s 2 i s t i ] . s1 represents the intention level of the target at every 5 s interval, and the intention and trend levels are shown in the table. Since fatigue increases with time during fatigue driving, the evaluation of data is obtained every 10 s to record the driver’s fatigue state over time when the driver is fatigued. This allows for the prediction of fatigue features over time when the driver exhibits fatigue features. Thus, it effectively verifies the long-term dependence of “attitude” on “potential”.

4.3.2. GRU Parameter Settings

A total of 400 targets are selected as the training set, while the remaining 100 targets serve as the test set. The input data for the training set are denoted as s i n p u t = [ s 1 i , s 2 i s t i ] , and the output data are denoted as s o u t p u t i = [ s 5 i ] . The number of neurons in the input layer is set to 4, and the number of neurons in the output layer is set to 1. The maximum number of training epochs is set to 500.
After optimizing the parameters using the improved marine predator algorithm, the initial population consists of 30 groups. The maximum number of iterations is set to 1000, and the dimensionality (Dim) is set to 3. The lower boundary (LB) parameters are set as [1 × 10−3, 10, 1 × 10−4], while the upper boundary (UB) parameters are set as [1 × 10−2, 30, 1 × 10−1]. Finally, the target prediction levels are obtained.

4.3.3. Simulation Results

  • The improved GRU network training result graph is shown in Figure 13 below.
2.
Comparison chart of the prediction results of each algorithm
Figure 14 and Figure 15 depict the confusion matrices and prediction result comparisons for the improved GRU algorithm: (a), GRU (b), LSTM (c), and BP [38,39] algorithm (d).
From the comparison of confusion matrices, it can be observed that our proposed algorithm (improved GRU algorithm) demonstrates superior performance in the classification task compared to other algorithms, particularly for the collected samples of fatigue levels in the test set over a period of time. The algorithm exhibits higher classification accuracy across all categories. This is evident from the confusion matrices of the four algorithms, where the diagonal elements of our proposed algorithm are relatively larger, indicating a higher correct classification rate.
Similarly, in terms of the comparison of classification prediction results, over a certain period of time, for the collected test set samples of fatigue levels, it can be clearly observed that the algorithm used in this paper (improved GRU algorithm) exhibits superior performance compared to other algorithms in the classification task. The classification prediction result comparison plot demonstrates the improved algorithm’s clearer decision boundaries and more accurate classification boundaries.
Through the analysis of Figure 14 and Figure 15, specifically, the comparison plots reveal the following advantages of the proposed algorithm in terms of classification prediction: Firstly, the algorithm can better differentiate between different categories, leading to tighter clustering of samples within the same category while effectively separating samples from different categories. Secondly, compared to the GRU and LSTM, the proposed algorithm reduces the occurrence of misclassifications during the classification prediction process and more accurately assigns samples to the correct categories. Additionally, in comparison to the traditional BP algorithm, the proposed algorithm demonstrates higher accuracy and robustness in classification prediction.
In addition, the experimental data utilized in this paper exhibit an uneven distribution. The sample data comprise five levels, with levels 1 and 5 having fewer instances and levels 2, 3, and 4 having more instances. The BP, LSTM, and GRU algorithms were compared without any optimization. Based on cost and accuracy evaluation, the algorithm proposed in this paper is suboptimal in terms of calculation time due to the lengthy computation time required by the improved marine predator algorithm. Among the four neural networks, the algorithm and GRU algorithm proposed in this paper employ a combination of three-gate structures for target prediction, while the LSTM algorithm utilizes a single-gate structure, and the BP algorithm adopts a four-layer network structure to achieve the objective. Predictions were randomly selected from these four algorithms and applied to 100 test samples. Figure 14 and Figure 15 demonstrate the enhanced algorithm and the processing effects of the GRU, LSTM, and BP algorithms from left to right. Through a comparison of the confusion matrix and prediction results, it is evident that the GRU algorithm outperforms the BP and LSTM algorithms in terms of classification accuracy. Furthermore, it is observed that the proposed improved GRU algorithm achieves significantly enhanced accuracy (92%) compared to the non-optimized GRU algorithm. The statistical data confirm its ability to attain more precise prediction accuracy, thereby fulfilling the requirements of fatigue driving detection.

5. Conclusions

In conclusion, although this paper proposes a research method combining multi-feature fusion and situational awareness for determining fatigue driving levels, there is still a certain imbalance in the data indicators affecting fatigue state obtained through the Dlib library, which can have an impact on the experimental results. Therefore, this paper performs a fusion analysis of fatigue state indicators and further categorizes the driver’s fatigue level. By employing fusion analysis techniques, the impact of imbalanced data on model accuracy is reduced.
The proposed fatigue driving detection method, which incorporates multi-feature fusion and trend perception, enables the evaluation of the driver’s fatigue state and provides early warning for predicting future situation levels based on past and present states. The AHP-fuzzy evaluation model constructed in this paper effectively integrates the evaluation indicators that influence driver fatigue state and enables comprehensive evaluation over a specific time period or stage. The fatigue level division standard is utilized in the situation evaluation experiment to conduct frequency statistics and generate a summary table of the affiliation matrix. The AHP-fuzzy comprehensive evaluation model is then implemented to obtain evaluation results. Experimental results demonstrate the feasibility of the model and evaluation method.
Furthermore, a GRU-based situation prediction method, optimized by an improved marine predator algorithm, is employed to predict the driver’s fatigue status level. Through simulation experiments, it is observed that although the improved marine predator algorithm achieves satisfactory results in optimizing the parameters of the GRU network and enhancing prediction accuracy in the situation prediction session, reaching 92%, there is still room for further improvement. This is primarily due to the computational time required for calculating the global optimal solution in the improved marine predator algorithm.
In this paper, the Dlib library was utilized for driver fatigue state analysis and fatigue driving level determination. However, the accuracy and performance of the algorithms provided by the Dlib library are crucial in safety-critical applications. Therefore, comprehensive evaluation and validation are necessary to ensure the accuracy and reliability of using the Dlib library for feature extraction in analyzing the driver’s fatigue state and fatigue driving level, meeting the required safety standards and regulations. Certification and compliance requirements also need to be considered, including ISO standards, ASPICE certification in the automotive industry, and road traffic safety regulations. It is essential to ensure that the Dlib library complies with these requirements and can pass the corresponding validation and testing processes. Furthermore, robustness and reliability in different scenarios, such as varying lighting conditions, driver poses, and facial expressions, should be considered for fatigue detection using the Dlib library. Thorough testing and validation are necessary to ensure the stability and robustness of the algorithms.
To summarize, this paper employed the Dlib library for feature extraction and proposed a method for fatigue detection and the determination of fatigue driving level using multi-feature fusion and situational awareness. However, it is important to acknowledge the challenges in using the Dlib library in safety-critical applications. Thorough evaluation, validation, and testing are essential to ensure compliance with safety standards and requirements. Collaboration with domain experts, conducting field tests and validations, and potential further research and development are crucial for ensuring the safety and reliability of the chosen solution.
In conclusion, this paper focuses on driver fatigue level classification and proposes an innovative method using the Dlib library and fatigue detection algorithms. The method incorporates multi-feature fusion and analytical hierarchy process–fuzzy comprehensive evaluation for fatigue level assessment. By optimizing the GRU network with an improved marine predator algorithm, the method achieves significant improvements in fatigue level prediction. Experimental results demonstrate high accuracy and real-time performance across various scenarios.
Accurately assessing driver fatigue levels is vital for enhancing road safety and reducing accidents. This paper provides an innovative method that comprehensively evaluates driver fatigue levels, offering a scientific basis for implementing appropriate measures. The research has implications for driver health and safety, as well as relevance to traffic management and policy making.
It is important to address limitations in the research, such as evaluating the credibility and security of driver facial recognition using the Dlib library and ensuring network security during data transmission. Further comparative analyses with other studies and leveraging deep learning techniques for facial and head feature analysis can enhance the assessment of driver fatigue levels. Additionally, incorporating sensor data, such as heart rate and electroencephalogram, can provide more precise determinations of driver fatigue levels.

Author Contributions

Conceptualization, F.-F.W.; data curation, F.-F.W.; formal analysis, F.-F.W.; funding acquisition, X.C.; investigation, F.-F.W.; methodology, T.C.; project administration, X.C.; resources, X.C. and T.C.; software, F.-F.W.; supervision, X.C.; validation, T.C.; visualization, F.-F.W.; writing—original draft, F.-F.W. and T.C.; writing—review and editing, F.-F.W. and X.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research reported herein was supported by the National Natural Science Foundation of China (NSFC) under Grant No. 71571091 and 71771112.

Data Availability Statement

Data are unavailable due to privacy.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Peng, Z.; Zhang, H.; Wang, Y. Work-related factors, fatigue, risky behaviors, and traffic accidents among taxi drivers: A comparative analysis among age groups. Int. J. Inj. Control Saf. Promot. 2020, 28, 58–67. [Google Scholar] [CrossRef] [PubMed]
  2. Xi, J.; Wang, S.; Ding, T.; Tian, J.; Shao, H.; Miao, X. Detection Model on Fatigue Driving Behaviors Based on the Operating Parameters of Freight Vehicles. Appl. Sci. 2021, 11, 7132. [Google Scholar] [CrossRef]
  3. Li, F.; Wang, X.; Wu, B.L. Detection of driving fatigue based on grip force on the steering wheel with wavelet transformation and support vector machine. In Proceedings of the International Conference on Neural Information Processing, Daegu, Republic of Korea, 3–7 November 2013; Springer: Berlin/Heidelberg, Germany, 2013; pp. 141–148. [Google Scholar]
  4. Liu, Z.; Peng, Y.; Hu, W. Driver fatigue detection based on deeply-learned facial expression representation. J. Vis. Commun. Image Represent. 2020, 71, 102723. [Google Scholar] [CrossRef]
  5. Cyganek, B.; Gruszczyński, S. Hybrid computer vision system for drivers’ eye recognition and fatigue monitoring. Neurocomputing 2014, 126, 78–94. [Google Scholar] [CrossRef]
  6. Halim, Z.; Rehan, M. On identification of driving-induced stress using electroencephalogram signals: A framework based on wearable safety-critical scheme and machine learning. Inf. Fusion 2020, 53, 66–79. [Google Scholar] [CrossRef]
  7. Daimon, T. Differences in Pedestrian Behavior at Crosswalk between Communicating with Conventional Vehicle and Automated Vehicle in Real Traffic Environment. Safety 2023, 9, 2. [Google Scholar]
  8. Das, A.; Sebastian, L. A Comparative Analysis and Study of a Fast Parallel CNN Based Deepfake Video Detection Model with Feature Selection (FPC-DFM). In Proceedings of the 2023 Advanced Computing and Communication Technologies for High Performance Applications (ACCTHPA), Ernakulam, India, 20–21 January 2023; pp. 1–9. [Google Scholar]
  9. Purahong, B.; Chutchavong, V.; Aoyama, H.; Pintavirooj, C. Hybrid Facial Features with Application in Person Identification. J. Mob. Multimed. 2020, 16, 245–266. [Google Scholar] [CrossRef]
  10. Deng, J.; Guo, J.; Zhou, Y.; Yu, J.; Kotsia, I.; Zafeiriou, S. RetinaFace: Single-stage dense face localisation in the wild. In Proceedings of the AAAI Conference on Artificial Intelligence 2020, New York, NY, USA, 7–12 February 2020. [Google Scholar]
  11. Soukupove, T.; Cech, J. Real-Time Eye Blink Detection using Facial Landmarks. Comput. Vis. Winter Workshop 2016, 61, 8–21. [Google Scholar]
  12. Mehta, S.; Dadhich, S.; Gumber, S.; Jadhav Bhatt, A. Real-time driver drowsiness detection system using eye aspect ratio and eye closure ratio. In Proceedings of the International Conference on Sustainable Computing in Science, Technology, and Management (SUSCOM), Amity University Rajasthan, Jaipur, India, 26–28 February 2019. [Google Scholar]
  13. Ramos, A.L.A.; Erandio, J.C.; Mangilava, D.H.T.; Carmen, N.D.; Enteria, E.M.; Enriquez, L.J. Driver Drowsiness Detection Based on Eye Movement and Yawning Using Facial Landmark Analysis. Int. J. Simul.–Syst. Sci. Technol. 2019, 20, 371–378. [Google Scholar] [CrossRef]
  14. Gallup, A.C.; Church, A.M.; Pelegrino, A.J. Yawn duration predicts brain weight and cortical neuron number in mammals. Biol. Lett. 2016, 12, 45–48. [Google Scholar] [CrossRef] [Green Version]
  15. Lu, R.X.; Zhang, B.H.; Mo, Z.L. Fatigue detection method based on facial features and head posture. J. Syst. Simul. 2021, 34, 2279. [Google Scholar]
  16. Fan, L.; Zhang, M.; Wang, J.; Li, C.; Liu, W.; Chen, J.; Zhao, Q.; Yang, J.; Zhang, Y.; Wu, F. Head pose estimation based on convolutional neural networks and random forest. Signal Process. Image Commun. 2018, 63, 29–39. [Google Scholar]
  17. Pham, T.T.; Nguyen, A.T.; Tran, T.D.; Le, N.T.; Do, T.H.; Nguyen, T.T.; Phan, X.H.; Nguyen, D.T.; Bui, T.D.; Vu, D.L.; et al. Head pose estimation and facial feature localization using a particle swarm optimized deep neural network. Expert Syst. Appl. 2018, 103, 90–100. [Google Scholar]
  18. Yi, Y.; Zhang, H.; Zhang, W.; Yuan, Y.; Li, C. Fatigue Working Detection Based on Facial Multifeature Fusion. IEEE Sens. J. 2023, 23, 5956–5961. [Google Scholar] [CrossRef]
  19. Dong, C.J.; Lin, G.H.; Wu, C.X.; Huang, S.A. Fatigue driving detection based on convolution expert neural network. Comput. Eng. Des. 2020, 41, 2812–2817. [Google Scholar]
  20. Kong, X.J.; Xia, F.; Li, J.X.; Hou, M.; Li, M.; Xiang, Y. A shared bus profiling scheme for smart cities based on heterogeneous mobile crowdsourced data. IEEE Trans. Ind. Inform. 2020, 16, 1436–1444. [Google Scholar] [CrossRef]
  21. Ning, Z.L.; Huang, J.; Wang, X.J.; Rodrigues, J.J.; Guo, L. Mobile edge computing-enabled internet of vehicles: Toward energy-efficient scheduling. IEEE Netw. 2019, 33, 198–205. [Google Scholar] [CrossRef]
  22. Feng, Y.; Li, X.L.; Gong, Y.B.; Wang, H.; Li, H. A real-time driving drowsiness detection algorithm with individual differences consideration. IEEE Access 2019, 7, 179396–179408. [Google Scholar]
  23. Endsley, M.R. Design and evaluation for situation awareness enhancement. Proc. Hum. Factor Ergon. Soc. Annu. Meet. 1988, 32, 97–101. [Google Scholar] [CrossRef]
  24. McKerral, A.; Boyce, N.; Pammer, K. Supervising the self-driving car: Situation awareness and fatigue during automated driving. In Proceedings of the 11th International Conference, London, UK, 9–11 September 2019. [Google Scholar]
  25. McKerral, A.; Pammer, K.; Gauld, C. Supervising the self-driving car: Situation awareness and fatigue during highly automated driving. Accid. Anal. Prev. 2023, 187, 107068. [Google Scholar] [CrossRef]
  26. He, J.; Su, W. Establishment of nonlinear network security situational awareness model based on random forest under the background of big data. Nonlinear Eng. 2023, 12, 20220265. [Google Scholar] [CrossRef]
  27. Kodituwakku, H.A.D.E.; Keller, A.; Gregor, J. InSight2: A Modular Visual Analysis Platform for Network Situational Awareness in Large-Scale Networks. Electronics 2020, 9, 1747. [Google Scholar] [CrossRef]
  28. Ho, W.; Ma, X. The state-of-the-art integrations and applications of the analytic hierarchy process. Eur. J. Oper. Res. 2018, 267, 399–414. [Google Scholar] [CrossRef]
  29. Faramarzi, A.; Heidarinejad, M.; Mirjalili, S.; Gandomi, A.H. Marine Predators Algorithm: A Nature-inspired Metaheuristic. Expert Syst. Appl. 2020, 152, 113377. [Google Scholar] [CrossRef]
  30. Mafarja, M.; Mirjalili, S.; Aljarah, I.; Faris, H. Marine Predators Algorithm for Global Optimization: A Comparative Study. Neural Comput. Appl. 2018, 30, 2041–2060. [Google Scholar]
  31. Cho, K.; Merrienboer, B.V.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. Association for Computational Linguistics. arXiv 2014, arXiv:1406.1078. [Google Scholar]
  32. Chung, J.; Gulcehre, C.; Cho, K.H.; Bengio, Y. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. arXiv 2014, arXiv:1412.3555. [Google Scholar]
  33. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
  34. Ali, G.; Ali, T.; Irfan, M.; Draz, U.; Sohail, M.; Glowacz, A.; Sulowicz, M.; Mielnik, R.; Faheem, Z.B.; Martis, C. IoT Based Smart Parking System Using Deep Long Short Memory Network. Electronics 2020, 9, 1696. [Google Scholar] [CrossRef]
  35. Li, G.; Jung, J.J. Deep learning for anomaly detection in multivariate time series: Approaches, applications, and challenges. Inf. Fusion 2023, 91, 93–102. [Google Scholar] [CrossRef]
  36. Nagra, A.; Bin, M.; Masood, K. Feature Selection Empowered by Self-Inertia Weight Adaptive Particle Swarm Optimization for Text Classification. Inf. Process. Manag. 2021, 58, 102517. [Google Scholar]
  37. Sun, K.; Sun, Y.; Chen, X.; Gao, L. A self-adaptive inertia weight based particle swarm optimization for global numerical optimization. Swarm Evol. Comput. 2018, 42, 68–78. [Google Scholar]
  38. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
  39. Liu, X.; Guo, J.; Wang, H.; Zhang, F. Prediction of stock market index based on ISSA-BP neural network. Expert Syst. Appl. 2022, 204, 117604. [Google Scholar] [CrossRef]
Figure 1. Facial Landmarks with 68 Key Points.
Figure 1. Facial Landmarks with 68 Key Points.
Electronics 12 02884 g001
Figure 2. Schematic Diagram of Eye Aspect Ratio Calculation.
Figure 2. Schematic Diagram of Eye Aspect Ratio Calculation.
Electronics 12 02884 g002
Figure 3. Schematic Diagram of Mouth Aspect Ratio Calculation.
Figure 3. Schematic Diagram of Mouth Aspect Ratio Calculation.
Electronics 12 02884 g003
Figure 4. Diagram of the three-dimensional change direction of the head posture.
Figure 4. Diagram of the three-dimensional change direction of the head posture.
Electronics 12 02884 g004
Figure 5. The relationship between the three coordinate systems.
Figure 5. The relationship between the three coordinate systems.
Electronics 12 02884 g005
Figure 6. Endsley Situational Awareness Concept Diagram.
Figure 6. Endsley Situational Awareness Concept Diagram.
Electronics 12 02884 g006
Figure 7. Evaluation Index System of Fatigue State Situation.
Figure 7. Evaluation Index System of Fatigue State Situation.
Electronics 12 02884 g007
Figure 8. AHP-fuzzy comprehensive evaluation map.
Figure 8. AHP-fuzzy comprehensive evaluation map.
Electronics 12 02884 g008
Figure 9. LSTM unit structure.
Figure 9. LSTM unit structure.
Electronics 12 02884 g009
Figure 10. GRU unit structure.
Figure 10. GRU unit structure.
Electronics 12 02884 g010
Figure 11. GRU situation prediction process.
Figure 11. GRU situation prediction process.
Electronics 12 02884 g011
Figure 12. Adaptive switching probability.
Figure 12. Adaptive switching probability.
Electronics 12 02884 g012
Figure 13. The results of the algorithm training in this paper.
Figure 13. The results of the algorithm training in this paper.
Electronics 12 02884 g013
Figure 14. Confusion matrix comparison diagram of the four algorithms: (a) Improved GRU, (b) GRU, (c) LSTM, and (d) BP algorithm.
Figure 14. Confusion matrix comparison diagram of the four algorithms: (a) Improved GRU, (b) GRU, (c) LSTM, and (d) BP algorithm.
Electronics 12 02884 g014
Figure 15. Comparison of the prediction results of the four algorithms: (a) Improved GRU, (b) GRU, (c) LSTM, and (d) BP] algorithm.
Figure 15. Comparison of the prediction results of the four algorithms: (a) Improved GRU, (b) GRU, (c) LSTM, and (d) BP] algorithm.
Electronics 12 02884 g015
Table 1. Commonly used fatigue detection methods and evaluation.
Table 1. Commonly used fatigue detection methods and evaluation.
Detection MethodDescriptionAccuracy Rate
Detection based on driver physical dataDetection of changes in the driver’s body’s psychological signals Good
Detection based on operating vehicle operation statusDetection of driver’s operation behavior and changes in vehicle behaviorGeneral
Facial detection based on computer visionDetection of the driver’s facial state changeGood
Table 2. Principles of comprehensive judgment of fatigue status.
Table 2. Principles of comprehensive judgment of fatigue status.
Whether It Is FatigueMAREAR|Pitch||Roll|
Not tired[0, 0.6)(0.2, 1][0, 20)[0, 15.4)
Tired[0.6, 1][0, 0.2][20, 70][15.4, 36]
Table 3. Fatigue status level judgment.
Table 3. Fatigue status level judgment.
Fatigue State LevelDriver’s Face and Head Performance
Level IThe face does not show a special sign of fatigue, or only one fatigue feature is present on the head
Level IIThe face only presents one fatigue feature, or the head presents two fatigue features
Level IIIOne fatigue feature on face and head each, or two fatigue features on the head
Level IVThere are two fatigue features on the face, only one fatigue feature on the head, or the head presents two fatigue features, and the face only presents one fatigue feature
Level VThe face presents two fatigue features, and two fatigue features appear on the head
Level ⅥArbitrary behavior without seat belt or airbags
Table 4. Label meaning.
Table 4. Label meaning.
ScalingMeaning
1Represents equal importance between two elements being compared
3Represents one element being slightly more important than the other element being compared
5Represents one element being noticeably more important than the other element being compared
7Represents one element being strongly more important than the other element being compared
9Represents one element being extremely more important than the other element being compared
2, 4, 6, 8Represent the relative importance between two elements being compared at levels between the above descriptions
Table 5. The value of RI.
Table 5. The value of RI.
n12345678910
RI000.520.891.121.261.361.411.461.49
Table 6. Weight summary table.
Table 6. Weight summary table.
IndexMAREARPitchRoll
Weight0.28610.53150.10600.0763
Table 7. Risk Decision Standard.
Table 7. Risk Decision Standard.
GradeAttributesJudgement Standard
Level ILow riskHigh degree of safety
Level IILower riskHigh security
Level IIIGeneral riskThe degree of security is average
Level IVHigher riskThe degree of safety is less secure
Level VExtremely high riskThe degree of safety is not safe
Table 8. Original data table.
Table 8. Original data table.
GroupsMAREAR|Pitch||Roll|
1 0.53 0.313.26 5.22
2 0.42 0.10 10.01 3.32
3 0.35 0.32 15.23 5.54
4 0.21 0.1421.36 6.65
5 0.33 0.34 25.14 8.41
6 0.42 0.40 30.41 6.21
70.54 0.33 33.50 9.32
80.790.62 35.21 10.10
90.86 0.54 10.00 6.24
10 0.740.52 5.35 5.36
110.62 0.41 2.32 3.85
120.410.70 1.14 5.37
130.33 0.663.36 7.21
140.290.35 5.45 3.46
150.410.144.33 5.17
Table 9. Standard table of affiliation.
Table 9. Standard table of affiliation.
GradeMAREAR|Pitch||Roll|
Level I[0, 0.6)(0.2, 1][0, 20)[0, 15.4)
[0, 0.6)(0.2, 1][20, 70][0, 15.4)
[0, 0.6)(0.2, 1][0, 20)[15.4, 36]
Level II[0.6, 1](0.2, 1][0, 20)[0, 15.4)
[0, 0.6)[0, 0.2][0, 20)[0, 15.4)
[0, 0.6)(0.2, 1][20, 70][15.4, 36]
Level III[0.6, 1](0.2, 1][20, 70][0, 15.4)
[0.6, 1](0.2, 1][0, 20)[15.4, 36]
[0, 0.6)[0, 0.2][20, 70][0, 15.4)
[0, 0.6)[0, 0.2][0, 20)[15.4, 36]
[0.6, 1][0, 0.2][0, 20)[0, 15.4)
Level IV[0.6, 1][0, 0.2][0, 20)[15.4, 36]
[0.6, 1][0, 0.2][20, 70][0, 15.4)
[0.6, 1](0.2, 1][20, 70][15.4, 36]
[0, 0.6)[0, 0.2][20, 70][15.4, 36]
Level V[0.6, 1][0, 0.2][20, 70][15.4, 36]
Table 10. Statistics of frequency of affiliation.
Table 10. Statistics of frequency of affiliation.
GradeMAREAR|Pitch||Roll|
Level I33362530
Level II26272530
Level III40413631
Level IV23212515
Level V4350
Table 11. Comprehensive evaluation form of affiliation.
Table 11. Comprehensive evaluation form of affiliation.
Driver Fatigue State LevelLevel ILevel IILevel IIILevel IVLevel V
Overall evaluation situation0.27840.21720.33430.17000
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wei, F.-F.; Chi, T.; Chen, X. A Multi-Feature Fusion and Situation Awareness-Based Method for Fatigue Driving Level Determination. Electronics 2023, 12, 2884. https://doi.org/10.3390/electronics12132884

AMA Style

Wei F-F, Chi T, Chen X. A Multi-Feature Fusion and Situation Awareness-Based Method for Fatigue Driving Level Determination. Electronics. 2023; 12(13):2884. https://doi.org/10.3390/electronics12132884

Chicago/Turabian Style

Wei, Fei-Fei, Tao Chi, and Xuebo Chen. 2023. "A Multi-Feature Fusion and Situation Awareness-Based Method for Fatigue Driving Level Determination" Electronics 12, no. 13: 2884. https://doi.org/10.3390/electronics12132884

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop