Detection Model on Fatigue Driving Behaviors Based on the Operating Parameters of Freight Vehicles

: Whether in developing or developed countries, trafﬁc accidents caused by freight vehicles are responsible for more than 10% of deaths of all trafﬁc accidents. Fatigue driving is one of the main causes of freight vehicle accidents. Existing fatigue driving studies mostly use vehicle operating data from experiments or simulation data, exposing certain drawbacks in the validity and reliability of the models used. This study collected a large quantity of real driving data to extract sample data under different fatigue degrees. The parameters of vehicle operating data were selected based on signiﬁcant driver fatigue degrees. The k -nearest neighbor algorithm was used to establish the detection model of fatigue driving behaviors, taking into account inﬂuence of the number of training samples and other parameters in the accuracy of fatigue driving behavior detection. With the collected operating data of 50 freight vehicles in the past month, the fatigue driving behavior detection models based on the k -nearest neighbor algorithm and the commonly used BP neural network proposed in this paper were tested, respectively. The analysis results showed that the accuracy of both models are 75.9%, but the fatigue driving detection model based on the k -nearest neighbor algorithm is more reliable.


Introduction
According to the National Highway Traffic Safety Administration [1] report, 11.2% of road traffic accidents in the United States in 2015 involved at least one large freight vehicle. In developing countries such as India, the situation is even worse. The National Crime Bureau of India (NCRB) [2] records that in 2015, commercial freight vehicles accounted for 19.4% of the total fatalities in traffic accidents. At the same time, with the acceleration of globalization, the demand for road freight transportation will further increase, and drivers often sacrifice sleep and rest time to drive for a long time under the pressure of life. Fatigue driving is one of the main causes of traffic accidents of freight vehicles [3].
The fatigue state of freight vehicle drivers has unique characteristics due to their occupational characteristics. In 2013, the US Centers for Disease Control and Prevention [4] stated that commercial freight drivers were more likely to be drowsy or fatigued than other drivers. Through questionnaires, Feyer, A.M. [5] found that the driving experience of longdistance freight drivers and long-distance bus drivers only partially overlaid. Compared with bus drivers, most freight drivers said that they often felt fatigue, while relatively few bus drivers said that fatigue was the main problem of their driving. Fitzharris, M. et al. [6] found that compared to not providing feedback to drivers when fatigue was detected, providing feedback to the company and the cockpit in real time could effectively reduce the occurrence of fatigue driving events. In summary, it is of great significance to quickly and accurately detect the fatigue driving behavior of freight vehicles.
At present, the research on freight vehicle fatigue driving mainly focuses on the correlation between driver fatigue and macro factors such as daily work and rest, work of face, physiological, and vehicle operation to detect fatigue driving [25][26][27][28]. However, first of all, it is difficult to calibrate some parameters accurately. For example, in the sample training process, some basic parameters and training function parameters are randomly generated, the training efficiency is low, and the influence of the number of training samples and other parameters on the model detection accuracy and model stability is seldom considered. Secondly, the intermediate learning process cannot be observed and the output results are difficult to explain, which will affect the credibility and acceptability of the results. Finally, the requirements for data quality and content are relatively high, and the practical application is more difficult.
With the extensive application of various vehicle operation management systems, vehicle operation data and drivers' facial monitoring video on the actual road can be acquired in real time and stored for a long time, which greatly improves the richness and objectivity of data sources [29]. At the same time, the k-nearest neighbor algorithm is the most intuitive in principle and the method is simple. It depends on the number of surrounding training samples, and the established model has strong stability [30]. Therefore, this paper collects a large number of continuous operating data of freight vehicles and driver video data under real road traffic conditions, extracts fatigue driving detection parameters from vehicle operating data, and studies a more accurate and stable fatigue driving detection model by using the k-nearest neighbor algorithm.
The paper is organized as follows. Section 2 extracts driver fatigue degree data, preprocesses the vehicle operating data, extracts the vehicle operating parameters that are significant to the fatigue degrees, and constructs the fatigue driving detection model. Section 3 compares and evaluates the performance of the model established in this paper by substituting data into the two models. Section 4 presents the conclusion.

Extraction of Driver Fatigue Degrees Data
The expert scoring method based on facial video is the most practical method for evaluating the fatigue degrees of drivers. This method uses a group of trained experts to score the fatigue degrees of drivers according to their facial expressions and head posture [31]. In this paper, the evaluation standard and process of driver fatigue degree is formulated as shown in Table 1. The process of determining the driver fatigue degree: the duration of the collected driver's facial video is 20 s, which is called the sample. The samples are scored by three experts according to the evaluation criteria of the driver fatigue degree. If the three experts score the same, then the experts' consensus evaluation result is taken as the fatigue level of the sample. When the scoring results are inconsistent, the three experts will conduct a collegiate discussion. If they are consistent after the negotiation, the result will be used as the fatigue degree of the sample. If there is still a disagreement after the negotiation, the sample will be discarded.

Data Preprocessing
The data sampling frequency of the vehicle operating parameters is 1 s, and the driving duration is calculated according to the vehicle operating time and speed value. Taking into account factors such as data collection errors, short-term temporary parking, and other factors that do not relieve driver fatigue, when the duration operating velocity is continuously 0 km/h and does not exceed 200 s, then the vehicle is regarded as in a temporary stop state. This paper considers that the vehicle is still in operation. If the calculated driving duration for some samples is less than 300 s, it is still regarded as the vehicle is not running. In addition, if the driving duration of some samples is longer than 300 s, but the speed at each moment during the driving duration is lower than 10 km/h, it is still regarded as the vehicle is not running. Due to the large amount of data, the Python software is used to write a program to calculate the driving duration and extract the start and end time of each segment of driving.
The operating parameters of the vehicle directly obtained from the system mainly include heading angle, roll angle, speed, lateral acceleration, and longitudinal acceleration. By deriving the heading angle, roll angle, and speed, the heading angular velocity, roll angular velocity, and acceleration can be obtained. In this way, eight operating parameters of the vehicle can be obtained. As the degree of fatigue deepens, the driver's ability to perceive, react, and judge will become weaker, resulting in abnormal fluctuations in vehicle control variables and state variables. In this paper, by monitoring the heading angle, roll angle, speed, lateral acceleration, and longitudinal acceleration of the vehicle, the heading angle, roll angle, and velocity parameter data are derived to obtain the heading angular velocity, roll angular velocity, and acceleration, and the vehicle is monitored through these parameters: whether there is a deviation in the trajectory of the vehicle, whether the vehicle is driving smoothly, and whether there are behaviors such as sudden braking, rapid acceleration, sharp turning, and rapid lane change. Taking the heading angle α by deriving to obtain the heading angular velocity β as an example, since the vehicle operating parameter is a smooth discrete function and the information density is large enough (the time interval is 1 s), the increment of the independent variable when deriving the derivation is ∆t = 1, and the specific calculation is Formula (1) [32].
Take the vehicle heading angle α as an example: where β is the current heading angular velocity of the vehicle at the current moment t; α t is the current heading angle value at moment t; and α t−1 is the vehicle heading angle value at moment t − 1.
On the basis of obtaining the eight operating parameters of the vehicle, the absolute mean value and standard deviation of each parameter are calculated under different time windows (5 s, 10 s, 15 s, 20 s, and 30 s), so that a total of eighty operating parameters of the vehicle are obtained. Taking the heading angle α as an example, the calculation method is shown in Formulas (2) and (3).
where α mean is the absolute mean value of heading angle; N is the length of the time window, with the N values of 5, 10, 15, 20, and 30; and α i is the heading angle value of the i-th sample.
where α STD is the standard deviation of heading angle; and α m is the mean value of the heading angles.

Parameter Extraction
Firstly, the normal distribution test and the variance homogeneity test are performed on the vehicle operating parameters under different fatigue degrees. When these two conditions are met, the single-factor variance method is used to study the difference of vehicle operating parameters under different fatigue degrees. When these two conditions are not met, the Friedman test is used instead [33,34]. Then, the Bonferroni-adjusted multiple comparison test analysis method is used to compare the differences of each parameter under the three different fatigue degrees. The significance level of the one-way analysis of variance and the Friedman test in this study is 0.05, and the significance level of the Bonferroni-adjusted multiple comparison test is 0.01. The test results show that fatigue status has a significant impact on the following nine vehicle operating parameters, p < 0.001 (See Table 2).

. Model Construction
The k-nearest neighbor algorithm is used to construct the fatigue driving detection model. The core idea of the algorithm is that if the fatigue degree of the sample isunknown and most k-nearest samples of the sample in the feature space belong to a certain fatigue degree, then the sample also belongs to this degree. The k value is usually an integer not greater than 20 [35]. In order to enhance the accuracy and reliability of the k-nearest neighbor algorithm, the work considers the influence of the k value and the number of training samples on the accuracy of fatigue detection.
The extracted sample set X of different fatigue degrees constitutes a matrix of n × (m + 1) (see Formula (5)). Since each vehicle operating parameter represents different attributes, and the value of each parameter has a different order of magnitude and value range, it is necessary to normalize each column of the parameter values before using the algorithm. In this paper, the maximum-minimum method is used for normalization (see Formula (6)), and the normalized sample set X * of three fatigue degrees is obtained [36].
where X is the sample set; x kt is the t − th parameter value in the k − th sample (k = 1, 2, · · · , n; t = 1, 2, · · · , m); and x km+1 is the fatigue degree of the k − th sample.
where x * kt is the t − th parameter value in the k − th sample in the sample set X * ; x t(min) represents the minimum value of the t − th parameter in the sample set X; and x t(max) represents the maximum value of the t − th parameter in the sample set X.
The improved k-nearest neighbor algorithm is solved as follows: 1.
Extract sample training set and sample test set.
Firstly, the first l samples are extracted from the sample set X * to constitute a test sample set P whose fatigue degree is unknown. Each test sample is denoted as Then, the q samples are extracted from the rest samples as the training sample set Q with known fatigue degree, where q < n − l. Each training sample is denoted as Finally the k value in the k-nearest neighbor algorithm is determined.

2.
For each test sample x * r whose fatigue degree is unknown, perform the following operations in sequence: a.
Calculate the distance between each sample x * l+s in the training set and the test sample x * r , using the Euclidean distance calculation (see Formula (7)).
where d rl+s is the Euclidean distance between the test sample x * r and the training sample x * l+s ; r is 1, 2, · · · , l; l + s is l + 1, l + 2, · · · , n − l; m is the number of parameters in each sample; x * rt is the t − th parameter value of the test sample x * r ; and x * l+st is the t − th parameter value of the training sample x * l+s . b.
Sort the distances calculated by each training sample and test samples in ascending order. c.
Select the first k training samples with the smallest distance from the test samples. d.
Determine the frequency of occurrence of each fatigue degree in the first k training samples. e.
Take the fatigue degree with the highest frequency in the first k samples as the fatigue degree of the test sample.

3.
Compare whether the actual fatigue degree of all the test samples is consistent with the predicted fatigue degree. Then calculate the number of correct predictions of various samples in the test sample.

4.
Adjust the number q of training samples, and then perform steps (1), (2), and (3). The cycle ends when all the different training samples numbers (q value) are executed. The q value range is 0 − (n − l).

5.
Change the k value and then perform steps (1)

Model Performance Evaluation
In order to evaluate the performance of the model, firstly, according to the prediction results, the results of various fatigue degrees are calculated, and the confusion matrix (see Table 3) is listed. Then, the Precision, True Positive Rate (TPR), and Truth Negativity Rate (TNR) of each fatigue degree are calculated, respectively (see Formulas (12)-(14)).
where TP is the number of correctly predicted samples that actually belong to the fatigue degree. FP is the number of samples that do not actually belong to the fatigue degree but are incorrectly predicted to the degree. FN is the number of samples that belong to the fatigue degree and are mispredicted to other degrees. TN is the number of correctly predicted samples that do not belong to the fatigue degree. The BP neural network model can also effectively detect the fatigue driving behavior. Therefore, in order to test the effectiveness, the fatigue driving behavior detection model based on the k-nearest neighbor algorithm can be compared with the existing fatigue driving behavior detection model based on the BP neural network to test the effectiveness of the proposed method.

Data Source
The research data were derived from vehicle operating data and drivers' facial videos of 50 freight vehicles provided by the vehicle cloud control platform of a logistics company for nearly one month. The driving path is an expressway in Shandong Province. Take the Yanhai Expressway as an example, the driving path is shown in the Figure 1. Through the evaluation of the drivers' fatigue degree and the extraction of significant parameters of vehicle operating data, a total of 396 samples data were obtained, of which the number of samples for each fatigue degree was 132. Each data sample contains nine operating parameters of vehicles that are significantly affected by the fatigue degrees and one driver fatigue degree.
For the convenience of the research, the order of 396 samples in the sample set X* was randomly disrupted. The first 79 samples (20% of the total sample number) were taken as the model test sample set, that is, the value of variable L in Formula 7 is 79.Therewere 24 alert samples, 24 fatigue samples, and 31 severe fatigue samples. The remaining samples served as the training sample set.

Discussion of Experimental Results
In order to test the effectiveness of the fatigue driving behavior detection model based on the k-nearest neighbor algorithm, the model (referred to as Model 1) was compared with the existing fatigue driving behavior detection model based on the BP neural network (referred to as Model 2). Both models considered the influence of the number of different training samples and other parameters on the detection accuracy.

1.
Analysis of the experimental results Model 1 The program is developed with Python to realize the calculation of the fatigue driving detection model based on the k-nearest neighbor algorithm. According to the data obtained above, the parameters of the model can be known: the number of test sample l = 79, with the number of training sample q ≤ 317, and the number of vehicle operating parameters significantly affected by fatigue degree m = 9. The value of k is usually not greater than 20, and the values are 1, 3, 5, 7, 9, 11, 13, 15, 17 and 19, respectively. The accuracy of fatigue driving behavior detection that changes with the number of training samples is calculated and shown in Figure 2. The number range of training samples corresponding to the maximum detection accuracy is shown in Table 4.   Figure 2 and Table 4 show that when values (smaller values) are 1 and 3, the detection accuracy fluctuates greatly with the change of the number of training samples, and the accuracy is relatively low. When values (larger values) are 5, 7, 9, 11, 13, 15, 17, and 19, and the number of the training samples isgreater than 70, the detection accuracy improves as the number of training samples increase and, until the number of training sample is greater than 200, the accuracy reaches more than 70.0% with a stable trend. When the value is 7 with the number range of training samples of 284-306, the accuracy of fatigue driving detection reaches the maximum value of 75.9%. Therefore, the value of the model should be set to 7, and when the ratio of the test set to the training set is between (2:7)-(2:8), the accuracy is high with strong stability so that requirements for training samples are low.
When the detection accuracy reaches the maximum, the detection results of each fatigue degree are shown in Table 5. By calculating the precision, true rate (TPR), and true   Figure 2 and Table 4 show that when k values (smaller values) are 1 and 3, the detection accuracy fluctuates greatly with the change of the number of training samples, and the accuracy is relatively low. When k values (larger values) are 5, 7, 9, 11, 13, 15, 17, and 19, and the number of the training samples isgreater than 70, the detection accuracy improves as the number of training samples increase and, until the number of training sample is greater than 200, the accuracy reaches more than 70.0% with a stable trend. When the k value is 7 with the number range of training samples of 284-306, the accuracy of fatigue driving detection reaches the maximum value of 75.9%. Therefore, the k value of the model should be set to 7, and when the ratio of the test set to the training set is between (2:7)-(2:8), the accuracy is high with strong stability so that requirements for training samples are low.
When the detection accuracy reaches the maximum, the detection results of each fatigue degree are shown in Table 5. By calculating the precision, true rate (TPR), and true negative rate (TNR) of each type of fatigue degree prediction, the algorithm is evaluated (see Table 6). As can be seen from Table 6, when the driver is alert, the three index values of the evaluation algorithm are all greater than or equal to 87.5%, indicating that the algorithm can distinguish the driver's alert state and non-alert state well. When the driver is in the state of fatigue and severe fatigue, the TNR is larger, while the Precision and TPR are smaller, indicating that the algorithm has a higher probability of correctly identifying the state of non-fatigue or non-severe fatigue, but a lower probability of correctly identifying the state of certain fatigue or severe fatigue. In addition, combined with Table 5, it can be seen that the fatigue state and severe fatigue state are easily confused in detection. There are two reasons for this result. First, based on the videos, the scoring method is used to evaluate the driver's fatigue state with subjective factors. Second, the vehicle operating data of the fatigue state and severe fatigue state have no significant difference. On the whole, the model test results are reasonable.  The program is developed with Matlab to realize the calculation of the fatigue driving detection model based on the BP neural network (see Figure 3). From the data obtained above, the parameters of the model can be known: the number of the test sample l = 79, with the number of training sample q ≤ 317. The number of input layer nodes (vehicle operating parameters), the number of output nodes (fatigue degree), the number of hidden layer nodes, and the learning efficiency are set as 9, 3, midnum, and 0.1, respectively. The neural network structure of 9 − midnum − 3 is adopted, and the training times are set as 10, 50, 100, 200, and 500, respectively [37]. The random number between (−1,1) is selected as the initial value of connection weight among the neurons of the input layer, hidden layer, and output layer, as well as the initial value of the threshold values of the hidden layer and output layer [38,39]. The Sigmoid activation function is used as the excitation function of the hidden-layer neurons. Referring to the constraints of the number setting of the hidden-layer nodes in the BP neural network algorithm, it is determined that the maximum and minimum numbers of nodes were 7 and 3, respectively. Therefore, the number range of the hidden-layer nodes is 3-7 [40]. Under the premise of setting the training times as 10, 50, 100, 200 and 500, respectively, the different numbers of hidden layer nodes are taken to obtain the accuracy of fatigue driving detection with the different ratios of training sample and test sample (see . No matter what the value of the training time is, when the detection accuracy reaches the maximum, the number of hidden layer nodes is 7. Therefore, the number of hidden layer nodes is determined to be 7. In addition, the model requires higher training data. Although the ratios of the test sample to the training sample are at 4:6, 3:7, and 2:8, the detection accuracy is above 70%, yet among these ratios, the detection accuracy fluctuates greatly.  Under the premise of setting the training times as 10, 50, 100, 200 and 500, respectively, the different numbers of hidden layer nodes are taken to obtain the accuracy of fatigue driving detection with the different ratios of training sample and test sample (see . No matter what the value of the training time is, when the detection accuracy reaches the maximum, the number of hidden layer nodes is 7. Therefore, the number of hidden layer nodes is determined to be 7. In addition, the model requires higher training data. Although the ratios of the test sample to the training sample are at 4:6, 3:7, and 2:8, the detection accuracy is above 70%, yet among these ratios, the detection accuracy fluctuates greatly. Under the premise of setting the training times as 10, 50, 100, 200 and 500, respectively, the different numbers of hidden layer nodes are taken to obtain the accuracy of fatigue driving detection with the different ratios of training sample and test sample (see . No matter what the value of the training time is, when the detection accuracy reaches the maximum, the number of hidden layer nodes is 7. Therefore, the number of hidden layer nodes is determined to be 7. In addition, the model requires higher training data. Although the ratios of the test sample to the training sample are at 4:6, 3:7, and 2:8, the detection accuracy is above 70%, yet among these ratios, the detection accuracy fluctuates greatly.  When the training times (smaller values) are 10 or 50 (see Figures 4 and 5), the detection accuracy is 75.9%. However, the fluctuation is large, the ratio of training samples to detection samples is high, and the reliability of the model is poor. When the training times are 100 or 200 (see Figures 6 and 7), the detection accuracy fluctuates greatly at first, and then it will fluctuate between 60.0-75.9% with the number of training sample increasing. However, there are still some large sudden changes in the intermediate detection accuracy, resulting in poor reliability. When training times are 500 (see Figure 8), the accuracy is generally stable and fluctuates between 60.0-75.9%. Reliability is generally acceptable, but the number of training samples corresponding to the maximum detection accuracy is still small. In addition, the model needs to consider a lot of parameters. Some parameters are randomly generated, leading to one operating result inconsistent with the previous result. The model structure is complex, which takes a long operating time.           When the training times (smaller values) are 10 or 50 (see Figures 4 and 5), the detection accuracy is 75.9%. However, the fluctuation is large, the ratio of training samples to detection samples is high, and the reliability of the model is poor. When the training times are 100 or 200 (see Figures 6 and 7), the detection accuracy fluctuates greatly at first, and then it will fluctuate between 60.0-75.9% with the number of training sample increasing. However, there are still some large sudden changes in the intermediate detection accuracy, resulting in poor reliability. When training times are 500 (see Figure 8), the accuracy is generally stable and fluctuates between 60.0-75.9%. Reliability is generally acceptable, but the number of training samples corresponding to the maximum detection accuracy is still small. In addition, the model needs to consider a lot of parameters. Some parameters are randomly generated, leading to one operating result inconsistent with the previous result. The model structure is complex, which takes a long operating time.
In order to reflect the overall effect of the model, under the premise that the training times are set as 10, 50, 100, 200, and 500, respectively, and the number of hidden nodes is 7, the mean value of detection accuracy under the numbers of different training sample is calculated (see Table 7). It can be seen from Table 7 that when the training times are 50, the mean value of the detection accuracy is very small, less than 60%. When the training times are other values, the average detection accuracy is relatively large, about 64%, but the overall stability of the model is poor. When the detection accuracy reaches the maximum, the detection results of each fatigue degree are shown in Table 8. By calculating the , , and of each type of fatigue degree prediction, the algorithm is evaluated (see Table 9). Comparing Table 9 with Table 6, it can be seen that there is little difference in the detection results of the two models. When the driver is in the alert state, the evaluation index value of the algorithm is relatively large, which can correctly distinguish the alert state and the nonalert state. In the fatigue state and the severe fatigue state, the is larger, while the and the are smaller, indicating that the algorithm has a higher probability of correctly identifying the state of non-fatigue or non-severe fatigue, but a lower probability of correctly identifying the state of certain fatigue or severe fatigue. In order to reflect the overall effect of the model, under the premise that the training times are set as 10, 50, 100, 200, and 500, respectively, and the number of hidden nodes is 7, the mean value of detection accuracy under the numbers of different training sample is calculated (see Table 7). It can be seen from Table 7 that when the training times are 50, the mean value of the detection accuracy is very small, less than 60%. When the training times are other values, the average detection accuracy is relatively large, about 64%, but the overall stability of the model is poor. When the detection accuracy reaches the maximum, the detection results of each fatigue degree are shown in Table 8. By calculating the Precision, TPR, and TNR of each type of fatigue degree prediction, the algorithm is evaluated (see Table 9). Comparing Table 9 with Table 6, it can be seen that there is little difference in the detection results of the two models. When the driver is in the alert state, the evaluation index value of the algorithm is relatively large, which can correctly distinguish the alert state and the non-alert state. In the fatigue state and the severe fatigue state, the TNR is larger, while the Precision and the TNR are smaller, indicating that the algorithm has a higher probability of correctly identifying the state of non-fatigue or non-severe fatigue, but a lower probability of correctly identifying the state of certain fatigue or severe fatigue.  3.
Comparative analysis of two models The above results show that the accuracy of the two models could reach 75.9% by taking different parameter values and different numbers of training samples. The fatigue driving detection model based on the k-nearest neighbor algorithm has reasonable detection results. With the increase of the training sample number, the accuracy of fatigue driving detection increases regularly, and the number of training samples corresponding to the maximum detection accuracy isin a certain interval. Therefore, the detection model based on the k-nearest neighbor algorithm is simple in principle and reliable. The fatigue driving behavior detection model based on the BP neural network has a complicated structure, which takes a long time to detect. The parameters are randomly selected, and the model requires higher training data, leading to the great fluctuation of detection results with sufficient reliability. The maximum accuracy values of the two models are the same, and the detection accuracy of various driver fatigue degrees are also very similar.

Conclusions
This study uses vehicle operating data and drivers' facial video data to screen nine operating parameters of vehicles that are significantly affected by fatigue degree. A fatigue driving detection model is established using k-nearest neighbor algorithm, and different model parameters are set to optimize the model. The results show that compared with the fatigue driving detection model established by BP neural network, the fatigue driving detection model established by the k-nearest neighbor algorithm has obvious advantages in detection accuracy and stability. The principle is simple, and there are fewer parameters that are easy to calibrate accurately. When the number of training samples reach a certain value, the detection accuracy is basically maintained at 75.9%.
However, the careful analysis of detection accuracy of each type of fatigue degree in this study shows that due to driver's human factors, the significance of the vehicle operating parameters in the fatigue state and the severe fatigue state is weak, which leads to the detection results of the two fatigue states being easily confused. This may be different from a real situation, requiring further experimental observation and data analysis.
The limitations of the study also reflected in the following two aspects. The present method relies on an independent panel of experts fatigue rating; thus, the next step will be the model with no such rating, a model of a single driver through training, and then the use of the same model to predict the same driver fatigue state in the future. Secondly, the current method cannot be carried out on a real-time basis, so the next step is to extend the model to real-time detection of fatigue state by sending a fatigue state detection signal to the supervisor, so that the driver can be ordered to stop driving and rest.

Data Availability Statement:
The data used to support the study have not been made available because that it involves the commercial confidentiality of Tianjin SOTEREA Automotive Technology Co., Ltd. The data includes vehicle operating data and drivers' facial videos.