ECG Signal Features Classiﬁcation for the Mental Fatigue Recognition

: Mental fatigue is a major public health issue worldwide that is common among both healthy and sick people. In the literature, various modern technologies, together with artiﬁcial intelligence techniques, have been proposed. Most techniques consider complex biosignals, such as electroencephalogram, electro-oculogram or classiﬁcation of basic heart rate variability parameters. Additionally, most studies focus on a particular area, such as driving, surgery, etc. In this paper, a novel approach is presented that combines electrocardiogram (ECG) signal feature extraction, principal component analysis (PCA), and classiﬁcation using machine learning algorithms. With the aim of daily mental fatigue recognition, an experiment was designed wherein ECG signals were recorded twice a day: in the morning, i.e., a state without fatigue, and in the evening, i.e., a fatigued state. PCA analysis results show that ECG signal parameters, such as Q and R wave amplitude values, as well as QT and T intervals, presented with the largest differences between states compared to other ECG signal parameters. Furthermore, the random forest classiﬁer achieved more than 94.5% accuracy. This work demonstrates the feasibility of ECG signal feature extraction for automatic mental fatigue detection.


Introduction
Fatigue is a phenomenon that has not been conventionally defined and relates, in particular, to reactions to various loads and conditions, including experiences and states of mind. Fatigue is also defined as a subjective lack of physical and/or mental energy perceived by an individual to interfere with their usual or desired activities [1].
Usually, fatigue is a state associated with a weakening or depletion of an individual's physical and/or mental resources, ranging from a general state of lethargy to a specific burning sensation in a particular muscle. Physical fatigue leads to an inability to continue functioning at a normal level of activity. Mental fatigue is a state of tiredness that sets in when brain energy levels are depleted. In the literature, fatigue is differentiated into six types: social, emotional, physical, pain, mental, and chronic illnesses; furthermore, these types are often distinguished in terms of physical and mental fatigue [2].
Many people experience mental fatigue (MF) in daily life or work activities that require sustained mental efficiency [3]. MF can be defined as a psychobiological state caused by prolonged episodes of cognitive exertion [4]. Overwork-related disorders, such as cerebrovascular/cardiovascular diseases, diabetes, and cancer, are major health issues worldwide [5,6]. However, fatigue is a common symptom in both sick and healthy people [7]. Fatigue is one of the most crucial factors contributing to decreased performance among aircraft pilots; car drivers [8]; individual athletes [9]; and team sport athletes, such The structure of this article is as follows: Section 2 highlights the recent literature on mental fatigue detection using various biosignal classification methods and other techniques. The experimental design, data description, and applied methods are presented in Section 3. Section 4 consists of data analysis using the PCA method, and analysis of the performance of ML algorithms. Finally, a discussion and conclusions are presented in Section 5.

Related Work
Modern wearable devices, such as eye-tracking technologies, are becoming increasingly popular. Li et al. [31] demonstrated the feasibility of applying wearable eye-tracking technology to identify and classify mental fatigue in construction equipment operators. The Toeplitz inverse covariance-based clustering (TICC) method was used to determine multiple levels of mental fatigue, and the classification task was performed using support vector machine (SVM) methods. However, this study consisted of a narrow target group and might not be applicable in other fields.
In [32], EEG and HRV signals were observed and analyzed to detect the impacts of prolonged cognitive activity on the central nervous system and the autonomic nervous system. EEG signal wavelet packet parameters and HRV spectral indices were combined to measure changes in mental fatigue. Although 91% classification accuracy was achieved, two separate devices for EEG and HRV recordings are not efficient and barely usable in daily life activities. Furthermore, EEG signals are most likely contaminated by muscle artifacts, which may lead to incorrect interpretation. For this reason, various filtering and feature extraction methods have been proposed [33,34]. Preprocessed EEG signals can be used in multilevel fatigue recognition tasks. In [35], EEG signals were classified using a K-nearest neighbor (KNN) classifier, achieving 100% accuracy. These results demonstrate the feasibility of using EEG signals and extracted features to successfully detect mental fatigue. However, in this case, the data were collected using a driving simulator and a brain cap with 32 electrodes placed on the skin surface, which may not be applicable in real-world environments.
Portable single-channel electrocardiogram equipment ("LaPatch") was used in [5] to record and analyze ECG signals. Eight heart rate variability (HRV) indicators were considered and classified using SVM, KNN, naïve Bayes (NB), and logistic regression (LR) models. Although the technique is promising, due to the small sample size, only 75.5% accuracy was achieved. In another study, researchers developed an automatic mental stress detection system based on ECG signals recorded from T-shirts and analyzed using machine learning (ML) classifiers: decision tree (DT), random forest (RF), NB, and LR [6]. The best-performing model achieved an accuracy of 94.1%. However, in this research only mental stress detection was considered, and the same technique may not be applicable to mental fatigue recognition.
Wearable devices for HRV recordings are usually user-friendly and convenient. Furthermore, they do not require electrodes to be attached directly to the skin surface. Many studies have focused on heart rate (HR) and time-or spectral-domain HRV analysis. For example, in [36], mental and physical fatigue detection methods were applied based on HR, HRV, skin temperature, and pulse. Causal convolutional neural networks (cCNN) and RF models were used to detect and distinguish between mental and physical fatigue. However, only 66.2% accuracy was achieved in the mental fatigue recognition task. Other similar research used a polar H10 chest strap and photoplethysmography (PPG) technology for HRV detection [37]. Results were compared with those obtained with a Bittium FarosTM 360 device, which records a single ECG lead. Furthermore, the study included several watches, such as the Actigraph wGT3X-BT, Garmin, and Polar Vantage V. Various time-and spectral-domain HRV parameters were estimated and compared. However, no decision-making or fatigue recognition techniques were applied.
Modern wearable electronics have been developed in recent years, such as epidermal electronics systems (EES) and electronic tattoos (E-tattoos), with which ECG signals, respiration rate, and galvanic skin responses (GSR) can be recorded [38]. Comparing three ML models (SVM, KNN, and DT) the obtained signals were classified with 89% accuracy. Although these technologies are promising, the equipment has not been fully tested and prepared for production. A transparent eye detection system can also be considered a modern wearable device [39]. Such a system can acquire movement in the pupil and detect blinking based on the light that is reflected from the eye. A summary of these and similar wearable devices and corresponding research in recent literature is presented in Table 1.  [18] Young healthy adults IMU, HRV RF, SVM, LR Neuroscan 32-channel system [32] Healthy adults EEG, ECG, HRV Spectral analysis, SVM Everion device [36] Healthy adults HRV, skin temperature Statistical analysis, CNN, RF Polar H10 chest strap [37] Military members HRV, ECG Statistical analysis Neuroscan Scan 4.3 [15] Drivers (men) EEG SVM, DT, RF, KNN, and others [20] Train drivers EOG, EEG, ECG Correlation analysis, ANOVA, SVM, PCA Driving simulator and brain cap [35] Drivers EEG KNN E-tattoos [38] Healthy adults ECG, respiration, GSR SVM, KNN, DT Transparent eye detection system [39] National Aeronautics Eye blinks Statistical analysis Because HRV analysis cannot achieve high accuracy in mental fatigue recognition and EEG signals are not reliable in daily life activities, in this paper, a novel framework is proposed for mental fatigue detection that involves analysis and classification of ECG signal features. Furthermore, principal component analysis (PCA) is applied to distinguish between ECG signal parameters in two different states (in the morning, i.e., a non-fatigued state, and in the evening, i.e., fatigue condition). The classification task is performed using several models: KNN, LDA, DT, and RF.

Materials and Methods
In this section, we describe the proposed data analysis and classification processes that are essential for mental fatigue recognition. The whole process flow consists of five main parts: ECG signal recording, ECG signal preprocessing, feature extraction, PCA analysis, and ML performance (see Figure 1). The experiment was designed with ECG signal registration twice a day (in the morning and in the evening). HRV analysis or whole ECG signal classification techniques failed on the mental fatigue recognition task, so we proposed the extraction of ECG signal features only when applying classification algorithms, such as KNN, DT, RF, etc. Before implementation of a machine learning technique, PCA analysis was applied to make sure that there were significant differences between ECG signal features in separate states. This research was conducted with the approval of the Kaunas Regional Research Ethics Committee of our institution under the project name, "Various directionalities on physical exercise effects that are based on differential learning methodology, and impact on heart and cardiovascular system" (biomedical ethics permission number BE-2-38, Lithuania).

ECG Signal Characteristics and Data Analysis
In this study, various ECG signal features were analyzed and classified. The protocol consisted of two 60 sec recordings of each participant. These recordings enabled the detection of differences in ECG signal parameters, which were estimated at the beginning of the day and in the evening. V5 lead was selected in this research (see an example in Figure 2), with each parameter representing a separate component of heart activity (see Table 2). All ECG signal features are visible in the properly filtered data. Numerous methods can be applied to ECG signal preprocessing, such as moving average (MA), exponential smoothing, or linear Fourier transformation. Usually, biological signals are contaminated with various environmental disturbances. The main purpose of signal filtering algorithms is to divide separate components into informative parts and undesirable noise. Furthermore, biological signals that are recorded during movement are highly contaminated by various disturbances, and sometimes, noise overlaps the signal itself. The main problem associated with movement-contaminated signals is non-stationary, low-frequency noise (a trend resulting from movement artifacts). In such cases, ordinary filtering methods for signal processing are insufficient or unreliable. In this research, ECG signals were recorded while each participant was standing so that only small movement artifacts might affect the signal. Therefore, a Butterworth filter was used for noise reduction [40]. Table 2. ECG signal features and causes of electrical impulses in the heart (based on [41]).

Wave Type and Parameter
Heart Activity

Q wave
The anteroseptal part of the myocardial ventricle is activated R wave Depolarization of myocardial ventricles S wave The posterior diaphragmatic part of the ventricles is activated T wave Rapid ventricular repolarization QT interval Time required for the electrical system to fire an impulse through the ventricles and then recharge ST interval The initial, slow phase of ventricular repolarization

RR interval
Time elapsed between two successive R waves of the QRS signal on an electrocardiogram (and its reciprocal, the HR); a function of intrinsic properties of the sinus node, as well as autonomic influence QRS complex A combination of the Q wave, R wave, and S wave; the "QRS complex" represents ventricular depolarization ECG signal preprocessing continues with feature extraction (ECG parameter estimation). ECG feature extraction starts with R peak detection and QRS complex identification. All other parameters, such as Q and S peaks, RR interval, and T wave, are based on R peaks or QRS complex positions. In this research, 9 ECG parameters were estimated: Q, R, S, and T amplitudes; QT, ST, RR, and QRS intervals; and T-wave intervals. All ECG features were estimated using the NeuroKit2 toolbox in Python programming language [42].

Research Design and Data Acquisition
In this research, a CardioScout Multi-device was used to record ECG signals and transmit them to mobile devices or tablets. The signal recording frequency was 500 Hz, and each segment was 60 s long. In this article, the analyzed experiments comprise data recorded twice a day (in the morning and in the evening) for signal parameter classification and fatigue recognition. In Table 3, two different states are defined: A1 in the morning and A2 in the evening. In total, 60 healthy adults were recruited (aged between 24 and 34 years) without a diagnosis of health pathologies or overwork-related problems. In this research, 8271 measurements were estimated from 60 participants via ECG signal recordings: 4195 corresponding to state A1 and 4076 corresponding to state A2.

Data Description and Visualization
As mentioned in the previous section, two states were analyzed in this study (A1 in the morning and A2 in the evening). All parameter data were normalized by subtracting means and dividing by the standard deviation. This type of data normalization is needed to eliminate differences in individual heart rate characteristics of each person. For example, some participants may have higher (or lower) ECG signal amplitude values compared to others in both states (in the morning and in the evening), which may affect classification results, indicating fatigued state in both datasets. Furthermore, normalization increases data integrity without distorting differences in the ranges of values. The distribution and scatter plots are shown in Figure 3. Pearson correlation coefficients are presented in Figure 4 (Y represents the state: a value of 1 corresponds to the fatigued condition or state (A2), and a value of 0 corresponds to the fatigue-free condition of state (A1)). Comparing data from different states, clear differences could be noticed. For example, histograms of the Sa parameter look similar, but A2 data are shifted, with higher values compared to state A1 values (see Figure 3). Furthermore, some ECG signal parameter values overlap. For example, there is no significant difference between states A1 and A2 in terms of RR interval values. Therefore, typical HRV analysis fails in mental fatigue detection, with low classification accuracies.    Figure 4 (for example, the Pearson correlation coefficient for ST and QT reaches 0.98). Additionally, a strong dependence (Pearson correlation coefficient > 0.8) is evident between Tint and QT or ST. Although some parameters may be eliminated in the classification step, it is not clear which parameters have a greater impact on classification accuracy. In the initial stage of the classification process, all ECG signal features were included (all 9 ECG parameters).

Principal Component Analysis
Principal component analysis (PCA) was applied to distinguish between ECG features in the morning and in the evening. The general idea behind this technique is to obtain new latent variables based on the original data. The newly defined principal components reflect directions of maximal variance of the projected data and form a new orthonormal basis of the original vector space.
PCA is commonly used to reduce the dimensions of the collected data matrix by choosing k principal components (PC1, PC2, . . . , PCk) and evaluating the amount of information explained by the chosen components as follows: where λ i is the i-th eigenvalue of the covariance matrix (C), and Tr(C) is its trace, i.e., the sum of all entries on the main diagonal. Furthermore, the original data are projected to the low-dimension hyperplane spanned by components PC1, PC2, . . . , PCk, thus extracting the essential information from the initial data cloud. In this study, three principal components were considered, and the obtained results were visualized via 3D plots to emphasize the desired differences in mental states.

Machine Learning Technique
The use of social media, smartphones, smartwatches, computers, and even portable devices provides big data about various mental and physical health disorders. Effective algorithms for big data processing are usually based on machine learning (ML) techniques.
Various ML algorithms have been created for data classification and prognosis. There are three main categories: Supervised learning: examples of such methods include support vector machine (SVM), k-nearest neighbors (KNN), decision tree (DT), and random forest (RF); Unsupervised learning: these methods include neural networks (NN) and clustering; Semi-supervised learning: this category includes methods such as semi-supervised SVM, mixed models, etc. [43].
Supervised learning is used to analyze the labeled data and make predictions or classify data into different categories, whereas unsupervised learning methods can learn from unlabeled data and extract similar patterns. The third group is semi-supervised learning, which involves the analysis of data with and without labels; such methods are used when there is not enough labeled data for classification or prognosis.
One way to evaluate the potential accuracy of predictions is to use a confusion matrix [44]. The entries in this matrix indicate the correctness of the prediction or classification of distinct fault categories compared to actual observed values. To evaluate the quality of the selected predictions or data classifications, additional measurements can be considered. Two widely used standard statistics are accuracy (acc) and F1 score. These are estimated using Equation (2).
where TN corresponds to true-negative elements after prediction (correctly predicted as not correct), TP represents true-positive elements (correctly predicted as correct), FP represents false-positive elements, and FN represents false-negative elements. Although popular in ML analysis, both acc and F1 ignore the size of each category in the confusion matrix. Therefore, the additional statistic called Matthew's correlation coefficient (MCC) was measured [45]. This coefficient is calculated as follows: The value of this coefficient is in the range [−1; 1], where −1 is interpreted as the worst-case scenario, whereas 1 is the best possible value.
Additionally, in 1960 J. Cohen revealed that there is a level of algorithm precision when the algorithm is no longer capable of predicting correctly, i.e., the prediction becomes as accurate as a simple guess. This level is called Cohen's Kappa (κ) statistic and can be expressed as follows: Three main intervals are considered: if κ > 0.75, then the value is viewed as perfect; if κ is in the range of [0.4, 0.75] the value is sufficient; and if κ < 0.4, it is considered weak [46].
In this research, we compared multiple ML methods, revealing that the RF algorithm classifies signal parameters with the highest accuracy. For RF algorithms in the feature extraction process, the Gini coefficient needs to be estimated. If n is defined as the number of samples in node t and each node has c classes, then the number of samples belonging to class i is n i . The ratio (p(i|t)) is expressed as: In this case, the Gini coefficient G for each node is defined as [47]: Generally, the RF classifier is based on DT and consists of three main steps: input all data into root nodes for every DT; minimize the Gini coefficient by dividing data into separate nodes; recursively repeat all steps at each node that needs to be split until the root mean square error (RMSE) value for the node falls below a threshold value or the tree reaches a defined depth.
RF may consist of many separate decision trees that train each model concurrently using random data samples. This type of RF is also called a bagged tree algorithm. Consistent DT models that are trained consecutively are called boosted trees. In this case, every DT model learns from previous model errors. Usually, this type of RF has more nodes [48]. Like any other classifier, the random forest algorithm requires two datasets: one for training and one for testing. In ML techniques, the more data provided, the higher the classification accuracy. Additionally, in every ML technique, overfitting of training data should be considered, which may negatively affect algorithm performance, thereby reducing prediction accuracy. Cross validation can be applied to avoid overfitting. This method involves splitting data into different groups and estimating the classification accuracy for each group. In this case, the training dataset is divided into two groups: a training set and a validation set. If cross validation is performed several times, in each iteration, different data samples are assigned to the testing data subset [49].

PCA Implementation
In this research, PCA was applied in two ways. First, all collected data were considered at once. Then, the data were factored into morning and evening subsets, and PCA was performed on each subset separately.
Considering all collected data, the main parameters of Tint, QT, and ST, which reflect the first principal component, have an interval, whereas the second principal component mainly covers amplitude parameters, i.e., Qa and Ra. These results are consequential, as during casual daytime activity, the amplitude of the heart rate changes less than the intervals between two waves. A summary of the PCA results for the whole dataset is presented in Table 4. A 3D plot of the obtained results was drawn using RStudio tools (see Figure 5). A significant difference was observed between the two groups: yellow dots represent the morning state, and e blue pyramids represent the evening state. The change in parameters is more noticeable in the evening subset. Moreover, the evening data can be further grouped into several clusters, whereas the morning data are mainly concentrated in the center. To explore the visible difference between the two investigated states, the data were split into two separate subsets. The obtained PCA results for each of the individual states show that the interval parameters are considerably represented in the first component. This influence remains consistent, regardless of the considered state. On the other hand, the influence of the amplitude parameters changed significantly. This difference is even more obvious in the second principal component, where these parameters outshine most other parameters in the case of the morning data subset. The explicit expressions of the first three principal components are presented for each of the states, along with the percentage of data explained by these components in parentheses (see Table 5).  Based on the PCA results for the two considered states, we assume that the fatigue factor is represented by the significant changes in the influence of the amplitude parameters on the first and second principal components. Moreover, substantial changes occurred in the distribution of points in 3D plots obtained for the first three principal components for each of the states (see Figure 6).

Machine Learning Performance
At the beginning of this research, multiple ML algorithms were compared (see Table 6 and Figure 7) using all 9 ECG parameters. In this case, the input dataset was split into training with validation (70%) and testing (30%) subsets. Analysis of data shows that a lower allocation of data to the testing subset slightly reduces the overall accuracy of the KNN, DT, and RF algorithms. For example, reducing the testing dataset to 20% of the total data for the RF model resulted in a reduction in accuracy of 2%. This may also result in lower quality and feasibility of selected classifiers. To ensure that the model did not overfit the training data, 10-fold cross validation was applied for all ML methods. In Figure 7 shows the validation accuracy results following 100 calculations. Table 6 shows averaged, F1, and MCC values for a better comparison of the analyzed ML algorithms. According to the results presented in Table 6 and Figure 7, the best algorithm for physiological fatigue recognition is the random forest model, which classified states A1 and A2 with a validation accuracy of more than 95%. Multiple hyperparameter values were analyzed for all compared ML techniques. The best results were obtained when DT had 100 maximum splits and 9 maximum surrogates in each node. In this case, the selected RF algorithm consisted of 30 DTs, with a maximum of 20 splits for every tree. Based on these results (see Table 6), a random forest algorithm was selected for further analysis. RF algorithms include multiple DTs, in which every node is a condition of a single feature and is designed to split the dataset into two subsets. Basically, similar response values end up in the same dataset. As previously mentioned, different ECG signal features may have varying impacts on the final classification result. Usually, the importance of a feature is estimated based on the degree to which it decreases the entropy in each tree.
Only four ECG signal characteristics (Sa, Ra, Ta, and QT) are important for A1 and A2 state classification (mental fatigue recognition) if the selected threshold is equal to 0.8 (see Figure 8). Based on these results, the final RF model was designed using only those four ECG parameters. A random forest is a complex algorithm with multiple hyperparameters, all of which should be optimized. In this research, a random search algorithm was selected based on a grid search technique; this algorithm attempts every possible combination. However, the number of iterations is limited, and possible randomly selected hyperparameter values are declared in advance. Only the best hyperparameter values are saved to maximize the FR validation accuracy. The optimal RF model identified in this research consists of 40 DTs with a maximum tree height of 7 and a minimum of 40 samples in a node for a split. K-times cross validation was used in the RF model training process. The training dataset was split into two subsets: the training set and the testing set. In this case, the same steps were repeated 10 times, and averaged results were estimated. Receiver operating characteristic curves (ROCs) are commonly used to visually represent of k-fold cross validation. These curves help to plot and illustrate the true-positive (TP, correctly classified values of state A2 presented on the y-axis) and false-positive (FP, misclassified A1 state values presented on the x-axis) rates. The 10-fold cross validation and averaged curve for states A1 and A2 are illustrated in Figure 9. In this figure, the area under the ROC curve (AUC) provides an accumulated measure of performance across all possible classification thresholds. In this research, the AUC indicated the probability that the model ranks a random value from state A2 more highly than a random value from state A1. Analysis of multiple scenarios with various k values shows that in most cases, a lower number of k-folds may negatively affect recall or precision values. However, the AUC value remains similar, and further investigation is needed to determine the number of k-folds in the cross-validation process. Due to small changes in the results, we decided not to present all possible combinations and instead use only 10-fold cross validation as an example. Based on cross-validation results (see Figure 9), and accuracy of 98% can be expected for A1 and A2 state classification. The next step is to test the final RF model using a separate dataset (30% of all input data). The testing accuracy of the constructed RF model is equal to 94.5%, which is lower than expected (validation accuracy, 98%). However, this model can sufficiently classify and correctly assign values to states A1 and A2. True-positive (state A2) and false-positive (state A1) values are predicted with similar accuracy (95% and 94%, respectively) ( Figure 10). Finally, Cohen's Kappa coefficient was estimated as an essential measurement to evaluate a model's suitability and reliance on random samples. This value was calculated as κ = 0.886, indicating "perfect" model reliability. Therefore, the constructed RF classifier sufficiently classifies data into states A1 and A2, meaning that the model can identify fatigued conditions using ECG signal parameter values.

Discussion and Future Work
In this paper, we proposed a framework for mental fatigue detection combining ECG signal recording twice a day corresponding to different mental states: fatigued and without mental fatigue. Extracted ECG signal features, such as R, S, and T wave amplitude values, as well as QT intervals, increased the classification accuracy compared to similar methods reported in the literature. For example, in [5], heart rate variability (HRV) indicators achieved an accuracy of 75.5%. Additionally, in [6], the proposed methods achieved an accuracy of 93.3%. However, this research considered mentally stressed participants (after 12 h of intense work), which may have resulted in an increased impact on HRV parameter values. This research enables the detection of smaller changes in mental health conditions compared to previously mentioned literature reports. Furthermore, statistical analysis of several ECG signal features showed that RR interval values (used in HRV analysis) overlap in between states, which is why HRV analysis and parameter estimation are not efficient and may reduce classification accuracies. PCA analysis showed that other ECG features present with larger differences between states. Due to the use of several ECG signal features, an accuracy of 94.5% was achieved.
Although the proposed technique shows promising results, it is also subject to some weaknesses. ECG signals should be recorded using professional devices, such as Car-dioScout Multi, which is expensive and inconvenient. User-friendly devices, such as a Polar v10 belt or Garmin watch, do not record full ECG signals, and data from such devices are not sufficiently reliable. Furthermore, in this research, the gathered data were not suitable for diagnostic purposes because we did not include patients with diagnosed mental illness. Future research should focus on gathering more data and improving classification accuracies. We suggest that 60 sec ECG signal recordings could be expanded and compared with HRV analysis results.

Conclusions
Considering gaps identified in recent literature, we presented a novel framework that combines ECG signal feature extraction, PCA analysis, and ML classification algorithms. The obtained results show that the proposed framework is feasible for automatic mental fatigue detection.
To ensure daily fatigue recognition, we designed an experiment involving separate recordings registered twice a day. Each recording represents a mental state, i.e., a state without fatigue recorded in the morning and a fatigued state recorded in the evening. A total of 60 healthy adults (ages 24 to 34) without a diagnosis of health pathologies or overwork-related problems were recruited for this experiment. All ECG signals were filtered using a Butterworth filter, and features were extracted using Python toolbox NeuroKit2. Using these methods, the following high-quality ECG parameters were obtained: Q, R, S, and T wave amplitude values; QRS complexes; and RR, ST, QT, and T intervals.
Data visualization processes and statistical analysis show that RR interval values overlap between states, which is why only RR interval analysis alone, such as HRV parameter estimation, is not an efficient way to detect mentally fatigued states. To overcome this issue, other ECG signal parameters were considered in this paper.
PCA analysis showed a significant difference between states (with and without fatigue). As the most representative ECG signal features Q and R wave amplitude values and QT and T intervals were observed. Changes in the first three principal components were evident, indicating the importance of ECG signal feature extraction for mental fatigue recognition.
Finally, machine learning algorithms were applied for automatic classification of ECG signal features into separate states. Four ECG signal parameters (Sa, Ra, Ta, and QT) were identified as the most important for the mental fatigue classification process. The final RF model was able to detect daily mental fatigue with an accuracy of more than 94.5%.
Although the proposed technique shows promising results, it is also subject to some weaknesses. Future work should focus on user-friendly devices for the ECG signal gathering process to ensure that a wide range of participants can be included in experiments.

Institutional Review Board Statement:
The study was conducted in accordance with the Declaration of Helsinki, and approved by the Kaunas Regional Research Ethics Committee board (protocol code BE-2-38, 2016-09-07).
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: Not applicable.

Conflicts of Interest:
The authors declare no conflict of interest.