Assessment of Vigilance Level during Work: Fitting a Hidden Markov Model to Heart Rate Variability

This study aimed to enhance the real-time performance and accuracy of vigilance assessment by developing a hidden Markov model (HMM). Electrocardiogram (ECG) signals were collected and processed to remove noise and baseline drift. A group of 20 volunteers participated in the study. Their heart rate variability (HRV) was measured to train parameters of the modified hidden Markov model for a vigilance assessment. The data were collected to train the model using the Baum–Welch algorithm and to obtain the state transition probability matrix A^ and the observation probability matrix B^. Finally, the data of three volunteers with different transition patterns of mental state were selected randomly and the Viterbi algorithm was used to find the optimal state, which was compared with the actual state. The constructed vigilance assessment model had a high accuracy rate, and the accuracy rate of data prediction for these three volunteers exceeded 80%. Our approach can be used in wearable products to improve their vigilance level assessment functionality or in other fields that have key positions with high concentration requirements and monotonous repetitive work.


Introduction
Internet of Things (IoT) and sensor technologies can now monitor and evaluate an individual's personal status while they are working [1]. However, an accessible product needs to be designed for people to easily know their measured current personal status data during work to understand workplace physiological conditions better.
Vigilance can be defined as the ability to achieve and maintain a state of high sensitivity to incoming stimuli. It is a measure of perceiving and responding to subtle changes that occur at random time intervals in a particular environment [2]. Vigilance is a special form of attention [3]. Continuous vigilance is related to types of work, such as aerospace, navigation, driving, etc. [4][5][6]. Caldwell found that official statistics indicate that low vigilance-caused fatigue is involved in at least 4-8% of aviation mishaps [7]. However, vigilance measurements are not directly available, which is taken by three main modalities: the subjective scale, experimental paradigm, and physiological signals [8].
Because vigilance levels are often linked to fatigue rate, some subjective scales for fatigue rate assessment are often used to assess vigilance. The most commonly used scales are: Karolinska sleepiness scale (KSS) and Stanford sleepiness scale (SSS). KSS uses nine The hidden Markov model (HMM) is a typical dynamic Bayesian network model, which is used to estimate the probability distribution of state transitions in the dynamic sequences of the measurement process and the probability of measurement output [25]. HMM is widely used in speech recognition, fault diagnosis, and other fields with high accuracy. HMM rests on the assumption that underlying processes produce time-changing observations with discrete hidden states. The measured process variable is regarded as the realization of the underlying stochastic process [26]. The difficulty lies in determining the hidden parameters of the process from the observable parameters and using these parameters for further analysis. HMM can reflect both the randomness and the potential structure of variables, showing a strong modelling ability for internal relations and random signals. Our study attempts to construct a modelling method to measure and evaluate the cognitive state by establishing a connection between the easily available HRV signal and a person's implicit cognitive state-vigilance level. Therefore, HMM is a very effective biological modelling and computing tool.
Our study's primary contribution is building HMM for vigilance assessments accomplished by extracting HRV information from human body ECG signals. This innovation improves the real-time performance of current vigilance measurement techniques, enabling the measurement of potential cognitive ability levels. Our method expands the crossfield knowledge domain of cognition assessment and computer science, providing new approaches for research and practical applications.

Related Works
Vigilance is the ability to sustain attention and remain alert to a particular stimulus over a prolonged period of time [27]. Numerous jobs in the fields of industry, the military, medicine, and education demand constant attention with varying levels of cognitive workload. Security personnel [28], workers in charge of watching security cameras or baggage screening experts, operating vehicles, working in real classroom settings [29], as well as industrial and air traffic control [30], are examples that require high levels of attention. Vigilance is necessary for these tasks to be completed with sufficient cognitive efficiency; however, assessing vigilance is a considerable problem.
In the past decade, machine learning methods have played important roles in the vigilance assessments [31][32][33]. Generally, these methods have five different phases to assess vigilance: (i) a sample acquisition focused on each task; (ii) signal pre-processing such as band-pass filtering; (iii) a feature extraction stage; (iv) a classification or regression step; and (v) a feedback phase [34]. For example, researchers proposed neural network methods for mental fatigue monitoring to assess vigilance, such as LRNN and BP [35]. However, these studies inadequately dealt with various vigilance assessment scenarios. Such neural network-based methods have an over-fitting problem due to the few-shot examples of each task. To address such a problem, the HMM method, which utilizes HRV parameters, can meet the need for continuous vigilance assessments.
According to the literature [36], performance in monotonous tasks is associated with an increase in the LF component of HRV. The low-frequency power spectrum of HRV reflects both sympathetic and parasympathetic activities, which jointly control the heart. This is the theoretical basis for our research. Therefore, our proposed method aims to provide vigilance level prediction, and may play a role in improving work performance and preventing staff from suffering accidents by combining wearable devices with prompt prewarning functions.
For general evaluation products already in application, relevant methods for monitoring mental fatigue focuses on stress; however, work performance is associated with several factors in addition to stress and, therefore, is more intuitive to vigilance level measurements [37]. With advances in ECG sampling technology, our method of obtaining HRV indicators by measuring ECG is also more convenient and easier to deploy in work scenarios than the traditional method. The literature also used a combination of ECG and EMG signals to develop a system that could simultaneously detect low-level vigilance manifestations such as drowsiness and inattention [38]. The KNN method and linear and quadratic discriminant analysis were used to classify these features with a maximum accuracy of 96.75%. However, this method needs more sensors, and larger numbers and types of sensors pose a considerable challenge to wearable device design.
Existing studies on vigilance mainly distinguish between sleep and wakefulness and focus on the variation of signal characteristics in typical vigilance tasks, such as MCT. Few studies introduced new models to improve vigilance measurement accuracy. Therefore, the purpose of our study was to systematically examine the feasibility of using ECG signal and HMM for vigilance detection from the perspective of time-and frequency-domain characteristics, in addition to improving accuracy and broadening the application prospects of wearable products for vigilance monitoring.

Methods
This section will introduce how volunteers' HRV feature data were collected using wearable ECG signal sensors while conducting PVT and VST paradigms as the dataset. The dataset was used for training the HMM. Figure 1 presents an overview of the research in diagram form.
For general evaluation products already in application, relevant methods for monitoring mental fatigue focuses on stress; however, work performance is associated with several factors in addition to stress and, therefore, is more intuitive to vigilance level measurements [37]. With advances in ECG sampling technology, our method of obtaining HRV indicators by measuring ECG is also more convenient and easier to deploy in work scenarios than the traditional method.
The literature also used a combination of ECG and EMG signals to develop a system that could simultaneously detect low-level vigilance manifestations such as drowsiness and inattention [38]. The KNN method and linear and quadratic discriminant analysis were used to classify these features with a maximum accuracy of 96.75%. However, this method needs more sensors, and larger numbers and types of sensors pose a considerable challenge to wearable device design.
Existing studies on vigilance mainly distinguish between sleep and wakefulness and focus on the variation of signal characteristics in typical vigilance tasks, such as MCT. Few studies introduced new models to improve vigilance measurement accuracy. Therefore, the purpose of our study was to systematically examine the feasibility of using ECG signal and HMM for vigilance detection from the perspective of time-and frequency-domain characteristics, in addition to improving accuracy and broadening the application prospects of wearable products for vigilance monitoring.

Methods
This section will introduce how volunteers' HRV feature data were collected using wearable ECG signal sensors while conducting PVT and VST paradigms as the dataset. The dataset was used for training the HMM. Figure 1 presents an overview of the research in diagram form.

Assessment Protocol
In our study, we adopted the HMM because it describes the process of vigilance change as a process related to time sequence. On the other hand, HMM has the advantages of simple calculation and easy deployment. The assessment protocol will be detailed in the following sections.

Assessment Protocol
In our study, we adopted the HMM because it describes the process of vigilance change as a process related to time sequence. On the other hand, HMM has the advantages of simple calculation and easy deployment. The assessment protocol will be detailed in the following sections.
To realize vigilance level classification, one needs to extract the signal features related to the state of vigilance. Our paper adopted the ECG as the object signal for extraction because it is closely associated with vigilance. The ECG measurement can effectively circumvent the EEG signal extraction process's complexity and interference with participants.

Psychomotor Vigilance Task
PVT is a visual response time measure that objectively quantifies human vigilance and fatigue by measuring visual response time to a simple and salient signal. We performed a 10 min standard PVT experiment to acquire data more conveniently. Figure 2 shows the PVT experiment. At the beginning of each trial, the screen appeared blank for 1000 to 6000 ms until a number of counts appeared. When the number appeared on the screen, the volunteer was asked to press the space bar as soon as possible. The number corresponded to the current time in milliseconds from the beginning of the digital presentation. The experiment automatically entered the next test trial if the volunteer pressed the space bar or gave no response at 5000 ms [13]. A PC based PVT version was used in our experiment. Using such a version was advantageous because of low hardware cost, high user familiarity, and the relative ease of software development [14].
To realize vigilance level classification, one needs to extract the signal features related to the state of vigilance. Our paper adopted the ECG as the object signal for extraction because it is closely associated with vigilance. The ECG measurement can effectively circumvent the EEG signal extraction process's complexity and interference with participants.

Psychomotor Vigilance Task
PVT is a visual response time measure that objectively quantifies human vigilance and fatigue by measuring visual response time to a simple and salient signal. We performed a 10 min standard PVT experiment to acquire data more conveniently. Figure 2 shows the PVT experiment. At the beginning of each trial, the screen appeared blank for 1000 to 6000 ms until a number of counts appeared. When the number appeared on the screen, the volunteer was asked to press the space bar as soon as possible. The number corresponded to the current time in milliseconds from the beginning of the digital presentation. The experiment automatically entered the next test trial if the volunteer pressed the space bar or gave no response at 5000 ms [13]. A PC based PVT version was used in our experiment. Using such a version was advantageous because of low hardware cost, high user familiarity, and the relative ease of software development [14].

Visual Search Task
We mainly used the visual search task (VST) to stimulate the volunteers' cognitive load to artificially change their vigilance levels. Figure 3 first shows an 800 ms "+" fixation point, followed by a visual search stimulus, which was presented until the volunteers responded with a key press, upon which the search stimulus disappeared followed by the screen appearing blank for 500 ms. We instructed the volunteers to search for stimuli presented at the visual search interface using a guiding letter. When judging the color of the lowercase letter "p", we asked the volunteers to press the "F" key if it was red; otherwise, we asked them to press the "J" key. The volunteers performed a 12-trial exercise session

Visual Search Task
We mainly used the visual search task (VST) to stimulate the volunteers' cognitive load to artificially change their vigilance levels. Figure 3 first shows an 800 ms "+" fixation point, followed by a visual search stimulus, which was presented until the volunteers responded with a key press, upon which the search stimulus disappeared followed by the screen appearing blank for 500 ms. We instructed the volunteers to search for stimuli presented at the visual search interface using a guiding letter. When judging the color of the lowercase letter "p", we asked the volunteers to press the "F" key if it was red; otherwise, we asked them to press the "J" key. The volunteers performed a 12-trial exercise session before the formal experiments, and the practice section similarly contained the stimuli for all conditions. The formal experiments started after the volunteers understood the experimental procedures and when the correct rate exceeded eighty percent. In our study, we mainly used VST to increase the variety of data changes [13]. The VST experimental results (such as reaction time and accuracy) did not affect our modelling. Our reason for adopting VST is that it was a structured experiment and easy to obtain; therefore, we could easily provide standard experimental materials to each participant. before the formal experiments, and the practice section similarly contained the stimuli for all conditions. The formal experiments started after the volunteers understood the experimental procedures and when the correct rate exceeded eighty percent. In our study, we mainly used VST to increase the variety of data changes [13]. The VST experimental results (such as reaction time and accuracy) did not affect our modelling. Our reason for adopting VST is that it was a structured experiment and easy to obtain; therefore, we could easily provide standard experimental materials to each participant. Visual search task experiment screen. When judging the color of the lowercase letter "p", the volunteers were asked to press the "F" key if it was red; otherwise, press the "J" key.

Participants
A total of 20 undergraduates participated in the experiment, including nine males and eleven females (aged 18-32, mean = 24, Standard Deviation = 2.60). None had a smoking history and reported normal hearing, vision, or corrected vision. The participants were required to maintain a regular sleep-wake period of at least one week before participating in the experiment and were not allowed to consume alcohol or functional beverages or participate in any high-intensity physical sports on the day of the experiment. The participants read and signed an informed consent form. They were also informed that they had the right to withdraw from the experiment anytime. We analyzed all reported data anonymously. Our study was approved by Northwestern Polytechnical University and complied with the Declaration of Helsinki.

Instruments
We conducted our experiment in a quiet environment completely shielded from natural light using shade drapes. We completely replaced natural light with fixed artificial light sources. We used an EQ02 LifeMonitor ECG signal acquisition (EquivitalTM, Cambridge, UK). We used a well-established commercial finger-clip heart rate oximeter to simultaneously measure volunteers' heart rates to compare with heart rates calculated from experimental data.

Assessment Steps
We artificially reduced volunteers' vigilance levels using the visual search task to obtain vigilance data for different fatigue conditions. We asked the volunteers to attend the laboratory for data acquisition during three time periods: 8:00-11:10, 14:30-17:40, and 19:30-22:40. First, we tested the volunteers for subjective vigilance and drowsiness using the SSS. Next, they underwent a standard PVT trial for 10 min using a laptop computer. Immediately after that, they performed the visual search task; furthermore, once again, following the visual search task, vigilance was rated using PVT with a cycle of 10 trials ending at the Visual search task experiment screen. When judging the color of the lowercase letter "p", the volunteers were asked to press the "F" key if it was red; otherwise, press the "J" key.

Participants
A total of 20 undergraduates participated in the experiment, including nine males and eleven females (aged 18-32, mean = 24, Standard Deviation = 2.60). None had a smoking history and reported normal hearing, vision, or corrected vision. The participants were required to maintain a regular sleep-wake period of at least one week before participating in the experiment and were not allowed to consume alcohol or functional beverages or participate in any high-intensity physical sports on the day of the experiment. The participants read and signed an informed consent form. They were also informed that they had the right to withdraw from the experiment anytime. We analyzed all reported data anonymously. Our study was approved by Northwestern Polytechnical University and complied with the Declaration of Helsinki.

Instruments
We conducted our experiment in a quiet environment completely shielded from natural light using shade drapes. We completely replaced natural light with fixed artificial light sources. We used an EQ02 LifeMonitor ECG signal acquisition (EquivitalTM, Cambridge, UK). We used a well-established commercial finger-clip heart rate oximeter to simultaneously measure volunteers' heart rates to compare with heart rates calculated from experimental data.

Assessment Steps
We artificially reduced volunteers' vigilance levels using the visual search task to obtain vigilance data for different fatigue conditions. We asked the volunteers to attend the laboratory for data acquisition during three time periods: 8:00-11:10, 14:30-17:40, and 19:30-22:40. First, we tested the volunteers for subjective vigilance and drowsiness using the SSS. Next, they underwent a standard PVT trial for 10 min using a laptop computer. Immediately after that, they performed the visual search task; furthermore, once again, following the visual search task, vigilance was rated using PVT with a cycle of 10 trials ending at the simultaneous experiment's conclusion. We recorded volunteer heart rates and HRV before combining them into physiological performance data [39]. From this, we obtained the experimentally recorded state sequence. Figure 4 shows the experimental flowchart with three segments of state sequence data. We completed the SSS, PVT, and VST experiments on the computer. Section 3.4 introduces specific data. simultaneous experiment's conclusion. We recorded volunteer heart rates and HRV before combining them into physiological performance data [39]. From this, we obtained the experimentally recorded state sequence. Figure 4 shows the experimental flowchart with three segments of state sequence data. We completed the SSS, PVT, and VST experiments on the computer. Section 3.4 introduces specific data. During the experiment, the volunteers kept their limbs still and their breaths even, and we kept them away from sources of electromagnetic interference, such as mobile phones, desk lamps, and electrical products. We transferred the collected ECG data to a computer through a cross-talk and saved it into table form to facilitate subsequent computerized processing. Figure 5 shows the acquisition scenes. The ECG signal is characterized by low frequency and small amplitude, making it susceptible to external interference during data acquisition. Therefore, it is essential to preprocess the raw signal to eliminate noise pollution and baseline drift. Fortunately, ECG signal noise reduction methods have reached a high level of maturity. The empirical mode decomposition (EMD) method can eliminate high-frequency noise in the signal, while the wavelet denoising method can reduce power frequency interference. Applying wavelet multiresolution analysis (MRA) can reduce baseline drift, and the built-in accelerometer of the device can eliminate motion artifacts (MA), following its operational principle [40,41].
ECG data cannot be directly used in constructing an HMM; therefore, it is necessary to extract HRV features by means of dimensionality reduction. The key to extracting HRV features from the raw signal is to determine the main wave's peak position in the R-R During the experiment, the volunteers kept their limbs still and their breaths even, and we kept them away from sources of electromagnetic interference, such as mobile phones, desk lamps, and electrical products. We transferred the collected ECG data to a computer through a cross-talk and saved it into table form to facilitate subsequent computerized processing. Figure 5 shows the acquisition scenes. simultaneous experiment's conclusion. We recorded volunteer heart rates and HRV before combining them into physiological performance data [39]. From this, we obtained the experimentally recorded state sequence. Figure 4 shows the experimental flowchart with three segments of state sequence data. We completed the SSS, PVT, and VST experiments on the computer. Section 3.4 introduces specific data. During the experiment, the volunteers kept their limbs still and their breaths even, and we kept them away from sources of electromagnetic interference, such as mobile phones, desk lamps, and electrical products. We transferred the collected ECG data to a computer through a cross-talk and saved it into table form to facilitate subsequent computerized processing. Figure 5 shows the acquisition scenes. The ECG signal is characterized by low frequency and small amplitude, making it susceptible to external interference during data acquisition. Therefore, it is essential to preprocess the raw signal to eliminate noise pollution and baseline drift. Fortunately, ECG signal noise reduction methods have reached a high level of maturity. The empirical mode decomposition (EMD) method can eliminate high-frequency noise in the signal, while the wavelet denoising method can reduce power frequency interference. Applying wavelet multiresolution analysis (MRA) can reduce baseline drift, and the built-in accelerometer of the device can eliminate motion artifacts (MA), following its operational principle [40,41].
ECG data cannot be directly used in constructing an HMM; therefore, it is necessary to extract HRV features by means of dimensionality reduction. The key to extracting HRV features from the raw signal is to determine the main wave's peak position in the R-R The ECG signal is characterized by low frequency and small amplitude, making it susceptible to external interference during data acquisition. Therefore, it is essential to preprocess the raw signal to eliminate noise pollution and baseline drift. Fortunately, ECG signal noise reduction methods have reached a high level of maturity. The empirical mode decomposition (EMD) method can eliminate high-frequency noise in the signal, while the wavelet denoising method can reduce power frequency interference. Applying wavelet multiresolution analysis (MRA) can reduce baseline drift, and the built-in accelerometer of the device can eliminate motion artifacts (MA), following its operational principle [40,41].
ECG data cannot be directly used in constructing an HMM; therefore, it is necessary to extract HRV features by means of dimensionality reduction. The key to extracting HRV features from the raw signal is to determine the main wave's peak position in the R-R cycle. We utilized the differential threshold approach for this purpose. Next, we selected N cardiac cycles (i.e., the periodic time sequence of N heartbeats), each denoted as t(n), n ∈ {1, N}, on the volunteers' ECG waveforms. Then, we recorded the instantaneous HR(n) and mean heart rates HR_MEAN, as well as the difference between cardiac cycles, which we recorded as δ (n) = t (n) − t (n−1) . Standard deviation of the R-R (peak) interval (SDNN) and root mean square of the difference between adjacent R-R intervals (RMSSD) can be expressed in terms of the average cardiac cycle δ. We used the low-frequency power (LFP) indicator to calculate the integrated power in the low-frequency band of 0.04 Hz-0.15 Hz after performing a spectral transformation on the R-R interval sequence of the N cardiac cycles; furthermore, we used the high-frequency power (HFP) indicator to calculate the integrated power in the high-frequency band of 0.15 Hz-0.40 Hz. Table 1 shows the HRV features we used in our work.

Heart rate variability indicator
Standard deviation of the R-R (peak) interval

Heart rate variability indicator
Root mean square of the difference between adjacent R-R intervals

LFP/HFP
Heart rate variability indicator / / As vigilance declined, SDNN, LFP, and LFP/HFP considerably increased, HFP also noticeably decreased, and all showed significant linearities. Hence, we chose SDNN from among the above indicators, which we used as an observation matrix parameter in the HMM.

Markov Chain Determination and Characteristic Parameter Processing
The procedure for constructing the HMM for vigilance assessment was as follows: (1) we determined the initial model parameters λ = (π, A, B); (2) we used the Baum-Welch algorithm to train and obtain appropriate initial model parametersλ = (π,Â,B); (3) we used the Viterbi algorithm to input the observed value sequence into the established HMM for vigilance assessment to obtain the optimal state sequence, which was compared with the actual state sequence to estimate the model's accuracy. Figure 6 shows the whole modelling process.
Because our study aimed to determine vigilance levels during working, we associated high, medium, and low vigilance levels with three hidden states (Figure 7). Considering the possibility that an individual may transition into a native or any other state from the current state at some point, the HMM allows for the transition of each state into the next or current state. State S1 denotes the state of low vigilance, S2 denotes the state of medium vigilance, and S3 denotes the state of high vigilance.  Because our study aimed to determine vigilance levels during working, we associated high, medium, and low vigilance levels with three hidden states (Figure 7). Considering the possibility that an individual may transition into a native or any other state from the current state at some point, the HMM allows for the transition of each state into the next or current state. State S1 denotes the state of low vigilance, S2 denotes the state of medium vigilance, and S3 denotes the state of high vigilance. In our HMM approach to assess vigilance, we defined three hidden states corresponding to three observed variable states (1, 2, and 3). To determine the threshold of SDNN state segmentation, we used boxplots, which accurately and consistently depict the discrete data distribution. In analyzing the experimental data, we classified vigilance level states into three levels based on subjective scales and the fastest 10% of responses in the PVT, which reflected the participant's level of sustained attention and were optimally represented by the activation of ongoing attention networks and the motor system cortex, as found in a previous study [27]. We then established the relationship between vigilance and SDNN, and Section 3.4 provides more details on the classification results.

Determination of the Initial Parameters of the HMM
The impact of the initial probability distribution of HMM on the final results of the model was contingent upon the nature of the data and the specific task. In certain scenarios, utilizing the average distribution as the initial probability distribution may not significantly affect the final model results, as the model is capable of learning more precise probability distributions through the training process [42]. Chen and Goodman demonstrated that utilizing the average distribution as the initial probability distribution did not have a detrimental impact on the modeling performance [43]. The reason for this is that the HMM model can improve its accuracy in probability distribution as the training progresses. Because our study aimed to determine vigilance levels during working, we associated high, medium, and low vigilance levels with three hidden states (Figure 7). Considering the possibility that an individual may transition into a native or any other state from the current state at some point, the HMM allows for the transition of each state into the next or current state. State S1 denotes the state of low vigilance, S2 denotes the state of medium vigilance, and S3 denotes the state of high vigilance. In our HMM approach to assess vigilance, we defined three hidden states corresponding to three observed variable states (1, 2, and 3). To determine the threshold of SDNN state segmentation, we used boxplots, which accurately and consistently depict the discrete data distribution. In analyzing the experimental data, we classified vigilance level states into three levels based on subjective scales and the fastest 10% of responses in the PVT, which reflected the participant's level of sustained attention and were optimally represented by the activation of ongoing attention networks and the motor system cortex, as found in a previous study [27]. We then established the relationship between vigilance and SDNN, and Section 3.4 provides more details on the classification results.

Determination of the Initial Parameters of the HMM
The impact of the initial probability distribution of HMM on the final results of the model was contingent upon the nature of the data and the specific task. In certain scenarios, utilizing the average distribution as the initial probability distribution may not significantly affect the final model results, as the model is capable of learning more precise probability distributions through the training process [42]. Chen and Goodman demonstrated that utilizing the average distribution as the initial probability distribution did not have a detrimental impact on the modeling performance [43]. The reason for this is that the HMM model can improve its accuracy in probability distribution as the training progresses. In our HMM approach to assess vigilance, we defined three hidden states corresponding to three observed variable states (1, 2, and 3). To determine the threshold of SDNN state segmentation, we used boxplots, which accurately and consistently depict the discrete data distribution. In analyzing the experimental data, we classified vigilance level states into three levels based on subjective scales and the fastest 10% of responses in the PVT, which reflected the participant's level of sustained attention and were optimally represented by the activation of ongoing attention networks and the motor system cortex, as found in a previous study [27]. We then established the relationship between vigilance and SDNN, and Section 3.4 provides more details on the classification results.

Determination of the Initial Parameters of the HMM
The impact of the initial probability distribution of HMM on the final results of the model was contingent upon the nature of the data and the specific task. In certain scenarios, utilizing the average distribution as the initial probability distribution may not significantly affect the final model results, as the model is capable of learning more precise probability distributions through the training process [42]. Chen and Goodman demonstrated that utilizing the average distribution as the initial probability distribution did not have a detrimental impact on the modeling performance [43]. The reason for this is that the HMM model can improve its accuracy in probability distribution as the training progresses. Furthermore, Baum and Petrie noted that the initial probability distribution in HMM served merely as a starting point, as the model can dynamically adjust its parameters based on the data, rendering the impact of the initial probability distribution typically short-lived [44].
Since the initial values of entries in the initial state probability vector π and the state transition probability matrix A had little effect on the model training results [45], only the following conditions need to be met.
π and A can be considered randomly selected or uniformly taken. Because a left-right model is usually adopted in pattern recognition, we set the initial state probability vector π i , without making an estimate, to: The values of entries in A were initialized by the principle of uniform distribution [46] by the following formula: The number of transfer paths on the Markov chain that move out of state i From the above analysis, there were three Markov chains, namely the high, medium, and low vigilance levels. Three transfer paths connect these states, so a ij = 1/3, i.e., the initial state transition probability matrix A is: Let S1, S2, and S3 represent the low, medium, and high vigilance levels, respectively. Each entry in A denotes the probability of transferring from a certain vigilance level to another. For example, a 13 represents the probability of transferring from S1 to S3, whereas a 31 represents the probability of transferring from S3 to S1, and a 22 represents the probability of transferring from S2 to S2.
We used experimental data to determine the initial entry values in the initial observation probability matrix B. We analyzed and extracted our acquired ECG signal and SDNN data, respectively, constructed a sample database for the HMM's vigilance assessment, and calculated the values of individual entries b ij in B, which denote the probability that the observed value is j when the state is i, by mathematical statistical means.
For b 11 represents the probability that the observed variable is in state 1 when the human body is in the low vigilance level, i.e., the probability that the value in O is 1 when the value in Q is 1. In the above sequence of states, the number of 1 s is 8, the number of state 1 s and the observed value 1 s is 3, hence b 11 = 3/8 = 0.38. We used the same method to determine the values of other entries in the observation probability matrix B.
We provided each volunteer with a sequence of observed values of length 30 and a state sequence upon completion of the trial. To avoid the influence of a single special datum's training results on the model, we selected 10 random sets of data using the method described above for value initialization to obtain a sequence of observation of length 300, an experimentally recorded state sequence for modelling, and the initial observation probability matrix B. The Datasets section describes the specific result of B.
We adopted the UMDHMM (hidden Markov model toolkit) lightweight C language version HMM package developed by Dr. Tapas Kanungo, the chief application scientist of Microsoft, to implement the algorithm. We optimized the values of entries in B during the modelling process. We describe the optimized results in the Datasets section.

Data Training
First, we determined the data length. The international standard duration of shortlasting data is generally 5 min and has many characteristics, such as easiness of grasp and control, and less susceptibility to external interference. It is widely used in many studies and clinical trials to analyze HRV data. In our study, the changes in vigilance were sensitive and susceptible to stimuli; therefore, we divided the data into multiple 1-min-long bins for processing and analysis by being intercepted to 6000 sampling sites in length.
Threshold segmentation was also needed for SDNN-vigilance levels. In general, the higher the HRV, the more active the vagus nerve and, hence, vigilance. We divided the sampled fastest 10% of responses into three phases, statistically analyzed the scale results, and set a threshold for the PVT results, at which the vigilance was higher and the PVT fastest 10% of responses was lower. The PVT fastest 10% of responses was divided into three levels, namely <400 ms, 400-500 ms, and >500 ms, based on questionnaire results, corresponding to high, moderate, and low vigilance levels, respectively. Figure 8 shows a boxplot of the SDNN numerical statistics results versus vigilance levels.
b11 represents the probability that the observed variable is in state 1 when the human body is in the low vigilance level, i.e., the probability that the value in O is 1 when the value in Q is 1. In the above sequence of states, the number of 1 s is 8, the number of state 1 s and the observed value 1 s is 3, hence b11 = 3/8 = 0.38. We used the same method to determine the values of other entries in the observation probability matrix B.
We provided each volunteer with a sequence of observed values of length 30 and a state sequence upon completion of the trial. To avoid the influence of a single special datum's training results on the model, we selected 10 random sets of data using the method described above for value initialization to obtain a sequence of observation of length 300 an experimentally recorded state sequence for modelling, and the initial observation probability matrix B. The Datasets section describes the specific result of B. We adopted the UMDHMM (hidden Markov model toolkit) lightweight C language version HMM package developed by Dr. Tapas Kanungo, the chief application scientist o Microsoft, to implement the algorithm. We optimized the values of entries in B during the modelling process. We describe the optimized results in the Datasets section.

Data Training
First, we determined the data length. The international standard duration of short lasting data is generally 5 min and has many characteristics, such as easiness of grasp and control, and less susceptibility to external interference. It is widely used in many studies and clinical trials to analyze HRV data. In our study, the changes in vigilance were sensi tive and susceptible to stimuli; therefore, we divided the data into multiple 1-min-long bins for processing and analysis by being intercepted to 6000 sampling sites in length.
Threshold segmentation was also needed for SDNN-vigilance levels. In general, the higher the HRV, the more active the vagus nerve and, hence, vigilance. We divided the sampled fastest 10% of responses into three phases, statistically analyzed the scale results and set a threshold for the PVT results, at which the vigilance was higher and the PVT fastest 10% of responses was lower. The PVT fastest 10% of responses was divided into three levels, namely <400 ms, 400-500 ms, and >500 ms, based on questionnaire results corresponding to high, moderate, and low vigilance levels, respectively. Figure 8 shows a boxplot of the SDNN numerical statistics results versus vigilance levels. It can be seen from Figure 7 that the upper quartile of the low vigilance level was less than 153, whereas the lower quartile of the medium vigilance level was higher than 156 so the level segmentation threshold of SDNN was set to 155; the upper quartile of the medium vigilance level to 186, and the lower quartile of the high vigilance level to 179 It can be seen from Figure 7 that the upper quartile of the low vigilance level was less than 153, whereas the lower quartile of the medium vigilance level was higher than 156, so the level segmentation threshold of SDNN was set to 155; the upper quartile of the medium vigilance level to 186, and the lower quartile of the high vigilance level to 179. Since this manuscript was committed to verifying whether the HMM could be used to predict the change in vigilance level, the state segmentation threshold of SDNN was set to 182. In future studies, more precise vigilance threshold segmentation methods will be investigated. The states of a low, medium, and high vigilance levels were set to S1, S2, and S3, respectively, for subsequent programming and postprocessing.
The matrix B in Formula (9) also needs to be calculated. The exact calculated result of B is: Once the state transition probability matrix A and observation probability matrix B were optimized, the model could use them to identify the most probable sequence of states (i.e., levels of vigilance) based on a given sequence of observations (i.e., HRV data). In other words, by inputting a predetermined sequence of observations, the model can use the optimized matrices to determine the most likely corresponding sequence of states.

Results and Discussion
This section first verified the model's accuracy, Hamming distance, and mean absolute error (MAE). By choosing three volunteers' data, the data were verified and compared with SVM. Then, the evaluation metric in this paper was introduced. Finally, the limitation of our work is discussed.
Considering that the task's setup conditions (e.g., stimulus duration and inter-stimulus interval) have the potential to influence the experimental results, we contrasted the results obtained by PVT and visual search task experiments with those obtained by the HMM as reference.
During the experiment, we found that the way mental states transform differed from individual to individual. A participant's state of vigilance level within a single experiment will form 10 consecutive states based on the PVT experimental results. The participant will produce a state sequence with a length of 30 after completing the experiment on that day, such as the experimentally recorded state sequence Q = [3, 1, 2, 2, 2, 1, 2, 2, 2, 3, 1, 3, 2, 2, 2, 2, 3, 1, 2, 1, 1, 1, 3, 3, 2, 1, 1, 2], which shows the changes in vigilance level in the three experimental trials.
The model we trained was a non-individual model, meaning we did not process the data separately for each volunteer. Instead, we cut up all the data and used them collectively to build HMM, achieving an accuracy of 92.67% during the training process. We expected the model's predictive accuracy to improve as we increased the number of subjects in our dataset. In the future, we can use this model directly for individual vigilance prediction without personal data.
To evaluate the accuracy of our HMM approach in predicting vigilance, three volunteers who participated in the model evaluation through the PVT experiment were invited. The observed values of the volunteers in the PVT experiment were input into the HMM to obtain the prediction of vigilance. The results are shown in Table 2 and Figure 9. By comparing them to the state values in the PVT experiment, we obtained an average accuracy rate of 87.78%, demonstrating the efficacy of our method. Therefore, the vigilance assessment model based on the HMM that we constructed can very accurately detect changes in the human vigilance state without producing large deviations due to the different modes of human mental state transition. comparing them to the state values in the PVT experiment, we obtained an average accuracy rate of 87.78%, demonstrating the efficacy of our method. Therefore, the vigilance assessment model based on the HMM that we constructed can very accurately detect changes in the human vigilance state without producing large deviations due to the different modes of human mental state transition.

HMM SVM
PVT and VST were adopted as our evaluation metrics. Our paper primarily focused on vigilance variation, and so, we chose the fastest 10% PVT reaction time as our model's measure. Although there are various statistical techniques for PVT to assess the level of vigilance, we verified that the fastest 10% reaction time was essentially consistent with the participant's subjective feelings of fatigue, which increased with the increases in task duration; however, this correlation was not affected by the task level [13].
VST was selected to modify volunteers' vigilance level because there was little variation in the levels. According to our results, the vigilance level seemed to change irregularly; with respect to this phenomenon, we believe that the changes in vigilance level by the VST showed a form of periodicity. VST and other personal conditions affecting the volunteers may affect the vigilance level because PVT has no practice effect. We believe that personal conditions such as resting well or not before the experiment and other work loadings will influence vigilance levels in conjunction with VST. In future studies, we will try to analyze data to find periodicity.
Segmentation thresholds of vigilance levels were divided into wakefulness and fatigue (sleepiness) only in most current literature. However, during the experiment, we learned that different vigilance level states could still impact work performance, even in the waking state. Therefore, further studies are needed to determine more precise demarcation criteria.
PVT has considerable effects on volunteer behavior, and physiology, and it may cause subjective drowsiness and mental fatigue, as well as changes in autonomic function and the central nervous system. Our manuscript highlighted the use of physiological methods for vigilance measurements. Changes in vigilance levels were monitored based on indicators of autonomic nerve function, such as HRV. Changes in vigilance levels have a close relationship with physiological parameters, which serve as indicators of vigilance. The HMM is an effective vigilance level estimator.
The average identification accuracy rate and MAE for our HMM method was up to 87.78% and 0.12 in our experiment. MAE measures the average absolute error between the predicted values generated by a model and their actual values [47]. The results of the MAE indicated that the average difference between the predicted and actual values was relatively small. A smaller MAE value indicated a better prediction ability of the model. A minimum value of 0 indicated that the model's predictions were completely accurate and free of error, whereas a larger value indicated that the model's predictions deviated more from the actual values. However, the maximum value of MAE was not fixed and can vary depending on the range and variability of the data. Therefore, it is important to compare the MAE of different models using standardized metrics, such as relative error or mean percentage error, and take into account the context and characteristics of the data being analyzed. The results of Hamming distance suggest that the predicted vigilance state sequence is relatively close to the actual vigilance sequence. Our approach was compared with the SVM [48], and our results showed that the accuracy gaps between different algorithms were not apparent. However, HMM has higher accuracy than SVM. HMM can analyze the dynamic signals of time series, complete pattern recognition according to the relationships between adjacent states, and reflect the similarities between categories to a greater extent while ignoring differences between categories. By mapping the linearly inseparable samples in the lowdimensional space to the high-dimensional space, SVM separates similar samples with the largest possible Euclidean distance, which reflects the difference between categories to a greater extent. Both models have their advantages; furthermore, there are studies that combined the two algorithms to extract speech features or recognize driving intentions. Thus, in future research, the way that HMM and SVM cascade to further enhance the accuracy of vigilance assessment may be explored. Our study was aimed at workers in key positions of specific industries, such as the manned deep submersible or nuclear industries, which require high concentration levels when dealing with monotonous and repetitive work.
For the purpose of this paper, we were primarily interested in how wearable devices can be used for daily cognitive assessments based on collected physiological signals. Currently, the most commonly used civilian wearable devices are wristbands and smartwatches, e.g., the Empatica E4 Wristband contains an electrodermal activity (EDA) sensor that measures signals related to stress, engagement, and excitement [49]. However, it is worth noting that the hardware system of such sensors is complex, and as a research instrument, it does not provide support for further data availability. Alberdi et al. [50] published a review on stress recognition approaches where they gathered research on which signals were used and with which methods to recognize stress. The authors did not limit themselves to wearable devices to either measure or recognize inputs and stress, and so, the results with this paper do not directly match this paper, topic-wise. Nonetheless, this review still offers valuable insights into stress recognition approaches that could inform the development of wrist-wearable devices for stress recognition.
Based on our investigation, current wearable devices, particularly those designed for research purposes, tend to offer more diverse and accurate raw physiological data rather than sophisticated data analysis, but also provide a hardware foundation for our method. In contrast, wearable devices intended for daily use often focus on specific aspects of human activity, such as exercise or sleep tracking. Although there was considerable research on wearable devices, vigilance assessment was largely overlooked. Our research aimed to bridge this gap by developing algorithms embedded in hardware and exploring the potential for using wearable devices to assess cognitive abilities in real-world settings.
Our study's limitations included the small sample size of 20 volunteers and the narrow range of vigilance levels. Finding statistically significant results was challenging due to the limited sample size, which restricted the experiment's analytical capacity. Furthermore, while our paper aimed to comprehensively examine the availability of HMM for assessing vigilance, no further analysis of data changes during collection was performed. Additionally, a narrow focus was presented on predicting the mental state of only three individuals. As such, further research utilizing larger sample sizes and a more diverse range of participants is necessary to corroborate the findings and explore the potential applications of this approach.
In this paper, we adopted the method of average distribution for the initial probability distribution of HMM, which may have limited the accuracy of the model to a certain extent. In the subsequent study, we will discuss and improve the model's accuracy by comparing the present model with models using other initial probability distribution methods.
The ECG collection device in our study was also only applicable in a lab environment, meaning that there is still a practical-perspective gap. Other devices and signals that are more convenient for real-world working environments will be tested in future work.

Conclusions
Long-term cognitive work induces a decrease in vigilance levels. In our paper, we used the HMM to improve the real-time performance and accuracy of current vigilance measurement techniques. We extracted the human body's HRV information by experimentally collecting the body's ECG signals while simultaneously recording volunteers' mental states and building an HMM for vigilance assessments. We proved that HRV information may indicate vigilance levels, as well as the availability of the PVT in detecting vigilance. Our experimental vigilance assessment model had a high accuracy rate that exceeded eighty percent. The method we adopted also had a higher accuracy level compared with the SVM. Our experimental results showed that our constructed vigilance assessment model had high accuracy rates. Our method provided new techniques for measuring potential cognitive ability levels and expanded the cross-field knowledge domain of cognition assessment and computer science. Our approach can be used in wearable products to improve their vigilance level assessment functionality. In particular, our research has broad application prospects in the navigation, aerospace, and nuclear industries, as well as other fields that have key positions with high concentration requirements and monotonous repetitive work.

Institutional Review Board Statement:
The study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Review Board of Northwest Polytechnic University (protocol code 202202053, approved in 1 November 2022).

Informed Consent Statement:
Informed consent was obtained from all participants involved in the study.
Data Availability Statement: Not applicable.