One of the main causes of traffic and industrial accidents is a low level of attention from fatigue, which greatly reduces work efficiency and increases the risk of accidents, and thus, a system automatically detecting the low level of alertness is a highly desirable. The research to estimate drowsiness or lack of vigilance mainly investigates physiological signals and video streaming information (eye blinking and yawning) [1
]. Video images obtained from a camera could detect with high accuracy, drowsiness, and with low accuracy, vigilance. However, at night time or when wearing sunglasses, it would be difficult to capture the faces of users using the camera. For these cases, physiological signals could help to determine the states of the drowsiness or low level of vigilance. The objective of this study was establishing automatic assessment of vigilance monitoring system with physiological signals using deep reinforcement learning algorithm. The previous studies of alertness estimation mainly utilized the electroencephalogram (EEG), electrocardiogram (ECG), electrooculogram (EOG), and electromyogram (EMG) signals [3
]. A vigilance monitoring system based on physiological signals could be implemented by utilizing EEG, ECG, and EOG signals.
EEG signals are widely utilized in practical situations, including brain computer interfacing (BCI), epileptic seizure prediction, drowsiness detection, and vigilance state analysis [3
]. EEG signal analysis is more challenging than other physiological signals because it contains several artifacts and high levels of noises such as EOG, EMG, and ECG signals [12
]. The vigilance state study also utilizes the EEG signals and mainly uses the four EEG frequency components [13
]. Four frequency components are extracted to make up the features of an EEG signal in the delta (
: 0–4 Hz), theta (
: 4–8 Hz), alpha (
: 8–13 Hz), and beta (
: 13–20 Hz) bands [14
]. A high amplitude in the delta band corresponds to deep sleep, and a high amplitude in the theta band corresponds to drowsiness [13
kerstedt et al. used features from frequency components in the theta and alpha bands to predict sleep and drowsiness states [7
]. A decreasing power in the alpha frequency band and increasing power in the theta frequency band may be a meaningful drowsiness indicator [6
]. Additionally, combinations of theta, alpha, and beta frequency components, such as
, have been used to extract features for detecting the drowsy and fatigued state. Studies on detecting low level of alertness with the EEG signal mainly used frequency-domain features and rarely considered time-domain features [10
]. The area, normalized decay, line length, mean energy, average peak amplitude, average valley amplitude, and normalized peak number are the main time-domain features of an EEG signal. ECG signals are widely utilized to predict arrhythmia, coronary artery disease, and paroxysmal atrial fibrillation [15
]. Chui et al. [18
] and Sahayadhas et al. [19
] considered ECG signals for estimating the drowsiness condition by extracting heart rate variability (HRV) information, which includes time- and frequency-domain analysis.
Recently, deep learning methods have been applied to healthcare data as replacements for conventional feature extraction methods, and have yielded better performances [20
]. Deep learning automatically detects specific patterns in physiological signals measured using EEG, ECG, and EMG [22
]. In healthcare studies, the convolution neural network (CNN) and autoencoder approach have been widely used to extract features from physiological signals [23
]. Hwang et al. proposed ultrashort-term segmentation of the ECG signal to estimate stressful conditions [25
]. While deep learning approaches have outperformed conventional methods for healthcare data analytics, a critical limitation has been encountered. Although deep learning algorithms automatically extract features in a data-driven manner, they are still a black box that fails to provide physiological meaning. Data analysis should yield interpretable outcomes and explain physiological phenomena corresponding to high-level states, such as cognitive or physical conditions [26
]. In the present study, relatively more interpretable conventional algorithms and deep learning methods were used to analyze the states, and were compared to explain the outcomes of deep learning approaches.
Reinforcement learning is an area of machine learning where an agent defined in a specific environment recognizes its current state and acts to maximize a future reward among selectable behaviors. Reinforcement learning is suitable for solving sequential decision-making problems. It has recently outperformed humans in many fields (e.g., Go, Atari game) and has also been used for deep learning optimization [27
]. In the present study, features were extracted with conventional and deep learning methods and fed to a reinforcement learning agent to find the optimal feature set. The aim of this study was to find the best features of EEG and ECG for the assessment of vigilance through reinforcement learning, and thus obtain higher accuracy than conventional supervised learning algorithms.
4. Discussion and Conclusions
A portable and wireless system was developed to monitor the low level of vigilance of a user. Because of the limited resources in terms of memory, processing, and battery power, optimizing the significant input or features is crucial to reducing the memory size and computation of the processor, and thus realizing low energy consumption. This study used the reinforcement algorithm of DQN to optimize the feature set of ECG and EEG responses to the state of low level vigilance. Figure 8
shows that selecting a feature set including only two or three features could yield a similar performance to that of using all of the features. This approach can be applied to most physiological and neuroscience research, where investigating the biomarkers for a physiological or cognitive condition is key to understanding the phenomenon. It was demonstrated that the reinforcement algorithm detects low level of vigilance using the optimized features more efficiently than conventional supervised learning classifiers, as shown in Figure 6
and Figure 8
. In addition to the DQN used in this paper, asynchronous advantage actor critic (A3C), trust region policy optimization (TRPO), and proximal policy optimization (PPO) algorithms are utilized in the state-of-the-art reinforcement learning algorithm. For the optimization of the deep learning structure and parameters, neural architecture search (NAS) is widely used [27
]. If the structure and the parameters of DQN are optimized through the NAS algorithm, the accuracy of the vigilance assessment will be increased. We left the parameter optimization and design of the new deep learning structure to future work. In this study, the DQN algorithm with the POMDP environment was applied to classify the low level of vigilance and optimize the feature set. The optimal feature selection process suggested that both ECG and EEG can be used to estimate the vigilance, although the two ECG features were selected more often, as shown in Figure 8
. This implies that both the autonomous system (measured by ECG) and central nervous system (measured by EEG) are responsible for the drowsy condition, while the autonomous system has a slightly stronger influence. Thus, both ECG and EEG sensors should be used for the development of a vigilance state detection system. These results demonstrated that the deep learning approach of Deep-ECGNet can be a state-of-the-art algorithm for extracting features from a short-term ECG signal. This is supported by the results of Hwang et al. [25
], who demonstrated its performance for monitoring stress in an experiment. Although the DQN agent uses few optimal features, it had a higher classification accuracy than conventional classifiers considered in this study (i.e., LDA, MLP, SVM, and random forest). DQN can also suggest significant features of EEG and ECG corresponding to a low level of vigilance, which can be used to investigate biomarkers for the physiological state.