An Innovative Deep Learning Algorithm for Drowsiness Detection from EEG Signal

The development of detection methodologies for reliable drowsiness tracking is a challenging task requiring both appropriate signal inputs and accurate and robust algorithms of analysis. The aim of this research is to develop an advanced method to detect the drowsiness stage in electroencephalogram (EEG), the most reliable physiological measurement, using the promising Machine Learning methodologies. The methods used in this paper are based on Machine Learning methodologies such as stacked autoencoder with softmax layers. Results obtained from 62 volunteers indicate 100% accuracy in drowsy/wakeful discrimination, proving that this approach can be very promising for use in the next generation of medical devices. This methodology can be extended to other uses in everyday life in which the maintaining of the level of vigilance is critical. Future works aim to perform extended validation of the proposed pipeline with a wide-range training set in which we integrate the photoplethysmogram (PPG) signal and visual information with EEG analysis in order to improve the robustness of the overall approach.


Introduction
Drowsiness represents the transition state between awakening and sleep during which the vigilance decrement is observed.As an antecedent of the sleep state, this phase is characterized by an almost total lack of reflexes and therefore does not agree with tasks which, on the contrary, require robust levels of vigilance.Therefore, the tracking of drowsiness represents the goal of increasing security in favor of human health.In fact, this topic is becoming very important in healthcare applications for further improvement of medical assessment [1,2].
In this context, the most reliable methods for detecting drowsiness are those based on physiological measurements, including electroencephalogram (EEG), electrocardiogram (ECG), and photoplethysmogram (PPG).Among these several methods, EEG is the most widely used technique to measure the electrical activity of the brain, and since it is the standard technique in sleep studies, it has been proposed by several authors for drowsiness tracking analysis [3][4][5][6].EEG has also been extensively used for fatigue classification, which is strictly correlated to the drowsiness monitoring [7][8][9][10][11][12].In this context, appropriated EEG artifacts removal procedures must be implemented for obtaining reliable post-processed signals [13][14][15].
ECG is also used to detect drowsiness by means of heart rate determination, which varies significantly through the different stages of drowsiness, from an awake state to a drowsy state [16].
PPG traces, which noninvasively measure the pulse rate variability in response to autonomous nervous system activity, have also been applied as appropriate methods for relaxation, fatigue, and drowsiness detection [17].
However, the development of detection systems for reliable drowsiness tracking is still a challenging task.Robust and complex algorithms of analysis are required.In this context, the Machine Learning methodologies (i.e., Deep Learning) represent powerful methodologies [18][19][20] offering promising approaches for reliable results on signal analysis [21].In this field, the Convolutional Neural Networks (CNNs) are the most appealing methods for this kind of analysis.CNNs were originally introduced by LeCun and Bengio in computer vision [22].
Recent studies have investigated the use of PPG signals combined with ECG samples for the estimation of drowsiness by means of a heart rate variability (HRV) indicator [23,24].We also recently described an advanced and innovative pipeline for drowsiness tracking based on the usage of PPG signals, ECG reconstructed from PPG signals, and EEG-to-PPG correlation [25].
However, the principal method for drowsiness detection is based on the EEG analysis [26].The EEG is produced by inhibitory and excitatory postsynaptic potentials of cortical pyramidal neurons.These signals are integrated at the cortical level and propagate up to the scalp [5].The rhythmical activity in the EEG is the expression of the more or less synchronized activity of large populations of adjacent cortical neurons, and it is believed to be generated by the interaction between cortical nerve cells and subcortical pacemakers.The electrical activity of the brain is usually classified according to rhythms, defined in terms of frequency bands-delta, theta, alpha, and beta-typically related to vigilance levels [5].In particular, delta activity is characterized by slow waves between 0.5 and 4 Hz; these waves are present during the transition to drowsiness and during sleep.Theta rhythm displays a frequency range between 4 and 7 Hz; this rhythm is considered to be related to decreased information and is associated with low levels of alertness during drowsiness and sleep.Alpha rhythm is characterized by a frequency range between 8 and 13 Hz and is considered to be related to an alert and relaxed state; it occurs during wakefulness, is heightened at eye closure while it weakens at eye opening, and is highly attenuated during attention.Beta waves are a fast frequency ranging between 13 and 30 Hz and are associated with increased alertness, arousal, and excitement.It has long been known that changes in brain arousal involve specific variations in EEG activity; in particular, increases in alpha and theta rhythms and reduction of beta waves are interpreted as indicating states of weariness and sleepiness [3][4][5][6].
Regarding EEG processing and Machine Learning for drowsiness detection, several methods have been proposed in the literature.In Reference [27] the authors present two models using artificial neural networks to detect the degree of drowsiness and to predict the time required to reach a particular drowsiness level (moderately drowsy).The proposed approach seems to be very promising since it is able to detect drowsiness level with a mean square error of 0.22, and it can predict the reaching of drowsiness level with a mean square error of 4.18 min.In [28], the authors proposed an algorithm which evaluates a driver's sleepiness level directly from cerebral activity.The results seem good, even though the authors confirmed that the method needs further investigation.In [29], the authors analyzed interesting Deep Learning methods for drowsiness tracking from EEG.The proposed deep learning solutions are based on novel channel-wise convolutional neural network (CCNN).To test the performance, the authors collected a large EEG dataset from three studies of driver fatigue that included 70 sessions from 37 subjects.All proposed methods were tested on both raw EEG and independent component analysis (ICA)-transformed data for cross-session predictions.The results seem very good and will be used as a benchmark for the algorithm proposed herein.In Reference [30], the authors proposed a study regarding the possibility to develop a drowsiness detection system for car drivers based on the integration of three methods: EEG, EOG signal processing, and driver image analysis.The approach seems very promising, but it needs two-dimensional images of the driver during the car driving, so the complexity of the pipeline is greater than the same ones based on EEG processing only.Finally, in [31], the authors proposed a deep convolution network and autoencoders-based model (AE-CDNN), which was constructed in order to perform unsupervised feature learning from EEG in epilepsy.The authors extracted features by the AE-CDNN model and classify the features based on two public EEG data sets.Experimental results showed that the classification results of features obtained by AE-CDNN are more optimal than features obtained by principal component analysis and sparse random projection.
In the present study, an innovative Deep Learning bio-inspired pipeline able to detect the level of drowsiness from EEG signals is described.The algorithm uses the Discrete Cosine Transform (DCT) analysis of EEG signal followed by a Deep learning stage (stacked autoencoders with softmax layers) for proper DCT post-processing data classification.
Results presented herein prove this approach can be very promising for use in the next generation of medical devices.

Volunteers Recruitment and Acquisition Protocol
Experiments were carried out on 62 healthy subjects of both sexes (31 men and 31 women), aged between 20 and 74 years; none of the volunteers were using drugs capable of changing cortical excitability.Volunteers gave informed consent to the procedures approved by the Ethical Committee Catania 1 (authorization n. 113/2018/PO), which were conducted in accordance with the Declaration of Helsinki.Participation criteria included the possession of a valid driving license for motor vehicles.

EEG Recordings
Standard EEG traces were recorded from 62 healthy subjects, of both sexes (31 men and 31 women), aged between 20 and 74 years.Two EEG electrodes were placed on the scalp at points O1 and O2 of the International System 10-20 [32] and held adherent to the skin by means of an adhesive/conductive paste; a ground electrode was placed at the right ankle.Standard recordings were made by using a standard EEG device (Galileo NT, EB Neuro, Italy), with the low-frequency filter between 0.53 Hz and 1.6 Hz (corresponding to 0.3 and 0.1 s of the time constant) and the 70 Hz high-frequency filter.From each channel, the data were acquired at the speed of 128/s.Subjects had to stay for 5 min in conditions of maximum relaxation, and then for another 5 min, they had to perform mental calculations to increase their level of vigilance.

Algorithm Description
The Deep Learning algorithm described herein has been developed in a MATLAB framework.the AE-CDNN model and classify the features based on two public EEG data sets.Experimental results showed that the classification results of features obtained by AE-CDNN are more optimal than features obtained by principal component analysis and sparse random projection.
In the present study, an innovative Deep Learning bio-inspired pipeline able to detect the level of drowsiness from EEG signals is described.The algorithm uses the Discrete Cosine Transform (DCT) analysis of EEG signal followed by a Deep learning stage (stacked autoencoders with softmax layers) for proper DCT post-processing data classification.
Results presented herein prove this approach can be very promising for use in the next generation of medical devices.

Volunteers Recruitment and Acquisition Protocol
Experiments were carried out on 62 healthy subjects of both sexes (31 men and 31 women), aged between 20 and 74 years; none of the volunteers were using drugs capable of changing cortical excitability.Volunteers gave informed consent to the procedures approved by the Ethical Committee Catania 1 (authorization n. 113/2018/PO), which were conducted in accordance with the Declaration of Helsinki.Participation criteria included the possession of a valid driving license for motor vehicles.

EEG Recordings
Standard EEG traces were recorded from 62 healthy subjects, of both sexes (31 men and 31 women), aged between 20 and 74 years.Two EEG electrodes were placed on the scalp at points O1 and O2 of the International System 10-20 [32] and held adherent to the skin by means of an adhesive/conductive paste; a ground electrode was placed at the right ankle.Standard recordings were made by using a standard EEG device (Galileo NT, EB Neuro, Italy), with the low-frequency filter between 0.53 Hz and 1.6 Hz (corresponding to 0.3 and 0.1 s of the time constant) and the 70 Hz high-frequency filter.From each channel, the data were acquired at the speed of 128/s.Subjects had to stay for 5 min in conditions of maximum relaxation, and then for another 5 min, they had to perform mental calculations to increase their level of vigilance.

Algorithm Description
The Deep Learning algorithm described herein has been developed in a MATLAB framework.Figure 1 reports the Deep Learning algorithm pipeline.
where N is the number of samples of the source EEG signals, while u(*) represents the frequency domain variable.The term "σ" is a dynamic variable self-learned during the training process of the proposed system.The frequency domain representation of the time domain source of EEG samples was therefore obtained.The DCT is suitable to better detect periodic intrinsic frequency components as it exploits such mathematical features as the discrete representation of the function correlated to the EEG signal.By means of the self-adaptive parameter "σ", we are able to autodetect the optimal "frequency window" for performing mathematical discrimination between DCT transformation of a source EEG signal of a drowsy person (EEG with alpha wave component) and DCT signals of a wakeful person (EEG with beta wave component).
The results from DCT processing were further resized by means of bicubic remapping composed of 256 samples.The processed DCT samples were further processed in order to resize them into a vector of 256 frequency samples as per bicubic interpolation.
In this context, we used a modified DCT function for the pre-processing of EEG signals in order to generate optimized window frequency signals for the autoencoder layer.The usage of self-adaptive "σ" parameters allowed us to correctly find the best frequency window for improving the features generated by the latent representation of the used autoencoder layer.
(b) Autoencoder-The Autoencoder System Block An autoencoder neural network is a bio-inspired system with a supervised learning algorithm that applies a typical error back-propagation scheme for firstly encoding input vectors (DCT samples), and secondly for decoding its internal representation in order to obtain the original input data with minimum error, which is usually the mean squared error (DCT internal representation) [20].
Figure 2 shows a typical neural structure of an autoencoder (used in our proposed pipeline) for learning DCT signal dynamics.

𝜑(𝑢) = 𝜔(𝑢) 𝑒𝑒𝑔(𝑘)𝑐𝑜𝑠 𝜋(2𝑘 + 1)𝑢
where N is the number of samples of the source EEG signals, while u(*) represents the frequency domain variable.The term "σ" is a dynamic variable self-learned during the training process of the proposed system.The frequency domain representation of the time domain source of EEG samples was therefore obtained.The DCT is suitable to better detect periodic intrinsic frequency components as it exploits such mathematical features as the discrete representation of the function correlated to the EEG signal.By means of the self-adaptive parameter "σ", we are able to autodetect the optimal "frequency window" for performing mathematical discrimination between DCT transformation of a source EEG signal of a drowsy person (EEG with alpha wave component) and DCT signals of a wakeful person (EEG with beta wave component).
The results from DCT processing were further resized by means of bicubic remapping composed of 256 samples.The processed DCT samples were further processed in order to resize them into a vector of 256 frequency samples as per bicubic interpolation.
In this context, we used a modified DCT function for the pre-processing of EEG signals in order to generate optimized window frequency signals for the autoencoder layer.The usage of selfadaptive "σ" parameters allowed us to correctly find the best frequency window for improving the features generated by the latent representation of the used autoencoder layer.
(b) Autoencoder-The Autoencoder System Block An autoencoder neural network is a bio-inspired system with a supervised learning algorithm that applies a typical error back-propagation scheme for firstly encoding input vectors (DCT samples), and secondly for decoding its internal representation in order to obtain the original input data with minimum error, which is usually the mean squared error (DCT internal representation) [20].
Figure 2 shows a typical neural structure of an autoencoder (used in our proposed pipeline) for learning DCT signal dynamics.The autoencoder proposed herein is structured with many input neurons, many hidden layers, many neurons, and one output layer composed of multiple neurons.More specifically, an autoencoder takes an input vector: and maps it into a hidden representation The autoencoder proposed herein is structured with many input neurons, many hidden layers, many neurons, and one output layer composed of multiple neurons.More specifically, an autoencoder takes an input vector: x and maps it into a hidden representation through a deterministic mapping The matrix W represents the synaptic weights of the neural system.The hidden representation y, sometimes called the latent representation, is then mapped back to a reconstructed vector z ∈ [0,1] d where z = γ(y) The basic idea here is that the autoencoder is constructed in such a way that the mapping x(i) -> y(i) reveals essential structures in the input vector x(i) that is not otherwise obvious.
In the autoencoder learning, the parameters ϕ and ϕ' of the model are optimized to minimize the average reconstruction error as shown in the following equation where the loss function L is a traditional squared error.
In order to avoid overfitting issues, an empirical risk minimization approach has been combined with regularized empirical risk, where the regularization imposes a degree of sparseness on the derived encodings.
Finally, the learning aims to find the optimal parameters ϕ* satisfying where R(β,D n ) is the regularized empirical risk function reported above.The effect of the autoencoder described above is forcing the latent representations to be sparse (sparse autoencoder).The sparsity constraints are based on the Kullback-Leibler divergence function, defined as where ρ is a sparsity parameter of which the value is close to zero while is the average activation of hidden unit j.By combining Equations ( 5)-( 9) we obtain: where H d is the number of hidden units and τ is a sparsity weighting term.This approach is used to avoid overfitting issues and improve its sparsity.
In order to increase the learning capability of the proposed system, we have used stacked autoencoders (SAEs).In SAEs, after the first layer is trained, the autoencoder output layer is discarded, and the related features (latent representation) are used as the input to the next layer (autoencoder).Hence, the training is greedy and layer-wise.The final step is to fine tune the network using the back-propagation algorithm.The architecture of each autoencoder used herein is composed of 256 neurons as an input layer and one 10-neuron hidden layer.

(c) Softmax Neural Layer Block
The softmax function is also known as the normalized exponential [21].In the field of deep and machine learning, a softmax function or layer is able to classify a K-dimensional vector of real values in the range [0,1].Formally, if we define the output of the latest layer (z j as the input of that layer) of a feed-forward neural network as In a multi-class K classification problem, the softmax transformation function is defined as This can be defined as the conditioned probability that, given an input u, the output y is a member of j-th class, where y can be computed as follows: In the proposed pipeline, we used a softmax layer to classify the input into two classes, i.e., drowsy person (0-0.5, class '0') and wakeful person (0.51-1, class '1') considering all the outputs normalized into [0,1] range.

Algorithm Testing and Validation Framework
The proposed pipeline was tested in a PC with an Intel i5 quad Core@64MGb RAM.All the EEG input records of 62 people were split into 70% as a training set and the remaining 30% for testing and validation.The training phase required about 20 min while the testing and validation took about 5 min.

Results and Discussion
Drowsiness is physiologically defined as "a state of near sleep due to fatigue".The effects of both sleepiness and fatigue are similar and involve mental alertness, reduced awareness, and prejudiced judgment.This leads to a decrease in a person's ability to operate and an increase in the risk of human error that could lead to fatalities and injuries.For this reason, the development of reliable and easy-to-use methods to detect drowsiness can make substantial improvements in the prevention of fatal accidents and can be implemented in the next generation of medical devices.
The goal of researchers working in this field is to achieve an advanced method that is easily applied and appropriate in all those conditions that require a high level of vigilance.As discussed in the background section, several studies have proposed techniques to detect drowsiness using different types of data as input.To achieve this goal, in this study we combined the EEG traces-the most suitable physiological measurement of vigilance levels-with the powerful Machine Learning methodology of data analysis.
In the following subsections, the results obtained in the three blocks of the algorithm schemed in Figure 1 and described in the previous section are reported.

Results of the DCT Block
Figure 3 shows a representative example of the DCT transformation spectral dynamics of the EEG signals before (a,b) and after (c) the resizing step.In particular, Figure 3a reports the DCT results for a drowsy person, while Figure 3b reports those for a wakeful person.Figure 3c reports the resized DCT signal to reduce the spectral bandwidth and the possible spurious components.This signal is then fed into the autoencoder block.

Results of the Autoencoder Block
According to the architecture described in the Methods section, Figure 4 reports the latent representations (features) of the input DCT-based preprocessed EEG samples and output of the autoencoder block.In particular, the results obtained by the first encoder block are illustrated in Figure 4a, while Figure 4b shows the latent representation after further processing by the second encoder block.

Results of the Autoencoder Block
According to the architecture described in the Methods section, Figure 4 reports the latent representations (features) of the input DCT-based preprocessed EEG samples and output of the autoencoder block.In particular, the results obtained by the first encoder block are illustrated in Figure 4a, while Figure 4b shows the latent representation after further processing by the second encoder block.The features provided by the latest autoencoder (Figure 4b) were then fed into the softmax layer, which is described in the next paragraph.The dynamic of the mean squared error during the learning phase of the pipeline is reported in Figure 5.The features provided by the latest autoencoder (Figure 4b) were then fed into the softmax layer, which is described in the next paragraph.The dynamic of the mean squared error during the learning phase of the pipeline is reported in Figure 5.The features provided by the latest autoencoder (Figure 4b) were then fed into the softmax layer, which is described in the next paragraph.The dynamic of the mean squared error during the learning phase of the pipeline is reported in Figure 5.

Results of the Softmax Block
This block executed the classifications of signal features coming from the previous block, and gave an output number in the range of 0-0.5 for a drowsy person and in the range of 0.51-1 for a wakeful person.We tested the proposed pipeline with a training set composed of 62 subjects for which sampled EEG signals were separated into two scenarios: drowsy and wakeful.The acquired EEG signals were used partly for training (55%) and partly for testing and validation (45%) of the pipeline.Results are reported in the confusion matrix illustrated in Figure 6.

Results of the Softmax Block
This block executed the classifications of signal features coming from the previous block, and gave an output number in the range of 0-0.5 for a drowsy person and in the range of 0.51-1 for a wakeful person.We tested the proposed pipeline with a training set composed of 62 subjects for which sampled EEG signals were separated into two scenarios: drowsy and wakeful.The acquired EEG signals were used partly for training (55%) and partly for testing and validation (45%) of the pipeline.Results are reported in the confusion matrix illustrated in Figure 6.The results reported in the confusion matrix of Figure 6 indicates that the system showed 100% accuracy in drowsy/wakeful discrimination.As highlighted in the confusion matrix reported in Figure 6, both specificity and sensitivity (and then the ROC (Receiver Operating Characteristic) analysis) of the proposed algorithm was 100%, as the pipeline was able to discriminate the two classes (drowsy vs. wakeful) without the issue of false positive or false negative results.
In fact, this matrix expresses the performance of the algorithm by statistical classification representing the instances in a predicted class (rows as the target classes) versus the instances in a real class (columns as the output classes) with the assumption of 0 for drowsy and 1 for wakeful.It can be seen that the training was successfully performed since all cases were correctly recognized by the pipeline (green boxes) with no errors (red boxes).
These results are also supported by the dynamic of mean squared error related to the autoencoder block (Figure 5) that asymptotically tends to be very low values.
Tables 1 and 2 report feature comparisons between our proposed pipeline and other methods reported in the literature, based on deep learning approaches for EEG processing.It is noteworthy that our methodology outperformed other methods in the literature, with respect to both accuracy and training performances.The results reported in the confusion matrix of Figure 6 indicates that the system showed 100% accuracy in drowsy/wakeful discrimination.As highlighted in the confusion matrix reported in Figure 6, both specificity and sensitivity (and then the ROC (Receiver Operating Characteristic) analysis) of the proposed algorithm was 100%, as the pipeline was able to discriminate the two classes (drowsy vs. wakeful) without the issue of false positive or false negative results.
In fact, this matrix expresses the performance of the algorithm by statistical classification representing the instances in a predicted class (rows as the target classes) versus the instances in a real class (columns as the output classes) with the assumption of 0 for drowsy and 1 for wakeful.It can be seen that the training was successfully performed since all cases were correctly recognized by the pipeline (green boxes) with no errors (red boxes).
These results are also supported by the dynamic of mean squared error related to the autoencoder block (Figure 5) that asymptotically tends to be very low values.
Tables 1 and 2 report feature comparisons between our proposed pipeline and other methods reported in the literature, based on deep learning approaches for EEG processing.It is noteworthy that our methodology outperformed other methods in the literature, with respect to both accuracy and training performances.These findings prove that the present research is very promising in identifying a methodology characterized by minimum invasiveness, high reliability, and considerable speed in the response, which allows for the identification of the reduction in the level of vigilance of a person within a few seconds.This observation, which can be extended to other uses in everyday life in which the maintaining of the level of vigilance is critical, confirms the significant correlations between the activity of the cerebral cortex and that of the cardiovascular apparatus, with the latter coming to be considered as a useful monitoring system of electrocortical activity.

Figure 1
reports the Deep Learning algorithm pipeline.Computation 2019, 7, x FOR PEER REVIEW 3 of 13

Figure 1 .
Figure 1.The proposed Deep Neural Networks-based pipeline.

Figure 1 .
Figure 1.The proposed Deep Neural Networks-based pipeline.Following are the descriptions of the methods used in the single blocks: (a) DCT-Discrete Cosine Transform Block EEG signals were fed into the DCT block which performed frequency domain transformation of the source EEG samples (EEG(k) composed by N samples) as per the modified classical DCT equation [18], as follows:

Figure 4 .
Figure 4. Latent representation of input DCT-based EEG computed by: the first autoencoder block for drowsy (a) and wakeful (c); the second autoencoder block for drowsy (b) and wakeful (d).

Figure 4 .
Figure 4. Latent representation of input DCT-based EEG computed by: the first autoencoder block for drowsy (a) and wakeful (c); the second autoencoder block for drowsy (b) and wakeful (d).

Figure 4 .
Figure 4. Latent representation of input DCT-based EEG computed by: the first autoencoder block for drowsy (a) and wakeful (c); the second autoencoder block for drowsy (b) and wakeful (d).

Table 1 .
[28]racy comparison with respect to such several methods reported in the review paper[28].

Table 2 .
[30]ormance comparison with respect to such similar methods described in Reference[30].