Quantitative Electroencephalography Analysis for Improved Assessment of Consciousness Levels in Deep Coma Patients Using a Proposed Stimulus Stage

“Coma” is defined as an inability to obey commands, to speak, or to open the eyes. So, a coma is a state of unarousable unconsciousness. In a clinical setting, the ability to respond to a command is often used to infer consciousness. Evaluation of the patient’s level of consciousness (LeOC) is important for neurological evaluation. The Glasgow Coma Scale (GCS) is the most widely used and popular scoring system for neurological evaluation and is used to assess a patient’s level of consciousness. The aim of this study is the evaluation of GCSs with an objective approach based on numerical results. So, EEG signals were recorded from 39 patients in a coma state with a new procedure proposed by us in a deep coma state (GCS: between 3 and 8). The EEG signals were divided into four sub-bands as alpha, beta, delta, and theta, and their power spectral density was calculated. As a result of power spectral analysis, 10 different features were extracted from EEG signals in the time and frequency domains. The features were statistically analyzed to differentiate the different LeOC and to relate with the GCS. Additionally, some machine learning algorithms have been used to measure the performance of the features for distinguishing patients with different GCSs in a deep coma. This study demonstrated that GCS 3 and GCS 8 patients were classified from other levels of consciousness in terms of decreased theta activity. To the best of our knowledge, this is the first study to classify patients in a deep coma (GCS between 3 and 8) with 96.44% classification performance.


Introduction
Intensive care is defined as all the methods applied to treat disease and to ensure patient survival until partial or complete loss of organ or system functions cause negative effects. Intensive care units (ICUs) are special units that provide the necessary treatment and care to patients who require intensive care. Many patients in the ICU are unconscious due to an organic disorder or sedation. Consciousness is defined as being aware of oneself and one's environment and adapting to new stimuli. Assessment of the level of consciousness (LeOC) is a significant task that affects stages such as nursing and treatment of the patient. As the patient's assistance is required during the evaluation of the LeOC, motor and sensory evaluations cannot be made for patients who cannot respond to commands [1,2]. Evaluating the LeOC and coma outcome prediction is very important for medical practice. The LeOC is not a phenomenon directly measured. In the evaluation of consciousness, firstly, verbal stimuli are given to the patient. The motor response for LeOC assessment is evaluated with intense (not harmful) painful stimuli [2,3]. Consciousness levels are shown in Figure 1. Two main factors can be mentioned for consciousness: the level of arousal or wakefulness and the awareness (i.e., the content of consciousness). Impairment of arousal is a process that progresses from lethargy to stupor and then to coma. Coma is the inability of the patient to be aroused. In order to interpret that the patient is in a coma, their eyes The severity of the disease is analyzed by grading the symptoms and by evaluating many physiological findings. Knowing the severity of the disease is important in terms of devising treatment, estimating the outcomes from the patient, and providing better intensive care. Scoring systems are not the main task of the treatment, but they are helpful in clinical decisions and in reducing hospital costs. Incorrect determination of the severity of the disease causes loss in cost and time, misdiagnosis, and wrong treatment [3,4]. The severity of the disease can be determined by various methods. While the width of the injury is anatomically determined by the injury severity scoring (ISS) method in trauma patients, the Glasgow Coma Scale (GCS) is the most commonly used in determining neurological damage and is an accepted 15-point score used to measure coma [5]. The GCS is a method for providing a simple and reproducible method of assessment of LeOC [6]. The GCS is scored according to the best response of the patients in each of its three categories: eye responses (E), verbal responses (V), and motor responses (M) [7]. The summation of the individual score (E + V + M) classifies the patients into mild closed head injury (score = 13-15); moderate head injury (score = 9-12); severe brain injury, also coma state (score = 3-8); and vegetative state (score < 3) [8]. Many physicians regard a maximum GCS score of 8 as the limit for coma [6]. It is stated that traumatic brain injuries are mainly determined by GCS; however, it is known that GCS is not sufficient and reliable for use by inexperienced clinicians. Namiki et al. show how the GCS can be misinterpreted by an inexperienced physician. A new graduate assistant was asked to examine the eye, verbal, and motor responses of a patient and to perform a GCS assessment. The GCS assessment was tested with eight different levels of consciousness. As a result of this study, on average, 26 ± 18% of examinees did not provide an accurate evaluation for the eight levels of consciousness selected. The GCS factors, mostly misinterpreted, were "evaluation of mixed speech" and "withdrawal motor response" [9].
It is considered that GCS is measured reliably by trained healthcare personnel [9,10]. However, GCS reliability has been reported to be insufficient in clinical practice [9][10][11][12][13][14] because GCS scoring errors have rarely been investigated [11] and methods that can help improve the accuracy of GCS are very rare [9].
The GCS is the chiefly used score in the evaluation of the neurological condition in the ICU. The GCS has a few limitations; most importantly, it requires an interactive patient, but patients in a coma state (GCS ≤ 8) are unlikely to be active. In the literature, the The severity of the disease is analyzed by grading the symptoms and by evaluating many physiological findings. Knowing the severity of the disease is important in terms of devising treatment, estimating the outcomes from the patient, and providing better intensive care. Scoring systems are not the main task of the treatment, but they are helpful in clinical decisions and in reducing hospital costs. Incorrect determination of the severity of the disease causes loss in cost and time, misdiagnosis, and wrong treatment [3,4]. The severity of the disease can be determined by various methods. While the width of the injury is anatomically determined by the injury severity scoring (ISS) method in trauma patients, the Glasgow Coma Scale (GCS) is the most commonly used in determining neurological damage and is an accepted 15-point score used to measure coma [5]. The GCS is a method for providing a simple and reproducible method of assessment of LeOC [6]. The GCS is scored according to the best response of the patients in each of its three categories: eye responses (E), verbal responses (V), and motor responses (M) [7]. The summation of the individual score (E + V + M) classifies the patients into mild closed head injury (score = 13-15); moderate head injury (score = 9-12); severe brain injury, also coma state (score = 3-8); and vegetative state (score < 3) [8]. Many physicians regard a maximum GCS score of 8 as the limit for coma [6]. It is stated that traumatic brain injuries are mainly determined by GCS; however, it is known that GCS is not sufficient and reliable for use by inexperienced clinicians.
Namiki et al. show how the GCS can be misinterpreted by an inexperienced physician. A new graduate assistant was asked to examine the eye, verbal, and motor responses of a patient and to perform a GCS assessment. The GCS assessment was tested with eight different levels of consciousness. As a result of this study, on average, 26 ± 18% of examinees did not provide an accurate evaluation for the eight levels of consciousness selected. The GCS factors, mostly misinterpreted, were "evaluation of mixed speech" and "withdrawal motor response" [9].
It is considered that GCS is measured reliably by trained healthcare personnel [9,10]. However, GCS reliability has been reported to be insufficient in clinical practice [9][10][11][12][13][14] because GCS scoring errors have rarely been investigated [11] and methods that can help improve the accuracy of GCS are very rare [9].
The GCS is the chiefly used score in the evaluation of the neurological condition in the ICU. The GCS has a few limitations; most importantly, it requires an interactive patient, but patients in a coma state (GCS ≤ 8) are unlikely to be active. In the literature, the error rate resulting from evaluation with GCS is stated as 40% [15]. It is not correct to use the scoring systems in patients with impaired motor responses or who have an attention deficit, a lack of arousal, or a lack of perception [16,17]. The GCS is designed for serial assessments, so it should be emphasized that any confidence in predicting outcomes cannot be sufficiently established until a trend is achieved in the GCS [17].
In the statistical analysis of the data obtained from intensive care units, some studies specifically aiming to produce predictive models have been conducted on heart rate, blood pressure, etc. There are studies to automatically detect abnormality in such situations. Automatic estimates are made to eliminate the need for human surveillance [18][19][20]. Monitoring Electroencephalogram (EEG) in ICUs ensures that changes in the neural activities of the brain are noticed immediately and have long-term follow-up. Electroencephalography is a useful tool for noninvasively quantifying neurological function [21].
Gills et al. mention the time and frequency of methods used in the study of the EEG data of patients in the ICU. They extracted autoregression coefficients from EEG data as a feature and examined the forces in frequency bands [22]. Shah et al. obtained the amplitude, autoregression coefficients, and frequency-weighted energy values from the EEG signals belonging to patients in the ICU and classified them using clustering methods [23].
EEG is used to identify sleep disorders, coma, encephalopathy, and brain death [24]. Diagnostic applications are mostly aimed at examining the spectral content of neural oscillations in EEG waves. EEG is used frequently because it is non-invasive and has a high time resolution. EEG is widely used compared with other brain imaging techniques in applications where continuous monitoring of the brain is important [25,26]. Additionally, the creation of a real-time health monitoring system for stroke prediction [27], post-stroke recovery prediction [28], and stroke prognostics [29] using quantitative EEG analysis and machine learning approaches are indicators of the broad scope of use of EEG.
Since the evaluation of consciousness disorders is performed based on behavioral scales, it makes the diagnosis of these patients difficult. A patient in a vegetative state/ unresponsive wakefulness syndrome (VS/UWS) is awake but unaware of himself or his environment. A patient in a minimally conscious state (MCS) has awareness of themselves or their environment to some extent. A comatose patient has neither spontaneous alertness nor an eye-opening caused by impulse [30]. Therefore, studies focus on EEG in order to differentiate disorders of consciousness (DOC) patients. Mikola et al. analyzed the EEG recordings to show the correlation of delirium with EEG. The relative powers of the EEG signal were computed for delirium and control groups. They showed that the relative power of EEG signals with a frequency above 8 Hz was lower in the state of delirium. This was pursued at the parietal and central lobes of the brain [31]. Lechinger et al. [32] analyzed various EEG parameters to find an association between EEG signals and Coma Recovery Scale-Revised (CRS-R) score. They showed a positive correlation between the CRS-R score and ratios calculated between frequencies above 8 Hz and frequencies below 8 Hz using Pearson correlation and repeated measures ANOVAs. Piarulli et al. [33] found higher theta/alpha and lower delta power in MCS patients compared with VS/UWS patients. Additionally, Naro et al. [34] revealed lower alpha power in UWS and MCS patients. Kotchoubey et al. [35] showed that conscious patients have indicated much more electrophysiological brain activity than patients in the persistent vegetative state (PVS) and MCS patients. Khanmohammadi et al. [36] obtained a positive relationship between the Intrinsic Network Reactivity Index and GCS. They revealed significant differences between various LeOC. They concluded that power analysis of EEG signals did not give any specific patterns correlated to GCS. There are studies that have shown that a 40 Hz auditory steady-state response (ASSR) as an auditory stimulus may be associated with the state of consciousness of patients in a coma [37,38]. At present, 40 Hz ASSR has been applied for coma outcome prediction [39,40]. Wieser et al. [16] conducted linear backward regression analysis in their work and concluded that 13 variables obtained from the data of 8 patients were adequate to define 74.7% of the variability. The electrocardiogram, P300, and blood pressure signal generated most of the variability in their regression model.
In our study, EEG data were recorded from 39 comatose patients with a new procedure proposed by us. To the best of our knowledge, there is no study to classify the GCS 3-8 score from EEG signals using family and nurse interaction experimental scenarios (tactile and auditory stimuli). Therefore, the present study aims to evaluate the GCS with objective data. The features were extracted from EEG signals in the time and frequency domains. After, the obtained features were statistically analyzed to differentiate the LeOC and to relate with GCS. The classification success of the proposed method for different LeOC was compared with various classifiers and evaluation metrics.
With our study, objective data were proposed for the unconscious patient group, which was difficult to evaluate using non-processed EEG signals. Differences between patients with consciousness levels 3 and 8 according to the GCS were revealed and classified using the proposed method. To the best of our knowledge, our study is the first in the literature in terms of the data recording stages, its purpose, and obtaining the results. So, our study is different and novel from other studies related to coma and EEG.
The main contributions of this study are as follows: • The frequency analysis and machine learning methods used in this study may contribute to the detection of consciousness levels of patients in a deep coma and the development of BCI systems for objective determination of GCS.

•
A new recording procedure including tactile and auditory stimuli has been proposed for this system, which was developed to examine changes in the EEG activity of different levels of consciousness and to measure the responses of patients to stimuli.

•
Features extracted by power spectral density analysis from EEG signals could characterize the changes in brain function for a deep coma state. The results obtained may be valuable for future studies in predicting the prognosis of unconscious patients and are very important for further studies to show the difference in the levels of consciousness of patients in a deep coma.
This article is organized as follows. Section 2 describes the materials, including a detailed description of the proposed methods for the data collection, the pre-processing, and the features and analysis methods. Section 3 presents the results of features, statistical analysis, and machine learning algorithms. Section 4 presents a brief review of related works, discussions of the proposed method, and limitations of the study in classifying the LeOC. Section 5 is dedicated to the conclusion.

Subjects
The Erciyes University Hospital's ICU patients were the subjects of our study. As the patients were unconscious and were unable to read and sign any document, all subjects' families gave their informed consent before participating in this study.
The EEG signals analyzed in this study were recorded from 39 patients (19 males, 20 females, varying in age from 19 to 91 years, with a mean age of 67). Patients were in coma status (3 ≤ GCS ≤ 8). GCS of the patients was determined independently by two experienced clinicians. Inclusion criteria were (1) unconscious patients followed up in the ICU for any reason, (2) age ≥ 18 years, and (3) patients whose consciousness level was between 3 and 8 according to GCS. Exclusion criteria were (1) patients under sedation, (2) patients with suspected or realized brain death, (3) patients using any medication that would affect the brain waves, and (4) families that did not want to participate in the study. The Ethics Committee of Erciyes University Medical Faculty accepted the study protocol, and the research was carried out in compliance with the Helsinki Declaration. Table 1 summarizes the study population.

EEG Recordings and Pre-Processing
In this study, EEG signals were recorded continuously at a sampling frequency of 500 Hz using the standard 10-20 system of electrode placement. Data acquisition was performed using Biopac MP-150 System and EEG recordings were obtained using the EEG Cap and wet electrodes. During recordings, we used a bipolar montage with 4 bipolar channels (F3-F4/Channel 1, C3-C4/Channel 2, T3-T4/Channel 3, and P3-P4/Channel 4) (right earlobe is ground). The layout of eight electrodes is shown in Figure 2. During the recording, patients were lying down and at rest. A 50 Hz infinite impulse response notch filter and a finite impulse response lowpass filter with a cutoff frequency of 30 Hz were used to remove the artifacts before further analysis. Then, the denoised EEG signals were decomposed into six sub-bands: beta (13-30 Hz), alpha (8-13 Hz), theta (4-8 Hz), and delta (0.5-4 Hz).

EEG Recordings and Pre-Processing
In this study, EEG signals were recorded continuously at a sampling frequency of 500 Hz using the standard 10-20 system of electrode placement. Data acquisition was performed using Biopac MP-150 System and EEG recordings were obtained using the EEG Cap and wet electrodes. During recordings, we used a bipolar montage with 4 bipolar channels (F3-F4/Channel 1, C3-C4/Channel 2, T3-T4/Channel 3, and P3-P4/Channel 4) (right earlobe is ground). The layout of eight electrodes is shown in Figure 2. During the recording, patients were lying down and at rest. A 50 Hz infinite impulse response notch filter and a finite impulse response lowpass filter with a cutoff frequency of 30 Hz were used to remove the artifacts before further analysis. Then, the denoised EEG signals were decomposed into six sub-bands: beta (13-30 Hz), alpha (8-13 Hz), theta (4-8 Hz), and delta (0.5-4 Hz). EEG signals were recorded from ICU patients in five stages, as seen in Figure 3, including three resting stages and nurse and family interaction (auditory, tactile stimuli) stages. During the first stage of the recording process, EEG signals were obtained in a EEG signals were recorded from ICU patients in five stages, as seen in Figure 3, including three resting stages and nurse and family interaction (auditory, tactile stimuli) stages. During the first stage of the recording process, EEG signals were obtained in a silent environment without any stimulus for 5 min. Then, signal recordings were performed while the nurse who was routinely responsible for the patient spoke closely with the patient for 5 min. After the nurse interaction, resting stage EEG signals were recorded for 10 min.
In the fourth stage of recordings, EEG signals were recorded while family members (patient relatives) interacted with the patient for 5 min. Finally, after interaction with the family as the last resting stage, EEG signals were recorded for 10 min. Thus, a total of 35 min of continuous EEG recording was obtained. During the patient's interactions with the family and nurse, stimuli such as auditory stimuli (speaking with the patient) and tactile stimuli (touching the patient) were performed.
Diagnostics 2023, 13, x FOR PEER REVIEW 6 of 26 silent environment without any stimulus for 5 min. Then, signal recordings were performed while the nurse who was routinely responsible for the patient spoke closely with the patient for 5 min. After the nurse interaction, resting stage EEG signals were recorded for 10 min. In the fourth stage of recordings, EEG signals were recorded while family members (patient relatives) interacted with the patient for 5 min. Finally, after interaction with the family as the last resting stage, EEG signals were recorded for 10 min. Thus, a total of 35 min of continuous EEG recording was obtained. During the patient's interactions with the family and nurse, stimuli such as auditory stimuli (speaking with the patient) and tactile stimuli (touching the patient) were performed.

Power Spectral Density of EEG Signals and Feature Extraction
Spectrum estimation of signals is usually based on procedures operating a fast Fourier transform (FFT). The FFT approach has some limitations. Frequency resolution is one of these limitations. Another limitation is due to the windowing of the signal. Windowing exhibits itself as a ''leakage'' in the spectral domain. Moreover, FFT has high noise sensitivity and needs long-term data records for good frequency resolution [41]. Given the limitations of the FFT, Welch (non-parametric), Yule-Walker, and Burg AR (parametric) methods have been employed.
Power spectral density (PSD) is a crucial method in signal processing for frequency analysis. Parametric and non-parametric methods are used for PSD. Autoregressive, moving average, or autoregressive moving average use appropriate models with a known spectrum as parametric methods. Non-parametric methods do not have any assumption about the form of the power spectrum and estimate the PSD directly from the signal itself. Non-parametric methods are generally preferred because the estimated PSD may not be reliable if the model of the signal is not sufficiently and accurately defined using parametric methods [42][43][44]. In addition, there are studies indicating that parametric methods do not provide good performance for EEG signals [45].
The simplest non-parametric method is known as a periodogram. The periodogram method is based on Fourier and divides an EEG signal into frames of 64, 128, and 256, which are the powers of 2 [46]. The Welch method is a developed version of the periodogram and is used with an overlap of 50% in this study. This method (1) divides the signals

Power Spectral Density of EEG Signals and Feature Extraction
Spectrum estimation of signals is usually based on procedures operating a fast Fourier transform (FFT). The FFT approach has some limitations. Frequency resolution is one of these limitations. Another limitation is due to the windowing of the signal. Windowing exhibits itself as a "leakage" in the spectral domain. Moreover, FFT has high noise sensitivity and needs long-term data records for good frequency resolution [41]. Given the limitations of the FFT, Welch (non-parametric), Yule-Walker, and Burg AR (parametric) methods have been employed.
Power spectral density (PSD) is a crucial method in signal processing for frequency analysis. Parametric and non-parametric methods are used for PSD. Autoregressive, moving average, or autoregressive moving average use appropriate models with a known spectrum as parametric methods. Non-parametric methods do not have any assumption about the form of the power spectrum and estimate the PSD directly from the signal itself. Non-parametric methods are generally preferred because the estimated PSD may not be reliable if the model of the signal is not sufficiently and accurately defined using parametric methods [42][43][44]. In addition, there are studies indicating that parametric methods do not provide good performance for EEG signals [45].
The simplest non-parametric method is known as a periodogram. The periodogram method is based on Fourier and divides an EEG signal into frames of 64, 128, and 256, which are the powers of 2 [46]. The Welch method is a developed version of the periodogram and is used with an overlap of 50% in this study. This method (1) divides the signals into segments with a window function, computes a modified periodogram of each segment, and then averages the PSD estimates by (2) (N: length of the window, L: length of signal) [47].
In this study, area and power values from the power spectrum graph, as seen in Figure 4, were calculated as features, and 10 features were extracted from each EEG subband and channel. Since the features of each EEG sub-band and channel are considered separate attributes, 160 features were created for an EEG signal. The energy of the EEG signal was calculated using the Parseval theorem (3) [48]. Features were calculated for all recording stages. A total of 10 min of resting EEG recordings were examined in 3 segments: the first 5 min, the last 5 min, and a total of 10 min of data. Since the recording stages were considered separate signals, the EEG signals belonging to a patient were divided into 9 segments: the 10 min rest signal (second rest and final rest), the initial five minutes, the last five minutes, and a total of ten minutes (a total of 6 segments), as well as nurse interaction signal, family interaction signal, and first rest signal. So, 345 instances were obtained because some rest stages' signals could not be divided into 9 segments because they were not exactly 10 min. As a result, there were 160 features and 345 examples in this study. In Equations (4) and (5), fs is the sampling frequency and Pf is the power value of the signal. Areas for PSD curve of physiological signals obtained in patient control group distinction have recently come forth in the literature and gained importance. In the studies [46][47][48], the peak point of the curve was chosen as the reference point, and area calculations were made accordingly. Therefore, in this study, various features were extracted according to the area under the PSD curve [49][50][51]. The features are defined in Table 2. Table 2. List of 10 features extracted from each sub-band.

Energy
The signal's energy value; see Equation (3).

Maxf
Frequency value corresponding to the peak power in the PSD curve; see Figure 4.
Maxp PSD's maximum power value AUC1 A1 area under the PSD curve, up to the peak power of PSD (as seen in Figure 4).

AUC2
A2 area of the curve after the PSD's peak power (as seen in Figure 4).
The entire power value, normalized P f ; see Equation (4).

Statistical Analysis
After the calculation of features from the EEG signal obtained at different stage performed the statistical analysis. As examined in the literature, most studies us ANOVA test to compare multiple groups [49][50][51]. Therefore, we applied the normalit of Kolmogorov-Smirnov by considering p = 0.05 in the case of all statistical tests usin SPSS statistical software package. Normality assumptions were not satisfied for any or both groups; therefore, the non-parametric test Kruskal-Wallis was applied. So used Kruskal-Wallis method to analyze the relationship between GCSs (GCS 3-8) and tures obtained from EEG signals. Additionally, we used Dunn's test to do pairwise parisons between the features of the EEG signals in the case of different GCSs.

Data Balancing
Biomedical data classifications are very challenging because biomedical data is ally big and imbalanced. If the data distribution is highly unbalanced, classification rithms may not perform correctly because they are intended to enhance overall accu with a bias toward the majority class, regardless of the relevance of the various cl [52]. The imbalance between the minority and majority classes of the data set mislead classification results. For example, if 5% minority class instances are included in a set, 95% majority class instances, most of the data, would be classified as majority cla that a very high accuracy rate can be achieved. However, the high accuracy rate doe indicate that the classification is correct. In such a case, minority data are misclas [53].

Statistical Analysis
After the calculation of features from the EEG signal obtained at different stages, we performed the statistical analysis. As examined in the literature, most studies use the ANOVA test to compare multiple groups [49][50][51]. Therefore, we applied the normality test of Kolmogorov-Smirnov by considering p = 0.05 in the case of all statistical tests using the SPSS statistical software package. Normality assumptions were not satisfied for any one or both groups; therefore, the non-parametric test Kruskal-Wallis was applied. So, we used Kruskal-Wallis method to analyze the relationship between GCSs (GCS 3-8) and features obtained from EEG signals. Additionally, we used Dunn's test to do pairwise comparisons between the features of the EEG signals in the case of different GCSs.

Data Balancing
Biomedical data classifications are very challenging because biomedical data is usually big and imbalanced. If the data distribution is highly unbalanced, classification algorithms may not perform correctly because they are intended to enhance overall accuracy with a bias toward the majority class, regardless of the relevance of the various classes [52]. The imbalance between the minority and majority classes of the data set misleads the classification results. For example, if 5% minority class instances are included in a data set, 95% majority class instances, most of the data, would be classified as majority class so that a very high accuracy rate can be achieved. However, the high accuracy rate does not indicate that the classification is correct. In such a case, minority data are misclassified [53].
Undersampling, oversampling, and hybrid method algorithms defined in three different groups can be mentioned to solve the unbalanced problem in the data [54]. In the literature, it is seen that oversampling methods are generally used. There are two types of oversampling methods: random and synthetic. The random oversampling method replicates instances from the minority class randomly. Even though random oversampling does not create new information since the instances are from existing entries, it increases the possibility of model overfitting. Therefore, synthetic oversampling methods have been developed to prevent overfitting. Synthetic oversampling methods produce artificial instances for the minority class [55]. In biomedical data sets, SMOTE is the most used oversampling method [56][57][58][59][60]. The SMOTE method was used in this study for the aim of data balancing.
The SMOTE method produces synthetic data based on the similarity of features by looking at the nearest neighbors (K neighbors) of the minority class samples (the nearest neighbors are randomly selected). In our study, we determined K = 5. The SMOTE technique uses the following formula to increase the number of instances of the minority class [61]. In this equation, w i is the coefficient of weight, and x knn is the sample for the K, the nearest neighbor of the minority class data. Figure 5 shows the generation of synthetic data by looking at the nearest neighbor K = 4. White round specimens belong to the majority class and black round specimens belong to the minority class. As shown in Figure 5, x new synthetic data are generated by looking at 4 randomly selected samples around x 1 and x 2 samples.
Undersampling, oversampling, and hybrid method algorithms defined in three different groups can be mentioned to solve the unbalanced problem in the data [54]. In the literature, it is seen that oversampling methods are generally used. There are two types of oversampling methods: random and synthetic. The random oversampling method replicates instances from the minority class randomly. Even though random oversampling does not create new information since the instances are from existing entries, it increases the possibility of model overfitting. Therefore, synthetic oversampling methods have been developed to prevent overfitting. Synthetic oversampling methods produce artificial instances for the minority class [55]. In biomedical data sets, SMOTE is the most used oversampling method [56][57][58][59][60]. The SMOTE method was used in this study for the aim of data balancing.
The SMOTE method produces synthetic data based on the similarity of features by looking at the nearest neighbors (K neighbors) of the minority class samples (the nearest neighbors are randomly selected). In our study, we determined K = 5. The SMOTE technique uses the following formula to increase the number of instances of the minority class [61]. In this equation, wi is the coefficient of weight, and xknn is the sample for the K, the nearest neighbor of the minority class data. * (6) Figure 5 shows the generation of synthetic data by looking at the nearest neighbor K = 4. White round specimens belong to the majority class and black round specimens belong to the minority class. As shown in Figure 5, xnew synthetic data are generated by looking at 4 randomly selected samples around x1 and x2 samples.

Classification
Data Mining is extracting important patterns and discovering nontrivial knowledge from large data. Biomedical data mining is important in terms of finding the causes of the

Classification
Data Mining is extracting important patterns and discovering nontrivial knowledge from large data. Biomedical data mining is important in terms of finding the causes of the disease by examining the vital signs of a patient, providing the diagnosis of a disease with past data, and directing its treatment [62]. Classification is an important process in data mining and machine learning. A multiclass classification process was devised using the Weka and MATLAB software in this study. The usefulness of features obtained as a result of quantitative analysis of EEG for the diagnosis of neurological disorders using various classifiers has been discussed in many studies [63][64][65]. Random Forest (RF), K-NN, Ensemble Bagged Trees, and SVM-Cubic were used in this study. Grid search algorithm was used to find the optimal hyperparameters for each machine learning algorithm. For SVM, the kernel function was determined as cubic and the kernel scale as automatic. For the Ensemble Bagged Trees algorithm, the learner type was specified as the decision tree and the number of learners was determined as 30. For RF, the max iteration was specified as 100 with grid search in which we obtained the highest accuracy. For K-NN, the K value was determined as 1 and the distance metric was Euclidean. In all experiments, cross-validations were selected as values between 2 and 50 (k = 2, 5, 10,15,20,25,30,35,40,50). Leave-one-out cross-validation procedure was also used to obtain the predicted labels for each patient in this study. GCS 3, GCS 4, GCS 5, GCS 6, GCS 7, and GCS 8 labels (for six classes) were applied for the classification of multiclass. In this study, only 20-fold cross-validation test results were included because the best accuracy rate was obtained for 20-fold.
The four algorithms whose classification performances were given in this study can be explained briefly as follows. SVM is a reliable approach for classifying nonlinear data. SVM creates a decision hyperplane to maximize the separation distance between various classes [66]. It then proceeds to create the accuracy value, which relies on the value of the kernel and the parameters that were used, after identifying the hyperplanes that separate the classes quite well.
RF is an ensemble algorithm in which features are selected randomly. A random sample of characteristics is chosen throughout the decision tree (DT) building process, and each tree independently forecasts a classification and "votes" for the associated class [67]. A single DT is noisier and more sensitive to outliers, and predictions are poorer than several DTs' outputs. Thus, RF is desirable for medical data sets [68].
The K-NN algorithm is an algorithm that classifies the data to be classified according to its proximity with the previous data. The target point's distance to be included in the sample data set from the existing data is calculated for the K number of neighbors and is allocated to the most common class among K nearest neighbors based on a majority vote of the neighbors [69].
The bagging method and the decision tree classifier are combined to form the Ensemble Bagged Trees classifier. The ensemble approach combines numerous machine learning classifiers, and the bagging method can minimize the decision tree algorithm's high variance [70].
When working with unbalanced data sets, accuracy appears to be highly sensitive to data distribution resulting from the majority class. Misclassification of the minority class has a substantially greater error rate than the misclassification of the majority class. So, in addition to accuracy and error rate, other evaluation metrics should be used for imbalanced datasets [71]. The classifiers were evaluated in terms of measures for six class problems: sensitivity, specificity, precision, F-score, G-mean, and overall accuracy. Metrics were calculated according to Table 3. Accuracy is calculated as the ratio of correctly classified instances to the total number of instances (Equation (7)).

accuracy = (TP + TN)/(TP + TN + FP + FN)
Sensitivity is also called Recall in some studies. Sensitivity measures a test's capacity to appropriately detect true positive cases (those with disease). Specificity demonstrates the test's capacity to appropriately detect real negative instances (those without the disease). Precision quantifies how many samples categorized as positive are actually positive [71].   )   TN  TN  TN  TN   GCS5  FP  TN  TN  TN  TN  TN   GCS6  FP  TN  TN  TN  TN  TN   GCS7  FP  TN  TN  TN  TN  TN   GCS8  FP  TN  TN  TN  TN  TN Sometimes, these metrics are not enough to indicate the success of classification studies. Therefore, in many studies, F-score and G-mean metrics were included [53]. The F-score is a classification metric that may be calculated as a weighted mean of accuracy and sensitivity. The β specified in the F-score equation is selected as one in a balanced state; i.e., precision and sensitivity are of equal weight. β is used to determine the significance of precision and sensitivity. In this study, β = 1 was determined [71].
The G-mean (geometric mean) averages both sensitivity and specificity. Thus, it evaluates the degree of tendency in terms of both positive grade accuracy and negative grade accuracy rate. A low G-mean score, if interpreted, indicates that the classifier is prone to a class [71].

Results
In this study, 4 channels of EEG acquisition were performed from 39 comatose patients with GCSs between 3 and 8. The signals were recorded in five stages: first rest, interaction with a nurse, second rest, interaction with family, and last rest. During communication with the family and nurse, verbal and tactile stimuli were applied. The signals were filtered by a low-pass filter and notch filter and noises from signals were eliminated. Then, EEG signals were decomposed into four EEG sub-bands using a bandpass filter. The power spectrum density was obtained, and 10 features were extracted from the frequency spectrum for all sub-bands and each channel. Thus, 160 features in total were obtained. The features were statistically analyzed. The imbalance in the data was eliminated with the SMOTE method and classification results were given for 20-fold cross-validation. A schematic overview of the study is given in Figure 6. Diagnostics 2023, 13, x FOR PEER REVIEW 12 of 26

Analysis of Energy Values
Energy values, which are the first feature in the study, were calculated for each EEG sub-band. The average ± standard error of means of energy values are shown in Table 4. In addition, topographic pictures of average energy values are given in Figure 7. According to Table 4, it is clearly seen that the energy values of GCS 3 patients in all sub-bands of each channel's EEG signal are lower than other GCSs. The standard error mean of GCS 3 patients is lower than that of other GCSs. Therefore, it is understood that EEG signals of

Analysis of Energy Values
Energy values, which are the first feature in the study, were calculated for each EEG sub-band. The average ± standard error of means of energy values are shown in Table 4. In addition, topographic pictures of average energy values are given in Figure 7. According to Table 4, it is clearly seen that the energy values of GCS 3 patients in all sub-bands of each channel's EEG signal are lower than other GCSs. The standard error mean of GCS 3 patients is lower than that of other GCSs. Therefore, it is understood that EEG signals of GCS 3 patients have less energy than other patients for all EEG channels and all sub-bands.
As the patients' GCS values increased, that is, their LeOC increased, their energy values also increased, as seen in Table 4. When we examine the delta band of the T3-T4 channel, it is seen that as the LeOC increases, the mean and standard deviation of energy values increase. The energy values of GCS 3 patients and GCS 8 patients were generally obtained close to each other, as seen in Figure 7. According to numerical results and topographical pictures, the EEG signals of GCS 3 patients have higher energy values in the parietal lobe in other sub-bands, except for the delta band. The parietal and temporal lobes of GCS 4 patients are more active. The central and temporal lobes of GCS 5 patients are active. The energy values of the EEG sub-bands of the temporal lobe are higher in patients with GCS 6 and GCS 7. In contrast, GCS 8 patients appear to have increased beta activity in the frontal lobe.

Statistical Analysis
It is important to analyze the features with statistical methods to understand the data. The conditions for parametric tests were examined. The Kruskal-Wallis method was used to statistically compare multiple groups since the data was not normally distributed. In this study, the extracted features were compared with different levels of consciousness. In other words, whether these features may distinguish patients with various GCS scores was explored. Most of the features were statistically significant between the groups (p < 0.05) according to the Kruskal-Wallis method. The results of the analysis are summarized in Table 5. In Table 5, the features that cannot make a statistical difference (p > 0.05) between groups are marked with red. It is seen that the EEG channel with the highest discrimination is T3-T4. Ratio 2 and Ratio 3 (seventh and eighth attributes) features were worse in distinguishing different levels of consciousness than other features. Post-hoc tests needed to be performed to find out which of the multiple groups differ from each other. For this, both groups were compared in pairs with the Dunn's Bonferroni adjustment test. The results of which groups are different from each other are included in this study as a Supplementary

Statistical Analysis
It is important to analyze the features with statistical methods to understand the data. The conditions for parametric tests were examined. The Kruskal-Wallis method was used to statistically compare multiple groups since the data was not normally distributed. In  According to the results of comparing pairwise groups, it is the alpha sub-band of the P3-P4 channel that is best distinguished between multiple groups. For this reason, the average frequency values of the alpha band of the P3-P4 channel are plotted according to the GCS values. The mean frequency value of each recording stage according to the GCS is observed in Figure 8. If the graph is examined, it is seen that GCS 8 patients have the highest frequency in each recording stage and that GCS 6 shows a higher frequency content than other levels of consciousness. While the frequency features of GCS 3, 4, and 5 tend to decrease overall, those of GCS 6, 7, and 8 tend to increase.  The power values according to the averaged sub-bands of all EEG channels are shown in Figure 9. Figure 9 shows the mean maximum power values of the EEG signals obtained during the interaction with the family because the power values of the family stage have been found to be more distinctive at different levels of consciousness. When Figure 9 is examined, it is seen that as the GCS value increases, the average power value of EEG signals increases. The EEG signals from GCS 3 patients have the lowest power value in each sub-band. Additionally, the EEG signals from GCS 8 patients have much a higher power in the beta sub-band than other levels of consciousness. The power values according to the averaged sub-bands of all EEG channels are shown in Figure 9. Figure 9 shows the mean maximum power values of the EEG signals obtained during the interaction with the family because the power values of the family stage have been found to be more distinctive at different levels of consciousness. When Figure 9 is examined, it is seen that as the GCS value increases, the average power value of EEG signals increases. The EEG signals from GCS 3 patients have the lowest power value in each sub-band. Additionally, the EEG signals from GCS 8 patients have much a higher power in the beta sub-band than other levels of consciousness.
The power values according to the averaged sub-bands of all EEG channels are shown in Figure 9. Figure 9 shows the mean maximum power values of the EEG signals obtained during the interaction with the family because the power values of the family stage have been found to be more distinctive at different levels of consciousness. When Figure 9 is examined, it is seen that as the GCS value increases, the average power value of EEG signals increases. The EEG signals from GCS 3 patients have the lowest power value in each sub-band. Additionally, the EEG signals from GCS 8 patients have much a higher power in the beta sub-band than other levels of consciousness.

Data Balance
Since the number of GCS 8 patients are fewer than others, fewer instances were obtained from the EEG signals of these patients. If Table 6 is examined, the highest instance number is at GCS 6 with 81 instances, and the minimum instance number is at GCS 8 with 18 instances. If there are data with imbalanced instances between classes, greater success will be achieved in classifying the majority classes. For this reason, the SMOTE method has been used to overcome the multiclass imbalance problem. Instance numbers for different GCSs before and after using SMOTE are given in Table 6. Synthetic data were generated as much as the amount of increase (%) to approximately equal the number of instances. Thus, the imbalance problem between the classes was eliminated.

Classification Results
Classification results are shown in Table 7 using the classifiers described under the 2.6 classification subheading. The results are presented separately for the classification of unbalanced data and balanced data by SMOTE. The classification results are performed for analysis of all EEG sub-bands. The Random Forest algorithm achieves the best classification success rate. Furthermore, the fact that the data is balanced across many classes using the SMOTE approach improves classification success for all algorithms. An overall accuracy of 96.44% was obtained for the Random Forest algorithm. The second successful method was K-NN with a 96.23% overall accuracy, a classification performance very close to that of the Random Forest algorithm. The classification success is over 95% for the algorithms used in this study. When the classification success of the EEG sub-bands is measured, it can be seen from Table 8 that obtained features from the theta sub-band are more successful than the other EEG sub-bands in classifying different levels of consciousness.  Figures 10 and 11 show the ROC curves of the test data in the classification of six GCS classes from unbalanced and balanced data using the Random Forest algorithm. A separate curve is given for each GCS class. AUC stands for "Area under the ROC Curve". The horizontal axis gives a false positive rate, and the vertical axis gives a true positive rate in the curves. Figures 10 and 11 show the ROC curves of the test data in the classification of six GCS classes from unbalanced and balanced data using the Random Forest algorithm. A separate curve is given for each GCS class. AUC stands for "Area under the ROC Curve". The horizontal axis gives a false positive rate, and the vertical axis gives a true positive rate in the curves.

Discussion
When the energy values of EEG signals are analyzed, GCS 3 patients have lower energy in all sub-bands of all EEG channels. In the beta sub-band of the F3-F4 channel, an increased energy state is observed as GCS increases. The energies of EEG signals of GCS 4, GCS 5, and GCS 6 patients are close to each other. For GCS 8 patients, a higher energy value was obtained in the beta sub-band of EEG signals from only F3-F4 and C3-C4 channels than others. Through analysis of other channels and sub-bands, it is seen that the EEG signals of GCS 3 and GCS 8 patients have close energy values. The very low number of instances for GCS 8 patients causes low discrimination in other sub-bands except for the beta sub-band. If the topographic plots of the average energy values are examined, it is seen that the frontal lobe of GCS 8 patients is more active, and the parietal lobe of GCS 3 and GCS 4 patients is active. It was obtained that GCS 5, 6, and 7 patients had more activities in the temporal lobe. In the 10-20 system, the F7 point in EEG is near the centers for rational activities, the F8 point is near sources of emotional impulses, the C3 and C4 points deal with sensory and motor functions, the P3 and P4 promote the activity of perception

Discussion
When the energy values of EEG signals are analyzed, GCS 3 patients have lower energy in all sub-bands of all EEG channels. In the beta sub-band of the F3-F4 channel, an increased energy state is observed as GCS increases. The energies of EEG signals of GCS 4, GCS 5, and GCS 6 patients are close to each other. For GCS 8 patients, a higher energy value was obtained in the beta sub-band of EEG signals from only F3-F4 and C3-C4 channels than others. Through analysis of other channels and sub-bands, it is seen that the EEG signals of GCS 3 and GCS 8 patients have close energy values. The very low number of instances for GCS 8 patients causes low discrimination in other sub-bands except for the beta sub-band. If the topographic plots of the average energy values are examined, it is seen that the frontal lobe of GCS 8 patients is more active, and the parietal lobe of GCS 3 and GCS 4 patients is active. It was obtained that GCS 5, 6, and 7 patients had more activities in the temporal lobe. In the 10-20 system, the F7 point in EEG is near the centers for rational activities, the F8 point is near sources of emotional impulses, the C3 and C4 points deal with sensory and motor functions, the P3 and P4 promote the activity of perception and differentiation ability, and the T3 and T4 locations are concerned with emotional processors [72,73]. Since the frontal lobe is the brain region responsible for conscious thinking, we can conclude that GCS 8 patients have a higher LeOC. The parietal lobe integrates sensory input from numerous sections of the body and processes information related to touch [74]. The fact that the energy of the EEG channel of the parietal lobe of GCS 3 and GCS 4 patients is higher indicates that even the patients with very low awareness perceive the sense of touch. The temporal lobe plays a part in primary auditory perception, such as hearing [75]. Therefore, the temporal lobe of GCS 5, GCS 6, and GCS 7 patients being more active indicates that talking to these patients makes a difference in EEG waves.
When the statistical test results (Table 5) are examined, it is seen that most of the features reveal a substantial difference between the groups. As can be seen in Figure 8, during the interaction with the nurse and the family, talking to the patient/touching the patient causes the frequency content of the EEG signals of the patient (GCS 8 patients) to increase. It is also shown with the maximum power value that the change in the EEGs of GCS 8 patients is more clearly indicated. If the average power values for the subbands are examined in Figure 9, there can be seen an increasing power value as the LeOC increases. In this graph, which shows the power values of the family interaction stage, GCS 8 patients for the beta sub-band have a significantly higher power value than other levels of consciousness. The power values of the PSD graphics of GCS 3 patients are much lower than other levels of consciousness. As a result of the analysis of power values, GCS is correlated with the energy values.
Balancing the data with the SMOTE method for classifying consciousness levels has increased classification success. Figures 10 and 11 show the influence of data balance on classification success, particularly in the classification of the GCS 8 class. The area under the ROC curve for balanced data is higher for the GCS 8 class than for unbalanced data. Classification with features obtained from the theta sub-band is performed with 93.10% success. Beta, alpha, and delta sub-bands have classifying successes close to each other. When a person is awake with their eyes closed, they produce alpha waves; when they are engaged in mental activity, they produce beta waves; theta waves occur between sleep and wakefulness; and delta waves occur during deep sleep [76]. Theta waves are usually related to drowsiness or heightened emotional states [77]. Theta frequency variations become more substantial during the shift from awake to sleepy states [78]. Therefore, the features extracted from theta waves have been more successful in classifying consciousness levels.
The statistical analysis of EEG shows a different pattern in frequency bands for different LeOC. During family and nurse interaction, a more complex organization of the EEG within the higher GCSs and changes in the distribution of the alpha and beta sub-band range were found. As a result, we were informed that patients between GCS 3 and 8 can be aware of what is going on around them, even when they are in a coma, by analyzing their EEG signals. Furthermore, our findings demonstrate that the energy, power, and frequency variations of EEG signals obtained from patients with different LeOCs can be used to objectively assess GCSs. Understanding that comatose patients are aware of their surroundings in ICUs will be beneficial for the patient's care. This awareness can provide effective care and treatment services. Discontinuing useless care will result in the improved provision of psychological requirements in patient care.

Related Works
Currently, quantitative analysis studies show a distinctive change in EEG signals in the state of coma and brain death. Kustermann et al. [79] found that significant statistical differences in the spectral power of EEG obtained from comatose patients (within 24 h of cardiac arrest) occur at 5.2-13.2 Hz and above 21 Hz. According to [80], the power value indicated the great intension of neurophysiological EEG activities for the 19 coma patients, as well as the absence of these for 17 quasi-brain-death patients. Additionally, Zhu et al. [81] found that compared with the coma group, the relative power spectral density values of brain death were decreased in the delta band and increased in the alpha and beta bands. Claasen et al. [82] found a large decrease in power across all frequency bands. Lehembre et al. [83] compared differences in power spectra between VS patients and MCS patients. They noted that VS patients showed higher delta power but lower alpha power compared with MCS patients. Yao Miao et al. [84] indicated that the energy of EEG signals of comatose patients was higher than that of patients with brain death. Bai et al. [85] and Stefan et al. [86] found reduced power in the alpha range and increased power in the delta and theta range in VS patients compared with MCS patients. Finding differential diagnostics between MCS and VS is currently a challenge for researchers. Therefore, many studies are performed differentiating between VS and MCS (DOC patients) [87][88][89]. There is no study that has dealt with determining GCS levels in comatose patients (GCS 3-8) in the literature. However, there is a considerable amount of research on distinguishing DOC patients using EEG spectral power measures and other methodologies [32,33,[87][88][89]. For example, Kempny et al. [90] compared the mean amplitudes of event-related potential data (ERP) in EEG recordings between VS/UWS and MCS patients. Statistical analysis was performed on the signals obtained by saying to the patients their own name and someone else's name as auditory stimuli. Naro et al. [91] explored functional connectivity during resting-state EEG in 17 patients with VS/UWS and 15 patients with MCS using multiplex and multilayer network analyses. These studies vary from our study in that they do not aim to discriminate GCS scores in deep coma patients. The majority of the research is based on statistical analysis results. Our study contributes to the literature by demonstrating the usefulness of EEG spectral analysis in discriminating the LeOC in individuals in a deep coma.
When studies in the literature are examined, it is seen that studies are mostly aimed at differentiating between brain death and coma. There are not many studies to differentiate consciousness levels. The studies are mostly based on statistical analysis and classification studies are not encountered. This study is, to the best of our knowledge, the first in the literature in terms of the recording scenario, the extracted features, and the purpose. Some studies in the literature that analyze EEG signals with GCS are summarized in Table 9. These studies are different from our study and do not aim to separate the consciousness levels of patients in a deep coma. Therefore, our study contributes to the literature to represent the success of EEG signals in discriminating the LeOC.  In our prior research [97,98], we used deep neural networks to classify consciousness levels without extracting features from EEG recordings. As a result of this study, consciousness levels were classified into two classes (low level of consciousness and high level of consciousness) with 83.3% accuracy. In another study [98,99] we carried out, the classification of the level of consciousness with the features obtained as a result of the nonlinear analysis of the EEG signals was achieved with an accuracy of 90.3%. Unlike the previous two studies, the spectral content of the EEG waves was shown to provide discrimination in different levels of consciousness with an accuracy of 96.44% in this study.

Limitations and Feature Work
Although the proposed study achieved a high classification performance in different LeOC, there are still some limitations. First, the number of coma patients needs to be increased, especially the number of patients with GCS 8. Second, the difference between multiple groups for statistical analysis needs to be improved. Possible ways to improve statistical analysis are to ensure that the number of patients is equal for each GCS and to obtain new features. In future studies, we will overcome these limitations to better reveal the difference between levels of consciousness.

Conclusions
In this study, we proposed an EEG analysis system to differentiate different consciousness levels. For this purpose, EEG signals were recorded while the patients were applied with auditory and tactile stimuli by the nurse and their families; EEG signals were obtained before and after the stimuli. The features extracted from the sub-bands of EEG signals were statistically analyzed for different GCSs. While the features were successful in differentiating most GCS, this study demonstrates that GCS 3 and GCS 8 coma patients differ from other consciousness levels in terms of decreased theta sub-band energy values. This is, to the best of our knowledge, the first study to classify patients in a deep coma (GCS between 3 and 8) with 96.44% classification performance as a result of the recording and analysis method recommended. This study contributes to the literature with the results that EEG signals are successful in differentiating the LeOC in a deep coma.
Supplementary Materials: The following supporting information can be downloaded at https: //www.mdpi.com/article/10.3390/diagnostics13081383/s1: Table S1: Kruskal-Wallis test post-hoc results (Dunn's test, Bonferroni correction). Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.