An Effective Mental Stress State Detection and Evaluation System Using Minimum Number of Frontal Brain Electrodes

Currently, mental stress is a common social problem affecting people. Stress reduces human functionality during routine work and may lead to severe health defects. Detecting stress is important in education and industry to determine the efficiency of teaching, to improve education, and to reduce risks from human errors that might occur due to workers’ stressful situations. Therefore, the early detection of mental stress using machine learning (ML) techniques is essential to prevent illness and health problems, improve quality of education, and improve industrial safety. The human brain is the main target of mental stress. For this reason, an ML system is proposed which investigates electroencephalogram (EEG) signal for thirty-six participants. Extracting useful features is essential for an efficient mental stress detection (MSD) system. Thus, this framework introduces a hybrid feature-set that feeds five ML classifiers to detect stress and non-stress states, and classify stress levels. To produce a reliable, practical, and efficient MSD system with a reduced number of electrodes, the proposed MSD scheme investigates the electrodes placements on different sites on the scalp and selects that site which has the higher impact on the accuracy of the system. Principal Component analysis is employed also, to reduce the features extracted from such electrodes to lower model complexity, where the optimal number of principal components is examined using sequential forward procedure. Furthermore, it examines the minimum number of electrodes placed on the site which has greater impact on stress detection and evaluation. To test the effectiveness of the proposed system, the results are compared with other feature extraction methods shown in literature. They are also compared with state-of-the-art techniques recorded for stress detection. The highest accuracies achieved in this study are 99.9%(sd = 0.015) and 99.26% (sd = 0.08) for identifying stress and non-stress states, and distinguishing between stress levels, respectively, using only two frontal brain electrodes for detecting stress and non-stress, and three frontal electrodes for evaluating stress levels respectively. The results show that the proposed system is reliable as the sensitivity is 99.9(0.064), 98.35(0.27), specificity is 99.94(0.02), 99.6(0.05), precision is 99.94(0.06), 98.9(0.23), and the diagnostics odd ratio (DOR) is ≥ 100 for detecting stress and non-stress, and evaluating stress levels respectively. This shows that the proposed framework has compelling performance and can be employed for stress detection and evaluation in medical, educational and industrial fields. Finally, the results verified the efficiency and reliability of the proposed system in predicting stress and non-stress on new patients, as the accuracy achieved 98.48% (sd = 1.12), sensitivity = 97.78% (sd = 1.84), specificity = 97.75% (sd = 2.05), precision = 99.26% (sd = 0.67), and DOR ≥ 100 using only two frontal electrodes.


Introduction
Mental stress is the human body's response to exposed psychosocial or physical situations. It affects people all over the world regardless of their age, gender, or occupation. This is because of the increasing work difficulties, rising pressure, and demanding daily activities that people face every day [1]. Currently, mental stress is considered the main cause of several health problems. These problems include heart attacks, strokes, nervousness, depression, post-traumatic stress disorder (PTSD), and immunological disorders. Stress can also influence brain activity and structure [2]. Therefore, early detection of stress is essential to prevent illness and reduce the chances of clinical brain damage and other health problems.
Detection and the evaluation of mental stress is essential as well in fields such as education and industry. In educational e-learning settings, stress may be a major factor that affects students' performance in exams. Stress level may increase with unhealthy examination schemes in higher education. In such systems, students can be rated on their performance based on limited hours only. Accordingly, their grades may not represent their real knowledge and intelligence but rather their ability to cope with exam induced stress [3]. Additionally, in an offline educational setting, the frequent evaluation of student's mental state can be used to define the speed of teaching and improve the outcome of education [4]. For industrial security, recognizing hazards that occurs due to human mistakes is essential. This is because insecure and careless manners of workers and absence of safety measures are the major reasons for human-caused problems. Such factors include lack of sufficient sleep, poor diet, physical deficiencies, and fatigue, which can lead a person into a stressful situation.
Common methods to measure stress include questionnaires, where the mental effort that applicants put into a task is evaluated [5]. However, such methods can be subjective, i.e., depend on the personal opinion of the applicants, based on psychophysiological [6] or personal measures [7]. Therefore, these methods are not accurate enough due to individuals' inconsistencies. Moreover, this process becomes challenging when the number of individuals to be evaluated increases in real time. Thus, automated stress detection algorithms that can correctly recognize and assess stress even with a large number of subjects are important in identifying stress factors and facilitating stress management. These approaches embrace the use of portable devices such as mobiles, remotes, or wearable sensing devices to gather physiological signals, such as electrocardiograms (ECGs), near-infrared spectroscopy (NIRS), near-infrared spectroscopy (NIRS), functional magnetic resonance imaging (fMRI), electro-corticography (ECoG) or electroencephalograms (EEGs) [8,9].
Recently, brain activity measurement has been verified as an efficient method in the imaging of emotional stress changes [10]. NIRS and fMRI measure brain activity using blood in the brain. The strength of fMRI is its ability to capture signals in the brain with an outstanding resolution, nevertheless, the measurements are deferred until the state of the brain changes. In contrast, NIRS has the ability to describe only the state of the brain and the signal is eventually captured through blood flow. The EEG and ECoG waves measure the brain signals as well. Regardless of the ability of the ECoG to measure long-bandwidth signals, a surgical procedure is needed to insert electrodes on the skull to detect these signals. EEG is measured non-invasively; it employs a process that requires wearing a helmet. It measures signals from the scalp rather than from the brain itself [3]. Therefore, EEG is preferred over other methods. EEG is commonly used to measure stress [11][12][13][14] due to the increasing availability of EEG systems as low-cost wearable devices [15]. Additionally, it has a comparatively greater temporal resolution and it can visualize fast and energetically varying brainwave patterns in complex stress scenarios [9] Therefore, in this study, EEG is used. The main purpose of this paper is to identify the mental state of a person by analyzing the EEG signals. EEG is a noisy signal. Medical signal processing (MSP) techniques play an important role in removing this noise and retaining only the frequency bands containing informative data that describe mental stress. MSP can also extract useful features from EEG signals. Many feature extraction techniques are used for the analysis of EEG signals. Some of these methods include; Hjorth parameters [16,17], power spectral density (PSD) [18,19], common spatial patterns (CSPs) [9,14], statistical-based [20,21], and wavelet-based methods [22,23]. Features extracted using feature extraction techniques are fed into a classification model to either classify stress, non-stress, or stress levels. Classification based on machine learning techniques has the ability to accurately classify stress states. This can assist doctors to understand the signals well, make an accurate diagnosis and provide appropriate treatment [1,8].
Recently, many studies have validated the associations between EEG pattern and emotional stress state [24]. These studies examine stress, non-stress, and stress levels from EEG signals. The authors in [25] proposed a framework that applies the concepts of network physiology and information theory to extract valuable features. A random forest (RF) classifier was constructed to distinguish between stress and non-stress situations. Mahajan, in [18], proposed two feature-sets to classify stress using an artificial neural network (ANN). In [6], a threshold-based method was introduced to detect stress. The number of peaks of the theta band of an EEG signal was counted. If this count exceeded the threshold, then it was considered a stressful state. The authors of [14] used CSP as a feature extractor. A linear discriminant classifier (LDA) was used to distinguish between stress and non-stress. A support vector machine classifier was employed in [26] to differentiate between stress levels using power spectral density (PSD) features. PSD features were also used in [19] to discriminate between stress and non-stress conditions using an ANN classifier. In the research articles [20,27], we assessed stress levels using a support vector machine (SVM) classifier. In [28], an SVM classifier was also used to classify the mental stress state of individuals based on the changes in EEG power spectral density, especially in the theta and alpha bands, where the average classification accuracy reached 79% and 78%, respectively. The authors in [4] employed a single channel EEG device to examine the use of frontal EEG to determine stress levels. An SVM classifier was employed to evaluate stress levels and the accuracy achieved was in the range of 65%-75%. The authors in [10] employed classical machine learning classifiers with a feature selection approach to detect stress levels. In [11], a hybrid feature pool was constructed to recognize stress and non-stress using KNN. Donghoo et al. in [29] proposed a genetic algorithm (GA)-based feature selection algorithm and used and k-nearest neighbor (KNN) classifier to identify stress and non-stress. Fares et al. in [30,31] proposed an MSD to detect stress using an SVM classifier. Most of the previous techniques used a large number of electrodes to construct their MSD systems. The performance of such systems was not sufficient enough for real applications. They also lacked efficient feature extraction, selection, or reduction techniques to enhance the performance of a subsequent classifier.
Other MSD systems were constructed based on deep learning techniques such as Li et al. [24], who used a fused deep learning architecture to extract discriminative spatial-temporal EEG features for detecting emotional stress. Hefron et al. [32] presented a novel convolutional recurrent neural classifier by means of multipath subnetworks for detecting stress. Kuanar et al. [33] built a recurrent neural network algorithm to complete the cognitive analysis of EEG signals. Such methods based on deep learning have several drawbacks. First of all, for efficient performance using deep learning, deep networks usually require large amounts of data for training. Furthermore, deep features extracted from deep networks are commonly not separable, are highly correlated, and have a huge feature dimensional space. Additionally, the current methods do not consider any significance to the selection of appropriate features from specific domains for deeper fundamental analysis. A summary of the recent related techniques from literature that are close to the proposed MSD system are presented in Table 1. The key aim of this paper is to construct a portable real-time EEG-based mental training neuro-feedback system to identify and evaluate stress levels efficiently in real-time. Thus, a new MSD system is proposed to classify stress, non-stress, and stress levels. Relative features are the key factor in a powerful MSD system. Therefore, the proposed technique presents a hybrid feature subset for MSD. Five classifiers are used to detect stress situations, then the stress level is assessed and sorted into low and high. In order to generate an efficient MSD system with a lower number of channels, first, the proposed MSD scheme explores the placements of electrodes on different sites on the skull and selects the location which has the highest influence on the accuracy of the system. Furthermore, Principal Component Analysis (PCA) feature reduction is applied in a sequential forward search to select the optimal number of principal components and reduce the dimension of feature space, which usually enhances the performance of the subsequent classifier. In addition, in order to make the proposed MSD system more portable and movable and easier to set up, the number of electrodes is further reduced by selecting the minimum number of channels from the site which has the greater impact on the stress detection rates. To prove the efficiency of the proposed method, the results are compared with three feature-sets proposed in literature. They are also compared with the state-of-the-art techniques.

Participants
Initially, sixty-six subjects (age: µ = 18.6, σ = 0.87) participated in the data collection procedure, and 47 of them were women and 19 were men. According to EEG visual examination by an expert neurophysiologist, thirty of them were omitted from the data collection process because of poor EEG quality (extreme amount of myographic and oculographic artifacts), so the final sample size is thirty-six participants [40]. Details of the participants can be found in Table 2. Subjects were qualified to register in the study if they had no medical signs of cognitive or mental deficiency, verbal or non-verbal education incapacities, and had usual or corrected-to-usual visual perception, and normal color sight. Exclusion criteria were the use of medication or alcohol addiction, psychoactive drug, and neurological or illnesses.

EEG Collection Procedure
Mental arithmetic evaluation is commonly used as a standard stressor [41]. Therefore, in this paper, mental arithmetic tasks are used. The tasks consist of counting during the relaxed state (non-stress) and serial subtraction of two numbers during the stress state. Every serial subtraction trial begins with verbally subtracting 4 numbers from 2. Serial subtraction during 15 min is known as psychosocial stress [42], and it was used in [40]. In this manner, the design of the data collection procedure in [40] required exhaustive cognitive activity from the engaged individuals. This strenuous mental load leads to a variation in the emotional background once the individual involves more effort to solve tasks. Depending on quantity of arithmetic operations per minute, participants were grouped into two classes. The first class are called "high stress" and contains participants who performed the arithmetic tasks with difficulty and make more effort to perform the arithmetic tasks. The other class is called "low stress" and consists of participants who managed the arithmetic tasks without difficulty and no excess effort. Thus in this study, two mental stress states are considered.
Subjects were asked to sit on a comfortably reclined armchair in a dark soundproof hall. Before starting the experiment, they were asked to relax throughout the resting state (non-stress state) and informed about the arithmetic task. Subjects were instructed to count accurately and quickly for 3 min without talking or moving in the pace they chose. Afterwards, they performed serial subtraction and the EEG was recorded for 1 min during this state (stress condition).

Proposed Mental Stress Detection System
In this paper, a new system is proposed to detect stress states and classify stress levels. The proposed system consists of five steps as shown in Figure 2. Initially, the data are preprocessed, and noise and artifact removal constitutes the first step. Subsequently, the data are segmented into frames by a sliding window. Afterwards, valuable features are extracted using time and frequency feature extraction techniques to yield two feature-sets which are combined later to form a hybrid feature-set. Then, a feature reduction step is used to reduce the dimension of the combined feature-set. Finally, five machine classifiers are used to detect stress and evaluate stress levels. The proposed MSD system

Proposed Mental Stress Detection System
In this paper, a new system is proposed to detect stress states and classify stress levels. The proposed system consists of five steps as shown in Figure 2. Initially, the data are preprocessed, and noise and artifact removal constitutes the first step. Subsequently, the data are segmented into frames by a sliding window. Afterwards, valuable features are extracted using time and frequency feature extraction techniques to yield two feature-sets which are combined later to form a hybrid feature-set. Then, a feature reduction step is used to reduce the dimension of the combined feature-set. Finally, five machine classifiers are used to detect stress and evaluate stress levels. The proposed MSD system is composed of three experiments. In the first experiment, each of the two feature-sets are used individually to construct the MSD system, and then combined to form a hybrid feature-set to examine the influence of combining time and frequency features. In experiment two, to reduce the number of channels employed to construct the MSD system and enhance the effectiveness of the proposed approach, the electrodes are placed on different sites on the skull and the location which has the greatest influence on the accuracy of the system is chosen. Finally, in experiment three, the PCA feature reduction method is adopted to choose the optimum number of principal components in sequential forward search and reduce the dimension of feature space. Reducing feature space commonly improves the performance of an MSD system. channels employed to construct the MSD system and enhance the effectiveness of the proposed approach, the electrodes are placed on different sites on the skull and the location which has the greatest influence on the accuracy of the system is chosen. Finally, in experiment three, the PCA feature reduction method is adopted to choose the optimum number of principal components in sequential forward search and reduce the dimension of feature space. Reducing feature space commonly improves the performance of an MSD system.

Data Preprocessing
An EEG signal is usually noisy due to power line interference, electromyography (EMG), electrocardiography (ECG), and subject movement, etc. In order to reduce noise and artifacts, a highpass filter with 0.5 Hz cut-off frequency, low-pass filter with 500 Hz cut-off frequency and a notch filter of 50 Hz were used. The filters used were Butterworth IIR filters of order 4. The distortion of the filters was handled using forward-reverse filtering. In addition, a de-noising method based on multilevel wavelets decomposition was employed [43]. The number of wavelet levels was 5, the mother wavelet was Symlets. The number of vanishing moments was 4. The signals are further smoothed using a Savitzky-Golay filter [44].

Segmentation
EEG pre-processed signals are segmented into 4-s segments with a sliding window of 1 s as an increment step. The same segmentation procedure as [12] is followed which stated that a window size of 4 s is suitable and common for classification using a mental arithmetic task. Each of these segments is considered as a single trial. Thus, each trial will have a size of N × T, where N = 19 is the

Data Preprocessing
An EEG signal is usually noisy due to power line interference, electromyography (EMG), electrocardiography (ECG), and subject movement, etc. In order to reduce noise and artifacts, a high-pass filter with 0.5 Hz cut-off frequency, low-pass filter with 500 Hz cut-off frequency and a notch filter of 50 Hz were used. The filters used were Butterworth IIR filters of order 4. The distortion of the filters was handled using forward-reverse filtering. In addition, a de-noising method based on multilevel wavelets decomposition was employed [43]. The number of wavelet levels was 5, the mother wavelet was Symlets. The number of vanishing moments was 4. The signals are further smoothed using a Savitzky-Golay filter [44].

Segmentation
EEG pre-processed signals are segmented into 4-s segments with a sliding window of 1 s as an increment step. The same segmentation procedure as [12] is followed which stated that a window size of 4 s is suitable and common for classification using a mental arithmetic task. Each of these segments is considered as a single trial. Thus, each trial will have a size of N × T, where N = 19 is the number of channels and T = 4 corresponds to 4 s of one segmented frame sampled at 500 Hz. All trials will be used later in the feature extraction and classification steps.

Feature Extraction
To construct a machine learning classifier, useful features are needed to be extracted from the EEG segmented data. For this purpose, for each trial of an EEG signal, two subsets of features are extracted.
Previous studies have demonstrated that to detect emotional activities like stress with higher accuracy, it is preferable to extract features from the frequency domain [45]. Other studies mentioned that fusing time and frequency features has better impact for the detection rates of stress [46,47]. Therefore, in this paper, we proposed using time and frequency features and studied the impact of fusing these features. The Feature-Set 1 is a frequency feature-set which includes features which were used in [48]. These features are the median frequency (MDF), modified frequency mean (MFMD) features, and spectral moments (SM). MDF computes the median normalized angular frequency. SM calculates three power spectral moments from each EEG segment corresponding to root squared zero, second and forth order moment. MDMD it is the frequency at which the spectrum is split into two sections with equivalent amplitude. In other words, it determines the median amplitude spectrum in each segment calculated using Fourier transform. Feature-Set 2 consists of the following features; the root mean square (RMS) amplitude of the signal which was used in [49,50] and a sixth order autoregressive (AR) model coefficients which was used in [51,52]. AR uses each sample of EEG segment to describe it as a linear fusion with the preceding samples plus a white noise error term. AR calculates coefficients of the model depending on the order chosen. Such coefficients are considered as features [53]. Feature-Sets 1 and 2 are combined together to form a hybrid subset called Feature-Subset 3. The equations representing the features are shown below.
where M is the size of the power spectrum density, and PSDj is the jth line of the power spectrum density; A k is the EEG amplitude spectrum at frequency index k.
where a sampled version of the EEG segment is denoted as and m 4 are the root squared zero, second and forth order moment. The discrete frequency transform of an EEG segment can be expressed as a function of frequency X k . P k is the phase-excluded power spectrum which is equivalent to the result of a multiplication of X k by its conjugate X k * divided by N, and k is the frequency index.
where a d is AR coefficients, e is white noise or error sequence, and D is the order of AR model, d is the order of the coefficient d = 1,2, . . . ..D.

Feature Reduction
The hybrid Feature-Set 3 extracted in the previous step is of high dimensional space, and for this reason the feature reduction process is required to lower the dimension of the feature space, reduce the cost of computation of the training process, decrease the complication of the MSD system, improve the efficiency of stress detection process, and produce a reliable portable real time MSD system. Principal Component Analysis (PCA) is a common feature reduction technique used to reduce the size of Feature-Set 3 by applying a covariance analysis. PCA shrinks the number of features in Feature-Set 3 to a lower number of principal components [54]. Such principle components convey the variance of the features in Feature-Set 3. The steps of PCA technique used in to reduce Feature-Set 3 are as follows; first, we calculate the covariance matrix of Feature-Set 3. Afterwards, we determine the eigenvectors and the eigenvalues of the covariance matrix. Next, we select the number of the principal components using a sequential forward procedure. Finally, a reduced feature-set is produced.

Classification
The three feature-sets generated in the feature extracted phase are used to construct five well-known classification models. These models are linear discriminate analysis (LDA), k-nearest neighbor (KNN), linear and cubic support vector machine (SVM) classifiers, and random forest classifiers. The distance metric that is used for the KNN is Euclidian and the number of neighbors (K) is equal to 1. These classification models are first used to detect stress and non-stress states. Next, they are used to distinguish between two stress levels (low and high stress levels). All models are tested using 5-fold cross-validation.

Performance Evaluation
Several metrics are used to evaluate the performance of the proposed system. These metrics are the classification accuracy (CA), sensitivity, specificity, Goodness Index (G), precision, Matthew correlation coefficient, diagnostic odds ratio (DOR), and receiver operating characteristic (ROC) analysis.
where TP is the true positive, which is the number of positive class instances that are correctly classified. TN is the true negative, which is the number of negative class instances that are correctly classified. FP is the false positive, which is the number of negative class instances that are incorrectly classified as positive class, and FN is the false negative, which is the number of positive class instances that are incorrectly classified as negative class.

Results
The main objective of this study is to build a portable real-time EEG-based system to detect stress states and distinguish between stress levels. The mental arithmetic test is a popular stress inducer, and so is used in this study. This study presents a new MSD system which consists of four phases, or experiments. Phase one-two feature extraction methods are used to extract valuable features from segmented EEG data. Afterwards, a hybrid feature-set (Feature-Set 3) is formed by fusing feature-set 1 and 2. Phase two-the appropriate electrode site placement is selected, which has higher influence on stress detection rates. This part examines the electrode placements on different sites on the skull and selects the location which has the higher influence on the accuracy of the system. Phase three-we investigate the optimal number of principal components in the sequential forward search strategy to reduce the dimension of the feature space. Compressing feature space commonly improves the performance of an MSD system. Minimizing the number of EEG channels used in mental stress detection and evaluation would make the system more mobile and easier to set up, and maintain the real-time EEG-based mental stress detection system. Therefore, we come to phase four, where we examine the impact of each frontal electrode and select a minimum number of frontal electrodes in order to construct a portable MSD system. Five classifiers were built to identify stress and distinguish between two stress levels (low and high). Figure 3 illustrates the four experiments.
where TP is the true positive, which is the number of positive class instances that are correctly classified. TN is the true negative, which is the number of negative class instances that are correctly classified. FP is the false positive, which is the number of negative class instances that are incorrectly classified as positive class, and FN is the false negative, which is the number of positive class instances that are incorrectly classified as negative class.

Results
The main objective of this study is to build a portable real-time EEG-based system to detect stress states and distinguish between stress levels. The mental arithmetic test is a popular stress inducer, and so is used in this study. This study presents a new MSD system which consists of four phases, or experiments. Phase one-two feature extraction methods are used to extract valuable features from segmented EEG data. Afterwards, a hybrid feature-set (Feature-Set 3) is formed by fusing feature-set 1 and 2. Phase two-the appropriate electrode site placement is selected, which has higher influence on stress detection rates. This part examines the electrode placements on different sites on the skull and selects the location which has the higher influence on the accuracy of the system. Phase threewe investigate the optimal number of principal components in the sequential forward search strategy to reduce the dimension of the feature space. Compressing feature space commonly improves the performance of an MSD system. Minimizing the number of EEG channels used in mental stress detection and evaluation would make the system more mobile and easier to set up, and maintain the real-time EEG-based mental stress detection system. Therefore, we come to phase four, where we examine the impact of each frontal electrode and select a minimum number of frontal electrodes in order to construct a portable MSD system. Five classifiers were built to identify stress and distinguish between two stress levels (low and high). Figure 3 illustrates the four experiments.

Experiment One Results
Two feature-sets (feature-set 1 and 2) were introduced in our study. Subsequently, a hybrid feature-set (Feature-Set 3) was made by combining these two feature-sets. These feature-sets were first used to detect stress and non-stress states. The classification accuracy of those three feature-sets, when used for detecting stress and non-stress states, are shown in Figure 4. It is clear from Figure 4 that Feature-Set 3 has higher accuracy than the other two feature-sets except for LDA which has the same performance as feature-set 1. Table 3 shows the evaluation metrics using Feature-Set 3 which achieved a higher accuracy for detecting stress. The sensitivity and specificity rates are all above 99% except for LDA, which is above 98%, and the sensitivity of linear SVM which is 98.84%. On the Goodness Index, they are all below 0.02. Figure 5 shows the ROC curve for detecting stress for both a cubic SVM and a KNN classifier. The area under ROC curve (AUC) is one.
feature-set (Feature-Set 3) was made by combining these two feature-sets. These feature-sets were first used to detect stress and non-stress states. The classification accuracy of those three feature-sets, when used for detecting stress and non-stress states, are shown in Figure 4. It is clear from Figure 4 that Feature-Set 3 has higher accuracy than the other two feature-sets except for LDA which has the same performance as feature-set 1. Table 3 shows the evaluation metrics using Feature-Set 3 which achieved a higher accuracy for detecting stress. The sensitivity and specificity rates are all above 99% except for LDA, which is above 98%, and the sensitivity of linear SVM which is 98.84%. On the Goodness Index, they are all below 0.02. Figure 5 shows the ROC curve for detecting stress for both a cubic SVM and a KNN classifier. The area under ROC curve (AUC) is one.   Afterwards, the three subsets of features are used to classify stress levels. The classification accuracy of the three feature-sets used to classify stress levels is shown in Figure 6. Figure 6 shows that Feature-Set 3 yielded the highest accuracies using LDA, cubic and linear SVM, KNN and random forest classifiers respectively compared to the other two feature-sets. Table 4 shows the evaluation metric using Feature-Set 3 which achieved the highest accuracy for classification of stress levels. Table 3 indicates that the sensitivity ranges between (98%-100%) and the specificity ranges between (85%-99.4%). Figure 7 shows the ROC curve for detecting stress for both Cubic SVM and KNN classifier. The Goodness Index values are in the range of (0.00634-0.0151). The area under ROC curve (AUC) is one.  Afterwards, the three subsets of features are used to classify stress levels. The classification accuracy of the three feature-sets used to classify stress levels is shown in Figure 6. Figure 6 shows that Feature-Set 3 yielded the highest accuracies using LDA, cubic and linear SVM, KNN and random forest classifiers respectively compared to the other two feature-sets. Table 4 shows the evaluation metric using Feature-Set 3 which achieved the highest accuracy for classification of stress levels. Table 3 indicates that the sensitivity ranges between (98%-100%) and the specificity ranges between (85%-99.4%). Figure 7 shows the ROC curve for detecting stress for both Cubic SVM and KNN classifier. The Goodness Index values are in the range of (0.00634-0.0151). The area under ROC curve (AUC) is one. Afterwards, the three subsets of features are used to classify stress levels. The classification accuracy of the three feature-sets used to classify stress levels is shown in Figure 6. Figure 6 shows that Feature-Set 3 yielded the highest accuracies using LDA, cubic and linear SVM, KNN and random forest classifiers respectively compared to the other two feature-sets. Table 4 shows the evaluation metric using Feature-Set 3 which achieved the highest accuracy for classification of stress levels. Table 3 indicates that the sensitivity ranges between (98%-100%) and the specificity ranges between (85%-99.4%). Figure 7 shows the ROC curve for detecting stress for both Cubic SVM and KNN classifier. The Goodness Index values are in the range of (0.00634-0.0151). The area under ROC curve (AUC) is one.   In order to show the ability of the proposed feature to distinguish between stress and non-stress scenarios and to separate between stress levels, two figures are plotted to present two-dimensional scatter plots of two different features of the proposed feature-set. Figure 8 shows a two-dimensional scatter plot of time domain second power spectral moment vs. time domain forth power spectral moment feature for stress and non-stress cases. Figure 9 shows a two-dimensional scatter plot of time domain second power spectral moment vs. time domain forth power spectral moment feature for high and low stress levels cases. These figures verify the effectiveness of the proposed features to differentiate between both stress and non-stress, and stress levels.  In order to show the ability of the proposed feature to distinguish between stress and non-stress scenarios and to separate between stress levels, two figures are plotted to present two-dimensional scatter plots of two different features of the proposed feature-set. Figure 8 shows a two-dimensional scatter plot of time domain second power spectral moment vs. time domain forth power spectral moment feature for stress and non-stress cases. Figure 9 shows a two-dimensional scatter plot of time domain second power spectral moment vs. time domain forth power spectral moment feature for high and low stress levels cases. These figures verify the effectiveness of the proposed features to differentiate between both stress and non-stress, and stress levels. In order to show the ability of the proposed feature to distinguish between stress and non-stress scenarios and to separate between stress levels, two figures are plotted to present two-dimensional scatter plots of two different features of the proposed feature-set. Figure 8 shows a two-dimensional scatter plot of time domain second power spectral moment vs. time domain forth power spectral moment feature for stress and non-stress cases. Figure 9 shows a two-dimensional scatter plot of time domain second power spectral moment vs. time domain forth power spectral moment feature for high and low stress levels cases. These figures verify the effectiveness of the proposed features to differentiate between both stress and non-stress, and stress levels.

Experiment Two Results
As mentioned before, the main aim of the experiment is to explore the electrode locations on different sites on the brain and chooses the site which has the greater impact on the accuracy of the system. This process will also reduce the number of channels employed to construct the MSD system which accordingly makes it more reliable and efficient. Figure 10 shows a bar chart that compares between the performance of several classifiers constructed from Feature-Set 3 extracted from several sites on the skull to detect stress. It is clear from this figure that using only the frontal activation channel, the performance of the MSD system reached 99.98% using the KNN classifier which is higher than all other sites. Table 5 shows the values of accuracy for detecting stress and non-stress from different electrode sites. The frontal activation site has also greater impact on evaluating stress levels as well, shown in Figure 11. Figure 11 shows a bar chart that compares the performance of several classifiers constructed using the proposed Feature-Set 3 extracted from different sites on the skull to evaluate stress levels. The highest accuracy of 99.78% was achieved using the KNN classifier. Table  6 shows the values of accuracy for evaluating stress levels from different electrode sites

Experiment Two Results
As mentioned before, the main aim of the experiment is to explore the electrode locations on different sites on the brain and chooses the site which has the greater impact on the accuracy of the system. This process will also reduce the number of channels employed to construct the MSD system which accordingly makes it more reliable and efficient. Figure 10 shows a bar chart that compares between the performance of several classifiers constructed from Feature-Set 3 extracted from several sites on the skull to detect stress. It is clear from this figure that using only the frontal activation channel, the performance of the MSD system reached 99.98% using the KNN classifier which is higher than all other sites. Table 5 shows the values of accuracy for detecting stress and non-stress from different electrode sites. The frontal activation site has also greater impact on evaluating stress levels as well, shown in Figure 11. Figure 11 shows a bar chart that compares the performance of several classifiers constructed using the proposed Feature-Set 3 extracted from different sites on the skull to evaluate stress levels. The highest accuracy of 99.78% was achieved using the KNN classifier. Table 6 shows the values of accuracy for evaluating stress levels from different electrode sites      Figure 11. A comparison between classification accuracies of several classifiers constructed using the proposed Feature-Set 3 extracted from different sites on the skull to access stress levels.

Experiment Three Results
The aim of this experiment is to investigate the influence of reducing the feature space using PCA on the performance of MSD system. The number of principal components are chosen in a sequential forward strategy. Figure 12 shows the selection process for the optimal number of principle components for detecting stress. Figure 12 shows that, using only 58 and 15 principal components, the accuracy of detecting stress reached 100% and 99.8% using cubic SVM and KNN classifiers respectively.
Diagnostics. 2020, 10, x FOR PEER REVIEW 17 of 25 Figure 12. Selecting the optimal number of principle components for detecting stress. Figure 13 represents the selection procedure for the optimal number of principle components for evaluating stress levels. It is clear from Figure 13 that, using only 30 and 54 principal components, the accuracy of classifying stress levels reached 99.4% and 99.7% using cubic SVM and KNN classifiers respectively.  Figure 13 represents the selection procedure for the optimal number of principle components for evaluating stress levels. It is clear from Figure 13 that, using only 30 and 54 principal components, the accuracy of classifying stress levels reached 99.4% and 99.7% using cubic SVM and KNN classifiers respectively.

Figure 12.
Selecting the optimal number of principle components for detecting stress. Figure 13 represents the selection procedure for the optimal number of principle components for evaluating stress levels. It is clear from Figure 13 that, using only 30 and 54 principal components, the accuracy of classifying stress levels reached 99.4% and 99.7% using cubic SVM and KNN classifiers respectively.

Experiment Four Results
As stated before, one of the aims of this manuscript is to build a portable real-time EEG-based mental training neuro-feedback system to detect and evaluate stress in real time with high accuracy. Minimizing the number of EEG electrodes used in mental stress detection and evaluation would make the system more portable and easier to operate, and maintain the real-time EEG-based mental stress detection system. In this experiment, the impact of each frontal electrode on stress detection and evaluation rates is investigated individually. Afterwards, the influence of fusing two or three electrodes which have higher detection and evaluation rates when used individually is examined for both stress detection and stress level evaluation.

Experiment Four Results
As stated before, one of the aims of this manuscript is to build a portable real-time EEG-based mental training neuro-feedback system to detect and evaluate stress in real time with high accuracy. Minimizing the number of EEG electrodes used in mental stress detection and evaluation would make the system more portable and easier to operate, and maintain the real-time EEG-based mental stress detection system. In this experiment, the impact of each frontal electrode on stress detection and evaluation rates is investigated individually. Afterwards, the influence of fusing two or three electrodes which have higher detection and evaluation rates when used individually is examined for both stress detection and stress level evaluation.
It is clear from Table 7 that Fp1 and Fp2 electrodes have the highest impact on detecting stress and non-stress. Therefore, each one can be used alone to detect stress, and the results show that only one frontal electrode, such as Fp1 or Fp2, is capable of detecting stress and non-stress. However, the impact of fusing Fp1 and Fp2 is examined and shown in Table 8, which shows the impact of using Fp1 and Fp2 on detecting stress and non-stress using five-fold cross validation. Table 7. Accuracy (%) with (95% CI) for detecting stress and non-stress, and evaluating stress levels using each frontal electrode individually using five-folds cross validation.

124250
(1.14-1.35) × 10 5 Table 8 shows that fusing Fp1 and Fp2 improves the performance metric for detecting stress, therefore the two Fp1 and Fp2 electrodes are sufficient to construct an efficient and portable MSD system. In order to further validate the performance of the proposed system and its ability to predict if a new person has stress or non-stress, leave-subject-out validation is performed as well and shown in Table 9. The results of Table 9 show the efficiency of the proposed system in predicting stress and non-stress on new patients. In the case of evaluating stress levels, Table 7 shows that the highest accuracy achieved for evaluating stress levels using one electrode is 92.5% using the F8 electrode, followed by 92% and 91.36% using Fp1 and F7 respectively. To improve the performance of the proposed system for evaluating stress levels using the minimum number of electrodes, the impact of fusing two or three electrodes is investigated and shown in Table 10. The results of Table 10 show the ability of the proposed system to evaluate level increase using two electrodes. However, the performance of the proposed system is further improved using the three electrodes (Fp1 + F7 + F8). In order to further validate the performance of the proposed system and its ability to predict stress levels for a new person, leave-subject-out validation is performed as well and is shown in Table 11. However, the results of Table 11 show that the proposed system has a lower ability to predict stress levels.

Discussion
This study proposes an effective MSD system to detect stress and classify stress levels. For this purpose, the study design consists of three parts. The first part compares two feature-sets. The first feature-set consists of frequency-based features, whereas the second one consists of AR and RMS features. Here, the influence of combining time and frequency features is assessed. The analysis showed that combining time and frequency features increases stress detection and stress level classification rates. In order to verify the effectiveness of our proposed system, the results are compared with the classification accuracy of three feature-sets from literature. These include the feature extraction method proposed by Khushaba et al. [22] and the two approaches presented by Mahajan [18]. The two feature-sets reported by Mahajan include some peak related features and PSD features. Mahajan's first feature-set comprises four peak-related features which are number of negative peaks, the number of positive peaks, the mean of negative peaks, and the mean of positive peaks. The second feature-set of Mahajan's consists of the mean spectral power estimation in beta (13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30), alpha (8)(9)(10)(11)(12)(13), theta (4)(5)(6)(7)(8), and delta (0.5-4 Hz) EEG sub-bands respectively. The feature-set of khushaba et al. represents features extracted using wavelet packet decomposition where the frequency components are selected using a mutual information estimation procedure that depends on the generalization of the fuzzy entropy theory. Figure 14 shows the accuracies for detecting stress and non-stress states using our proposed Feature-Set 3 compared to Khushaba and the two feature-sets from Mahajan using only Fp1 + Fp2 frontal electrodes. It is clear from Figure 14 Figure 14. The classification accuracy of detecting stress and non-stress states using our proposed Feature-Set 3 compared to Khushaba et al. [22] and the two feature-sets from Mahajan [18] using only Fp1 + Fp2 frontal electrodes.
The proposed Feature-Set 3 was also compared with Khushaba et al. and the two feature-sets from Mahajan, but in this case for evaluating stress levels using only Fp1 + F7 + F8 electrodes. Figure  15 shows the classification accuracies of classifying stress levels using our proposed Feature-Set 3 compared to Khushaba and the two feature-sets from Mahajan. It is clear from Figure 15 that the proposed Feature-Set 3 has better performance in evaluating stress levels compared to the other feature-sets from Khushaba et al. and Mahajan. the accuracy of our proposed Feature-Set 3 (82%, 99.26%, 88.1%, and 98.36%,) are higher than the Mahajan PSD feature-set (75.4%, 96.7%, 75.7%, and 94.3%), and Mahajan Peak feature-set (71.7%,72.3%, 63.8%, and 74.6%), using LDA, KNN, linear SVM, and cubic SVM classifiers respectively. The accuracy of the proposed Feature-Set 3 is also greater than Khushaba et al.'s feature-set as well (81.7%, 93.9%, 82.1%, and 95.2%), using LDA, KNN, linear SVM, and cubic SVM classifiers. This suggests that our proposed system, using the hybrid feature-set, is more efficient and has higher ability in identifying stress levels.  Figure 14. The classification accuracy of detecting stress and non-stress states using our proposed Feature-Set 3 compared to Khushaba et al. [22] and the two feature-sets from Mahajan [18] using only Fp1 + Fp2 frontal electrodes.
The proposed Feature-Set 3 was also compared with Khushaba et al. and the two feature-sets from Mahajan, but in this case for evaluating stress levels using only Fp1 + F7 + F8 electrodes. Figure 15 shows the classification accuracies of classifying stress levels using our proposed Feature-Set 3 compared to Khushaba and the two feature-sets from Mahajan. It is clear from Figure 15 that the proposed Feature-Set 3 has better performance in evaluating stress levels compared to the other feature-sets from Khushaba et al. and Mahajan. The accuracy of our proposed Feature-Set 3 (82%, 99.26%, 88.1%, and 98.36%,) are higher than the Mahajan PSD feature-set (75.4%, 96.7%, 75.7%, and 94.3%), and Mahajan Peak feature-set (71.7%,72.3%, 63.8%, and 74.6%), using LDA, KNN, linear SVM, and cubic SVM classifiers respectively. The accuracy of the proposed Feature-Set 3 is also greater than Khushaba et al.'s feature-set as well (81.7%, 93.9%, 82.1%, and 95.2%), using LDA, KNN, linear SVM, and cubic SVM classifiers. This suggests that our proposed system, using the hybrid feature-set, is more efficient and has higher ability in identifying stress levels.
Diagnostics. 2020, 10, x FOR PEER REVIEW 21 of 25 Figure 15. The classification accuracies for classifying stress levels using our proposed Feature-Set 3 compared to Khushaba et al. [22] and the two feature-sets from Mahajan [18].
In the second part of the proposed technique, the selection of skull site and its influence on the performance of the MSD system are examined. Based on [4,30,31,55] that showed that EEG signals acquired from the frontal site of the skull are capable of evaluating and detecting the mental stress, this experiment was conducted and proved that frontal brain activation has the most impact on detecting and classifying stress levels. The accuracy of the proposed MSD with frontal brain activation reached 100% for detecting stress and 99.78% for classifying stress levels. The experiment . The classification accuracies for classifying stress levels using our proposed Feature-Set 3 compared to Khushaba et al. [22] and the two feature-sets from Mahajan [18].
In the second part of the proposed technique, the selection of skull site and its influence on the performance of the MSD system are examined. Based on [4,30,31,55] that showed that EEG signals acquired from the frontal site of the skull are capable of evaluating and detecting the mental stress, this experiment was conducted and proved that frontal brain activation has the most impact on detecting and classifying stress levels. The accuracy of the proposed MSD with frontal brain activation reached 100% for detecting stress and 99.78% for classifying stress levels. The experiment showed that only seven electrodes are enough to achieve a reliable and efficient MSD system. Note that the feature space dimension is 228 after feature extraction. In the third part of the proposed study, PCA is used to reduce the feature space dimension and to construct the MSD and lower its complexity. Here, we showed that PCA is capable of improving the performance of MSD constructed using a cubic SVM classifier to reach an accuracy of 100% for detecting stress and 99.4% for classifying stress levels with only 58 and 54 principal components.
As it is clear from the results, KNN and SVM classifiers yielded the highest accuracies for both detections of stress and stress levels. This is because the KNN classifier is simple, straightforward and has high effectiveness, even with noisy datasets [56]. Despite its simplicity, it is able to produce high-accuracy rates in medical applications [57,58]. SVM is a strong classifier, and also has the capability to alter an input vector which is not linearly distinguishable using a hyperplane into a higher-dimensional feature space that is able to linearly discriminate between classes of input data to facilitate the classification process. The process is obtained using a kernel function which maps the likeness between the input data and the new higher-dimension feature space. A linear kernel is frequently used when the dataset is just divided by a linear line, which is the case in the feature space used in the case of detecting stress as shown in Figure 8. However, a quadratic kernel is a nonlinear kernel used when the dataset is complex and not linearly separable. Cubic kernels may possibly increase the accuracy. Other benefits of cubic kernels include taking sophisticated mathematical tractability and direct geometric interpretation [51]. The feature space in the case of classifying stress levels was not linearly separable, as shown in Figure 9; therefore, the cubic kernel produced better results than that of the linear kernel in classifying stress levels.
EEG signals are commonly known to be non-stationary. This is due the changes in states of neuronal assemblies during brain functioning. It is essential to recognize non-stationaries in EEG signal, because they are illustrative of the underlying actions. This is done by segmenting the EEG signal into smaller stationary segments [59]. Selecting the appropriate window size which segments EEG signal into stationary segments is very important so that the model fits the actual and consistent activity of the brain [60]. There are several ways to check stationarity in literature. Among them, Azami et al. in [61] suggested that the standard deviation determined for each segment is can be a property that indicates changes in amplitude or/and frequency, as it remains unchanged in stationary intervals, the differences of the standard deviation in successive windows indicates stationarity. Furthermore, McEwen verified that short segments (up to 10 s) usually follow the normal distribution while longer segments (up to 60 s) are not Gaussian. McEwen recommended that EEG could be visualized as a procedure consisted from short Gaussian segments. A portion of segments that can be thought as Gaussian reduces from 90% to 20% when the segments' period rises from 4 to 60 s. In contrast, up to 90% of 4-s-long segments can be believed as stationary while this number decreases to 70%-80% when analyzing 16 s-long EEG segments [2]. Therefore, we used 4-s segment length.
It was reported in [62,63] that for the medical system to be reliable, it should achieve a sensitivity greater than or equal to 80%, with a specificity greater than or equal to 95%, a precision greater than or equal 95%, and a DOR greater than or equal 100. The results in Table 8 show that the proposed system is reliable and can be used to detect stress and non-stress as the sensitivity is 99.9, specificity is 99.94, precision is 99.81, and DOR is 1057474 using a KNN classifier constructed with only Fp1 and Fp2 frontal electrodes. The results in Table 10 verify that the proposed system is reliable as well as capable of evaluating stress levels, as the sensitivity is 99.6, the specificity is 98.9, the precision is 98.15, and the DOR is greater than 100 using a KNN classifier constructed with only Fp1 + F7 + F8 frontal electrodes. This means that by using only one or two electrodes, the proposed system is capable of detecting stress and non-stress. It is also able to evaluate stress levels using only three frontal electrodes with high performances. The minimum number of electrodes selected to construct the proposed system makes the system more portable and mobile, and easier to set up.
Furthermore, to validate the ability of the proposed MSD system to predict if a new person has stress or non-stress, leave-subject-out validation is performed as well. The results in Table 9 confirm the capability of the proposed system in predicting stress and non-stress using only Fp1 and Fp2 frontal electrodes. They also reveal that the proposed system is reliable and can be used to predict stress and non-stress as the sensitivity is 97.78, the specificity is 97.75, the precision is 99.26, and the DOR is greater than 100 using a KNN classifier. Finally, the ability of the proposed MSD system to predict the stress level of a new person is tested using leave-subject-out validation. The results in Table 11 shows that the proposed system has lower capability in predicting stress levels. This is due the imbalance occurring between the two classes of stress levels. Therefore, future work will focus on investigating solutions to deal with this imbalance and improve the performance of predicting stress levels.
Comparing the results of the proposed algorithm with recent related work in Table 1, it is quite clear that the proposed MSD system outperforms other recent related techniques from literature. More specifically, the accuracies achieved for the proposed technique are 99.9% (sd = 0.015) and 99.26% (sd = 0.08) for identifying stress and non-stress states, and distinguishing between stress levels, respectively, using only two frontal brain electrodes for detecting stress and non-stress, and three frontal electrodes for evaluating stress levels as shown in Tables 8 and 10. The results prove that the proposed method has a competitive performance compared to the state-of-the-art techniques for both detecting stress and non-stress and classifying stress levels.

Conclusions
The main aim of the proposed system is to build a portable real-time EEG-based mental training neuro-feedback system to detect and evaluate stress levels in real-time with high accuracy using mental arithmetic tasks. The proposed method introduced a hybrid feature-set (Feature-Set 3) and used five classification models for this purpose. Minimizing the number of EEG channels used in mental stress detection and evaluation would make the system more mobile and easier to set up, and maintain the real-time EEG-based mental stress detection system. This study revealed that the frontal brain activation has a great impact on detecting and evaluating stress levels and is capable of achieving high detection and classification rates. Additionally, the study indicated that PCA has the ability to reduce feature space and enhance stress detection rates. Furthermore, the study indicated that only one or two frontal electrodes are capable of detecting stress and non-stress, and three frontal electrodes are able to evaluate stress levels. The results showed that the proposed method based on the hybrid feature-set was capable of both identifying stress and classifying stress levels. Moreover, our method outperformed other feature extraction methods in literature. Furthermore, it has competitive performance compared to the state-of-the-art techniques. Thus, it can be used for stress management, industrial safety, and education. It will enable clinicians to make an accurate diagnosis and provide appropriate treatment, which will consequently reduce chances of clinical brain damage and other health problems. It will also improve the safety standards in industry and will enhance the quality of education.
Funding: This research received no external funding.