Comparison of Frontal-Temporal Channels in Epilepsy Seizure Prediction Based on EEMD-ReliefF and DNN

: Epilepsy patients who do not have their seizures controlled with medication or surgery live in constant fear. The psychological burden of uncertainty surrounding the occurrence of random seizures is one of the most stressful and debilitating aspects of the disease. Despite the research progress in this ﬁeld, there is a need for a non-invasive prediction system that helps disrupt the seizure epileptiform. Electroencephalogram (EEG) signals are non-stationary, nonlinear and vary with each patient and every recording. Full use of the non-invasive electrode channels is impractical for real-time use. We propose two frontal-temporal electrode channels based on ensemble empirical mode decomposition (EEMD) and Relief methods to address these challenges. The EEMD decomposes the segmented data frame in the ictal state into its intrinsic mode functions, and then we apply Relief to select the most relevant oscillatory components. A deep neural network (DNN) model learns these features to perform seizure prediction and early detection of patient-speciﬁc EEG recordings. The model yields an average sensitivity and speciﬁcity of 86.7% and 89.5%, respectively. The two-channel model shows the ability to capture patterns from brain locations for non-fontal-temporal seizures.


Introduction
Epilepsy is a severe neurologic condition with a high incidence and prevalence worldwide, affecting more than 50 million people. The disability, and the adjusted life imposed on patients, ranks the disease as the second-most burdensome neurologic disorder [1][2][3][4]. An epileptic seizure is a sudden rush of abnormal neural activity in the cortex of the brain. The economic cost and unpredictable recurrent nature make the condition difficult to bear for patients and hinders the design of treatments by physicians. Seizure symptoms challenge the patient to face independent life, stigma and social misunderstandings [5,6].
The electroencephalogram (EEG) is widely recognized for assessing brain activities and remains effective for epilepsy research due to its excellent temporal resolution [7,8]. For epilepsy seizure detection, scalp EEGs (sEEGs) and intracranial EEGs (iEEGs) are used to measure the ictal changes that lead to a biomarker for the presence of a seizure. The iEEG reading has a higher specificity, but a limited sampling coverage of the cerebral cortex compared with sEEG readings [9]. More than 40 years have been devoted to predicting epileptic seizures, with the challenge of forecasting seizures before they happen [10]. Most methods look for patterns in the pre-ictal transition and contrast them with the interictal signal for high sensibility [11,12]. For a real-time application, the detection and prediction prediction model must run in an embedded device with limited computation resources and then be able to run state-of-the-art algorithms [13].
Noninvasive methods have emerged as a demanding alternative in medicine to monitor health and treatment [14][15][16][17][18]. The majority of the advancements in noninvasive seizure prediction and detection use the full scalp EEG electrode channels in their research [3,12,[15][16][17]19]. It is impractical for epilepsy patients to wear the complete electrode channels EEG cap in a real-time application. The International League Against Epilepsy (ILAE) in 2017 classified seizures based on the seizure onset area, consciousness level during a convulsion and other characteristics of seizures. Temporal lobe seizures and frontal lobe seizures are the most common types of focal impaired awareness seizure, occurring in up to 80% of patients [20].
The main objective of this paper is to implement a two frontal-temporal channel seizure prediction method based on ensemble empirical mode decomposition (EEMD)-Relief and deep neural networks (DNNs). Figure 1 shows the proposed model. The EEMD-Relief combination extracts features from a segmented data frame in the ictal and preictal times and retains the most informative components. The model runs on 23 patient-specific epileptic data. A deep neural network is trained to predict the seizure 23 min before it happens, with early detection of 3.9 s in the ictal period.

Literature Review
Researchers have developed all kinds of methods that attempt to detect a biological signal pattern using an EEG. From signal processing to classification, state-of-the-art algorithms have been implemented with different levels of success [21]. Several theoretical investigations examine the preictal state to find a signature that helps anticipate and predict an epileptiform seizure [22][23][24][25][26]. The authors in [27] presented a pseudo-prospective seizure prediction. A deep learning classifier was trained to distinguish between pre-ictal and interictal signals. The model was deployed in a neuromorphic chip. However, the study only implemented a real-time testing model on a device for one subject. The mean sensitivity of the system was 69%. They also benchmarked the system against three recent studies. Senger et al. [26] addressed the seizure prediction problem by using an algorithm based on cellular neural networks (CNNs). They used principal components analysis (PCA) to feed a nonlinear CNN for the first method in the preprocessing stage, followed by level-crossing behavioral analysis. For future work, they propose to limit the number of channels for a seizurewarning device.
Daoud et al. [28] proposed four deep learning prediction models with limited preprocessing. The raw data were direct inputs to the multilayer perception (MLP) to classify between pre-ictal and interictal states. The model integrated four layers and backpropagation for optimization. Data from eight subjects were used to evaluate the patient-specific model. In the second model, the authors

Literature Review
Researchers have developed all kinds of methods that attempt to detect a biological signal pattern using an EEG. From signal processing to classification, state-of-the-art algorithms have been implemented with different levels of success [21]. Several theoretical investigations examine the pre-ictal state to find a signature that helps anticipate and predict an epileptiform seizure [22][23][24][25][26]. The authors in [27] presented a pseudo-prospective seizure prediction. A deep learning classifier was trained to distinguish between pre-ictal and interictal signals. The model was deployed in a neuromorphic chip. However, the study only implemented a real-time testing model on a device for one subject. The mean sensitivity of the system was 69%. They also benchmarked the system against three recent studies. Senger et al. [26] addressed the seizure prediction problem by using an algorithm based on cellular neural networks (CNNs). They used principal components analysis (PCA) to feed a nonlinear CNN for the first method in the preprocessing stage, followed by level-crossing behavioral analysis. For future work, they propose to limit the number of channels for a seizure-warning device.
Daoud et al. [28] proposed four deep learning prediction models with limited preprocessing. The raw data were direct inputs to the multilayer perception (MLP) to classify between pre-ictal and interictal states. The model integrated four layers and backpropagation for optimization. Data from eight subjects were used to evaluate the patient-specific model. In the second model, the authors implemented deep convolutional neural network (DCNN) architecture. This model was designed for front-end feature extraction, reporting a drastic reduction in the number of trainable Computers 2020, 9, 78 3 of 14 parameters. The authors implemented a channel selection algorithm to decrease the number of channels, the dimensionality of variables and the demand for memory. Thus, the model is a candidate for real-time application. In addition, the algorithm selects the channels with the highest variance entropy.
The authors in [29] developed a hybrid neural network prediction model by combining ensemble empirical mode decomposition (EEMD) and a stochastic recurrent wavelet neural network, applied to a nonlinear time series energy indexes price. The EEMD extracted feature frequencies from four energy indexes. The proposed wavelet neural network model was implemented by adding stochastic functions to add different timely weight to the historical data, and then recurrent layers were used to improve the learning process. The proposed method exhibits good performance forecasting of the time-series energy price. Zhou et al. [30] proposed a methodological approach to study hippocampal rhythmicity using different spectral decomposition methods. The study compared the spectral features of the wavelet, Fourier transform and EEMD to characterize hippocampal oscillations. The results showed that the wavelet and the EEMD failed to represent the local field potential oscillations accurately in the hippocampal under similar parameters. Additional research with more decomposition methods in a broader brain lobes region is needed for the authors to accomplish their primary objectives.
Kanagaraj et al. [31] extracted features from images of patients with lung cancer using gray level co-occurrence (GLMC). Four image texture features were extracted and calculated from GLMC: energy, contrast, homogeneity and correlation. The contrast feature was reported to be the best suited for differentiating between textures in the lung tissue's spatial distribution. In the preprocessing, they first conducted image denoising of the original computer tomography images, and secondly, a segmentation process of the lung's tumor area was implemented. An artificial neural network is used for classifying normal and tumor tissues with an accuracy of 85.5%. The authors in [32] implemented an artificial neural network, multilayer perception (MLP), to explain the economic and financial variables that explain construction industry productivity. The study used nine economic variables. Value-added variables and gross revenues were related to productivity. These features were the inputs to the MPL model for training and testing. The classification score was 94.8%, and the receiving operating curve (ROC) score was 98.1%. The study showed that management variables are closely related to productivity, and using the MLP only addresses the management variables and leaves out other criteria, such as human capital and innovation. A study in [33] presented a comparative method of feature selection and extraction using the original empirical mode decomposition (EMD) and the ensemble empirical mode decomposition (EEMD). The authors extracted intrinsic mode functions (IMFs) from correlation, energy, power spectral density and statistical significance measurements. The extracted features were classified using four algorithms: K-nearest Neighbors (KNN), Support Vector Machine (SVM), Logistic Regressions and Naive Bayes. They demonstrated that the selection of IMFs affects classifications, where EEMD features increase all four classifiers' performance results.

Preprocessing
The original signal had a physiological artifact (eyes blinking, muscle movement, involuntary movement) and noise from the power line and EEG equipment. A sixth-order Butterworth band-pass

Preprocessing
The original signal had a physiological artifact (eyes blinking, muscle movement, involuntary movement) and noise from the power line and EEG equipment. A sixth-order Butterworth band-pass filter (2 Hz to 40 Hz) was applied to the time-series EEG signal to eliminate noise, power harmonics and muscle artifacts.

Ensemble Empirical Mode Decomposition
The EEMD overcomes the noise mixing effect of the empirical mode decomposition (EMD) by allowing better scale separation. The algorithm adds a different series of white noise into the signal in several iterations [35]. There was no correlation in IMFs among trials, since the added noise was different in each trial. Averaging the IMFs of different trials eliminated the added noise and preserved the signal. The EEMD has been successfully applied to biomedical signals and other noisy nonlinear and nonstationary processes [36][37][38]. The formulation of the EEMD algorithms can be explained with the following two equations [39]: where M-1 is the total number of IMFs, IMF

ReliefF
Relief is a filter-based feature selection method. Contrary to a wrapper, filters use a measurement calculated from the general characteristics of the training data to score features and elicit new feature subsets prior to modeling. Filters are much faster algorithms than wrappers and function independently of the model algorithm [40]. The basic idea of Relief in Algorithm 1 is to weigh attributes between the two nearest instances in a binary classification [5,6]. Multiple versions of Relief, from A to F, have been proposed to overcome the original algorithm problem of dealing with incomplete data sets and multiclass labels. ReliefF is the best known and most adopted Relief filter-based version. Many other improvements have been proposed in succeeding versions [40]. The ReliefF variant deals with noisy and incomplete data sets and can efficiently deal with multiclass problems. The algorithm finds one near-miss class M(C) for each instance in the data set and averages their contribution for updating attribute weights W[A]. ReliefF estimates the ability of attributes to separate each pair of classes, regardless of if they are closest to each other. Here is a ReliefF version algorithm proposed by Kononenko [41]:

Prediction and Early Detection
We implemented a seizure-specific methodology in which a model was elicited for each patient. The duration of a seizure varies from 10 s to 63 s. For early detection, shown in Figure 3a, the goal is to find a signature of information in the first 3.9 s of the seizure's ictal stage. The vectors of the EEMD-Relief features extracted in this period were used for training the DNN. The seizure prediction scheme is shown in Figure 3b. The seizure prediction horizon (SPH) is the time between the alarm and the seizure onset, as defined in [11]. We used two SPHs of 23 min and 5 min, respectively. In each prediction horizon, we are using two windows of 3.9 s with an overlapping one of 2.9 s. IMFs were extracted in each window for the corresponding ictal and interictal states. The best IMF futures of ReliefF were given as input to the DNN for training.
estimates the ability of attributes to separate each pair of classes, regardless of if they are closest to each other. Here is a ReliefF version algorithm proposed by Kononenko [41]:

Prediction and Early Detection
We implemented a seizure-specific methodology in which a model was elicited for each patient. The duration of a seizure varies from 10 s to 63 s. For early detection, shown in Figure 3a, the goal is to find a signature of information in the first 3.9 s of the seizure's ictal stage. The vectors of the EEMD-Relief features extracted in this period were used for training the DNN. The seizure prediction scheme is shown in Figure 3b. The seizure prediction horizon (SPH) is the time between the alarm and the seizure onset, as defined in [11]. We used two SPHs of 23 min and 5 min, respectively. In each prediction horizon, we are using two windows of 3.9 s with an overlapping one of 2.9 s. IMFs were extracted in each window for the corresponding ictal and interictal states. The best IMF futures of ReliefF were given as input to the DNN for training.

Statistical Analysis and Validation
Statistical procedures were applied to analyze the sample size of the patient-specific data and the performance score of the prediction model. The sample per subject had a high variability because it depended on the patient's number of seizures per trial. The minimum number of seizures in this study was three to split the training and testing time segments. The sensitivity of the area for different brain lobes was evaluated using a coefficient of variation and standard deviation as a measure of spread. The DNN model performance was estimated using a one-way analysis of variance (ANOVA) and ten-fold cross-validation. The training features were time-series signals with sequential meanings. The procedure differed from non-sequential data points. Folds for training in our timeseries cross-validation were created by chaining the state segments of patient seizures. Therefore, we selected a new train and test set of features for fitting and evaluating the model.

Deep Neural Network
A DNN of eight hidden layers was designed for classification. The sigmoid and ReLU activation functions for output and input layers were defined in the model. The network ran with a 10% batch size and 100 epochs. Keras, a python deep learning Application Programming Interface (API), was used for running the model.

Statistical Analysis and Validation
Statistical procedures were applied to analyze the sample size of the patient-specific data and the performance score of the prediction model. The sample per subject had a high variability because it depended on the patient's number of seizures per trial. The minimum number of seizures in this study was three to split the training and testing time segments. The sensitivity of the area for different brain lobes was evaluated using a coefficient of variation and standard deviation as a measure of spread. The DNN model performance was estimated using a one-way analysis of variance (ANOVA) and ten-fold cross-validation. The training features were time-series signals with sequential meanings. The procedure differed from non-sequential data points. Folds for training in our time-series cross-validation were created by chaining the state segments of patient seizures. Therefore, we selected a new train and test set of features for fitting and evaluating the model.

Deep Neural Network
A DNN of eight hidden layers was designed for classification. The sigmoid and ReLU activation functions for output and input layers were defined in the model. The network ran with a 10% batch size and 100 epochs. Keras, a python deep learning Application Programming Interface (API), was used for running the model.

Early Detection
Within the patient-specific EEG signal, we identified an early indicator in the first 3.9 s of seizure initiation. The empirical mode functions extracted in this seizure state showed a significant contrast to the IMFs of the interictal state in the same time window. A deep neural network prediction algorithm using this feature as input was evaluated on each patient's EEG recording, in which 143 seizures in 23 patients were analyzed. The proposed method showed an average sensitivity of 86.77% and specificity of 89.52%. Tables 3 and 4 show the statistics for early detection and prediction.

Prediction
From the patient EEG signal, we used two prediction horizons of 23 min and 5 min, respectively. Each prediction horizon had a window of 3.9 s with an overlapping one of 2.9 s. IMFs were extracted in each window for the corresponding ictal and interictal states. The best IMF features for ReliefF were given as input to the DNN for training. The DNN model showed a sensitivity of 80.00% and a specificity of 72.85%. Figures 4 and 5 show the intrinsic mode function for the ictal and interictal transitions.

Prediction
From the patient EEG signal, we used two prediction horizons of 23 min and 5 min, respectively. Each prediction horizon had a window of 3.9 s with an overlapping one of 2.9 s. IMFs were extracted in each window for the corresponding ictal and interictal states. The best IMF features for ReliefF were given as input to the DNN for training. The DNN model showed a sensitivity of 80.00% and a specificity of 72.85%.. Figures 4 and 5 show the intrinsic mode function for the ictal and interictal transitions.

Prediction
From the patient EEG signal, we used two prediction horizons of 23 min and 5 min, respectively. Each prediction horizon had a window of 3.9 s with an overlapping one of 2.9 s. IMFs were extracted in each window for the corresponding ictal and interictal states. The best IMF features for ReliefF were given as input to the DNN for training. The DNN model showed a sensitivity of 80.00% and a specificity of 72.85%.. Figures 4 and 5 show the intrinsic mode function for the ictal and interictal transitions.

ReliefF
The contribution of twelve IMFs from the two scalp EEG channels is shown in Figure 6. The score for the left and right hemisphere is presented.
Computers 2020, 9, x FOR PEER REVIEW 9 of 14

ReliefF
The contribution of twelve IMFs from the two scalp EEG channels is shown in Figure 6. The score for the left and right hemisphere is presented. Figure 6. ReliefF score contributions of twelve IMFs from the two channels of patient S01.

Early Detection in Different Brain Locations
The sensitivity and specificity of the frontal-temporal model in diffrerent brain locations are shown in Table 5. Frontal and temporal lobes seizure are present in 14 of the 23 subjects. Table 5. Performance of the frontal-temporal method in different brain locations.

Brain Location
Sensitivity Specificity

Statistical Significance
The high variance of sample size per subject is shown in Figure 7, which depends directly on seizure duration and frequency. Table 6 shows the statitics for subject sensitivity score on brain lobes in the 5 min. and 23 min. horizon time. Table 7 shows the one-way ANOVA analysis of the prediction performance in the 23 min. horizon for all subjects.

Early Detection in Different Brain Locations
The sensitivity and specificity of the frontal-temporal model in diffrerent brain locations are shown in Table 5. Frontal and temporal lobes seizure are present in 14 of the 23 subjects.

Statistical Significance
The high variance of sample size per subject is shown in Figure 7, which depends directly on seizure duration and frequency. Table 6 shows the statitics for subject sensitivity score on brain lobes in the 5 min. and 23 min. horizon time. Table 7 shows the one-way ANOVA analysis of the prediction performance in the 23 min. horizon for all subjects.       Table 8 shows a performance comparison of the most recent work in seizure prediction. Ten-fold cross-validation average for each 23 subject for early detection is shown in Table 9.

Discussion and Conclusions
This study focused on the analysis of two frontal-temporal channel prediction systems. We used 23 subjects from the CHB-MIT dataset and scalp EEG recordings to validate this work. A patient-specific DNN classification model for each subject was trained for early detection and prediction. We extracted all the intrinsic mode functions in the segmented preictal and interictal states from the two temporal-frontal channels of each patient. Sixteen intrinsic mode functions were elicited per channel. Then, a ReliefF filter selected the most informative features. This new subset of oscillatory components helped the DNN learn a segment of 1000 data points, representing 3.9 s EEG windows in the ictal and interictal periods. The proposed model achieved excellent results for the two electrodes in each brain hemisphere, F7T7 and F8T8. The performance scores in early detection and prediction are shown in Tables 3 and 4. Early detection showed an average sensitivity and specificity of 86.77% and 89.52%, respectively, while in the prediction method, the average sensitivity and specificity were 81% and 97% in the 5 min and 23 min horizon times. The area under the curve (AUC) of the receiving operating curve (ROC), ROC-AUC, shows how well the patient-specific DNN model distinguishes between classes, with an overall score of 85.39%.
The ensemble IMFs, and their instantaneous frequencies in the ictal and interictal states, were extracted for early detection. The use of EEMD allowed for discriminating between the segmented ictal and interictal periods. For the prediction analysis, the preictal to interictal signals are shown in Figures 4 and 5, where the preictal state shows higher IMFs of instantaneous frequencies than the corresponding interictal state. These signature patterns in the ictal and preictal enabled the DNN to perform as a successful classifier model. The ReliefF analysis shows an almost similar score feature selection for the F7T7 and F8T8 electrode channels in both early detection and prediction, as shown in Figure 6. This feature pattern is present in each of the 23 subjects, suggesting an equal contribution by each hemisphere in seizure prediction.
The intracranial EEG had a better spatial resolution and signal-to-noise ratio than the scalp EEG. However, not all patients were candidates for surgery. In addition, there are some time complications during craniotomy and risks of potential post-operative deficits [48]. Scalp EEGs promote a more realistic acceptance of real-life scenarios for a patient, but have the limitations of wearing all the electrode arrays in an outpatient setting [9]. A performance comparison of recent research in this field is presented in Table 8. The authors in [46] used a reduced montage of eight electrodes in seizure detection. The algorithm implemented detected seizures from EEG, surface, Electromyogram (EMG) and Electrocardiogram (ECG). Results encouraged the use of reduced electrode sets based on frontal-temporal electrodes. In our study, the two electrode channels of F7-T7 and F8-T8 covered a partial frontal-temporal area on the subject's scalp. Table 7 shows a one-way analysis of variance (ANOVA) of prediction score performance in the 23 min horizon with a p-value < 0.05. The parietal, temporal-occipital and temporal-parietal seizure locations exhibited an average sensitivity of 94%, 93.3% and 91.9%. The overall performance in Table 5 demonstrates the ability of the proposed model to capture patterns from non-fontal-temporal seizure brain locations. Table 7 shows the validation score for each subject, yielding an average of 80.5%.
Further work can assess the feasibility of running the model on restricted resource devices with frontal and temporal epilepsy subjects. Also, a more extended prediction horizon time can be explored with adjustable windows in the ictal period. The subrogation of the two frontal-temporal channels can also be studied to produce a single, condensed channel (e.g., common spatial pattern).
For visual inspection and seizure validation for a physician, we recommend the full use of the 10-20 electrode montage.
We have demonstrated that the proposed method represents a compelling alternative for reducing the number of channels in a patient's scalp. We proposed an EEMD-ReliefF approach to extract and select the most relevant oscillatory components in two frontal-temporal channels. Investigating and computing the frontal-temporal two-channel signals promises to be a suitable candidate for developing a miniaturized warning and intervention device for therapy in epilepsy patients. Although the detection and prediction results are promising, further studies are needed to validate their utility for broad clinical applications in epilepsy patients.