Next Article in Journal
Performance of Commercially Open Refrigerated Showcases with and without Ice Storage—A Case Study
Previous Article in Journal
Enterococcal Species Associated with Slovak Raw Goat Milk, Their Safety and Susceptibility to Lantibiotics and Durancin ED26E/7
Previous Article in Special Issue
Predicting the Potency of Anti-Alzheimer’s Drug Combinations Using Machine Learning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Performance Evaluation of Epileptic Seizure Prediction Using Time, Frequency, and Time–Frequency Domain Measures

Shandong Provincial Key Laboratory of Network Based Intelligent Computing, University of Jinan, Jinan 250022, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Processes 2021, 9(4), 682; https://doi.org/10.3390/pr9040682
Submission received: 17 March 2021 / Revised: 3 April 2021 / Accepted: 10 April 2021 / Published: 13 April 2021
(This article belongs to the Special Issue Machine Learning Methods for Modelling Neurological Diseases)

Abstract

:
The prediction of epileptic seizures is crucial to aid patients in gaining early warning and taking effective intervention. Several features have been explored to predict the onset via electroencephalography signals, which are typically non-stationary, dynamic, and varying from person-to-person. In the former literature, features applied in the classification have shared similar contributions to all patients. Therefore, in this paper, we analyze the impact of the specific combination of feature and channel from time, frequency, and time–frequency domains on prediction performance of disparate patients. Based on the minimal-redundancy-maximal-relevance criterion, the proposed framework uses a sequential forward selection approach to individually find the optimal features and channels. Trained models could discriminate the pre-ictal and inter-ictal electroencephalography with a sensitivity of 90.2% and a false prediction rate of 0.096/h. We also present the comparison between the classification accuracy obtained by the optimal features, several features summarized from optimal features, and the complete set of features from three domains. The results indicate that various patient interpretations have a certain specificity in the selection of feature-channel. Furthermore, the detailed list of optimal features and summarized features are proffered for reference to those who research the corresponding database.

1. Introduction

Epilepsy is the fourth most common neurological disorder with approximately 50 million patients worldwide, and it affects people of all ages [1]. Many of the pre-onset symptoms are not visible to observers; thus, family and friends may inadvertently witness them all the time [2]. It is an irrefutable fact that a method to predict the occurrence of epileptic seizure would significantly improve the possibility of treatment.
Electroencephalography (EEG) can be used to study changes in brain activity, which is commonly applied in epilepsy research [3]. Epilepsy studies distinguish two types of EEG signals, based on the way they are recorded, namely intracranial EEG (iEEG) and scalp EEG (sEEG). The iEEG is obtained by invasive electrodes, while sEEG is recorded through electrodes attached to specific locations on the scalp. Since sEEG is obtained in a non-invasive way, it has the advantages of easy access and greater safety.
There are four states for EEG recording of a patient with epilepsy, namely inter-ictal, pre-ictal, ictal, and post-ictal (as shown in Figure 1). The seizure prediction problem involves designing models to distinguish between pre-ictal state and inter-ictal state, while seizure detection focuses on discriminating the ictal state. Although the model in seizure detection can detect seizures, it cannot be used for monitoring and treatment.
Nowadays, machine learning is an advanced technology for epilepsy prediction. Among them, feature extraction is the key procedure. In this area, feature extraction is one of the key procedures for improving performance of classification. Previous work mainly used one or a few features, or focused on the improvement of a certain feature. Even when multiple features are used, the same features are extracted from different patients. However, due to the nonlinear and non-stationary aspects of EEG signals and the different seizure types between individuals, few and fixed features may not be suitable for all patients. Hence, the aim of this study is to examine whether a patient-specific feature design principle will achieve relatively high improvement rates. To address these issues, we extract features of EEG signals from 18 channels using time, frequency, and time–frequency domains. After that, we use a sequential forward selection method to optimize specific feature-channels for each patient based on the output of minimal-redundancy-maximal-relevence (mRMR). Beyond that, several features statistically analyzed from optimal features and all features extracted from these three domains are also explored to discriminate between pre-ictal and inter-ictal states. In addition, the models trained with the optimal features perform well on extra undefined pre-ictal window data of majority of patients.
The contributions of this study lie in the following:
1
We verify the importance of feature-specific in seizure prediction.
2
We comprehensively summarize the features of time, frequency, and time–frequency domains and their interpretations in predicting seizures using EEG signals.
3
The optimal features can provide guidance for studying each patient separately, and the several features summarized in these have implications for the general design principle of an epileptic prediction system.
The remainder of the paper is organized as follows. Section 2 provides the details of our proposed method. Section 3 presents the results of this method, discussions of the results, and comparisons with related work. Finally, the paper is concluded in Section 4.

2. Materials and Methodology

The framework suggested in the present study consists of four stages: preprocessing, feature extraction, feature ranking, and classification. The overall implementation process of this methodology is depicted in Figure 2.

2.1. EEG Data

The CHB-MIT Scalp EEG database [4] used in this work includes data from 22 pediatric patients with intractable seizures at the Children’s Hospital Boston. EEG signals sampled at 256 samples per second usually contain 23 channels but in some cases contain 24 or 26 channels. The start and end time of seizures judged by clinical experts are stated in annotation files.
Each case contains between 9 and 42 recordings (edf files) from a single patient. All recordings were grouped into 23 cases. In particular, case chb21 was obtained from the same patient 1.5 years after obtaining chb01. For convenience, “patient” and “case” in this work have the same meaning. The detailed information on these 23 cases is listed in Table 1.

2.2. Preprocessing

We picked 18 channels common to all cases: FP1-F7, F7-T7, T7-P7, P7-O1, FP1-F3, F3-C3, C3-P3, P3-O1, FP2-F4, F4-C4, C4-P4, P4-O2, FP2-F8, F8-T8, T8-P8, P8-O2, FZ-CZ, and CZ-PZ. In particular, since the different electrodes were used by each patient in multiple experiments, a few recordings in case chb12 do not contain the common channels. Therefore, we drop these recordings in chb12.
As shown in Figure 1, seizure prediction is used to distinguish pre-ictal and inter-ictal states. The pre-ictal interval needs to be defined in the study of seizure prediction; nevertheless, it is still controversial. Mormann [2] believes that, when univariate measures are used to classify the EEG signal, it changes 30 min before onset. Taking into account that activities of EEG signals vary from patient to patient, we use 15 min as the pre-ictal interval. Other principles are as follows:
1
When the second seizure occurs within 15 min after the end of the first seizure, the data of the last seizure are discarded.
2
Use the previous consecutive recording to supplement the data that do not satisfy 15 min.
3
Any previous recording with a gap larger than 5 s from this recording is marked as disconsecutive.
4
In the end, a duration of less than 15 min will also be used as a pre-ictal state of this seizure.
Inter-ictal state accounts for 99% of most epilepsy patients’ lives; hence, we balance the data by selecting the inter-ictal signals as far away as possible from the ictal state. Each trimmed recording including pre-ictal and inter-ictal state is divided into 5-s segments with 2.5 s overlapping as samples for classification. The details on the data we used are listed in Table 1.
For eliminating low frequency activity and high frequency noise to obtain quasi-stationary signals, a 5th-order Butterworth filter band-passing between 0.5 and 30 Hz is employed.

2.3. Feature Extraction

Feeding the raw EEG signals directly into the classifier may adversely affect the output quality. Thus, we extract features from preprocessed EEG signals. Several features have been determined to discriminate the changes in EEG signals. These features can be categorized based on time domain, frequency domain, and time–frequency domain. A variety of the most common features from these three domains is extracted in this study. The extracted features are summarized in Table 2. Features are described and explained with mathematical expressions in Appendix A.
Time domain analysis uses the EEG signal directly, that is, the analysis of electrode voltage amplitude with reference to time series. Time domain analysis can show how the signal changes according to time which is commonly used in other fields [5]. It can reflect the physical meaning of the signal straightforwardly. In terms of time domain features, the basic statistical characteristics (e.g., mean value, skewness and kurtosis) are computed, as well as the energy related, entropy related, randomly related, Hjorth parameters, and the number of zero-crossings and local extrema. In addition, line length can reflect changes in signal amplitude and frequency. The fluctuations in time series can be studied by a detrended fluctuation analysis.
Frequency domain analysis is the analysis of the signal with reference to frequency, while the spectral information of EEG is used. It can reflect the distribution of frequency components or signal power in the whole frequency range. Research in other fields [6] often benefits from frequency domain analysis. Power Spectral Density (PSD) is the measure of a signal’s power content versus frequency. The fast Fourier transform is applied using Welch’s method [7] to obtain the PSD of signals. The energy and entropy of PSD, intensity weighted mean frequency and bandwidth, edge frequency, and peak frequency are extracted in the frequency domain.
Time-frequency domain analysis, which is available for other fields [8,9], can obtain the relationship between frequency and time. It is necessary to have time–frequency analysis, as the EEG signals’ frequency characteristics are changing constantly. Wavelet transform is known to capture both frequency and location in signals. Delta (0.5–30 Hz), Theta (4–7 Hz), Alpha (8–13 Hz), and Beta (14–30 Hz) are four basic patterns of EEG signals. Following the discrete wavelet transform method and EEG patterns, we use a 3-level filter bank (as shown in Figure 3) with Daubechies 1 (db1) mother wavelet function to decompose each raw EEG signal. Finally, four different frequency sub-bands, which are detail coefficients and approximation coefficients, are formed. Basic statistics, energy based, randomly based, and line length introduced in the time domain are also used in the time–frequency domain. The difference is that features are extracted from four sub-bands in time–frequency domains.
The selection of EEG signal channels is still an open research topic. In order to capture the characteristics of signals on all channels, we extract these 67 features from every channel. Having calculated the measurements using the above parameters, the 18 × 67 feature matrix is obtained. After aggregating this feature matrix, we acquire a 1206-dimensional vector (called channel-based feature vector or feature-channel vector) for every sample.

2.4. Feature Ranking

The minimal-redundancy-maximal-relevance algorithm [10] selects features not only based on the dependency between features and sample categories, but also considers the relationship between features. In this research, we reorder the variables in the extracted channel-based feature vector based on the mRMR criterion. Suppose there is a dataset D with N samples and M (1206) features. The purpose of feature ranking is to generate a new set S from an M-dimensional set F = { f i , i = 1 , , M } .
The Max-Dependency criterion uses mutual information as the dependency of features on target class c to select features. Since the Max-Dependency is hard to implement, the mean value of all mutual information values between individual feature f i and class c, Max-Relevance, is used as an approximation of Max-Dependency:
max D ( S , c ) , D = 1 S f i S I ( f i ; c ) ,
where I ( · ) is the mutual information function, f i is the ith feature in F, and c is the class label vector. The mutual information of two random variables x and y is defined by their probability density functions p ( x ) , p ( y ) and the joint probability density function p ( x , y ) :
I ( x ; y ) = p ( x , y ) log 2 p ( x , y ) p ( x ) p ( y ) d x d y .
Only considering dependency could have rich redundancy, so the relationship between features should be considered. Min-Redundancy condition, which is the mean value of all mutual information values between every two features in S, can be added to select mutually exclusive features. It has the following form:
min R ( S ) , R = 1 S 2 f i , f j S I ( f i , f j ) .
There are two selection schemes, the mutual information quotient (MIQ) and mutual information difference (MID), to combine D and R for discrete data. We denote the operator Φ ( D , R ) scheme, and use MIQ in this study. The final optimization is:
max Φ ( D , R ) , Φ = D / R .
The incremental search method is used to obtain the near-optimal solution. Then, when we want to select mth promising feature from the set { F S m 1 } , the formula of (4) can be written as:
max f j F S m 1 I ( f j ; c ) / 1 m 1 f i S m 1 I ( f j ; f i ) .

2.5. Classification

Support vector machines are classical classifiers most commonly applied in the seizure prediction problem. In this work, we combine a sequential forward selection approach and Support Vector Machine (SVM) to optimize feature subsets and train models.
SVM finds an optimal hyperplane by maximizing the distance between two categories [11]. Given training set T = { ( x i , y i ) i = 1 , , N } with N samples of n-dimensional vector, where x i R n , y i { + 1 , 1 } . The problem of maximizing the margin can be written as a dual problem:
min α 1 2 i = 1 N j = 1 N α i α j y i y j K x i , x j i = 1 N α i s . t . i = 1 N α i y i = 0 0 α i C , i = 1 , 2 , , N
where C is the penalty term that acts as an inverse regularization parameter. K ( · )
K ( x , z ) = exp ( γ x z 2 ) ,
where γ is kernel coefficient. After solving this problem, the optimal b * can be obtained using α * . Then, the decision function of classifying sample x becomes:
f ( x ) = sign ( i = 1 N α i * y i K ( x , x i ) + b * ) .
Since hyperparameters directly control the behavior of the SVM algorithm, choosing hyperparameters plays a crucial role in the success of classifiers. Two hyperparameters (C and γ ) should be tuning in our classifier. We use a grid-search method to find the most optimal hyperparameter combination. Grid-search exhaustive searches over specified parameter values, which is crude but effective.
Various combinations of feature-channels (n) are tried. The one with the highest model accuracy will be regarded as the optimal subset of a patient. The whole algorithm for finding the optimal combination is as shown in Algorithm 1.

2.6. Performance Evaluation

Cross-validation (CV) is a model validation technique for assessing how the results of a model will generalize to an independent dataset. In order to obtain a reliable outcome, a leave-one-out CV method is used. Compared with k-fold cross-validation, leave-one-out is more practical and deterministic.
Algorithm 1: Finding optimal feature-channels using a sequential forward selection approach
  • Require:
  •  The reordered feature-channel set S = { s i , i = 1 , , N } ;
  • Ensure:
  •  The optimal feature-channel subset S ;
  •  Initial the model accuracy A c c 0 ;
  • for each n [ 1 , N ] do
  •   Feed the feature subset { s 1 , , s n } to SVM classifier;
  •   Tune the parameters of SVM using grid-search;
  •   Use leave-one-out to get model accuracy a c c ;
  •   if a c c > A c c then
  •     A c c a c c
  •     S { s 1 , , s n }
  •   end if
  • end for
In leave-one-out CV, one observation contains some pre-ictal and inter-ictal samples, which correspond to a seizure. Suppose there are P observations for a certain case, where the value of P depends on the number of seizures (used) in Table 1. Leave-one-out uses one observation as the validation set and the remaining P 1 observations as the training set. This is repeated on all observations to obtain P validation results. The average of these results is defined as the performance of this model.
In order to produce the best and most robust models, classifiers with different hyperparameters are tried. It is a practical method that the candidate parameter sequence of SVM conforms to exponential growth. Therefore, we do a grid search on γ [ 2 15 , 2 13 , , 2 3 ] and C [ 10 4 , 10 3 , , 10 4 ] .
Seven metrics are applied to measure the performance. Accuracy is defined as the percentage of correct classification in the total samples, which can mirror overall performance of the model when using balanced data. The false prediction rate (FPR) is defined as the proportions of the duration which is wrongly predicted as the pre-ictal in the inter-ictal period per hour. Sensitivity (SEN) is the rightness rate of pre-ictal prediction. Moreover, Area Under Curve (AUC), F1, and kappa are used to measure the classification ability. Cost is the average prediction time spent on a sample, including preprocessing, feature extraction, and classification. All experiments are implemented in Python 3.6 on a server of six 3.7 GHz Intel Core (TM) CPUs running Ubuntu 16.04.

3. Results and Discussion

This section presents the results of the proposed feature design method applied to the scalp EEG data, and compares it with other design principles. Then, visualization of required features and the generalization capacity of models are studied.

3.1. Performance of Different Numbers of Feature-Channels

As shown in Algorithm 1, combinations with different feature-channels are fed into the classifier to predict seizures in a patient-specific approach. In order to study the prediction performance of different n on different patients, five cases are randomly selected for detailed analysis.
Figure 4a visualizes the mRMR scores of these five cases, which are chb01, chb02, chb11, chb19, and chb21, from top to bottom. The vertical axis represents 18 channels, and the horizontal axis represents 67 features. A brighter color indicates that the feature-channel is more valuable to this patient. The accuracies of five models with different n from 1 to 200 are presented in Figure 4b. The selection order of first n-ranked feature-channels is based on mRMR score.
As can be seen from Figure 4, finding a specific feature-channel subset for each patient is crucial to seizure prediction. The effects of same feature and channel on different patients are diverse. In addition, the number of required feature-channels differ from patient to patient. According to the accuracy curve of each patient, as features are added, the model will gradually reach the optimum. However, with the further increase of features, model accuracy will decline and tend to flatten. Hence, in order to highlight the difference, only the accuracy with n from 1 to 200 are displayed. The selection of feature subset is based on the evaluation of the mRMR algorithm. Therefore, the feature-channel that is added to the feature subset firstly has the maximal relevance with the category. For case chb11, the accuracy of model is 74.4% when only nonlinear energy-FP2-F4 was used in the model. When the fourth feature-channel is added, the accuracy of the model will decrease from 87.8% to 87.7%. This kind of fluctuation is very common in the training and evaluation process of some machine learning algorithms (e.g., SVM), for which the model is not guaranteed to achieve the best performance. However, this fluctuation will not affect the overall trend. Compared to the first six feature-channels, the added feature-channels are redundant as the feature subset is optimal. In particular, case chb01 only needs local extrema-FP1-F3 to achieve the optimal performance. The new redundant feature-channels will reduce the accuracy of the model. Finally, when using first 10, 1, 6, 36, and 17 feature-channels combination, respectively, the optimal model can obtained for cases chb01, chb02, chb11, chb19, and chb21.
The model trained with these optimal feature-channels is considered the best model for the patient. After 23 patients with 146 seizures in the CHB-MIT sEEG database were evaluated, the number of items in optimal feature-channel combination and classifier parameters of the best model are presented in Table 3. The complete list of optimal feature-channel combinations is described in Appendix B. The features and channels that contribute most to the seizure prediction in each patient are provided for future research. We can see that the number of optimal feature-channels and the combination varies greatly from patient to patient.

3.2. Comparison of Different Feature Design Principles

To examine whether the proposed prediction method is significantly different from the general feature design principle in previous works, comparison and discussion between optimal feature-channel subset, summarized feature subset, and complete feature-channel set are presented.
Summarized feature subset is the combination of the top 11 features (except channel) with the most occurrences in the optimal feature-channel combinations in Table A1. The statistics of a total of 62 features are presented in Figure 5, in which the features with a sky-blue bar are regarded as summarized features. These 11 features are extracted from 18 channels for all patients. Then, 198-D feature vectors are fed into the classifier in a patient-specific method.
The complete channel-based feature set is the combination of all feature-channels extracted from time, frequency, and time–frequency domains. Feature vectors with 1206-D are used to train the SVM classifier.
Figure 6 presents the comparison of classification accuracy obtained by these three feature sets. The optimal feature-channel subset gives an overall accuracy of 90.4%, while summarized feature subset and complete feature set achieve an accuracy of 84.8% and 85.0%, respectively.
Other metrics are also applied to measure the performance of these three feature sets (as shown in Table 4). Models trained using optimal feature subsets reach a sensitivity of 90.2% with FPR of 0.096/h.
In summary, the overall performance of each patient model obtained when using the optimal feature-channel subset is higher than the other two. Comparing summarized feature subset and complete feature set, even when a generous amount of features in the complete feature set is used, the upgrade in performance is negligible. Features in the complete set may have shared similar contributions, making them redundant in the training process. Especially in chb19, there is a significant difference between the performance of these two feature sets. It also indicates that only specific features can effectively discriminate the signal states in some patients.
The method of iteratively selecting the optimal features for each patient still suffers from the problem of great time complexity. Therefore, in the case of ensuring acceptable performance, the 11 summarized features (line length-beta [12], local extrema [13], Higuchi FD-beta, zero-crossings, line length-alpha, line length-theta, Higuchi FD [14], SVDEn [15], peak frequency [16], Hurst exponent [15,17], and Higuchi FD-alpha) that are commonly applied in seizure prediction can still be used as a guide for general features of all patients.

3.3. Visualization of Optimal Feature-Channels

To investigate how the optimal feature-channels contributed to the classification, we project the optimal feature-channel vector of cases chb01, chb20, chb23 and chb14 onto a 2D plane using t-SNE [18] algorithm. The selection principle is based on various n, which are 10, 26, 37, and 55, respectively.
Figure 7 shows the dimension reduction results of optimal features of these four cases. The blue dots represent pre-ictal samples, and the orange ones represent inter-ictal samples.
It can be seen that the samples represented by the optimal feature-channels can be well divided into two categories. This indicates that these features are also friendly to other classifiers, not only to SVM.

3.4. Evaluation of Generalization Ability

In order to verify the generalization ability of models, the optimal models are evaluated on the extended pre-ictal data. Many studies have found that characteristic changes in EEG signals occur within 30 min before onset [2,19]. Since the data 15 min before onset are used to train and evaluate the models, we use the same criteria in Section 2.2 to collect the EEG signals from 15 to 30 min before onset as the extended pre-ictal data. These data are divided into three datasets at 5 min intervals, which are −30 to −25 min dataset, −25 to −20 min dataset, and −20 to −15 min dataset.
The evaluation of models trained with optimal feature-channels on these three datasets is presented in Table 5.
Since the pre-ictal state exists within 30 min before onset, most models can also accurately predict the data that is not involved in training. The prediction time of our models are much shorter than 5 s, so it can be easily applied in the application of real-time prediction.
However, some patient models (e.g., chb21) have negative results on these data. The possible reason is that the signal 15 min before onset of this patient is very different from the signal 15 to 30 min before onset. This suggests that the pre-ictal window varies greatly from patient to patient.
Generally, we believe that the closer the distance to onset, the more obvious characteristics. However, many models (e.g., chb03, chb07 et al.) have better prediction results on the dataset of −30 to −25 min than on the one from −25 to −20 min. Therefore, a detailed analysis of time on epilepsy may be worthy of examination in future studies.

3.5. Comparison to Prior Works

A comparison of the proposed framework and other methodology from the works in pre-ictal/inter-ictal classification is presented in Table 6. The focus of the comparison is studies such as ours that have been evaluated within the CHB-MIT database. From the methods perspective, a similar workflow is used, with each work using different sets of features and classifiers. Models trained with optimal features show a promising performance with high sensitivity and low FPR.

4. Conclusions

In this study, mRMR is employed to determine the quality of each channel-based feature from time, frequency, and time–frequency domains. Based on this, we use the prediction accuracy of SVM combined with a sequential forward selection method to determine the optimal subset of features for each case. These models trained with optimal features achieve an overall sensitivity of 90.2% and an FPR of 0.096/h in the classification of pre-ictal and inter-ictal on the CHB-MIT database. In addition, a comparison of the optimal features with the summarized feature subset and the complete set of features from three domains show that finding an optimal feature-channel for each patient represents an important step in seizure prediction. This suggests a valuable future research trajectory of applying a method that combines feature quality measurement with classification to save training time.

Author Contributions

D.M. developed the theory and performed the computational experiments. D.M. and J.Z. wrote the manuscript with support from L.P. All authors discussed the results and contributed to the final manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially supported by the National Natural Science Foundation of China under Grant No. 61972176, No. 61472164, No. 61672262, No. 61572230, and No. 61573166, the Shandong Provincial Key R&D Program under Grants No. 2018CXGC0706, and No. 2017CXZC1206.

Conflicts of Interest

The authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.

Appendix A. Detailed Description of Features

In this section, the features expressed in italics are briefly described and explained with mathematical expressions.

Appendix A.1. Time Domain Features

Time domain features are directly estimated on raw EEG signals that are a function of time and amplitude. Amplitude refers to the instantaneous energy, which emphasizes the changes. Therefore, the time domain features are the numerical embodiment of the changes of signals. X = { x 1 , x 2 , , x n } refers to the signal with time n.
  • Basic statistics:
    Basic statistics features have been frequently used to distinguish pre-ictal pattern from an inter-ictal pattern. Mean, skewness, kurtosis, peak-to-peak amplitude, and coefficient of variation are used in this work. Skewness counting the asymmetry of the signal distribution is calculated as:
    S k e w = 1 n i = 1 n x i μ σ 3 ,
    where μ is the mean value, and σ is the standard deviation of signal X. Kurtosis can be used to measure the steepness of a signal as follows:
    K u r t o s i s = 1 n i = 1 n x i μ σ 4 .
    Peak-to-peak amplitude quantifies the range of signal. Coefficient of variation is a normalized measures of signal dispersion, which is defined as the ratio of standard deviation to average value.
  • Energy related:
    The energy is the sum of squares of the signal. The average power is the mean square of the signal, which is the average energy. The root mean square is the root mean square of the signal, which is the square root of the average power. The last energy related feature is nonlinear energy, which was originally presented in [25]. This feature can track the instantaneous frequency of the signal effectively, and its output is given by:
    N E = i = 1 n 2 x i 2 x i 1 · x i + 1 .
  • Line length:
    Line length derived from Katz’s fractal dimension was first proposed by [26]. It increases as the amplitude or frequency of the signal increase. The normailzed line length can be represented as:
    L L = 1 n 1 i = 2 n x i x i 1 .
  • Entropy based:
    Entropy, as the measure of uncertainty and disorder in the data, has been verified in a lot signal processing research. Approximate entropy (ApEn), sample entropy (SampEn), and singular value decomposition entropy (SVDEn) are the features used in this work. ApEn [27] can be used to quantify the regularity of the signal. The signal X is cut into subsequence u m ( i ) for { i 1 i n m + 1 } using m as the sliding window. Subsequence i is presented by u i = { x i , x i + 1 , , x i + m 1 } . Then, the ApEn is defined as:
    A p E n ( m , r , n ) = 1 n m + 1 i = 1 n m + 1 ln C i m ( r ) 1 n m i = 1 n m ln C i m + 1 ( r ) ,
    where C i m ( r ) is the proportion of the distance between subsequence u i and all subsequence less than tolerance value r = 0.2 * s t d ( X ) . It can be calculated by:
    C i m ( r ) = 1 n m + 1 j = 1 n m + 1 H ( r d ( u i , u j ) ) ,
    where H ( t ) is the Heaviside step function, and d ( t ) is the distance function. SampEn [28] is based upon concepts similar to ApEn. It is only for calculated other subsequences when calculation distance, so the B i m ( r ) corresponding to C i m ( r ) is:
    B i m ( r ) = 1 n m j = 1 n m + 1 H ( r d ( u i , u j ) ) , j i .
    In addition, it performs logarithmic operations in the final entropy calculation. Therefore, the SampEn is defined by:
    S a m p E n ( m , r , n ) = ln 1 n m i = 1 n m B i m ( r ) ln 1 n m i = 1 n m B i m + 1 ( r ) .
    Smaller values indicate more self-similar and regular signals, while larger values characterize higher complexity. SVDEn [29] measures the dimensionality of the signals. It uses the embedding matrix Y, which can be written as:
    y i = [ x i , x i + j , , x i + ( m 1 ) j ] , Y = [ y 1 , y 2 , , y n ( m 1 ) j ] T ,
    where m is the embedding dimension, and j is the time delay. Then, this entropy is described as follows:
    S V D E n = i = 1 M σ ¯ i log 2 σ ¯ i ,
    where M is the number of the singular values of the embedding matrix Y and σ ¯ i is the normalized singular value. Since SVDEn is calculated on all channels, the value can indicate a pattern of pre-ictal signal in both space and time.
  • Hurst exponent:
    Hurst exponent can evaluate the predictability of a time series, and it has been proven that the epileptic brain is long term anticorrelated [30]. Rescaled range analysis is a commonly used calculation method in time series. The cumulative deviate series can be defined as follows:
    Z i = u = 1 i ( x u μ ) , i = 1 , 2 , , n ,
    then compute the range:
    R n = max 1 t n { Z t } min 1 t n { Z t } .
    In addition, the standard deviation is:
    S n = σ .
    A Hurst exponent complies with the following rules:
    R S n = c * n HE .
  • Higuchi Fractal Dimension:
    Fractal dimension can be used to measure the signal complexity. Higuchi fractal dimension [31] calculates the fractal dimension of time series in the time domain. Use discrete time interval k to construct a set of new time series U = { u i 1 i k } , where u i = { x i , x i + k , , x i + n i k k } , i = 1 , 2 , , k . The average length of new time series u i is computed as:
    L i = j = 1 ( n i ) / k x i + j k x m + ( j 1 ) k n 1 n j k k .
    Thus, the total average length of all new series can be written as:
    L = i = 1 k L i .
    Finally, Higuchi fractal dimension is the slope of the linear regression between ln L and ln 1 k .
  • Hjorth parameters:
    Three Hjorth parameters proposed by [32] can together characterize the EEG signal in terms of amplitude, time scale, and complexity. Hjorth parameters were calculated on raw signal series X, the first derivative of the series X , and the second derivative X . The derivatives were obtained as differences, namely:
    X = x j + 1 x j , j = 1 , 2 , , n 1 , X = x j + 1 x j , j = 1 , 2 , , n 2 .
    The first parameter, Hjorth activity, is the variance σ X 2 of the signal X. The second parameter, Hjorth mobility, can be expressed as:
    m o b i l i t y = σ X σ X .
    The third parameters called Hjorth complexity is defined as:
    c o m p l e x i t y = σ X / σ X σ X / σ X .
  • Detrended fluctuation analysis (DFA):
    DFA [33] is another long-range correlations analysis method, which is similar to the rescaled range analysis. Construct a new series using the mean of X, u i = j = 1 i ( x j μ ) , then segmented into k subintervals using length list v = { v 1 , v 2 , , v k } . In each subinterval, the data are fitted by polynomial regression to obtain the function u ^ i , and the mean-squared residual is found:
    F j = 1 v j l = 1 v j ( u l u ^ l ) 2 , j = 1 , 2 , , k .
    Finally, DFA is the slope of the 1D least-square regression between ln v and ln F .
  • Number of zero-crossings:
    Zero-crossing, a favorite feature, is a point where the sign of the signal amplitude changes. Therefore, number of zero-crossings is these points’ quantity in a signal series. It indirectly reflects the change of signal frequency. When this number is large, it infers that there are relatively high frequency components in this signal.
  • Number of local extrema:
    Local extrema consist of local maxima and minima. The local maxima, the so-called peaks, is obtained by a simple comparison of neighboring values. In the same way, local minima is found. Number of local extrema is also an indirect measurement of signal frequency similar to the number of zero-crossings.

Appendix A.2. Frequency Domain Features

Power Spectral Density describes how the power of a signal is distributed over frequency. We use P = { p i , p 2 , , p n } to denote the signal’s PSD with a frequency range from 1 to n and use P ^ to denote the normalized the PSD to the total power.
  • Energy-PSD:
    The same as the energy in time domain, the energy of the PSD named energy-PSD, similar to the energy in time domain, is the sum of the squares of the PSD.
  • Intensity weighted mean frequency (IWMF):
    IWMF, also known as the mean frequency, is the weighted mean of the frequencies present in the normalized PSD estimation for signal series. It is defined as:
    I W M F = i = 1 n i p ^ i .
  • Intensity weighted bandwidth (IWBW):
    IWBW, also called standard deviation frequency, is a measure of the normalized PSD width expressed in standard deviation, and is defined to be:
    I W B W = i = 1 n i ( p ^ i I W M F ) 2 .
  • Spectral edge frequency (SEF):
    We use SEF 50 [34], the median frequency, defined as the minimum frequency that can reach 50% of the total spectral power of reference frequency f ref :
    S E F 50 = min { f * i = 1 f * p i > j = 1 f ref p j · 0.5 } .
  • Spectral entropy:
    The entropy of power spectrum [35] can determine EEG’s irregularity. Spectral entropy is calculated on the normalized PSD using the classical entropy [36]. Thus, it is defined to be:
    S E = i = 1 n p ^ i log 2 p ^ i .
  • Peak frequency:
    Peak frequency [37], also called dominant frequency, is the frequency at the peak which has the largest average power in its full-width-half-maximum (FWHM) band. The FWHM band is defined by two frequencies, which are within the rising slope and falling slope, respectively, and their amplitudes are equal to half of the peak’s amplitude. This feature can find the most prominent rhythmic component of the signal.

Appendix A.3. Time–Frequency Domain Features

Features in this domain are applied to four frequency sub-bands ( X δ , X θ , X α , X β ).
Table A1. Optimal feature-channel combinations for every patient. In time and frequency domains, feature_name-channel_name denotes the channel-based feature. In time–frequency domain, feature_name-pattern_name-channel_name to represent it.
Table A1. Optimal feature-channel combinations for every patient. In time and frequency domains, feature_name-channel_name denotes the channel-based feature. In time–frequency domain, feature_name-pattern_name-channel_name to represent it.
CaseOptimal nOptimal Feature-Channel Subset
chb0110local extrema-P4-O2, hurst exponent-theta-FP2-F4, kurtosis-delta-FP1-F3, hurst exponent-theta-FP1-F7, energy-alpha-P4-O2, Higuchi FD-beta-T8-P8, line length-beta-FP1-F7, local extrema-C4-P4, hurst exponent-FP2-F4, SVDEn-FZ-CZ
chb021local extrema-FP1-F3
chb0319hurst exponent-P3-O1, zero-crossings-F8-T8, Higuchi FD-beta-C3-P3, line length-alpha-P3-O1, peak frequency-P8-O2, local extrema-CZ-PZ, energy-delta-F7-T7, skewness-beta-T8-P8, ApEn-CZ-PZ, mean-P7-O1, mean of absolute-delta-P7-O1, Higuchi FD-beta-F3-C3, root mean square-theta-FZ-CZ, Hjorth complexity-P7-O1, root mean square-delta-P7-O1, hurst exponent-P4-O2, hurst exponent-delta-C4-P4, peak frequency-T8-P8, ptp amplitude-alpha-F8-T8
chb0458mean of absolute-delta-FP1-F3, mean-FP2-F4, line length-beta-FP1-F3, Higuchi FD-P4-O2, root mean square-delta-FP2-F4, Higuchi FD-alpha-FP2-F8, spectral entropy-FZ-CZ, peak frequency-C3-P3, mean-FP1-F7, peak frequency-P3-O1, ptp amplitude-delta-FP1-F3, Higuchi FD-beta-C4-P4, local extrema-FP2-F8, root mean square-delta-FP1-F3, local extrema-P4-O2, mean of absolute-theta-P3-O1, ptp amplitude-delta-FP2-F4, line length-alpha-FP1-F3, Higuchi FD-C4-P4, mean of absolute-delta-FP2-F4, line length-P3-O1, ptp amplitude-delta-FP1-F7, mean-FP1-F3, mean of absolute-delta-FP1-F7, peak frequency-C4-P4, mean of absolute-beta-P3-O1, zero-crossings-P3-O1, SVDEn-FP2-F4, line length-delta-FP2-F4, line length-beta-FP1-F7, Higuchi FD-beta-P4-O2, local extrema-FP1-F7, SVDEn-FZ-CZ, root mean square-delta-FP1-F7, ptp amplitude-FP2-F4, hurst exponent-P4-O2, line length-FP1-F3, local extrema-FP2-F4, mean of absolute-alpha-P3-O1, root mean square-FP2-F4, peak frequency-P4-O2, zero-crossings-FZ-CZ, average power-FP1-F7, line length-alpha-FP1-F7, root mean square-F4-C4, Higuchi FD-beta-F8-T8, mean of absolute-beta-FP1-F3, peak frequency-FP1-F3, Higuchi FD-FP2-F8, local extrema-FZ-CZ, SampEn-C3-P3, SVDEn-C4-P4, ptp amplitude-delta-F4-C4, line length-theta-P3-O1, average power-delta-FP1-F3, energy-delta-FP1-F3, peak frequency-P7-O1, mean of absolute-theta-FP1-F3
chb054energy-PSD-P7-O1, spectral entropy-FP1-F3, spectral entropy-P7-O1, local extrema-T7-P7
chb0628IWBW-FP2-F8, hurst exponent-beta-P3-O1, local extrema-F8-T8, local extrema-T7-P7, hurst exponent-T7-P7, Higuchi FD-delta-P4-O2, hurst exponent-T8-P8, ptp amplitude-beta-T8-P8, line length-beta-F7-T7, Higuchi FD-FP2-F8, DFA-P3-O1, hurst exponent-F8-T8, Higuchi FD-theta-T8-P8, Higuchi FD-beta-F7-T7, ptp amplitude-beta-F8-T8, local extrema-P7-O1, Higuchi FD-alpha-F8-T8, line length-beta-FP2-F8, Hjorth complexity-FP1-F7, line length-alpha-T7-P7, local extrema-T8-P8, Higuchi FD-beta-CZ-PZ, Higuchi FD-beta-T8-P8, Higuchi FD-alpha-T7-P7, skewness-alpha-F8-T8, spectral entropy-P4-O2, hurst exponent-FP2-F8, local extrema-F7-T7
chb0757ptp amplitude-delta-FP1-F3, hurst exponent-FP1-F7, Higuchi FD-beta-P3-O1, Hjorth complexity-C3-P3, ptp amplitude-beta-C4-P4, zero-crossings-T8-P8, line length-beta-F3-C3, ptp amplitude-alpha-F8-T8, peak frequency-T7-P7, hurst exponent-theta-P8-O2, ptp amplitude-theta-C4-P4, DFA-P3-O1, line length-beta-P3-O1, hurst exponent-F8-T8, local extrema-P4-O2, IWBW-P4-O2, spectral entropy-FP1-F3, ptp amplitude-FP2-F4, line length-beta-FP1-F3, hurst exponent-FP2-F8, peak frequency-P3-O1, nonlinear energy-C3-P3, ptp amplitude-delta-FP1-F7, energy-alpha-FP1-F7, line length-alpha-F3-C3, root mean square-beta-FP1-F3, ptp amplitude-beta-F8-T8, ptp amplitude-alpha-C4-P4, Hjorth complexity-C4-P4, energy-theta-C3-P3, nonlinear energy-C4-P4, peak frequency-C3-P3, SampEn-P7-O1, hurst exponent-F7-T7, average power-alpha-C3-P3, SEF-P7-O1, Higuchi FD-theta-P4-O2, spectral entropy-P8-O2, ptp amplitude-delta-FP2-F4, line length-delta-CZ-PZ, Higuchi FD-T8-P8, IWBW-FP1-F7, peak frequency-FP2-F8, line length-theta-C4-P4, line length-alpha-P3-O1, hurst exponent-alpha-FP1-F7, Higuchi FD-beta-C3-P3, ptp amplitude-delta-FP2-F8, energy-beta-FP1-F7, energy-beta-C4-P4, peak frequency-P7-O1, root mean square-alpha-FP1-F3, mean of absolute-beta-P4-O2, IWMF-P3-O1, SVDEn-FP2-F4, Higuchi FD-beta-C4-P4, average power-delta-C3-P3
chb0830mean of absolute-delta-F3-C3, Higuchi FD-alpha-CZ-PZ, line length-alpha-C4-P4, SampEn-P3-O1, IWMF-F3-C3, Higuchi FD-delta-F8-T8, line length-delta-T8-P8, line length-delta-C3-P3, ptp amplitude-beta-T8-P8, zero-crossings-T8-P8, line length-theta-C4-P4, hurst exponent-FP2-F4, skewness-beta-P8-O2, hurst exponent-alpha-T8-P8, Higuchi FD-alpha-C4-P4, Hjorth mobility-CZ-PZ, zero-crossings-C4-P4, Higuchi FD-theta-C4-P4, Hjorth complexity-F4-C4, Higuchi FD-theta-P8-O2, hurst exponent-beta-F4-C4, ptp amplitude-delta-FZ-CZ, root mean square-P3-O1, ApEn-T8-P8, SampEn-C4-P4, mean of absolute-delta-P3-O1, mean of absolute-theta-T8-P8, line length-theta-P8-O2, Higuchi FD-alpha-F4-C4, mean of absolute-theta-C3-P3
chb091peak frequency-P8-O2
chb107mean of absolute-theta-T8-P8, line length-C4-P4, hurst exponent-T7-P7, ptp amplitude-T7-P7, line length-theta-FP1-F7, ptp amplitude-delta-FP1-F7, Higuchi FD-delta-F8-T8
chb116nonlinear energy-FP2-F4, local extrema-FP1-F7, peak frequency-P3-O1, zero-crossings-T7-P7, IWBW-T8-P8, Hjorth complexity-P4-O2
chb1231root mean square-beta-F8-T8, ptp amplitude-FZ-CZ, local extrema-F4-C4, hurst exponent-beta-P3-O1, mean of absolute-theta-P3-O1, local extrema-T8-P8, Higuchi FD-theta-FP1-F7, line length-beta-FP2-F4, ptp amplitude-FP2-F8, ptp amplitude-beta-T8-P8, Higuchi FD-P3-O1, hurst exponent-F8-T8, local extrema-F8-T8, mean of absolute-theta-P4-O2, hurst exponent-beta-T8-P8, kurtosis-delta-FP1-F7, ptp amplitude-theta-FP2-F8, Higuchi FD-F4-C4, IWBW-P8-O2, IWBW-F8-T8, ptp amplitude-theta-FZ-CZ, ptp amplitude-beta-T7-P7, zero-crossings-CZ-PZ, line length-beta-T8-P8, hurst exponent-alpha-P8-O2, IWBW-P4-O2, Higuchi FD-beta-P7-O1, root mean square-delta-F4-C4, Higuchi FD-beta-T8-P8, mean of absolute-alpha-P3-O1, ptp amplitude-delta-FP2-F8
chb1337SEF-C4-P4, skewness-delta-FP1-F7, mean of absolute-beta-C3-P3, Hjorth complexity-C3-P3, local extrema-P4-O2, average power-delta-C4-P4, ApEn-C3-P3, Higuchi FD-beta-C3-P3, kurtosis-delta-F7-T7, mean of absolute-delta-FP1-F7, Hjorth complexity-P3-O1, energy-delta-F8-T8, SVDEn-C4-P4, hurst exponent-C3-P3, local extrema-P8-O2, hurst exponent-beta-C3-P3, kurtosis-delta-FP1-F7, mean-C3-P3, zero-crossings-C4-P4, local extrema-P3-O1, hurst exponent-beta-F8-T8, SVDEn-P4-O2, SampEn-C3-P3, energy-delta-C4-P4, average power-delta-P4-O2, average power-delta-F8-T8, zero-crossings-P4-O2, kurtosis-delta-FP1-F3, hurst exponent-alpha-C4-P4, SVDEn-C3-P3, Higuchi FD-C3-P3, line length-theta-C3-P3, Hjorth mobility-C4-P4, local extrema-P7-O1, peak frequency-C3-P3, SEF-F8-T8, mean of absolute-delta-C4-P4
chb1455Higuchi FD-alpha-P8-O2, line length-alpha-FP1-F7, Hjorth activity-FP1-F7, Higuchi FD-delta-P7-O1, energy-PSD-FP1-F3, Higuchi FD-alpha-CZ-PZ, root mean square-beta-FP1-F7, hurst exponent-beta-FP1-F7, line length-beta-T8-P8, line length-beta-FP2-F8, ptp amplitude-delta-FP1-F7, local extrema-T8-P8, SampEn-T8-P8, line length-beta-FP1-F3, line length-alpha-F7-T7, mean-FP1-F7, SVDEn-F7-T7, peak frequency-P3-O1, mean of absolute-theta-FP1-F7, Higuchi FD-delta-P4-O2, line length-alpha-F8-T8, Hjorth activity-FP2-F8, IWMF-P7-O1, line length-beta-FP1-F7, line length-alpha-FP2-F4, SampEn-P8-O2, IWBW-FP1-F7, Hjorth complexity-FP1-F7, Higuchi FD-delta-F4-C4, line length-alpha-FP2-F8, root mean square-FP1-F7, line length-FP1-F7, IWMF-P3-O1, line length-alpha-T8-P8, Higuchi FD-CZ-PZ, ptp amplitude-theta-FP1-F7, zero-crossings-T8-P8, IWBW-FP2-F8, mean of absolute-alpha-FP1-F7, line length-beta-F8-T8, average power-FP1-F7, Higuchi FD-delta-C4-P4, zero-crossings-P7-O1, ptp amplitude-beta-F7-T7, DFA-FP1-F3, ptp amplitude-FP1-F7, ptp amplitude-F7-T7, mean of absolute-beta-FP1-F7, hurst exponent-beta-F3-C3, SVDEn-T8-P8, energy-theta-FP2-F8, zero-crossings-P8-O2, ptp amplitude-delta-FP1-F3, line length-theta-FP2-F8, ApEn-FP1-F7
chb1558mean of absolute-alpha-P4-O2, kurtosis-FZ-CZ, average power-delta-FP2-F4, line length-beta-P4-O2, Hjorth complexity-P3-O1, Higuchi FD-beta-P4-O2, line length-theta-FZ-CZ, IWBW-P8-O2, line length-alpha-P4-O2, mean-FP1-F3, SVDEn-T7-P7, kurtosis-delta-F8-T8, mean of absolute-theta-C4-P4, ptp amplitude-delta-FP2-F4, ApEn-C4-P4, skewness-beta-FP1-F3, mean of absolute-beta-P4-O2, average power-delta-FP1-F3, Higuchi FD-theta-FP1-F7, zero-crossings-P3-O1, local extrema-P4-O2, mean of absolute-alpha-C4-P4, root mean square-delta-P8-O2, local extrema-F4-C4, Higuchi FD-beta-C4-P4, Hjorth complexity-F3-C3, DFA-T7-P7, line length-alpha-C4-P4, line length-P4-O2, mean of absolute-beta-P8-O2, ptp amplitude-theta-FP1-F3, hurst exponent-theta-F8-T8, line length-theta-C4-P4, energy-delta-FP2-F4, mean-FP1-F7, kurtosis-FP2-F4, energy-PSD-FP2-F8, ptp amplitude-FP1-F7, DFA-P3-O1, hurst exponent-theta-F7-T7, line length-beta-P8-O2, ptp amplitude-P4-O2, Higuchi FD-C4-P4, average power-FP2-F4, root mean square-theta-P8-O2, spectral entropy-T7-P7, line length-C4-P4, Higuchi FD-alpha-FP1-F7, energy-delta-FP1-F3, local extrema-F3-C3, line length-P8-O2, line length-theta-F4-C4, mean of absolute-beta-C4-P4, Higuchi FD-theta-FP2-F4, root mean square-P8-O2, ApEn-P4-O2, SEF-P3-O1, ptp amplitude-delta-FP1-F3
chb1656line length-alpha-CZ-PZ, hurst exponent-P4-O2, Hjorth mobility-F7-T7, Higuchi FD-FZ-CZ, mean of absolute-delta-C3-P3, line length-alpha-P4-O2, Hjorth mobility-C3-P3, hurst exponent-alpha-P7-O1, SVDEn-P7-O1, peak frequency-P8-O2, Higuchi FD-CZ-PZ, line length-delta-C3-P3, Higuchi FD-beta-P7-O1, SEF-P8-O2, SVDEn-F4-C4, Higuchi FD-beta-C4-P4, hurst exponent-P3-O1, Higuchi FD-beta-FZ-CZ, ptp amplitude-beta-CZ-PZ, DFA-FP2-F8, line length-beta-P4-O2, DFA-P7-O1, SVDEn-F3-C3, hurst exponent-theta-P3-O1, SVDEn-F7-T7, Higuchi FD-beta-P4-O2, root mean square-beta-C3-P3, SVDEn-P8-O2, Higuchi FD-F8-T8, Higuchi FD-delta-T8-P8, SVDEn-P3-O1, root mean square-beta-FP1-F7, Higuchi FD-beta-CZ-PZ, Hjorth mobility-P7-O1, average power-theta-P8-O2, IWMF-C3-P3, hurst exponent-beta-P7-O1, DFA-F3-C3, Higuchi FD-alpha-FZ-CZ, Higuchi FD-beta-C3-P3, line length-beta-C4-P4, root mean square-theta-C3-P3, IWBW-T8-P8, Hjorth mobility-C4-P4, zero-crossings-F7-T7, SEF-P7-O1, DFA-P8-O2, ptp amplitude-alpha-P8-O2, DFA-F4-C4, Higuchi FD-P4-O2, Higuchi FD-alpha-CZ-PZ, root mean square-delta-C3-P3, Hjorth mobility-F3-C3, hurst exponent-beta-T7-P7, Higuchi FD-P3-O1, SVDEn-C3-P3
chb176line length-alpha-FZ-CZ, energy-alpha-P8-O2, zero-crossings-FP1-F7, Higuchi FD-CZ-PZ, mean of absolute-theta-F7-T7, local extrema-FP1-F7
chb1860kurtosis-theta-FP1-F3, line length-alpha-FP2-F4, line length-delta-CZ-PZ, zero-crossings-FZ-CZ, Higuchi FD-theta-P3-O1, skewness-beta-FP2-F4, line length-beta-FP2-F4, mean of absolute-delta-P7-O1, skewness-beta-T7-P7, SVDEn-FP2-F4, line length-delta-P7-O1, Higuchi FD-alpha-P3-O1, line length-alpha-F4-C4, kurtosis-beta-FP2-F4, Higuchi FD-theta-P7-O1, local extrema-P3-O1, line length-beta-FP1-F3, line length-alpha-P7-O1, root mean square-P7-O1, zero-crossings-FP2-F4, line length-theta-FP2-F4, ptp amplitude-alpha-P7-O1, skewness-alpha-FP1-F7, root mean square-beta-F4-C4, skewness-alpha-F7-T7, mean of absolute-theta-CZ-PZ, Higuchi FD-theta-FP2-F4, skewness-alpha-FP2-F4, line length-FP2-F4, mean of absolute-theta-P7-O1, line length-beta-F4-C4, line length-alpha-P3-O1, root mean square-delta-P7-O1, Higuchi FD-alpha-FP2-F4, kurtosis-theta-FP2-F4, line length-alpha-FP1-F3, root mean square-alpha-P7-O1, energy-theta-P7-O1, peak frequency-P3-O1, ApEn-FP2-F4, line length-F4-C4, mean of absolute-alpha-P3-O1, Higuchi FD-alpha-P7-O1, mean of absolute-beta-FP2-F4, zero-crossings-FP2-F8, mean of absolute-alpha-CZ-PZ, IWBW-P7-O1, kurtosis-beta-FP2-F8, mean of absolute-alpha-P7-O1, DFA-FP2-F4, Higuchi FD-alpha-F4-C4, average power-alpha-P7-O1, skewness-theta-FP1-F3, mean of absolute-alpha-FP2-F4, root mean square-alpha-F4-C4, SVDEn-P3-O1, line length-beta-FP2-F8, average power-theta-P7-O1, mean of absolute-theta-P3-O1, mean of absolute-beta-F4-C4
chb1936line length-theta-FZ-CZ, skewness-beta-P8-O2, line length-beta-F7-T7, mean of absolute-delta-F8-T8, peak frequency-F3-C3, IWMF-F8-T8, ptp amplitude-theta-FZ-CZ, line length-beta-P8-O2, hurst exponent-beta-F8-T8, hurst exponent-delta-F8-T8, SEF-F4-C4, zero-crossings-F8-T8, mean of absolute-delta-P3-O1, line length-theta-FP2-F4, local extrema-FP2-F8, spectral entropy-F8-T8, Higuchi FD-delta-FP2-F8, skewness-beta-P7-O1, line length-alpha-FZ-CZ, skewness-alpha-P4-O2, mean of absolute-delta-FZ-CZ, SEF-F8-T8, line length-beta-C3-P3, Hjorth mobility-F8-T8, line length-delta-FP2-F4, Hjorth complexity-F8-T8, line length-theta-P4-O2, skewness-alpha-P8-O2, hurst exponent-beta-F4-C4, root mean square-delta-F8-T8, line length-theta-FP1-F3, peak frequency-P4-O2, DFA-F8-T8, line length-theta-FP1-F7, line length-beta-FZ-CZ, mean of absolute-theta-FZ-CZ
chb2026line length-beta-P4-O2, line length-theta-P4-O2, hurst exponent-beta-F4-C4, DFA-P3-O1, Hjorth complexity-T8-P8, line length-theta-F7-T7, line length-delta-FP2-F8, line length-beta-P3-O1, Higuchi FD-alpha-F4-C4, Higuchi FD-alpha-P4-O2, mean of absolute-alpha-FP1-F7, zero-crossings-CZ-PZ, Higuchi FD-delta-P8-O2, SEF-F3-C3, hurst exponent-alpha-F3-C3, mean of absolute-alpha-F8-T8, zero-crossings-P4-O2, Hjorth complexity-F4-C4, hurst exponent-beta-C3-P3, line length-beta-C3-P3, Higuchi FD-alpha-FZ-CZ, line length-delta-FP1-F3, spectral entropy-C4-P4, line length-alpha-P3-O1, line length-theta-T8-P8, SEF-C3-P3
chb2117line length-theta-FP1-F3, Higuchi FD-beta-C3-P3, SEF-P3-O1, IWMF-F4-C4, energy-alpha-FP2-F4, line length-beta-P3-O1, Higuchi FD-beta-T7-P7, ptp amplitude-alpha-FZ-CZ, zero-crossings-P3-O1, ptp amplitude-alpha-FP1-F3, SVDEn-P8-O2, Higuchi FD-beta-CZ-PZ, line length-beta-F4-C4, line length-beta-CZ-PZ, SampEn-P7-O1, SVDEn-P3-O1, nonlinear energy-FP2-F4
chb2222line length-theta-P8-O2, line length-alpha-P4-O2, SVDEn-FZ-CZ, mean of absolute-theta-T7-P7, zero-crossings-CZ-PZ, hurst exponent-beta-T8-P8, Hjorth complexity-P8-O2, IWBW-P4-O2, SampEn-F7-T7, SampEn-P4-O2, skewness-beta-P7-O1, Higuchi FD-delta-F3-C3, Higuchi FD-delta-T7-P7, hurst exponent-delta-F7-T7, line length-beta-P4-O2, hurst exponent-alpha-T7-P7, zero-crossings-FZ-CZ, line length-theta-P7-O1, line length-delta-T8-P8, hurst exponent-beta-T7-P7, spectral entropy-FZ-CZ, hurst exponent-alpha-P8-O2
chb2337line length-alpha-P4-O2, mean of absolute-theta-F3-C3, skewness-beta-FP1-F3, Higuchi FD-beta-F4-C4, Higuchi FD-beta-P7-O1, mean of absolute-theta-FP1-F3, Higuchi FD-CZ-PZ, ptp amplitude-P4-O2, peak frequency-C3-P3, Higuchi FD-beta-F3-C3, hurst exponent-theta-F7-T7, Higuchi FD-beta-P3-O1, Hjorth mobility-P4-O2, Higuchi FD-beta-F8-T8, average power-delta-T8-P8, zero-crossings-P3-O1, Higuchi FD-F3-C3, Higuchi FD-P4-O2, SampEn-P3-O1, Higuchi FD-P3-O1, Higuchi FD-P7-O1, line length-theta-FP1-F7, Higuchi FD-beta-CZ-PZ, line length-delta-P4-O2, local extrema-F3-C3, Higuchi FD-T8-P8, line length-beta-P4-O2, Hjorth mobility-P3-O1, energy-delta-F8-T8, mean-T8-P8, zero-crossings-P4-O2, IWBW-CZ-PZ, local extrema-F4-C4, Higuchi FD-F4-C4, Higuchi FD-beta-P4-O2, local extrema-P3-O1, IWBW-FP1-F3
  • Basic statistics:
    Four statistical features are employed. Mean of absolute value, skewness, kurtosis, and peak-to-peak amplitude are mean of coefficients’ absolute values, skewness, kurtosis, and peak-to-peak amplitude of the coefficients in every sub-band, respectively.
  • Energy related:
    Similar to time domain, energy, average power, and root mean square are used to observe every sub-band’s coefficients amplitude.
  • Line length:
    Line length can efficiently measure the fractal dimension for each EEG pattern.
  • Randomly related:
    Hurst exponent and Higuchi fractal dimension can represent the randomness of each decomposed sub-band.

Appendix B. Optimal Feature-Channel Combinations

Below, we present the feature-channel subsets (as shown in Table A1) used to obtain the optimal models. In time and frequency domains, feature_name-channel_name denotes extracted items in channel-based feature vectors, e.g., mean-FP1-F7. Meanwhile, in the time–frequency domain, we use feautre_name-pattern_name-channel_name to represent it, e.g., skewness-delta-FP1-F7.

References

  1. Iasemidis, L.; Shiau, D.-S.; Chaovalitwongse, W.; Sackellares, J.; Pardalos, P.; Principe, J.; Carney, P.; Prasad, A.; Veeramani, B.; Tsakalis, K. Adaptive epileptic seizure prediction system. IEEE Trans. Biomed. Eng. 2003, 50, 616–627. [Google Scholar] [CrossRef]
  2. Mormann, F.; Kreuz, T.; Rieke, C.; Andrzejak, R.G.; Kraskov, A.; David, P.; Elger, C.E.; Lehnertz, K. On the predictability of epileptic seizures. Clin. Neurophysiol. 2005, 116, 569–587. [Google Scholar] [CrossRef] [PubMed]
  3. Freestone, D.R.; Karoly, P.J.; Cook, M.J. A forward-looking review of seizure prediction. Curr. Opin. Neurol. 2017, 30, 167–173. [Google Scholar] [CrossRef] [PubMed]
  4. Shoeb, A.; Guttag, J. Application of machine learning to epileptic seizure detection. In Proceedings of the ICML 2010—Proceedings, 27th International Conference on Machine Learning, Haifa, Israel, 21–24 June 2010; pp. 975–982. [Google Scholar]
  5. Aggarwal, Y.; Das, J.; Mazumder, P.M.; Kumar, R.; Sinha, R.K. Heart rate variability time domain features in automated prediction of diabetes in rat. Phys. Eng. Sci. Med. 2021, 44, 45–52. [Google Scholar] [CrossRef] [PubMed]
  6. Benhassine, N.E.; Boukaache, A.; Boudjehem, D. Classification of mammogram images using the energy probability in frequency domain and most discriminative power coefficients. Int. J. Imaging Syst. Technol. 2020, 30, 45–56. [Google Scholar] [CrossRef]
  7. Welch, P. The use of fast Fourier transform for the estimation of power spectra: A method based on time averaging over short, modified periodograms. IEEE Trans. Audio Electroacoust. 1967, 15, 70–73. [Google Scholar] [CrossRef] [Green Version]
  8. Cai, J.; Zhou, H.; Huang, W.; Wen, B. Ship Detection and Direction Finding Based on Time–Frequency Analysis for Compact HF Radar. IEEE Geosci. Remote Sens. Lett. 2021, 18, 72–76. [Google Scholar] [CrossRef]
  9. Hassani Saadi, H.; Sameni, R.; Zollanvari, A. Interpretive time–frequency analysis of genomic sequences. BMC Bioinform. 2017, 18, 154. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  10. Peng, H.; Long, F.; Ding, C. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27, 1226–1238. [Google Scholar] [CrossRef]
  11. Chang, C.C.; Lin, C.J. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2011, 2, 1–27. [Google Scholar] [CrossRef]
  12. Al-Bakri, A.F.; Villamar, M.F.; Haddix, C.; Bensalem-Owen, M.; Sunderam, S. Noninvasive seizure prediction using autonomic measurements in patients with refractory epilepsy. In Proceedings of the 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Honolulu, HI, USA, 18–22 July 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 2422–2425. [Google Scholar] [CrossRef]
  13. Niknazar, H.; Maghooli, K.; Motie Nasrabadi, A. Epileptic Seizure Prediction using Statistical Behavior of Local Extrema and Fuzzy Logic System. Int. J. Comput. Appl. 2015, 113, 24–30. [Google Scholar] [CrossRef]
  14. Khoa, T.Q.D.; Ha, V.Q.; Toi, V.V. Higuchi Fractal Properties of Onset Epilepsy Electroencephalogram. Comput. Math. Methods Med. 2012, 2012, 1–6. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  15. Zhang, Y.; Yang, S.; Liu, Y.; Zhang, Y.; Han, B.; Zhou, F. Integration of 24 Feature Types to Accurately Detect and Predict Seizures Using Scalp EEG Signals. Sensors 2018, 18, 1372. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  16. Minasyan, G.R.; Chatten, J.B.; Chatten, M.J.; Harner, R.N. Patient-Specific Early Seizure Detection From Scalp Electroencephalogram. J. Clin. Neurophysiol. 2010, 27, 163–178. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  17. Namazi, H.; Kulish, V.V.; Hussaini, J.; Hussaini, J.; Delaviz, A.; Delaviz, F.; Habibi, S.; Ramezanpoor, S. A signal processing based analysis and prediction of seizure onset in patients with epilepsy. Oncotarget 2016, 7, 342–350. [Google Scholar] [CrossRef] [PubMed]
  18. Van der Maaten, L.; Hinton, G. Visualizing Data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
  19. Maiwald, T.; Winterhalder, M.; Aschenbrenner-Scheibe, R.; Voss, H.U.; Schulze-Bonhage, A.; Timmer, J. Comparison of three nonlinear seizure prediction methods by means of the seizure prediction characteristic. Phys. D Nonlinear Phenom. 2004, 194, 357–368. [Google Scholar] [CrossRef]
  20. Shahidi Zandi, A.; Tafreshi, R.; Javidan, M.; Dumont, G.A. Predicting Epileptic Seizures in Scalp EEG Based on a Variational Bayesian Gaussian Mixture Model of Zero-Crossing Intervals. IEEE Trans. Biomed. Eng. 2013, 60, 1401–1413. [Google Scholar] [CrossRef]
  21. Chu, H.; Chung, C.K.; Jeong, W.; Cho, K.H. Predicting epileptic seizures from scalp EEG based on attractor state analysis. Comput. Methods Programs Biomed. 2017, 143, 75–87. [Google Scholar] [CrossRef]
  22. Alotaiby, T.N.; Alshebeili, S.A.; Alotaibi, F.M.; Alrshoud, S.R. Epileptic Seizure Prediction Using CSP and LDA for Scalp EEG Signals. Comput. Intell. Neurosci. 2017, 2017, 1–11. [Google Scholar] [CrossRef]
  23. Truong, N.D.; Nguyen, A.D.; Kuhlmann, L.; Bonyadi, M.R.; Yang, J.; Ippolito, S.; Kavehei, O. Convolutional neural networks for seizure prediction using intracranial and scalp electroencephalogram. Neural Netw. 2018, 105, 104–111. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  24. Agboola, H.A.; Solebo, C.; Aribike, D.S.; Lesi, A.E.; Susu, A.A. Seizure Prediction with Adaptive Feature Representation Learning. J. Neurol. Neurosci. 2019, 10, 1–12. [Google Scholar] [CrossRef]
  25. Kaiser, J. On a simple algorithm to calculate the ‘energy’ of a signal. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Albuquerque, NM, USA, 3–6 April 1990; IEEE: Piscataway, NJ, USA, 1990; pp. 381–384. [Google Scholar] [CrossRef]
  26. Esteller, R.; Echauz, J.; Tcheng, T.; Litt, B.; Pless, B. Line length: An efficient feature for seizure onset detection. In Proceedings of the 2001 Conference Proceedings of the 23rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Istanbul, Turkey, 25–28 October 2001; IEEE: Piscataway, NJ, USA, 2001; Volume 2, pp. 1707–1710. [Google Scholar] [CrossRef] [Green Version]
  27. Pincus, S.M. Approximate entropy as a measure of system complexity. Proc. Natl. Acad. Sci. USA 1991, 88, 2297–2301. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  28. Richman, J.S.; Moorman, J.R. Physiological time-series analysis using approximate entropy and sample entropy. Am. J. Physiol. Heart Circ. Physiol. 2000, 278, H2039–H2049. [Google Scholar] [CrossRef] [Green Version]
  29. Roberts, S.J.; Penny, W.; Rezek, I. Temporal and spatial complexity measures for electroencephalogram based brain-computer interfacing. Med. Biol. Eng. Comput. 1999, 37, 93–98. [Google Scholar] [CrossRef] [PubMed]
  30. Devarajan, K.; Jyostna, E.; Jayasri, K.; Balasampath, V. EEG-Based Epilepsy Detection and Prediction. Int. J. Eng. Technol. 2014, 6, 212–216. [Google Scholar] [CrossRef] [Green Version]
  31. Esteller, R.; Vachtsevanos, G.; Echauz, J.; Litt, B. A comparison of waveform fractal dimension algorithms. IEEE Trans. Circuits Syst. I Fundam. Theory Appl. 2001, 48, 177–183. [Google Scholar] [CrossRef] [Green Version]
  32. Hjorth, B. EEG analysis based on time domain properties. Electroencephalogr. Clin. Neurophysiol. 1970, 29, 306–310. [Google Scholar] [CrossRef]
  33. Bryce, R.M.; Sprague, K.B. Revisiting detrended fluctuation analysis. Sci. Rep. 2012, 2, 315. [Google Scholar] [CrossRef] [Green Version]
  34. Mormann, F.; Andrzejak, R.G.; Elger, C.E.; Lehnertz, K. Seizure prediction: The long and winding road. Brain 2007, 130, 314–333. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  35. Inouye, T.; Shinosaki, K.; Sakamoto, H.; Toi, S.; Ukai, S.; Iyama, A.; Katsuda, Y.; Hirano, M. Quantification of EEG irregularity by use of the entropy of the power spectrum. Electroencephalogr. Clin. Neurophysiol. 1991, 79, 204–210. [Google Scholar] [CrossRef]
  36. Shannon, C.E. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef] [Green Version]
  37. Hao, Q.; Gotman, J. A patient-specific algorithm for the detection of seizure onset in long-term EEG monitoring: Possible use as a warning device. IEEE Trans. Biomed. Eng. 1997, 44, 115–122. [Google Scholar] [CrossRef] [PubMed]
Figure 1. The state of the epileptic seizure. Inter-ictal, pre-ictal, ictal, and post-ictal state. This signal segment comes from channel FP1-F7 of chb01_03.edf file of case chb01 in the CHB-MIT Scalp EEG database. It contains a 15-min inter-ictal period, a 15-min pre-ictal period, a 40-s ictal period, and a 5-min post-ictal period.
Figure 1. The state of the epileptic seizure. Inter-ictal, pre-ictal, ictal, and post-ictal state. This signal segment comes from channel FP1-F7 of chb01_03.edf file of case chb01 in the CHB-MIT Scalp EEG database. It contains a 15-min inter-ictal period, a 15-min pre-ictal period, a 40-s ictal period, and a 5-min post-ictal period.
Processes 09 00682 g001
Figure 2. Schematic representation of the proposed framework for seizure prediction.
Figure 2. Schematic representation of the proposed framework for seizure prediction.
Processes 09 00682 g002
Figure 3. Wavelet decomposition. X: preprocessed EEG signal. h: high-pass filter. g: low-pass filter. Down arrow: subsampled by 2. CD i : detail coefficients. CA i : approximation coefficients.
Figure 3. Wavelet decomposition. X: preprocessed EEG signal. h: high-pass filter. g: low-pass filter. Down arrow: subsampled by 2. CD i : detail coefficients. CA i : approximation coefficients.
Processes 09 00682 g003
Figure 4. The mRMR scores and accuracy changes of cases chb01, chb02, chb11, chb19, and chb21. (a) visualization of mRMR scores. The higher the score, the more valuable the feature-channel is to the patient; (b) accuracy with various top n feature-channels subsets, where the selection of feature-channel is based on (a).
Figure 4. The mRMR scores and accuracy changes of cases chb01, chb02, chb11, chb19, and chb21. (a) visualization of mRMR scores. The higher the score, the more valuable the feature-channel is to the patient; (b) accuracy with various top n feature-channels subsets, where the selection of feature-channel is based on (a).
Processes 09 00682 g004
Figure 5. Times of feature appearance in optimal channel-based feature combinations. The top 11 of 62 features with sky-blue bar are regarded as summarized features.
Figure 5. Times of feature appearance in optimal channel-based feature combinations. The top 11 of 62 features with sky-blue bar are regarded as summarized features.
Processes 09 00682 g005
Figure 6. Comparison between the classification accuracy obtained using optimal, summarized, and complete feature sets.
Figure 6. Comparison between the classification accuracy obtained using optimal, summarized, and complete feature sets.
Processes 09 00682 g006
Figure 7. Visualization of optimal feature-channel vector of various patients. (ad) illustrate the 10-feature, 26-feature, 37-feature and 55-feature subsets from chb01, chb20, chb23 and chb14 embedded onto a 2D space, respectively. The points with blue correspond to pre-ictal samples, and the others represent the inter-ictal samples.
Figure 7. Visualization of optimal feature-channel vector of various patients. (ad) illustrate the 10-feature, 26-feature, 37-feature and 55-feature subsets from chb01, chb20, chb23 and chb14 embedded onto a 2D space, respectively. The points with blue correspond to pre-ictal samples, and the others represent the inter-ictal samples.
Processes 09 00682 g007
Table 1. Detailed information on the CHB-MIT Scalp EEG database and the data used in this study. Sex: Female (F), Male (M). Seizure type: simple partial seizure (SP), complex partial seizure (CP) and generalized tonic-clonic seizure (GTC). No. of channels: Number of channels included in this databse. No. of recordings: Number of edf files. No. of seizures (all/used): number of seizures included in this database (all), number of seizures used in this study (used). No. of samples: Number of 5-s signal segments used in this study.
Table 1. Detailed information on the CHB-MIT Scalp EEG database and the data used in this study. Sex: Female (F), Male (M). Seizure type: simple partial seizure (SP), complex partial seizure (CP) and generalized tonic-clonic seizure (GTC). No. of channels: Number of channels included in this databse. No. of recordings: Number of edf files. No. of seizures (all/used): number of seizures included in this database (all), number of seizures used in this study (used). No. of samples: Number of 5-s signal segments used in this study.
CaseSexAgeSeizure TypeNo. of Seizures (All/Used)No. of ChannelsNo. of RecordingsNo. of Samples
chb01F11SP, CP723424566
chb02M11SP, CP, GTC323361538
chb03F14SP, CP723384218
chb04M22SP, CP, GTC423/24422872
chb05F7CP, GTC523393202
chb06F1.5CP, GTC1023186404
chb07F14.5SP, CP, GTC323192154
chb08M3.5SP, CP, GTC523203590
chb09F10CP, GTC423192872
chb10M3SP, CP, GTC723255026
chb11F12SP, CP, GTC323351672
chb12F2SP, CP, GTC40/1123/25/24247020
chb13F3SP, CP, GTC12/1023/20/18335968
chb14F9CP, GTC823265744
chb15M16SP, CP, GTC20/1726/324010,524
chb16F7SP, CP, GTC10/923/18195702
chb17F12SP, CP, GTC323/18212154
chb18F18SP, CP6/518/23363240
chb19F19SP, CP, GTC318/23301672
chb20F6SP, CP, GTC823294690
chb21F13SP, CP423332872
chb22F9-323312154
chb23F6-72394566
Total---185/149-66494,420
Table 2. The list of the features extracted from time, frequency, and time–frequency domains.
Table 2. The list of the features extracted from time, frequency, and time–frequency domains.
Time DomainBasic statisticsMean, skewness, kurtosis, peak-to-peak amplitude, coefficient of variation
Energy relatedEnergy, average power, root mean square, nonlinear energy
Line length
Entropy basedApproximate entropy, sample entropy, singular value decomposition entropy
Randomly relatedHurst exponent, Higuchi fractal dimension
Hjorth parametersHjorth activity, Hjorth mobility, Hjorth complexity
Detrended fluctuation analysis
Number of zero-crossings
Number of local extrema
Frequency DomainEnergy
Intensity weightedMean frequency, bandwidth
Spectral edge frequency
Spectral entropy
Peak frequency
Time-frequency Domain * Basic statisticsMean of absolute value, skewness, kurtosis, peak-to-peak amplitude
Energy relatedEnergy, average power, root mean square
Line length
Randomly relatedHurst exponent, Higuchi fractal dimension
All features in this domain are extracted on four frequency sub-bands (δ θ α β)
Table 3. Optimal number of features and classifier parameters for all cases. n: the number of items in optimal channel-based feature combination. C, gamma: SVM classifier hyperparameters.
Table 3. Optimal number of features and classifier parameters for all cases. n: the number of items in optimal channel-based feature combination. C, gamma: SVM classifier hyperparameters.
CasenCGamma
chb0110 10 2 2 5
chb021 10 2 2 9
chb0319 10 4 2 11
chb0458 10 1 2 5
chb054 10 4 2 11
chb0628 10 2 2 11
chb0757 10 1 2 3
chb0830 10 3 2 9
chb091 10 1 2 3
chb107 10 4 2 3
chb116 10 4 2 15
chb1231 10 1 2 5
chb1337 10 1 2 7
chb1455 10 1 2 5
chb1558 10 1 2 5
chb1656 10 4 2 11
chb176 10 1 2 9
chb1860 10 2 2 5
chb1936 10 4 2 7
chb2026 10 1 2 5
chb2117 10 4 2 9
chb2222 10 4 2 9
chb2337 10 1 2 5
Table 4. Performance of models when using optimal, summarized, and complete feature set.
Table 4. Performance of models when using optimal, summarized, and complete feature set.
CaseOptimal Feature Subset Summarized Feature Subset Complete Feature Set
FPR (/h)SENAUCF1kappa FPR (/h)SENAUCF1kappa FPR (/h)SENAUCF1kappa
chb010.0240.9860.9990.9820.963 0.0060.8750.9960.9260.870 0.1030.9610.9980.9390.858
chb020.0550.9620.9870.9540.907 0.3620.9800.8030.8660.618 0.4470.9540.8460.8140.507
chb030.0840.6980.8760.7500.619 0.1550.7550.8990.7720.601 0.1160.7350.8970.7710.614
chb040.0240.9810.9970.9780.956 0.0520.9360.9740.9420.884 0.0540.9360.9540.9410.882
chb050.1520.8880.9470.8690.736 0.2100.7880.9170.7720.578 0.1690.7630.9490.7690.594
chb060.1560.8070.8270.7750.651 0.2100.8400.9030.7870.631 0.1870.8120.8690.7750.625
chb070.0600.9730.9880.9570.913 0.0820.9200.9570.9190.838 0.0850.9160.9470.9150.830
chb080.1160.8990.9370.8940.783 0.1720.8650.8980.8530.692 0.1730.8690.9020.8580.696
chb090.1120.8990.9030.8890.787 0.1240.7280.8420.7410.604 0.0850.6480.8180.6620.563
chb100.0640.9200.9890.9260.856 0.0930.8610.9710.8610.768 0.0920.8540.9820.8570.762
chb110.0750.9210.9830.9220.845 0.1270.9000.9660.8880.774 0.0830.8720.9630.8910.789
chb120.0770.9460.9750.9370.869 0.0710.9350.9840.9320.864 0.0610.9060.9830.9170.845
chb130.2080.8550.8930.8400.647 0.1620.7830.8280.7790.621 0.1720.7520.8150.7580.580
chb140.1260.8710.9420.8720.745 0.1560.8440.8960.8440.688 0.1560.8430.8980.8440.687
chb150.1020.8750.9540.8850.773 0.1630.8330.9060.8350.670 0.1670.8290.8960.8300.662
chb160.1600.8130.9050.8100.653 0.1760.7850.8660.7900.610 0.1610.8120.8900.8110.651
chb170.1080.8390.9650.8620.732 0.3470.9430.8080.8510.596 0.3330.8960.8260.8320.563
chb180.0450.9710.9930.9630.926 0.0740.9330.9760.9300.859 0.0800.9310.9660.9260.851
chb190.1870.9330.9200.8860.746 0.3460.5350.5630.5330.189 0.1680.7170.8650.7650.549
chb200.0020.9981.0000.9980.996 0.0010.9971.0000.9980.996 0.0040.9940.9990.9950.990
chb210.1510.7860.8860.8100.635 0.2890.8660.8260.8160.577 0.2650.8470.8500.8090.582
chb220.0690.9550.9640.9450.887 0.0640.6610.9370.7100.597 0.0630.6390.9380.6860.576
chb230.0410.9790.9940.9710.939 0.1140.9870.9900.9500.872 0.1260.9790.9870.9420.852
Total0.0960.9020.9490.8990.807 0.1550.8500.9000.8390.696 0.1460.8460.9150.8390.700
Table 5. Evaluation Results of extended pre-ictal data using optimal models. The extended data 15 to 30 min before the onset is divided into three datasets. These three datasets with same number of seizures are −30 to −25 min dataset, −25 to −20 minutes dataset and −20 to −15 min dataset. # of samples: Number of 5-s segments used for prediction. SEN: The percentage of the samples that are correctly predicted as pre-ictal state. Cost: Average time to predict a sample. AVG SEN: Average sensitivity of these three dataset on each case.
Table 5. Evaluation Results of extended pre-ictal data using optimal models. The extended data 15 to 30 min before the onset is divided into three datasets. These three datasets with same number of seizures are −30 to −25 min dataset, −25 to −20 minutes dataset and −20 to −15 min dataset. # of samples: Number of 5-s segments used for prediction. SEN: The percentage of the samples that are correctly predicted as pre-ictal state. Cost: Average time to predict a sample. AVG SEN: Average sensitivity of these three dataset on each case.
Case# of Seizures−30 to −25 min Dataset −25 to −20 min Dataset −20 to −15 min DatasetAVG SEN
# of SamplesSENCost (sec) # of SamplesSENCost (sec) # of samplesSENCost (sec)
chb0144160.827 8.48 × 10 2 4760.887 9.06 × 10 2 4760.964 8.02 × 10 2 0.893
chb0222380.996 7.64 × 10 4 2380.954 7.50 × 10 4 2380.975 7.87 × 10 4 0.975
chb0355650.979 7.50 × 10 2 5950.887 8.94 × 10 2 5950.911 6.91 × 10 2 0.926
chb0444270.970 2.69 × 10 1 4760.950 2.71 × 10 1 4760.968 2.70 × 10 1 0.963
chb0533570.706 1.84 × 10 2 3570.812 2.19 × 10 2 3570.868 1.86 × 10 2 0.796
chb0689210.722 6.09 × 10 2 9520.808 6.40 × 10 2 9520.834 5.81 × 10 2 0.788
chb0733570.922 2.74 × 10 1 3570.835 2.72 × 10 1 3570.846 2.76 × 10 1 0.867
chb0855950.839 2.71 × 10 1 5950.830 2.72 × 10 1 5950.820 2.70 × 10 1 0.830
chb0944760.855 6.53 × 10 4 4760.815 6.53 × 10 4 4760.813 6.66 × 10 4 0.828
chb1067140.542 2.92 × 10 2 7140.573 2.99 × 10 2 7140.731 2.95 × 10 2 0.615
chb1111190.916 3.64 × 10 2 1190.891 3.71 × 10 2 1190.773 3.53 × 10 2 0.860
chb1253950.820 6.29 × 10 2 5950.805 6.28 × 10 2 5950.825 6.19 × 10 2 0.817
chb1344760.908 2.73 × 10 1 4760.975 2.73 × 10 1 4760.977 2.73 × 10 1 0.953
chb1455950.677 2.70 × 10 1 5950.655 2.70 × 10 1 5950.608 2.69 × 10 1 0.647
chb1543720.704 2.71 × 10 1 4760.681 2.69 × 10 1 4760.708 2.69 × 10 1 0.698
chb1622380.777 2.73 × 10 1 2380.811 2.72 × 10 1 2380.777 2.73 × 10 1 0.789
chb1733570.675 1.28 × 10 3 3570.734 1.22 × 10 3 3570.790 1.29 × 10 3 0.733
chb1844760.790 2.71 × 10 1 4760.651 2.67 × 10 1 4760.750 2.70 × 10 1 0.730
chb1922380.634 3.20 × 10 2 2380.655 3.10 × 10 2 2380.840 3.16 × 10 2 0.710
chb2022381.000 2.69 × 10 1 2381.000 2.69 × 10 1 2381.000 2.70 × 10 1 1.000
chb2133570.549 2.67 × 10 1 3570.437 2.69 × 10 1 3570.423 2.70 × 10 1 0.470
chb2222380.832 2.71 × 10 1 2380.714 2.71 × 10 1 2380.748 2.74 × 10 1 0.765
chb2354980.994 6.53 × 10 2 5950.971 6.11 × 10 2 5950.983 5.82 × 10 2 0.983
Total8696630.810 1.50 × 10 1 102340.797 1.51 × 10 1 102340.823 1.49 × 10 1 0.810
Table 6. Comparison with prior works.
Table 6. Comparison with prior works.
Study# of Used CasesPre-Ictal Window (Minutes)FeaturesClassifierSENFPR (/h)
Zandi et al., 2013 [20]340Positive zero-crossing intervalsBayesian Gaussian mixture model88.340.155
Chu et al., 2017 [21]1386Spectral measureWarning threshold86.670.367
Alotaiby et al., 2017 [22]24120CSPLDA890.39
Truong et al., 2018 [23]135STFTCNN81.20.16
A. Agboola et al., 2019 [24]1760Normalized Logarithmic Wavelet Packet Coefficient Energy RatiosSVM87.260.08
The proposed framework2315Time, frequency, time–frequency domain featuresSVM90.20.096
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Ma, D.; Zheng, J.; Peng, L. Performance Evaluation of Epileptic Seizure Prediction Using Time, Frequency, and Time–Frequency Domain Measures. Processes 2021, 9, 682. https://doi.org/10.3390/pr9040682

AMA Style

Ma D, Zheng J, Peng L. Performance Evaluation of Epileptic Seizure Prediction Using Time, Frequency, and Time–Frequency Domain Measures. Processes. 2021; 9(4):682. https://doi.org/10.3390/pr9040682

Chicago/Turabian Style

Ma, Debiao, Junteng Zheng, and Lizhi Peng. 2021. "Performance Evaluation of Epileptic Seizure Prediction Using Time, Frequency, and Time–Frequency Domain Measures" Processes 9, no. 4: 682. https://doi.org/10.3390/pr9040682

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop