A New Signal Processing Approach for Discrimination of EEG Recordings

: Classifying brain activities based on electroencephalogram (EEG) signals is one of the important applications of time series discriminant analysis for diagnosing brain disorders. In this paper, we introduce a new method based on the Singular Spectrum Analysis (SSA) technique for classifying brain activity based on EEG signals via an application into a benchmark dataset for epileptic study with five categories, consisting of 100 EEG recordings per category. The results from the SSA based approach are compared with those from discrete wavelet transform before proposing a hybrid SSA and principal component analysis based approach for improving accuracy levels further.


Introduction
Discriminating electroencephalogram, EEG, signals are of great interest in both biology and statistics [1][2][3][4].Identifying a discrimination (classification) rule to classify different sets of EEG recordings can provide a tool to classify brain activities as well as diagnose brain diseases.In statistics, the problem is related to time series discriminant analysis and has been tackled with different approaches [5].The basic idea in time series discrimination is to use a similarity measure and build a classification rule.Optimality approaches use difference or divergence measures to maximize the difference between the probability distributions or spectral densities corresponding to groups of time series.These approaches consider certain assumptions (model formulation and/or distribution) to make a discrimination function.For instance, [6][7][8] used differences between spectral densities to discriminate stationary time series.While applying these approaches is straightforward, in time series with time-varying dynamics they do not provide good accuracy [5].Shumway [9] and Sakiyama and Taniguchi [10] used time varying spectra and the Kullback-Leibler divergence measure to discriminate locally stationary time series (Time series with a slowly time-varying structure are called locally stationary because the small parts of the series show a stationary structure; see [11] and [12] for more details.).Huang et al. [13] used the smooth localized complex exponential model to discriminate non-stationary time series.These techniques have strong theoretical foundations and can be applied in non-stationary time series.However, they are generally too complicated and have their own restrictions [14].For instance, in methods based on time-varying spectra, the appropriate choice of bandwidth and window length or some parametric model for spectral densities are required [5].
Another class of techniques extract some features from each time series so discrimination can be carried out by classifying the features instead of the time series.Feature extraction techniques are mostly heuristic, that is, in each problem the researcher finds the quantities which best visualize the difference between time series groups.Maharaj and Alonso [15], Fryzlewicz and Ombao [16], Maharaj and Alonso [17] and Yeganegi et al. [14] used different features based on wavelet transform to discriminate time series models.Wavelet transform provides a simple and effective solution to many time series discrimination problems.However, the choice of base function (mother wavelet) could affect the resulting features and consequently the discrimination error rate.
Classifying brain activities based on EEG signals is one of the important applications of time series discriminant analysis for diagnosing brain disorders such as epilepsy.Andrzejak et al. [1] introduced a benchmark dataset for epileptic study.The dataset contains five categories of EEG recordings.Each category consists of 100 EEG recordings from participants.EEG signals have been recorded from brain activities for 23.6 s (4096 observations).Category A contains EEG recordings from healthy subjects while their eyes are open.The second category, category B, contains EEG recordings from healthy subjects with their eyes closed.Categories C and D represent the EEG recordings from epileptic patients.Category D contains EEG signals recorded from the seizure generating area and category C contains signals recorded from the opposite hemisphere of the brain.The last category, category E, consists of EEG signal recordings during seizure activity [1].
Figure 1 shows a typical EEG signal from each category.There are different approaches in the literature for discriminating between these five classes.Nigam and Graupe [18] used a large memory storage and retrieval (LAMSTAR) neural network for classification between category A and E. Kannathal et al. [19] tackled the same problem using four entropy measures and an adaptive neuro-fuzzy system.Guleret et al. [20] used Lyapunov exponents and a recurrent neural network to classify category A, D and E patterns.Maharaj and Alonso [15] used maximum overlap discrete wavelet transform to discriminate between the five categories.Their use of wavelets for feature extraction has become the primary method underlying the majority of subsequent research.Ubeyli [21] used features extracted from a discrete wavelet transform and a mixture of expert classifiers to discriminate category A, D and E patterns.Ghosh-Dastidar et al. [22] extracted EEG features using the mixed-band wavelet-chaos methodology, and applied principal component analysis (PCA) and a cosine radial basis function neural network to classify categories A, C and D. Orhan et al. [23] used features extracted from discrete wavelet transform sub-bands (six sub-bands) and applied k-means clustering and a multilayer perceptron neural network to classify different patterns from category E. Wang et al. [24] used the same approach, although they applied a preprocessing step for selecting the best base in the wavelet transform.Guo et al. [25] used a genetic programming scheme to improve the feature database and applied a k-nearest neighbor analysis to discriminate between categories A, D and E. This research also extracted features from a discrete wavelet transform.Bajaj et al. [26] used time-frequency images of EEG signals and a variety of classifiers to discriminate between category E and other categories' patterns.All previous research, except that of [15], focuses on discrimination between two or three categories at a time (mostly discrimination between epileptic and healthy categories).This research follows the work of [15] and focuses on the five categories problem.The discrimination results (misclassification and accuracy) of [15] are given in Table 1.The majority of previous research studies extracted the signal's features from wavelet transform or dynamic Fourier transform of the signal.The theory behind extracting features from wavelet and Fourier transforms comes from the fact that different brain activities produce signals with different frequencies over time.In this paper, we consider introducing a new method based on the singular spectrum analysis (SSA) technique for classifying brain activity based on EEG signals.The proposed method is applied to the Andrzejak et al. dataset [1] to discriminate between five categories.The SSA technique is a nonparametric filtering technique [27][28][29] with varying applications in forecasting [30,31] signal extraction [32,33] and denoising [34,35].Given its nonparametric nature, the SSA technique is not restricted by the parametric assumptions of normality, linearity or stationarity [36], which enables the modeling of data without the need for transformations.At the outset, we find it pertinent to point out some additional features of SSA in relation to classical time series analysis techniques and the second machine age.Firstly, the SSA technique has some similarities with the popular autoregressive integrated moving average (ARIMA) technique, as explained in detail in [37] and [38].Secondly, the emergence of big data has transformed machine learning and deep learning into increasingly popular In this paper, we consider introducing a new method based on the singular spectrum analysis (SSA) technique for classifying brain activity based on EEG signals.The proposed method is applied to the Andrzejak et al. dataset [1] to discriminate between five categories.The SSA technique is a nonparametric filtering technique [27][28][29] with varying applications in forecasting [30,31] signal extraction [32,33] and denoising [34,35].Given its nonparametric nature, the SSA technique is not restricted by the parametric assumptions of normality, linearity or stationarity [36], which enables the modeling of data without the need for transformations.At the outset, we find it pertinent to point out some additional features of SSA in relation to classical time series analysis techniques and the second machine age.Firstly, the SSA technique has some similarities with the popular autoregressive integrated moving average (ARIMA) technique, as explained in detail in [37] and [38].Secondly, the emergence of big data has transformed machine learning and deep learning into increasingly popular data analysis techniques.However, as explained in [39] and [40], the increasing signal-to-noise ratios in big data can make it difficult for machine and deep learning techniques to capture the underlying signal.The filtering capabilities within the SSA technique can be extremely useful under such scenarios to enable machine learning and deep learning techniques to fit better on big data.
The remainder of the paper is organized as follows: Section two introduces the SSA technique; Section 3 presents a statistical summary of the data used in this study; the SSA based discrimination method is introduced in Section 4; Section 5 reports the SSA based extractions; and the paper ends with some conclusions in Section 6.

Singular Spectrum Analysis
The SSA technique has its roots in the work by [27,28].In what follows, we present a concise summary of the SSA process, and in doing so we mainly follow [36] and [29].
The entire SSA process is dependent upon the selection of window length L and the number of eigenvalues r.A concise discussion around the selection of L is presented in the following section.Figure 2 presents a graphical representation of the SSA process, which includes two stages known as decomposition and reconstruction, each with two steps of their own.Consider a one-dimensional time series Y N of length N. In the decomposition stage, L, an integer such that 2 ≤ L ≤ N − 1 is utilized within the embedding step to transfer the one-dimensional time series Y N into a multidimensional series X 1 , . . ., X K with vectors where i = 1, 2, . . ., K, T denotes transposition and K = N − L + 1.The output from this embedding step is the trajectory matrix X, which is a Hankel matrix.
Stats 2018, 2, x FOR PEER REVIEW 4 of 13 data analysis techniques.However, as explained in [39] and [40], the increasing signal-to-noise ratios in big data can make it difficult for machine and deep learning techniques to capture the underlying signal.The filtering capabilities within the SSA technique can be extremely useful under such scenarios to enable machine learning and deep learning techniques to fit better on big data.The remainder of the paper is organized as follows: Section two introduces the SSA technique; Section 3 presents a statistical summary of the data used in this study; the SSA based discrimination method is introduced in Section 4; Section 5 reports the SSA based extractions; and the paper ends with some conclusions in Section 6.

Singular Spectrum Analysis
The SSA technique has its roots in the work by [27,28].In what follows, we present a concise summary of the SSA process, and in doing so we mainly follow [36] and [29].
The entire SSA process is dependent upon the selection of window length L and the number of eigenvalues r.A concise discussion around the selection of L is presented in the following section.Figure 2 presents a graphical representation of the SSA process, which includes two stages known as decomposition and reconstruction, each with two steps of their own.Consider a one-dimensional time series of length N. In the decomposition stage, L, an integer such that 2 − 1 is utilized within the embedding step to transfer the one-dimensional time series into a multidimensional series , … , with vectors = , , , … , where = 1, 2, … , , T denotes transposition and K = N − L + 1.The output from this embedding step is the trajectory matrix , which is a Hankel matrix.
The singular value decomposition (SVD) step is purely concerned with the extraction of singular values from the trajectory matrix X.Interestingly, these singular values (eigenvalues) capture all information contained within Y N .Taking XX T provides us with positive eigenvalues λ 1 , . . ., λ L in decreasing order of magnitude.Then, the SVD of X can be written as where Thereafter, the grouping step consists of splitting the elementary matrices into several groups and summing the matrices within each group.Let I = i 1 , . . ., i p denote a group of indices i 1 , . . ., i p .Then the matrix X I corresponding to the group I can be defined as we transform each matrix I into a time series, which is an additive component of the initial series Y N .For example, suppose z ij stands for an element of a matrix Z, then the k-th term of the resulting series is obtained by averaging z ij over all i, j, such that i + j = k + 2. This procedure is also known as Hankelization of the matrix Z.The output of the Hankelization of matrix Z is the Hankel matrix HZ, which is the trajectory matrix corresponding to the series obtained as aresult of diagonal averaging.
As the gist of this research is focused around the decomposition and signal extraction capabilities of SSA, as opposed to its forecasting capabilities, we do not go in to a discussion of the forecasting phases.Those interested are referred to [36].

Features Extracted Based on SSA
As a usual technique in feature extraction, one may use a filter to decompose the original signal into its components.Then, characteristics of components which show the most difference between categories will be extracted as statistical features.For instance, let us consider the SSA based decomposition of a typical EEG recording (as shown in Figure 1).Here, we use SSA with L = 20 to decompose each recording.Figures 3 and 4 show first five components in each case.
As a usual technique in feature extraction, one may use a filter to decompose the original signal into its components.Then, characteristics of components which show the most difference between categories will be extracted as statistical features.For instance, let us consider the SSA based decomposition of a typical EEG recording (as shown in Figure 1).Here, we use SSA with L = 20 to decompose each recording.Figures 3 and 4 show first five components in each case.According to Figures 3 and 4, most differences between the five recordings are related to the amplitude and dispersion of their components.This behavior motivates the use of eigenvalues, variances and minimum and maximum values of components, as features to discriminate between categories.
Accordingly, we consider box plots for the logarithm of the above mentioned values in Figure 5.It can be seen from left to right (from category A to category E) that the logarithm of eigenvalues and variances grow larger (relatively).This shows a clear difference between the logarithm of eigenvalues and variances corresponding to categories.According to Figures 3 and 4, most differences between the five recordings are related to the amplitude and dispersion of their components.This behavior motivates the use of eigenvalues, variances and minimum and maximum values of components, as features to discriminate between categories.
Accordingly, we consider box plots for the logarithm of the above mentioned values in Figure 5.It can be seen from left to right (from category A to category E) that the logarithm of eigenvalues and variances grow larger (relatively).This shows a clear difference between the logarithm of eigenvalues and variances corresponding to categories.

categories.
Accordingly, we consider box plots for the logarithm of the above mentioned values in Figure 5.It can be seen from left to right (from category A to category E) that the logarithm of eigenvalues and variances grow larger (relatively).This shows a clear difference between the logarithm of eigenvalues and variances corresponding to categories.It should be noted that these results are based on SSA with L = 20.Although our results can show the difference between these signals, the window length should be chosen carefully.On one hand, a larger window length gives smoother components [36], which may remove the difference between variances.On the other hand, a smaller window length would not lead to a sound separation of noise from signal.There are several approaches for choosing the best L. For example, in [41] and [42], the authors suggested that L is chosen to be equal to the correlation length when the sample autocorrelation function crosses the confidence interval of 95%.[43] consider the selection of L based on the separability between signal and noise, whilst [44] introduced the minimization of a loss function as a means of determining the best L for forecasting purposes.More recently, [45] further investigated the signal extraction problem and constructed a minimum description length criterion that can be employed to identify the dimension (rank) of the signal component.In contrast to the popular binary approach for eigenvalue selection in SSA, [46] introduced a new approach based on Colonial Theory, which appreciates that for more complex time series, a binary approach to reconstruction would not suffice.Ghodsi et al. [47] considered an extensive study of the characteristics underlying 100 real data sets and its influence on the selection of SSA choices when decomposition and reconstruction are based on the [44] algorithm.They found the distribution of data, stationarity, frequencies and series length to be factors enabling differentiation between the best SSA forecasting approaches.Given the importance of the window length to the overall SSA process, we conduct a robustness analysis aimed at determining the effect of window length on eigenvalues.For this purpose, the average eigenvalues are calculated for different window lengths.Figure 6 shows the logarithm of the average eigenvalues corresponding to EEG patterns A and E for different window lengths.
data, stationarity, frequencies and series length to be factors enabling differentiation between the best SSA forecasting approaches.Given the importance of the window length to the overall SSA process, we conduct a robustness analysis aimed at determining the effect of window length on eigenvalues.For this purpose, the average eigenvalues are calculated for different window lengths.Figure 6 shows the logarithm of the average eigenvalues corresponding to EEG patterns A and E for different window lengths.According to Figure 6, for all window lengths, the first two components show most differences between the two patterns.Furthermore, as seen here, a larger window length does not lead to a major difference between the first two components.As such, when L = 3, SSA removes the noise from two first components and reveals differences between categories.According to Figure 6, for all window lengths, the first two components show most differences between the two patterns.Furthermore, as seen here, a larger window length does not lead to a major difference between the first two components.As such, when L = 3, SSA removes the noise from two first components and reveals differences between categories.
Given the results discussed above, the following features are then extracted based on SSA with L = 3, to discriminate between five categories, i.e., the logarithm of the first two eigenvalues, the logarithm of the first two variances, the logarithm of the maximum value of the first two components and the logarithm of the absolute minimum value of the first two components.
It is noteworthy that all programming and computations were completed in R package [48].The average time for each EEG feature extraction was 0.035 s, whilst the average time for training the kernel classifier to estimate the category of one EEG recording was 21.74 s.

Discrimination between EEG Recordings Based on SSA
The EEG recording discrimination follows these steps: (1) Use the SSA with window length L = 3 and r = 2 (without integration) to decompose each EEG recording to two components.
(2) Extract the six features as follows: Logarithm of the variance of each two components, logarithm of the eigenvalues corresponding to two components, logarithm of maximum value of two components, logarithm of minimum value of two components (The SSA analysis and original feature extraction can be implemented using package "Rssa" in R [48]).The features extracted from five categories of EEG recording are shown in Figure 7.
(3) Use PCA and calculate the six principal components of the features extracted in step 2 (These are the features we use for discrimination).
(4) Estimate the six variate probability density functions for each category (say f 1 (x), . . . ,f 5 (x), where x = [x 1 , . . . ,x 6 ] ) using the kernel method [49] (Since some of the features are not well separated, a nonparametric (kernel) classifier is used to classify the features): where H is bandwidth matrix, c is the some normalizing constant, u i j = H −1 x − x i j and x i j are the jth features' vector (corresponding to the jth observation) in the ith category.
In this research the estimation is based on Epanechnikov kernel function, since it gave the better results in this dataset, and the bandwidth is selected using the plug-in bandwidth selection method [50] (The "ks" package in R is used for bandwidth selection and kernel density estimation).
( (2) Extract the six features as follows: Logarithm of the variance of each two components, logarithm of the eigenvalues corresponding to two components, logarithm of maximum value of two components, logarithm of minimum value of two components (The SSA analysis and original feature extraction can be implemented using package "Rssa" in R [48]).The features extracted from five categories of EEG recording are shown in Figure 7.
(3) Use PCA and calculate the six principal components of the features extracted in step 2 (These are the features we use for discrimination).
(4) Estimate the six variate probability density functions for each category (say , … , , where = , … , ′ ) using the kernel method [49] (Since some of the features are not well separated, a nonparametric (kernel) classifier is used to classify the features): where is bandwidth matrix, isthe some normalizing constant, = − and are the jth features' vector (corresponding to the jth observation) in the ith category.
In this research the estimation is based on Epanechnikov kernel function, since it gave the better results in this dataset, and the bandwidth is selected using the plug-in bandwidth selection method [50] (The "ks" package in R is used for bandwidth selection and kernel density estimation). (

Results
The results of discrimination (number of misclassifications) between patterns A, B, C, D and E (before applying PCA) are given in Table 2.According to these results, using features extracted from SSA and the kernel classifier, there are only 3 misclassifications, which makes the total error rate 0. 6% (99.4% accuracy).To further improve the performance of the proposed method, one may use principal components analysis (PCA) to reduce the dependence between features.Figures 8 and 9 show the first and last three principal components of features.It is evident that most categories are well separated.Using all six components of PCA gives 100% accuracy.Applying PCA will increase the computational cost of recognition by O(n 2 ) (Applying PCA, in estimating the category of the given observation, the original vector of features will be multiplied with an n by n coefficient matrix) where n is the number of features).Since the proposed procedure involves only six features, the additional computational cost of recognition regarding applying PCA is relatively low.On the other hand, before applying PCA, the procedure classifies some of the recordings from epileptogenic area of epileptic subject as healthy EEG recordings.
The accuracy of the presented procedure in discriminating observations out of the training dataset is investigated using the leave-one-out method.Table 3 shows the leave-one-out accuracy and misclassification results for discriminating between five categories using six components (from PCA) and the kernel classifier.According to the results, the leave-one-out misclassification rate after applying PCA is 2.8%, which gives a total accuracy of 97.2%.To further improve the performance of the proposed method, one may use principal componentsanalysis (PCA) to reduce the dependence between features.Figures 8 and 9 show the first and last three principal components of features.It is evident that most categories are well separated.Using all six components of PCA gives 100% accuracy.Applying PCA will increase the computational cost of recognition by O(n 2 ) (Applying PCA, in estimating the category of the given observation, the original vector of features will be multiplied with an n by n coefficient matrix) where n is the number of features).Since the proposed procedure involves only six features, the additional computational cost of recognition regarding applying PCA is relatively low.On the other hand, before applying PCA, the procedure classifies some of the recordings from epileptogenic area of epileptic subject as healthy EEG recordings.
The accuracy of the presented procedure in discriminating observations out of the training dataset is investigated using the leave-one-out method.Table 3 shows the leave-one-out accuracy and misclassification results for discriminating between five categories using sixcomponents (from PCA) and the kernel classifier.According to the results, the leave-one-out misclassification rate after applying PCA is 2.8%, which gives atotal accuracy of 97.2%.

Conclusions
This paper begins with the aim of introducing a new and more effective approach for discriminating EEG signals.Accordingly, we propose the use of the nonparametric singular spectrum analysis technique as a viable tool and evaluate its performance at discriminating EEG signals using the [1] benchmark data set for epileptic study.We find promising results which show that highly accurate feature extractions are possible via SSA and a kernel classifier.Motivated by this, we innovate further by incorporating PCA within the proposed classification model to make additional gains in classification accuracy.
As a result, we have succeeded in presenting several contributions.Firstly, previous researchers working on this same data set (see Introduction) have relied upon the wavelet transform and the Fourier transform for feature extraction.In contrast, we have succeeded in proposing and introducing a successful SSA and kernel-based approach for discriminating between EEG signals.
Secondly, it is important to note that the primary challenge in this dataset is our attempt to classify all five categories.For instance, categories A and B are very similar, as both are EEG recordings of healthy subjects, with their eyes are open and closed, respectively.Moreover, categories C and D are also similar, since both are EEG recordings from epileptic patients, taken from the epileptic parts of the brain and the other from the normal parts of the brain, when they are not under attack.In contrast, the last category is completely different as it is recorded during an epileptic attack.Interestingly, most of the previous research focused on discrimination between categories A and E, or C and E, or D and E, or A and D, or A and C categories in pairs.Whilst [15] considered discriminating all categories at the same time, their findings were not satisfactory and thus forced them to switch to categories A and E, which represents the simplest case for discriminating between EEG recordings.The fact that our findings, using the spectral method of SSA, produced betters results by discriminating between all five categories at the same time is extremely important and useful in practice.
Thirdly, the initial results were extremely positive, as the SSA combined with a kernel classifier reported a 99.4% accuracy rate.We also show that incorporating PCA within the proposed methodology can further improve the classification accuracy, up to 100% (97.2% for out-of-sample classification).These results are far better than the accuracy rate reported in [15], where DWT was used to discriminate between EEG signals.Whilst historically the DWT technique was popular for discriminating between EEG signals, the findings here not only question its applicability in future, but also clearly demonstrate that the SSA and kernel-based approach for classification has comparatively better features and accuracy rates.

Figure 1 .
Figure 1.Typical electroencephalogram(EEG) signals: (a) EEG recording from healthy subject with eyes open, (b) EEG recording from healthy subject with eyes closed, (c) EEG recording from nonepileptogenic area of epileptic subject, (d) EEG recording from epileptogenic area of epileptic subject, (e) EEG recording from seizure activity.

Figure 1 .
Figure 1.Typical electroencephalogram(EEG) signals: (a) EEG recording from healthy subject with eyes open, (b) EEG recording from healthy subject with eyes closed, (c) EEG recording from non-epileptogenic area of epileptic subject, (d) EEG recording from epileptogenic area of epileptic subject, (e) EEG recording from seizure activity.
The split of the set of indices J = 1, . . ., d into the disjoint subsets I 1 , . . ., I m corresponds to the representation X = X I 1 + • • • + X I m .The procedure of choosing the sets I 1 , . . ., I m is called theeigentriple grouping.For a given group I, the contribution of the component X I is measured by the share of the corresponding eigenvalues:∑ i∈I λ i / d ∑ i=1 λ i .Finally, in the diagonal averaging step,

Figure 3 .
Figure 3. First five components (first component at the top and fifth component at the bottom) of typical EEG recordings from Figure 1: Recordings from categories A (left) and B (right).

Figure 3 .
Figure 3. First five components (first component at the top and fifth component at the bottom) of typical EEG recordings from Figure 1: Recordings from categories A (left) and B (right).

Figure 4 .
Figure 4. First five components (first component at the top and fifth component at the bottom) of typical EEG recordings from Figure 1: Recordings from categories C (left), D (middle) and E (right).

Figure 4 .
Figure 4. First five components (first component at the top and fifth component at the bottom) of typical EEG recordings from Figure 1: Recordings from categories C (left), D (middle) and E (right).

Figure 5 .
Figure 5. Categories'' Box plots for logarithm of eigenvalues and variance of first two components of EEG recordings.

Figure 5 .
Figure 5. Categories" Box plots for logarithm of eigenvalues and variance of first two components of EEG recordings.

Figure 6 .
Figure 6.Logarithm of average eigenvalues for EEG patterns A (solid line) and E (dotted line) with different window lengths, L: The horizontal axis shows the component's number and the vertical axis shows the logarithm of the average eigenvalues.

Figure 6 .
Figure 6.Logarithm of average eigenvalues for EEG patterns A (solid line) and E (dotted line) with different window lengths, L: The horizontal axis shows the component's number and the vertical axis shows the logarithm of the average eigenvalues.
) As the discrimination rule, a new set of features (say x 1 , . . ., x 6 ) is allocated to the ith category if arg max i=1, ..., 5 fi (x) .
) As the discrimination rule, a new set of features (say , … , ) is allocated to the ith category ifarg max ,…, .

Table 1 .
[15]sification results for five patterns based on discrete wavelet transform (DWT) (multi-block approach with four blocks)[15].The majority of previous research studies extracted the signal's features from wavelet transform or dynamic Fourier transform of the signal.The theory behind extracting features from wavelet and Fourier transforms comes from the fact that different brain activities produce signals with different frequencies over time.

Table 2 .
Classification results for five patterns based on SSA, before applying principal components analysis (PCA).The diagonal numbers show the number of correct classifications in each category.

Table 3 .
[15]e-one-out classification results for five patterns based on SSA and PCA.The diagonal numbers show the number of correct classifications in each category.Maharajand Alonso[15]used discrete wavelet transform(DWT) to discriminate between these patterns.The best accuracy achieved using DWT is reported in Table1.According to Table1, discrimination using DWT and a kernel classifier has a 90.2% total accuracy.The comparison of SSA and DWT to extract EEG features clearly indicates the superiority of the SSA-based extractions, which are far more accurate than DWT-based extractions.