Next Article in Journal
AI Evaluation of Imaging Factors in the Evolution of Stage-Treated Metastases Using Gamma Knife
Previous Article in Journal
Preoperative Assessment of Perianal Fistulas with Combined Magnetic Resonance and Tridimensional Endoanal Ultrasound: A Prospective Study
Previous Article in Special Issue
Optimizing 1D-CNN-Based Emotion Recognition Process through Channel and Feature Selection from EEG Signals
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Brain–Computer Interface: The HOL–SSA Decomposition and Two-Phase Classification on the HGD EEG Data

by
Mary Judith Antony
1,
Baghavathi Priya Sankaralingam
2,
Shakir Khan
3,4,
Abrar Almjally
3,
Nouf Abdullah Almujally
5,* and
Rakesh Kumar Mahendran
6
1
Department of Computer Science & Engineering, Panimalar College of Engineering, Chennai 600123, India
2
Department of Computer Science & Engineering, Amrita School of Computing, Amrita Vishwa Vidyapeetham, Chennai 601103, India
3
College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11432, Saudi Arabia
4
University Centre for Research and Development, Department of Computer Science and Engineering, Chandigarh University, Mohali 140413, India
5
Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh 11671, Saudi Arabia
6
Department of Computer Science and Engineering, Rajalakshmi Engineering College, Chennai 602105, India
*
Author to whom correspondence should be addressed.
Diagnostics 2023, 13(17), 2852; https://doi.org/10.3390/diagnostics13172852
Submission received: 28 July 2023 / Revised: 25 August 2023 / Accepted: 28 August 2023 / Published: 3 September 2023
(This article belongs to the Special Issue Biomedical Signal Processing and Analysis)

Abstract

:
An efficient processing approach is essential for increasing identification accuracy since the electroencephalogram (EEG) signals produced by the Brain–Computer Interface (BCI) apparatus are nonlinear, nonstationary, and time-varying. The interpretation of scalp EEG recordings can be hampered by nonbrain contributions to electroencephalographic (EEG) signals, referred to as artifacts. Common disturbances in the capture of EEG signals include electrooculogram (EOG), electrocardiogram (ECG), electromyogram (EMG) and other artifacts, which have a significant impact on the extraction of meaningful information. This study suggests integrating the Singular Spectrum Analysis (SSA) and Independent Component Analysis (ICA) methods to preprocess the EEG data. The key objective of our research was to employ Higher-Order Linear-Moment-based SSA (HOL–SSA) to decompose EEG signals into multivariate components, followed by extracting source signals using Online Recursive ICA (ORICA). This approach effectively improves artifact rejection. Experimental results using the motor imagery High-Gamma Dataset validate our method’s ability to identify and remove artifacts such as EOG, ECG, and EMG from EEG data, while preserving essential brain activity.

1. Introduction

EEG is a technique for detecting electrical activity in the brain. Since the electrodes are often positioned along the scalp, it is noninvasive. EEG readings that are aberrant are the outcome of the most frequent use of the technology, which is to diagnose epilepsy according to [1]. Additionally, it may be utilized to spot brain death, coma, encephalopathies, sleep disorders, and the degree of anesthesia. For identifying tumors, strokes, and other focal brain illnesses, among other conditions, EEG was previously considered the gold standard. The use of this technology has decreased, nevertheless, owing to the advancement of good structural imaging methods such as computerized tomography and magnetic resonance. EEG continues to be a vital study and diagnostic tool despite its poor spatial resolution [2]. CT, PET, and MRI cannot really compete with its millisecond-range temporal resolution. EEG is often tolerant of subject mobility, unlike the majority of other neuroimaging methods. To achieve a better analysis of the reactions to auditory stimuli, it is also possible to reduce motion abnormalities in EEG data.
Before analyzing EEG data, unnecessary components must be eliminated, which is highly essential to achieve improved accuracy. Therefore, source separation approaches in EEG signal processing have gained a great deal of attention. Mixed data have independent sources that are statistically concealed. For source separation, the Blind Signal Extraction (BSE) and Blind Source Separation (BSS) approaches were suggested [3]. Independent Component Analysis (ICA) and Principal Component Analysis (PCA) were more often employed in many studies to separate the signals that are concealed in the mixed EEG data [4]. The most popular method for effectively separating the sources from the EEG signal’s complex requirements is called ICA, and it belongs to the BSS class. The PCA approach has certain limitations, such as a poor ability to convert directions. When it comes to separating sources and artifacts from EEG data, the ICA approach is more precise and versatile [5]. Because they utilize less power, wearable and portable EEG devices have become more popular. With these gadgets, it is simple to monitor and record EEG signals at home. Additionally, there is spectral overlap between the EEG and the source components.
To suppress the artifacts from multichannel EEG recordings, Independent Component Analysis has been primarily used to address these differences. The use of ICA in real time is not possible with systems that have only one or a few EEG channels. Using ICA to eliminate artifacts on a single channel is substantially more difficult. The suggested work emphasizes separation using ICA. The challenging job is in building a technique that can separate artifacts from a single-channel EEG signal. The direct use of single-channel EEG signals with ICA cannot be done. Whereas, for multichannel EEG signal processing, ICA techniques are used more often. Therefore, the single-channel signal is mapped into multivariate data using the appropriate decomposition technique for overcoming the limitation. The single-channel signal was mapped into multivariate data using Wavelet Transform and EEMD, respectively, by the authors in [6]. However, this technique failed to separate the sources efficiently according to the report.
A decomposition method that is frequently applied in the study of meteorological time series data is Singular Spectrum Analysis (SSA) [7]. A tensor-based SSA method was utilized to extract the narrow band variable using the Empirical Mode Decomposition (EMD). However, this requires many calculations during the SSA reconstruction step. In SSA, a Finite Impulse Response filter was later described using a truncated SVD, with the eigenvectors serving as the filter coefficients. Recently, SSA has been utilized to create filter banks. In such implementations, the output remained in phase with the original signal.
The primary advantage of the suggested strategy over the existing SSA–ICA and SSA–ANC [8] procedures is the capacity to separate the sources present in single-channel EEG signals in a multiview data analysis. In other words, the suggested technique of decomposition translates the signal into multivariate data with many dimensions, as opposed to existing multivariate data that only have spatial and temporal dimensions, which is referred to as Higher-Order SVD-based Singular Spectrum Analysis as a consequence. The recommended HOL–SSA technique is utilized to deconstruct the single-channel signals into multivariate data, which are then subsequently used to recover the source signals using the Online Recursive ICA (ORICA) approach.
This study introduces an innovative methodology for identifying signal sources within single-channel EEG data. The approach involves combining the Singular Spectrum Analysis and Independent Component Analysis techniques. Specifically, we propose the utilization of a novel method called Higher-Order L-moment Singular Value Decomposition-based SSA (HOL–SSA). This technique is a linear combination of Higher-Order Singular Value Decomposition (HOSVD), as demonstrated by [9]. HOL–SSA has exhibited superior robustness compared to both higher- and lower-order statistical methods within SSA.
The proposed manuscript is organized as follows. Section 2 explains the proposed approaches of HOL–SSA and its algorithm. Section 3 explains the dataset used for experimentation, the results obtained, and their analysis. Finally, Section 4 concludes the entire work and its benefit with direction toward future enhancement.

2. Literature Survey

The authors in [10] introduced a new method based on the Singular Spectrum Analysis (SSA) technique for classifying brain activity based on EEG signals via an application into a benchmark dataset for epileptic study. The results from the SSA-based approach were compared with those from discrete wavelet transform. It finds that SSA can capture both stationary and nonstationary EEG features more effectively than wavelet transforms. The automated removal of EOG artifacts from EEG signals was presented by authors in [11]. Circulant Singular Spectrum Analysis (CiSSA) was employed by them to decompose the EOG-contaminated EEG signals into intrinsic mode functions (IMFs). Subsequently, the artifact signal components were identified through the utilization of kurtosis and energy values, and their removal was executed by means of a four-level discrete wavelet transform (DWT). The results indicate that the proposed approach was evaluated on synthetic and real EEG data, revealing its effectiveness in the elimination of EOG artifacts while retaining low-frequency EEG information.
In their study [12], the authors introduced a novel and effective technique for the removal of muscle artifacts from EEG signals. The method, named SSA–CCA (Singular Spectrum Analysis–Canonical Correlation Analysis), combines Singular Spectrum Analysis (SSA) and Canonical Correlation Analysis (CCA). Unlike conventional single-channel decomposition methods, such as ensemble empirical mode decomposition (EEMD), the SSA algorithm employed in this approach draws on principles of multivariate statistics. This enables the proposed method to harness the benefits of both SSA and cross-channel information. The efficacy of SSA–CCA is assessed using both semi-simulated and real EEG data. The results of the evaluation reveal that the introduced method surpasses existing techniques, namely, EEMD–CCA, and even the classic approach of CCA, particularly when dealing with multichannel scenarios. This innovative SSA–CCA approach thus presents a promising advancement in the domain of EEG artifact removal.
As the successful elimination of EOG artifacts remains a significant obstacle in EEG research, the authors proposed a novel approach, termed EEMD-based ICA (EICA) [13]. This method combines ensemble empirical mode decomposition (EEMD) with ICA algorithms to enhance the removal of EOG artifacts from multichannel EEG signals. However, when conducting a comparative analysis, the authors found that the Singular Spectrum Analysis (SSA) method exhibits superior performance. SSA showcases the highest improvement in signal-to-noise ratio, coupled with a reduction in root mean square error and correlation coefficient after the removal of EOG artifacts. This robust performance of SSA underscores its ability to more effectively eliminate blink artifacts from multichannel EEG signals, while minimizing the impact of error. As a result, SSA emerges as a promising solution for addressing the challenge of EOG artifact removal in the realm of multichannel EEG signal analysis.
One emerging approach that has gained attention in recent years is the two-phase classification approach, which involves a sequential classification process aimed at enhancing accuracy, efficiency, and noise reduction. This review highlights the merits of the two-phase classification approach in comparison to other classification methods commonly used in EEG signal processing.
The seminal work by the authors in [14] discusses the conceptual framework and practical implementation of a two-stage classification approach as compared to single-stage classifiers. By leveraging multiple stages, the proposed methodology enables the model to first capture high-level patterns and subsequently refine predictions in the second stage. Empirical evidence presented in this article underscores the improved accuracy, generalization, and adaptability of the two-stage classifier across diverse datasets.
In the comparative study, the authors systematically assess the performance of single-stage classifiers against a two-stage classifier using multiple datasets [15]. The article meticulously outlines the benefits of the two-stage approach, which includes superior feature extraction and hierarchical decision-making. The experimental results clearly illustrate that the two-stage classifier consistently outperforms single-stage alternatives, emphasizing the efficacy of its intricate decision pipeline.
Focusing on the complexities posed by intricate datasets, the article [16] by the authors elucidates the merits of employing a two-stage classification strategy. Through an in-depth examination of real-world scenarios, the authors demonstrate the limitations of single-stage classifiers and how the two-stage approach is better suited to handle such challenges. By effectively segmenting the decision-making process, the proposed methodology showcases remarkable performance improvements, establishing its relevance in intricate data analysis.
The authors in [17] have presented a case study that highlights the tangible benefits of adopting a two-stage classification model in practical applications. Drawing from a specific domain, they outline the shortcomings of using single-stage classifiers and present evidence of the two-stage model’s remarkable success. Similarly, the authors in [18] introduced a dual-stage classification approach. In the initial stage, they employed LDA classifiers to distinguish between various pair-wise MI tasks. Following this, a naive Bayes classifier was employed to forecast the ultimate task executed by the user. This prediction is based on the weighted results of the LDA classifiers. The conducted experiments indicated that the proposed method surpassed the top-performing entry in BCI competition IV by a margin of 3.5%.
Through careful analysis and extensive experimentation, this work underscores the superiority of the two-stage classification approach, reinforcing its viability in real-world scenarios.
The proposed study contributes an adaptive two-phase classification technique for MI events, showcasing improved accuracy and consistency in BCI performance. The study by the authors in [19] presents a method for epileptic seizure detection in EEG signals, leveraging nonlinear features and a deep learning model. Both studies highlight the significance of innovative classification methodologies in distinct domains, with the first emphasizing enhanced performance in BCI and the second demonstrating exceptional accuracy in epileptic seizure detection using advanced feature extraction and DL techniques.
The proposed study in this research and the study in [20] addresses classification challenges in distinct domains utilizing advanced methodologies. In Study 1, the emphasis is on MI event classification using a two-phase approach, with ANN and adaptive SVM classifiers. The adaptive technique aims to improve BCI performance by maintaining consistency, reducing training time, and handling non-stationarities. Study 2, on the other hand, focuses on epileptic seizure detection in EEG signals, employing a comprehensive CADS. It incorporates TQWT decomposition, extraction of various features, and a CNN-RNN DL model for classification. Both studies demonstrate significant improvements over existing approaches. Moreover, the proposed model can be efficiently used for other applications of medical images segmentation for brain data studies.

3. Methods and Materials

3.1. Singular Spectrum Analysis (SSA)

Singular Spectrum Analysis (SSA) is a powerful technique to handle time series data [21]. It can handle nonlinear and nonstationary time series data. SSA has shown great promise in the analysis of electroencephalography (EEG) signals [22]. It is a data-driven technique which identifies the alpha, beta, gamma, etc., associated with different brain activities. The processing steps of SSA include: (1) Embedding, (2) Singular Value Decomposition, (3) Grouping, and (4) Reconstruction.
The proposed contribution is HOL–SVD-based decomposition in the SSA rather than the conventional SVD. HOL–SSA is a linear combination of Higher-Order Singular Value Decomposition (HOSVD). It proved to be more robust than the existing higher-order and lower-order statistics of SSA. Both HOSVD and SVD are matrix factorization techniques, they handle higher or multidimensional data. HOSVD can handle nonlinear data. It can handle complete spatial and temporal features from EEG data simultaneously, making it useful for analyzing data with complex spatiotemporal patterns. SVD does not capture the full spatiotemporal patterns in EEG data. HOSVD can handle missing data in the tensor by using tensor completion, whereas SVD requires a complete matrix for analysis. However, the choice of method will depend on the specific application and the characteristics of the data being analyzed.

3.2. HOL–SSA

Multiple approaches to SSA were proposed for decomposition. Here, it is proposed to use a novel Higher-Order L-moment Singular Value Decomposition-based SSA (HOL–SSA), a linear combination of Higher-Order Singular Value Decomposition (HOSVD), which has been proved to be more robust than the existing higher-order and lower-order statistics of SSA. The recommended HOL–SSA technique is utilized to deconstruct the single-channel signals into multivariate data, which are then subsequently used to recover the source signals using the Online Recursive ICA (ORICA) approach.

3.2.1. HOSVD

Most frequently, the multidimensional SVD is associated with the extraction of relevant information from the multiway cluster. A Multilinear Singular Value Decomposition is another term that is used. The relevant data are sampled in several dimensions using the multidimensional digital signal processing technique. The process of performing single-dimensional samplings involves selecting points along a continuous line and recording their values in a data stream. Contrarily, in multidimensional sampling, the data are chosen using a matrix based on the dataset’s sample vectors. The Tucker compression, which is a method for reducing the amount of multidimensional data, is mostly implemented using the HOSVD.
For tensor R of order O and size s 1 x   s 2   x ……x   s O   , the HOSVD is defined as follows.
R = C R   x 1   P ( 1 )   x 2   P ( 2 )   x 3 x O   P ( O )
where C R is the core tensor.
P ( m ) are the matrices of m-mode singular vectors of R with m = 1 , 2 , . O .
For every m = 1 , 2 , . O , the following are determined.
  • SVD of m-mode unfolding R ( m ) of R as R ( m ) = P ( m ) .   Σ ( m ) .   Q ( m ) T
  • After the computation of matrices of m mode singular vectors P ( m ) , the core tensor C R can be computed as follows.
    C R = R   x 1   P ( 1 ) T   x 2   P ( 2 ) T   x 3 x O   P ( O ) T
  • Number of nonzero diagonal elements in Σ ( m ) as the m rank.
Similar to O matrix SVDs in difficulty, the HOSVD of an order O tensor R is also computationally complex.

3.2.2. Truncated HOSVD

An efficient and approximative solution is to compute the greatest m-mode singular values. After determining the dominant m-mode singular vectors’ matrices, derived from R ( m ) = P ( m ) .   Σ ( m ) .   Q ( m ) T , the truncated core tensor C R T is obtained from the elements of the core tensor C R .
R = C R   x 1   P ( 1 )   x 2   P ( 2 )   x 3 x O   P ( O )
C R = R   x 1   P ( 1 ) T   x 2   P ( 2 ) T   x 3 x O   P ( O ) T
The term “truncated HOSVD” refers to the low m rank approximation R’ of the tensor R, which has the dominating m ranks [23]. In several applications across a wide range of signal processing domains, the HOSVD has been employed. It is extremely promising to use the reduced HOSVD as a preprocessing step for several multilinear signal processing methods’ dimensionality reduction. Thus, the computational complexity may be greatly decreased.

3.2.3. L-Moment

The L-moment analysis is a statistical method used to analyze the probability distributions. As the HOSVD method decomposes the EEG signal into its spatial, spectral, and temporal components, the L-moment provides the distribution information of each component. The approach provides support in identifying patterns in the signal that would not be apparent using traditional signal processing techniques thus providing more accurate and reliable results [24].
In statistical theory, using cumulants and joint cumulants for univariate and multivariate distributions is one well-established method for Higher-Order Statistics. These are extended in time series analysis to higher-order spectra, such as the bispectrum and trispectrum.
L-moments, which are linear statistics (linear combinations of order statistics) and thus more reliable than HOS, can be used as an alternative to HOS and higher moments. L-moments are a series of statistics that are used to condense a probability distribution’s form. The L-scale, L-skewness, and L-kurtosis are linear combinations of order statistics (L-statistics) that are comparable to traditional moments and may be used to derive numbers similar to standard deviation, skewness, and kurtosis, respectively, where the L-mean is identical to the conventional mean. Standardized moments are equivalent to standardized L-moments, also known as L-moment ratios. A theoretical distribution has a collection of population L-moments, similar to conventional moments. For a sample taken from the population, sample L-moments are established and utilized as estimators of population L-moments.
The nth population L-moment for random variable Z is
n 1 i = 0 n 1 ( 1 ) i n 1 i E Z n i : n
where E stands for expected value and Zi:N represents the kth order statistic (nth least value) in an independent sample of size N from the distribution of Z.
The recommended HOL–SSA technique is utilized to deconstruct the single-channel signals into multivariate data, which are then subsequently used to recover the source signals using the Online Recursive ICA (ORICA) [25] approach (Algorithm 1).
Algorithm 1: λ2 = (EX2:2 − EX1:2)/2λ3 = (EX3:3 − 2EX2:3+EX1:3)/3Proposed HOL–SSA And ORICA Methodology
Step 1:  Input the raw EEG signal.
Step 2:  Map the signal vector to a matrix.
In the embedding stage, the time series s with length l is mapped into tensor R , where s is segmented using a nonoverlapping window of size i and a[l/i] x i matrix M is obtained from s.
M = s 1 s 2 . . . s i s i + 1 . . . . . s i + 2 . . . s 2 i . . . . . s L 1 i s L 1 i + 1 s L i
L = [l/i]
  L refers to the last slab of the tensor. The matrix M is converted to tensor R by considering each slab of the tensor as a windowed version of M.
Because the application of SSA to real data does not exploit the inherent nonstationarity and therefore may fail in actual data decomposition, therefore, tensor-based SSA is a robust solution to this problem.
Step 3:  Decompose the signal using HOSVD.
  The truncated HOSVD of the converted tensor R of order O and the dominant m ranks for m = 1 , 2 , . O is computed.
for  m = 1 , 2 , . O
{
compute  R ( m ) = P ( m ) . Σ ( m ) . Q ( m ) T
compute matrices of dominant m-mode singular vectors [ P 1 ( m ) , P 2 ( m ) , . . , P d o m m ]
}
compute  C R = R x 1 P ( 1 ) T x 2 P ( 2 ) T x 3 x O P ( O ) T
compute  C R t from C R
Step 4:  Determine the Linear moments of HOSVD.
  The nth population L-moment of a tensor with O order statistics in a decomposed sample from the distribution of core tensor C R t is as follows.
= n 1 O = 0 n 1 ( 1 ) O . n 1 O E . C R ( n O : n )
  E is the expected value.
Step 5:  Reconstruct the original signal to a multivariate data matrix.
The matrices from step 4 are grouped into submatrices, as given below.
z = 1 Y M z
Here, Y represents the total number of groups, z refers to the subgroups of eigenvalues, and M z denotes the sum of matrices within group z.
Secondly, each matrix of the grouped decomposition is Hankelized, after which the Hankel matrix is transformed into a new series of length l . The diagonal averaging applied to the resultant matrix produces a reconstructed series. Thus, the initial series set s 1 , . , s l is decomposed into a sum of r reconstructed subseries, as shown below.
s = 1 r s 1
This decomposition is the main result of the HOL–SSA algorithm. If each reconstructed subseries is categorized as a single periodic component or noise, the decomposition makes sense. As a result, the online recursive ICA technique is used in this situation for component separation, as indicated in the step that follows.
Step 6:  Apply ORICA on the multivariate data matrix, and for each iteration, the whitening matrix and the demixing matrix are computed.
In order to reverse the mixing action, the inverse matrix of the reconstructed subseries is built. The independent components are produced by applying the ORICA rule after applying the Sherman–Morrison matrix inversion method.
S i + 1 1 = S i 1 + l r I f a i a i T S i 1
S 1 refers to the demixing matrix of the r reconstructed subseries.
Step 7:  Output the mapped sources of interest into original signal form.

Time Complexity: O ( N 3 ) + O ( M )
The characteristics of the denoised EEG data are then extracted using the Common Spatial Pattern (CSP) technique. A two-phase classification strategy has also been suggested and tested on the motor 4 imagery EEG data, which is likewise in accordance with this. Cross-comparison tests also demonstrated that the suggested two-phase classification approach including Artificial Neural Network and Adaptive Support Vector Machine has greater classification accuracy than the existing single-stage and two-stage classification approaches [26].

4. Result and Analysis

4.1. Dataset Description

The suggested model is assessed using the HGD, a different dataset, to confirm its resilience to data fluctuations. The HGD contains four classes—left hand, right hand, both feet, and rest—and more trials than the BCI-IV 2a. Fourteen individuals provided the HGD, which was gathered in a controlled environment. Just 21 of the 128 channels used to acquire the data, which had a sampling frequency of 500 Hz, were associated to MI.
The HGD dataset’s data quality was improved by downsampling it from 500 Hz to 250 Hz. In addition, channels were reduced from 128 to 21 in order to discard redundant information. Electrodes that do not link to the motor imagery region are left out. As the database description states, only 21 sensors with the letter C in their name were chosen since they represent the motor cortex.

4.2. Performance Analysis

The analysis of artifact removal on the HGD motor imagery signals using the proposed approach is discussed below. Figure 1 represents the channels used to acquire the motor imagery signals and their locations. The signals acquired by each of these channels are represented in Figure 2 as channel data. These signals are further decomposed using the proposed decomposition approach.
Note that executing ICA requires that bad channels be rejected first. The entire dataset should be cycled through in order to visually detect faulty channels because some of them could only be harmful intermittently. In this instance, removing the erroneous data segment rather than the channel itself may be better. Plotting the channels’ spectra is another approach to spot problematic channels. Bad channels might be rejected using the pop select.m function if they are known. Moreover, as filtering might scatter the artifacts out over clean data, necessitating additional data to be discarded after filtering, it may be desirable to remove data parts containing substantial artifacts by visual examination, such as high spikes in the data, before filtering.
After band-pass filtering of the signals, Figure 3 shows the channel data. Before filtering, it is also preferable to eliminate data segments having significant artifacts by visual inspection, such as large spikes in the data. Problematic data segment deletion is seen in Figure 4 and Figure 5.
Although epoched data can also be filtered, screening continuous EEG data before epoching or artifact removal is advised since it reduces the introduction of filtering artifacts at epoch borders. It may be beneficial to high-pass filter the data to eliminate linear trends. It is recommended to apply high-pass filtering to the data at 1 Hz to generate signal decompositions of high quality.
Moreover, when large artifacts are removed, as seen in Figure 6, a “border” event replaces the deleted data. It is possible to reject or remove any portion of the continuous data in the eegplot.m box. After portions of the data have been flagged for rejection, a new dataset will be created. Any part of the continuous data in the eegplot.m box could be rejected or deleted. A new dataset will be constructed when some sections of the data have been designated for rejection.
The components are listed in decreasing order of the EEG variation that each component accounts for. EEG datasets always contain eye artifacts. They frequently occupy the top spots in both their scalp topographies and component arrays.
All of the component topoplots are shown in Figure 7. Figure 8’s depiction of the scalp map for component 21 illustrates the existence and volume of artifacts in the EEG data. This component appears to have a significant level of muscular artifacts, and Figure 9 displays the corresponding activity spectrum. Ocular artifacts, which typically occupy the highest locations in their scalp topographies, can be seen together with EEG data. As a consequence, component 21 may be identified as an eye artifact since neither the findings of the ERP in Figure 10 nor the scalp map shows a significant far-frontal projection that characterizes eye artifacts.
Relatively, Figure 11 depicts the scalp map of component 1, which has fewer artifacts and more EEG signals. Figure 12 and Figure 13, respectively, display the activity power spectrum and ERP map of the same. Table 1 lists the artifacts that are present in each component.
Following artifact removal, the pruned data are eventually shown in Figure 14 and Figure 15, where the artifact-free signals are depicted in red. So, it is found to be quite advantageous to remove artifact regions that include unique artifacts while generating pure independent components. The signals that have had the artifacts removed are then transmitted for feature extraction and classification. The following chart compares the classification performance of the artifact-free HGD motor imagery signals using the proposed ANN + A-SVM model to other approaches tested on the identical HGD motor imagery EEG signals. The classification performance is evaluated under different metrics such as accuracy, precision, recall, K-score, F1-score, and misclassification rate. The accuracy reported is 95.24%, with an average K value of 0.94. Also, the precision, recall, F1-score evaluated for the four classes (Left, Right, Feet and Rest) of all the 14 subjects are reported in the analysis, as shown in Table 2.
The average misclassification rate of 0.047 is better compared to the existing approaches. This performance analysis is graphically represented in Figure 16. The classification performance is also represented through the confusion matrices in Figure 17. The confusion matrices are shown for four subjects, S4, S5, S13, and S14, where the prediction values are found to be better. Table 3 shows the performance comparison between the proposed models and other models. In particular, the classification accuracy of every subject and the average classification accuracies obtained by the DeepConvNet, EEGNet, CP-MixedNet, TS-SEFFNet, MBEEGNet, and MBShallowCovNet from the HGD dataset is summarized in Table 3. Our method has the highest average accuracy of 95.24%, except for the MBEEGNet approach, which has an accuracy of 95.30%. The comparison is graphically presented in Figure 18.
Table 4 shows the performance comparison between the proposed models and other models. The average classification accuracies from the BCI-IV 2a and HGD Motor Imagery datasets are summarized in the table. Using the two public datasets, the performance of the proposed model is evaluated where it has proved to perform better compared to the other models.

5. Conclusions

In this research article, a new method for removing artifacts from EEG signals has been put forward. The proposed HOL–SSA involves a Higher-Order Linear-Moment-based approach to decompose the signal into multivariate data followed by the ORICA method to separate the sources. The suggested HOL–SSA and ORICA approach performs better compared to a number of other current decomposition and source separation approaches. Thus, the proposed HOL–SSA and ORICA-based preprocessing approach has shown improved results in artifact rejection. The experimental findings demonstrate that the suggested technique can identify and eliminate EOG, ECG, EMG, and other artifacts from EEG data while still preserving brain activity that is ignored by the noise component. The computational complexity of the suggested artifact removal approach is also shown in the algorithm. The ANN + A-SVM, two-stage classifier improves the classification performance on the HGD motor imagery dataset, as shown in Table 3 and Table 4.

Author Contributions

Conceptualization, M.J.A.; Methodology, M.J.A.; Software, B.P.S.; Validation, B.P.S.; Formal analysis, S.K.; Investigation, S.K.; Resources, A.A.; Data curation, A.A.; Writing—original draft, N.A.A.; Writing—review & editing, R.K.M.; Supervision, R.K.M.; Funding acquisition, N.A.A. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported and funded by the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University (IMSIU) (grant number IMSIU-RG23056), Riyadh, Saudi Arabia.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors would like to thank Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2023R410), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Machado, S.; Araújo, F.; Paes, F.; Velasques, B.; Cunha, M.; Budde, H.; Basile, L.F.; Anghinah, R.; Arias-Carrión, O.; Cagy, M.; et al. EEG-based Brain-Computer Interfaces: An Overview of Basic Concepts and Clinical Applications in Neurorehabilitation. Rev. Neurosci. 2010, 21, 451–468. [Google Scholar] [CrossRef]
  2. Martinek, R.; Ladrova, M.; Sidikova, M.; Jaros, R.; Behbehani, K.; Kahankova, R.; Kawala-Sterniuk, A. Advanced Bioelectrical Signal Processing Methods: Past, Present and Future Approach—Part II: Brain Signals. Sensors 2021, 21, 5186. [Google Scholar] [CrossRef]
  3. Taha, L.Y.; Abdel-Raheem, E. Blind Source Separation: A Performance Review Approach. In Proceedings of the 5th International Conference on Signal Processing and Information Security (ICSPIS), Dubai, United Arab Emirates, 7–8 December 2022. [Google Scholar]
  4. Chaddad, A.; Wu, Y.; Kateb, R.; Bouridane, A. Electroencephalography Signal Processing: A Comprehensive Review and Analysis of Methods and Techniques. Sensors 2023, 23, 6434. [Google Scholar] [CrossRef]
  5. Wang, X.; Wang, X.; Zhou, B. EEG Artifact Removal Based on Independent Component Analysis and Outlier Detection. In Proceedings of the 2022 14th International Conference on Wireless Communications and Signal Processing (WCSP), Nanjing, China, 14–17 October 2022; pp. 209–214. [Google Scholar] [CrossRef]
  6. Taelman, J.; Mijovic, B.; Van Huffel, S.; Devuyst, S.; Dutoit, T. ECG artifact removal from surface EMG signals by combining empirical mode decomposition and independent component analysis. In Proceedings of the 4th International Conference on Bio-inspired Systems and Signal Processing, Rome, Italy, 26–29 January 2011; pp. 421–424. [Google Scholar]
  7. Blokland, Y.; Bruhn, J.; Farquhar, J.; Spyrou, L. Singular spectrum analysis as a preprocessing filtering step for fNIRS brain computer interfaces. In Proceedings of the 22nd European Signal Processing Conference, Lisbon, Portugal, 1–5 September 2014; pp. 46–50. [Google Scholar]
  8. Maddirala, A.K.; Shaik, R.A. Separation of Sources from Single-Channel EEG Signals Using Independent Component Analysis. IEEE Trans. Instrum. Meas. 2017, 67, 382–393. [Google Scholar] [CrossRef]
  9. Gopalan, G.; Wikle, C.K. A Higher-Order Singular Value Decomposition Tensor Emulator for Spatiotemporal Simulators. J. Agric. Biol. Environ. Stat. 2022, 27, 22–45. [Google Scholar] [CrossRef]
  10. Hassani, H.; Yeganegi, M.R.; Silva, E.S. A New Signal Processing Approach for Discrimination of EEG Recordings. Stats 2018, 1, 155–168. [Google Scholar] [CrossRef]
  11. Yedukondalu, J.; Sharma, L.D. Circulant Singular Spectrum Analysis and Discrete Wavelet Transform for Automated Removal of EOG Artifacts from EEG Signals. Sensors 2023, 23, 1235. [Google Scholar] [CrossRef] [PubMed]
  12. Liu, Q.; Liu, A.; Zhang, X.; Chen, X.; Qian, R.; Chen, X. Removal of EMG Artifacts from Multichannel EEG Signals Using Combined Singular Spectrum Analysis and Canonical Correlation Analysis. J. Healthc. Eng. 2019, 2019, 4159676. [Google Scholar] [CrossRef]
  13. Chao-Lin, T.; Yi-Yang, Z.; Wei, W.; Yuan-Yuan, L.; Gang, W.; Jin, X. A Novel Method Based on Combination of Independent Component Analysis and Ensemble Empirical Mode Decomposition for Removing Electrooculogram Artifacts from Multichannel Electroencephalogram Signals. Front. Neurosci. 2021, 15, 729403. [Google Scholar]
  14. Sandoval, C.; Pirogova, E.; Lech, M. Two-Stage Deep Learning Approach to the Classification of Fine-Art Paintings. IEEE Access 2019, 7, 41770–41781. [Google Scholar] [CrossRef]
  15. Naveed, M.; Arif, F.; Usman, S.M.; Anwar, A.; Hadjouni, M.; Elmannai, H.; Hussain, S.; Ullah, S.S.; Umar, F. A Deep Learning-Based Framework for Feature Extraction and Classification of Intrusion Detection in Networks. Wirel. Commun. Mob. Comput. 2022, 2022, 2215852. [Google Scholar] [CrossRef]
  16. Demidova, L.A. Two-Stage Hybrid Data Classifiers Based on SVM and kNN Algorithms. Symmetry 2021, 13, 615. [Google Scholar] [CrossRef]
  17. Yang, Y.; Hu, Y.; Zhang, X.; Wang, S. Two-Stage Selective Ensemble of CNN via Deep Tree Training for Medical Image Classification. IEEE Trans. Cybern. 2022, 52, 9194–9207. [Google Scholar] [CrossRef]
  18. Dos Santos, E.M.; Cassani, R.; Falk, T.H.; Fraga, F.J. Improved motor imagery brain-computer interface performance via adaptive modulation filtering and two-stage classification. Biomed. Signal Process. Control. 2020, 57, 101812. [Google Scholar] [CrossRef]
  19. Malekzadeh, A.; Zare, A.; Yaghoobi, M.; Kobravi, H.-R.; Alizadehsani, R. Epileptic Seizures Detection in EEG Signals Using Fusion Handcrafted and Deep Learning Features. Sensors 2021, 21, 7710. [Google Scholar] [CrossRef] [PubMed]
  20. Malekzadeh, A.; Zare, A.; Yaghoobi, M.; Alizadehsani, R. Automatic Diagnosis of Epileptic Seizures in EEG Signals Using Fractal Dimension Features and Convolutional Autoencoder Method. Big Data Cogn. Comput. 2021, 5, 78. [Google Scholar] [CrossRef]
  21. Shen, Y.; Peng, F.; Li, B. Improved singular spectrum analysis for time series with missing data. Nonlinear Process. Geophys. 2014, 22, 371–376. [Google Scholar] [CrossRef]
  22. Xu, S.; Hu, H.; Ji, L.; Wang, P. Embedding Dimension Selection for Adaptive Singular Spectrum Analysis of EEG Signal. Sensors 2018, 18, 697. [Google Scholar] [CrossRef]
  23. Das, S.; Kyal, C. Efficient multichannel EEG compression by optimal tensor truncation. Biomed. Signal Process. Control. 2021, 68, 102749. [Google Scholar] [CrossRef]
  24. Zhang, L.-W.; Lu, Z.-H.; Zhao, Y.-G. Dynamic reliability assessment of nonlinear structures using extreme value distribution based on L-moments. Mech. Syst. Signal Process. 2021, 159, 107832. [Google Scholar] [CrossRef]
  25. Antony, M.J.; Sankaralingam, B.P.; Mahendran, R.K.; Gardezi, A.A.; Shafiq, M.; Choi, J.-G.; Hamam, H. Classification of EEG Using Adaptive SVM Classifier with CSP and Online Recursive Independent Component Analysis. Sensors 2022, 22, 7596. [Google Scholar] [CrossRef] [PubMed]
  26. Judith, A.M.; Priya, S.B.; Mahendran, R.K.; Gadekallu, T.R.; Ambati, L.S. Two-phase classification: ANN and A-SVM classifiers on motor imagery BCI. Asian J. Control. 2022, 1–12, 1561–8625. [Google Scholar]
  27. Schirrmeister, R.T.; Springenberg, J.T.; Fiederer, L.D.J.; Glasstetter, M.; Eggensperger, K.; Tangermann, M.; Hutter, F.; Burgard, W.; Ball, T. Deep learning with convolutional neural networks for EEG decoding and visualization. Hum. Brain Mapp. 2017, 38, 5391–5420. [Google Scholar] [CrossRef]
  28. Lawhern, V.J.; Solon, A.J.; Waytowich, N.R.; Gordon, S.M.; Hung, C.P.; Lance, B.J. EEGNet: A compact convolutional neural network for EEG-based brain–computer interfaces. J. Neural Eng. 2018, 15, 056013. [Google Scholar] [CrossRef] [PubMed]
  29. Li, Y.; Zhang, X.-R.; Zhang, B.; Lei, M.-Y.; Cui, W.-G.; Guo, Y.-Z. A Channel-Projection Mixed-Scale Convolutional Neural Network for Motor Imagery EEG Decoding. IEEE Trans. Neural Syst. Rehabilitation Eng. 2019, 27, 1170–1180. [Google Scholar] [CrossRef]
  30. Altuwaijri, G.A.; Muhammad, G. Electroencephalogram-Based Motor Imagery Signals Classification Using a Multi-Branch Convolutional Neural Network Model with Attention Blocks. Bioengineering 2022, 9, 323. [Google Scholar] [CrossRef]
  31. Haq A u Li, J.P.; Khan, S.; Alshara, M.A.; Alotaibi, R.M.; Mawuli, C.B. DACBT: Deep learning approach for classifica-tion of brain tumors using MRI data in IoT healthcare environment. Sci. Rep. 2022, 12, 15331. [Google Scholar] [CrossRef]
  32. Haq, A.U.; Li, J.P.; Agbley, B.L.Y.; Khan, A.; Khan, I.; Uddin, M.I.; Khan, S. IIMFCBM: Intelligent Integrated Model for Feature Extraction and Classification of Brain Tumors Using MRI Clinical Imaging Data in IoT-Healthcare. IEEE J. Biomed. Health Inform. 2022, 26, 5004–5012. [Google Scholar] [CrossRef]
  33. Salehi, A.W.; Khan, S.; Gupta, G.; Alabduallah, B.I.; Almjally, A.; Alsolai, H.; Siddiqui, T.; Mellit, A. A Study of CNN and Transfer Learning in Medical Imaging: Advantages, Challenges, Future Scope. Sustainability 2023, 15, 5930. [Google Scholar] [CrossRef]
Figure 1. Channel Locations.
Figure 1. Channel Locations.
Diagnostics 13 02852 g001
Figure 2. Original channel data of MI signals.
Figure 2. Original channel data of MI signals.
Diagnostics 13 02852 g002
Figure 3. Channel data after filtering.
Figure 3. Channel data after filtering.
Diagnostics 13 02852 g003
Figure 4. Rejection of bad data.
Figure 4. Rejection of bad data.
Diagnostics 13 02852 g004
Figure 5. Rejection of bad data (stacked form).
Figure 5. Rejection of bad data (stacked form).
Diagnostics 13 02852 g005
Figure 6. Boundary creation after data rejection.
Figure 6. Boundary creation after data rejection.
Diagnostics 13 02852 g006
Figure 7. Topoplots of the independent components.
Figure 7. Topoplots of the independent components.
Diagnostics 13 02852 g007
Figure 8. Scalp map of component 21.
Figure 8. Scalp map of component 21.
Diagnostics 13 02852 g008
Figure 9. Activity power spectrum of component 21.
Figure 9. Activity power spectrum of component 21.
Diagnostics 13 02852 g009
Figure 10. ERP of component 21.
Figure 10. ERP of component 21.
Diagnostics 13 02852 g010
Figure 11. Scalp map of component 1.
Figure 11. Scalp map of component 1.
Diagnostics 13 02852 g011
Figure 12. ERP of component 1.
Figure 12. ERP of component 1.
Diagnostics 13 02852 g012
Figure 13. Activity power spectrum of component 1.
Figure 13. Activity power spectrum of component 1.
Diagnostics 13 02852 g013
Figure 14. Pruned data after artifact removal.
Figure 14. Pruned data after artifact removal.
Diagnostics 13 02852 g014
Figure 15. Pruned data after artifact removal (stacked form).
Figure 15. Pruned data after artifact removal (stacked form).
Diagnostics 13 02852 g015
Figure 16. Classification performance under Accuracy, K-Score, Precision, Recall, and F1-Score metrics on HGD dataset.
Figure 16. Classification performance under Accuracy, K-Score, Precision, Recall, and F1-Score metrics on HGD dataset.
Diagnostics 13 02852 g016
Figure 17. Confusion matrices of subjects (A) S4, (B) S5, (C) S13, and (D) S14.
Figure 17. Confusion matrices of subjects (A) S4, (B) S5, (C) S13, and (D) S14.
Diagnostics 13 02852 g017
Figure 18. Comparison of classification performance chart.
Figure 18. Comparison of classification performance chart.
Diagnostics 13 02852 g018
Table 1. EEG and artifacts present in the observed signals.
Table 1. EEG and artifacts present in the observed signals.
Components/SignalsEEG (%)Muscle (%)Eye (%)Heart (%)Line Noise (%)Channel Noise (%)Other (%)
IC 197.14.60.00.10.30.43.0
IC 210.085.20.60.01.40.12.6
IC 33.393.31.00.00.80.01.6
IC 41.695.90.60.00.60.01.2
IC 52.195.20.70.00.60.01.3
IC 60.984.79.00.00.30.05.0
IC 70.666.722.40.01.20.19.0
IC 80.273.05.30.00.40.320.9
IC 90.993.50.40.00.80.14.3
IC 106.850.61.30.48.20.332.2
IC 117.683.71.20.31.20.15.9
IC 1212.777.40.23.31.20.15.2
IC 1314.040.10.90.25.33.236.4
IC 1412.363.00.71.13.50.119.2
IC 151.037.913.20.00.30.447.2
IC 161.390.11.90.00.80.35.6
IC 172.460.23.10.10.20.633.5
IC 185.747.72.80.51.50.644.2
IC 191.967.71.10.00.40.828.1
IC 200.782.20.90.20.30.815.1
IC 210.391.11.90.00.00.56.1
Table 2. Classification performance on the HGD dataset using the proposed model.
Table 2. Classification performance on the HGD dataset using the proposed model.
ParticipantsS1S2S3S4S5S6S7S8S9S10S11S12S13S14Avg.
Accuracy (%)94.296.494.495.693.996.296.794.294.995.296.996.594.693.795.2
K value0.920.950.930.940.920.950.950.920.930.940.960.950.930.920.94
PrecisionLH0.930.970.930.960.930.970.960.930.950.960.970.960.950.930.95
RH0.940.970.940.970.930.960.970.940.940.960.970.970.950.930.95
FT0.950.960.950.960.950.960.980.960.950.960.980.970.960.940.96
RT0.950.960.960.940.950.960.960.940.960.930.960.960.930.950.95
Avg0.940.970.950.960.940.960.970.940.950.950.970.970.950.940.95
RecallLH0.940.950.940.950.930.960.960.950.940.930.970.960.930.930.95
RH0.940.960.940.960.940.950.970.940.950.960.970.970.940.930.95
FT0.950.970.950.970.940.970.980.940.950.960.980.970.970.940.96
RT0.940.970.950.950.950.970.960.940.960.960.960.960.950.950.96
Avg0.940.960.950.960.940.960.970.940.950.950.970.970.950.940.95
F1-scoreLH0.940.960.930.950.930.960.960.940.940.940.970.960.940.930.95
RH0.940.960.940.960.930.950.970.940.940.960.970.970.940.930.95
FT0.950.970.950.960.940.960.980.950.950.960.980.970.960.940.96
RT0.940.960.950.940.950.960.960.940.960.940.960.960.940.950.95
Avg0.940.960.940.950.940.960.970.940.950.950.970.970.950.940.95
Misclassification Rate0.0580.0360.0560.0440.060.0380.0330.0580.050.0480.0310.0350.0540.0630.0476
Table 3. Accuracy comparison of classification performance on HGD dataset.
Table 3. Accuracy comparison of classification performance on HGD dataset.
Methods/Subjects1234567891011121314Avg.
DeepConvNet81.8891.8893.1392.5090.6393.1384.2890.8096.8885.0088.1391.2589.9483.7589.51
EEGNet94.3792.5010096.2596.8798.1293.0796.8798.1291.2580.0096.2595.6079.3793.47
CP-MixedNet88.7590.0095.6391.2595.0091.2588.0593.1395.0088.7575.6393.7589.3178.1393.70
TS-SEFFNet90.6993.5398.5396.8892.9093.5392.4091.7896.8889.8892.7895.4093.0387.3493.25
MBEEGNet95.0295.0210099.4098.1798.8093.1395.5298.1892.1489.4396.0294.4588.8895.30
MBShallowCovNet98.2596.2398.8098.1897.6596.9093.8097.0097.5292.5080.7896.2595.6292.0495.11
ANN + A-SVM94.296.494.495.693.996.296.794.294.995.296.996.594.693.795.24
Table 4. The comparison summary of classification performance among different models under different datasets.
Table 4. The comparison summary of classification performance among different models under different datasets.
DatasetMethodsAccuracy (%)F1-ScoreReference
Feature ExtractionClassification
HGDShallowConvNetCNN88.690.887Schirrmeister, et.al [27] (2017)
DeepConvNetCNN89.510.893Schirrmeister, et.al [27] (2017)
EEGNetCNN93.470.935Lawhern, et al. [28] (2018)
CP-MixedNetCNN93.700.937Li, et.al [29] (2019)
TS-SEFFNetCNN93.250.901Li, et.al [29] (2019)
MBEEGNetCNN95.300.954Altuwaijri and Muhammad [30] (2022)
MBShallowCovNetCNN95.110.951Altuwaijri and Muhammad [30,31,32,33] (2022)
HOL–SSA–ORICA + CSPANN + A-SVM95.240.95-
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Antony, M.J.; Sankaralingam, B.P.; Khan, S.; Almjally, A.; Almujally, N.A.; Mahendran, R.K. Brain–Computer Interface: The HOL–SSA Decomposition and Two-Phase Classification on the HGD EEG Data. Diagnostics 2023, 13, 2852. https://doi.org/10.3390/diagnostics13172852

AMA Style

Antony MJ, Sankaralingam BP, Khan S, Almjally A, Almujally NA, Mahendran RK. Brain–Computer Interface: The HOL–SSA Decomposition and Two-Phase Classification on the HGD EEG Data. Diagnostics. 2023; 13(17):2852. https://doi.org/10.3390/diagnostics13172852

Chicago/Turabian Style

Antony, Mary Judith, Baghavathi Priya Sankaralingam, Shakir Khan, Abrar Almjally, Nouf Abdullah Almujally, and Rakesh Kumar Mahendran. 2023. "Brain–Computer Interface: The HOL–SSA Decomposition and Two-Phase Classification on the HGD EEG Data" Diagnostics 13, no. 17: 2852. https://doi.org/10.3390/diagnostics13172852

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop