In the field of biomedical signal processing, accurately finding brain activity from electroencephalography (EEG) signals is the focus of much research. In the brain-computer interface (BCI) applications based on event-related potentials (ERP), fast and efficient feature extraction and classification of EEG signals to understand human intention are the current research hotspots in this field [1
]. For the real-world brain-computer interface, it is necessary to extract the ERP waveform from the EEG signal obtained in a single trial [2
]. This is not an easy task, since the ERP is usually submerged in noise. Although the commonly used superposition and averaging method can remove part of the random noise, it requires EEG data from multiple trials to be superimposed in order to obtain the results, so the response speed of the entire BCI system cannot be guaranteed, and it cannot be directly applied to real-world BCI. Therefore, it is necessary to use the feature extraction method to extract the classification features from the EEG signal to identify the ERP waveform [3
1.1. Common EEG Feature Extraction Methods for ERP Classification
At present, one common category of methods is to use temporal characteristics to extract ERP features from EEG. In the year of 1988, Farwell and Donchin proposed stepwise discriminant analysis (SWDA) [4
] to extract P300 ERP component from EEG. For multi-channel EEG data, the SWDA method simply combines it to a matrix, without analysis of spatial characteristics. Then, Independent Component Analysis (ICA) was proposed for ERP feature extraction, to overcome the disadvantages of SWDA. ICA can extract the distribution of EEG signals on the brain cortex, and then find the classification features related to ERP. Jung et al. first used ICA to perform spatial analysis on ERP, using the spatial locations to find independent components that can represent ERP, and using them to obtain a more obvious waveform from a single trial [5
]; Gao Chang et al. used ICA and the Hilbert-Huang Transform (HHT) [6
] combined method to filter artifacts such as the ocular electrogram from the single-channel EEG, and improved the signal-to-noise ratio of ERP [7
]. Lee et al. used one-unit ICA with a reference, a variant of ICA for single-trial ERP extraction [8
]. By analyzing the spatial distribution difference of each ERP waveform, a more significant difference between the deviation stimulus and the standard stimulus was obtained. Eilbeiigi et al. used global optimal constrained ICA to search for movement-related cortical potential in single-trial EEG data, and reached a higher accuracy rate in a motor-imaging classification experiment [9
]. The major shortcoming of the ICA-based methods is that ICA requires separating components to be statistically independent. ICA also needs multi-channel EEG signals for accuracy, which also limits its application in BCI.
Another widely used category of method is based on the statistical characteristics of ERP waveform and noise. Among these methods, component estimation methods based on statistical principles, such as Kalman filtering and Bayesian estimation, can separate ERP waveforms from interference noise, and therefore are used to extract ERP classification features from EEG. Zhang et al. used ICA and Kalman in combination to reduce the interference of white noise on ICA [10
] and improve the performance of ICA and ERP extraction; Fukami et al. used Particle Filter to extract P300 waveforms [11
], and obtained more accurate delay estimation and P300 component amplitude estimation; Ting et al. used Kalman filter to extract ERP, by adding the EM algorithm to Kalman filter to achieve a more accurate amplitude estimation [12
]; Delaney-Busch et al. used Bayesian estimation to study semantic understanding in the process of learning—the trial by trial change of the N400 component in the ERP waveform cannot be achieved by the superimposed average method [13
]. Zeyl et al. used Bayesian ranks to analyze and calculate the Event related potential scores of each trial, and use event-related potential scores as the time domain features to improve the accuracy of the P300-based speller [14
]. This kind of method can handle delay estimation and amplitude estimation on a single-signal frame and improve the classification accuracy. However, there is a main disadvantage that the non-stationary characteristics of the EEG signals have a negative impact on the performance, so that ERP waveform estimation error cannot be guaranteed.
The third category is feature extraction methods based on sparse modeling. Sparse modeling is an efficient representation method for high-dimensional data, especially for the EEG data [15
]. The purpose is to approximate the input data with a linear combination of sparse dictionary atoms. On the other hand, atoms must have data adaptability—that is, atoms can describe certain essential characteristics of the data. At the same time, the linear combination coefficient, also called the sparse representation vector, can also be used as a classification feature. Dai et al. developed a personal identification system using a sparse-modeling-based EEG signal compression-sensing method [16
]. Because of the application of sparse modeling and compressed sensing, the amount of data transmission during the operation of the system can be reduced, so that the system can use low-cost wearable EEG acquisition equipment and run on the World Wide Web, which is convenient for application. Wu et al. used Regularized Group Sparse Discriminant Analysis to identify the EEG signal in the brain-computer interface paradigm and identify the P300 waveform in the EEG [17
]. Mo et al. directly used sparse representation coefficients as classification features to perform classification in Motor Imagery BCI [18
]. Shin et al. added the incoherence measure to the sparse dictionary update process and used this dictionary to sparsely decompose the EEG signal to obtain better classification features [19
]. Yuan et al. used kernel sparse representation to sparsely reconstruct EEG data and used sparse reconstruction coefficients as classification features to identify EEG data holding epileptic components [20
]. Yu et al. use sparse representation to decompose EEG data for Visual Evoked Potential (VEP) extraction [21
]. Shin et al. applied sparse representation to the BCI system of motion imagination and used the Gabor base to construct a dictionary to extract recognizable waveforms from EEG [22
]. However, because the definite basis is used, the waveform components are difficult to make accurate and it is impossible to express the nature of the signal.
However, due to the over-completeness of the sparse dictionary, the ERP waveforms in the EEG signal will be distributed among multiple sparse dictionary atoms, which makes it difficult for the sparse reconstruction coefficients to become stable classification features. From the perspective of sparse decomposition theory, since these dictionary atoms are used to perform sparse reconstruction of EEG signals with low errors, these atoms must hold the information needed to identify ERP. Therefore, an extra approach is required to fully utilize ERP waveform information contained in the sparse dictionary atoms for classification features from EEG signals for ERP recognition.
Self-organizing mapping (SOM) is an appropriate approach to solving the problem of scattered ERP in atoms. SOM is a method to produce a typically two-dimensional representation of the input space of the training samples. When the training samples are sparse dictionary atoms, SOM can combine the scattered ERP information into code vectors of the SOM network. SOM was first proposed by Professor T. Kohonen of the University of Helsinki in Finland in 1981 [23
]. Kohonen believes that when a neural network accepts external input patterns, it will be divided into different corresponding areas, with each area having different response characteristics to the input mode, and this process is completed automatically. SOM is already a common method in the field of biomedical signal analysis and is widely used in the analysis of neural activity data. Ngan et al. used SOM to analyze the time-domain activity waveform of each voxel in Functional magnetic resonance imaging (fMRI) and aggregated the neuron nodes in SOM according to the correlation to find a pattern of voxel activity [25
]. Wei et al. used SOM to perform hierarchical cluster analysis of spatio-temporal features on fMRI image data to find fMRI classification features that can represent cognitive activities [26
]. Kurth et al. used SOM to perform cluster analysis on EEG signals collected in the clinical scenarios and classify EEG fragments containing epileptic electrical activity and normal EEG fragments [27
]. Hemanth et al. first extracted features from EEG signals, and then analyzed the features using SOM to recognize human emotions from EEG [28
]. Diaz-Sotelo et al. used SOM to extract features from EEG for a BCI system that can recognize human cognitive states [29
]. These studies showed that SOM can be effectively used in the analysis of biological signals, especially brain electrical signals.
1.2. The Proposed Method and Article Structure
In this paper, we propose a feature extraction method based on the self-organizing mapping (SOM) of dictionary atoms. In this method, we first use K-SVD dictionary learning algorithm to construct a sparse dictionary. Then, self-organizing mapping is performed on the dictionary atoms, and the code vector of the neuron is compared with the target ERP waveform as a time-domain waveform. The code vectors with the largest cosine similarity value to the target ERP waveform are found. For the EEG signal frame to be recognized, the cosine similarity between the to-be-recognized frame and the selected code vectors are calculated. These similarity values are the extracted classification features. Finally, the classifier is trained using these features to find the ERP waveform. In the testing phase, the SOM, sparse dictionary, and classifier of the training phase are reused, and the feature extraction operation is repeated for the EEG samples to be recognized. Compared to the three categories of methods mentioned previously, the proposed method has the following advantages: (1) It does not rely on multichannel data. (2) It can deal with non-stationary ERP waveforms. (3) It can make the most use of ERP fragments in sparse dictionary.
This article unfolds as follows: Section 2
provides a brief introduction to EEG sparse decomposition and the procedures of proposed method. In Section 3
, the experiment material is explained, and we present the results produced by proposed method. In Section 4
and Section 5
, we discussed the advantages of the proposed method and potential further improvements.