1. Introduction
Brain–computer interface (BCI) is a technology that allows users and computers to interact with each other through brain activity. An electroencephalogram (EEG) is used to record the brain activity under certain BCI experimental task [
1]. For example, users can control the mouse on the screen to move left and right by imagining their left and right hand movements, respectively [
2]. Therefore, BCI has a wide range of uses in patients with disabilities, such as patients with severe neuromuscular disease or interlocking symptoms [
3,
4].
Many different types of EEG signals can be used in BCI field, such as steady state visual evoked potential (SSVEP) [
5], motor imagery (MI) [
6], and P300 [
7]. In this article, we are interested in the P300-EEG signal, which is based on event-related potentials. The P300-EEG signal is a natural response of our brain to a specific external stimulus; in response, the EEG signal will have a positive peak after about 0.3 ms of the stimulation [
8]. One of the main reasons hindering the widespread use of BCI systems is the variability of EEG signals [
1,
9]. Due to the variability, the feature space distribution of the EEG signal collected from different subjects or different sessions are inconsistent [
10]. In addition, the BCI system requires a long calibration phase before each time it is used because, to achieve a good performance, every subject’s BCI system needs to be trained by their own EEG signals and cannot use other’s EEG signals [
11]. One of the potential solutions to reduce or even eliminate the calibration phase is a transfer learning algorithm. In this article, we mainly study the offline transfer learning of the P300-EEG signal.
In the field of machine learning, transfer learning is defined as the ability to use the knowledge learned in a previous task or domain in a new task or domain [
12]. Transfer learning in the BCI field has received extensive attention in improving the generalization performance of the classifier. Pieter-Jan et al. [
13] proposed combining a Bayesian model and learning from label proportion (LLP). Gayraud et al. [
14] completed a cross-session transfer of P300 data using a nonlinear transform obtained by solving an optimal transport problem and reached the highest AUC score of 0.835 for one particular subject. Lu et al. [
15] proposed an adaptive classification method. The initialization of the classifier is subject-independent, and, after several minutes of online adaptation, the accuracy converges to that of a fully trained supervised subject-specific model. Morioka et al. [
16] proposed to learn a dictionary of spatial filters. Some other transfer learning methods are semi-supervised learning [
17] and using uniform local binary patterns [
18] or artificial data generation [
11]. However, one of the methods with the most potential is Riemannian Geometry method [
19,
20,
21].
The Riemannian Geometry classifier is a promising and new classification method in BCI field. The main idea of Riemannian Geometry is to represent the data in the form of symmetric positive definite (SPD) covariance matrix, and then directly map the SPD covariance matrix on the Riemannian manifold. The data on the Riemannian manifold can be manipulated directly, including direct classification using the Riemannian distance. We further study the potential of this classifier in this article. Although Riemannian Geometry method has achieved many good results in the BCI field, it still has shortcomings. If the data dimension is too large, the Riemannian Geometry method will perform many calculations, which is time-consuming and causes statistical deviations [
22]. Therefore, the use of Riemannian Geometry method needs to be combined with dimensionality reduction algorithms. Thus, we introduce XDAWN spatial filters. XDAWN spatial filters, specially design for Event Related Potentials (ERP), were proposed by Bertrand Rivet [
23]. They can enhance the P300 component and reduce the data dimensions, which is very suitable to our needs. Then, we improve the RGC by affine transforming the SPD covariance matrix of different subjects using their own Riemannian Geometry Mean (RGM) to make the data from different subjects comparable. We finally use the Riemannian Geometry classifier to complete our transfer learning experiments on the P300-speller paradigm.
Naturally, the performance of a transfer learning algorithm largely depends on the relevance of the two tasks. For example, the P300-speller task performed between two different subjects will be more relevant than the P300 task and MI task performed on the same person. In this paper, transfer learning is defined as follows: The model is trained on Subject A and used to evaluate Subject B, where Subjects A and B come from the same dataset. The structure of this paper is as follows.
Section 2 presents the introduction of the method, datasets and experiment design.
Section 3 presents the experimental results.
Section 4 presents discussion.
Section 5 presents the conclusion.
4. Discussion
This paper proposes a XDAWN + RGC transfer learning algorithm for P300-EEG signal. The XDAWN spatial filter can effectively improve the quality of the evoked P300 components by considering the signal and noise simultaneously. XDAWN also greatly reduced the feature dimensions for the subsequent Riemannian Geometry classifier and improves the performance of the Riemannian Geometry classifier. After mapping the covariance matrix to the Riemannian manifold space, we first performed an affine transformation on the covariance matrix, so that data from different subjects move in the same direction on the Riemannian manifold, making the data comparable without changing the Riemannian distance and geometry structure of the data. There are several reasons for promoting the use of Riemannian Geometry classifier. Due to its logarithmic nature, the Riemann distance is robust to extreme values (noise). Moreover, the Riemannian distance of the SPD matrix is invariant to the matrix inversion and any linearly invertible transformation of the matrix [
36]. These characteristics partially explain why the Riemann classification provides good generalization capabilities.
From the results of two experiments, it is proved that our proposed method has greatly improved the transfer learning algorithm’s performance compared with two classic classification methods, E-SVM and SWLDA. The highest average AUC value reached 0.836, and it also proved that, with the small number of available data in Experiment 1, our proposed transfer learning method can already achieve a fairly good performance. We visualized the data of two subjects from the two datasets, respectively, for a more intuitive understanding of the affine transformation.
From the visualization of the two datasets (
Figure 8 and
Figure 9), we can see that the covariance matrix after the affine transformation is more concentrated and consistent in spatial distribution, which proves that our proposed affine transformation is effective.
The reason the overall performance is good and stable is that using the covariance matrix to represent the data can better capture the correlation between features; mapping these covariance matrices on the Riemannian manifold as points, the geometric structure of these points will be demonstrated clearly. Affine transformation can be performed on the data without changing the geometric properties of the data, and the Riemannian Geometry mean is used to represent the reference matrix. It is considered that, under the P300 task, the subject’s mental state is in a relatively stable state. We use Riemannian Geometry mean of all the samples to capture this stable state. In addition, the Riemannian Geometry classifier has no parameters to train. We use XDAWN to enhance the P300 signal while reducing the data dimension, which greatly reduces the computational expenditure of the Riemannian Geometry classifier.