Improvement of the Classification Accuracy of Steady-State Visual Evoked Potential-Based Brain-Computer Interfaces by Combining L1-MCCA with SVM

Canonical correlation analysis (CCA) has been used for the steady-state visual evoked potential (SSVEP) based brain-computer interface (BCI) for a long time. However, the reference signal of CCA is relatively simple and lacks subject-specific information. Moreover, over-fitting may occur when a short time window (TW) length was used in CCA. In this article, an optimized L1-regularized multiway canonical correlation analysis (L1-MCCA) is combined with a support vector machine (SVM) to overcome the aforementioned shortcomings in CCA. The correlation coefficients obtained by L1-MCCA were transferred into a particle-swarm-optimization (PSO)-optimized support vector machine (SVM) classifier to improve the classification accuracy. The performance of the proposed method was evaluated and compared with the traditional CCA and power spectral density (PSD) methods. The results showed that the accuracy of the L1-MCCA-PSO-SVM was 96.36% and 98.18% respectively when the TW lengths were 2 s and 6 s. This accuracy is higher than that of the traditional CCA and PSD methods.


Introduction
Brain-computer interface (BCI) aims to create new communication pathways without depending on the brain's normal output pathways of peripheral nerves and muscles [1]. The BCI technology has rapidly become a popular direction in the field of neuroscience and neurorehabilitation research since it was proposed by Vidal [2] in the early 1970s. The BCI technology has been utilized in a variety of applications, such as medical [3,4], military [5,6], and daily life [7] scenarios. Over the past few years, researchers have developed BCI systems based on various neuroimaging techniques [8][9][10]. Among them, electroencephalography (EEG)-based BCI is the most popular method owing to its unique advantages, such as non-invasiveness, cost-effectiveness, portability, and high temporal resolution.
At present, the most commonly utilized BCI paradigms based on EEG mainly include steady-state visual evoked potential (SSVEP) [11][12][13], P300 [14,15], motor imagery [16,17], etc. Among them, SSVEP based BCI has the advantages of simple preparation, high classification accuracy, high signal-to-noise ratio, short response time, and fewer training requirements [18]. SSVEP is an oscillating neural response caused by external stimuli 2 of 13 flashing at 5~30 Hz [19]. The neural responses acquired by EEG have peak frequencies that are the fundamental and harmonic frequencies of the flicker frequency. Therefore, the user's target can be identified by matching the characteristics of the acquired EEG signals to the command-related particular flicker frequency.
The power spectral density (PSD) method was one of the earliest methods applied to SSVEP. Since SSVEP itself carries the frequency characteristics of visual stimuli, the frequency components corresponding to the stimulation frequency can be used as the characteristic values [20,21]. However, the PSD approach has some drawbacks, such as sensitivity to noise, low signal-to-noise ratio, and requirement of a relatively long time window (TW) if higher accuracy is needed [22,23]. To overcome these problems, several techniques have been proposed in the past few years. Canonical correlation analysis (CCA) proposed by Lin et al. [24], analyzes SSVEP signals by studying the linear relationship between two sets of multidimensional vectors. CCA combined with multiway EEG data improves the signal-to-noise ratio. It has been found that CCA could achieve a higher classification accuracy than the PSD with a shorter TW [25,26]. In recent years, the studies based on CCA have provided valuable information in the field of BCI research. However, the reference signal of CCA is only composed of standard sine and cosine signals. It lacks subject-specific information about the neural responses of the brain [27,28].
To further improve the performance of BCI systems, multiway canonical correlation analysis (MCCA) based on tensor analysis was proposed. Specifically, MCCA constructs an EEG tensor from the multiway EEG signals recorded by multiple experiments and alternately learns the projection vector used for reference signal optimization from the channel array of the EEG tensor and the test channel array. Based on tensor analysis [29] and sparse regularization, Zhang Yu et al. [30] have proposed the L1-regularized multiway canonical correlation analysis (L1-MCCA). L1 regularization [31][32][33] was used to further strengthen MCCA's test channel array optimization so that L1-MCCA has the best projection vector effect when learning reference signal optimization. During the data training process, the L1-MCCA has the stronger ability to learn projection vectors and prevent signal overfitting. SVM is suitable for small-sample classification problems, has no limitation on data dimension, and has good generalization ability and robustness, and can learn multiple relations [34,35]. Inputting the correlation coefficient matrix obtained by L1-MCCA into SVM for classification may achieve better performance. The advantages of the two methods can be combined to further improve the classification accuracy of the BCI system. Furthermore, the particle swarm algorithm was used to optimize the c and g parameters of SVM to achieve higher accuracy.
In this study, a means that combining L1-MCCA with particle swarm optimization (PSO) [36][37][38] optimized SVM was proposed to improve the accuracy of SSVEP based BCI. SSVEP data of 15 participants were recorded to investigate the reliability and the performances of the L1-MCCA-PSO-SVM method. This study is expected to provide useful insights in improving the classification accuracy of the SSVEP based BCI.2.

Subjects and Study Design
Fifteen healthy subjects (nine males and six females, aged 22-34 years) were recruited for this study. All the subjects were right-handed and had a normal or corrected-to-normal vision. They were in good health and had no history of epilepsy, neurological disorder, or other psychiatric disorders. Written informed consent was obtained. Each subject was given 100 Chinese Yuan. The experimental paradigms in this study were approved by the ethics committee of the Beijing Information Science and Technology University and the Institute of Automation, Chinese Academy of Sciences.
The experiment was conducted in a dark, acoustically shielded room. During the experiment, the subjects were required to sit comfortably and stay about 60 cm away from a computer monitor (resolution 1920 × 1080 pixels, screen size 27 inches). In this study, the visual stimuli were black and white checkerboard reversal designs with nine different  Figure 1a. During the experiment, the subjects were required to focus on the target stimulus to induce the SSVEP signal. The entire experimental protocol consisted of an initial baseline period (60 s) followed by 135 sessions (comprised of 15 sessions for each of nine conditions). Each session consisted of a 3-s preparation period and a 6-s stimulation period. During the stimulation period, the participants were required to look at the target stimulation to evoke SSVEP. The nine different target stimulation conditions were reversed separately and randomly. The experimental paradigm is illustrated in Figure 1a. During the experiment, to maintain a high level of attention and minimize fatigue, the subjects were allowed a 2-min break after each session. Before the formal experiment, each subject was allowed to practice until they were familiar with the experimental requirements.
The experiment was conducted in a dark, acoustically shielded room. During the experiment, the subjects were required to sit comfortably and stay about 60 cm away from a computer monitor (resolution 1920 × 1080 pixels, screen size 27 inches). In this study, the visual stimuli were black and white checkerboard reversal designs with nine different frequencies: 6.8 Hz, 12.5 Hz, 12 Hz, 9.7 Hz, 11.1 Hz, 8.4 Hz, 10 Hz, 8 Hz, and 14.7 Hz, resulting in nine different conditions. The nine stimuli were arranged at disparate locations, as shown in Figure 1a. During the experiment, the subjects were required to focus on the target stimulus to induce the SSVEP signal. The entire experimental protocol consisted of an initial baseline period (60 s) followed by 135 sessions (comprised of 15 sessions for each of nine conditions). Each session consisted of a 3-s preparation period and a 6-s stimulation period. During the stimulation period, the participants were required to look at the target stimulation to evoke SSVEP. The nine different target stimulation conditions were reversed separately and randomly. The experimental paradigm is illustrated in Figure 1a. During the experiment, to maintain a high level of attention and minimize fatigue, the subjects were allowed a 2-min break after each session. Before the formal experiment, each subject was allowed to practice until they were familiar with the experimental requirements.

EEG Recording and Data Analysis
The SSVEP signals were recorded using the Brain Vision system (Brain Products Ltd., Munich, Germany) with 64 channels at a sampling rate of 5000 Hz. The electrodes arrangement was according to the international EEG system with 64 channels. The reference and ground electrodes were FCz and AFz, respectively. Six channels (O1, O2, Oz, PO3, PO4, and POz) were used to acquire the SSVEP signals. An electrooculographic (EOG) electrode was placed over the outer canthus of the right eye for blink-artifacts correction. The impedances of the electrodes were maintained below 10 kΩ. The layout of the electrodes is shown in Figure 1b.
The EEGLAB toolbox and MATLAB 2019a platform (MathWorks Inc., Natick, MA, USA) was used to analyze the EEG data. First, the data were down-sampled to 1000 Hz and band-filtered between 0.5 and 49 Hz. Then the electrooculogram artifacts were rejected using independent component analysis (ICA). After that, the data was segmented into 6 s time length epochs (from the start to the end of the stimulation was selected). In addition, considering the DC offset after segmentation, baseline correction was required. Among the 15 participants, two participants' data were discarded due to big motion artifacts. Another one was unable to concentrate due to drowsiness and did not complete the entire experiment. The subsequent data processing was conducted using the data of the remaining 12 participants.

EEG Recording and Data Analysis
The SSVEP signals were recorded using the Brain Vision system (Brain Products Ltd., Munich, Germany) with 64 channels at a sampling rate of 5000 Hz. The electrodes arrangement was according to the international EEG system with 64 channels. The reference and ground electrodes were FCz and AFz, respectively. Six channels (O1, O2, Oz, PO3, PO4, and POz) were used to acquire the SSVEP signals. An electrooculographic (EOG) electrode was placed over the outer canthus of the right eye for blink-artifacts correction. The impedances of the electrodes were maintained below 10 kΩ. The layout of the electrodes is shown in Figure 1b.
The EEGLAB toolbox and MATLAB 2019a platform (MathWorks Inc., Natick, MA, USA) was used to analyze the EEG data. First, the data were down-sampled to 1000 Hz and band-filtered between 0.5 and 49 Hz. Then the electrooculogram artifacts were rejected using independent component analysis (ICA). After that, the data was segmented into 6 s time length epochs (from the start to the end of the stimulation was selected). In addition, considering the DC offset after segmentation, baseline correction was required. Among the 15 participants, two participants' data were discarded due to big motion artifacts. Another one was unable to concentrate due to drowsiness and did not complete the entire experiment. The subsequent data processing was conducted using the data of the remaining 12 participants.

Algorithms
Canonical correlation analysis (CCA) is a multivariable statistical method [24], which can be used to investigate the underlying correlation for two sets of data. CCA has been utilized for analyzing the SSVEP signal by studying the linear relationship between two sets of multidimensional vectors, so as to find a pair of linear transformations and to maximize the correlation of the linear combination. Specifically, given the two sets of multi-dimensional vectors X and Y, CCA aims to figure out the weight vectors w and v, and to maximize the correlation coefficient between x = X T w and y = Y T v. The correlation coefficient between each reference signal Y n and the signal X is calculated using the following equations: When the CCA was applied to the SSVEP based BCI [24], the signal X indicates the acquired EEG data, and Y represents the reference signal at n-th stimulation frequency f n . Y which is composed of several sine and cosine signals [39]: where f n represents the nth stimulation frequency, N is the number of the harmonics used for CCA, K represents the signal length, and F S represents the sampling rate. The SSVEP target frequency (the stimulus frequency which the subject focused on) is determined by the following formula: where S represents the total number of the used stimulation frequencies.
Since the reference signal in the standard CCA method is only generated by a series of sine and cosine signals, it does not take into account the influence of individual-specific and inter-trail differences. Therefore, the classification accuracy of the CCA may not be good enough. A multiway extension of CCA (MCCA), which introduces the concept of tensor analysis, has been proposed to improve classification accuracy by optimizing the reference signals.
An N-order tensor is expressed as x = (x) i 1 i 2 ...i N ∈ R I 1 ×I 2 ×...×I N , the projection of x on the vector w is calculated as follows: A three-way tensor x (channel × time × trial) is calculated from the multiway EEG data and the original reference signal set of multiple trials with specific stimulation frequencies. By analyzing the multidimensional correlation between the three-way tensor EEG data and the two-dimensional sine-cosine signal to find the optimal correlation reference signal, MCCA aims to find linear transformations w 1 , w 3 and v to maximize the correlation between x = x × 1 w T 1 × 3 w T 3 and y = v T Y. The maximum correlation coefficient between X T w and Y T v is expressed by: Then the optimized reference signal z can be calculated based on the optimal linear transformations w 1 and w 3 , following the expression: On one hand, CCA can maximize the correlation between the experimental test data and the optimized reference signal to achieve classification; on the other hand, CCA can be used as a feature extraction method. Specifically, the obtained correlation coefficient calculated by CCA can be sent into the SVM classifier for further classification. The classification process is shown in Figure 2a. However, in MCCA, the optimized projection vector for each dimension is learned without regularization, and hence lack of sparsity that can provide greater interpretability for features. Therefore, the reference signal is further optimized through L1 regularization. The essence of regularization is to constrain the parameters to be optimized and prevent over-fitting.
Since the scaling of w and v does not affect the correlation maximization, Equation (2) can be expressed as w T XX T w = v T YY T v = 1. w T XX T v is then maximized. We expressed the proposed MCCA as the following least-squares formulation: The optimized equation using L1-regularization is: where λ 1 , λ 2 , and λ 3 are the regularization parameters that control the sparsity of w 1 , v, and w 3 respectively. L1-MCCA obtains the correlation coefficients corresponding to different reference signal frequencies by leave-one-out cross-validation. The collected 15 session EEG signals were split into two parts, 14 of which were used as the training data and the rest was used as the test data. The classification accuracy was then calculated. This process was repeated 15 times, yielding an average accuracy.
The main purpose of SVM is to find the separation hyperplane linear classifier with the maximum distance in the feature space. SVM is strong in terms of generalization ability, easy to apply and suitable for small-sample classification problems and nonlinear features. To improve the classification accuracy of SSVEP, the PSO algorithm was used to optimize the penalty parameter c of the SVM classifier and the parameter g of the kernel function [40]. The PSO algorithm first initializes a group of particles in the feasible solution space, each particle represents the potential optimal solution of the mechanism optimization problem, that is, the potential optimal solution of parameters c, g. The three indicators of position, speed and fitness value are used to express the characteristics of the particle, and the fitness value represents the accuracy rate. When the particle moves in the solution space, the individual location is updated by tracking the individual extremum P best and the group extremum G best .
During each iteration, the particle updates its location and velocity through individual extreme values and group extreme values based on the relationship expressed by: and where w is the inertia factor that is a non-negative number used to describe the ability in the inheritance of the previous speed, c 1 , c 2 are learning factors, respectively, r 1 , r 2 are random numbers between [0, 1], V k id is the velocity vector of the particle after the kth iteration, P k id is the best position of the particle i after k iterations, X k id is the position vector of the particle after k iterations, and P k gd is the best position of the group after k iterations.
be used as a feature extraction method. Specifically, the obtained correlation coefficient calculated by CCA can be sent into the SVM classifier for further classification. The classification process is shown in Figure 2a. However, in MCCA, the optimized projection vector for each dimension is learned without regularization, and hence lack of sparsity that can provide greater interpretability for features. Therefore, the reference signal is further optimized through L1 regularization. The essence of regularization is to constrain the parameters to be optimized and prevent over-fitting.   w 3 and v) and sine-cosine original reference signal (Y) composed of part of the experimental sequence are optimized to obtain the optimized reference signal (z), and the correlation with the experimental signal is maximized. The classification method can be divided into MCCA whether to combine with SVM. (b) L1-MCCA method for reference signal optimization. The L1 regularization based on tensor was introduced to further enhance the optimization of MCCA test channel array. The correlation between the experimental signal (X) and the optimized reference signal (z 1 , z 2 . . . z N ) was maximized, and finally the correlation coefficient was input to the PSO optimized SVM.
In this study, a means combining L1-MCCA with PSO-optimized SVM was proposed to improve the classification accuracy. The experimental data was divided into a training set and a test set in a ratio of 7:3. PSO was used to optimize the parameters of SVM. The number of iterations was set to 200, the penalty parameter c and the kernel parameter g are selected as optimization variables, and the best accuracy rate is used as the fitness function. The optimized c and g parameters were transferred to SVM for training, and the test obtains the optimal accuracy rate. RBF function was used as the kernel function of SVM. In order to further reduce the calculation errors and to improve the accuracy, the calculation was repeated ten times, and then the averaged data was obtained as the final accuracy. In addition, to testify the reliability and the performance of the proposed method, it was compared with the traditional CCA and PSD methods.

Statistical Analysis
In this study, all data are indicated as mean ± standard error unless mentioned separately. The paired t-test was used to characterize the differences between different conditions. Differences were accepted as significant when p < 0.05.

Results
The individual classification accuracies of the 12 subjects within different TWs for CCA and L1-MCCA are shown in Figure 3. This figure shows that when the TW increased, the accuracies of the CCA and the L1-MCCA methods both improved. Additionally, the accuracies of the L1-MCCA were higher than those of the CCA.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 7 of 13 number of iterations was set to 200, the penalty parameter c and the kernel parameter g are selected as optimization variables, and the best accuracy rate is used as the fitness function. The optimized c and g parameters were transferred to SVM for training, and the test obtains the optimal accuracy rate. RBF function was used as the kernel function of SVM. In order to further reduce the calculation errors and to improve the accuracy, the calculation was repeated ten times, and then the averaged data was obtained as the final accuracy. In addition, to testify the reliability and the performance of the proposed method, it was compared with the traditional CCA and PSD methods.

Statistical Analysis
In this study, all data are indicated as mean ± standard error unless mentioned separately. The paired t-test was used to characterize the differences between different conditions. Differences were accepted as significant when p < 0.05.

Results
The individual classification accuracies of the 12 subjects within different TWs for CCA and L1-MCCA are shown in Figure 3. This figure shows that when the TW increased, the accuracies of the CCA and the L1-MCCA methods both improved. Additionally, the accuracies of the L1-MCCA were higher than those of the CCA. The correlation coefficients corresponding to diverse reference signal frequencies derived from the CCA and L1-MCCA for a typical subject (S1) are shown in Figure 4. The classification performance was evaluated by comparing the correlation coefficient values of the two methods under different reference frequencies. Compared with CCA, L1-MCCA obtains the correlation coefficients corresponding to different reference signal frequencies (6.8, 12.5, 12, 9.7, 11.1, 8.4, 10, 8, and 14.7 Hz) through leave-one-out crossvalidation. It can be seen from Figure 4 that the correlation coefficient between the target frequency and the corresponding reference signal is much higher than those corresponding to other frequencies using both methods. At each frequency, the greater the difference between the highest value and the other values, the more distinct the classification effect. The comparison of the two techniques shows that L1-MCCA has a better effect, and the classification accuracy based on L1-MCCA is predicted to be higher than that of CCA.  The correlation coefficients corresponding to diverse reference signal frequencies derived from the CCA and L1-MCCA for a typical subject (S1) are shown in Figure 4. The classification performance was evaluated by comparing the correlation coefficient values of the two methods under different reference frequencies. Compared with CCA, L1-MCCA obtains the correlation coefficients corresponding to different reference signal frequencies (6.8, 12.5, 12, 9.7, 11.1, 8.4, 10, 8, and 14.7 Hz) through leave-one-out cross-validation. It can be seen from Figure 4 that the correlation coefficient between the target frequency and the corresponding reference signal is much higher than those corresponding to other frequencies using both methods. At each frequency, the greater the difference between the highest value and the other values, the more distinct the classification effect. The comparison of the two techniques shows that L1-MCCA has a better effect, and the classification accuracy based on L1-MCCA is predicted to be higher than that of CCA.  To test the feasibility and evaluate the performance of the method proposed in this study, disparate techniques were compared quantitatively in terms of classification accuracy. To test the feasibility and evaluate the performance of the method proposed in this study, disparate techniques were compared quantitatively in terms of classification accuracy. The comparison results of the classification accuracies of dissimilar methods with different TWs are shown in Figure 5. As shown in Figure 5, from the perspective of each method alone, the longer the TW, the more experimental information obtained, and the higher the corresponding classification accuracy. For the PSD method, the accuracy of 1 s TW was 37.58% and was increased to 39.70% after optimization. The accuracy of 2 s TW was significantly improved to 70.91% (p < 0.05), and further improved to 73.03% after optimization. When the TW was longer than 3 s, the classification accuracies were maintained stably, and the optimized ways were bigger than those of the unoptimized method.
For the CCA method, the optimization effect PSO-SVM was significantly larger than the PSD method. Specifically, the accuracy of 1 s TW was significantly improved from 31.91% to 52.42% (p < 0.05) after optimization. The accuracy of 2 s TW was significantly increased to 77.28% (p < 0.05) and 94.85% after optimization (p < 0.05). In addition, for the TWs that are longer than 3 s, the classification accuracies of the PSO-SVM optimized method were significantly larger than the unoptimized CCA method. Specifically, the accuracies were improved from 88.15% to 96.67% (3 s TW, p < 0.05), from 88.58% to 95.76% (4 s TW, p < 0.05), from 87.41% to 95.96% (5 s TW, p < 0.05), and from 87.72% to 96.36% (6 s TW, p < 0.05), respectively.
For the L1-MCCA method, the optimization effect PSO-SVM was more distinct. The accuracy rate was significantly improved from 40.62% to 65.76% (p < 0.05) for 1 s TW. The accuracy of 2 s TW was significantly (p < 0.05) improved to 80.62% before optimization. After optimization, it was further significantly increased to 96.36% (p < 0.05). After that, the classification accuracies for the TW longer than 3 s were stable, but the accuracies after PSO-SVM optimization were significantly higher than before. Specifically, the accuracies were increased from 91.36% to 96.97% (3 s TW, p = 0.056, close to significant degree), from 93.40% to 98.18% (4 s TW, p < 0.05), from 92.10% to 97.58% (5 s TW, p < 0.05), and from 94.63% to 98.18% (6 s TW, p < 0.05), respectively.
As shown in Figure 5, the classification accuracies of the L1-MCCA related methods were higher than those of the CCA-related and PSD-related methods. More importantly, the PSO-SVM optimization can further improve the classification accuracy of PSD, CCA As shown in Figure 5, from the perspective of each method alone, the longer the TW, the more experimental information obtained, and the higher the corresponding classification accuracy. For the PSD method, the accuracy of 1 s TW was 37.58% and was increased to 39.70% after optimization. The accuracy of 2 s TW was significantly improved to 70.91% (p < 0.05), and further improved to 73.03% after optimization. When the TW was longer than 3 s, the classification accuracies were maintained stably, and the optimized ways were bigger than those of the unoptimized method.
For the CCA method, the optimization effect PSO-SVM was significantly larger than the PSD method. Specifically, the accuracy of 1 s TW was significantly improved from 31.91% to 52.42% (p < 0.05) after optimization. The accuracy of 2 s TW was significantly increased to 77.28% (p < 0.05) and 94.85% after optimization (p < 0.05). In addition, for the TWs that are longer than 3 s, the classification accuracies of the PSO-SVM optimized method were significantly larger than the unoptimized CCA method. Specifically, the accuracies were improved from 88.15% to 96.67% (3 s TW, p < 0.05), from 88.58% to 95.76% (4 s TW, p < 0.05), from 87.41% to 95.96% (5 s TW, p < 0.05), and from 87.72% to 96.36% (6 s TW, p < 0.05), respectively.
For the L1-MCCA method, the optimization effect PSO-SVM was more distinct. The accuracy rate was significantly improved from 40.62% to 65.76% (p < 0.05) for 1 s TW. The accuracy of 2 s TW was significantly (p < 0.05) improved to 80.62% before optimization. After optimization, it was further significantly increased to 96.36% (p < 0.05). After that, the classification accuracies for the TW longer than 3 s were stable, but the accuracies after PSO-SVM optimization were significantly higher than before. Specifically, the accuracies were increased from 91.36% to 96.97% (3 s TW, p = 0.056, close to significant degree), from 93.40% to 98.18% (4 s TW, p < 0.05), from 92.10% to 97.58% (5 s TW, p < 0.05), and from 94.63% to 98.18% (6 s TW, p < 0.05), respectively.
As shown in Figure 5, the classification accuracies of the L1-MCCA related methods were higher than those of the CCA-related and PSD-related methods. More importantly, the PSO-SVM optimization can further improve the classification accuracy of PSD, CCA and L1-MCCA technologies, indicating that the combination method of them with classifiers is very effective. The PSO-SVM optimization provides new ideas for the classification of BCI systems based on SSVEP.

Discussion
In the past few years, several studies have been proposed for improving the classification performance of the SSVEP-based BCI. SSVEP analysis method has been gradually shifted from one-way optimization to collaborative multiway optimization. Moreover, discriminant analysis of regularized tensors has begun to appear and was applied to classifiers to prevent overfitting. In this study, an approach that combines L1-MCCA with PSO-optimized SVM was proposed.
The performance of L1-MCCA was compared quantitatively with CCA and PSD before and after PSO-SVM optimization. Note that the classification accuracies of the L1-MCCA related methods with varied TWs were consistently higher than the CCA-related and PSDrelated methods. Our results are in agreement with previous research [30,41,42]. It has been reported that the L1-MCCA method can improve the classification accuracy of SSVEP-based BCI by optimizing the reference signal. As shown in Figure 4, the correlation coefficients corresponding to the disparate stimulation frequencies for both L1-MCCA and CCA were calculated. The results showed that compared with CCA, L1-MCCA improves the correlation coefficient of the target frequency and reduces the correlation coefficient of other non-target frequencies, indicating the effectiveness of L1-MCCA in SSVEP classification.
The key to achieving a better classification performance is to separate the target frequency from other non-target frequencies more accurately. The greater differences in the correlation coefficients between the two, the better the classification performance. The results showed that the combination of multiway analysis and regularization allowed L1-MCCA to present better performance than that of the traditional CCA method.
The length of the TW has a significant influence on the classification accuracy of SSVEP-based BCI. As shown in Figures 3 and 4, the classification accuracies were gradually increased as the length of the TW increased. In this study, nine visual stimulations with different frequencies were used to produce the SSVEP data. The accuracy of the CCA of the 4 s TW was 88.58%. Islam et al. [43] reported that under the CCA-based method, the average accuracy of 10 subjects with 4 s TW for 12 stimuli reached about 93%, which was higher than the accuracy of this study. Chen et al. [44] reported that under the four stimuli, the CCA-based method had reached 62% accuracy with a 4 s TW for nine subjects. The effect of TW on the classification of the current study is in agreement with these previous studies. The longer the TW, the more experimental information obtained, and the higher the classification accuracy. The reasons why the classification accuracies of different studies are different from each other can be ascribed into several factors, such as the differences in experimental design, individual, data processing method, and individual difference, etc.
Different from the traditional CCA which uses the standard sine and cosine waves as reference signals, the L1-MCCA method optimizes reference signals with subject-specific information to obtain a better performance. The results of the current study are in line with the study proposed by Zhang Yu et al. [30]. The average accuracy of L1-MCCA obtained by 10 subjects with 4 s TW, which is about 91%, was higher than that of MCCA and CCA, showing the effectiveness of L1-MCCA in SSVEP-based BCI. In this study, due to the unique advantages of PSO, such as fast convergence speed and simple calculation, the SVM classifier which is optimized by PSO combined with L1-MCCA was used, aiming at the further improvement of the classification accuracy. To test the feasibility and evaluate the performance of the proposed algorithm, quantitative comparisons were conducted for inconsistent methods in terms of classification accuracy. As shown in Figure 5, at the same TW, the accuracy of the L1-MCCA was higher than that of the CCA and PSD. Note that the optimized classification accuracies of L1-MCCA, CCA and PSD are consistently higher than those without optimization. The findings indicate that L1-MCCA is better than CCA and PSD in the classification of SSVEP signals. In addition, the combination of L1-MCCA with SVM classifier significantly improved the accuracy of BCI classification. In further exploratory studies, we will try different classifiers or optimization algorithms, aiming to provide useful insights for the accuracy of BCI classification based on SSVEP.
At present, there has been some research in other directions based on the L1-MCCA algorithm. A document proposes a sparse Bayesian learning L1-MCCA (SBMCCA), which improves the computational efficiency based on a certain classification accuracy [42]. In addition, Zhao Jing et al. [45] proposed a new decision selector (DMS) that integrates classification decision combinations based on CCA's different frequency methods (including seven types of CCA, MsetCCA, L1-MCCA, et al.) into SSVEP. The results show that DMS-ECCA&TRCA obtained the best classification accuracy rate which reaches 98.3%. It shows that the combination of multiple techniques is also a research direction.
One of the limitations in this study is the relatively small sample size and category of stimuli. In further studies, larger sample sizes and stimulus categories will be conducted to further confirm the current findings. The second limitation is that filtering short interval BCI scenarios will lead to edge effects and may affect the accuracy of decoding. In a further exploratory study, advanced data processing methods will be utilized to overcome this issue. An additional limitation is that different stimulations were presented separately and randomly rather than simultaneously, which was not exactly the same as the real BCI. Although there are certain limitations, this study still provides some insights into the field of BCI. In a further exploratory study, we are not going to stop at this stage but will adopt advanced technologies and data processing methods to make the L1-MCCA based SVM method more applicable clinically.

Conclusions
In this study, the combination of L1-MCCA and PSO optimized SVM was proposed to improve the classification accuracy of the SSVEP-based BCI. Using the combination method, the concept of tensor to construct multiway EEG signals and sine and cosine signals to seek maximization is introduced. The combination uses L1 regularization to have a stronger learning ability when training data. The optimized reference signal is classified using the original CCA. Its essence is to optimize the reference signal and introduce some experimental information. The results show that the proposed L1-MCCA-PSO-SVM method further increased the classification accuracy of SSVEP-based BCI when compared with the traditional CCA and PSD ways. The proposed method has reached 96.36% accuracy under the 2 s TW and has reached 98.18% accuracy under the 6 s TW. In a further study, advanced methodologies and technologies will be used to further improve the performance and to promote the applications of SSVEP-based BCI.

Institutional Review Board Statement:
The experimental paradigms in this study were approved by the ethics committee of the Beijing Information Science and Technology University and the Institute of Automation, Chinese Academy of Sciences.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study. Written informed consent has been obtained from the patient(s) to publish this paper.