An N100-p300 Spelling Brain-computer Interface with Detection of Intentional Control

A brain-computer interface (BCI) is a tool to communicate with a computer via brain signals without the user making any physical movements, thus enabling disabled people to communicate with their environment and with others. P300-based ERP spellers are a widely used spelling visual BCI using the P300 component of event-related potential (ERP). However, they have a technical problem in that at least √ 2N flashes are required to present N characters. This prevents the improvement of accuracy and restricts the typing speed. To address this issue, we propose a method that uses N100 in addition to P300. We utilize novel stimulus images to detect the user's gazing position by using N100. By using both P300 and N100, the proposed visual BCI reduces the number of flashes and improves the accuracy of the P300 speller. We also propose using N100 to classify non-control (NC) and intentional control (IC) states. In our experiments, the detection accuracy of N100 was significantly higher than that of P300 and the proposed method exhibited a higher information transfer rate (ITR) than the P300 speller.


Introduction
The brain-computer interface (BCI) is an alternative communication pathway to communicate with and control devices by discriminating brain signals without the user making any physical movements.The major goal of BCI research is to develop applications that enable disabled or elderly users to communicate with others and control their limbs and/or the environment [1].Various types of event related potentials (ERPs) have been utilized to realize BCI, such as P300 based BCI, steady state visual evoked potential (SSVEP), auditory steady state response (ASSR), and µ-rhythms from the sensorimotor cortex [2], and various systems have been used to measure it, including electroencephalography (EEG), magnetoencephalography (MEG), and functional magnetic resonance imaging (fMRI).In this paper, we focus on an EEG-based BCI system.EEG-based ERP spellers have been extensively used because of their simplicity and high accuracy.Most of ERP-spellers use P300 evoked by counting the number of times the target is intensified to detect the desired target command [3,4].The P300 speller proposed by Farwell and Donchin is a well-known BCI system using P300 [3].A 6 × 6 matrix containing target characters is used for stimulation.Each row and column of the matrix is flashed in random order and the user silently counts the number of times the desired character is presented.The desired character is determined by detecting P300 evoked by the mental task.In [4], early ERP components such as P1, N1, and P2 are used in addition to P300 as the features to detect the target command.GeoSpell (a geometric speller) is an alternative visual ERP-based spelling system.In the GeoSpell interface, each N2 character is assigned to two 2N groups arranged in a circle.The user silently counts the number of target stimuli containing the target character in the same manner as the P300 speller.The advantage of GeoSpell is that the user is not required to perform direct eye-gazing.In addition, the probability that an identical target stimulus flashes twice continuously, is lower than with the conventional P300 speller [5].Another promising BCI is the ERP-based Hex-o-Spell.In Hex-o-Spell, the target is determined in two stages.First, a character group containing the target character is selected, and after that, the individual target is determined [4].In GeoSpell and Hex-o-Spell, a visual stimulus is presented from the center of the screen so that users can fixate on a dot in the center of the screen and focus on the target in their visual periphery.
Existing ERP spellers have several drawbacks: (i) at least √ 2N flashes are required to present N commands; (ii) since the stimuli containing a group (e.g., row or column) of the characters flash randomly, at least one character flashes twice in a row in some ERP spellers (including the P300 speller); and (iii) at least two counting tasks are required to type one character, which is like counting row and column in a matrix in the P300 speller.In Section 2, we discuss these drawbacks in detail.
Hybrid BCIs which combine two or more BCI paradigms have been proposed [6].Some hybrid BCI researches aim to improved ITR by combining plural BCI paradigms [7][8][9].Allison et al. improved reliability by using SSVEP and the event-related desynchronization (ERD) paradigms, especially for some users who do not exhibit adequate BCI performance in single BCI paradigm [10].Panicker et al. utilized SSVEP to detect the control state in a P300-based ERP speller [11].
In this paper, we propose a new visual ERP-speller using N100 in addition to P300, along with efficient visual stimulus images for this purpose.N100 is a kind of visual evoked potential (VEP) that is evoked with P1 and P2 [12].Unlike P300, N100 is evoked by only paying attention to the visual stimulus, with no counting task.To the best of the authors knowledge, this is the first work to use N100 for feature to classify BCI commands.In the proposed paradigm, unlike [4], P300 and N100 are independently used to determine the target character.By utilizing two features independently, the proposed BCI overcomes the above drawbacks of the conventional ERP speller.In Section 5, we show through a preliminary experiment that N100 is discriminable, and in Sections 5.1.2and 5.2, we present two sets of experimental results demonstrating that the ITR of the proposed method improves upon that of the P300 speller by 15 bit/min on average.The proposed method is not a kind of hybrid BCIs because N100 is difficult to use for BCI solely.
We furthermore propose using N100 to realize a self-paced (asynchronous) BCI [13,14].When individuals use an input device, they are not constantly sending information; sometimes they pause to rest, think, and wait for a response.Therefore, classifying non-control (NC) and intentional control (IC) states is required for practical BCI.Although the original asynchronous BCI does not require a predefined time frame, we here consider classifying NC/IC states using a short time frame (3.4-4.5 s).In previous studies, classifying NC/IC states was done using stopping criteria such as thresholding of the peak amplitude of P1 and N1 or outputs of the classifier [14,15].In these methods, however, it is necessary to tune the threshold depending on the experimental environment and conditions each time.Therefore, we here propose a machine learning-based NC/IC classification method that uses P300 and N100.The classification results of NC/IC states are discussed in Section 5.2.Our preliminary ideas have been published in conference publications [16,17].In this paper, we systematize our frameworks and add experimental results to show the discriminability of N100 and detailed experimental results.

ERP Speller
P300 is a positive deflection in ERP that appears 300 ms after the onset of stimuli.The oddball paradigm is used to observe P300 [18].P300 is elicited if a user is actively trying to detect the targets.The mental task of counting the number of target stimuli is often used for BCI.P300 is evoked by not only visual but also auditory [19] or tactile [20] stimuli.
The P300 speller is a classical spelling BCI proposed by Farwell and Donchin in 1988.It features a 6 × 6 matrix containing alphanumeric characters is arranged on a display as shown in Figure 1.Each row and column having six characters is flashed in a random order.The user performs a mental task such as counting how many times the desired character is presented.P300 evoked by the counting task is detected by the system and the target character is determined by detecting P300 from the target row and column [3].An example of the detection process of the desired character "K" is given in Figure 2. GeoSpell and Hex-o-Spell are improved versions of the ERP speller.They do not require eye-gaze control.The performance of BCIs is usually evaluated by the information transfer rate (ITR) as well as the classification accuracy of discriminating the target character.Such measurements depend upon three factors: typing speed, classification accuracy, and the number of commands [21], where T (s) is the time of one session, P is the classification accuracy, and N is the number of commands.
Although ERP spellers are widely used because of their simplicity and high ITR, they have several technical problems, as stated in the Introduction.The first is that ERP spellers require at least √ 2N flashes to present N commands.Suppose that the classification accuracy is P = 0.9 and the stimulus onset asynchrony (SOA) is T 0 = 187.5 ms, that is, it takes T = t 0 × √ 2N ms to present all commands.Figure 3 shows the relationship between N and ITR obtained by Equation (1).This figure suggests that making the matrix larger than 3 × 3 (nine commands) does not improve the ITR.Moreover, the accuracy P is expected to be lower for large N because the number of classes increases with N.This is the main limitation of the ERP speller.
Since enlarging the matrix does not improve the ITR, we next consider shortening the SOA.However, in some ERP spellers, at least one character flashes twice continuously.This problem is called attentional blink (AB).Discriminating the second target is made more difficult if both targets are presented less than roughly 500 ms apart [22].For example, in Figure 1, if (b) is presented after the presentation of (a), "A" flashes twice continuously.If the SOA is too short, the subject cannot follow the stimulation, and P300 will not be observed.Most ERP spellers require the target stimuli to be counted at least two times because of the two-stage selection process.Moreover, if we use averaging to improve accuracy, the number of counting times increases, which increases the risk of the users become fatigued.
If we use a large matrix in the P300 speller, all characters are small and close together.This causes users to make mistakes and is not user-friendly, especially for the elderly.

N100 and Its Discriminability
The visual N100 (also referred to as N1) is a negative deflection in the transient VEP that appears 100 ms after the onset of a stimulus [12].P1 and P2 are also observed around N100 [23], and they would also be useful features for BCI.In a previous study investigated that P1, N100, and P2 components were found to have amplitudes large enough to discriminate the target intensification [4].
Unlike P300, N100 is not related to the reaction to a specific target, e.g., a counting task to low-frequency stimuli.When a user pays attention to a stimulus area, N100 is evoked by any stimulus.Thus, it is difficult to use N100 solely for BCI.N100 has larger amplitude when the user focuses on or pays attention to the target position [23][24][25].We confirm this in our experiment in Section 5.1.

N100-P300 Speller
In a similar manner to the standard 6 × 6 P300-speller, we consider a BCI that has 36 commands: 26 letters (A-Z) and ten numbers (0-9).Since N100 is evoked without any counting task, we propose an efficient stimulus presentation based on rapid visual presentation (RVP) in order to utilize N100 for BCI.
The 36 characters and several blanks are arranged in the stimulus images.Figures 4 and 5 show examples of the proposed images.The positions of characters are fixed, and a user is assumed to know the target position beforehand.The proposed system detects the target characters as follows.(i) The user pays attention to the target position and counts how many times the target characters are presented; (ii) the system detects P300 evoked by the counting task and determines the target stimulus image; (iii) the system also detects absent or weak N100 caused by blanks in the stimulus images and determines the position of the target character; and finally (iv) the desired character is determined by the combination of the detected image and position.Figure 6 shows an example of this detection process.Suppose the target character is "K".In this case, the user focuses on the top-right part of the stimulus image.Since N100 is evoked by every stimulus, all stimulus images except for the third image evoke N100.In contract, P300 is evoked by the user's counting task after the second stimulus image is presented.The system detects the target position and image from N100 and P300, respectively.
In our study, we developed two BCI systems, one with 2 × 2 matrices (Figure 4), and the other with 2 × 3 matrices (Figure 5).In the case of the 2 × 2 matrix, we used 12 stimulus images.To perform averaging for the N100 absence signals, we arranged three blanks for each position.In this case, the number of stimulations is twelve, which is the same as that of the 6 × 6 P300-speller matrix.The arrangement of the characters is listed in Table 1.For the 2 × 3 matrix, we also arranged three blanks for each position and used nine stimulus images.Examples of stimulus images are shown in Figure 5.The arrangement of the characters is listed in Table 2.In this case, the number of stimulations is nine, which is less than that of the 6 × 6 P300-speller matrix.Thus, the input speed is faster than that of the P300 speller.
We used a simple signal processing and feature extraction method along with the linear support vector machines (SVMs) to classify N100 and P300.SVMs to detect P300 and N100 are denoted by SVM1p and SVM1n, respectively.SVM1p is a binary classifier trained by the EEG responses of the target image (positive samples) and the non-target image (negative samples).In the case of the 2 × 2 matrix, there are 12 stimulus images, and only one stimulus image contains the target character.Thus we obtain one positive sample and 11 negative samples from one trial.For the testing stage, 12 or nine responses of the stimulus images are input to SVM1p, and the response having the maximum output is taken as the estimated target image.These outputs are also used in the next NC/IC classification.Table 1.Character arrangement of 2 × 2 matrix, "-" denotes the blank."T", "B", "L", and "R" respectively mean top, bottom, left, and right.

No. TL TC TR BL BC BR
In a similar manner, SVM1n is also binary classifier trained by the EEG responses of the blank positions (positive samples) and the non-blank positions (negative samples).For each position, the feature vector of SVM1n is made by averaging the EEG responses of the blank.For example in the case of the 2 × 2 matrix, the averaging response for Figure 4a-c is the feature vector of the top-left blank, the averaging response for Figure 4d,e is that of the top-right blank, and so forth.
Thus we obtain four or six feature vectors from one trial.In the training stage, since we use labeled samples, we obtain one positive sample, and three (or five) negative samples from one trial.In the testing stage, four or six feature vectors are input to SVM1n, and the position having the maximum output is taken as the estimated position.These outputs of four or six responses are also used in the next NC/IC classification.

Discrimination of NC/IC States
When individuals use the BCI system, they are not constantly typing characters or control tools in practice.To realize a practical BCI, a function must be developed to distinguish whether the user intends to spell characters or not.The previous study proposed a stopping criterion whereby the maximum amplitude of VEPs such as P1 and N1 is thresholded.In this method, however, we have to tune the threshold depending on experimental environment and conditions each time [15].
To detect the IC state, we again use N100 and P300.The outline of the IC detection system is shown in Figure 7, where SVM1p and SVM1n are classifiers for P300 and N100, respectively, and SVM2 classifies the IC/NC states.The feature vector of SVM2 is made from the outputs of SVM1p and SVM1n.If the user intends to input, the outputs of SVM1p and SVM1n are expected to have only one positive output, and if the user does not intend to input, all output values of SVM1p and SVM1n are expected to be negative.Therefore, we use sorted output values of SVM1p and SVM1n for the feature vector of SVM2.In our experiment, we compare three feature vectors: (i) using outputs of SVM1p, the dimensions of the feature vector are 12 or nine; (ii) using outputs of SVM1n, the dimensions of the feature vector are four or six; (iii) concatenating outputs of SVM1p and SVM1n, the dimensions of the feature vector are 13 or 15.It should be noted that SVM1p and SVM1n are trained only from the intended training data as shown in Figure 7. SVM2 is trained by the SVM1p and SVM1n outputs of the intended and non-intended training data.

Experiments
We describe three experiments in this section.All participants signed a consent form approved by the research ethics committee of The University of Electro-Communications.

Purpose and Method
To show the discriminability of N100 and clarify the effect of eye movement, we conducted a preliminary experiment.Although the amplitude of N100 is significantly different between the attended and non-attended conditions [12], sufficient averaging number is unclear.Moreover, in the proposed system and in the P300 speller, users may move their eyes while typing.However, some seriously ill patients, such as those in the final stages of amyotrophic lateral sclerosis (ALS), cannot move their eyes [26].The relationship between the P300 speller and eye movement has previously been reported [27].We also investigated the effect of eye movement on the N100-based BCI.
We arranged two stimuli as shown in Figure 8.The central gray circle is continuously presented.The left and right white circles are presented randomly six times each, for a total of 12 times in one trial.The flash lasts 125 ms, and the SOA is 187.5 ms.The target circle (left or right) was pointed out in another display before each trial.In experiment I, participants were instructed to gaze at either the left or right circle, and in experiment II, they were to gaze at the central gray circle during the experiment and to pay attention to either the left or right circle.The participants performed these experiments alternately.Two healthy 22-year-old males participated in the experiments.Fifteen trials were recorded for each participant.The EEG was recorded using an active EEG (Guger Technologies) at a 512 Hz sampling rate and a bio-signal amplifier (Digitec) with a 0.5 Hz analogue high-pass filter and 100Hz analogue low-pass filter.FCz, FC2, FC1, Cz, CP1, CP2, Pz, POz, P3, P4, POz, PO3, PO4, O1, O2, and Iz were used.AFz and A2 were used as the ground and the reference, respectively (Figure 9).The locations of electrodes were based on the extended international 10-20 system.For the recorded EEG, we used a second-order Butterworth band pass filter (1-13 Hz) and a third-order Butterworth band stop filter (49-51 Hz) to remove the hum noise.The signal was down-sampled from 512 to 64 Hz.A linear SVM was used to classify the participant's attention.We extracted the EEG signal from the specific range after the onset of stimulus and averaged six responses for each stimulus.A 160-dimensional feature vector was made by concatenating ten sample points and 16 channels.The soft margin parameter C was selected from {0.1, 1, 10, 100, 1000}.All signal processing tools were implemented on MATLAB, and LibSVM was used [28].The mean accuracies of five-fold cross-validation were compared.

Results
The averaged waveforms of participant 1 are shown in Figure 10.The left side of the figure shows the case in which the participant gazes at the stimulus circle.From the responses for the stimulus intensification, a negative peak is observed around 175 ms to 200 ms after the onset of the stimulus.Two positive peaks, P1 and P2, around the N100 peak are also observed.In contrast, from the responses for the blank stimulus, N100 is not observed.The right side of Figure 10 shows the case in which the participant gazes at the center and pays attention to the stimulus.In this case, the difference between the two conditions is small.However, P2 around 270 ms after the onset can be observed.Table 3 shows the mean classification accuracy and standard deviation.The classification accuracy of the gazing case is higher than that of the attention case.However, in the attention case, the range 150-300 ms shows a classification performance significantly better than chance (50%).The accuracy is expected to be higher if we average more signals.

Method
We compared the proposed BCI using the 2 × 2 matrix and 6 × 6 P300-speller.Eleven healthy 22-24-year-old males participated in this experiment.They performed 50 trials each for the P300 speller and the proposed method alternately.In the proposed method, 12 stimulus images are presented twice in one trial in random order, thus the total number of flashes is 24 per trial.In the P300 speller, 12 flashes are presented twice in one trial in random order, so the total number of flashes is also 24.Each flash lasts 125ms, and the SOA is 187.5 ms.Hence, both the proposed method and the P300 speller take 4.5 s for one trial.Participants were asked to gaze at the target position and silently count the number of times the target character flashed.The target position was pointed out in another display before each trial.The users were informed of the positions of the characters before the experiment.
The EEG recording system and its settings were the same as in the preliminary experiment above.The electrode locations are shown in Figure 11.The signal processing (filtering and down-sampling) was also the same as in the preliminary experiment.P300 and N100 were respectively extracted from 125 to 625 ms and from 100 to 250 ms after the onset of the stimulus.P300 was averaged for each stimulus image in both methods, and then a 512-dimensional feature vector for SVM1p was made by arranging a 32-sample-point signal and 16 electrodes.N100 was averaged for each position, and then a 160-dimensional feature vector was made by arranging a 10-sample-point signal and 16 electrodes for SVM1n.

Averaged Waveform
The grand averaged waveforms over P300 are shown in Figure 12.The waveform of a target is averaged over the responses when the subject responds to the target character.From the target response waveform, P300 is observed around 400-500ms after the onset of the stimulus.The P300 latency of the proposed speller is larger than that of the P300 speller.The left side of Figure 13 shows the averaged waveforms of N100.The waveform of the stimulus is averaged over the responses when a character is presented in the target position.The waveform of the blank is averaged over the responses when no character is presented in the target position.N100 is observed around 150 ms after the onset of the stimulus.The negative peak amplitude of the blank around 150 ms is smaller than that of the stimulus in P3 and P4.The significance of the difference over the N100 peak of P3 was confirmed by the t-testing (p < 0.01).Negative peaks around 340 ms and 530 ms in Figure 13 are caused by the subsequent stimuli because the SOA is 187.5 ms.The positive peak amplitude of the stimulus around 150 ms is much larger than that of the blank in FC1 and FC2.The significance of the difference over the P1 peak of FC2 was also confirmed by t-testing (p < 0.01).Grand averaged waveform over all subjects for N100 on FC1, FC2, P3 and P4.The signal for the target (non-target) stimulus is averaged 9900 (3300) times for 2 × 2 matrix N100-P300 speller.The signal for the target (non-target) stimulus is averaged 6600 (3300) times for 2 × 3 matrix N100-P300 speller.
Detection Accuracy of N100 and P300 Table 4 lists the averaged detection accuracy and the standard deviation over the five-fold cross-validation.In the proposed method, "Image" is the detection of the target image using P300 and "Position" is the detection of the target position using N100.The detection accuracy of N100 is more than 10% higher than that of P300 although the peak amplitude of N100 is smaller than that of P300 in Figures 13 and 14.This is because N100 was averaged six times (3 blanks × 2 loops).The P300 signal was averaged twice (in both the P300 speller and the proposed method) for one trial.The accuracy of P300 did not significantly differ between the P300 speller and the proposed method (p = 0.25).

Classification Accuracy of the Target Character
Table 5 shows averaged classification accuracy and the standard deviation of the target character.The accuracy of the proposed method is an average of 11.6% higher than that of 6 × 6 P300-speller.The significance of the proposed method is confirmed by the t-test (p < 0.01).Information Transfer Rate Table 6 compares the ITR.The averaged ITR of the proposed method is 0.17 bits/s higher than that of the P300 speller.The significance of the proposed method is confirmed by t-testing (p < 0.01, difference: 0.17 bit/s).

2 × 3 Matrix and IC Detection
We compared the proposed BCI using the 2 × 3 matrix and the P300 speller.We also evaluated the accuracy of the IC detection.Ten healthy 22-24-year-old males participated in this experiment, performing 60 trials each for the P300 speller and the proposed method alternately.Every third trial, participants were asked not to type a character, at which point they did not gaze at the display.The other settings (EEG recording system and signal processing) were the same as in the case of the 2 × 2 matrix.

Averaged Waveform
The grand averaged waveforms over P300 are shown in Figure 13.From the target response waveform, P300 is observed around 400-500 ms after the onset of the stimulus as well as the 2 × 2 matrix N100-P300 speller.The P300 latency of the proposed speller is larger than that of the P300 speller.The right side of Figure 14 shows the averaged waveforms of N100.N100 is observed around 150 ms after the onset of the stimulus in P3 and P4.The negative peak amplitude of the blank around 150 ms is smaller than that of the stimulus in P3 and P4.The significance of the difference over the N100 peak of P3 was confirmed by t-testing (p < 0.01).The significance of the difference over the P1 peak of FC2 was also confirmed by t-testing (p < 0.01).
If the target location is increased from the 2 × 2 matrix to the 2 × 3 matrix, and the distance between positions is small, the response for the blank has larger N100 and P1 peaks evoked by the neighbor stimulus.To investigate this, we compared the N100 and P1 peaks of the blank response for the 2 × 2 and × 3 cases.The peak differences between two cases do not have significant difference for either P1 in FC2 (p = 0.20), or N100 in P3 (p = 0.35).

Classification Accuracy and ITR
The averaged detection accuracy and the standard deviation of P300 and N100 are shown in Table 7.In this case, the detection of N100 is a six-class classification problem, which is more difficult than the case of the 2 × 2 matrix.Therefore, the detection accuracy of N100 is lower than that of the 2 × 2 matrix.The P300 detection accuracy of the proposed method is significantly lower than that of the P300 speller (p < 0.01).The reason for this will be discussed later.The classification accuracy of the target character is shown in Table 8.Although the accuracy of the proposed method is higher than that of the P300 speller, the improvement is smaller than for the 2 × 2 matrix.However, as shown in Table 9, the improvement of the ITR is higher than for the 2 × 2 matrix.The reason is that only nine stimulus images are used in the case of the 2 × 3 matrix, whereas 12 stimulus images are used for the P300 speller and 2 × 2 matrix cases.Since the SOA is 187.5 ms, the proposed method with the 2 × 3 matrix takes 3.375 s (=(9 stimuli) × (2 loops × (187.5 ms))) for one trial, whereas the P300 speller takes 4.5 s.The ITR improvement of the 2 × 3 matrix over the 2 × 2 matrix was confirmed by t-testing (p < 0.01, difference: 0.15 bit/s)

IC Detection
Table 10 shows the classification accuracy of the IC detection.The accuracies of the proposed method using P300, N100, and P300 + N100 were higher than that of the P300 speller.The significance of accuracy of the proposed method (N100) was confirmed by t-testing (p < 0.01, difference = 32.7%).6. Discussion

Discussion of Experimental Results
From the averaged waveform, we can see that the P300 peak latency of the proposed method is larger than that of the P300 speller.A previous study reported that the latency often depends on the difficulty of the paradigm [29].Since the proposed method is a kind of RSVP design, its counting task is more difficult than that of the P300 speller.
The P300 detection accuracy of the proposed 2 × 3 matrix method is lower than that of the P300 speller.The first reason for this is that the proposed system has half as many training samples as the P300 speller.In the proposed method, P300 is evoked only once per trial whereas the P300 speller evokes P300 twice, (once for the row and once for the column).The second reason is the number of classes.In the proposed 2 × 3 matrix system, the target image is one of nine, and the detection problem is a nine-class classification problem.In contrast, the detection problem of the P300 speller is a six-class classification problem.
The ITR of the proposed method is higher than that of the P300 speller.This is because the detection accuracy of N100 is higher than that of P300.Furthermore, in the case of the 2 × 3 matrix, the number of flashes is less than that of the P300 speller, and hence the typing speed is improved.
As for the IC detection problem, the classification accuracy of the proposed method is an average of 32.7% higher than that of the P300 speller.The results in Table 10 suggest that features from N100 are more informative than features from P300 for detecting IC state.

Comparison with ERP-Spellers
The P300 speller does not perform well if the user does not move his or her eyes [27].This may be a drawback for severely disabled patients.Moreover, myoelectric potential may corrupt EEG signals.To overcome this problem, gaze independent ERP-spellers such as GeoSpell, Center speller, and Hex-o-Spell have been proposed [4,5,30,31].These methods exhibit comparable or slightly better performance than the P300 speller (up to 15% improvement), whereas the proposed method exhibits a 40% improvement in terms of ITR comparison.Although the experimental results discussed in Section 5.1 suggest that N100 can be discriminated without eye-gazing, the proposed method essentially has the same problem as the P300 speller.
As explained in Section 2, ERP spellers such as GeoSpell and Hex-o-Spell have several technical problems.Since ERP spellers need to present √ 2N flashes in order to issue N commands, the ITR will not be improved even if we enlarge the size of the matrix or the number of groups.Let us consider a general case of the proposed method.Suppose that N commands are arranged in m × n matrices.To detect the absence of N100, we need to arrange at least mn blanks for each position.Therefore, since we arrange N commands and mn blanks in mn positions, the minimum number of stimulus images is (N + mn)/(mn) , where • , denotes the ceiling function.Even if we increase the number of commands N and enlarge the matrix m, n say ) the total stimulation time does not change.On the other hand, in the case of the P300 speller, the matrix size is m = n = √ N and the total stimulation time is proportional to 2 √ N. If we enlarge the number of commands from N to 2N and the matrix from √ N to √ 2N, the total stimulation time increases from 2 √ N to 2 √ 2N.As shown in Figure 3, the ITR of the P300 speller decreases as the matrix size and number of commands increases.Therefore, compared with the P300 speller, the proposed system is more flexible and has the potential for further extensions.
However, the detection accuracy of the target position by N100 depends on the size of matrix.If it is too big, the number of positions is large, which makes the classification problem difficult.Moreover, if the distance between characters is small, the peak difference of N100 between the blank and target is smaller.As shown in Figure 14 and Section 5.2.2 the amplitude of early visual ERPs for the 2 × 2 matrix is not significantly different from that of the 2 × 3 matrix.The optimal matrix size should be investigated in future work.
In some ERP spellers, including the P300 speller and GeoSpell, at least one character is presented twice continuously, and hence they experience the AB problem [22].The proposed method does not have the AB problem, so the minimum SOA for the proposed method is expected to be shorter.This point should also be investigated in future work.
Thanks to using N100, the proposed method requires only a one-stage selection process in one trial, whereas most other ERP spellers require at least two stage selections.Our method is therefore reduces user fatigue and improves the stability and reliability of the BCI.
The hybrid BCI in [11] discriminated IC/NC by using SSVEP with an averaged accuracy of 88%, where the window was more than 4.8 sec.In our BCI, the averaged IC/NC detection accuracy was 90.5%, where one trial took 3.375 sec.
The proposed method requires participants to remember the position of characters beforehand.In our experiments, we showed images of the character position prior to the experiment and the participants remembered them.In lieu of this, we can use stimulus images printing all character positions in small low-contrast print, as shown in Figure 15.The validity of using stimulus images in this manner should be investigated in future work.

Conclusions
We have proposed a spelling BCI using both P300 and N100 to reduce the number of flashes and increase the ITR.To utilize N100 in a BCI, we have arranged uniquely designed stimulus images containing both characters and blanks.The blanks are arranged not to elicit N100.Hence, the proposed system can detect the gazing position by using N100.The advantages of the proposed method are that (i) the classification accuracy of N100 is higher than that of P300 since the number of averaging for N100 is greater than that for P300; (ii) the proposed method takes less time to type one character since the number of flashes can be reduced to nine in the case of the 2 × 3 matrix; (iii) the number of counting tasks can be reduced because N100 is elicited by visual stimulation without counting tasks, thus reducing user fatigue; and (iv) no characters flash twice in a row, whereas at least one character flashes twice in a row in most other ERP spellers.Therefore, the SOA may be shorter than that for the P300 speller.These advantages have been confirmed by our experiment.

Figure 1 .
Figure 1.Examples of stimulus of P300-seller.(a) the first row is intensified; (b) the first column is intensified.

Figure 2 .
Figure 2. Stimulus and operating principle of P300-speller.When the subject counts the number of flashes of the character "K", P300 is elicited by the user's response.

Figure 6 .
Figure 6.Stimulus images and operating principle of proposed method.Circles contained in images are the user's gazing position.When the subject counts the number of flashes of the character "K" while gazing at the top right of stimulus, P300 is elicited by the user's response.N100 is elicited when any character is flashed in the user's gazing position.

Figure 7 .
Figure 7. Classification procedure of the proposed method.

Figure 8 .
Figure 8. Stimulus images of the preliminary experiment.

Figure 9 .
Figure 9. Electrode locations of the preliminary experiment based on the extended international 10-20 system.

Figure 10 .
Figure 10.Grand averaged waveform over all subjects for N100 on FC1, FC2, P3 and P4 The number of averaging is 360.Left side of figures obtained when the subject gazes at the stimulus circle.Right side of each signal are obtained when the subject gazes at the center circle and pays attention to the stimulus.

Figure 11 .
Figure 11.Electrode locations of experiments of the proposed BCI based on the extended international 10-20 system.

Figure 12 .
Figure 12.Grand averaged waveform over all subjects of P300 on FCz and Pz.The signal for the target (non-target) stimulus is averaged 1100 (12,100) times for 2 × 2 matrix N100-P300 speller.The signal for the target (non-target) us averaged 2200 (11,000) times for P300-speller.

Figure 15 .
Figure 15.An example of stimulus image of the proposed method (2 × 3 matrix) indicating target positions.

Table 3 .
Classification accuracy and standard deviation (%) of VEP detection.

Table 10 .
Classification accuracy and standard deviation (%) of IC detection.