1. Introduction
The brain-computer interface (BCI) is an alternative communication pathway to communicate with and control devices by discriminating brain signals without the user making any physical movements. The major goal of BCI research is to develop applications that enable disabled or elderly users to communicate with others and control their limbs and/or the environment [
1]. Various types of event related potentials (ERPs) have been utilized to realize BCI, such as P300 based BCI, steady state visual evoked potential (SSVEP), auditory steady state response (ASSR), and
μ-rhythms from the sensorimotor cortex [
2], and various systems have been used to measure it, including electroencephalography (EEG), magnetoencephalography (MEG), and functional magnetic resonance imaging (fMRI). In this paper, we focus on an EEG-based BCI system.
EEG-based ERP spellers have been extensively used because of their simplicity and high accuracy. Most of ERP-spellers use P300 evoked by counting the number of times the target is intensified to detect the desired target command [
3,
4]. The P300 speller proposed by Farwell and Donchin is a well-known BCI system using P300 [
3]. A 6 × 6 matrix containing target characters is used for stimulation. Each row and column of the matrix is flashed in random order and the user silently counts the number of times the desired character is presented. The desired character is determined by detecting P300 evoked by the mental task. In [
4], early ERP components such as P1, N1, and P2 are used in addition to P300 as the features to detect the target command. GeoSpell (a geometric speller) is an alternative visual ERP-based spelling system. In the GeoSpell interface, each N2 character is assigned to two
groups arranged in a circle. The user silently counts the number of target stimuli containing the target character in the same manner as the P300 speller. The advantage of GeoSpell is that the user is not required to perform direct eye-gazing. In addition, the probability that an identical target stimulus flashes twice continuously, is lower than with the conventional P300 speller [
5]. Another promising BCI is the ERP-based Hex-o-Spell. In Hex-o-Spell, the target is determined in two stages. First, a character group containing the target character is selected, and after that, the individual target is determined [
4]. In GeoSpell and Hex-o-Spell, a visual stimulus is presented from the center of the screen so that users can fixate on a dot in the center of the screen and focus on the target in their visual periphery.
Existing ERP spellers have several drawbacks: (i) at least
flashes are required to present
N commands; (ii) since the stimuli containing a group (e.g., row or column) of the characters flash randomly, at least one character flashes twice in a row in some ERP spellers (including the P300 speller); and (iii) at least two counting tasks are required to type one character, which is like counting row and column in a matrix in the P300 speller. In
Section 2, we discuss these drawbacks in detail.
Hybrid BCIs which combine two or more BCI paradigms have been proposed [
6]. Some hybrid BCI researches aim to improved ITR by combining plural BCI paradigms [
7,
8,
9]. Allison et al. improved reliability by using SSVEP and the event-related desynchronization (ERD) paradigms, especially for some users who do not exhibit adequate BCI performance in single BCI paradigm [
10]. Panicker et al. utilized SSVEP to detect the control state in a P300-based ERP speller [
11].
In this paper, we propose a new visual ERP-speller using N100 in addition to P300, along with efficient visual stimulus images for this purpose. N100 is a kind of visual evoked potential (VEP) that is evoked with P1 and P2 [
12]. Unlike P300, N100 is evoked by only paying attention to the visual stimulus, with no counting task. To the best of the authors knowledge, this is the first work to use N100 for feature to classify BCI commands. In the proposed paradigm, unlike [
4], P300 and N100 are independently used to determine the target character. By utilizing two features independently, the proposed BCI overcomes the above drawbacks of the conventional ERP speller. In
Section 5, we show through a preliminary experiment that N100 is discriminable, and in
Section 5.1.2 and
Section 5.2, we present two sets of experimental results demonstrating that the ITR of the proposed method improves upon that of the P300 speller by 15 bit/min on average. The proposed method is not a kind of hybrid BCIs because N100 is difficult to use for BCI solely.
We furthermore propose using N100 to realize a self-paced (asynchronous) BCI [
13,
14]. When individuals use an input device, they are not constantly sending information; sometimes they pause to rest, think, and wait for a response. Therefore, classifying non-control (NC) and intentional control (IC) states is required for practical BCI. Although the original asynchronous BCI does not require a predefined time frame, we here consider classifying NC/IC states using a short time frame (3.4–4.5 s). In previous studies, classifying NC/IC states was done using stopping criteria such as thresholding of the peak amplitude of P1 and N1 or outputs of the classifier [
14,
15]. In these methods, however, it is necessary to tune the threshold depending on the experimental environment and conditions each time. Therefore, we here propose a machine learning-based NC/IC classification method that uses P300 and N100. The classification results of NC/IC states are discussed in
Section 5.2. Our preliminary ideas have been published in conference publications [
16,
17]. In this paper, we systematize our frameworks and add experimental results to show the discriminability of N100 and detailed experimental results.
2. ERP Speller
P300 is a positive deflection in ERP that appears 300 ms after the onset of stimuli. The oddball paradigm is used to observe P300 [
18]. P300 is elicited if a user is actively trying to detect the targets. The mental task of counting the number of target stimuli is often used for BCI. P300 is evoked by not only visual but also auditory [
19] or tactile [
20] stimuli.
The P300 speller is a classical spelling BCI proposed by Farwell and Donchin in 1988. It features a 6 × 6 matrix containing alphanumeric characters is arranged on a display as shown in
Figure 1. Each row and column having six characters is flashed in a random order. The user performs a mental task such as counting how many times the desired character is presented. P300 evoked by the counting task is detected by the system and the target character is determined by detecting P300 from the target row and column [
3]. An example of the detection process of the desired character “K” is given in
Figure 2. GeoSpell and Hex-o-Spell are improved versions of the ERP speller. They do not require eye-gaze control.
The performance of BCIs is usually evaluated by the information transfer rate (ITR) as well as the classification accuracy of discriminating the target character. Such measurements depend upon three factors: typing speed, classification accuracy, and the number of commands [
21],
where
T (s) is the time of one session,
P is the classification accuracy, and
N is the number of commands.
Although ERP spellers are widely used because of their simplicity and high ITR, they have several technical problems, as stated in the Introduction. The first is that ERP spellers require at least
flashes to present
N commands. Suppose that the classification accuracy is
and the stimulus onset asynchrony (SOA) is
ms, that is, it takes
ms to present all commands.
Figure 3 shows the relationship between
N and ITR obtained by Equation (
1). This figure suggests that making the matrix larger than 3 × 3 (nine commands) does not improve the ITR. Moreover, the accuracy
P is expected to be lower for large
N because the number of classes increases with
N. This is the main limitation of the ERP speller.
Since enlarging the matrix does not improve the ITR, we next consider shortening the SOA. However, in some ERP spellers, at least one character flashes twice continuously. This problem is called attentional blink (AB). Discriminating the second target is made more difficult if both targets are presented less than roughly 500 ms apart [
22]. For example, in
Figure 1, if (b) is presented after the presentation of (a), “A” flashes twice continuously. If the SOA is too short, the subject cannot follow the stimulation, and P300 will not be observed.
Most ERP spellers require the target stimuli to be counted at least two times because of the two-stage selection process. Moreover, if we use averaging to improve accuracy, the number of counting times increases, which increases the risk of the users become fatigued.
If we use a large matrix in the P300 speller, all characters are small and close together. This causes users to make mistakes and is not user-friendly, especially for the elderly.