A Novel Instantaneous Phase Detection Approach and Its Application in SSVEP-Based Brain-Computer Interfaces

This paper proposes a novel phase estimator based on fully-traversed Discrete Fourier Transform (DFT) which takes all possible truncated DFT spectra into account such that it possesses two merits of ‘direct phase extraction’ (namely accurate instantaneous phase information can be extracted without any correction) and suppressing spectral leakage. This paper also proves that the proposed phase estimator complies with the 2-parameter joint estimation model rather than the conventional 3-parameter joint model. Numerical results verify the above two merits and demonstrate that the proposed estimator can extract phase information from noisy multi-tone signals. Finally, real data analysis shows that fully-traversed DFT can achieve a better classification on the phase of steady-state visual evoked potential (SSVEP) brain-computer interface (BCI) than the conventional DFT estimator does. Besides, the proposed phase estimator imposes no restrictions on the relationship between the sampling rates and the stimulus frequencies, thus it is capable of wider applications in phase-coded SSVEP BCIs, when compared with the existing estimators.


Introduction
Estimating the frequency, the phase and the amplitude of a signal is a standard classical problem of signal processing. In particular, phase estimation has been widely applied in synchronization in communication [1], power analysis [2], speech enhancement [3], GPS navigation [4] and much more. Until now, phase estimators based on direct Discrete Fourier Transform (DFT) are the mainstream [5].
In [5], Liguori pointed out that, the existing DFT-based phase estimators heavily rely on the result of frequency estimate. To illustrate this dependency, let us study the sampled version (the sampling rate is F s ) of a complex exponential signal x(t) = a 0 exp[j(2π f 0 t + θ 0 )] as x(n) = x(t) t=n∆t = a 0 e j(2π f 0 n∆t+θ 0 ) , n = 0, ..., N − 1, where a 0 , f 0 , θ 0 are amplitude, frequency and phase respectively and ∆t is the sampling interval 1/F s . Accordingly, the frequency resolution of the N-point DFT X(k) is ∆ f = F s /N = 1/N∆t. Assume the peak DFT bin is at k = k 0 , k 0 ∈ Z + (Z + refers to the set of positive integer numbers). Then, it can be deduced that the ideal phase value of X(k 0 ) is The variable δ in (2) is the fractional frequency offset (−0.5 ≤ δ < 0.5) with the value δ = f 0 /∆ f − k 0 , which is closely related to two sampling cases (coherent sampling or noncoherent sampling). For the The rest of this paper is organized as follows. Section 2 introduces the derivation of the proposed fully-traversed DFT. Section 3 elaborates the properties of the proposed fully-traversed DFT in the noiseless case. Section 4 gives accuracy analysis of this phase estimator in noisy case. Section 5 demonstrates the proposed DFT-based phase estimator's superiority to conventional phase estimators in the phase-coded SSVEP-BCI with auto-calibration. Finally, we conclude with a summary of results in Section 6.

The Proposed Fully-Traversed DFT Spectrum
In this work, we aim to develop a phase estimator capable of removing the dependency on the frequency offset δ. To be specific, our goal is to construct a corrected DFT spectrum whose peak bin provides accurate phase information whether δ = 0 or not.
To derive the proposed fully-traversed DFT, it is necessary to study the relationship between the ideal Fourier transform and the conventional DFT. As known to us, the ideal Fourier transform X(jω) of an infinite-length sequence {x(n)} = {· · · , x(−N+1), · · · , x(0), · · · , x(N−1), · · · } is However, X(jω) cannot be realized, since the calculation in (7) consumes innumerous samples and memory. Thus, in practical applications, X(jω) is replaced by the conventional normalized DFT X(k) defined in (3). Furthermore, if we define a sequence x 0 (n), which is truncated from the infinite-length sequence {x(n), −∞ ≤ n ≤ ∞}, i.e., then, apparently, combining (3) with (7), (8) yields a representation of X(k) as Now, let us focus on the sample x(0) at which the ideal phase θ 0 is located. It can be inferred from (9) that conventional DFT only considers one truncation case. However, there are N truncated sequences x m (m = 0, 1, ..., N − 1) containing x(0) listed as . .
where the elements of sequence x m = {x m (n), n = 0, · · · ,N−1} are Similar to the discrete spectrum of the truncated sequence x 0 expressed in (8), a reasonable discrete spectrum for Furthermore, aiming to rewrite X m (k) in the form of conventional DFT in which variable n ranges from 0 to N−1, we should detach the series summation in (12) into two terms as The first summation in (13) can be denoted as Hence, it is necessary to introduce N new truncated sequencesx m (m = 0, 1, ..., N − 1) whose elements arex In terms of (15), N sequencesx 0 ∼x N−1 are listed as . .

Single-Tone Case
Consider a single-tone signal x(n) = exp(j(ω 0 n + θ 0 )), (12) and (18) yields Obviously, (19) can be rewritten as where the superscript ' * ' represents complex conjugate operation. Then, combining (20) with (4) yields Y(k) = e jθ 0 X(k)X * (k) = e jθ 0 |X(k)| 2 , Taking the phase part of (21), we can obtain the phase spectrum ϕ Y (k) as Equation (22) shows that, for the signal x(n) = exp(j(ω 0 n + θ 0 )), −N + 1 ≤ n ≤ N − 1, the phase values at all N spectral bins (including the peak bin k = k 0 ) uniformly equal the ideal phase θ 0 (also refers to the instantaneous phase of the center sample x(0)). In other words, the synthesized spectrum Y(k) directly provides the accurate phase estimate at any spectral bin, thus removing the conventional DFT-based phase estimator's dependency on the frequency offset δ.
Furthermore, since Y(k) is obtained by linearly averaging the N sub DFT spectra as (18) shows, the proposed fully-traversed DFT spectrum is of linearity, also. Thus, for a single-tone signal with amplitude a 0 (a 0 = 1), the following holds among which the peak spectral bin is

Multi-Tone Case
Consider a signal containing Q (Q ≥ 2) components expressed as Since the conventional DFT is a linear transform, combining (5) and (25), its multi-tone spectrum X(k) is composed of Q single-tone spectra, i.e., Similarly, since the proposed phase corrected DFT spectrum is also of linearity, combining (5) with (23), its multi-tone spectrum Y(k) equals the summation of Q single-tone spectra as Furthermore, to measure the phase of the i-th tone, the peak bin at k = k i should be focused. Therefore, the conventional DFT spectrum X(k i ) can be written as Similarly, the proposed fully-traversed DFT spectrum Y(k i ) can be expressed as In (29), the first term is the expected i-th single-tone spectrum which directly provides the phase estimate, and the second item represents the interference of other tones. Obviously, for either the conventional DFT spectrum or the fully-traversed DFT spectrum, the accuracy of the i-th tone's phase estimation depends on the intensity of interference. Particularly, this accuracy depends on the relative magnitude ratio between X q (k i ), q = i, and X i (k i ). From (28) and (29), we have From (30), we have Please note that 0 < |δ i |, |δ q | < 0.5, |k q − k i | ≥ 1 and thus |k q + δ q − k i | > 0.5. Since sin(δπ)/ sin(δπ/N) is a monotonously descending even function, we have Combining (31) with (32), we have Equation (33) shows that, for the fully-traversed DFT spectrum, the relative magnitude ratio between any other tone and the tone of interest is smaller than that of the conventional DFT. In other words, compared to the conventional DFT spectrum, the fully-traversed DFT magnitude spectrum does better in suppressing spectral leakage and interferences, thus yielding a higher accuracy of phase estimation.

Simplified Dataflow of Phase Estimation
From the aforementioned 4-step procedure listed in Section II, one can find that the fully-traversed DFT experiences N times of DFT operation. Therefore, to reduce computation complexity, this procedure needs to be simplified.
According to the linearity of DFT, the average of N sub-spectra is equivalent to the DFT result of the averaged data. Therefore, if we average the N sub-sequencesx 0 ∼x N−1 in (15) and (16), then one new sequence {y(n), n = 0, 1, · · · ,N−1} can be constructed as Clearly, Equation (35) only involves one time of DFT operation, thus greatly reducing computation complexity. Accordingly, its simplified dataflow is illustrated in Figure 1 (take N = 4 as an example), from which a low-complexity procedure of multi-tone phase estimation can be summarized as follows.
Lastly, collect all the peak indices k 1 , ..., k Q of Y(k). For each index k q , directly taking the phase value of Y(k q ) provides its phase estimatesθ q .

CRLB for the Proposed Phase Estimator
Different from conventional 3-parameter model-based phase estimators, mathematical model for the proposed corrected DFT-based phase estimator can be simplified. The reasons are as three aspects.
Firstly, (29) implies that the proposed phase detector only requires roughly searching the peak spectral bin and then taking the phases directly, i.e., independent of frequency offset δ.
Secondly, as previously mentioned, the proposed fully-traversed DFT spectrum equals the average of N sub DFT spectra. This actually reflects the following mechanism: averaging N sub vectors plays the role of compensating the angles of N sub DFT spectra with each other, which leads the synthesized phase to automatically fall at the ideal phase value, whether the frequency offset δ = 0 or not.
Lastly, as previously mentioned, the proposed phase estimator can directly determine the ideal phase θ 0 , referring to the 'instantaneous phase' of the center sample among the 2N − 1 input samples x(−N + 1) ∼ x(N − 1). Obviously, the position for the center sample is at n = 0, which can be easily determined in advance. Thus, if we rewrite x(n) = a 0 e j(ω 0 n+θ 0 ) as a 0 e jϕ n , then, for the position n = 0, the entire term ϕ n equals the ideal phase θ 0 , which also indicates that estimation of frequency ω 0 can be omitted.
For the above 3 reasons, the error variance of fully-traversed DFT phase estimator obeys a 2-parameter mathematical model with α = [θ 0 , a 0 ] T , in which the estimate of frequency offset δ is bypassed. Now we deduce the CRLB of this 2-parameter model using the classical parameter estimation theory [27].

Numerical Results
To further verify the superiority of the proposed phase estimator, simulations performed under various noisy conditions and different spectral orders are presented. The phase error variance of the proposed estimator was also compared with that of conventional DFT-based estimator (we choose the ratio method based on interpolated DFT [2]). Assume k 0 = 3, N = 128 and θ 0 = 60 • . Then, the specific signal based on (36) is As can be seen in each figure, the majority of phase corrected DFT's error variance curve (marked in ' * ') lies below CRLB 3 , proving that the proposed phase estimator is independent of the conventional 3-parameter joint estimation model. Furthermore, the proposed estimator's error variance curve is bounded by CRLB 2 curve, verifying the correctness of (51). Figure 2 also demonstrates that the error variances of the conventional interpolated DFT estimator are nearly one order of magnitude higher than that of the proposed estimator.

Applying Fully-Traversed DFT in Phase-Coded SSVEP-BCI
Recently, BCIs have become a very hot topic in neural engineering. A BCI detects an user's ongoing brain activities and translates them into meaningful messages, which helps patients with severe motor disabilities to express their messages to external world [30]. In particular, BCI based steady-state visual evoked potential (SSVEP) has received much attention in bioengineering research due to its satisfactory performance [31]. To increase the number of recognizable targets, the phase-coded SSVEP-BCIs use phase information to encode subject's visual intention. In this system, the phase-tagged visual stimuli are characterized with flashing at one frequency but different phases, resulting in that subjects' SSVEPs also differ in phase features. Therefore, through extracting the phase information of SSVEP potentials, a computer is able to distinguish which flicker the subject desires to select.

Experiment Paradigm
In general, SSVEP is always elicited after some latency time L (or labeled as lag phase 'θ L '), which actually corresponds to a phase difference between flicker's phase 'θ F ' and SSVEP's phase 'θ S '. Hence, the relationship between the phase difference 'θ L ' and latency L is described by where f s , q denote the stimulus frequency and the integer cycles, respectively. Under normal condition, L is stable in a short period of time but differs in inter-subject such that θ L cannot be calculated in advance [21,23,24]. From (54), it can be found that the flicker's phase is usually not equal to SSVEP's phase (or θ F = θ S ) due to θ L , which means that we cannot directly use θ s to identify which flicker the subject desires to select if θ L is unknown. As a result, an additional measure of phase calibration is necessary for phase-coded SSVEP-BCI to calibrate this error θ L in the detection algorithm. Hence, we build up a BCI system which uses a half-field phase-tagged stimulus to evoke the SSVEP with two different frequency components f 1 , f 2 at the same time. This system does not adopt any phase calibration since it is able to identify the flicker by introducing the phase difference instead of the phase under the assumption f 1 = f 2 (this frequency distinction makes a subject more sensitive to flickering stimuli than to those with the same frequency), i.e., If . Therefore, the difference between θ S ( f 1 ) and θ S ( f 2 ) approximately equals the difference between θ F ( f 1 ) and θ F ( f 2 ). Hence, the lag phase difference can be removed.
In our phase-coded SSVEP-BCI system, a visual stimulator (ViewSonic, 22 inch, 120 Hz refresh rate, 1680 × 1050 screen resolution) presenting two phase-tagged flickers (with the size 12 cm × 8 cm each) was used to evoke subjects' SSVEPs ( Figure 3 and Table 1). Table 1. The parameters of two phase-tagged half-field flickers.

Left-Field
Right-Field It should be noted that, in our SSVEP-BCI illustrated in Figure 3, the selected flickering frequencies f 1 and f 2 should be as close as possible (this helps to remove the possible jump change of multiple of 360 • for the lag phase difference between θ L ( f 1 ) and θ L ( f 2 ), which was solved in [17] by means of a exhaustive search procedure based on least-square fitting). Otherwise, their lag phase difference (θ L ( f 1 ) − θ L ( f 2 )) would not be removed. However, due to the fact that all the stimulus frequencies in our SSVEP-BCI are acquired by integer dividing a fixed LCD display refresh frequency F r = 120 Hz (see [32]), we can only obtain a limited number of flikering frequencies (they are 120/7 Hz, 120/8 Hz, 120/9 Hz, 120/10 Hz, 120/11 Hz) falling at the visual sensitive region (10 Hz, 20 Hz). Therefore, among these candidate flickering frequencies, the frequency pair with the minimum interval is ( f 1 , f 2 ) = (120/11 Hz, 120/10 Hz) = (10.9 Hz, 12 Hz). Obviously, in this case, the lag phase difference between θ L ( f 1 ) and θ L ( f 2 ) will not be removed as f 1 and f 2 are not close enough. Hence, this phase difference should be taken into account in order to achieve an accurate result. In practice, it can be roughly estimated according to their empirical results of the apparent latency of SSVEPs [24]. In our experiment, (θ L ( f 1 ) − θ L ( f 2 )) cam be roughly estimated as 36 • .
Different from the well-known CCA (Canonical Correlation Analysis) method, which also uses phase information to enhance the classification accuracy of SSVEPs, our proposed scheme consumes lower hardware cost. As [17] pointed out, the CCA method needs multiple stimulus frequencies (6 frequencies were adopted) to remove the possible jump change of multiple of 360 • for the lag phase difference. In contrast, our proposed scheme only employs 2 frequencies, thereby lowering the hard cost.
Three subjects (S1∼S3, two males and one female) were seated on a comfortable chair before the visual stimulator in an illuminated room. The subjects' EEG signals were recorded by a g.USBamp EEG amplifier from 13 electrodes (PO 3 , PO 5 , PO 7 , POZ, PO 4 , PO 6 , PO 8 , P1, PZ, P2, O1, Oz, and O 2 ). Specify the sampling rate F s = 600 samples/s. This experiment consisted of 5 runs containing 10 trials each. Each trial lasted for 8 s. Subjects were instructed to focus on one of flickers according to the following paradigm: From 0 to 2 s a cue appeared indicating which flashing flicker was required to focus on; From 2 s to 8 s the subjects gazed at the specified flicker; Then the next trial started. The order of gazed-flickers was '1212121212' in each run. Thus this dataset had 25 trials for each flicker. The whole experiment lasted about 30 minutes. During this experiment, the subjects' EEG signals were recorded for offline analysis later.

Procedure of SSVEP Phase Extraction
We collected a total amount of 50 trials for each subject. Basically, the procedure of SSVEP phase extraction would contain the following steps: step 1. Apply both conventional DFT and phase corrected DFT to extract the phase values θ S ( f 1 ) and θ S ( f 2 ), respectively; Use the estimate ∆θ to identify the gazed-flicker. If ∆θ is close to 0 (or 180) deg, then the gazed-flicker is judged as flicker 1 (or 2).
The judgement involved in step 3 can actually be extended to distinguish C targets (C ≥ 2) by finding the maximum among C decision variables R k as where ϕ k refers to the ideal phase difference (θ F ( f 1 ) − θ F ( f 2 )), i.e., the ideal clustering center of pattern recognition. For the above 2-category recognition problem, we have ϕ 1 = 0 • , ϕ 2 = 180 • . From (57), it can be found if the detected phase difference ∆θ → ϕ k , then R k → 1. In other words, if R 1 is close to 1, the gazed-flicker is judged as flicker 1 and vice versa. Furthermore, (57) also allows to assume more clustering centers. As will be elaborated, introducing more assumed clustering centers helps to evaluate the performance of phase estimator.

Result of Offline Analysis
The classification rates of this 2-class experiment are listed in Table 2. It can be found that the proposed phase estimator can achieve a higher classification rate than conventional DFT-based estimator does, which is entirely over 10%. Table 3 lists not only these two estimators' detected phase values (θ s (12), θ s (10.9)) but also 4 columns of phase decision values R k (j), k = 1, 2, 3, 4, calculated by (57). Here decision variable R k (j) corresponds to the kth assumed clustering center (4 assumed phase clustering centers are 0 • , 180 • , 90 • , 270 • , respectively) while the subject actually gazes at the flicker j. Therefore, for any flicker index j (j = 1, 2), R j (j) (marked with shadow in Table 3) is close to 1 and tends to be the maximum among R k (j), k = 1, 2, 3, 4, if the accuracy of selected phase estimator is high enough. The data for all trials involved in Table 3 are recorded within 4 s. In summary, fully-traversed DFT is well suitable for the phase-coded SSVEP-BCI, especially in our proposed duel-frequency stimulus-based system which uses phase difference to recognize the gazed flick. Due to the fact that fully-traversed DFT does well in extracting the phase information at any frequency offset, the proposed phase estimator can achieve a higher classification rate. More importantly, as listed in Table 3, we find that the corresponding phase's standard deviation of the proposed estimator is smaller than that of conventional DFT estimator, although the frequency resolution of fully-traversed DFT is less than DFT. The reason is actually mentioned in Section 2, i.e., the mechanism that fully-traversed DFT is the average of N DFT sub-spectra (see formula (18)) also helps to reduce the averaged noise's power.
Based on our preliminary results, we believe that fully-traversed DFT can be applied in the online system. Although the simulated 4-category experiment results show that the corresponding average classification rate is only 80%, the classification rate is surely to be further enhanced once the estimate of the lag phase difference is more accurate. In this experiment, we only roughly estimate it as 36 deg. In fact, we can use another two very close frequencies (for example, we can replace LCD stimuli with LED stimuli) such that the lag phase difference will be close to zero and thus we do not need to estimate it. Moreover, it was also clearly found that the phase variance of fully-traversed DFT estimator does not get worse than conventional DFT under our experiment condition.

Conclusions
A novel phase estimator based on corrected-phase DFT was proposed in this paper. Due to considering all possible truncated sequences containing the center sample, spectral leakage in correctedphase DFT is greatly reduced and thus the instantaneous phase information of the center sample can be directly extracted. In addition, we have also proved that the process of corrected-phase DFT is equivalent to a streamline dataflow, and thus the proposed phase estimator has a relatively low computational complexity. Furthermore, the phase error variance of the proposed phase estimator follows a 2-parameterized mathematical model and thus it has a higher accuracy than the conventional DFT-based phase estimators in noisy circumstances.
We also applied the proposed phase estimators to phase-coded SSVEP-BCIs. In particular, compared to the conventional DFT-based estimators, our offline experiment results demonstrate that the fully-traversed DFT does better in extracting the phase information of phase-coded SSVEP-BCI. Moreover, the proposed phase estimator imposes on restrictions on the relationship between the sampling rates and the stimulus frequencies (i.e., non-synchronous sampling is allowed), it is of wider applications in phase-coded SSVEP BCIs than the existing estimators. Our future work is to improve our system design and apply fully-traversed DFT in the practical SSVEP-BCI online system.

Conflicts of Interest:
The authors declare no conflict of interest.