Time-Series Prediction of the Oscillatory Phase of EEG Signals Using the Least Mean Square Algorithm-Based AR Model

: Neural oscillations are vital for the functioning of a central nervous system because they assist in brain communication across a huge network of neurons. Alpha frequency oscillations are believed to depict idling or inhibition of task-irrelevant cortical activities. However, recent studies on alpha oscillations (particularly alpha phase) hypothesize that they have an active and direct role in the mechanisms of attention and working memory. To understand the role of alpha oscillations in several cognitive processes, accurate estimations of phase, amplitude, and frequency are required. Herein, we propose an approach for time-series forward prediction by comparing an autoregressive (AR) model and an adaptive method (least mean square (LMS)-based AR model). This study tested both methods for two prediction lengths of data. Our results indicate that for shorter data segments (prediction of 128 ms), the AR model outperforms the LMS-based AR model, while for longer prediction lengths (256 ms), the LMS- based AR model surpasses the AR model. LMS with low computational cost can aid in electroencephalography (EEG) phase prediction (alpha oscillations) in basic research to reveal the functional role of the oscillatory phase as well as for applications for brain-computer interfaces. in the EEG signal. By comparing these two methods, our showed that the adaptive LMS-based AR model outperforms the AR model in real time for longer prediction lengths. This new approach may raise the possibility of EEG phase prediction with its simplicity and low computational cost.


Introduction
Rapidly changing neural oscillations are fundamental features of a working central nervous system. These oscillations can be seen as rhythmic changes either in cellular spiking behavior or subthreshold membrane potential in a single neuron. Large ensembles of these neurons can generate synchronous activity that results in rhythmic oscillations in the local field potential (LFP). These oscillations reflect the excitability of neurons. Their key role is to assist brain communication across a huge network of neurons via synchronous excitation [1]. At certain frequencies, oscillations are initiated by specific tasks, the outcome of which determines their amplitude or power [2]. Alpha oscillations have generally been considered a ubiquitous characteristic of neural activity; however, recent observations have demonstrated their roles in active inhibition [3,4] and attention [5]. Although alpha frequencies are slower and tend to distribute frontally in older subjects [6], the largest alpha amplitude is observed at the scalp over the occipital and parietal areas of the brain [7]. Additionally, alpha oscillations are also evident in the form of mu oscillations over motor cortices [8]. These studies suggest that alpha oscillations play a specific role in brain information processing associated with specific functions in motor, perceptual, and cognitive processes. However, the role of oscillations is yet to be revealed.
To understand the role of these oscillations in motor, perceptual, and cognitive processes, accurate estimations of instantaneous phase and amplitude are required. In most studies, instantaneous phase relationships are categorized post-hoc due to the requirement for analysis in the time-frequency domain. The focus on oscillatory phase does not imply that the amplitude of ongoing oscillations has no impact. In fact, the phase of an oscillatory signal can only be reliably computed when this signal has a significant amplitude, both in the mathematical and biophysical sense. Oscillatory amplitude in various frequency bands bears significant relations to sensory perception and attention [5,7,9,10]. The instantaneous frequency of electroencephalography (EEG) oscillations has also been investigated [11][12][13][14]. However, the ongoing oscillatory phase has been largely overlooked, at least up until recent studies, which showed the effects of the instantaneous phase on perceptual performance [15][16][17].
To reveal the functional role of the phase of neural oscillations, real-time prediction of the instantaneous phase is important. Pavlides in 1988 [18] and Holscher in 1997 [19] built analog circuits that triggered stimulation at the peak, trough, and zero crossing of the hippocampal LFP based on the assumption of a sufficiently narrow bandwidth. Hyman [20] employed a dual-window discrimination method for peak detection in theta oscillations. The aforementioned approach required manual calibration in a specific setting; thus, real-time operation would not be possible. More recently, Chen [21] used autoregressive (AR) modelling to accurately estimate the instantaneous phase and frequency of an intracranial EEG theta oscillation and phase-locked stimulation in real time. This study used a genetic algorithm to optimize several parameters before deploying the algorithm. Although the optimization procedure was a major limitation, this system provided a benchmark for comparing oscillations in other frequency bands. Alternatively, wavelet ridge extraction can be utilized to estimate the instantaneous phase [22]. This method is robust when presented with simultaneous multiple oscillatory schemes. However, the implementation of such a system in real time might be computationally expensive. Therefore, an adaptive method is needed. Closed-loop neuroscience has gained significant attention over the past few years with the latest technological advances. Phase information in real-time systems could have applications in brain-computer interfaces [23], closed-loop stimulator devices [24][25][26], and electrical stimulation of animal models based on a specific phase [18,19]. There is substantial experimental and therapeutic potential in state-dependent brain stimulation as the participant or patient performs the task simultaneously. We are mainly interested in estimating the current instantaneous phase utilizing the visual alpha oscillation so that depending on the brain state, we can decide whether oscillatory phase-dependent stimulation should be given or not. This reflection of a closed-loop brain-state-dependent stimulation system can be seen as a new brain-computer interface (BCI) approach. Herein, we propose a novel approach for time-series forward prediction that was developed based on the conventional AR model [21,27] with an extension of an adaptive least mean square (LMS)-based AR model. The AR model can be constructed using various algorithms to calculate model coefficients, each one of them fulfilling different purposes. These methods minimize the prediction error, either only forward prediction error or both backward and forward prediction errors. This study focuses on the forward prediction error and demonstrates that the LMS-based AR model minimizes the forward prediction error because its coefficients can be adjusted dynamically, and it performs the AR model better in real-time for long prediction lengths. Since adaptive methods can dynamically track the coefficients and allow more accurate prediction, we chose the LMS-based AR model. 4. Calculate the time-series forward prediction for twice the prediction length (256 ms) for the LMS-based AR model (See Figure 1).
Assessment of performance of AR model and LMS-based AR model: 5. Compute the means of the predicted segment and original segment and subtract the respective means from the predicted segment and original segment. 6. Estimate the instantaneous phase and frequency of the original and predicted data segments by calculating the analytic signal via the Hilbert transform. 7. Compute the phase difference between the original and predicted data segments. 8. Calculate the phase locking value (PLV) between the original and predicted data.
The last four steps are to be performed for both methods. A flow chart of these steps is shown in Figure  2.

Autoregressive (AR) Model
AR modelling has been successfully applied to EEG signal analysis for diverse applications such as segmentation, forecasting, rejection of artifacts [29,30], and speech analysis [31,32]. The AR model is a robust method for estimating the power spectrum of shorter EEG segments and is less susceptible to spurious results [30].
An AR(p) of order p is a random process defined as follows [21]: where ( ) is an input data at sample n, is a constant, , …, are the coefficients of the AR model, and is white noise at sample n.
The AR model can be created using one of numerous algorithms to calculate model coefficients. These algorithms include the Burg lattice method to solve the lattice filter equations, utilizing the mean of backward-and forward-squared prediction errors and the Yule-Walker method to minimize the forward

Autoregressive (AR) Model
AR modelling has been successfully applied to EEG signal analysis for diverse applications such as segmentation, forecasting, rejection of artifacts [29,30], and speech analysis [31,32]. The AR model is a robust method for estimating the power spectrum of shorter EEG segments and is less susceptible to spurious results [30].
An AR(p) of order p is a random process defined as follows [21]: where x(n) is an input data at sample n, c is a constant, α 1 , . . . , α p are the coefficients of the AR model, and ε n is white noise at sample n.
The AR model can be created using one of numerous algorithms to calculate model coefficients. These algorithms include the Burg lattice method to solve the lattice filter equations, utilizing the mean of backward-and forward-squared prediction errors and the Yule-Walker method to minimize the forward prediction error only. To enable comparisons with previous studies [21,27], we used the Yule-Walker method. The AR model order was selected using Akaike's Information Criterion (AIC) [33].

Least Mean Square (LMS)
For a random signal, the main objective of the AR modelling technique is to find recursively the optimal coefficients that minimize the mean square error (MSE) [34]. The LMS algorithm was developed by Hoff and Widrow in 1960 [35]. It employs a stochastic gradient algorithm to solve the least square problem. This method is obtained from steepest-descent implementation by replacing the required gradient vectors and Hessian matrices by more suitable approximations. The equations that constitute the adaptive LMS algorithm are as follows [36]: where x(n) is an input data at sample n, y(n) is output, e(n) is error, w(n) is filter weight, µ is step size, and M is filter length.
In the current study, two methods were implemented and compared for time-series forward prediction; first Yule-Walker based AR model and second LMS-based AR model. The AR model computes AR coefficients once, whereas the LMS-based AR model computes them at each instant. The LMS-based AR model algorithm starts from an initial condition without having the desired information and then updates the filter weights based on the input data sequence. The filter length for the LMS-based AR model is the same as for the AR model order.

Instantaneous Frequency and Phase
The analytic signal is constructed by combining the original data and its Hilbert transform to calculate the instantaneous phase [37]. The complex signal z x (t) can be constructed as follows: where x(t) is the original real signal, and H x(t) is the Hilbert transform of the real signal x(t) defined as follows: where p.v. denotes Cauchy's principal value. The instantaneous phase of the signal can be calculated from the complex analytic signal as follows: The instantaneous frequency can then be calculated from the instantaneous phase as follows: where ϕ uw x (t) is the unwrapped instantaneous phase. To estimate the instantaneous phase at the edge of the sliding data segment of 500 ms and to avoid edge effects, 64 ms segments were trimmed on both ends. The remaining 372 ms data segment was then introduced into the AR model to generate AR coefficients. The AR model order was fixed to 30 to iteratively forecast the 128 ms in the future [21]. The predicted 128 ms window was then used to calculate the analytic signal based on the Hilbert transform. As suggested by Zrenner [23], there is a trade-off between the length of the predicted segment and the efficacy of the filter. This study used a 128th order two-pass FIR band pass filter. A higher filter order delays the signal longer, whereas a low filter order <100 does not remove frequencies outside the passband effectively. For the LMS-based AR model, the same, and twice (256 ms) prediction lengths were compared with the AR model.

Participants
A total of 21 volunteers (10 males and 11 females; mean age + SD, 26.2 + 7.1) were recruited to this study and provided informed consent for EEG analysis before they participated in the study. To record EEG data, participants were asked to rest with their eyes closed while their EEG signals were measured for 3 minutes. The study was conducted in accordance with the Declaration of Helsinki, and the study was approved by the ethics committee of RIKEN (Wako3 26-24). The EEG data had been used in our previous studies for different purposes [38][39][40].

EEG Recording and Preprocessing
EEG signals were recorded using an EEG amplifier (BrainAmp MR+, Brain Products GmbH, Gilching, Germany) at a sampling rate of 1000 Hz using a 63-channel EEG cap (Easycap, EASYCAP GmbH, Herrsching, Germany). Online low and high cutoff frequencies of the EEG amplifier were set to 0.016 and 250 Hz, respectively. Electrodes were positioned according to the 10/10 system with electrode AFz as the ground electrode and left earlobe as the reference electrode. EEG signals were re-referenced to the average of the right and left earlobe and down sampled to 500 Hz offline. All analysis was performed in MATLAB (Math-Works Inc., Natick, MA, USA) using custom-written scripts and EEGLAB [41].

Statistical Analysis
All statistical analyses were conducted using MATLAB Statistics and Machine Learning Toolbox. The statistical significance level was set at p < 0.05.

Results
The ultimate goal of this study was to replicate the AR model for time-series forward prediction and compare the results with those of the adaptive LMS-based AR model with twice the prediction length. First, we wanted to determine the performance of both methods to assess the phase-locking value (PLV) at different time points [42].
For a given data segment, the difference between the original instantaneous phase and predicted instantaneous phase was computed as shown in Figure 3. The measure yields a number between 0 and 1: with 0 reflecting the high phase variability, and 1 reflecting trials with identical phase. PLV is defined as follows: where N is the total number of trials, m is the trial index, Ø 2 is the instantaneous phase of the original data segment, Ø 1 is the instantaneous phase of the predicted data segment, and n is the time.
The statistical significance of the PLV can be tested by calculating Rayleigh's Z value [43] as follows: where N is the total number of trials.
Appl. Sci. 2020, 10, 3616 7 of 15 where N is the total number of trials, m is the trial index, Ø2 is the instantaneous phase of the original data segment, Ø1 is the instantaneous phase of the predicted data segment, and n is the time.  [42] between origanl and predicted data segments. Intertrial variability of phase differences between the original and predicted data segments from the same electrodes was calculated. PLV was calculated for three different electrodes (O1, O2, and Oz) and at three different time points (0, 128, 256 ms), respectively.
The statistical significance of the PLV can be tested by calculating Rayleigh's Z value [43] as follows: = (12) where is the total number of trials.

Shorter Prediction Length
A paired t-test was used to compare the results of both methods for three different channels (O1, O2, and Oz) at two time points (64 ms and 128 ms). Figure 4 depicts the mean values of Rayleigh's Z value (PLVrz) for all the subjects. Part A shows the results for the AR model, and part B reflects the PLVrz values of the LMS-based AR model. The PLVrz decays slowly, indicating that the prediction performance decreases with time. At 128 ms, there is a significant difference between the two methods where the AR performs better than the LMS-based AR model. For approximately 700 trials per subject across 21 subjects, PLVrz > 2.9957 is considered statistically significant with respect to the p < 0.05 significance level.  [42] between origanl and predicted data segments. Intertrial variability of phase differences between the original and predicted data segments from the same electrodes was calculated. PLV was calculated for three different electrodes (O1, O2, and Oz) and at three different time points (0, 128, 256 ms), respectively.

Shorter Prediction Length
A paired t-test was used to compare the results of both methods for three different channels (O1, O2, and Oz) at two time points (64 ms and 128 ms). Figure

Twice Prediction Length
To investigate how well both methods perform with increasing prediction length, the prediction length was increased twice (256 ms: 128 samples). The prediction performance usually decays with time [44]. However, the LMS-based AR model showed better prediction performance when the length was increased, as shown in Figure 5.

Twice Prediction Length
To investigate how well both methods perform with increasing prediction length, the prediction length was increased twice (256 ms: 128 samples). The prediction performance usually decays with time [44]. However, the LMS-based AR model showed better prediction performance when the length was increased, as shown in Figure 5.

Twice Prediction Length
To investigate how well both methods perform with increasing prediction length, the prediction length was increased twice (256 ms: 128 samples). The prediction performance usually decays with time [44]. However, the LMS-based AR model showed better prediction performance when the length was increased, as shown in Figure 5.   Both methods were compared using a paired t-test for three channels: O1, O2, and Oz. At 128 ms and the 256 ms, the LMS-based AR model showed higher PLVrz than the AR model. Although the performance of both methods declined with increasing prediction length, the LMS-based AR model outperformed the AR model when the length was increased two times, as shown by their mean PLV Rayleigh Z values (Figure 7). Both methods were compared using a paired t-test for three channels: O1, O2, and Oz. At 128 ms and the 256 ms, the LMS-based AR model showed higher PLVrz than the AR model. Although the performance of both methods declined with increasing prediction length, the LMS-based AR model outperformed the AR model when the length was increased two times, as shown by their mean PLV Rayleigh Z values (Figure 7). Both methods were compared using a paired t-test for three channels: O1, O2, and Oz. At 128 ms and the 256 ms, the LMS-based AR model showed higher PLVrz than the AR model. Although the performance of both methods declined with increasing prediction length, the LMS-based AR model outperformed the AR model when the length was increased two times, as shown by their mean PLV Rayleigh Z values (Figure 7).

Sampling Points Crossing the Significant Rayleigh's Z Value
Next, we investigated which of the sampling points in the 800 ms interval per individual subject crossed the significant Rayleigh's Z value (>2.9957). When the prediction length was increased to 800 ms, the performance of both methods declined (Table 1). Amongst the 21 participants, 15 subjects showed higher crossing values with the LMS-based AR model than with the AR model. A significant difference was observed between the two methods for time points <400 (Table 2). At 64 ms, the AR model surpassed the LMS-based AR model, whereas for the 128 ms, 256 ms, and 340 ms, the LMS-based AR model outperformed the AR model. For time points greater than 400 ms, there was no significant difference between the two methods, and their performances declined. Table 1. Time points crossing the significant Rayleigh's PLV for channel O1 for both the autoregressive (AR) model and the least mean square (LMS)-based AR model. Asterisk (*) means that out of 800 ms, the particular channel did not cross the significant value (>2.9957).

Discussion
Estimation of the phase of EEG rhythms is challenging due to their low signal-to-noise ratio and dynamic nature. In this study, we presented an adaptive method to estimate the phase of alpha oscillations and compared the results with those of the AR model. Our aim was to improve real-time EEG applications that depend on phase estimates. This approach estimates instantaneous frequency and phase of an EEG segment (three channels: O1, O2, and Oz) and then forecasts the signal based on these aforementioned methods. Two prediction lengths (128 ms and 256 ms) of the EEG segment were investigated, and performance was evaluated in terms of PLV.
Previously, Zrenner [27] used the AR model [21]. For comparison and consistency purposes, we applied the AR model with the same filtering method, AR model order, length of EEG data segment, and prediction length. Additionally, we assessed how the future prediction window affected the performance of both methods. Earlier studies used the Hilbert-Huang transform [45,46] and complex wavelet transform [47] to extract frequency and phase information from EEG signals. These methods are limited because the task of predicting a future signal is hard to resolve using analysis of non-stationary data. Rendering methods such as AR assume stationarity over short time periods, therefore they are not suitable for closed-loop and real-time applications. Our adaptive method relies on frequent updates to cater for EEG signals with non-stationarity, thus making it possible to predict the future signal while adjusting to dynamic changes over time.
Considering the precise phase dynamics of an ongoing oscillation, the use of closed-loop neurostimulation has escalated significantly in the last few years. Specifically, a prior study [48] used Fast Fourier-and Hilbert transform-based algorithms for phase-locked stimulation of different EEG bands. The authors proposed a short window prediction algorithm based on an intermittent protocol with distinct windows for phase extraction and prediction. As in our study, they used the PLV metric for performance evaluation and exceeded it to 0.6 for alpha band detection. Similarly, they reported that performance declined relative to increasing the size of prediction length in both methods. The major limitations of this study were a relatively small sample size and neglect to show the functioning of a whole closed-loop system. A study published in 2018 reported three different phase prediction methods (AR model, Hilbert-based, and zero crossing) [44]. Different performance metrics were applied, including PLV, phase synchrony based on entropy, and degree deviation. Their study confirmed that PLV decreases with an increasing time-window as indicated by an increase in alpha fluctuations at longer intervals. Drawbacks of the study include a small sample size (eight), use of only one channel (Oz), and failure to determine whether the AR or Hilbert-based methods were optimal.
Most studies estimate the phase based on standard signal processing methods. Recently machine learning techniques, specifically deep learning methods, have been implemented in many BCI systems. An eleven layered Convolutional Neural Network (CNN) model was used to detect schizophrenia [49]. High classification accuracy 81% for the subject and 98% for non-subject based testing were achieved. Despite high classification accuracies, the major limitations include small data pool, and CNN is costly to compute as compared to traditional machine learning algorithms. CNN was also implemented in an automated detection system for Parkinson's disease with an accuracy of 88.25% [50]. Another study employed Principal Component Analysis (PCA)-based CNN for p300 EEG signals [51]. PCA was used to reduce the dimension of the signal and computational cost by retaining the data features of the original signal. A combination of the continuous wavelet transform and simplified CNN named SCNN was implemented to enhance the recognition rate of motor imagery EEG signals [52]. Compared with CNN, the SCNN shortens the training time and reduces the parameters; however, the classification accuracy needs to be improved. An interesting study implemented the deep learning algorithm in neuromarketing utilizing EEG signals for a product-based preference detection. Although the deep neural network produced good results, a random forest algorithm yields similar results on the same dataset [53]. Two-level cascaded CNN was proposed to detect the stego texts generated by substituting synonyms [54]. The proposed steganalysis algorithms showed enhanced prediction performance with an accuracy of 82.2%. A study investigated machine-learning methods to extract the instantaneous phase of an EEG signal centered at POz [55]. Similar to our study, the authors also performed frequency band optimization based on individual alpha frequency. To compose analytic signals, the algorithm was split into two branches: on the first path, to generate an input signal, data were epoched only, and on the second path, to generate an output signal, an FIR band-pass filter was applied on the data. The second path was directly followed by Hilbert transform and then epoching as the last step before evaluation of the model. The data was trained with an optimized filter, and instantaneous phase recovered via application of the Hilbert transform and non-causal filtering to minimize MSE. The main disadvantage of this procedure is the need for preliminary data prior to the principal experiment for training purposes. Since real-time phase estimation does not provide future information, the trained filters depend on the quality of the signal and their underlying properties. As a result, this method does not assume unbiased phase estimation. Although deep learning techniques are highly efficient, they require a copious amount of data for training. Another challenge is that it requires a huge amount of processing power and, therefore, is a costly affair.
Recently, databases comprising abundant similar time-series have been available for many applications. For such data, utilizing traditional univariate methods in time-series forecasting leaves a great possibility for generating a precise forecast unexploited. A comprehensive review article sheds light on utilizing CNN, Long Short Term Memory (LSTM), and Deep Belief Networks for financial time-series forecasting [56]. Recently, recurrent neural networks, particularly LSTM networks, have proven to be able to surpass the traditional state-of-the-art time-series forecasting methods. LSTM has attracted much attention and has been widely used in various domains in data science such as forecasting of the production rate of petroleum oil [57], wind speed forecasting [58], weather forecasting [59], and speech recognition [60]. Although the accuracy of the methods, as mentioned above, may decline in the presence of the heterogeneous time-series data [61], we need to further investigate the possibility of using such recent techniques, which have been applied to other data types already mentioned.
The goal of this study was to estimate the EEG signal phase by utilizing an adaptive method. In support of earlier studies, our results clearly show that for longer prediction intervals, the LMS-based AR model outperforms the AR model. Even for single channel O1, the LMS-based AR model shows higher samples crossing the significance line. Our proposed method may also be deployed as an alternative to the AR model, when longer prediction length is the key. The LMS-based AR model offers a feasible approach for mimicking the results of previous studies while incurring faster adaptation to the underlying EEG signal at lower computational costs. Limitations of the current study include the need to apply (a) the proposed method for different EEG rhythms, (b) performance evaluation in some behavioral tasks (rather than a simple case of resting state), and (c) single performance metric (i.e., PLV). Further work will involve overcoming the limitations and implementation in a real-time scenario, such as by closed-loop stimulation or exploring new vistas for alpha oscillations.

Conclusions
Our analytical approach to estimate EEG phase relies on instantaneous alpha oscillations using the conventional Yule-Walker-based method (AR) and the adaptive method (LMS). The LMS-based AR model serves at least two main purposes. First, it avoids pre-existing knowledge of the exact signal stochastic, which is rarely available in practice when a learning mechanism is used to estimate the required signal statistics. Second, it possesses a tracking mechanism to monitor the variation in the EEG signal. By comparing these two methods, our study showed that the adaptive LMS-based AR model outperforms the AR model in real time for longer prediction lengths. This new approach may raise the possibility of EEG phase prediction with its simplicity and low computational cost. In summary, our proposed technique can help in neurostimulation applications, such as EEG phase prediction aids in versatile brain-machine interface applications.
Funding: This research was partially funded by a research grant from Toyota Motor Corporation.