Motor-Imagery Classiﬁcation Using Riemannian Geometry with Median Absolute Deviation

: Motor imagery (MI) from human brain signals can diagnose or aid speciﬁc physical activities for rehabilitation, recreation, device control, and technology assistance. It is a dynamic state in learning and practicing movement tracking when a person mentally imitates physical activity. Recently, it has been determined that a brain–computer interface (BCI) can support this kind of neurological rehabilitation or mental practice of action. In this context, MI data have been captured via non-invasive electroencephalogram (EEGs), and EEG-based BCIs are expected to become clinically and recreationally ground-breaking technology. However, determining a set of efﬁcient and relevant features for the classiﬁcation step was a challenge. In this paper, we speciﬁcally focus on feature extraction, feature selection, and classiﬁcation strategies based on MI-EEG data. In an MI-based BCI domain, covariance metrics can play important roles in extracting discriminatory features from EEG datasets. To explore efﬁcient and discriminatory features for the enhancement of MI classiﬁcation, we introduced a median absolute deviation (MAD) strategy that calculates the average sample covariance matrices (SCMs) to select optimal accurate reference metrics in a tangent space mapping (TSM)-based MI-EEG. Furthermore, all data from SCM were projected using TSM according to the reference matrix that represents the featured vector. To increase performance, we reduced the dimensions and selected an optimum number of features using principal component analysis (PCA) along with an analysis of variance (ANOVA) that could classify MI tasks. Then, the selected features were used to develop linear discriminant analysis (LDA) training for classiﬁcation. The benchmark datasets were considered for the evaluation and the results show that it provides better accuracy than more sophisticated methods. the state-of-the-art absolute (MAD) for the centrality of the EEG covariance matrix. all covariance matrix data from covariance (SCMs) using methods and applied the optimal number of feature dimensions to the classiﬁcation steps. We have proposed TSMLDA as a classiﬁer to categorize the activity of MI. The experimental results achieved the average recognition accuracies for MI tasks of 88.05% for BCI III-IIIa, and 90.33% of BCI IV-IIa. The summary of these results is that the proposed method performed much more accurately than the sophisticated method.


Introduction
Motor imagery (MI) from human brain signals is an important and challenging technology that can be used to diagnose diseases or help with performing certain physical tasks, including rehabilitation, recreation, device control, and technology assistance [1][2][3]. Using a brain-computer interface (BCI), brain signals can be captured by non-invasive electroencephalograms (EEGs), analyzed, and translated into commands that perform desired actions related to the output device. The challenging aspect of BCI is processing signals for classification. It is necessary to determine the control signal from a brain's activity for the application of BCI tasks. The human brain has different areas and functions and is divided into two hemispheres, right and left. The left side of the body is controlled by the right hemisphere and associated with creativity, spatial orientation, imagination, emotion, and multitasking.
The right side of the body is controlled by the left hemisphere and performs logical tasks with scientific and mathematical thinking. However, it belongs to the four lobes of frontal, parietal, temporal, and occipital. The frontal lobes control behavior, action, and problem-solving ability. The parietal lobes are charged with interpreting reality. The occipital lobe is responsible for receiving information from the eyes and then distributing this information to other parts of the brain. Finally, it is the responsibility of the temporal lobe to keep the memories, and it enables listening and speaking. To see this type of brain activity in the human brain, the EEG is one way for the BCI system that does not require surgery to use-only some inexpensive equipment. Therefore, the BCI system is used to translate EEGs into the corresponding control signals.
During different movements of the imagination, a MI-BCI uses the characteristics of the central beta and mu rhythm that can be observed in the sensorimotor area [4]. It is used to investigate imagery of voluntary movements in various parts of the body, such as fingers, tongue, and foot [5]. The results of the action imagery involve in a systematic way engaging sections of the primary motor cortex, and enable specific representations of certain parts in the non-motor region. However, finding appropriate features and signal processing techniques is of great concern due to noise and interference. We focused on classifying motor imagery tasks such as controlling the tongue and controlling the foot by extracting effective features from an EEG-based MI dataset. The important features were revealed for MI functions that can be used to recover and rehabilitate a user's motor function.
This paper proposes a robust estimate of the average use of the covariance matrix of an EEG signal, which performs better than the state-of-the-art methods. We propose a median absolute deviation (MAD) method for selecting the centrality of the EEG covariance matrix. Using average points, all covariance matrix data from sample covariance matrices (SCMs) were mapped using TSM. Moreover, we selected optimal feature dimensions using PCA to increase performance for classifying MI tasks via linear discriminant analysis (LDA). This paper continues as follows. Section 2 discusses the background work of MI-BCI-based applications and classification methods. Section 3 defines the basic structures of our proposed motor imagery classification system. We describe the process of calculating average points, the reference matrix, feature extraction and selection, and finally, classification steps. In Section 4, we explain experimental datasets and evolutionary results, and there is a brief discussion of the results and process. Section 5 summarizes this work.

Background
Many researchers have proposed different methods for the application and classification of an MI-BCI based on spatiotemporal and time-frequency analyses. An adaptive autoregressive (AAR) approach with LDA was proposed to classify dual responses consistent with left and right aspects [6]. A common spatial pattern (CSP) was applied to decode motor imagery for improving classification accuracy [7][8][9][10][11][12]. However, the performance evaluation of CSP was influenced by the frequency band of the EEG segments and the time window. Nicholas et al. proposed to exclude some effective features from non-invasive EEG signals, i.e., visually-evoked potential, P300 response, slow cortical potentials, and sensorimotor rhythm [13]. However, the features of the non-invasive BCI are limited in terms of speed, reliability, and accuracy. The CSP method was applied for extracting the correct frequency band from the individual power incident with event-related synchronization (ERS) patterns [14]. In this case, the narrowband frequency sub-band method was developed to select subject-specific frequency bands to evaluate the effective performance of the BCI [15][16][17]. These methods can effectively explain MI activities, but it is reasonable to utilize a binary classification, which limits the applications of MI-based BCI.
Furthermore, Riemannian geometry can be used to identify the actual purpose of brain activity due to the narrow EEG signal. In [18], the authors employed the concept of a covariance matrix in the Riemannian manifold to process radar signals. The Riemannian distance of a symmetric positive definite matrix was applied to motor imagery-based applications in the BCI [19]. In Reference [20], the Riemannian-based kernel was used to extract features from the MI-based BCI using the Romanian geometry method, i.e., TSM. The TSM was used to map all sample covariance matrices as averages onto a linear tangent space. However, the main challenge for the Riemannian manifold and TSM is to calculate the SCM reference point due to the high outlier. In Reference [21], the authors proposed various mean and medium methods for calculating the SCM's reference matrix. In Reference [22], the authors employed a mean absolute deviation technique to improve the effectiveness of TSM. A system using Riemannian geometry was developed to store more data variants of PCA for symmetric positive specific (SPD) matrices [23]. However, a higher variance of EEG signals does not make the method effective. The outliers of the EEG data and test conditions are a major concern for the precise centralization of the tangent space.

Proposed Methodology
The proposed method uses multi-channel EEG signals for MI classification. The fourth-order Butterworth filtering technique was used to reduce noise from the input signal. Then, we calculated the SCM and obtained reference metrics from SCMs using the MAD technique. The TSM refers to Riemannian geometry yields to extract the feature vector using the reference matrix. To select the discriminative features, we applied PCA including ANOVA. Finally, the MI tasks were classified by using LDA for a BCI application. Figure 1 depicts the block diagram of our proposed system.

Use of Covariance Matrices in BCI
In this work, the SCMs were computed in the ith trials via the following equation (Equation (1)).
where the dataset contains sample points t and has n channels of MI-based EEG signals, and the dataset can be written as R n×n . The SCM was applied to create a spatial filter for extracting EEG features [24].
To apply TSM, we have considered the Riemannian manifold. The Riemannian manifold contains the SCM that perceived the Riemannian space. The entire n × n symmetric covariance matrices space is denoted by S(n) = {S ∈ M(n), S ∧ T = S} in the square real matrices' space of M(n) and the set of entire n × n symmetric positive-definite matrices by P(n) = {P ∈ S(n), u T Pu > 0, ∀u ∈ R n }. Therefore, we define P(m) = {P ∈ S(m)|u T Pu > 0, ∀u ∈ m , u = 0} of a set of all m × m real symmetric matrices, once the SCM represents the positive definite and symmetric matrices in the space of the Riemannian manifold M of dimension . An SPD matrix is always diagonal, with a strictly positive eigenvalue [25]. Finally, all sets of n × n invertible matrices are in S(n) space, where a tangent space is laying with an m = n(n+1) 2 dimensional. The tangent space is a derivative of matrix point P g calculated by averaging the manifold lies in a vector space at T P at P g point.

Reference Matrices Calculation from Covariance Matrices
To apply TSM to the covariance matrix, it was necessary to calculate the centre point of the covariance matrix. Based on the central point, all data points could be mapped from non-Euclidean space (covariance data) to linear space (or a tangent space). The efficiency of the TSM mapping method was strongly dependant on the central points of the covariance matrix. We calculated different types of the mean and median for selecting the centrality of covariance matrices via Equations (2) and (3).
where P 1 , . . . , P n defines covariance matrices and distance function is denoted by d(., .) in space P(M), over the distance function d E (., .). Table 1 lists the functions of the state-of-the-art methods [26,27]. Table 1. The different functions for calculating the centrality of covariance matrices.

Functions Equation
In this work, we propose the MAD to calculate the central points of the covariance matrix and the procedures of action of the MAD as conveyed below: Step 1: Sorted data values in ascending order. Replace the same or repeated varieties with different varieties as necessary within the given knowledge set.
Step 2: If the number of observations is odd, calculate the median of the given data by dividing it by two; otherwise, it express the two midmost numbers as normal.
Step 3: Calculate the deviation of each value from the median by subtracting every median value.
Step 4: Then, calculate the absolute value of each deviation.
Step 5: Select all perfect deviations in ascending order and calculate the median of these deviations according to step 2. These median values are known as MAD.

Feature Extraction
We extracted features from the input signal that could be used as a basis for differentiating MI tasks. For this, a set of trials was collected for each section of the MI signal, and the SCM for each trial was calculated using Equation (4).
Here m-dimensional vector S i is the normalised tangent space of covariance matrices. Therefore, all covariance metrics are transmitted to Euclidean space using TSM. After mapping via TSM, we obtained a feature space S which was a set of m = n(n+1) 2 dimension vectors [22]. The algorithm of the TSM working procedure is provided in Algorithm 1.

Algorithm 1: Tangent space mapping (TSM).
Input: SPD matrices set I with P i ∈ P(n) Output: a set of I vector s i 1: Compute Riemannian mean of the whole set, P G = G(P i , i = 1...I) 2: for i = 1 to I do 3: end for 5: return s i

Feature Selection
In order to reduce the calculation time and increase the accuracy, several important features were selected and transferred to the classification model. To regulate the dimension space by comparing it with the number of trials in each class, the vector V can be orthogonalized using a singular value decomposition (SVD), as shown in Equation (5).
Here S ∈ R d * n , U ∈ R d * d , and V ∈ R n * n are all from an orthogonal matrix and the ∧R d * n has a diagonal matrix that belongs to the singular values of S. Therefore, the tangent space S can be estimated by Equation (6) using the orthogonal matrix U which refers to the PCA [24]. We applied PCA to tangent spaces, which reduced the dimensions of the feature vectors. Therefore, we applied the one-way ANOVA method to select efficient features from reducing vectors. All components applied to the PCA were ranked according to their p-values, and the minimum number of components was set using the weighted false discovery rate (FDR) with the expected ratio of false rejection for all rejections. However, we calculated the p-value for each trial/row by F scores and determined the threshold value p < 0.8 based on the FDR function.

Classification
The effective feature vector obtained from the EEG trial is presented in the corresponding task classification. Identifying a set of observation classes based on the training set data with available class labels was a major concern. However, LDA has been frequently adopted in recent studies for MI classification [28,29]. LDA is an effective statistical technique used for EEG data classification. The main purpose of the LDA is to create a predictive model for a group member to separate data, representing different classes, using hyperplanes [30]. This predictive model has a linear discriminate function that maximizes the ratio between class variations in a dataset. The LDA efficiently deals with two-tier training data using these variance allowances. In this paper, we separated the feature data into two classes-the foot and tongue; and the LDA was used to assign hyperplanes to separate the feature data representing the two classes [31]. Figure 2 shows the example of the projections of two classes of LDA. The algorithm projects all feature data to a new location using the Equation (7).
where x and W are the numbers of class labels and projection vectors, respectively. However, the vector projected (X − 1) classes into a new space and all linear projections employed the following cost function of the Equation (8), where m and S are defined as the mean and variance of the feature vectors.

Dataset Descriptions
We have evaluated the effectiveness of our proposed method based on the publicly accessible BCI-EEG datasets, such as the BCI Competition IV benchmark dataset IIa (BCI IV-IIa), and the BCI Competition III benchmark dataset IIIa (BCI III-IIIa). •

BCI III-IIIa (binary-class)
BCI III-IIIa recorded the MI-EEG signals of task movements using the left and right hands, both feet, and the tongue [32]. Data were collected from 60 channels and each class has 60 trials. The signal was discretised at 250 Hz and filtered between 1 Hz and 50 Hz. Evidence of a data point in this direction was recorded from a total of three people performing MI tasks, and several runs (at least six) were performed in each class, including 40 trials. All data were concatenated in a single unit and stored in a general data format (GDF). •

BCI IV-IIa (Two-classes)
This dataset contains nine labelled EEG signals similar to A01T-A09T, respectively, where nine subjects participated [33]. There were four different imaginings of MI activity, i.e., using the left and right hands, both feet, and the tongue. For this, two sessions (at least six runs per session) were recorded for each subject. A total of 288 trials were recorded per session with 48 trials for each run and 12 for four possible classes. For the purpose of using this dataset, we have considered data from five (A01T, A03T, A07T, A08T, A09T) subjects for feature extraction and classification steps.

Experimental Evaluation, Results, and Discussion
The proposed method was implemented with BCI IV-IIa and BCI III-IIIa to classify the MI activity. We have considered a binary classification process to differentiate the effectiveness of MI for the left hand and feet movement. After preprocessing, a feature vector was extracted using TSM and the MI task was classified by LDA. To obtain an efficient feature, we calculated the central point of the covariance matrix. For the experiment, we calculated the average covariance matrix using our proposed method. Figure 3 shows the comparison of the average recognition accuracies of all datasets using our proposed method against more sophisticated methods.  Table 2 lists the accuracy of the classification of the tangent space mapping with linear discriminant analysis (TSMLDA) with different central points of the sample of the covariance matrix with LDA for BCI III-IIIa. The average accuracies of our proposed method for each subject were 93.33%, 95.83%, and 75%, respectively. The average accuracy of all subjects was 88.05%, which was better than the other methods. The accuracies of the feet and tongue MI tasks for each subject are listed in Table 3. From these experimental results, we found that the maximum average accuracy for "K6B" was 95.83%.  Table 4 illustrates the observations of the differences between the feet and tongue trials. The results showed that our proposed method calculated more efficient average points from the SCM and achieved a better performance. The results of the final average classification were determined for 280 trials for each condition. The average accuracy was found to be 90.33% using our proposed method. Table 5 lists the label wise accuracy of each subject where the highest accuracy is shown 95.83% for the 'A08' subject. These results show that the classification effectiveness of our proposed method is comparatively better than other methods. The significance of our average calculation techniques surpassed the other techniques, as shown in Figure 2. In Reference [34], the authors employed BCI-based EEG datasets for two-class problems, with different forms of CSP experiments, including CSP [17], SBCSP [18], FBCSP [20], and the CSP-TSM method. However, the statistical mean (arithmetic mean) of the TSM method was reported for comparing CSP, TSM, and CSP-TSM. In our experiment, TSM methods were applied to partial sets of classes (we considered two-class problems) with BCI III-IIIa and IV-IIa. The classification performances of TSM with MAD were 3.14% and 5.4% higher than those of TSM with other reference matrices for BCI III-IIIa and IV-IIa, respectively (see Tables 2 and 4). Table 6 lists the accuracy of the proposed TSM using the MAD method as compared to the sophisticated methods based on BCI III-IIIa. These results show that the proposed method increased the performance by 6.94%, 0.42%, 6.39%, 5.09%, 3.8%, and 4.53% compared to SRCSP, MDRM, HOREV-MDRM, WOLA-CSP, and TSGSP, respectively.  Table 7 lists the accuracy of the proposed TSM using the MAD method as compared to the sophisticated method based on BCI IV-IIa. These results show that the proposed method increased the performance by 12.85%, 5.75%, 3.62%, 0.05%, and 6.13% compared to CSP+LDA, TLCSP1, FBCSP with LR, WOLA-CSP, and TSLDA, respectively. As a result, we can state that the use of the MAD-based average method can increase the accuracy of the MI-based BCI classification, which was evaluated by the proposed method.

Conclusions
This paper showed significant improvements in the steps needed to classify MI activity using the average framework MAD. We considered the binary class to classify MI tasks for tongue and feet movement. Two benchmark datasets were studied in this work. We have proposed a MAD strategy to address the issue of the noise and nonstationary aspects of EEG signals concerned with the tangent space on the map. Moreover, we compared it to more sophisticated methods and applied the optimal number of feature dimensions to the classification steps. We have proposed TSMLDA as a classifier to categorize the activity of MI. The experimental results achieved the average recognition accuracies for MI tasks of 88.05% for BCI III-IIIa, and 90.33% of BCI IV-IIa. The summary of these results is that the proposed method performed much more accurately than the sophisticated method. Funding: This research received no external funding.

Conflicts of Interest:
The authors declare no conflict of interest.