Motor Imagery Classification Based on a Recurrent-Convolutional Architecture to Control a Hexapod Robot

: Advances in the ﬁeld of Brain-Computer Interfaces (BCIs) aim, among other applications, to improve the movement capacities of people suffering from the loss of motor skills. The main challenge in this area is to achieve real-time and accurate bio-signal processing for pattern recognition, especially in Motor Imagery (MI). The signiﬁcant interaction between brain signals and controllable machines requires instantaneous brain data decoding. In this study, an embedded BCI system based on ﬁst MI signals is developed. It uses an Emotiv EPOC+ Brainwear ® , an Altera SoCKit ® development board, and a hexapod robot for testing locomotion imagery commands. The system is tested to detect the imagined movements of closing and opening the left and right hand to control the robot locomotion. Electroencephalogram (EEG) signals associated with the motion tasks are sensed on the human sensorimotor cortex. Next, the SoCKit processes the data to identify the commands allowing the controlled robot locomotion. The classiﬁcation of MI-EEG signals from the F3, F4, FC5, and FC6 sensors is performed using a hybrid architecture of Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks. This method takes advantage of the deep learning recognition model to develop a real-time embedded BCI system, where signal processing must be seamless and precise. The proposed method is evaluated using k-fold cross-validation on both created and public Scientiﬁc-Data datasets. Our dataset is comprised of 2400 trials obtained from four test subjects, lasting three seconds of closing and opening ﬁst movement imagination. The recognition tasks reach 84.69% and 79.2% accuracy using our data and a state-of-the-art dataset, respectively. Numerical results support that the motor imagery EEG signals can be successfully applied in BCI systems to control mobile robots and related applications such as intelligent vehicles. The digital logical design guarantees adequate functionality in the integral transmission of data. Hence, FIFOs communicating with FPGA outputs are implemented at a 32 bit word-length, running at 10 MB/s. Task recognition delay on the SoCKit is estimated at 755 ms and about 500 ms in executing the hexapod movements, including intrinsic delays in the SSC-32 V2.0 card. This study proves the active and accurate locomotion of a hexapod robot, exploiting EEG brain signals captured by an Emotiv EPOC+ headset and processed by a SoCKit card using a pre-trained CNN-LSTM as a classiﬁer. The research perspectives of this study primarily include building a larger and robust database with more sophisticated sensing equipment, increasing the number of subjects, and modular tasks to control the mobile robot. Likewise, the proposed methodology could be straightforwardly applied using EEG signals to control special devices. In such a context, the EEG based BCI of a wheelchair will be analyzed to support human mobility.


Introduction
In the last decade, various practical applications have been developed using the electrical signals generated in the brain through Brain-Computer Interfaces (BCIs) [1,2]. In particular, the Electroencephalograms (EEGs) are examples of such signals used currently to control specific devices such as service robots, motorized wheelchairs, drones, and several other human support machines. EEG signals have also found many applications in health sciences to detect mental diseases. For instance, a multi-modal machine learning approach was used to detect dementia integrating EEG-engineered features [3]. Some neurodegenerative disorders of the brain such as Alzheimer's disease have been studied using EEG signals to improve disease detection [4]. Early detection of schizophrenia risks in children between nine and 13 years was developed using EEG patterns classified by traditional machine learning algorithms [5].
Alternatively, using BCIs in specialized applications helps people with motor disabilities move or communicate with the surrounding environment [6,7]. Azmy et al. experimented with a BCI based on brain activation in the scalp using EEG signals for a robot's remote control [8]. Likewise, Palankar et al. tested a BCI to command a mobile robot using electronic systems from different suppliers. Indeed, integrating dissimilar technologies into BCIs has led to new research applications [9]. Among the mobile platforms, hexapod robots have excellent adaptability to terrain due to their zoomorphic structure. They present a quasi-controlled balance of their entire structure [10]. Hence, the experimental BCI contributions based on hexapod robots are still growing [11].
Several BCI architectures use a central computer as the primary signal processing unit [12,13]. Nevertheless, recent works have tested more versatile processing units [14], for instance using a dedicated Raspberry Pi board [15,16], implementing a Finite Impulse Response (FIR) filter on an Advanced RISC Machine (ARM) Linux embedded environment [17], or processing biological signals on embedded systems and the Internet of Things (IoT) [18]. Recently, Belwafi et al. used an Altera Stratix ® IV Field-Programmable Gate Array (FPGA) to develop an embedded BCI system for home devices' control [19]. This trend is currently supported to meet the actual speed requirements of powerful signal processors using embedded systems [19,20].
The main challenges in the embedded BCI based on EEG signals are the classification accuracy, processing speed, low cost development, low power consumption, and hardware reconfiguration capabilities [21][22][23]. In the BCI area, the Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks were recently implemented to classify EEG signals [24,25]. Jun Yang et al. studied the dynamic correlation between the temporal and spatial representation of Motor Imagery (MI)-EEG signals applied to BCI recognition using a CNN-LSTM model and the wavelet transform [24]. Torres et al. implemented a CNN on an FPGA applying an open-source project to convert the trained network model to an executable binary library for FPGA acceleration [26].
This study concerns developing a real-time embedded BCI system based on MI-EEG signals' recognition, using the Emotiv EPOC+ headset and an FPGA Altera Cyclone ® V System on a Chip (SoC) card and applied in the locomotion of a hexapod robot. MI-EEG signals are wirelessly sent to the SoCKit platform to generate the corresponding commands, allowing robot displacement. For this project, signal processing modules were integrated into the SoCKit board. MI-EEG signals for closing-opening the right and left fist were captured by the F3, F4, FC5, and FC6 sensors and classified by a CNN-LSTM. Such hand movements were defined as tasks for motion commands sent directly to the hexapod. Under controlled conditions, a local dataset was built by carefully training four subjects, two males and two females aged from 23 to 36 years old, respectively. For comparison, a public database (Scientific-Data) was also used to validate the proposed embedded-BCI system by selecting a hand MI task subset related to the opening and closing of the left or right fist [27]. Finally, k-fold cross-validation was used to evaluate the commands' recognition rate used to control the hexapod locomotion.

Materials and Methods
The proposed methodology focuses on developing a BCI system by interconnecting an Emotiv EPOC+ headset, an Altera SoCKit Cyclone V SoC board, and a hexapod robot for validation. For this purpose, we created a database using basic MI movements (forward, backward, and stop) with four test subjects. The recognition process was carried out using a CNN-LSTM architecture. The operational system transforms the continuous MI-EEG signals into command instructions to control a hexapod robot's locomotion.

Proposed Framework
The EEG signal acquisition system was chosen following pragmatic criteria such as market accessibility, portability, resolution, sampling rate, compatibility, and scalability. An Emotiv EPOC+ headset consists of sixteen electrodes to be placed on the scalp according to the 10-20 international system of EEG electrode placement. Two of these sixteen electrodes are references, and fourteen are reserved for real-time capturing. On the other hand, the SoCKit module was used to process the MI signals to extract reliable robot commands. It is extensively described in Section 2.2.
In this study, robot perception and control were conceived to operate in real time. Simultaneously, data were stored on the SoCKit. The zoomorphic robot used has two degrees of freedom in each leg. In total, twelve servomotors quickly achieve static and kinematic stability. Figure 1 shows the proposed method's flowchart using an Emotiv EPOC+ for capturing EEG signals and a CNN-LSTM architecture implemented on the SoCKit to control a robot.  Figure 1. Flowchart of the proposed methodology. The EEG signals for the imagined closing-opening of the right and left fist are captured by the F3, F4, FC5, and FC6 sensors and sent to the SoCKit platform. The CNNs process the EEG data and extract feature sequences. A recurrent neural network followed by a dense layer classifies these feature sequences into robot locomotion commands.

A SoCKit Module Configuration
The SoCKit Cyclone V FPGA card powered by an ARM Cortex ® A9 Hard Processor System (HPS) was used to implement the EEG signal processing algorithms and the classifier. Figure 2 shows the basic SoCKit functional blocks.  The algorithms' implementation was coded and tested under the Xillybus for SoCKit Linux distribution (Xillinux) (https://www.terasic.com.tw/wiki/images/e/ef/Xillybus_ getting_started_sockit.pdf, accessed on 7 February 2021) based on Ubuntu 12.04 LTS.
Communications were established between the processor and the FPGA core by configuring the Xillybus Intellectual Properties (IPs) Core, as showed in Figure 3.  . The Xillybus IPs Core was used as a data transport mechanism and configured to interconnect the processor core with the FPGA. The primary control signals include the write enable (wr-en), read enable (rd-en), and FIFO full enable (Full-en). Adapted from Xillibus Ltd.
Emotiv EPOC+ data writing (rd-en) is enabled as First-In, First-Out (FIFO) when it is empty. After reading the data, the Xillybus communicates with the processor core using the Advanced eXtensible Interface (AXI) bus, generating Direct Memory Access (DMA) requests on the central CPU bus. Simultaneously, the low-level FIFO (FPGA) is released (the full-en signal is low), and Xillybus carries the data from the processor core to the FPGA to control the hexapod. The project Xillybus IPs Core was designed to use four FIFOs, two focused on reading and two others on writing data. Each FIFO was configured to a 32 bit data width, a data transmission latency of 5 ms, a bandwidth of 10 MB/s, and a buffering time to autoset. The FPGA is internally forced to control the buffer RAM distribution for continuous reading and writing operations by configuring the buffering time to autoset and specifying the planned period for the maximal processor deprivation. The following equation gives the RAM size required for the DMA buffers' flow: where t is the buffering time and BW is the expected data bandwidth. For reading, all FIFOs must be empty and the enable signals (rd-en) activated. Thus, EEG data can fill the FIFOs until they all are full and the empty signal is disabled. Since FIFOs work with 32 bit and considering that the Emotiv EPOC+ device has a 14 bit resolution, a zero-padding operation was applied to each signal at the Most Significant Bit (MSB) position. Like the previous procedure, writing is enabled (wr-en at the high level) when all write FIFOs are empty (low level). Therefore, a finite state machine was designed to control the FIFOs' filling and emptying processes.
The EEG signal reading, processing, and classification algorithms were written in Python, the Verilog Language, the ANSI-C language (Nios ® II Embedded Design Suite), and the Open Computing Language (OpenCL Standard) [28], which were tested and evaluated on the SoCKit. Table 1 summarizes the SoCkit resources used in the implemented experiments.  (Figure 4) Channel servo controller, from 0.50 to 2.50 ms USB-TTL adapter USB to UART converter module FPGA outputs were wire-connected to the hexapod servo-control board. The Central Pattern Generator (CPG), based on discrete-time neural networks, was adapted to move the hexapod robot [29].
The locomotion law defined by the CPGs and derived from the discrete-time spiking neuronal model [30] is mathematically described by: where Z i is the firing state of the ith neuron at time k, V i is the potential membrane, W ij is the synaptic influences (weights), I ext i is the external current, and γ is a dimensionless parameter. Mainly, Z i [k] is defined as a thresholded Heaviside function.
Moreover, considering that twelve servomotors control the hexapod movements, twelve neurons were required in this model; the input current was not needed (i.e., I ext i = 0), and γ = 1 to emulate a linear integrator.

BCI Dataset
The test subjects provided written consent to capture the EEG signals after carefully reading the experimental protocol to protect confidentiality. The specialized equipment used in the experiment was entirely commercial, not presenting any potential risk to the participants. Seven subjects were initially selected for the training process, and after addressing the defined paradigm, they followed an individual schedule.
Before and during each training session, the Emotiv Software Development Kit (Emotiv Xavier) monitored the subject's cognitive and emotional performances [15]. Hence, a dataset was created selecting four test subjects between 23 and 36 years old, trained and supervised to collect signals during several experimental tasks lasting three seconds each. According to the given task, subjects were instructed to stay still during the capture and invited to imagine closing and opening the right or the left fist focused on a stimulus video ( Figure 5).

Starting task
Ending task  The stimuli video of the fist closing-opening movements was played on the screen according to the temporal task sequence shown in Figure 6.
In the capture sequence, the first five seconds served to prepare the test subject, ending this phase with the audible Beep 1, followed by Task 1, related to the left fist MI action. The Beep 2 tone concludes this period and marks a pause of 3 s. Beep 3 triggers the end of this static period and starts a second preparation phase of 2 s. Beep 4 starts Task 2, related to the right fist MI task, ending with Beep 5. Therefore, the developed dataset consists of 2400 trials performed by four subjects (600 trials from each subject), representing 2400 × 19 s (12.67 h) of data capture. For each session duration, only signals of 3 s corresponding to Task 1 (left fist MI), 3 s to Task 2 (right fist MI), and 3 s to neutral action were gathered to build the dataset.

Data Preprocessing
Signals from the F3, F4, FC5, and FC6 sensors were processed in the MI recognition process [31,32]. Such sensors were located in the rear portion of the frontal lobe, as shown in Figure 7. The Emotiv EPOC+ headset was configured with three filters: a low-pass filter with a cutoff frequency at 85 Hz, an operational bandwidth between 0.16 and 43 Hz, and a band-rejection filter with a stop-band between 50 and 60 Hz [15,33]. According to the International EEG Waveform Society, the project paradigm is based on the mu rhythm processing, which occupies frequencies between 8 and 12 Hz [34]. The mu rhythm is the most used pattern in BCI systems considering the nature of the MI movements [35,36]. Thus, the mental imagery of body members' mobility can be perceived through the mu rhythm variations at the sensorimotor cortex, avoiding any real movement of the body limbs [37]. Lotze et al. determined that the left and right hands' physical movements cause an Event-Related Desynchronization (ERD) of the mu rhythm power, captured in different motor cortex areas [38]. Consequently, the F3 and FC5 sensors were selected for the left hemisphere, whereas F4 and FC6 for the right hemisphere on the sensorimotor cortex. Such a choice takes into account the sensor's closeness to the primary motor cortex location associated with the imagined and physical movements of the left and right hands [31,32].

MI-EEG Signals' Classification Based on a CNN-LSTM Architecture
Recurrent neural networks (e.g., LSTM networks) are composed of memory units that temporarily store information [39]. Such a network's layer structure is not unique because the interconnections between neurons are not based on a transportable (mutable) logic. The feature extraction and classification of EEG signals are done by combining two neural schemes, the CNN and LSTM. Figure 8 presents the CNN-LSTM architecture integrated into the SoCKit to decode robot commands. The overall network consists of a sequence of layers: a convolutional layer (CNN1), an LSTM layer (LSTM1), a convolutional layer (CNN2), followed by a max-pooling layer, a convolutional layer (CNN3), an LSTM layer (LSTM2), and a dense layer. A 384 × 4 matrix was applied as the input to the CNN1 layer, which performed 32 convolutions with a 3 × 3 size kernels. In each convolutional layer (CNN1, CNN2, and CNN3), the padding parameter was configured to have the same temporal dimensions between input and output data. Weights were initialized according to a uniform distribution using the He initialization algorithm [40]. Dropout was applied to each convolutional layer with parameters tuned to 0.4, 0.2, 0.2, and 0.1 for the CNN1, CNN2, CNN3, and LSTM2 layers, respectively. According to the deep learning software interface Keras [41], for a dropout rate of 0.1, only 10% of the neurons are zeroed-out during the training phase, which reduces overfitting (overtraining).
On the other hand, LSTM layers contain 32 and 150 cells and receive feature matrices from convolutional layers for processing. The model was implemented in Keras and TensorFlow using the categorical cross-entropy loss function to evaluate the error between the estimated outputs and the ground-truth. The network was trained for 8000 epochs to meet the max accuracy, using the Nesterov-accelerated Adaptive Moment Estimation (NADAM) optimizer with a batch size of 512. A cyclical learning rate with a step-size of nine and minimum and maximum learning rates of 0.000001 and 0.0005, respectively, was used to speed up training [42]. The convolutional layers used the leaky Rectified Linear Unit (ReLU) as the activation function with α = 0.005. This allowed obtaining a small non-zero gradient when a neuron has a negative net input. The leaky ReLU activation function f (α; x) is defined by: where α is a small positive constant [43]. However, SoftMax was used as the activation function of the fully connected layer following the LSTM2 layer to normalize the outputs, such that they may be interpreted as class probabilities [44]. Table 2 depicts the principal parameters of the proposed network model. The convolutional layer CNN1 has only 416 parameters, while the CNN2 and CNN3 layers have 3104 parameters each. It must be highlighted that CNNs do not require a specially designed feature extraction stage because they can perform adaptive feature extraction directly on raw input data. Therefore, there were 125,197 parameters necessary for all layers. The output used a fully connected layer with SoftMax as the activation function, which produces the three class probabilities. During the neural network training, the neuron weights were randomly initialized using the He initialization algorithm.

Experimental Results
The proposed method's performance was evaluated according to the operative interconnection of the Emotiv EPOC+ headset, the SoCKit board, and the hexapod robot. It is worth noting that recognition algorithms were integrated into an embedded BCI system. MI-EEG recognition was achieved by implementing a CNN-LSTM architecture and creating, training, and validating an EEG dataset. Figure 9 shows the servomotor location and associated nomenclature, as well as the diagram of the hexapod locomotion sequence.   Figure 9b illustrates the repeating pattern sequences used to control the servomotors, which are integrated to move the hexapod synchronously. MI-EEG signals for closing-opening the right and left fists were processed by the SoCKit and transferred to the hexapod servomotors as commands to move forward, backward, and stop.

Qualitative Evaluation
The dataset was constituted by signals captured from four chosen subjects, which were trained to reproduce each task pattern until they became familiar with the experience. The Xavier interface of Emotiv evaluated the subjects' mental state metrics before each training and capture [45], using cognitive, expressive, affective, and inertial sensors.

Quantitative Evaluation
A total of 2400 sessions were validated with four test subjects several times. This process allowed obtaining representative samples for the training and validation datasets. Therefore, two-thousand one-hundred sixty captures were used as the training data and 240 captures as the validation data. The number of classes was three (i.e., forward, backward, and neutral). Moreover, the stratified k-fold Cross-Validation (CV) was used with k = 10 to evaluate the system performance. Table 3 summarizes the dataset structure split into training and validation patterns. The convolutional layer CNN1 used 416 parameters, while the CNN2 and CNN3 layers had 3104 parameters each. Thus, there were 125,197 parameters necessary for all layers.

EEG Signals' Dataset
The highlighted parameters in the created dataset were age, gender, and the capture sequence duration. The dataset features are summarized in Table 4. The subjects of the dataset required significant mental focus on the stimulus to get reliable signals, which was solved with extensive experimentation for easy adaptation to this experience.

Model Evaluation
The proposed framework's evaluation considers the accuracy of the hexapod movement commands according to the three predefined tasks identified from the MI signals. Simultaneously, the designed BCI must optimize real-time signal processing. The neural network training was repeated five times, achieving an average accuracy of 84.69% for the three experimental predefined tasks. The highest accuracy of 87.6% was obtained for the sixth k-fold iteration, whereas the lowest accuracy of 81.88% was found with the second k-fold iteration. The created database provides signal patterns lasting six hours collected from four test subjects. Table 5 shows the classification accuracy for each test subject, while Table 6 summarizes the accuracy reached with different subject combinations. This study was additionally evaluated with the Scientific-Data dataset [27]. The Scientific-Data gathers five EEG signal paradigms, captured with the Nihon Kohden Neurofax EEG-1200 electroencephalograph and the JE-921A amplifier. Scientific-Data signals were selected to evaluate our dataset. The proposed BCI evaluation is based on leftand right-hand motor imagining of closing and opening the respective fist once, defined as Paradigm #1 [27]. After imagining such actions, the participant remained passive until the next action signal was presented. Table 7 shows that the average accuracy achieved for the three task classification was 79.2% using the evaluated Scientific-Data set. Table 7. Accuracies achieved with our and Scientific-Data datasets.

Reference Dataset Type Brain Signals Accuracy
Proposed method Our dataset (4 subjects) MI-EEG 84.69% Scientific-Data [27] Public dataset (7 subjects) MI-EEG 79.2% It is worth mentioning that the proposed network model was trained on a computer with an NVIDIA GeForce ® GTX 1080 GPU. Next, the descriptive files (weight files, module files) were migrated to the SoCKit card to optimize the processing latency time of the EEG signals. The embedded system took approximately 0.750 s to decide whether the samples present referred to the right-fist, left-fist, or none using the FIFO configuration and the Emotiv EPOC+ rate of 128 Samples Per Second (SPS). The processing time with the Scientific-Data dataset (sampled at 200 SPS) was evaluated as 0.279 s with signals from the C3, C4, and Cz sensors.
Moreover, two closely related embedded BCI approaches [19,20] were also included to compare the proposed method in Table 8, including the signals' processing time and the number of channels. Therefore, this work's contributions are developing an MI-EEG dataset, a CNN-LSTM model implemented in the FPGA SoC board for real-time signal processing, and the embedded BCI architecture implementation with different technologies.

Discussion
The proposal was implemented on the Altera FPGA SoC environment, building the respective hardware design modules shown in Figure 1. A mechanism to read EEG signals in real time on the SoCKit board was designed with a delay limit of about 10 ms for the Emotiv EPOC+ capture. A proper buffer module synchronization guaranteed the regularity of data being sent to the processor, where the available memory (DMA) was dynamically allocated.
Quartus II Version 13.0 was used to embed the project descriptive architectures, to ensure the processing and classification of motor imagining EEG signals. Real-time signal processing starts when the Emotiv EPOC+ system sends EEG signals at 128 samples per second to the SoCKit card. Next, the NIos-II IDE cache memory was previously preloaded with the designed neural network parameters. Therefore, the SoCKit can instantly convert classified EEG signal features into commands to move the hexapod. Moreover, the SoCKit USB 2.0 On-The-Go (OTG) port was used to connect the Serial Servo Controller SSC-32 Control board through the USB to TTL serial converter module.
Note that physically, the SoCKit board was embedded in the hexapod, and they moved together. Table 5 shows the achieved variability of the accuracy results with each subject, confirming intrinsic differences among EEG signal characteristics [46], besides the closely related faithful reproduction by each subject. In Table 6, different combinations of the subjects' signals are given to appreciate the data classification accuracy according to the dataset size variation. Thus, a high test accuracy of 85.7% was reached by combining two signal groups from the CD and EF subjects while combining three signal groups from the CD, EF, and GH subjects; the highest score achieved was 86.1%.

Conclusions
This study presents the development of an embedded BCI system based on EEG signals, corresponding to imagined fist movements, captured with an Emotiv EPOC+ headset and processed in real time on the SoCKit Cyclone V SoC card to control a hexapod robot. The designed framework allows controlling the forward and backward movements of a hexapod robot, using two MI tasks: closing and opening the right fist, closing and opening the left fist, and the neutral (reference) action.
Likewise, an MI-EEG dataset is created. Other hexapod locomotion modalities, including turning right or left, running, sitting, or climbing, are planned for future work, considering this framework as the first experimental and incremental approach. MI-EEG signal recognition was carried out using a hybrid CNN-LSTM. By using stratified 10-fold cross-validation, the average task accuracy is determined as 84.69%.
The digital logical design guarantees adequate functionality in the integral transmission of data. Hence, FIFOs communicating with FPGA outputs are implemented at a 32 bit word-length, running at 10 MB/s. Task recognition delay on the SoCKit is estimated at 755 ms and about 500 ms in executing the hexapod movements, including intrinsic delays in the SSC-32 V2.0 card. This study proves the active and accurate locomotion of a hexapod robot, exploiting EEG brain signals captured by an Emotiv EPOC+ headset and processed by a SoCKit card using a pre-trained CNN-LSTM as a classifier. The research perspectives of this study primarily include building a larger and robust database with more sophisticated sensing equipment, increasing the number of subjects, and modular tasks to control the mobile robot. Likewise, the proposed methodology could be straightforwardly applied using EEG signals to control special devices. In such a context, the EEG based BCI of a wheelchair will be analyzed to support human mobility.  Institutional Review Board Statement: The study was conducted according to the guidelines of the Declaration of Helsinki for procedures involving human participants. Ethical review and approval are waived for this kind of study. In EEG signals' capture, no invasive procedures were involved, nor were special biodata captured that could be used to identify the participants.

Informed Consent Statement:
Formal written consent to capture EEG signals from participants is available on demand.