A Novel Deep Learning Model for Motor Imagery Classification in Brain–Computer Interfaces

Chen, Wenhui; Xu, Shunwu; Hu, Qingqing; Peng, Yiran; Zhang, Hong; Zhang, Jian; Chen, Zhaowen

doi:10.3390/info16070582

Open AccessArticle

A Novel Deep Learning Model for Motor Imagery Classification in Brain–Computer Interfaces

by

Wenhui Chen

¹,

Shunwu Xu

¹

,

Qingqing Hu

²,

Yiran Peng

^3,*,

Hong Zhang

^1,*

,

Jian Zhang

¹ and

Zhaowen Chen

^1,3

¹

Key Laboratory of Nondestructive Testing, Fujian Polytechnic Normal University, Fuqing 350300, China

²

Faculty of Humanities and Arts, Macau University of Science and Technology, Avenida Wai Long, Macau 999078, China

³

Faculty of Innovation Engineering, Macau University of Science and Technology, Avenida Wai Long, Macau 999078, China

^*

Authors to whom correspondence should be addressed.

Information 2025, 16(7), 582; https://doi.org/10.3390/info16070582

Submission received: 30 May 2025 / Revised: 1 July 2025 / Accepted: 4 July 2025 / Published: 7 July 2025

Download

Browse Figures

Versions Notes

Abstract

Recent advancements in decoding electroencephalogram (EEG) signals for motor imagery tasks have shown significant potential. However, the intricate time–frequency dynamics and inter-channel redundancy of EEG signals remain key challenges, often limiting the effectiveness of single-scale feature extraction methods. To address this issue, we propose the Dual-Branch Blocked-Integration Self-Attention Network (DB-BISAN), a novel deep learning framework for EEG motor imagery classification. The proposed method includes a Dual-Branch Feature Extraction Module designed to capture both temporal features and spatial patterns across different scales. Additionally, a novel Blocked-Integration Self-Attention Mechanism is employed to selectively highlight important features while minimizing the impact of redundant information. The experimental results show that DB-BISAN achieves state-of-the-art performance. Also, ablation studies confirm that the Dual-Branch Feature Extraction and Blocked-Integration Self-Attention Mechanism are critical to the model’s performance. Our approach offers an effective solution for motor imagery decoding, with significant potential for the development of efficient and accurate brain–computer interfaces.

Keywords:

EEG signals; brain–computer interfaces; deep learning

1. Introduction

The brain–computer interface (BCI) is a rapidly evolving technology that establishes an efficient interaction bridge between the human brain and external environments. This technology converts and interprets neural activity data to facilitate perceiving external stimuli or controlling external devices. Due to safety considerations and other factors, EEG signals are extensively utilized in BCI applications. After the collection device captures EEG signals, researchers can achieve various behavioral interactions through EEG information [1,2,3], such as wheelchair control [4], for which the the basic processing flow is shown in Figure 1.

Motor imagery based on EEG signals, a technology that does not rely strictly on external stimuli, has demonstrated significant potential in various applications. The recognition and control of motor intentions are achieved by internally simulating different motor scenarios in the brain, which are then captured by the corresponding EEG signals [5,6,7,8]. Leveraging this capability, individuals with mobility impairments or muscle degeneration can effectively use motor imagery alone in daily interventions. This approach improves the convenience of their daily lives and substantially improves their ability to live independently [9]. However, EEG data’s complex and non-stationary nature introduces considerable challenges to the analysis process. Consequently, designing efficient systems that can accurately extract meaningful information from EEG signals to enhance the classification performance of motor imagery tasks is increasingly becoming a focus of research [10].

The rapid advancement of deep learning has significantly enhanced the analysis of EEG motor imagery signals. Unlike traditional machine learning methods, deep learning frameworks can automatically extract intricate, high-dimensional features from complex EEG data, demonstrating superior performance [11]. For instance, Shiam et al. [12] proposed an entropy-based approach that identifies and selects effective EEG channels, thereby enhancing classification accuracy and reducing computational complexity. Similarly, Xie et al. [13] introduced a two-branch parallel CNN model that integrates continuous wavelet transform and Common Spatial Patterns, achieving improved classification accuracy. Despite these advancements, existing methods typically treat feature extraction and classification as separate stages, lacking an end-to-end analysis process for motor imagery tasks [14]. Hwaidi et al. [15] proposed a deep autoencoder (DAE) and a CNN for classifying EEG motor imagery signals, an end-to-end approach which outperforms other methods. Due to the temporal characteristics of EEG signals, some methods integrate a CNN with temporal models to improve the accuracy of EEG motor imagery classification. For example, Khademi et al. [16] extract spatial and temporal information based on a CNN and LSTM. Meanwhile, Transformer-based models have also been adopted to model long-range dependencies in EEG signals, with the attention mechanism used to capture global contextual information [17,18]. While multi-scale network architectures and attention mechanisms have shown promise in enhancing model performance, each approach presents its own set of limitations [19]. Current methods cannot extract more comprehensive and detailed features from EEG signals, which limits their ability to characterize these signals. Meanwhile, the traditional attention mechanism is prone to introducing a large number of redundant features, which interfere with the capture of key discriminative information, thus affecting classification performance and computational efficiency.

This paper proposes a Dual-Branch Blocked-Integration Self-Attention Network (DB-BISAN) to address these challenges. Initially, a dual-branch network architecture is employed to extract multi-scale spatio-temporal features from EEG signals. Then, the Blocked-Integration Self-Attention Network is designed to enhance the model’s performance by capturing the long-term dependencies. This module effectively helps capture the unique temporal and spatial characteristics of EEG signals.

The contributions of this article are as follows:

A Dual-Branch Feature Extraction Module is introduced to capture multi-scale spatio-temporal features, thereby addressing the limitations of traditional methods in extracting informative representations from EEG motor imagery signals.
A Blocked-Integration Self-Attention Mechanism is proposed to improve model efficiency while preserving essential spatial dependencies within EEG data, resulting in more efficient and discriminative feature representations.
An end-to-end framework, DB-BISAN, is developed for motor imagery classification in brain–computer interface applications, achieving superior classification accuracy on the BCI Competition IV 2a [20] and 2b [21] datasets.

The rest of the paper is organized as follows: Section 2 introduces the background of the EEG motor imagery; Section 3 presents the proposed DB-BISAN; Section 4 conducts experiments and an analysis of the DB-BISAN; Section 5 and Section 6 discuss and summarize the DB-BISAN, respectively.

2. Related Work

2.1. EEG Motor Imagery

In recent years, convolutional neural networks (CNNs) have been extensively employed by researchers to study motor imagery based on EEG signals. These CNNs can primarily be categorized into single-branch and multi-branch architectures. Single-branch CNN models typically rely on temporal or spatial convolution to extract features from EEG signals. However, due to EEG data’s complexity and non-stationary nature, a single convolutional operation often fails to capture the key information contained within the signals. To address this limitation, multi-branch CNNs, which utilize convolutional kernels of different scale sizes to extract information at multiple resolutions from EEG signals, have garnered significant attention [22].

Advanced machine learning algorithms have emerged as pivotal methods in EEG motor imagery, offering robust tools and innovative methodologies for analyzing and comprehending complex cognitive processes. These algorithms typically involve meticulously designed feature extraction procedures, followed by classifiers to categorize specific mental states [23]. Prominent feature extraction techniques include Common Spatial Patterns (CSPs) [24], Principal Component Analysis (PCA) [25], and entropy-based methods [26]. For instance, Ang et al. introduced the Filter Bank Common Spatial Pattern (FBCSP), an extension of the CSP that enhances feature extraction performance by applying CSPs across multiple frequency bands and subsequently selecting the most relevant features [27]. Similarly, Wu et al. leveraged Riemannian geometry to enhance EEG signal quality by employing spatial filters and extracting features from the Riemannian tangent space after reducing the covariance dimension [28]. Despite significant advancements by machine learning approaches in traditional motor imagery tasks, physiological signals like EEG are frequently compromised by noise interference and inherent variability among individuals, making it challenging to obtain high-quality data. These data-related challenges hinder effective feature extraction and constrain the discriminative capabilities of feature-based models.

The rapid advancement of deep learning has further revolutionized the analysis of EEG motor imagery, offering superior performance by automatically extracting high-dimensional and intricate features from complex EEG signals. Chen et al. identified that conventional Filter Bank Common Spatial Pattern (FBCSP) approaches fail to consider the time-domain characteristics of EEG signals. To address this limitation, they proposed a deep learning method based on shared spatial patterns across filter banks, which enhances the classification performance of motor imagery tasks [29]. Similarly, Luo et al. employed the Filter Group CSP (FG-CSP) technique to extract spatial frequency information. They integrated Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) models to capture temporal dynamics, achieving robust performance on two motor imagery datasets [30]. However, existing deep learning methods typically treat feature extraction and classification as separate stages, lacking an end-to-end analysis process for motor imagery tasks.

To address this limitation, Schirrmeister et al. proposed two convolutional neural network (CNN) architectures that achieved classification accuracies of 73.7% and 70.7%, respectively, by effectively extracting temporal and spatial information from EEG signals [31]. Similarly, Lawhern et al. developed a compact CNN architecture to classify EEG signals across various brain–computer interface paradigms [32]. Zhang et al. introduced a hybrid CNN architecture that integrates temporal and spatial EEG information, demonstrating satisfactory recognition accuracy in their experiments [33]. To fully leverage the temporal information inherent in EEG signals, researchers have explored combining CNNs with temporal prediction models. For instance, Garcia Moreno et al. proposed a deep learning model that fuses a CNN with Long Short-Term Memory (LSTM) networks, achieving a recognition accuracy of 87% in their experiments [34]. Additionally, Wang et al. developed a two-dimensional CNN-LSTM model, where features are first extracted from different EEG channels using a 2D CNN and then fed into an LSTM for training. This approach showed notable improvements compared to traditional one-dimensional CNN models [35]. Despite the promising performance of CNN-based models in motor imagery tasks, single-scale convolutional architectures often fail to capture the full spectrum of EEG information, resulting in limited discriminative capabilities.

Current research in EEG motor imagery focuses on two primary directions to efficiently extract information from EEG signals. On the one hand, researchers adopt attention-based approaches to filter redundant EEG signal features and improve analysis accuracy. Attention mechanisms are widely used in visual tasks, enabling models to focus on more relevant parts of the input data [36,37]. In the context of EEG motor imagery, attention mechanisms aim to screen out the most relevant features in EEG signals for motor imagery tasks and suppress irrelevant and redundant information [38]. Altuwaijri et al. integrated the Squeeze and Excitation (SE) Attention Block methods into EEGNet to selectively screen the most relevant features through channel correlation [39]. Additionally, with the advent of self-attention mechanisms, Xie et al. employed the Transformer architecture, leveraging self-attention mechanisms for EEG motor imagery, and enhanced overall EEG classification performance by incorporating position embedding modules [40]. Similarly, Amin et al. developed a neural network inspired by attention mechanisms and a lightweight deep learning model composed of Long Short-Term Memory (LSTM) networks to achieve high recognition accuracy [41].

Conversely, researchers are focused on extracting meaningful and effective features from EEG signals at multiple scales by exploring multi-scale convolutional neural networks (CNNs). Riyad et al. introduced the end-to-end Adaptive Multi-Scale Integration (AMSI) EEGNet, which extracts robust features from EEG signals through multi-scale input features for motor imagery classification [42]. Wang et al. employed multi-scale temporal and frequency convolutions to extract time–frequency features, achieving efficient feature fusion for motor imagery classification [43]. Additionally, Li et al. developed a multi-scale residual network for EEG motor imagery classification, demonstrating robust decoding capabilities and reduced system time costs [44]. These multi-scale CNN approaches enhance the ability to capture comprehensive EEG information, improving classification performance in motor imagery tasks.

2.2. Attention Mechanism

Attention is a complex and advanced cognitive function in humans. Typically, individuals are unable to process all received information simultaneously and instead focus on specific local information. This selective focus allows people to allocate more cognitive resources to areas that require attention. Inspired by this cognitive mechanism, deep learning has achieved significant breakthroughs by integrating attention mechanisms into model designs. Researchers are increasingly exploring how to incorporate human-like attention into neural network architectures to enhance models’ representational capacity and performance. For example, Zeiler et al. introduced a novel visualization technique to understand which features the feature layer focuses more on, attempting to gain a deeper understanding of the functionality of the feature layer [45]. Yang et al. introduced a hierarchical attention network and visualized the attention layers. They found that the model could select the information with the highest quality value [46]. Chen et al. designed a network that integrates spatial and channel attention mechanisms to complete image captioning tasks. The network utilizes channel attention to select the most efficient feature channels while preserving and enhancing key features [47].

Furthermore, researchers have revisited attention mechanisms and proposed self-attention mechanisms. Transformer has achieved outstanding results by modeling long-range dependencies and calculating attention through queries, keys, and values. It can help extract more discriminative EEG features across both spatial and temporal dimensions. Chen et al. implemented efficient sleep state staging using a capsule network and LSTM, supplemented with a self-attention mechanism [48]. Similarly, Zhang et al. combined convolutional neural networks (CNNs) with a self-attention mechanism and a time–frequency co-space model to classify motor imagery states [49]. Also, Zhang et al. [50] utilize Transformer’s self-attention and cross-attention mechanisms for feature extraction. Priyanshu et al. [51] presented a two-stage Transformer-based network for EEG motor imagery classification. Although these studies validate the potential of self-attention mechanisms in EEG feature modeling, self-attention still faces challenges such as computational complexity and feature redundancy, underlining the need for more targeted and efficient designs.

Leveraging the unique advantages of self-attention mechanisms in capturing internal data relationships, this paper aims to introduce such methods to EEG signal processing. By optimizing the parsing and recognition performance of EEG signals, the proposed model seeks to enhance both accuracy and robustness.

3. Method

3.1. Model Overview

Motor imagery based on EEG signals inherently encompasses rich temporal and spatial information. However, the majority of existing studies employ single-branch architectures for feature extraction. Such single-branch networks are limited to learning only one aspect of the data, failing to capture the dynamic neural changes that occur during motor imagery processes. This limitation often results in the models exhibiting suboptimal recognition accuracy. To overcome these challenges, the Dual-Branch Blocked-Integration Self-Attention Network (DB-BISAN) is proposed for EEG motor imagery. Building upon a dual-branch convolutional neural network architecture, DB-BISAN effectively captures multiple aspects of EEG data by simultaneously processing diverse feature representations, enhancing the model’s ability to recognize and interpret complex motor imagery patterns with higher accuracy. Also, compared with mainstream deep learning methods that utilize fixed weighting, DB-BISAN incorporates an automated channel weighting mechanism based on the proposed Blocked-Integration Self-Attention Module. This mechanism adaptively emphasizes the most informative EEG channel, which helps achieve improved generalization across subjects.

Firstly, by employing dual convolutional operations, the proposed model effectively extracts information from the temporal and spatial domains of EEG motor imagery signals. Moreover, to address the issues of insufficient information fusion and feature redundancy in dual-branch network structures, an innovative Blocked-Integration Self-Attention Module is introduced. The channel-splitting strategy within this module allows the network to isolate and encode information from spatially relevant brain areas, while the integrated self-attention mechanism adaptively highlights discriminative motor imagery features and suppresses redundant patterns. It can help improve the overall efficiency and accuracy of the model. Figure 2 illustrates an overview of the framework of the proposed method.

3.2. Dual-Branch Feature Extraction Module

To effectively extract pertinent features from EEG motor imagery signals, an integrated convolutional approach is employed to leverage the distinct advantages of both temporal and spatial convolutions. In the temporal dimension, a multi-scale convolutional kernel architecture is implemented. Specifically, large convolutional kernels are utilized to capture long-range dependencies and encompass relevant features over extended periods, while small convolutional kernels focus on temporal details and local signal variations. Additionally, spatial convolutions exploit the electrode arrangement to capture interrelationships between different EEG channel locations in the spatial dimension. This dual-dimensional convolutional strategy provides a more comprehensive understanding of the complex patterns within EEG signals, thereby enhancing recognition accuracy in motor imagery tasks. For the input process, EEG signal data

X \in R^{B \times 1 \times C H \times T}

, where B represents the number of batches,

C H

represents the channel numbers of the EEG signal, and T represents the signal length. The feature representation extracted by Branch I is

X_{Branch}^{1} \in R^{B \times C_{1} \times H_{1} \times W_{1}}

, and the feature representation extracted by Branch II is

X_{Branch}^{2} \in R^{B \times C_{2} \times H_{2} \times W_{2}}

.

In the extraction module,

H_{1} \times W_{1}

and

H_{2} \times W_{2}

represent the feature dimensions extracted by Branch I and Branch II, respectively. At the same time,

C_{1}

and

C_{2}

denote the number of feature channels for Branches I and II. The Dual-Branch Feature Extraction Model facilitates feature extraction through convolution operations by employing a two-branch architecture. Each branch comprises temporal convolution and spatial convolution layers. In the first branch (Branch I), the emphasis is on capturing long-range relationships. Consequently, 32 large-scale convolution kernels of size

[1 \times 125]

are utilized for convolution operation. Afterward, 32 convolution kernels with sizes of

[C H \times 1]

and

[1 \times 1]

are integrated into one feature representation, denoted as

{SC}_{AP}

. The overall formula for this section can be expressed as

T C_{1} = T e m p o r a l C o n v^{1} (X)

(1)

T e m p o r a l C o n v^{1} (X) = B N (C o n v_{1 \times 125} (X))

(2)

T C_{2} = T e m p o r a l C o n v^{2} (X)

(3)

T e m p o r a l C o n v^{2} (X) = B N (C o n v_{1 \times 30} (X))

(4)

T C_{3} = E L U (B N (C o n v_{C H \times 1} (T C_{1})))

(5)

T C_{4} = E L U (B N (C o n v_{C H \times 1} (T C_{2})))

(6)

S C_{1} = C o n v_{1 \times 1} (T C_{3})

(7)

S C_{2} = C o n v_{1 \times 1} (T C_{4})

(8)

B N (X) = γ \frac{X - μ_{X}}{\sqrt{σ_{X}^{2} + ϵ}} + β

(9)

E L U (x) = \{\begin{matrix} x, & if x > 0 \\ α (e^{x} - 1), & if x \leq 0 \end{matrix}

(10)

In the Dual-Branch Feature Extraction Module,

T e m p o r a l C o n v^{1}

and

T e m p o r a l C o n v^{2}

represent the temporal convolution operations of Branches I and II, respectively.

T C_{1}

and

T C_{2}

are the outputs obtained from temporal convolutions of

T e m p o r a l C o n v^{1}

and

T e m p o r a l C o n v^{2}

, respectively.

T C_{3}

and

T C_{4}

are the intermediate variables.

S C_{1}

and

S C_{2}

are the resulting outputs from the spatial convolution operations of Branches I and II.

B N

is the batch normalization.

μ_{X}

is the mean value of X, and

σ_{X}^{2}

denotes the variance of X.

γ

and

β

are the scaling factor and offset, respectively.

α

is the control parameter.

For each branch, a pooling operation will be used for dimension reduction. To reduce the feature dimension while preserving sufficient feature information, a pooling operation with a stride of 32 and kernel size

[1 \times 32]

is applied to

S C_{1}

, resulting in feature

X_{B r a n c h}^{1}

. A pooling operation with a step size of 25 and a size of

[1 \times 75]

is adopted for

S C_{2}

, resulting in the feature

X_{B r a n c h}^{2}

. The overall formula for this part can be expressed as

X_{B r a n c h}^{1} = AvgPooling (S C_{1})

(11)

X_{B r a n c h}^{2} = AvgPooling (S C_{2})

(12)

where the average pooling operation is denoted by

AvgPooling

.

3.3. Blocked-Integration Self-Attention Module

The EEG features derived from motor imagery using convolutional neural networks (CNNs) often encompass rich information but exhibit significant redundancy, complicating subsequent downstream tasks. To effectively extract latent information from dual branches, the self-attention mechanism is introduced. This mechanism dynamically assigns weights to different features, enabling the model to focus more effectively on critical details within EEG signals, thereby enhancing classification performance. However, despite the substantial benefits of self-attention in extracting meaningful EEG signal features, its inherent high computational complexity presents a bottleneck for real-time processing applications.

To address this limitation, a channel-splitting strategy is introduced within the self-attention mechanism. By splitting the high-dimensional channel space into multiple subsets, each subset concentrates on different levels of the original features, facilitating specialized feature refinement. Additionally, the self-attention outputs from each subgroup are integrated and recombined to restore the integrity of the original channel space. This approach enhances the model’s focus and flexibility in information extraction and reduces overall computational complexity, thereby enabling more efficient execution of the self-attention mechanism. The execution details are shown in Figure 3.

For the input process, the feature representation extracted by Branch I is

X_{B r a n c h}^{1} \in R^{B \times C_{1} \times H_{1} \times W_{1}}

, and the feature representation extracted by Branch II is

X_{B r a n c h}^{2} \in R^{B \times C_{2} \times H_{2} \times W_{2}}

.

Given that the same type of operation is applied to both

X_{B r a n c h}^{1}

and

X_{B r a n c h}^{2}

, without loss of generality, the overall structure of the module is explained with the feature

X_{B r a n c h}^{1}

. The integration process can be divided into P group channels based on the channel dimension of

X_{B r a n c h}^{1}

, and the formula can be defined as

X_{B r a n c h}^{1} = {G_{1}, G_{2}, \dots, G_{j}, \dots, G_{P}}

(13)

where

G_{i}

represents the i-th set of split features, and its feature dimension is

G_{i} \in R^{B \times \frac{C^{1}}{U} \times H_{1} \times W_{1}}

. This is followed by the calculation of self-attention for each

G_{i}

.

Assume three parameter matrices

W^{Q}

,

W^{K}

,

W^{V}

, where all three parameter matrices are trainable parameters in linear transformations. By sequentially performing operations on these three parameter matrices,

q u e r y (Q)

,

k e y (K)

, and

v a l u e (V)

can be obtained. Here,

q u e r y (Q)

,

k e y (K)

, and

v a l u e (V)

will be utilized for self-attention operations.

Q = G_{i} \times W^{Q}

(14)

K = G_{i} \times W^{K}

(15)

V = G_{i} \times W^{V}

(16)

Secondly, we perform dot multiplication on Q and K, divide the result by

\sqrt{d_{k}}

, and use the Softmax function to obtain the attention matrix. By computing the attention matrix with V, the final attention feature weighting is achieved, and the formula is

G_{i}^{'} = G_{i} + S o f t m a x (\frac{Q K^{T}}{\sqrt{d_{k}}} V)

(17)

where

\sqrt{d_{k}}

is the dimension of the key. At this point, we record each calculated self-attention value of

G_{i}

as

G_{i}^{'}

. The final stem is to concatenate all the

G_{i}^{'}

values to obtain the self-attention weighted feature representation

X_{A t t}^{1}

.

X_{A t t}^{1} = C o n c a t (G_{1}^{'}, G_{2}^{'}, \dots, G_{i}^{'}, \dots, G_{P}^{'})

(18)

Similarly, Branch II obtains features

X_{A t t}^{2}

through the Blocked-Integration Self-Attention.

3.4. Feature Fusion Module

To achieve further feature integration, the features are fused from different scales to aggregate diverse information and improve the model’s classification performance.

The input feature can be defined as

X_{A t t}^{1}

after self-attention weighting in Branch I, and as the feature representation

X_{A t t}^{2}

after self-attention weighting in Branch II. These weighted feature representations are then used for further processing. The features from

X_{A t t}^{1}

and

X_{A t t}^{2}

are merged to obtain

X_{F u s i o n}

. The formulation can be defined as

X_{F u s i o n} = C o n c a t (X_{A t t}^{1}, X_{A t t}^{2})

(19)

X_{F u s i o n}

is passed through fully connected layers and Log-Softmax to obtain the classification categories. The calculation can be expressed as

Z = C o n v (X_{F u s i o n})

(20)

{\hat{y}}_{i} = log (\frac{e^{z_{i}}}{\sum_{j = 1}^{C} e^{z_{j}}}), \forall i \in {1, 2, \dots, C}

(21)

The cross-entropy loss function is employed for model training.

L o s s = - \frac{1}{B} \sum_{b = 1}^{B} \sum_{c = 1}^{C_{l s}} y_{i} log (\hat{y_{i}})

(22)

where B represents the number of batches,

C_{l s}

represents the number of classification categories, y represents the predicted label, and

\hat{y}

represents the actual label.

4. Experiment

4.1. Dataset Description

This study evaluates the proposed model exclusively through offline classification experiments using two publicly available EEG motor imagery datasets. Detailed descriptions of the datasets are given below:

(1) BCI Competition IV-2a Dataset (BCI IV 2a) [20]: This dataset contains EEG data from nine subjects. For each subject, 22 electrode caps that meet the international 10–20 standard were worn, and data from 576 traits were collected for two EEG sessions at a sampling rate of 250 Hz. There are four categories, left hand, right hand, feet, and tongue, with 72 traits in each category.

(2) BCI Competition IV-2b Dataset (BCI IV 2b) [21]: This dataset comprises EEG data from nine subjects, recorded at a sampling frequency of 250 Hz. Data are exclusively collected from the C3, Cz, and C4 electrode channels, which conform to the international 10–20 system standards. The first two sessions each include 400 trials, whereas the subsequent three sessions consist of 320 trials each.

4.2. Data Processing and Experimental Setting

In the pre-processing of EEG data, a bandpass filter from 0 to 40 Hz is initially applied. This is followed by exponential moving mean normalization to standardize the EEG signals [31]. The Adam optimizer is employed during model training with parameters set to 350 training epochs, a weight decay of 0.075, and a batch size of 16. Cross-entropy is used as the loss function.

4.3. Validation of Classification

To validate the effectiveness of our proposed method, we compare it with established deep learning approaches, including mainstream baseline methods, multi-scale feature fusion models, and several cutting-edge models.

4.3.1. Comparison with Mainstream Baseline Deep Learning Methods

This study initially compares several mainstream baseline deep learning methods based on their reported results, including Shallow ConvNet [31], DeepConvNet [31], and EEGNet [32]. The cross-session experimental approach is adopted, and the average experimental results are shown in Figure 4. It demonstrates that the proposed method outperforms some mainstream baseline deep learning methods across both datasets. These experiments validate the robust classification performance of the proposed method in cross-session EEG motor imagery tasks.

4.3.2. Comparison with Deep Learning Model Based on Multi-Scale Feature Fusion

The proposed method then compares several multi-scale and similar deep learning methods, including MCNN [52], MI-EEGNET [53], AMSI EEGNet [42], EEG Inception [54], EEGITNet [55], MSHCNN [56], MMCNN [22], and TBTF-CNN [57]. The results of the BCI IV 2a and 2b datasets are shown in Figure 5, which illustrates that the proposed method achieves superior classification performance compared to other multi-scale convolutional neural network models.

4.3.3. Several State-of-the-Art Deep Learning Methods

Finally, the proposed method compares several state-of-the-art deep learning methods, including TSFBCSP-MRPGA [58], SincNet [59], DAFS [60], C2CM [61], EEGNeX [62], Spatio Temporal [63], FBCNet [64], DRDA [65], Conformer [66], CD-LOSS [67], NCA [68], and S3T [69]. The experimental results are shown in Table 1 and Table 2. Overall, the proposed method achieves good classification performance compared to other methods.

4.4. Ablation Experiment

To validate the effectiveness of each module in the proposed DB-BISAN, data from the first five subjects are selected, and ablation experiments are conducted on both the BCI IV 2a and BCI IV 2b datasets. The model variables tested are presented in Table 3, and the experimental results are illustrated in Figure 6 and Figure 7. These experiments demonstrate each module’s contributions to the model’s overall performance.

From the structure outlined in Table 3, it is apparent that the Branch I—Large kernel model utilizes a large-scale temporal convolutional layer in the feature extraction stage, followed by an average pooling layer. In the classification stage, a dense layer (fully connected layer) is deployed to classify the information derived from feature extraction. The model architecture includes the feature extraction and classification modules for the Branch II—Small kernel model. The feature extraction module consists of a single small-scale temporal convolutional layer followed by an average pooling layer. In contrast, the classification module comprises one or more densely connected layers (i.e., fully connected layers) responsible for the final classification based on the extracted features. The Dual-Branch model adopts a multi-scale approach and utilizes convolution at different scales, aiming to demonstrate the effectiveness of multi-scale networks.

From Figure 6 and Figure 7, it is evident that the main modules of the method proposed in this paper significantly enhance the accuracy of the neural network. In both datasets, the dual-branch convolutional neural network surpasses the performance of the single-branch architecture. Additionally, the proposed Blocked-Integration Self-Attention Mechanism substantially boosts the overall performance of the framework.

4.5. Confusion Matrix

To gain a more comprehensive understanding of the overall performance of the proposed method, confusion matrices are generated to display the classification results. Specifically, Figure 8 and Figure 9 illustrate the confusion matrix for each category of Subject 4 in the BCI IV 2a and 2b datasets, demonstrating its superior accuracy compared to other models.

Specifically, BCI IV 2a reveals significant improvements in the “Right hand” and “Feet” categories, while BCI IV 2b highlights the method’s strong performance in the “Left hand” category. These results ultimately validate the effectiveness of the proposed approach in enhancing classification accuracy across motor imagery tasks.

4.6. Feature Visualization

To analyze the feature representation of the Dual-Branch Feature Extraction and Blocked-Integration Self-Attention Modules, t-distributed Stochastic Neighbor Emulation (t-SNE) is employed to visualize the features extracted by different models. We compare the following models: Branch I—Large kernel, Branch I—Small kernel, Dual-Branch, and our proposed method. Figure 10 and Figure 11 show that the dual-branch method has better feature learning abilities. Compared with the large kernel, the distribution of the features learned by the small kernel is more separable. This suggests that the small kernel can more effectively capture the discriminative features in EEG signals. When concatenating the features of the dual branches, the structure exhibits more compact clustering in t-SNE results because it can fully utilize the complementary local and global information. Further, when integrating the Blocked-Integration Self-Attention Module, the t-SNE projection shows an even more structured and discriminative distribution. This verifies that the Blocked-Integration Self-Attention Module can effectively strengthen the key features and improve the overall discriminative ability of the model.

4.7. Attention Visualization

To further illustrate the blocked integration self-attention mechanism, an individual is selected from each of the two datasets to illustrate the results of their attention mechanisms. Specifically, the features are divided into two subsets by channel dimension. For the BCI-IV 2a dataset, Subject 1 is chosen, and for the BCI-IV 2b dataset, Subject 2 is selected. Figure 12 demonstrates that the self-attention mechanism enhances the adaptive and flexible feature fusion process by re-weighting the feature maps.

5. Discussion

In recent years, convolutional neural networks (CNNs) have been extensively employed by researchers for extracting features from many areas [70]. While multi-scale information fusion enhances model performance, it also introduces increased complexity and challenges in effectively integrating multi-scale features. Furthermore, although incorporating attention mechanisms improves feature adaptability, the associated computational complexity poses obstacles to model efficiency. To address these issues, a dual-branch convolutional neural network based on Blocked-Integration Self-Attention is proposed. Firstly, rich information from EEG signals in both the temporal and frequency domains is extracted through convolutional kernels of different scales within the dual pathways. Secondly, an efficient Blocked-Integration Self-Attention Module is designed to enhance model performance comprehensively. This module aims to improve the efficiency of the self-attention mechanism through splitting and integration, further enhancing the model’s ability to understand and process signal features. This type of deep learning model, with the advantages of powerful feature extraction and spatio-temporal analysis, shows great potential for EEG-oriented motor imagery.

In addition, firstly, the proposed method is compared with mainstream baseline deep learning methods, multi-scale deep learning models of feature fusion, and several state-of-the-art deep learning methods. The experimental results demonstrate that the proposed method achieves superior performance. This improvement is attributed to the dual pathway design, which fully captures richer signal features and enhances the model’s generalization and robustness. Moreover, introducing an efficient self-attention mechanism enables the model to acquire more comprehensive global information mining capabilities, thereby obtaining richer contextual information and. Secondly, we validate the effectiveness of each module of our method through ablation experiments. Thirdly, a visualization approach is employed to examine the Blocked-Integration Self-Attention of each branch, facilitating a deeper understanding of the interrelationships between features and the construction of more efficient feature representations. Finally, to gain a more detailed understanding of the model’s performance, a confusion matrix can be used to conduct a detailed examination of the model’s performance in various classification tasks.

However, this method presents certain limitations. Firstly, the impact of the number of convolution kernels on model performance has not been addressed. In subsequent research, more detailed investigations into convolutional kernel configurations and architectural design will be conducted. Secondly, the complex process of collecting EEG signals results in a limited sample size, and the current model does not explicitly account for the challenges posed by small EEG datasets. To mitigate this issue, future work will consider cross-center collaborations to expand data sources and improve the generalizability of the model. Finally, the effectiveness of the dual-branch structure has not been thoroughly examined. In future studies, the potential advantages of multi-path or more efficient single-pathway architectures will be systematically explored.

6. Conclusions

This paper proposes a Dual-Branch Blocked-Integration Self-Attention Network (DB-BISAN) for classifying motor imagery in EEG signals. The model introduces a dual-branch structure that extracts multi-scale spatio-temporal features from EEG signals, providing a robust foundation for recognizing motor imagery. In addition, an efficient Blocked-Integration Self-Attention Mechanism is developed to enhance both the performance and the contextual interpretation of EEG features. The experimental results demonstrate that DB-BISAN achieves superior classification performance compared to existing methods, validating its efficiency and generalization ability across subjects. These findings suggest that the proposed approach holds strong potential as a valuable tool for EEG-based brain–computer interfaces, with promising implications for clinical and rehabilitation applications.

In future work, the proposed method will be extended to support multi-modal neural signal integration, aiming to further enhance its capacity for robust and adaptable brain–computer interface development. This will involve the fusion of complementary modalities, such as Functional Near-Infrared Spectroscopy (fNIRS) or Electromyography (EMG), which may provide richer representations of user intent and improve classification accuracy under noisy or uncertain conditions. Moreover, efforts will be made to design a more lightweight model architecture, thereby improving its practicality and deployment feasibility in real-world scenarios. Techniques such as model pruning or knowledge distillation will be considered to reduce computational cost and memory footprint, facilitating the model’s implementation on portable or embedded BCI systems.

Author Contributions

Conceptualization, W.C., Y.P. and H.Z.; Formal Analysis, W.C., S.X., Y.P. and Z.C.; Investigation, Q.H. and Y.P.; Methodology, W.C., Q.H. and Y.P.; Project Administration, H.Z.; Resources, S.X., H.Z. and J.Z.; Software, S.X., Y.P. and Z.C.; Validation, J.Z.; Visualization, Y.P. and Z.C.; Writing—Original Draft, W.C., Q.H. and Y.P.; Writing—Review and Editing, S.X., H.Z., Z.C. and J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported in part by the National Natural Science Foundation of China (Grant No. 62071123), the Natural Science Foundation of Fujian Province (Grant No. 2024J01971), and the Fujian Province Marine Economy Development Subsidy Fund Project (FJHJF-L-2019-7).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data are available in a publicly accessible repository: https://www.bbci.de/competition/iv/download (accessed on 22 April 2025).

Conflicts of Interest

The authors declare that there are no conflicts of interest.

References

Altaheri, H.; Muhammad, G.; Alsulaiman, M.; Amin, S.U.; Altuwaijri, G.A.; Abdul, W.; Bencherif, M.A.; Faisal, M. Deep learning techniques for classification of electroencephalogram (EEG) motor imagery (MI) signals: A review. Neural Comput. Appl. 2023, 35, 14681–14722. [Google Scholar]
Al-Saegh, A.; Dawwd, S.A.; Abdul-Jabbar, J.M. Deep learning for motor imagery EEG-based classification: A review. Biomed. Signal Process. Control 2021, 63, 102172. [Google Scholar]
Craik, A.; González-España, J.J.; Alamir, A.; Edquilang, D.; Wong, S.; Sánchez Rodríguez, L.; Feng, J.; Francisco, G.E.; Contreras-Vidal, J.L. Design and validation of a low-cost mobile eeg-based brain–computer interface. Sensors 2023, 23, 5930. [Google Scholar] [PubMed]
Naser, M.Y.M.; Bhattacharya, S. Towards practical BCI-driven wheelchairs: A systematic review study. IEEE Trans. Neural Syst. Rehabil. Eng. 2023, 31, 1030–1044. [Google Scholar] [PubMed]
Al-Qaysi, Z.T.; Ahmed, M.A.; Hammash, N.M.; Hussein, A.F.; Albahri, A.S.; Suzani, M.S.; Al-Bander, B.; Shuwandy, M.L.; Salih, M.M. Systematic review of training environments with motor imagery brain–computer interface: Coherent taxonomy, open issues and recommendation pathway solution. Health Technol. 2021, 11, 783–801. [Google Scholar]
Buerkle, A.; Eaton, W.; Lohse, N.; Bamber, T.; Ferreira, P. EEG based arm movement intention recognition towards enhanced safety in symbiotic Human-Robot Collaboration. Robot. Comput.-Integr. Manuf. 2021, 70, 102137. [Google Scholar]
Aung, H.W.; Li, J.J.; Shi, B.; An, Y.; Su, S.W. EEG_GLT-Net: Optimising EEG graphs for real-time motor imagery signals classification. Biomed. Signal Process. Control 2025, 104, 107458. [Google Scholar]
Bouchane, M.; Guo, W.; Yang, S. Hybrid CNN-GRU Models for Improved EEG Motor Imagery Classification. Sensors 2025, 25, 1399. [Google Scholar] [CrossRef]
Lazarou, I.; Nikolopoulos, S.; Petrantonakis, P.C.; Kompatsiaris, I.; Tsolaki, M. EEG-based brain–computer interfaces for communication and rehabilitation of people with motor impairment: A novel approach of the 21st Century. Front. Hum. Neurosci. 2018, 12, 14. [Google Scholar]
Yang, J.; Yao, S.; Wang, J. Deep fusion feature learning network for MI-EEG classification. IEEE Access 2018, 6, 79050–79059. [Google Scholar]
Gu, H.; Chen, T.; Ma, X.; Zhang, M.; Sun, Y.; Zhao, J. CLTNet: A Hybrid Deep Learning Model for Motor Imagery Classification. Brain Sci. 2025, 15, 124. [Google Scholar] [PubMed]
Shiam, A.A.; Hassan, K.M.; Islam, M.R.; Almassri, A.M.; Wagatsuma, H.; Molla, M.K.I. Motor imagery classification using effective channel selection of multichannel EEG. Brain Sci. 2024, 14, 462. [Google Scholar]
Xie, Y.; Oniga, S. Enhancing Motor Imagery Classification in Brain–Computer Interfaces Using Deep Learning and Continuous Wavelet Transform. Appl. Sci. 2024, 14, 8828. [Google Scholar]
Huang, W.; Liu, X.; Yang, W.; Li, Y.; Sun, Q.; Kong, X. Motor imagery EEG signal classification using distinctive feature fusion with adaptive structural LASSO. Sensors 2024, 24, 3755. [Google Scholar] [CrossRef]
Hwaidi, J.F.; Chen, T.M. Classification of motor imagery EEG signals based on deep autoencoder and convolutional neural network approach. IEEE Access 2022, 10, 48071–48081. [Google Scholar]
Khademi, Z.; Ebrahimi, F.; Kordy, H.M. A transfer learning-based CNN and LSTM hybrid deep learning model to classify motor imagery EEG signals. Comput. Biol. Med. 2022, 143, 105288. [Google Scholar]
Wei, Y.; Liu, Y.; Li, C.; Cheng, J.; Song, R.; Chen, X. TC-Net: A Transformer Capsule Network for EEG-based emotion recognition. Comput. Biol. Med. 2023, 152, 106463. [Google Scholar]
Gong, L.; Li, M.; Zhang, T.; Chen, W. EEG emotion recognition using attention-based convolutional transformer neural network. Biomed. Signal Process. Control 2023, 84, 104835. [Google Scholar]
Ahmadi, H.; Mesin, L. Enhancing motor imagery electroencephalography classification with a correlation-optimized weighted stacking ensemble model. Electronics 2024, 13, 1033. [Google Scholar] [CrossRef]
Brunner, C.; Leeb, R.; Müller-Putz, G.; Schlögl, A.; Pfurtscheller, G. BCI Competition 2008–Graz Data Set A; Graz University of Technology: Styria, Austria, 2008; Volume 16, pp. 1–6. [Google Scholar]
Leeb, R.; Brunner, C.; Müller-Putz, G.; Schlögl, A.; Pfurtscheller, G. BCI Competition 2008–Graz Data Set B; Graz University of Technology: Styria, Austria, 2008; Volume 16, pp. 1–6. [Google Scholar]
Jia, Z.; Lin, Y.; Wang, J.; Yang, K.; Liu, T.; Zhang, X. MMCNN: A multi-branch multi-scale convolutional neural network for motor imagery classification. In Proceedings of the Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2020, Ghent, Belgium, 14–18 September 2020; Proceedings, Part III. Springer International Publishing: Berlin/Heidelberg, Germany, 2021; pp. 736–751. [Google Scholar]
Chatterjee, R.; Maitra, T.; Islam, S.K.H.; Hassan, M.M.; Alamri, A.; Fortino, G. A novel machine learning based feature selection for motor imagery EEG signal classification in Internet of medical things environment. Future Gener. Comput. Syst. 2019, 98, 419–434. [Google Scholar]
Antony, M.J.; Sankaralingam, B.P.; Mahendran, R.K.; Gardezi, A.A.; Shafiq, M.; Choi, J.-G.; Hamam, H. Classification of EEG using adaptive SVM classifier with CSP and online recursive independent component analysis. Sensors 2022, 22, 7596. [Google Scholar] [CrossRef] [PubMed]
Buzzell, G.A.; Niu, Y.; Aviyente, S.; Bernat, E. A practical introduction to EEG time-frequency principal components analysis (TF-PCA). Dev. Cogn. Neurosci. 2022, 55, 101114. [Google Scholar]
Hu, J.; Xiao, D.; Mu, Z. Application of energy entropy in motor imagery EEG classification. Int. J. Digit. Content Technol. Its Appl. 2009, 3, 83–90. [Google Scholar]
Ang, K.K.; Chin, Z.Y.; Wang, C.; Guan, C.; Zhang, H. Filter bank common spatial pattern algorithm on BCI competition IV datasets 2a and 2b. Front. Neurosci. 2012, 6, 39. [Google Scholar] [CrossRef] [PubMed]
Wu, D.; Lance, B.J.; Lawhern, V.J.; Gordon, S.; Jung, T.-P.; Lin, C.-T. EEG-based user reaction time estimation using Riemannian geometry features. IEEE Trans. Neural Syst. Rehabil. Eng. 2017, 25, 2157–2168. [Google Scholar]
Chen, J.; Yu, Z.; Gu, Z.; Li, Y. Deep temporal-spatial feature learning for motor imagery-based brain–computer interfaces. IEEE Trans. Neural Syst. Rehabil. Eng. 2020, 28, 2356–2366. [Google Scholar]
Luo, T.; Zhou, C.; Chao, F. Exploring spatial-frequency-sequential relationships for motor imagery classification with recurrent neural network. BMC Bioinform. 2018, 19, 1–18. [Google Scholar]
Schirrmeister, R.T.; Springenberg, J.T.; Fiederer, L.D.J.; Glasstetter, M.; Eggensperger, K.; Tangermann, M.; Hutter, F.; Burgard, W.; Ball, T. Deep learning with convolutional neural networks for EEG decoding and visualization. Hum. Brain Mapp. 2017, 38, 5391–5420. [Google Scholar]
Lawhern, V.J.; Solon, A.J.; Waytowich, N.R.; Gordon, S.M.; Hung, C.P.; Lance, B.J. EEGNet: A compact convolutional neural network for EEG-based brain–computer interfaces. J. Neural Eng. 2018, 15, 056013. [Google Scholar]
Zhang, R.; Zong, Q.; Dou, L.; Zhao, X. A novel hybrid deep learning scheme for four-class motor imagery classification. J. Neural Eng. 2019, 16, 066004. [Google Scholar]
Garcia-Moreno, F.M.; Bermudez-Edo, M.; Rodríguez-Fórtiz, M.J.; Garrido, J.L. A CNN-LSTM deep learning classifier for motor imagery EEG detection using a low-invasive and low-cost BCI headband. In Proceedings of the 2020 16th International Conference on Intelligent Environments (IE), Madrid, Spain, 20–23 July 2020; pp. 84–91. [Google Scholar]
Wang, J.; Cheng, S.; Tian, J.; Gao, Y. A 2D CNN-LSTM hybrid algorithm using time series segments of EEG data for motor imagery classification. Biomed. Signal Process. Control 2023, 83, 104627. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 4–9. [Google Scholar]
Niu, Z.; Zhong, G.; Yu, H. A review on the attention mechanism of deep learning. Neurocomputing 2021, 452, 48–62. [Google Scholar] [CrossRef]
Liu, X.; Wang, K.; Liu, F.; Zhao, W.; Liu, J. 3D convolution neural network with multiscale spatial and temporal cues for motor imagery EEG classification. Cogn. Neurodyn. 2023, 17, 1357–1380. [Google Scholar] [CrossRef]
Altuwaijri, G.A.; Muhammad, G.; Altaheri, H.; Alsulaiman, M. A multi-branch convolutional neural network with squeeze-and-excitation attention blocks for EEG-based motor imagery signals classification. Diagnostics 2022, 12, 995. [Google Scholar] [CrossRef] [PubMed]
Xie, J.; Zhang, J.; Sun, J.; Ma, Z.; Qin, L.; Li, G.; Zhou, H.; Zhan, Y. A transformer-based approach combining deep learning network and spatial-temporal information for raw EEG classification. IEEE Trans. Neural Syst. Rehabil. Eng. 2022, 30, 2126–2136. [Google Scholar] [CrossRef]
Amin, S.U.; Altaheri, H.; Muhammad, G.; Abdul, W.; Alsulaiman, M. Attention-inception and long-short-term memory-based electroencephalography classification for motor imagery tasks in rehabilitation. IEEE Trans. Ind. Inform. 2021, 18, 5412–5421. [Google Scholar] [CrossRef]
Riyad, M.; Khalil, M.; Adib, A. A novel multi-scale convolutional neural network for motor imagery classification. Biomed. Signal Process. Control 2021, 68, 102747. [Google Scholar] [CrossRef]
Wang, C.; Wu, Y.; Wang, C.; Ren, Y.; Shen, J.; Pang, T.; Chan, C.S.; Ren, W.; Yu, Y. MSFNet: A Multi-Scale Space-Time Frequency Fusion Network for Motor Imagery EEG Classification. IEEE Access 2024, 12, 8325–8336. [Google Scholar] [CrossRef]
Li, M.; Li, J.; Zheng, X.; Ge, J.; Xu, G. MSHANet: A multi-scale residual network with hybrid attention for motor imagery EEG decoding. Cogn. Neurodyn. 2024, 18, 3463–3476. [Google Scholar] [CrossRef]
Zeiler, M.D.; Fergus, R. Visualizing and understanding convolutional networks. In Proceedings of the Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, 6–12 September 2014; Proceedings, Part I. Springer International Publishing: Berlin/Heidelberg, Germany, 2014; pp. 818–833. [Google Scholar]
Yang, Z.; Yang, D.; Dyer, C.; He, X.; Smola, A.; Hovy, E. Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, 12–17 June 2016; pp. 1480–1489. [Google Scholar]
Chen, L.; Zhang, H.; Xiao, J.; Nie, L.; Shao, J.; Liu, W.; Chua, T.-S. Sca-cnn: Spatial and channel-wise attention in convolutional networks for image captioning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 5659–5667. [Google Scholar]
Chen, J.; Han, Z.; Qiao, H.; Li, C.; Peng, H. EEG-based sleep staging via self-attention based capsule network with Bi-LSTM model. Biomed. Signal Process. Control 2023, 86, 105351. [Google Scholar] [CrossRef]
Zhang, R.; Liu, G.; Wen, Y.; Zhou, W. Self-attention-based convolutional neural network and time-frequency common spatial pattern for enhanced motor imagery classification. J. Neurosci. Methods 2023, 398, 109953. [Google Scholar] [CrossRef]
Zhang, D.; Li, H.; Xie, J. MI-CAT: A transformer-based domain adaptation network for motor imagery classification. Neural Netw. 2023, 165, 451–462. [Google Scholar] [CrossRef] [PubMed]
Chaudhary, P.; Dhankhar, N.; Singhal, A.; Rana, K. A two-stage transformer based network for motor imagery classification. Med. Eng. Phys. 2024, 128, 104154. [Google Scholar] [CrossRef] [PubMed]
Amin, S.U.; Alsulaiman, M.; Muhammad, G.; Mekhtiche, M.A.; Hossain, M.S. Deep Learning for EEG motor imagery classification based on multi-layer CNNs feature fusion. Future Gener. Comput. Syst. 2019, 101, 542–554. [Google Scholar] [CrossRef]
Riyad, M.; Khalil, M.; Adib, A. MI-EEGNET: A novel convolutional neural network for motor imagery classification. J. Neurosci. Methods 2021, 353, 109037. [Google Scholar] [CrossRef]
Riyad, M.; Khalil, M.; Adib, A. Incep-eegnet: A convnet for motor imagery decoding. In Proceedings of the Image and Signal Processing: 9th International Conference, ICISP 2020, Marrakesh, Morocco, 4–6 June 2020; Proceedings. Springer International Publishing: Cham, Switzerland, 2020; pp. 103–111. [Google Scholar]
Salami, A.; Andreu-Perez, J.; Gillmeister, H. EEG-ITNet: An explainable inception temporal convolutional network for motor imagery classification. IEEE Access 2022, 10, 36672–36685. [Google Scholar] [CrossRef]
Tang, X.; Yang, C.; Sun, X.; Zou, M.; Wang, H. Motor imagery EEG decoding based on multi-scale hybrid networks and feature enhancement. IEEE Trans. Neural Syst. Rehabil. Eng. 2023, 31, 1208–1218. [Google Scholar] [CrossRef]
Yang, J.; Gao, S.; Shen, T. A two-branch CNN fusing temporal and frequency features for motor imagery EEG decoding. Entropy 2022, 24, 376. [Google Scholar] [CrossRef]
Luo, T. Parallel genetic algorithm based common spatial patterns selection on time–frequency decomposed EEG signals for motor imagery brain-computer interface. Biomed. Signal Process. Control 2023, 80, 104397. [Google Scholar] [CrossRef]
Liu, C.; Jin, J.; Daly, I.; Li, S.; Sun, H.; Huang, Y.; Wang, X.; Cichocki, A. SincNet-based hybrid neural network for motor imagery EEG decoding. IEEE Trans. Neural Syst. Rehabil. Eng. 2022, 30, 540–549. [Google Scholar] [CrossRef]
Phunruangsakao, C.; Achanccaray, D.; Hayashibe, M. Deep adversarial domain adaptation with few-shot learning for motor-imagery brain-computer interface. IEEE Access 2022, 10, 57255–57265. [Google Scholar] [CrossRef]
Yan, Y.; Zhou, H.; Huang, L.; Cheng, X.; Kuang, S. A novel two-stage refine filtering method for EEG-based motor imagery classification. Front. Neurosci. 2021, 15, 657540. [Google Scholar] [CrossRef]
Chen, X.; Teng, X.; Chen, H.; Pan, Y.; Geyer, P. Toward reliable signals decoding for electroencephalogram: A benchmark study to EEGNeX. Biomed. Signal Process. Control 2024, 87, 105475. [Google Scholar] [CrossRef]
Sudalairaj, S. Spatio-Temporal Analysis of EEG Using Deep Learning. Ph.D. Thesis, University of Cincinnati, Cincinnati, OH, USA, 2022. [Google Scholar]
Mane, R.; Chew, E.; Chua, K.; Ang, K.K.; Robinson, N.; Vinod, A.P.; Lee, S.-W.; Guan, C. FBCNet: A multi-view convolutional neural network for brain-computer interface. arXiv 2021, arXiv:2104.01233. [Google Scholar]
Zhao, H.; Zheng, Q.; Ma, K.; Li, H.; Zheng, Y. Deep representation-based domain adaptation for nonstationary EEG classification. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 535–545. [Google Scholar] [CrossRef]
Song, Y.; Zheng, Q.; Liu, B.; Gao, X. EEG conformer: Convolutional transformer for EEG decoding and visualization. IEEE Trans. Neural Syst. Rehabil. Eng. 2022, 31, 710–719. [Google Scholar] [CrossRef]
Yang, L.; Song, Y.; Ma, K.; Xie, L. Motor imagery EEG decoding method based on a discriminative feature learning strategy. IEEE Trans. Neural Syst. Rehabil. Eng. 2021, 29, 368–379. [Google Scholar] [CrossRef]
Malan, N.S.; Sharma, S. Motor imagery EEG spectral-spatial feature optimization using dual-tree complex wavelet and neighbourhood component analysis. IRBM 2022, 43, 198–209. [Google Scholar] [CrossRef]
Song, Y.; Jia, X.; Yang, L.; Xie, L. Transformer-based spatial-temporal feature learning for EEG decoding. arXiv 2021, arXiv:2106.11170. [Google Scholar]
Hu, B.; Gao, B.; Woo, W.L.; Ruan, L.; Jin, J.; Yang, Y.; Yu, Y. A lightweight spatial and temporal multi-feature fusion network for defect detection. IEEE Trans. Image Process. 2020, 30, 472–486. [Google Scholar] [CrossRef] [PubMed]

Figure 1. The basic processing flow of the EEG signal in a BCI system. (1) Collect EEG signal: Record raw brain signals from the scalp. (2) Pre-processing: Remove noise from the raw signals. (3) Feature extraction: Extract spatio-temporal or frequency domain features. (4) Signal classification: Convert the features into control commands. In addition, the system senses the user’s neural state through the neural feedback.

Figure 2. A overview of the framework of the DB-BISAN.

Figure 3. An overview of the framework of the Blocked-Integration Self-Attention Module.

Figure 4. Cross-session classification performance of proposed method and mainstream baseline deep learning methods on BCI IV 2a and 2b dataset.

Figure 5. The cross-session classification performance of the proposed method and multi-scale CNN model on the BCI IV 2a and 2b datasets.

Figure 6. Accuracy results of ablation experiment on BCI IV 2a and 2b datasets.

Figure 7. Kappa results of ablation experiment on BCI IV 2a and 2b datasets.

Figure 8. Confusion matrix results of classification in Subject 4 from BCI IV 2a.

Figure 9. Confusion matrix results of classification in Subject 4 from BCI IV 2b.

Figure 10. The t-SNE visualization for the extracted features of Subject 3 from the dual-branches in the BCI-IV 2a dataset.

Figure 11. The t-SNE visualization for the extracted features of Subject 3 from the dual-branches in the BCI-IV 2b dataset.

Figure 12. The Blocked-Integration Self-Attention score of the single subject from the BCI-IV 2a and 2b datasets.

Table 1. Classification results of subject-independent methods on BCI IV 2a.

Method	S01	S02	S03	S04	S05	S06	S07	S08	S09	Avg
TSFBCSP-MRPGA	86.11%	61.81%	86.81%	71.18%	62.85%	56.25%	90.28%	80.56%	78.47%	74.92%
SincNet	82.76%	68.97%	79.31%	65.52%	58.62%	48.28%	86.21%	89.66%	89.87%	74.26%
DAFS	81.94%	64.58%	88.89%	73.61%	70.49%	56.67%	85.42%	79.15%	81.60%	75.85%
C2CMe	83.3%	53.7%	87.0%	55.6%	50.0%	27.3%	86.1%	77.8%	72.70%	65.9%
EEGNext	86.25%	60.71%	93.38%	70.27%	67.14%	70.63%	88.84%	85.89%	86.16%	78.81%
Spatio-Temporal	82.99%	56.25%	93.06%	84.03%	68.75%	58.34%	79.72%	87.67%	86.81%	77.51%
FBCNet	85.42%	60.42%	90.63%	76.39%	74.31%	53.82%	84.38%	79.51%	80.90%	76.20%
DRDA	83.19%	55.14%	87.43%	75.28%	62.29%	57.15%	86.18%	83.31%	82.00%	74.74%
Conformer	88.19%	61.46%	93.4%	78.13%	52.08%	65.28%	92.36%	89.18%	88.89%	78.66%
Our Proposed Method	86.8%	51.73%	93.05%	79.86%	77.43%	70.48%	91.31%	78.12%	87.50%	79.58%

Table 2. Classification results of subject-independent methods on BCI IV 2b.

Method	S01	S02	S03	S04	S05	S06	S07	S08	S09	Avg
TSFBCSP-MRPGA	75.63%	62.5%	61.25%	97.81%	87.81%	85%	81.25%	92.81%	85.31%	81.04%
SincNet	83.33%	61.76%	58.33%	97.3%	91.89%	88.89%	86.11%	92.11%	91.67%	83.49%
DAFS	70.31%	73.57%	80.31%	94.69%	95.00%	83.75%	93.73%	95%	75.31%	84.63%
CD-LOSS	79.69%	60.71%	82.19%	96.87%	94.37%	89.37%	82.19%	93.75%	90.00%	85.46%
NCA	85.6%	66.3%	63.7%	99.4%	91.3%	79.2%	86.9%	94.4%	89.4%	84.02%
S3T	81.67%	68.33%	66.67%	98.33%	88.33%	90%	85%	93.33%	86.67%	84.26%
DRDA	81.37%	62.86%	63.63%	95.94%	93.56%	88.19%	85%	95.25%	90%	83.98%
Conformer	82.5%	65.71%	63.75%	98.44%	86.56%	90.31%	87.81%	94.38%	92.19%	84.63%
Our Proposed Method	76.56%	73.21%	82.18%	96.56%	99.37%	85.37%	94.68%	90.62%	84.37%	87.01%

Table 3. The compared models of the ablation experiment.

Model Name	Small Kernel	Large Kernel	Attention
Branch I—Large kernel		✓
Branch II—Small kernel	✓
Dual-Branch	✓	✓
Our Proposed Model	✓	✓	✓

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, W.; Xu, S.; Hu, Q.; Peng, Y.; Zhang, H.; Zhang, J.; Chen, Z. A Novel Deep Learning Model for Motor Imagery Classification in Brain–Computer Interfaces. Information 2025, 16, 582. https://doi.org/10.3390/info16070582

AMA Style

Chen W, Xu S, Hu Q, Peng Y, Zhang H, Zhang J, Chen Z. A Novel Deep Learning Model for Motor Imagery Classification in Brain–Computer Interfaces. Information. 2025; 16(7):582. https://doi.org/10.3390/info16070582

Chicago/Turabian Style

Chen, Wenhui, Shunwu Xu, Qingqing Hu, Yiran Peng, Hong Zhang, Jian Zhang, and Zhaowen Chen. 2025. "A Novel Deep Learning Model for Motor Imagery Classification in Brain–Computer Interfaces" Information 16, no. 7: 582. https://doi.org/10.3390/info16070582

APA Style

Chen, W., Xu, S., Hu, Q., Peng, Y., Zhang, H., Zhang, J., & Chen, Z. (2025). A Novel Deep Learning Model for Motor Imagery Classification in Brain–Computer Interfaces. Information, 16(7), 582. https://doi.org/10.3390/info16070582

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Novel Deep Learning Model for Motor Imagery Classification in Brain–Computer Interfaces

Abstract

1. Introduction

2. Related Work

2.1. EEG Motor Imagery

2.2. Attention Mechanism

3. Method

3.1. Model Overview

3.2. Dual-Branch Feature Extraction Module

3.3. Blocked-Integration Self-Attention Module

3.4. Feature Fusion Module

4. Experiment

4.1. Dataset Description

4.2. Data Processing and Experimental Setting

4.3. Validation of Classification

4.3.1. Comparison with Mainstream Baseline Deep Learning Methods

4.3.2. Comparison with Deep Learning Model Based on Multi-Scale Feature Fusion

4.3.3. Several State-of-the-Art Deep Learning Methods

4.4. Ablation Experiment

4.5. Confusion Matrix

4.6. Feature Visualization

4.7. Attention Visualization

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI