Abstract
Brain–computer interfaces using motor imagery (MI-BCIs) offer a promising noninvasive communication pathway between humans and engineered equipment such as robots. However, for MI-BCIs based on electroencephalography (EEG), the reliability of the interface across recording sessions is limited by temporal non-stationary effects. Overcoming this barrier is critical to translating MI-BCIs from controlled laboratory environments to practical uses. In this paper, we present a comprehensive dual-validation framework to rigorously evaluate the temporal robustness of EEG signals of an MI-BCI. We collected data from six participants performing four motor imagery tasks (left/right hand and foot). Features were extracted using Common Spatial Patterns, and ten machine learning classifiers were assessed within a unified pipeline. Our method integrates within-session evaluation (stratified K-fold cross-validation) with cross-session testing (bidirectional train/test), complemented by stability metrics and performance heterogeneity assessment. Findings reveal minimal performance loss between conditions, with an average accuracy drop of just 2.5%. The AdaBoost classifier achieved the highest within-session performance (84.0% system accuracy, F1-score: 83.8%/80.9% for hand/foot), while the K-nearest neighbors (KNN) classifier demonstrated the best cross-session robustness (81.2% system accuracy, F1-score: 80.5%/80.2% for hand/foot, 0.663 robustness score). This study shows that robust performance across sessions is attainable for MI-BCI evaluation, supporting the pathway toward reliable, real-world clinical deployment.