Abstract
Objective. Consumer-grade EEG devices have the potential for widespread brain–computer interface deployment but pose significant challenges for emotion recognition due to reduced spatial coverage and the variable signal quality encountered in uncontrolled deployment environments. While deep learning approaches have employed increasingly complex architectures, their efficacy in noisy consumer-grade signals and cross-system generalizability remains unexplored. We present a comprehensive systematic comparison of EEGNet architecture, which has become a benchmark model for consumer-grade EEG analysis versus traditional machine learning, examining when and why domain-specific feature engineering outperforms end-to-end learning in resource constrained scenarios. Approach. We conducted comprehensive within-dataset evaluation using the DREAMER dataset (23 subjects, Emotiv EPOC 14-channel) and challenging cross-dataset validation (DREAMER→SEED-VII transfer). Traditional ML employed domain-specific feature engineering (statistical, frequency-domain, and connectivity features) with random forest classification. Deep learning employed both optimized and enhanced EEGNet architectures, specifically designed for low channel consumer EEG systems. For cross-dataset validation, we implemented progressive domain adaptation combining anatomical channel mapping, CORAL adaptation, and TCA subspace learning. Statistical validation included 345 comprehensive evaluations with fivefold cross-validation × 3 seeds × 23 subjects, Wilcoxon signed-rank tests, and Cohen’s d effect size calculations. Main results. Traditional ML achieved superior within-dataset performance (F1 = 0.945 ± 0.034 versus 0.567 for EEGNet architectures, p < 0.000001, Cohen’s d = 3.863, 67% improvement) across 345 evaluations. Cross-dataset validation demonstrated good performance (F1 = 0.619 versus 0.007) through systematic domain adaptation. Progressive improvements included anatomical channel mapping (5.8× improvement), CORAL domain adaptation (2.7× improvement), and TCA subspace learning (4.5× improvement). Feature analysis revealed inter-channel connectivity patterns contributed 61% of the discriminative power. Traditional ML demonstrated superior computational efficiency (95% faster training, 10× faster inference) and excellent stability (CV = 0.036). Fairness validation experiments supported the advantage of traditional ML in its ability to persist even with minimal feature engineering (F1 = 0.842 vs. 0.646 for enhanced EEGNet), and robustness analysis revealed that deep learning degrades more under consumer-grade noise conditions (17% vs. <1% degradation). Significance. These findings challenge the assumption that architectural complexity universally improves biosignal processing performance in consumer-grade applications. Through the comparison of traditional ML against the EEGNet consumer-grade architecture, we highlight the potential that domain-specific feature engineering and lightweight adaptation techniques can provide superior accuracy, stability, and practical deployment capabilities for consumer-grade EEG emotion recognition. While our empirical comparison focused on EEGNet, the underlying principles regarding data efficiency, noise robustness, and the value of domain expertise could extend to comparisons with other complex architectures facing similar constraints in further research. This comprehensive domain adaptation framework enables robust cross-system deployment, addressing critical gaps in real-world BCI applications.