This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Open AccessArticle
Multi-Domain Feature Fusion Transformer with Cross-Domain Robustness for Facial Expression Recognition
by
Katherine Lin Shu
Katherine Lin Shu 1
and
Mu-Jiang-Shan Wang
Mu-Jiang-Shan Wang 2,*
1
Faculty of Biology, Medicine and Health, University of Manchester, Manchester M13 9PL, UK
2
Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
*
Author to whom correspondence should be addressed.
Symmetry 2026, 18(1), 15; https://doi.org/10.3390/sym18010015 (registering DOI)
Submission received: 18 November 2025
/
Revised: 4 December 2025
/
Accepted: 10 December 2025
/
Published: 21 December 2025
(This article belongs to the Section
Computer)
Abstract
Facial expression recognition (FER) is a key task in affective computing and human–computer interaction, aiming to decode facial muscle movements into emotional categories. Although deep learning-based FER has achieved remarkable progress, robust recognition under uncontrolled conditions (e.g., illumination change, pose variation, occlusion, and cultural diversity) remains challenging. Traditional Convolutional Neural Networks (CNNs) are effective at local feature extraction but limited in modeling global dependencies, while Vision Transformers (ViT) provide global context modeling yet often neglect fine-grained texture and frequency cues that are critical for subtle expression discrimination. Moreover, existing approaches usually focus on single-domain representations and lack adaptive strategies to integrate heterogeneous cues across spatial, semantic, and spectral domains, leading to limited cross-domain generalization. To address these limitations, this study proposes a unified Multi-Domain Feature Enhancement and Fusion (MDFEFT) framework that combines a ViT-based global encoder with three complementary branches—channel, spatial, and frequency—for comprehensive feature learning. Taking into account the approximately bilateral symmetry of human faces and the asymmetric distortions introduced by pose, occlusion, and illumination, the proposed MDFEFT framework is designed to learn symmetry-aware and asymmetry-robust representations for facial expression recognition across diverse domains. An adaptive Cross-Domain Feature Enhancement and Fusion (CDFEF) module is further introduced to align and integrate heterogeneous features, achieving domain-consistent and illumination-robust expression understanding. The experimental results show that the proposed method consistently outperforms existing CNN-, Transformer-, and ensemble-based models. The proposed model achieves accuracies of 0.997, 0.796, and 0.776 on KDEF, FER2013, and RAF-DB, respectively. Compared with the strongest baselines, it further improves accuracy by 0.3%, 2.2%, and 1.9%, while also providing higher F1-scores and better robustness in cross-domain testing. These results confirm the effectiveness and strong generalization ability of the proposed framework for real-world facial expression recognition.
Share and Cite
MDPI and ACS Style
Shu, K.L.; Wang, M.-J.-S.
Multi-Domain Feature Fusion Transformer with Cross-Domain Robustness for Facial Expression Recognition. Symmetry 2026, 18, 15.
https://doi.org/10.3390/sym18010015
AMA Style
Shu KL, Wang M-J-S.
Multi-Domain Feature Fusion Transformer with Cross-Domain Robustness for Facial Expression Recognition. Symmetry. 2026; 18(1):15.
https://doi.org/10.3390/sym18010015
Chicago/Turabian Style
Shu, Katherine Lin, and Mu-Jiang-Shan Wang.
2026. "Multi-Domain Feature Fusion Transformer with Cross-Domain Robustness for Facial Expression Recognition" Symmetry 18, no. 1: 15.
https://doi.org/10.3390/sym18010015
APA Style
Shu, K. L., & Wang, M.-J.-S.
(2026). Multi-Domain Feature Fusion Transformer with Cross-Domain Robustness for Facial Expression Recognition. Symmetry, 18(1), 15.
https://doi.org/10.3390/sym18010015
Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details
here.
Article Metrics
Article metric data becomes available approximately 24 hours after publication online.