Next Article in Journal
AI-Driven Digital Twins for Manufacturing: A Review Across Hierarchical Manufacturing System Levels
Previous Article in Journal
Advancing SAR Target Recognition Through Hierarchical Self-Supervised Learning with Multi-Task Pretext Training
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

SFE-DETR: An Enhanced Transformer-Based Face Detector for Small Target Faces in Open Complex Scenes

School of Computer and Control Engineering, Northeast Forestry University, Harbin 150040, China
*
Author to whom correspondence should be addressed.
Sensors 2026, 26(1), 125; https://doi.org/10.3390/s26010125
Submission received: 20 November 2025 / Revised: 19 December 2025 / Accepted: 22 December 2025 / Published: 24 December 2025
(This article belongs to the Section Optical Sensors)

Abstract

Face detection is an important task in the field of computer vision and is widely applied in various applications. However, in open and complex scenes with dense faces, occlusions, and image degradation, small face detection still faces significant challenges due to the extremely small target scale, difficult localization, and severe background interference. To address these issues, this paper proposes a small face detector for open complex scenes, SFE-DETR, which aims to simultaneously improve detection accuracy and computational efficiency. The backbone network of the model adopts an inverted residual shift convolution and dilated reparameterization structure, which enhances shallow features and enables deep feature self-adaptation, thereby better preserving small-scale information and reducing the number of parameters. Additionally, a multi-head multi-scale self-attention mechanism is introduced to fuse multi-scale convolutional features with channel-wise weighting, capturing fine-grained facial features while suppressing background noise. Moreover, a redesigned SFE-FPN introduces high-resolution layers and incorporates a novel feature fusion module consisting of local, large-scale, and global branches, efficiently aggregating multi-level features and significantly improving small face detection performance. Experimental results on two challenging small face detection datasets show that SFE-DETR reduces parameters by 28.1% compared to the original RT-DETR-R18 model, achieving a mAP50 of 94.7% and AP-s of 42.1% on the SCUT-HEAD dataset, and a mAP50 of 86.3% on the WIDER FACE (Hard) subset. These results demonstrate that SFE-DETR achieves optimal detection performance among models of the same scale while maintaining efficiency.
Keywords: object detection; small face detection; feature extraction; feature fusion pyramid; RT-DETR; open complex scenes object detection; small face detection; feature extraction; feature fusion pyramid; RT-DETR; open complex scenes

Share and Cite

MDPI and ACS Style

Yang, C.; Jiang, Y.; Song, C. SFE-DETR: An Enhanced Transformer-Based Face Detector for Small Target Faces in Open Complex Scenes. Sensors 2026, 26, 125. https://doi.org/10.3390/s26010125

AMA Style

Yang C, Jiang Y, Song C. SFE-DETR: An Enhanced Transformer-Based Face Detector for Small Target Faces in Open Complex Scenes. Sensors. 2026; 26(1):125. https://doi.org/10.3390/s26010125

Chicago/Turabian Style

Yang, Chenhao, Yueming Jiang, and Chunyan Song. 2026. "SFE-DETR: An Enhanced Transformer-Based Face Detector for Small Target Faces in Open Complex Scenes" Sensors 26, no. 1: 125. https://doi.org/10.3390/s26010125

APA Style

Yang, C., Jiang, Y., & Song, C. (2026). SFE-DETR: An Enhanced Transformer-Based Face Detector for Small Target Faces in Open Complex Scenes. Sensors, 26(1), 125. https://doi.org/10.3390/s26010125

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop