Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (343)

Search Parameters:
Keywords = adaptive feature pyramid network

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
26 pages, 8183 KB  
Article
MEE-DETR: Multi-Scale Edge-Aware Enhanced Transformer for PCB Defect Detection
by Xiaoyu Ma, Xiaolan Xie and Yuhui Song
Electronics 2026, 15(3), 504; https://doi.org/10.3390/electronics15030504 (registering DOI) - 23 Jan 2026
Abstract
Defect inspection of Printed Circuit Board (PCB) is essential for maintaining the safety and reliability of electronic products. With the continuous trend toward smaller components and higher integration levels, identifying tiny imperfections on densely packed PCB structures has become increasingly difficult and remains [...] Read more.
Defect inspection of Printed Circuit Board (PCB) is essential for maintaining the safety and reliability of electronic products. With the continuous trend toward smaller components and higher integration levels, identifying tiny imperfections on densely packed PCB structures has become increasingly difficult and remains a major challenge for current inspection systems. To tackle this problem, this study proposes the Multi-scale Edge-Aware Enhanced Detection Transformer (MEE-DETR), a deep learning-based object detection method. Building upon the RT-DETR framework, which is grounded in Transformer-based machine learning, the proposed approach systematically introduces enhancements at three levels: backbone feature extraction, feature interaction, and multi-scale feature fusion. First, the proposed Edge-Strengthened Backbone Network (ESBN) constructs multi-scale edge extraction and semantic fusion pathways, effectively strengthening the structural representation of shallow defect edges. Second, the Entanglement Transformer Block (ETB), synergistically integrates frequency self-attention, spatial self-attention, and a frequency–spatial entangled feed-forward network, enabling deep cross-domain information interaction and consistent feature representation. Finally, the proposed Adaptive Enhancement Feature Pyramid Network (AEFPN), incorporating the Adaptive Cross-scale Fusion Module (ACFM) for cross-scale adaptive weighting and the Enhanced Feature Extraction C3 Module (EFEC3) for local nonlinear enhancement, substantially improves detail preservation and semantic balance during feature fusion. Experiments conducted on the PKU-Market-PCB dataset reveal that MEE-DETR delivers notable performance gains. Specifically, Precision, Recall, and mAP50–95 improve by 2.5%, 9.4%, and 4.2%, respectively. In addition, the model’s parameter size is reduced by 40.7%. These results collectively indicate that MEE-DETR achieves excellent detection performance with a lightweight network architecture. Full article
28 pages, 8014 KB  
Article
YOLO-UMS: Multi-Scale Feature Fusion Based on YOLO Detector for PCB Surface Defect Detection
by Hong Peng, Wenjie Yang and Baocai Yu
Sensors 2026, 26(2), 689; https://doi.org/10.3390/s26020689 - 20 Jan 2026
Abstract
Printed circuit boards (PCBs) are critical in the electronics industry. As PCB layouts grow increasingly complex, defect detection processes often encounter challenges such as low image contrast, uneven brightness, minute defect sizes, and irregular shapes, making it difficult to achieve rapid and accurate [...] Read more.
Printed circuit boards (PCBs) are critical in the electronics industry. As PCB layouts grow increasingly complex, defect detection processes often encounter challenges such as low image contrast, uneven brightness, minute defect sizes, and irregular shapes, making it difficult to achieve rapid and accurate automated inspection. To address these challenges, this paper proposes a novel object detector, YOLO-UMS, designed to enhance the accuracy and speed of PCB surface defect detection. First, a lightweight plug-and-play Unified Multi-Scale Feature Fusion Pyramid Network (UMSFPN) is proposed to process and fuse multi-scale information across different resolution layers. The UMSFPN uses a Cross-Stage Partial Multi-Scale Module (CSPMS) and an optimized fusion strategy. This approach balances the integration of fine-grained edge information from shallow layers and coarse-grained semantic details from deep layers. Second, the paper introduces a lightweight RG-ELAN module, based on the ELAN network, to enhance feature extraction for small targets in complex scenes. The RG-ELAN module uses low-cost operations to generate redundant feature maps and reduce computational complexity. Finally, the Adaptive Interaction Feature Integration (AIFI) module enriches high-level features by eliminating redundant interactions among shallow-layer features. The channel-priority convolutional attention module (CPCA), deployed in the detection head, strengthens the expressive power of small target features. The experimental results show that the new UMSFPN neck can help improve the AP50 by 3.1% and AP by 2% on the self-collected dataset PCB-M, which is better than the original PAFPN neck. Meanwhile, UMSFPN achieves excellent results across different detectors and datasets, verifying its broad applicability. Without pre-training weights, YOLO-UMS achieves an 84% AP50 on the PCB-M dataset, which is a 6.4% improvement over the baseline YOLO11. Comparing results with existing target detection algorithms shows that the algorithm exhibits good performance in terms of detection accuracy. It provides a feasible solution for efficient and accurate detection of PCB surface defects in the industry. Full article
(This article belongs to the Section Physical Sensors)
Show Figures

Figure 1

23 pages, 54360 KB  
Article
ATM-Net: A Lightweight Multimodal Fusion Network for Real-Time UAV-Based Object Detection
by Jiawei Chen, Junyu Huang, Zuye Zhang, Jinxin Yang, Zhifeng Wu and Renbo Luo
Drones 2026, 10(1), 67; https://doi.org/10.3390/drones10010067 - 20 Jan 2026
Abstract
UAV-based object detection faces critical challenges including extreme scale variations (targets occupy 0.1–2% image area), bird’s-eye view complexities, and all-weather operational demands. Single RGB sensors degrade under poor illumination while infrared sensors lack spatial details. We propose ATM-Net, a lightweight multimodal RGB–infrared fusion [...] Read more.
UAV-based object detection faces critical challenges including extreme scale variations (targets occupy 0.1–2% image area), bird’s-eye view complexities, and all-weather operational demands. Single RGB sensors degrade under poor illumination while infrared sensors lack spatial details. We propose ATM-Net, a lightweight multimodal RGB–infrared fusion network for robust UAV vehicle detection. ATM-Net integrates three innovations: (1) Asymmetric Recurrent Fusion Module (ARFM) performs “extraction→fusion→separation” cycles across pyramid levels, balancing cross-modal collaboration and modality independence. (2) Tri-Dimensional Attention (TDA) recalibrates features through orthogonal Channel-Width, Height-Channel, and Height-Width branches, enabling comprehensive multi-dimensional feature enhancement. (3) Multi-scale Adaptive Feature Pyramid Network (MAFPN) constructs enhanced representations via bidirectional flow and multi-path aggregation. Experiments on VEDAI and DroneVehicle datasets demonstrate superior performance—92.4% mAP50 and 64.7% mAP50-95 on VEDAI, 83.7% mAP on DroneVehicle—with only 4.83M parameters. ATM-Net achieves optimal accuracy–efficiency balance for resource-constrained UAV edge platforms. Full article
Show Figures

Figure 1

25 pages, 9860 KB  
Article
Symmetry-Aware SXA-YOLO: Enhancing Tomato Leaf Disease Recognition with Bidirectional Feature Fusion and Task Decoupling
by Guangyue Du, Shuyu Fang, Lianbin Zhang, Wanlu Ren and Biao He
Symmetry 2026, 18(1), 178; https://doi.org/10.3390/sym18010178 - 18 Jan 2026
Viewed by 58
Abstract
Tomatoes are an important economic crop in China, and crop diseases often lead to a decline in their yield. Deep learning-based visual recognition methods have become an approach for disease identification; however, challenges remain due to complex background interference in the field and [...] Read more.
Tomatoes are an important economic crop in China, and crop diseases often lead to a decline in their yield. Deep learning-based visual recognition methods have become an approach for disease identification; however, challenges remain due to complex background interference in the field and the diversity of disease manifestations. To address these issues, this paper proposes the SXA-YOLO (an improvement based on YOLO, where S stands for the SAAPAN architecture, X represents the XIoU loss function, and A denotes the AsDDet module) symmetric perception recognition model. First, a comprehensive symmetry architecture system is established. The backbone network creates a hierarchical feature foundation through C3k2 (Cross-stage Partial Concatenated Bottleneck Convolution with Dual-kernel Design) and SPPF (the Fast Pyramid Pooling module) modules; the neck employs a SAAPAN (Symmetry-Aware Adaptive Path Aggregation Architecture) bidirectional feature pyramid architecture, utilizing multiple modules to achieve equal fusion of multi-scale features; and the detection head is based on the AsDDet (Adaptive Symmetry-aware Decoupled Detection Head) module for functional decoupling, combining dynamic label assignment and the XIoU (Extended Intersection over Union) loss function to collaboratively optimize classification, regression, and confidence prediction. Ultimately, a complete recognition framework is formed through triple symmetric optimization of “feature hierarchy, fusion path, and task functionality.” Experimental results indicate that this method effectively enhances the model’s recognition performance, achieving a P (Precision) value of 0.992 and an mAP50 (mean Average Precision at 50% IoU threshold) of 0.993. Furthermore, for ten categories of diseases, the SXA-YOLO symmetric perception recognition model outperforms other comparative models in both p value and mAP50. The improved algorithm enhances the recognition of foliar diseases in tomatoes, achieving a high level of accuracy. Full article
Show Figures

Figure 1

27 pages, 32247 KB  
Article
A Dual-Resolution Network Based on Orthogonal Components for Building Extraction from VHR PolSAR Images
by Songhao Ni, Fuhai Zhao, Mingjie Zheng, Zhen Chen and Xiuqing Liu
Remote Sens. 2026, 18(2), 305; https://doi.org/10.3390/rs18020305 - 16 Jan 2026
Viewed by 64
Abstract
Sub-meter-resolution Polarimetric Synthetic Aperture Radar (PolSAR) imagery enables precise building footprint extraction but introduces complex scattering correlated with fine spatial structures. This change renders both traditional methods, which rely on simplified scattering models, and existing deep learning approaches, which sacrifice spatial detail through [...] Read more.
Sub-meter-resolution Polarimetric Synthetic Aperture Radar (PolSAR) imagery enables precise building footprint extraction but introduces complex scattering correlated with fine spatial structures. This change renders both traditional methods, which rely on simplified scattering models, and existing deep learning approaches, which sacrifice spatial detail through multi-looking, inadequate for high-precision extraction tasks. To address this, we propose an Orthogonal Dual-Resolution Network (ODRNet) for end-to-end, precise segmentation directly from single-look complex (SLC) data. Unlike complex-valued neural networks that suffer from high computational cost and optimization difficulties, our approach decomposes complex-valued data into its orthogonal real and imaginary components, which are then concurrently fed into a Dual-Resolution Branch (DRB) with Bilateral Information Fusion (BIF) to effectively balance the trade-off between semantic and spatial details. Crucially, we introduce an auxiliary Polarization Orientation Angle (POA) regression task to enforce physical consistency between the orthogonal branches. To tackle the challenge of diverse building scales, we designed a Multi-scale Aggregation Pyramid Pooling Module (MAPPM) to enhance contextual awareness and a Pixel-attention Fusion (PAF) module to adaptively fuse dual-branch features. Furthermore, we have constructed a VHR PolSAR building footprint segmentation dataset to support related research. Experimental results demonstrate that ODRNet achieves 64.3% IoU and 78.27% F1-score on our dataset, and 73.61% IoU with 84.8% F1-score on a large-scale SLC scene, confirming the method’s significant potential and effectiveness in high-precision building extraction directly from SLC. Full article
Show Figures

Figure 1

25 pages, 65227 KB  
Article
SAANet: Detecting Dense and Crossed Stripe-like Space Objects Under Complex Stray Light Interference
by Yuyuan Liu, Hongfeng Long, Xinghui Sun, Yihui Zhao, Zhuo Chen, Yuebo Ma and Rujin Zhao
Remote Sens. 2026, 18(2), 299; https://doi.org/10.3390/rs18020299 - 16 Jan 2026
Viewed by 70
Abstract
With the deployment of mega-constellations, the proliferation of on-orbit Resident Space Objects (RSOs) poses a severe challenge to Space Situational Awareness (SSA). RSOs produce elongated and stripe-like signatures in long-exposure imagery as a result of their relative orbital motion. The accurate detection of [...] Read more.
With the deployment of mega-constellations, the proliferation of on-orbit Resident Space Objects (RSOs) poses a severe challenge to Space Situational Awareness (SSA). RSOs produce elongated and stripe-like signatures in long-exposure imagery as a result of their relative orbital motion. The accurate detection of these signatures is essential for critical applications like satellite navigation and space debris monitoring. However, on-orbit detection faces two challenges: the obscuration of dim RSOs by complex stray light interference, and their dense overlapping trajectories. To address these challenges, we propose the Shape-Aware Attention Network (SAANet), establishing a unified Shape-Aware Paradigm. The network features a streamlined Shape-Aware Feature Pyramid Network (SA-FPN) with structurally integrated Two-way Orthogonal Attention (TTOA) to explicitly model linear topologies, preserving dim signals under intense stray light conditions. Concurrently, we propose an Adaptive Linear Oriented Bounding Box (AL-OBB) detection head that leverages a Joint Geometric Constraint Mechanism to resolve the ambiguity of regressing targets amid dense, overlapping trajectories. Experiments on the AstroStripeSet and StarTrails datasets demonstrate that SAANet achieves state-of-the-art (SOTA) performance, achieving Recalls of 0.930 and 0.850, and Average Precisions (APs) of 0.864 and 0.815, respectively. Full article
Show Figures

Figure 1

24 pages, 6383 KB  
Article
FF-Mamba-YOLO: An SSM-Based Benchmark for Forest Fire Detection in UAV Remote Sensing Images
by Binhua Guo, Dinghui Liu, Zhou Shen and Tiebin Wang
J. Imaging 2026, 12(1), 43; https://doi.org/10.3390/jimaging12010043 - 13 Jan 2026
Viewed by 199
Abstract
Timely and accurate detection of forest fires through unmanned aerial vehicle (UAV) remote sensing target detection technology is of paramount importance. However, multiscale targets and complex environmental interference in UAV remote sensing images pose significant challenges during detection tasks. To address these obstacles, [...] Read more.
Timely and accurate detection of forest fires through unmanned aerial vehicle (UAV) remote sensing target detection technology is of paramount importance. However, multiscale targets and complex environmental interference in UAV remote sensing images pose significant challenges during detection tasks. To address these obstacles, this paper presents FF-Mamba-YOLO, a novel framework based on the principles of Mamba and YOLO (You Only Look Once) that leverages innovative modules and architectures to overcome these limitations. Specifically, we introduce MFEBlock and MFFBlock based on state space models (SSMs) in the backbone and neck parts of the network, respectively, enabling the model to effectively capture global dependencies. Second, we construct CFEBlock, a module that performs feature enhancement before SSM processing, improving local feature processing capabilities. Furthermore, we propose MGBlock, which adopts a dynamic gating mechanism, enhancing the model’s adaptive processing capabilities and robustness. Finally, we enhance the structure of Path Aggregation Feature Pyramid Network (PAFPN) to improve feature fusion quality and introduce DySample to enhance image resolution without significantly increasing computational costs. Experimental results on our self-constructed forest fire image dataset demonstrate that the model achieves 67.4% mAP@50, 36.3% mAP@50:95, and 64.8% precision, outperforming previous state-of-the-art methods. These results highlight the potential of FF-Mamba-YOLO in forest fire monitoring. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

20 pages, 3283 KB  
Article
Small-Target Pest Detection Model Based on Dynamic Multi-Scale Feature Extraction and Dimensionally Selected Feature Fusion
by Junjie Li, Wu Le, Zhenhong Jia, Gang Zhou, Jiajia Wang, Guohong Chen, Yang Wang and Yani Guo
Appl. Sci. 2026, 16(2), 793; https://doi.org/10.3390/app16020793 - 13 Jan 2026
Viewed by 117
Abstract
Pest detection in the field is crucial for realizing smart agriculture. Deep learning-based target detection algorithms have become an important pest identification method due to their high detection accuracy, but the existing methods still suffer from misdetection and omission when detecting small-targeted pests [...] Read more.
Pest detection in the field is crucial for realizing smart agriculture. Deep learning-based target detection algorithms have become an important pest identification method due to their high detection accuracy, but the existing methods still suffer from misdetection and omission when detecting small-targeted pests and small-targeted pests in more complex backgrounds. For this reason, this study improves on YOLO11 and proposes a new model called MSDS-YOLO for enhanced detection of small-target pests. First, a new dynamic multi-scale feature extraction module (C3k2_DMSFE) is introduced, which can be adaptively adjusted according to different input features and thus effectively capture multi-scale and diverse feature information. Next, a novel Dimensional Selective Feature Pyramid Network (DSFPN) is proposed, which employs adaptive feature selection and multi-dimensional fusion mechanisms to enhance small-target saliency. Finally, the ability to fit small targets was enhanced by adding 160 × 160 detection heads removing 20 × 20 detection heads and using Normalized Gaussian Wasserstein Distance (NWD) combined with CIoU as a position loss function to measure the prediction error. In addition, a real small-target pest dataset, Cottonpest2, is constructed for validating the proposed model. The experimental results showed that a mAP50 of 86.7% was achieved on the self-constructed dataset Cottonpest2, which was improved by 3.0% compared to the baseline. At the same time, MSDS-YOLO has achieved better detection accuracy than other YOLO models on public datasets. Model evaluation on these three datasets shows that the MSDS-YOLO model has excellent robustness and model generalization ability. Full article
Show Figures

Figure 1

25 pages, 7611 KB  
Article
BFRI-YOLO: Harmonizing Multi-Scale Features for Precise Small Object Detection in Aerial Imagery
by Xue Zeng, Shenghong Fang and Qi Sun
Electronics 2026, 15(2), 297; https://doi.org/10.3390/electronics15020297 - 9 Jan 2026
Viewed by 199
Abstract
Identifying minute targets within UAV-acquired imagery continues to pose substantial technical hurdles, primarily due to blurred boundaries, scarce textural details, and drastic scale variations amidst complex backgrounds. In response to these limitations, this paper proposes BFRI-YOLO, an enhanced architecture based on the YOLOv11n [...] Read more.
Identifying minute targets within UAV-acquired imagery continues to pose substantial technical hurdles, primarily due to blurred boundaries, scarce textural details, and drastic scale variations amidst complex backgrounds. In response to these limitations, this paper proposes BFRI-YOLO, an enhanced architecture based on the YOLOv11n baseline. The framework is built upon four synergistic components designed to achieve high-precision localization and robust feature representation. First, we construct a Balanced Adaptive Feature Pyramid Network (BAFPN) that utilizes a resolution-aware attention mechanism to promote bidirectional interaction between deep and shallow features. This is complemented by incorporating the Receptive Field Convolutional Block Attention Module (RFCBAM) to refine the backbone network. By constructing the C3K2_RFCBAM block, we effectively enhance the feature representation of small objects across diverse receptive fields. To further refine the prediction phase, we develop a Four-Shared Detail Enhancement Detection Head (FSDED) to improve both efficiency and stability. Finally, regarding the loss function, we formulate the Inner-WIoU strategy by integrating auxiliary bounding boxes with dynamic focusing mechanisms to ensure precise target localization. The experimental results on the VisDrone2019 benchmark demonstrate that our method secures mAP@0.5 and mAP@0.5:0.95 scores of 42.1% and 25.6%, respectively, outperforming the baseline by 8.8% and 6.2%. Extensive tests on the TinyPerson and DOTA1.0 datasets further validate the robust generalization capability of our model, confirming that BFRI-Yolo strikes a superior balance between detection accuracy and computational overhead in aerial scenes. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

24 pages, 3590 KB  
Article
Rotation-Sensitive Feature Enhancement Network for Oriented Object Detection in Remote Sensing Images
by Jiaxin Xu, Hua Huo, Shilu Kang, Aokun Mei and Chen Zhang
Sensors 2026, 26(2), 381; https://doi.org/10.3390/s26020381 - 7 Jan 2026
Viewed by 164
Abstract
Oriented object detection in remote sensing images remains a challenging task due to arbitrary target rotations, extreme scale variations, and complex backgrounds. However, current rotated detectors still face several limitations: insufficient orientation-sensitive feature representation, feature misalignment for rotated proposals, and unstable optimization of [...] Read more.
Oriented object detection in remote sensing images remains a challenging task due to arbitrary target rotations, extreme scale variations, and complex backgrounds. However, current rotated detectors still face several limitations: insufficient orientation-sensitive feature representation, feature misalignment for rotated proposals, and unstable optimization of rotation parameters. To address these issues, this paper proposes an enhanced Rotation-Sensitive Feature Pyramid Network (RSFPN) framework. Building upon the effective Oriented R-CNN paradigm, we introduce three novel core components: (1) a Dynamic Adaptive Feature Pyramid Network (DAFPN) that enables bidirectional multi-scale feature fusion through semantic-guided upsampling and structure-enhanced downsampling paths; (2) an Angle-Aware Collaborative Attention (AACA) module that incorporates orientation priors to guide feature refinement; (3) a Geometrically Consistent Multi-Task Loss (GC-MTL) that unifies the regression of rotation parameters with periodic smoothing and adaptive weight mechanisms. Comprehensive experiments on the DOTA-v1.0 and HRSC2016 benchmarks show that our RSFPN achieves superior performance. It attains a state-of-the-art mAP of 77.42% on DOTA-v1.0 and 91.85% on HRSC2016, while maintaining efficient inference at 14.5 FPS, demonstrating a favorable accuracy-efficiency trade-off. Visual analysis confirms that our method produces concentrated, rotation-aware feature responses and effectively suppresses background interference. The proposed approach provides a robust solution for detecting multi-oriented objects in high-resolution remote sensing imagery, with significant practical value for urban planning, environmental monitoring, and security applications. Full article
Show Figures

Figure 1

21 pages, 8752 KB  
Article
Remote Sensing Interpretation of Soil Elements via a Feature-Reinforcement Multiscale-Fusion Network
by Zhijun Zhang, Mingliang Tian, Wenbo Gao, Yanliang Wang, Fengshan Zhang and Mo Wang
Remote Sens. 2026, 18(1), 171; https://doi.org/10.3390/rs18010171 - 5 Jan 2026
Viewed by 165
Abstract
Accurately delineating soil elements from satellite imagery is fundamental for regional geological mapping and survey. However, vegetation cover and complex geomorphological conditions often obscure diagnostic surface information, weakening the visibility of key geological features. Additionally, long-term tectonic deformation and weathering processes reshape the [...] Read more.
Accurately delineating soil elements from satellite imagery is fundamental for regional geological mapping and survey. However, vegetation cover and complex geomorphological conditions often obscure diagnostic surface information, weakening the visibility of key geological features. Additionally, long-term tectonic deformation and weathering processes reshape the spatial organization of soil elements, resulting in substantial within-class variability, inter-class spectral overlap, and fragmented structural patterns—all of which hinder reliable segmentation performance for conventional deep learning approaches. To mitigate these challenges, this study introduces a Reinforced Feature and Multiscale Feature Fusion Network (RFMFFNet) tailored for semantic interpretation of soil elements. The model incorporates a rectangular calibration attention (RCA) module into a ResNet101 backbone to recalibrate feature responses in critical regions, thereby improving scale adaptability and the preservation of fine geological structures. A complementary multiscale feature fusion (MFF) component is further designed by combining sparse self-attention with pyramid pooling, enabling richer context aggregation while reducing computational redundancy. Comprehensive experiments on the Landsat-8 and Sentinel-2 datasets verify the effectiveness of the proposed framework. RFMFFNet consistently achieves superior segmentation performance compared with several mainstream deep learning models. On the Landsat-8 dataset, the oPA and mIoU increase by 2.4% and 2.6%, respectively; on the Sentinel-2 dataset, the corresponding improvements reach 4.3% and 4.1%. Full article
Show Figures

Figure 1

23 pages, 41532 KB  
Article
CW-DETR: An Efficient Detection Transformer for Traffic Signs in Complex Weather
by Tianpeng Wang, Qiaoshuang Teng, Shangyu Sun, Weidong Song, Jinhe Zhang and Yuxuan Li
Sensors 2026, 26(1), 325; https://doi.org/10.3390/s26010325 - 4 Jan 2026
Viewed by 295
Abstract
Traffic sign detection under adverse weather conditions remains challenging due to severe feature degradation caused by rain, fog, and snow, which significantly impairs the performance of existing detection systems. This study presents the CW-DETR (Complex Weather Detection Transformer), an end-to-end detection framework designed [...] Read more.
Traffic sign detection under adverse weather conditions remains challenging due to severe feature degradation caused by rain, fog, and snow, which significantly impairs the performance of existing detection systems. This study presents the CW-DETR (Complex Weather Detection Transformer), an end-to-end detection framework designed to address weather-induced feature deterioration in real-time applications. Building upon the RT-DETR, our approach integrates four key innovations: a multipath feature enhancement network (FPFENet) for preserving fine-grained textures, a Multiscale Edge Enhancement Module (MEEM) for combating boundary degradation, an adaptive dual-stream bidirectional feature pyramid network (ADBF-FPN) for cross-scale feature compensation, and a multiscale convolutional gating module (MCGM) for suppressing semantic–spatial confusion. Extensive experiments on the CCTSDB2021 dataset demonstrate that the CW-DETR achieves 69.0% AP and 94.4% AP50, outperforming state-of-the-art real-time detectors by 2.3–5.7 percentage points while maintaining computational efficiency (56.8 GFLOPs). A cross-dataset evaluation on TT100K, the TSRD, CNTSSS, and real-world snow conditions (LNTU-TSD) confirms the robust generalization capabilities of the proposed model. These results establish CW-DETR as an effective solution for all-weather traffic sign detection in intelligent transportation systems. Full article
(This article belongs to the Section Remote Sensors)
Show Figures

Figure 1

30 pages, 18696 KB  
Article
A Lightweight Multi-Module Collaborative Optimization Framework for Detecting Small Unmanned Aerial Vehicles in Anti-Unmanned Aerial Vehicle Systems
by Zhiling Chen, Kuangang Fan, Jingzhen Ye, Zhitao Xu and Yupeng Wei
Drones 2026, 10(1), 20; https://doi.org/10.3390/drones10010020 - 31 Dec 2025
Viewed by 461
Abstract
In response to the safety threats posed by unauthorized unmanned aerial vehicles (UAVs), the importance of anti-UAV systems is becoming increasingly apparent. In tasks involving UAV detection, small UAVs are particularly difficult to detect due to their low resolution. Therefore, this study proposed [...] Read more.
In response to the safety threats posed by unauthorized unmanned aerial vehicles (UAVs), the importance of anti-UAV systems is becoming increasingly apparent. In tasks involving UAV detection, small UAVs are particularly difficult to detect due to their low resolution. Therefore, this study proposed YOLO-CoOp, a lightweight multi-module collaborative optimization framework for detecting small UAVs. First, a high-resolution feature pyramid network (HRFPN) was proposed to retain more spatial information of small UAVs. Second, a C3k2-WT module integrated with wavelet transform convolution was proposed to enhance feature extraction capability and expand the model’s receptive field. Then, a spatial-channel synergistic attention (SCSA) mechanism was introduced to integrate spatial and channel information and enhance feature fusion. Finally, the DyATF method replaced the upsampling with Dysample and the confidence loss with adaptive threshold focal loss (ATFL), aiming to restore UAV details and balance positive–negative sample weights. The ablation experiments show that YOLO-CoOp achieves 94.3% precision, 93.1% recall, 96.2% mAP50, and 57.6% mAP50−95 on the UAV-SOD dataset, with improvements of 3.6%, 10%, 5.9%, and 5% over the baseline model, respectively. The comparison experiments demonstrate that YOLO-CoOp has fewer parameters while maintaining superior detection performance. Cross-dataset validation experiments also demonstrate that YOLO-CoOp exhibits significant performance improvements in small object detection tasks. Full article
Show Figures

Figure 1

18 pages, 2002 KB  
Article
YOLOv11-ASV: Research on Classroom Behavior Recognition Method Based on YOLOv11
by Zihao Wang and Tao Fan
Appl. Sci. 2026, 16(1), 432; https://doi.org/10.3390/app16010432 - 31 Dec 2025
Viewed by 218
Abstract
(1) Background: With the continuous development of intelligent education, classroom behavior recognition has become increasingly important in teaching evaluation and learning analytics. In response to challenges such as occlusion, scale differences, and fine-grained behavior recognition in complex classroom environments, this paper proposes an [...] Read more.
(1) Background: With the continuous development of intelligent education, classroom behavior recognition has become increasingly important in teaching evaluation and learning analytics. In response to challenges such as occlusion, scale differences, and fine-grained behavior recognition in complex classroom environments, this paper proposes an improved YOLOv11-ASV detection framework; (2) Methods: This framework introduces the Adaptive Spatial Pyramid Network (ASPN) based on YOLOv11, enhancing contextual modeling capabilities through block-level channel partitioning and multi-scale feature fusion mechanisms. Additionally, VanillaNet is adopted as the backbone network to improve the global semantic feature representation; (3) Conclusions: Experimental results show that on our self-built classroom behavior dataset (ClassroomDatasets), YOLOv11-ASV achieves 81.5% mAP50 and 62.1% mAP50–95, improving by 1.6% and 2.9%, respectively, compared to the baseline model. Notably, performance shows significant improvement in recognizing behavior classes such as “reading” and “writing” which are often confused. The experimental results validate the effectiveness of the YOLOv11-ASV model in improving behavior recognition accuracy and robustness in complex classroom scenarios, providing reliable technical support for the practical application of smart classroom systems. Full article
Show Figures

Figure 1

19 pages, 3910 KB  
Article
Defect Detection Algorithm of Galvanized Sheet Based on S-C-B-YOLO
by Yicheng Liu, Gaoxia Fan, Hanquan Zhang and Dong Xiao
Mathematics 2026, 14(1), 110; https://doi.org/10.3390/math14010110 - 28 Dec 2025
Viewed by 229
Abstract
Galvanized steel sheets are vital anti-corrosion materials, yet their surface quality is prone to defects that impact performance. Manual inspection is inefficient, while conventional machine vision struggles with complex, small-scale defects in industrial settings. Although deep learning offers promising solutions, standard object detection [...] Read more.
Galvanized steel sheets are vital anti-corrosion materials, yet their surface quality is prone to defects that impact performance. Manual inspection is inefficient, while conventional machine vision struggles with complex, small-scale defects in industrial settings. Although deep learning offers promising solutions, standard object detection models like YOLOv5 (which is short for ‘You Only Look Once’) exhibit limitations in handling the subtle textures, scale variations, and reflective surfaces characteristic of galvanized sheet defects. To address these challenges, this paper proposes S-C-B-YOLO, an enhanced detection model based on YOLOv5. First, a Squeeze-and-Excitation (SE) attention mechanism is integrated into the deep layers of the backbone network to adaptively recalibrate channel-wise features, improving focus on defect-relevant information. Second, a Transformer block is combined with a C3 module to form a C3TR module, enhancing the model’s ability to capture global contextual relationships for irregular defects. Finally, the original path aggregation network (PANet) is replaced with a bidirectional feature pyramid network (Bi-FPN) to facilitate more efficient multi-scale feature fusion, significantly boosting sensitivity to small defects. Extensive experiments on a dedicated galvanized sheet defect dataset show that S-C-B-YOLO achieves a mean average precision (mAP@0.5) of 92.6% and an inference speed of 62 FPS, outperforming several baseline models including YOLOv3, YOLOv7, and Faster R-CNN. The proposed model demonstrates a favorable balance between accuracy and speed, offering a robust and practical solution for automated, real-time defect inspection in galvanized steel production. Full article
(This article belongs to the Special Issue Advance in Neural Networks and Visual Learning)
Show Figures

Figure 1

Back to TopTop