Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (2,256)

Search Parameters:
Keywords = feature-pyramid

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 2396 KB  
Article
AD-YOLO: A Unified Method for Traffic-Dense and Small Object Detection in UAV Images
by Yu Deng, Yucong Hu, Yun Ye and Pengpeng Xu
Drones 2026, 10(5), 338; https://doi.org/10.3390/drones10050338 - 1 May 2026
Abstract
The densely distributed, scale-varying objects in unmanned aerial vehicle (UAV) images, together with their dynamic, diverse, and unconstrained backgrounds, make conventional detection methods prone to missed detections, false alarms, and localization biases. To improve UAV vision tasks, we propose AD-YOLO, a unified method [...] Read more.
The densely distributed, scale-varying objects in unmanned aerial vehicle (UAV) images, together with their dynamic, diverse, and unconstrained backgrounds, make conventional detection methods prone to missed detections, false alarms, and localization biases. To improve UAV vision tasks, we propose AD-YOLO, a unified method tailored for small object detection in traffic-dense settings. First, a module combining an adaptive rotation convolution unit and grouped directional attention with mixed-kernel features is introduced to enhance the model’s orientation invariance and multi-scale discrimination. Then, a dual-path collaborative feature pyramid network is proposed to jointly refine the model’s semantic and spatial details via a multi-directional context aggregation path and a hierarchical semantic progressive fusion path. Last, a hierarchically dense reparameterized large-kernel module is designed to produce broader receptive fields with reduced computational complexity. Extensive experiments on the VisDrone2019 and UAVDT datasets demonstrate that AD-YOLO outperforms state-of-the-art methods in detection accuracy while maintaining favorable computational efficiency. Full article
7904 KB  
Proceeding Paper
Mesh Adaptation on Hybrid Unstructured Meshes for Immersed Boundary Methods with Applications to Industrial Aerodynamics
by Jonatan Núñez-de la Rosa, Esteban Ferrer and Eusebio Valero
Eng. Proc. 2026, 133(1), 62; https://doi.org/10.3390/engproc2026133062 - 30 Apr 2026
Abstract
In this work we present the development and application of a mesh adaptation tool on hybrid unstructured meshes for immersed boundary volume penalization methods in the computational fluid dynamics software from ONERA, DLR, and Airbus. This mesh adaptation tool is capable of refining [...] Read more.
In this work we present the development and application of a mesh adaptation tool on hybrid unstructured meshes for immersed boundary volume penalization methods in the computational fluid dynamics software from ONERA, DLR, and Airbus. This mesh adaptation tool is capable of refining elements around geometries immersed in unstructured meshes made of different types of elements, like tetrahedra, hexahedra, prisms, and pyramids. This feature allows us to simulate fluid flow problems with the immersed boundary method not only on Cartesian meshes but on general hybrid unstructured meshes. Of special interest in this work is the simulation of turbulent fluid flows in aerodynamics through the numerical solution of the Reynolds-averaged Navier–Stokes equations either on unstructured meshes with only immersed geometries or on unstructured body-fitted meshes along with immersed geometries. As part of the benchmarking, we simulate the subsonic flow past the high-lift multi-element airfoil. The reported numerical simulations are in good agreement with their corresponding full body-fitted meshes. Full article
Show Figures

Figure 1

32 pages, 17164 KB  
Article
A Small Object Detection Transformer for UAV Remote Sensing Imagery via Multi-Scale Perception and Cross-Spatial-Frequency Domain Fusion
by Chenglong Shi, Hui Wang, Xiaolin Fu, Pingping Liu and Hongchang Ke
Remote Sens. 2026, 18(9), 1394; https://doi.org/10.3390/rs18091394 - 30 Apr 2026
Abstract
Small object detection in UAV remote sensing imagery has long faced significant challenges. Existing Transformer-based detectors still suffer from feature degradation and insufficient multi-scale information fusion when handling small objects with sparse pixels and complex backgrounds. To address this, we propose MSF-DETR, a [...] Read more.
Small object detection in UAV remote sensing imagery has long faced significant challenges. Existing Transformer-based detectors still suffer from feature degradation and insufficient multi-scale information fusion when handling small objects with sparse pixels and complex backgrounds. To address this, we propose MSF-DETR, a Transformer-based detector with multi-scale perception and cross-spatial-frequency domain fusion. Specifically, we design a multi-scale perception attention feature extraction network that integrates a Poly Kernel Inception module with a bidirectional contextual anchor attention mechanism via a dual-pathway fusion block, enabling simultaneous capture of multi-granularity features and long-range semantic dependencies. We further develop a feature alignment and cross-spatial-frequency enhancement pyramid that enriches shallow-layer spatial details through feature reorganization and leverages a spatial-frequency dual-domain collaborative strategy to capture both local textures and global spectral dependencies. Cross-scale dynamic intensity modulation combined with decoupled lightweight downsampling further effectively suppresses semantic noise, corrects feature misalignment, and preserves critical edge details. Finally, a Shape-NWD loss is devised to incorporate geometric and scale constraints, effectively alleviating the positional sensitivity of IoU for small targets. Extensive experiments on three public benchmarks demonstrate the superior performance of MSF-DETR; notably, on the VisDrone dataset, it achieves improvements of 7.45% and 8.71% in mAP50 and mAP50:95 over the baseline. Full article
23 pages, 4083 KB  
Article
RD-DETR: A Robust Vehicle Detector via Reaction–Diffusion Mechanisms
by Yi Huang, Yishi Chen, Kaiming Pan, Xiangning Wu, Haoxiang Huang and Yanmei Meng
Appl. Sci. 2026, 16(9), 4378; https://doi.org/10.3390/app16094378 - 30 Apr 2026
Abstract
Vehicle detection is a fundamental perception task in intelligent transportation systems and autonomous driving. Although state-of-the-art detectors achieve competitive performance under normal conditions, their robustness degrades substantially under adverse conditions such as rain, fog, low illumination, and sensor noise. To address this challenge, [...] Read more.
Vehicle detection is a fundamental perception task in intelligent transportation systems and autonomous driving. Although state-of-the-art detectors achieve competitive performance under normal conditions, their robustness degrades substantially under adverse conditions such as rain, fog, low illumination, and sensor noise. To address this challenge, we propose RD-DETR, a vehicle detector that incorporates reaction–diffusion mechanisms into deep feature learning. The RDNet backbone adopts a pyramid-based enhancement strategy in which shallow layers preserve fine-grained texture details while deep layers employ reaction–diffusion-inspired dynamics to suppress noise and enhance target representations. The Phase-Guided Spatial Attention (PGSA) module leverages phase-related structural cues that are relatively less sensitive to global illumination and contrast variations, helping recover vehicle boundaries when appearance cues become unreliable under adverse imaging conditions. The Content-Aware Adaptive Fusion Module (CA-AFM) dynamically aggregates multi-scale features according to scene complexity, improving detection across diverse traffic scenarios. Experiments on BDD100K and DAWN show that RD-DETR yields mAP@0.5 improvements of 3.2 and 4.0 percentage points over RT-DETR, respectively, while reducing model parameters by 27.6%, indicating a favorable balance between accuracy and efficiency under the evaluated settings. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

27 pages, 1859 KB  
Article
DAFE-Net: Direction-Aware Feature Enhancement Network for SAR Ship Detection
by Junjie Zeng, Xinxin Tang and Shuang Li
Remote Sens. 2026, 18(9), 1380; https://doi.org/10.3390/rs18091380 - 29 Apr 2026
Abstract
Synthetic Aperture Radar (SAR) ship detection is important for maritime surveillance and maritime security. However, existing methods still suffer from insufficient backbone representation, inadequate directional structure modeling, and limited cross-scale interaction under complex backgrounds. To address these issues, we propose a Direction-Aware Feature [...] Read more.
Synthetic Aperture Radar (SAR) ship detection is important for maritime surveillance and maritime security. However, existing methods still suffer from insufficient backbone representation, inadequate directional structure modeling, and limited cross-scale interaction under complex backgrounds. To address these issues, we propose a Direction-Aware Feature Enhancement Network (DAFE-Net). First, a Multi-Branch Feature Interaction Module (MBFIM) is designed to improve the collaborative representation of global structures and local details. Second, a Direction-Aware Contrast Enhancement Module (DACEM) is introduced to explicitly model the directional bright–dark coupled structures of SAR ships, thereby improving target–background discrimination under complex clutter. Finally, a Feature-Focused Diffusion Pyramid Network (FFDPN) is constructed to strengthen cross-scale feature interaction and improve the detection of multi-scale ship targets. Experimental results show that the proposed method outperforms several competitive detectors on the merged SSDD and HRSID dataset. Compared with DEIM-D-FINE, our method improves AP by 3.1% and APL by 5.0%. These results demonstrate that the proposed method provides an effective direction-aware modeling approach for SAR ship detection. Full article
(This article belongs to the Special Issue Recent Advances in SAR Object Detection)
21 pages, 12418 KB  
Article
SAR-Based Submesoscale Oceanic Eddy Detection Using Deep Fusion Feature Pyramid Network with Scale-Aware Learning
by Songhao Peng, Yongqiang Chen and Chunle Wang
Remote Sens. 2026, 18(9), 1370; https://doi.org/10.3390/rs18091370 - 29 Apr 2026
Abstract
Submesoscale oceanic eddies play a crucial role in ocean dynamics and climate systems, while Synthetic Aperture Radar (SAR) offers distinct advantages for observing these fine-scale phenomena; the advancement of automated detection algorithms is currently hindered by the lack of publicly available, high-quality benchmark [...] Read more.
Submesoscale oceanic eddies play a crucial role in ocean dynamics and climate systems, while Synthetic Aperture Radar (SAR) offers distinct advantages for observing these fine-scale phenomena; the advancement of automated detection algorithms is currently hindered by the lack of publicly available, high-quality benchmark datasets. To address this gap, this paper constructs a universal benchmark dataset for submesoscale eddies and presents an improved anchor-free object detection framework based on Fully Convolutional One-Stage (FCOS). We propose two key innovations: (1) a Deep Fusion Feature Pyramid Network (DF-FPN) that integrates adaptive multi-scale feature fusion directly into the pyramid construction process through deep fusion Adaptive Spatial Feature Fusion (ASFF) modules, enabling bidirectional feature enhancement and global context-aware fusion and (2) a Pixel-level Statistical Description Learning (PSDL) module that enhances feature representation by learning statistical descriptors across multiple scales. The DF-FPN replaces traditional staged optimization with an intrinsic deep fusion paradigm, significantly improving feature quality. Extensive experiments on our constructed dataset demonstrate that our method achieves 66.6% mAP, 91.3% AP50, and 80.5% AP75. These results represent a substantial improvement over the FCOS baseline and outperform other state-of-the-art detectors, providing a robust and efficient solution for operational submesoscale eddy monitoring in SAR imagery. Enhanced detection capacity of this kind offers a critical observational foundation for advancing research on upper-ocean nutrient transport, carbon cycle dynamics, and the dispersion of marine pollutants, thereby supporting broader environmental monitoring and climate-related objectives. Full article
Show Figures

Figure 1

26 pages, 12515 KB  
Article
DAFSDet: Dual-Attention Guided Few-Shot Object Detection in Remote Sensing Images
by Guangshuai Gao, Zhilin Zhang, Wei Zhang, Yunqi Shang, Yan Dong and Jiangtao Xi
Remote Sens. 2026, 18(9), 1345; https://doi.org/10.3390/rs18091345 - 28 Apr 2026
Viewed by 105
Abstract
Few-shot object detection aims to accurately identify and localize novel categories using only a small number of labeled samples. In remote sensing images, however, this task faces significant challenges due to substantial variations in target scale and complex backgrounds. To address these issues, [...] Read more.
Few-shot object detection aims to accurately identify and localize novel categories using only a small number of labeled samples. In remote sensing images, however, this task faces significant challenges due to substantial variations in target scale and complex backgrounds. To address these issues, this paper proposes a dual-attention guided few-shot object detection framework, DAFSDet. Specifically, a dual-attention strategy is implemented across the feature modeling and proposal generation stages. For feature fusion, the Content-Aware Strip Pyramid (CASP) is designed to enhance multi-scale feature representation by modeling spatial and contextual information. In the detection stage, a Deformable Attention RPN (DA-RPN) is proposed to improve the localization quality of candidate regions. With these designs, the proposed method effectively mitigates the challenges posed by multi-scale variations and complex backgrounds. Experimental results on the DIOR and NWPU VHR-10 datasets demonstrate consistent improvements over baseline methods, with notable gains of 7.54 mAP on DIOR Split 2 under the 10-shot setting and 2.09 mAP on NWPU VHR-10 under the 3-shot setting. These results indicate that the proposed method offers an effective solution for few-shot object detection in complex remote sensing scenarios. Full article
Show Figures

Figure 1

15 pages, 4149 KB  
Article
LRNet: A Lightweight Detection Model for Foreign Objects on Coal Mine Conveyor
by Lili Xu, Youli Yao and Airan Zhang
Electronics 2026, 15(9), 1848; https://doi.org/10.3390/electronics15091848 - 27 Apr 2026
Viewed by 103
Abstract
Coal mine conveyor belt foreign objects detection is critical in conveyor belt transportation of coal. Aiming at the problems that the existing coal mine conveyor belt foreign objects detection model has a large number of parameters, occupies more computer resources, and detects fewer [...] Read more.
Coal mine conveyor belt foreign objects detection is critical in conveyor belt transportation of coal. Aiming at the problems that the existing coal mine conveyor belt foreign objects detection model has a large number of parameters, occupies more computer resources, and detects fewer types of foreign objects, the original YOLOv13 object detection algorithm is optimized to achieve lightweight design and high precision. Therefore, a sophisticated lightweight YOLO network named LRNet is proposed based on the original YOLOv13, which is tailored for foreign objects detection on coal mine conveyor belts. First, lightweight ShuffleNetv2 is used as the backbone network for YOLOv13 to reduce computational cost and the number of parameters, and to improve the network parallelism. Second, the Bidirectional Feature Pyramid Network (BIFPN) is used as a feature fusion network to effectively fuse global deep and shallow key detail information. Finally, the Coordinate Attention (CA) mechanism is added to enhance the extraction capability of key features and strengthen the foreign objects target attention to improve the network model detection accuracy. The experimental results show that the average detection accuracy of LRNet reaches 91.0%, the number of parameters is 3.6 M. The proposed method can quickly and accurately detect foreign objects in coal mine conveyor belts with less computational resources, and at the same time, it shows strong adaptability and anti-interference ability, which reflects the effectiveness and advancedness of the LRNet model. Full article
14 pages, 1117 KB  
Article
MS-PANet: Multi-Scale Spatial Pyramid Attention for Effective Drainage Pipeline Image Dehazing
by Ce Li, Xinyi Duan, Zhongbo Jiang, Yijing Ding, Quanzhi Li, Zhengyan Tang and Feng Yang
J. Imaging 2026, 12(5), 189; https://doi.org/10.3390/jimaging12050189 - 27 Apr 2026
Viewed by 136
Abstract
Urban drainage pipelines are crucial for flood control, drainage, and environmental quality. However, fog within pipelines degrades image quality, hindering the identification of damage features such as cracks and leaks. Existing dehazing algorithms struggle with the unique challenges presented by drainage pipelines, such [...] Read more.
Urban drainage pipelines are crucial for flood control, drainage, and environmental quality. However, fog within pipelines degrades image quality, hindering the identification of damage features such as cracks and leaks. Existing dehazing algorithms struggle with the unique challenges presented by drainage pipelines, such as their cylindrical structure, non-uniform lighting, and multi-scale particulate interference, leading to inadequate feature extraction and weak cross-channel dependency modeling. To address these issues, we propose a novel drainage pipeline image dehazing network based on a pyramid attention mechanism. Specifically, our proposed method incorporates a custom-designed multi-scale spatial pyramid attention (MSPA) module, which combines hierarchical pyramid convolution and spatial pyramid recalibration modules. This enables the dynamic adjustment of multi-scale feature weights and the effective modeling of cross-channel long-range dependencies. Extensive experiments demonstrate that our network achieves superior dehazing performance across diverse underground environments, particularly in synthetic foggy dataset under real pipeline conditions, outperforming state-of-the-art dehazing algorithms. This proposed approach provides a reliable solution for high-precision visual inspection in complex pipeline scenarios. Full article
Show Figures

Figure 1

28 pages, 12735 KB  
Article
FMW-YOLO: A Frequency-Enhanced and Multi-Scale Context-Aware Framework for PCB Defect Detection
by Yuguo Li, Shuo Tian, Wenzheng Sun, Longfa Chen, Jian Li, Junkai Hu and Na Meng
Micromachines 2026, 17(5), 531; https://doi.org/10.3390/mi17050531 - 27 Apr 2026
Viewed by 186
Abstract
A high-precision and efficient surface defect detection for printed circuit board (PCB) is critical to ensuring the reliability of electronic systems. However, the presence of complex circuit backgrounds and the small scale of defects often limit the precision and effectiveness of conventional inspection [...] Read more.
A high-precision and efficient surface defect detection for printed circuit board (PCB) is critical to ensuring the reliability of electronic systems. However, the presence of complex circuit backgrounds and the small scale of defects often limit the precision and effectiveness of conventional inspection approaches. To address these challenges, this paper proposes FMW-YOLO, a lightweight and accurate detection framework based on YOLO11n. Specifically, a Frequency-Enhanced Channel-Transposed and Local Feature backbone network is developed to improve feature extraction. By designing a Dual-Frequency and Channel Attention Aggregation module and a Lightweight Edge-Gaussian Block, the original C3k2 structure is refined to suppress noise interference while preserving high-frequency details, thereby enhancing feature representation. Furthermore, a neck network incorporating a Multi-Scale Context-Aware Enhancement mechanism is constructed, in which an Attention-Integrated Feature Pyramid is employed to facilitate more effective cross-scale feature interaction. In addition, a Dilated Reparam Residual Module is embedded into the C3k2 structure to expand the receptive field without significantly increasing computational burden. Finally, Wise-IoU is adopted to optimize bounding box regression by assigning greater importance to anchors of moderate quality. Extensive experiments conducted on the HRIPCB and DeepPCB datasets demonstrate that FMW-YOLO improves mAP50 by 2.1% and 0.3%, respectively, while reducing the number of parameters by 23%. These results indicate that the proposed method achieves improved detection accuracy and demonstrates strong potential for practical industrial applications. Full article
(This article belongs to the Topic AI Sensors and Transducers)
Show Figures

Figure 1

50 pages, 17736 KB  
Article
Swin–YOLOv12: A Hybrid Transformer-Based Deep Learning Approach for Enhanced Real-Time Brain Tumor Detection in MRI Images
by Mubashar Tariq and Kiho Choi
Mathematics 2026, 14(9), 1447; https://doi.org/10.3390/math14091447 - 25 Apr 2026
Viewed by 234
Abstract
Brain tumors (BTs) arise from the abnormal growth of cells within brain tissue and may spread rapidly, making them a major cause of mortality worldwide. Early detection of BTs remains highly challenging due to the brain’s complex structure and the heterogeneous nature of [...] Read more.
Brain tumors (BTs) arise from the abnormal growth of cells within brain tissue and may spread rapidly, making them a major cause of mortality worldwide. Early detection of BTs remains highly challenging due to the brain’s complex structure and the heterogeneous nature of tumors. Magnetic Resonance Imaging (MRI) provides detailed information about tumor size, location, and shape, thereby supporting clinical decision-making for treatments such as chemotherapy, radiation therapy, and surgery. Traditional machine learning (ML) approaches mainly rely on manual feature extraction, whereas recent advances in Computer-Aided Diagnosis (CAD) and deep learning (DL) have enabled more accurate detection of small and complex tumor regions. To improve automated tumor detection, we propose a hybrid Swin–YOLO framework that combines the Swin Transformer (ST) with the latest CNN-based YOLOv12 model. In this framework, the Swin Transformer serves as the main backbone for feature extraction, while the Feature Pyramid Network (FPN) and Path Aggregation Network (PANet) are employed in the neck to better capture multi-scale features. For training, we used the publicly available Br35H dataset and applied data augmentation to enhance the model’s robustness and generalization capability. The experimental results show that the proposed framework achieved 99.7% accuracy, 99.4% mAP@50, and 87.2% mAP@50:95. Furthermore, we incorporated Explainable Artificial Intelligence (XAI) techniques, including Grad-CAM and SHAP, to improve the interpretability of the model by visually highlighting the tumor regions that contributed most to the prediction. In addition, we developed NeuroVision AI, a web-based application designed to support faster and more accurate clinical decision-making. Although the proposed model demonstrated strong performance on the dataset, these results should be interpreted within the context of the current experimental setting. Full article
22 pages, 14714 KB  
Article
TGL-YOLO: A Multi-Scale Feature Enhancement Method for Plant Disease Detection Based on Improved YOLO11
by Qi Wang and Zhiyu Wang
Agriculture 2026, 16(9), 947; https://doi.org/10.3390/agriculture16090947 - 25 Apr 2026
Viewed by 668
Abstract
Plant disease detection in natural environments is significantly challenged by variations in lesion scales and interference from complicated background clutter. Nevertheless, current models often remain limited in effectively capturing multi-scale features and mitigating background interference simultaneously. To tackle these challenges, we present TGL-YOLO, [...] Read more.
Plant disease detection in natural environments is significantly challenged by variations in lesion scales and interference from complicated background clutter. Nevertheless, current models often remain limited in effectively capturing multi-scale features and mitigating background interference simultaneously. To tackle these challenges, we present TGL-YOLO, an improved detection network built on the YOLO11 framework. Methodologically, we introduce the Tri-Scale Dynamic Block (TSDBlock) to adaptively extract fine-grained features across highly variable lesion sizes. Furthermore, a Gated Pyramid Spatial Transformer (GPST) is designed to fuse cross-scale features and suppress background interference, while a Large Separable Pyramid Attention (LSPA) module expands the spatial receptive field to capture global context. Experimental results on two public datasets show that TGL-YOLO demonstrates improved performance over the YOLO11s baseline. On the PlantDoc dataset, it improves mAP50 and mAP50:95 by 4.7% and 3.7%, reaching 0.591 and 0.449, respectively. On the FieldPlant dataset, it reaches 0.793 and 0.608, yielding improvements of 2.3% and 1.9%. The proposed method demonstrates the capability to reduce missed detections and false positives caused by multi-scale lesions and environmental noise, providing a competitive and computationally viable solution for agricultural disease monitoring in natural environments. Full article
Show Figures

Figure 1

29 pages, 4546 KB  
Article
Beyond Scale Variability: Dynamic Cross-Scale Modeling and Efficient Sparse Heads for Wind Turbine Blade Defect Detection
by Xingxing Fan, Manxiang Gao, Yong Wang, Haining Tang, Fengyong Sun and Changpo Song
Processes 2026, 14(9), 1367; https://doi.org/10.3390/pr14091367 - 24 Apr 2026
Viewed by 120
Abstract
Images of wind turbine blades captured by drones often feature complex backgrounds, and small targets such as minor defects or images have low resolution, leading to reduced recognition rates. To address environments with complex feature backgrounds, this paper proposes the PPS-MSDeim model. Based [...] Read more.
Images of wind turbine blades captured by drones often feature complex backgrounds, and small targets such as minor defects or images have low resolution, leading to reduced recognition rates. To address environments with complex feature backgrounds, this paper proposes the PPS-MSDeim model. Based on the lightweight end-to-end detection framework DEIM-N, it introduces three core innovations to tackle the challenge of detecting small, irregular defects on wind turbine blades against complex backgrounds. First, we design an inverted multi-scale deep separable convolutional module (MDSC). After compressing channels via a bottleneck layer, it concurrently processes 3 × 3, 5 × 5, and 7 × 7 inverted deep separable convolutions. By first fusing channel information and then extracting multi-receiver-field spatial features, this approach enhances the ability to characterize morphologically variable defects while reducing computational overhead. The MDSC is then embedded into the backbone network HGNetv2. Second, we construct a Multi-Scale Feature Aggregation and Diffusion Pyramid Network (MFADPN). Through a Multi-Scale Feature Aggregation Module (MSFAM), it directly fuses features from layers P2 to P5, achieving deep integration of high-level semantics and low-level details. Combining dilated convolutions with expansion ratios of 1, 3, and 5 captures multi-level context, and a Sobel edge branch is introduced to enhance defect contours; subsequently, a feature diffusion operation is performed to distribute the enhanced features back to each level, shortening information paths and preventing signal decay; simultaneously, a high-resolution detection head is added to P2 and the P5 head is removed to improve sensitivity for small object detection. Finally, we propose the PPSformer module to replace the original Transformer encoding layer. It uses patch embedding to convert images into sequences and introduces a multi-head probabilistic sparse self-attention mechanism that focuses only on key-value pairs during attention computation. This design efficiently captures irregularly varying feature information and globally detects data anomalies induced by external defects. This study uses real engineering data sets, and the results show that PPS-MSDeim, based on DEIM, increased mAP@0.5 by 6.7%, reaching 95.1%. mAP@0.5–0.95 increased by 12.0%, reaching 70.1%. This indicates that the proposed method has a significant advantage in detecting defects in wind turbine blades. Full article
23 pages, 8014 KB  
Article
MSW-Mamba-Det: Multi-Scale Windowed State-Space Modeling for End-to-End Defect Detection in Photovoltaic Module Electroluminescence Images
by Xiaofeng Wang, Haojie Hu, Xiao Hao and Weiguang Ma
Sensors 2026, 26(9), 2616; https://doi.org/10.3390/s26092616 - 23 Apr 2026
Viewed by 537
Abstract
Electroluminescence (EL) imaging is widely used for photovoltaic (PV) module inspection, yet EL defect detection remains challenging due to the need for high-resolution inputs, low-contrast defects, and strong structured background patterns. To address these issues, we propose MSW-Mamba-Det, an end-to-end defect detection framework [...] Read more.
Electroluminescence (EL) imaging is widely used for photovoltaic (PV) module inspection, yet EL defect detection remains challenging due to the need for high-resolution inputs, low-contrast defects, and strong structured background patterns. To address these issues, we propose MSW-Mamba-Det, an end-to-end defect detection framework built on RT-DETR, comprising three components. (1) MSW-Mamba, a multi-scale windowed state-space module, adopts a Local/Stripe/Grid architecture to jointly model fine details and long-range dependencies; the Stripe branch strengthens directional continuity for elongated defects, while the Grid branch introduces coarse global context to improve cross-region consistency. Saliency- and gradient-guided gating is further used to suppress background-induced false responses. (2) DetailAware compensates for detail attenuation by restoring high-frequency textures and edges through multi-scale local enhancement, and applies pixel-wise adaptive gating to integrate global semantics and mitigate smoothing effects in deep representations. (3) PAFB (Pyramid Attention Fusion Block) aligns adjacent-scale features and improves multi-scale fusion, enhancing localization stability across defect sizes. Experiments on two public EL datasets show that MSW-Mamba-Det achieves AP50:95 of 60.4% on PV-Multi-Defect-main and 68.0% on PVEL-AD, improving over RT-DETR by 2.5 points (from 57.9% to 60.4%) and 2.2 points (from 65.8% to 68.0%), respectively. MSW-Mamba-Det also outperforms 12 representative baselines, including CNN-, Transformer-, and recent YOLO-based models, in AP50:95 on both datasets, with particularly strong performance on medium and large defects. These results demonstrate the effectiveness of the proposed modules for robust PV EL defect inspection under low-contrast and structured-background conditions. Full article
(This article belongs to the Section Sensing and Imaging)
26 pages, 8883 KB  
Article
Strip Steel Defect Detection Algorithm Integrating Dynamic Convolution and Attention
by Changchun Shao, Zhijie Chen and Jianjun Meng
Electronics 2026, 15(9), 1796; https://doi.org/10.3390/electronics15091796 - 23 Apr 2026
Viewed by 132
Abstract
To address the issues of low accuracy, high false positives, and missed detections in hot-rolled strip steel surface defect inspection, this paper proposes an improved detection model named DFEM-NET based on YOLOv8n. First, an efficient feature extraction module (DSC2f) based on Dynamic Snake [...] Read more.
To address the issues of low accuracy, high false positives, and missed detections in hot-rolled strip steel surface defect inspection, this paper proposes an improved detection model named DFEM-NET based on YOLOv8n. First, an efficient feature extraction module (DSC2f) based on Dynamic Snake Convolution is designed to enhance the model’s capability in capturing features of irregular and elongated defects. Second, a Feature Pyramid Shared Convolution module (FPSC) is constructed to expand the model’s receptive field and effectively suppress interference from complex backgrounds. Third, an Enhanced Feature Correction (EFC) strategy is adopted during the feature fusion stage to help the model better learn the detailed features of small defect targets. Finally, a Multi-Scale Attention Aggregation module (MSAA) is introduced before the detection head, enabling the network to focus on critical feature information and thereby comprehensively improve detection accuracy for target defects. Experimental results demonstrate that, compared to the baseline model YOLOv8n, DFEM-NET achieves a detection accuracy (mAP@0.5) of 83.5%, representing an increase of 4.8%; a recall rate of 76.4%, an increase of 3.3%; and a precision of 84.7%, an increase of 3.1%, without a significant increase in model complexity. Furthermore, generalization experiments conducted on the GC10-DET dataset confirm that the proposed algorithm exhibits exceptional generalization capability. Full article
Back to TopTop