Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (166)

Search Parameters:
Keywords = adaptively downsampling

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
25 pages, 4606 KB  
Article
Denoising and Simplification of 3D Scan Data of Damaged Aero-Engine Blades for Accurate and Efficient Rigid and Non-Rigid Registration
by Hamid Ghorbani and Farbod Khameneifar
Sensors 2025, 25(19), 6148; https://doi.org/10.3390/s25196148 (registering DOI) - 4 Oct 2025
Abstract
Point cloud processing of raw scan data is a critical step to enhance the accuracy and efficiency in computer-aided inspection and remanufacturing of damaged aero-engine blades. This paper presents a new methodology to obtain a noise-reduced and simplified dataset from the raw scan [...] Read more.
Point cloud processing of raw scan data is a critical step to enhance the accuracy and efficiency in computer-aided inspection and remanufacturing of damaged aero-engine blades. This paper presents a new methodology to obtain a noise-reduced and simplified dataset from the raw scan data while preserving the underlying geometry of the damaged blade in high-curvature and damaged regions. At first, outliers are removed from the scan data, and measurement noise is reduced through local least-squares quadric surface/plane fitting on the adaptive support domain of measured points under the measurement uncertainty constraint of inspection data. Then, a directed Hausdorff distance-based region growing scheme is developed to progressively search within the support domain of denoised data points to obtain a down-sampled dataset while preserving the local geometric shape of the surface. Numerical and experimental case studies have been conducted to evaluate the accuracy and computation time of scan-to-CAD rigid registration and CAD-to-scan non-rigid registration processes using the down-sampled dataset of damaged blades. The results have demonstrated that the proposed methodology effectively removes the measurement noise and outliers and provides a down-sampled dataset from the scan data that can significantly reduce the time complexity of the computer-aided inspection and remanufacturing process of the point cloud of damaged blades with a negligible loss of accuracy. Full article
(This article belongs to the Special Issue Short-Range Optical 3D Scanning and 3D Data Processing)
Show Figures

Figure 1

37 pages, 10380 KB  
Article
FEWheat-YOLO: A Lightweight Improved Algorithm for Wheat Spike Detection
by Hongxin Wu, Weimo Wu, Yufen Huang, Shaohua Liu, Yanlong Liu, Nannan Zhang, Xiao Zhang and Jie Chen
Plants 2025, 14(19), 3058; https://doi.org/10.3390/plants14193058 - 3 Oct 2025
Abstract
Accurate detection and counting of wheat spikes are crucial for yield estimation and variety selection in precision agriculture. However, challenges such as complex field environments, morphological variations, and small target sizes hinder the performance of existing models in real-world applications. This study proposes [...] Read more.
Accurate detection and counting of wheat spikes are crucial for yield estimation and variety selection in precision agriculture. However, challenges such as complex field environments, morphological variations, and small target sizes hinder the performance of existing models in real-world applications. This study proposes FEWheat-YOLO, a lightweight and efficient detection framework optimized for deployment on agricultural edge devices. The architecture integrates four key modules: (1) FEMANet, a mixed aggregation feature enhancement network with Efficient Multi-scale Attention (EMA) for improved small-target representation; (2) BiAFA-FPN, a bidirectional asymmetric feature pyramid network for efficient multi-scale feature fusion; (3) ADown, an adaptive downsampling module that preserves structural details during resolution reduction; and (4) GSCDHead, a grouped shared convolution detection head for reduced parameters and computational cost. Evaluated on a hybrid dataset combining GWHD2021 and a self-collected field dataset, FEWheat-YOLO achieved a COCO-style AP of 51.11%, AP@50 of 89.8%, and AP scores of 18.1%, 50.5%, and 61.2% for small, medium, and large targets, respectively, with an average recall (AR) of 58.1%. In wheat spike counting tasks, the model achieved an R2 of 0.941, MAE of 3.46, and RMSE of 6.25, demonstrating high counting accuracy and robustness. The proposed model requires only 0.67 M parameters, 5.3 GFLOPs, and 1.6 MB of storage, while achieving an inference speed of 54 FPS. Compared to YOLOv11n, FEWheat-YOLO improved AP@50, AP_s, AP_m, AP_l, and AR by 0.53%, 0.7%, 0.7%, 0.4%, and 0.3%, respectively, while reducing parameters by 74%, computation by 15.9%, and model size by 69.2%. These results indicate that FEWheat-YOLO provides an effective balance between detection accuracy, counting performance, and model efficiency, offering strong potential for real-time agricultural applications on resource-limited platforms. Full article
(This article belongs to the Special Issue Advances in Artificial Intelligence for Plant Research)
19 pages, 5891 KB  
Article
MS-YOLOv11: A Wavelet-Enhanced Multi-Scale Network for Small Object Detection in Remote Sensing Images
by Haitao Liu, Xiuqian Li, Lifen Wang, Yunxiang Zhang, Zitao Wang and Qiuyi Lu
Sensors 2025, 25(19), 6008; https://doi.org/10.3390/s25196008 - 29 Sep 2025
Abstract
In remote sensing imagery, objects smaller than 32×32 pixels suffer from three persistent challenges that existing detectors inadequately resolve: (1) their weak signal is easily submerged in background clutter, causing high miss rates; (2) the scarcity of valid pixels yields few [...] Read more.
In remote sensing imagery, objects smaller than 32×32 pixels suffer from three persistent challenges that existing detectors inadequately resolve: (1) their weak signal is easily submerged in background clutter, causing high miss rates; (2) the scarcity of valid pixels yields few geometric or textural cues, hindering discriminative feature extraction; and (3) successive down-sampling irreversibly discards high-frequency details, while multi-scale pyramids still fail to compensate. To counteract these issues, we propose MS-YOLOv11, an enhanced YOLOv11 variant that integrates “frequency-domain detail preservation, lightweight receptive-field expansion, and adaptive cross-scale fusion.” Specifically, a 2D Haar wavelet first decomposes the image into multiple frequency sub-bands to explicitly isolate and retain high-frequency edges and textures while suppressing noise. Each sub-band is then processed independently by small-kernel depthwise convolutions that enlarge the receptive field without over-smoothing. Finally, the Mix Structure Block (MSB) employs the MSPLCK module to perform densely sampled multi-scale atrous convolutions for rich context of diminutive objects, followed by the EPA module that adaptively fuses and re-weights features via residual connections to suppress background interference. Extensive experiments on DOTA and DIOR demonstrate that MS-YOLOv11 surpasses the baseline in mAP@50, mAP@95, parameter efficiency, and inference speed, validating its targeted efficacy for small-object detection. Full article
(This article belongs to the Section Remote Sensors)
Show Figures

Figure 1

26 pages, 35265 KB  
Article
Reconstruction Error Guided Instance Segmentation for Infrared Inspection of Power Distribution Equipment
by Jinbin Luo, Yi Sun, Jian Zhang and Bin Sun
Sensors 2025, 25(19), 6007; https://doi.org/10.3390/s25196007 - 29 Sep 2025
Abstract
The instance segmentation of power distribution equipment in infrared images is a prerequisite for determining its overheating fault, which is crucial for urban power grids. Due to the specific characteristics of power distribution equipment, the objects are generally characterized by small scale and [...] Read more.
The instance segmentation of power distribution equipment in infrared images is a prerequisite for determining its overheating fault, which is crucial for urban power grids. Due to the specific characteristics of power distribution equipment, the objects are generally characterized by small scale and complex structure. Existing methods typically use a backbone network to extract features from infrared images. However, the inherent down-sampling operations lead to information loss of objects. The content regions of small-scale objects are compressed, and the edge regions of complex-structured objects are fragmented. In this paper, (1) the first unmanned aerial vehicle-based infrared dataset PDI for power distribution inspection is constructed with 16,596 images, 126,570 instances, and 7 categories of power equipment. It has the advantages of large data volume, rich geographic scenarios, and diverse object patterns, as well as challenges of distribution imbalance, category imbalance, and scale imbalance of objects. (2) A reconstruction error (RE)-guided instance segmentation framework, coupled with an object reconstruction decoder (ORD) and a difference feature enhancement (DFE) module, is proposed. The former reconstructs the objects, where the reconstruction result indicates the position and degree of information loss of the objects. Therefore, the difference map between the reconstruction result and the input image effectively replays the object features. The latter adaptively compensates for object features by global fusion between the difference features and backbone features, thereby enhancing the spatial representation of objects. Extensive experiments on the constructed and publicly available datasets demonstrate the strong generalization, superiority, and versatility of the proposed framework. Full article
Show Figures

Figure 1

26 pages, 5592 KB  
Article
AGRI-YOLO: A Lightweight Model for Corn Weed Detection with Enhanced YOLO v11n
by Gaohui Peng, Kenan Wang, Jianqin Ma, Bifeng Cui and Dawei Wang
Agriculture 2025, 15(18), 1971; https://doi.org/10.3390/agriculture15181971 - 18 Sep 2025
Viewed by 328
Abstract
Corn, as a globally significant food crop, faces significant yield reductions due to competitive growth from weeds. Precise detection and efficient control of weeds are critical technical components for ensuring high and stable corn yields. Traditional deep learning object detection models generally suffer [...] Read more.
Corn, as a globally significant food crop, faces significant yield reductions due to competitive growth from weeds. Precise detection and efficient control of weeds are critical technical components for ensuring high and stable corn yields. Traditional deep learning object detection models generally suffer from issues such as large parameter counts and high computational complexity, making them unsuitable for deployment on resource-constrained devices such as agricultural drones and portable detection devices. Based on this, this paper proposes a lightweight corn weed detection model, AGRI-YOLO, based on the YOLO v11n architecture. First, the DWConv (Depthwise Separable Convolution) module from InceptionNeXt is introduced to reconstruct the C3k2 feature extraction module, enhancing the feature extraction capabilities for corn seedlings and weeds. Second, the ADown (Adaptive Downsampling) downsampling module replaces the Conv layer to address the issue of redundant model parameters; The LADH (Lightweight Asymmetric Detection) detection head is adopted to achieve dynamic weight adjustment while ensuring multi-branch output optimization for target localization and classification precision. Experimental results show that the AGRI-YOLO model achieves a precision rate of 84.7%, a recall rate of 73.0%, and a mAP50 value of 82.8%. Compared to the baseline architecture YOLO v11n, the results are largely consistent, while the number of parameters, G FLOPs, and model size are reduced by 46.6%, 49.2%, and 42.31%, respectively. The AGRI-YOLO model significantly reduces model complexity while maintaining high recognition precision, providing technical support for deployment on resource-constrained edge devices, thereby promoting agricultural intelligence, maintaining ecological balance, and ensuring food security. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
Show Figures

Figure 1

25 pages, 3276 KB  
Article
CPB-YOLOv8: An Enhanced Multi-Scale Traffic Sign Detector for Complex Road Environment
by Wei Zhao, Lanlan Li and Xin Gong
Information 2025, 16(9), 798; https://doi.org/10.3390/info16090798 - 15 Sep 2025
Viewed by 447
Abstract
Traffic sign detection is critically important for intelligent transportation systems, yet persistent challenges like multi-scale variation and complex background interference severely degrade detection accuracy and real-time performance. To address these limitations, this study presents CPB-YOLOv8, an advanced multi-scale detection framework based on the [...] Read more.
Traffic sign detection is critically important for intelligent transportation systems, yet persistent challenges like multi-scale variation and complex background interference severely degrade detection accuracy and real-time performance. To address these limitations, this study presents CPB-YOLOv8, an advanced multi-scale detection framework based on the YOLOv8 architecture. A Cross-Stage Partial-Partitioned Transformer Block (CSP-PTB) is incorporated into the feature extraction stage to preserve semantic information during downsampling while enhancing global feature representation. For feature fusion, a four-level bidirectional feature pyramid BiFPN integrated with a P2 detection layer significantly improves small-target detection capability. Further enhancement is achieved via an optimized loss function that balances multi-scale objective localization. Comprehensive evaluations were conducted on the TT100K, the CCTSDB, and a custom multi-scenario road image dataset capturing urban and suburban environments at 1920 × 1080 resolution. Results demonstrate compelling performance: On TT100K, CPB-YOLOv8 achieved 90.73% mAP@0.5 with a 12.5 MB model size, exceeding the YOLOv8s baseline by 3.94 percentage points and achieving 6.43% higher small-target recall. On CCTSDB, it attained a near-saturation performance of 99.21% mAP@0.5. Crucially, the model demonstrated exceptional robustness across diverse environmental conditions. Rigorous analysis on partitioned CCTSDB subsets based on weather and illumination, alongside validation using a separate self-collected dataset reserved solely for inference, confirmed strong adaptability to real-world distribution shifts and low-visibility scenarios. Cross-dataset validation and visual comparisons further substantiated the model’s robustness and its effective suppression of background interference. Full article
Show Figures

Graphical abstract

22 pages, 9649 KB  
Article
DTC-YOLO: Multimodal Object Detection via Depth-Texture Coupling and Dynamic Gating Optimization
by Wei Xu, Xiaodong Du, Ruochen Li and Lei Xing
Sensors 2025, 25(18), 5731; https://doi.org/10.3390/s25185731 - 14 Sep 2025
Viewed by 596
Abstract
To address the inherent limitations of single-modality sensors constrained by physical properties and data modalities, we propose DTC-YOLO (Depth-Texture Coupling Mechanism YOLO), a depth-texture coupled multimodal detection framework. The main contributions are as follows: RGB-LiDAR (RGB-Light Detection and Ranging) Fusion: We propose a [...] Read more.
To address the inherent limitations of single-modality sensors constrained by physical properties and data modalities, we propose DTC-YOLO (Depth-Texture Coupling Mechanism YOLO), a depth-texture coupled multimodal detection framework. The main contributions are as follows: RGB-LiDAR (RGB-Light Detection and Ranging) Fusion: We propose a depth-color mapping and weighted fusion strategy to effectively integrate depth and texture features. ADF3-Net (Adaptive Dimension-aware Focused Fusion Network): A feature fusion network with hierarchical perception, channel decoupling, and spatial adaptation. A dynamic gated fusion mechanism enables adaptive weighting across multidimensional features, thereby enhancing depth-texture representation. Adown Module: A dual-path adaptive downsampling module that separates high-frequency details from low-frequency semantics, reducing GFLOPs (Giga Floating-point Operations Per Second) by 10.53% while maintaining detection performance. DTC-YOLO achieves substantial improvements over the baseline: +3.50% mAP50, +3.40% mAP50-95, and +3.46% precision. Moreover, it maintains moderate improvements for medium-scale objects while significantly enhancing detection of extremely large and small objects, effectively mitigating the scale-related accuracy discrepancies of vision-only models in complex traffic environments. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

19 pages, 4802 KB  
Article
Enhanced SOLOv2: An Effective Instance Segmentation Algorithm for Densely Overlapping Silkworms
by Jianying Yuan, Hao Li, Chen Cheng, Zugui Liu, Sidong Wu and Dequan Guo
Sensors 2025, 25(18), 5703; https://doi.org/10.3390/s25185703 - 12 Sep 2025
Viewed by 273
Abstract
Silkworm instance segmentation is crucial for individual silkworm behavior analysis and health monitoring in intelligent sericulture, as the segmentation accuracy directly influences the reliability of subsequent biological parameter estimation. In real farming environments, silkworms often exhibit high density and severe mutual occlusion, posing [...] Read more.
Silkworm instance segmentation is crucial for individual silkworm behavior analysis and health monitoring in intelligent sericulture, as the segmentation accuracy directly influences the reliability of subsequent biological parameter estimation. In real farming environments, silkworms often exhibit high density and severe mutual occlusion, posing significant challenges for traditional instance segmentation algorithms. To address these issues, this paper proposes an enhanced SOLOv2 algorithm. Specifically, (1) in the backbone network, Linear Deformable Convolution (LDC) is incorporated to strengthen the geometric feature modeling of curved silkworms. A Haar Wavelet Downsampling (HWD) module is designed to better preserve details for partial visible targets, and an Edge-Augmented Multi-attention Fusion Network (EAMF-Net) is constructed to improve boundary discrimination in overlapping regions. (2) In the mask branch, Dynamic Upsampling (Dysample), Adaptive Spatial Feature Fusion (ASFF), and Simple Attention Module (SimAM) are integrated to refine the quality of segmentation masks. Experiments conducted on a self-built high-density silkworm dataset demonstrate that the proposed method achieves an Average Precision (AP) of 85.1%, with significant improvements over the baseline model in small- (APs: +10.2%), medium- (APm: +4.0%), and large-target (APl: +2.0%) segmentation accuracy. This effectively advances precision in dense silkworm segmentation scenarios. Full article
(This article belongs to the Special Issue Vision Sensors for Object Detection and Tracking)
Show Figures

Figure 1

25 pages, 69171 KB  
Article
CrackNet-Weather: An Effective Pavement Crack Detection Method Under Adverse Weather Conditions
by Wei Wang, Xiaoru Yu, Bin Jing, Ziqi Tang, Wei Zhang, Shengyu Wang, Yao Xiao, Shu Li and Liping Yang
Sensors 2025, 25(17), 5587; https://doi.org/10.3390/s25175587 - 7 Sep 2025
Viewed by 893
Abstract
Accurate pavement crack detection under adverse weather conditions is essential for road safety and effective pavement maintenance. However, factors such as reduced visibility, background noise, and irregular crack morphology make this task particularly challenging in real-world environments. To address these challenges, we propose [...] Read more.
Accurate pavement crack detection under adverse weather conditions is essential for road safety and effective pavement maintenance. However, factors such as reduced visibility, background noise, and irregular crack morphology make this task particularly challenging in real-world environments. To address these challenges, we propose CrackNet-Weather, which is a robust and efficient detection method that systematically incorporates three key modules: a Haar Wavelet Downsampling Block (HWDB) for enhanced frequency information preservation, a Strip Pooling Bottleneck Block (SPBB) for multi-scale and context-aware feature fusion, and a Dynamic Sampling Upsampling Block (DSUB) for content-adaptive spatial feature reconstruction. Extensive experiments conducted on a challenging dataset containing both rainy and snowy weather demonstrate that CrackNet-Weather significantly outperforms mainstream baseline models, achieving notable improvements in mean Average Precision, especially for low-contrast, fine, and irregular cracks. Furthermore, our method maintains a favorable balance between detection accuracy and computational complexity, making it well suited for practical road inspection and large-scale deployment. These results confirm the effectiveness and practicality of CrackNet-Weather in addressing the challenges of real-world pavement crack detection under adverse weather conditions. Full article
Show Figures

Figure 1

19 pages, 11410 KB  
Article
A Pool Drowning Detection Model Based on Improved YOLO
by Wenhui Zhang, Lu Chen and Jianchun Shi
Sensors 2025, 25(17), 5552; https://doi.org/10.3390/s25175552 - 5 Sep 2025
Viewed by 1160
Abstract
Drowning constitutes the leading cause of injury-related fatalities among adolescents. In swimming pool environments, traditional manual surveillance exhibits limitations, while existing technologies suffer from poor adaptability of wearable devices. Vision models based on YOLO still face challenges in edge deployment efficiency, robustness in [...] Read more.
Drowning constitutes the leading cause of injury-related fatalities among adolescents. In swimming pool environments, traditional manual surveillance exhibits limitations, while existing technologies suffer from poor adaptability of wearable devices. Vision models based on YOLO still face challenges in edge deployment efficiency, robustness in complex water conditions, and multi-scale object detection. To address these issues, we propose YOLO11-LiB, a drowning object detection model based on YOLO11n, featuring three key enhancements. First, we design the Lightweight Feature Extraction Module (LGCBlock), which integrates the Lightweight Attention Encoding Block (LAE) and effectively combines Ghost Convolution (GhostConv) with dynamic convolution (DynamicConv). This optimizes the downsampling structure and the C3k2 module in the YOLO11n backbone network, significantly reducing model parameters and computational complexity. Second, we introduce the Cross-Channel Position-aware Spatial Attention Inverted Residual with Spatial–Channel Separate Attention module (C2PSAiSCSA) into the backbone. This module embeds the Spatial–Channel Separate Attention (SCSA) mechanism within the Inverted Residual Mobile Block (iRMB) framework, enabling more comprehensive and efficient feature extraction. Finally, we redesign the neck structure as the Bidirectional Feature Fusion Network (BiFF-Net), which integrates the Bidirectional Feature Pyramid Network (BiFPN) and Frequency-Aware Feature Fusion (FreqFusion). The enhanced YOLO11-LiB model was validated against mainstream algorithms through comparative experiments, and ablation studies were conducted. Experimental results demonstrate that YOLO11-LiB achieves a drowning class mean average precision (DmAP50) of 94.1%, with merely 2.02 M parameters and a model size of 4.25 MB. This represents an effective balance between accuracy and efficiency, providing a high-performance solution for real-time drowning detection in swimming pool scenarios. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

18 pages, 2778 KB  
Article
YOLO-MARS for Infrared Target Detection: Towards near Space
by Bohan Liu, Yeteng Han, Pengxi Liu, Sha Luo, Jie Li, Tao Zhang and Wennan Cui
Sensors 2025, 25(17), 5538; https://doi.org/10.3390/s25175538 - 5 Sep 2025
Viewed by 1099
Abstract
In response to problems such as large target scale variations, strong background noise, and blurred features leading by low contrast in infrared target detection in near space environments, this paper proposes an efficient detection model, YOLO-MARS, which is based on YOLOv8. The model [...] Read more.
In response to problems such as large target scale variations, strong background noise, and blurred features leading by low contrast in infrared target detection in near space environments, this paper proposes an efficient detection model, YOLO-MARS, which is based on YOLOv8. The model introduces a Space-to-Depth (SPD) convolution module into the backbone section, which retains the detailed features of smaller targets by downsampling operations without information loss, alleviating the loss of the target feature caused by traditional downsampling. The Grouped Multi-Head Self-Attention (GMHSA) module is added after the backbone’s SPPF module to improve cross-scale global modeling capabilities for target area feature responses while suppressing complex thermal noise background interference. In addition, a Light Adaptive Spatial Feature Fusion (LASFF) detector head is designed to mitigate the scale sensitivity issue of infrared targets (especially smaller targets) in the feature pyramid. It uses a shared weighting mechanism to achieve adaptive fusion of multi-scale features, reducing computational complexity while improving target localization and classification accuracy. To address the extreme scarcity of near space data, we integrated 284 near space images with the HIT-UAV dataset through physical equivalence analysis (atmospheric transmittance, contrast, and signal-to-noise ratio) to construct the NS-HIT dataset. The experimental results show that mAP@0.5 increases by 5.4% and the number of parameters only increase 10% using YOLO-MARS compared to YOLOv8. YOLO-MARS improves the accuracy of detection significantly while considering the requirements of model complexity, which provides an efficient and reliable solution for applications in near space infrared target detection. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

23 pages, 1476 KB  
Article
Dynamically Optimized Object Detection Algorithms for Aviation Safety
by Yi Qu, Cheng Wang, Yilei Xiao, Haijuan Ju and Jing Wu
Electronics 2025, 14(17), 3536; https://doi.org/10.3390/electronics14173536 - 4 Sep 2025
Viewed by 521
Abstract
Infrared imaging technology demonstrates significant advantages in aviation safety monitoring due to its exceptional all-weather operational capability and anti-interference characteristics, particularly in scenarios requiring real-time detection of aerial objects such as airport airspace management. However, traditional infrared target detection algorithms face critical challenges [...] Read more.
Infrared imaging technology demonstrates significant advantages in aviation safety monitoring due to its exceptional all-weather operational capability and anti-interference characteristics, particularly in scenarios requiring real-time detection of aerial objects such as airport airspace management. However, traditional infrared target detection algorithms face critical challenges in complex sky backgrounds, including low signal-to-noise ratio (SNR), small target dimensions, and strong background clutter, leading to insufficient detection accuracy and reliability. To address these issues, this paper proposes the AFK-YOLO model based on the YOLO11 framework: it integrates an ADown downsampling module, which utilizes a dual-branch strategy combining average pooling and max pooling to effectively minimize feature information loss during spatial resolution reduction; introduces the KernelWarehouse dynamic convolution approach, which adopts kernel partitioning and a contrastive attention-based cross-layer shared kernel repository to address the challenge of linear parameter growth in conventional dynamic convolution methods; and establishes a feature decoupling pyramid network (FDPN) that replaces static feature pyramids with a dynamic multi-scale fusion architecture, utilizing parallel multi-granularity convolutions and an EMA attention mechanism to achieve adaptive feature enhancement. Experiments demonstrate that the AFK-YOLO model achieves 78.6% mAP on a self-constructed aerial infrared dataset—a 2.4 percentage point improvement over the baseline YOLO11—while meeting real-time requirements for aviation safety monitoring (416.7 FPS), reducing parameters by 6.9%, and compressing weight size by 21.8%. The results demonstrate the effectiveness of dynamic optimization methods in improving the accuracy and robustness of infrared target detection under complex aerial environments, thereby providing reliable technical support for the prevention of mid-air collisions. Full article
(This article belongs to the Special Issue Computer Vision and AI Algorithms for Diverse Scenarios)
Show Figures

Figure 1

18 pages, 3670 KB  
Article
Photovoltaic Cell Surface Defect Detection via Subtle Defect Enhancement and Background Suppression
by Yange Sun, Guangxu Huang, Chenglong Xu, Huaping Guo and Yan Feng
Micromachines 2025, 16(9), 1003; https://doi.org/10.3390/mi16091003 - 30 Aug 2025
Viewed by 433
Abstract
As the core component of photovoltaic (PV) power generation systems, PV cells are susceptible to subtle surface defects, including thick lines, cracks, and finger interruptions, primarily caused by stress and material brittleness during the manufacturing process. These defects substantially degrade energy conversion efficiency [...] Read more.
As the core component of photovoltaic (PV) power generation systems, PV cells are susceptible to subtle surface defects, including thick lines, cracks, and finger interruptions, primarily caused by stress and material brittleness during the manufacturing process. These defects substantially degrade energy conversion efficiency by inducing both optical and electrical losses, yet existing detection methods struggle to precisely identify and localize them. In addition, the complexity of background noise and other factors further increases the challenge of detecting these subtle defects. To address these challenges, we propose a novel PV Cell Surface Defect Detector (PSDD) that extracts subtle defects both within the backbone network and during feature fusion. In particular, we propose a plug-and-play Subtle Feature Refinement Module (SFRM) that integrates into the backbone to enhance fine-grained feature representation by rearranging local spatial features to the channel dimension, mitigating the loss of detail caused by downsampling. SFRM further employs a general attention mechanism to adaptively enhance key channels associated with subtle defects, improving the representation of fine defect features. In addition, we propose a Background Noise Suppression Block (BNSB) as a key component of the feature aggregation stage, which employs a dual-path strategy to fuse multiscale features, reducing background interference and improving defect saliency. Specifically, the first path uses a Background-Aware Module (BAM) to adaptively suppress noise and emphasize relevant features, while the second path adopts a residual structure to retain the original input features and prevent the loss of critical details. Experiments show that PSDD outperforms other methods, achieving the highest mAP50 scores of 93.6% on the PVEL-AD. Full article
(This article belongs to the Special Issue Thin Film Photovoltaic and Photonic Based Materials and Devices)
Show Figures

Figure 1

24 pages, 17568 KB  
Article
Super-Resolved Pseudo Reference in Dual-Branch Embedding for Blind Ultra-High-Definition Image Quality Assessment
by Jiacheng Gu, Qingxu Meng, Songnan Zhao, Yifan Wang, Shaode Yu and Qiurui Sun
Electronics 2025, 14(17), 3447; https://doi.org/10.3390/electronics14173447 - 29 Aug 2025
Viewed by 458
Abstract
In the Ultra-High-Definition (UHD) domain, blind image quality assessment remains challenging due to the high dimensionality of UHD images, which exceeds the input capacity of deep learning networks. Motivated by the visual discrepancies observed between high- and low-quality images after down-sampling and Super-Resolution [...] Read more.
In the Ultra-High-Definition (UHD) domain, blind image quality assessment remains challenging due to the high dimensionality of UHD images, which exceeds the input capacity of deep learning networks. Motivated by the visual discrepancies observed between high- and low-quality images after down-sampling and Super-Resolution (SR) reconstruction, we propose a SUper-Resolved Pseudo References In Dual-branch Embedding (SURPRIDE) framework tailored for UHD image quality prediction. SURPRIDE employs one branch to capture intrinsic quality features from the original patch input and the other to encode comparative perceptual cues from the SR-reconstructed pseudo-reference. The fusion of the complementary representation, guided by a novel hybrid loss function, enhances the network’s ability to model both absolute and relational quality cues. Key components of the framework are optimized through extensive ablation studies. Experimental results demonstrate that the SURPRIDE framework achieves competitive performance on two UHD benchmarks (AIM 2024 Challenge, PLCC = 0.7755, SRCC = 0.8133, on the testing set; HRIQ, PLCC = 0.882, SRCC = 0.873). Meanwhile, its effectiveness is verified on high- and standard-definition image datasets across diverse resolutions. Future work may explore positional encoding, advanced representation learning, and adaptive multi-branch fusion to align model predictions with human perceptual judgment in real-world scenarios. Full article
Show Figures

Figure 1

28 pages, 14886 KB  
Article
Efficient Conditional Diffusion Model for SAR Despeckling
by Zhenyu Guo, Weidong Hu, Shichao Zheng, Binchao Zhang, Ming Zhou, Jincheng Peng, Zhiyu Yao and Minghao Feng
Remote Sens. 2025, 17(17), 2970; https://doi.org/10.3390/rs17172970 - 27 Aug 2025
Viewed by 686
Abstract
Speckle noise inherent in Synthetic Aperture Radar (SAR) images severely degrades image quality and hinders downstream tasks such as interpretation and target recognition. Existing despeckling methods, both traditional and deep learning-based, often struggle to balance effective speckle suppression with structural detail preservation. Although [...] Read more.
Speckle noise inherent in Synthetic Aperture Radar (SAR) images severely degrades image quality and hinders downstream tasks such as interpretation and target recognition. Existing despeckling methods, both traditional and deep learning-based, often struggle to balance effective speckle suppression with structural detail preservation. Although Denoising Diffusion Probabilistic Models (DDPMs) have shown remarkable potential for SAR despeckling, their computational overhead from iterative sampling severely limits practical applicability. To mitigate these challenges, this paper proposes the Efficient Conditional Diffusion Model (ECDM) for SAR despeckling. We integrate the cosine noise schedule with a joint variance prediction mechanism, accelerating the inference speed by an order of magnitude while maintaining high denoising quality. Furthermore, we integrate wavelet transforms into the encoder’s downsampling path, enabling adaptive feature fusion across frequency bands to enhance structural fidelity. Experimental results demonstrate that, compared to a baseline diffusion model, our proposed method achieves an approximately 20-fold acceleration in inference and obtains significant improvements in key objective metrics. This work contributes to real-time processing of diffusion models for SAR image enhancement, supporting practical deployment by mitigating prolonged inference in traditional diffusion models through efficient stochastic sampling. Full article
Show Figures

Figure 1

Back to TopTop