Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (102)

Search Parameters:
Keywords = CBAM-YOLO

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 7458 KB  
Article
Dynamic and Lightweight Detection of Strawberry Diseases Using Enhanced YOLOv10
by Huilong Jin, Xiangrong Ji and Wanming Liu
Electronics 2025, 14(19), 3768; https://doi.org/10.3390/electronics14193768 - 24 Sep 2025
Viewed by 139
Abstract
Strawberry cultivation faces significant challenges from pests and diseases, which are difficult to detect due to complex natural backgrounds and the high visual similarity between targets and their surroundings. This study proposes an advanced and lightweight detection algorithm, YOLO10-SC, based on the YOLOv10 [...] Read more.
Strawberry cultivation faces significant challenges from pests and diseases, which are difficult to detect due to complex natural backgrounds and the high visual similarity between targets and their surroundings. This study proposes an advanced and lightweight detection algorithm, YOLO10-SC, based on the YOLOv10 model, to address these challenges. The algorithm integrates the convolutional block attention module (CBAM) to enhance feature representation by focusing on critical disease-related information while suppressing irrelevant data. Additionally, the Spatial and Channel Reconstruction Convolution (SCConv) module is incorporated into the C2f module to improve the model’s ability to distinguish subtle differences among various pest and disease types. The introduction of DySample, an ultra-lightweight dynamic upsampler, further enhances feature boundary smoothness and detail preservation, ensuring efficient upsampling with minimal computational resources. Experimental results demonstrate that YOLO10-SC outperforms the original YOLOv10 and other mainstream algorithms in precision, recall, mAP50, F1 score, and FPS while reducing model parameters, GFLOPs, and size. These improvements significantly enhance detection accuracy and efficiency, making the model well-suited for real-time applications in natural agricultural environments. The proposed algorithm offers a robust solution for strawberry pest and disease detection, contributing to the advancement of smart agriculture. Full article
Show Figures

Figure 1

32 pages, 6397 KB  
Article
Enhancing YOLO-Based SAR Ship Detection with Attention Mechanisms
by Ranyeri do Lago Rocha and Felipe A. P. de Figueiredo
Remote Sens. 2025, 17(18), 3170; https://doi.org/10.3390/rs17183170 - 12 Sep 2025
Viewed by 607
Abstract
This study enhances Synthetic Aperture Radar (SAR) ship detection by integrating attention mechanisms, Bi-Level Routing Attention (BRA), Swin Transformer, and a Convolutional Block Attention Module (CBAM) into state-of-the-art YOLO architectures (YOLOv11 and v12). Addressing challenges like small ship sizes and complex maritime backgrounds [...] Read more.
This study enhances Synthetic Aperture Radar (SAR) ship detection by integrating attention mechanisms, Bi-Level Routing Attention (BRA), Swin Transformer, and a Convolutional Block Attention Module (CBAM) into state-of-the-art YOLO architectures (YOLOv11 and v12). Addressing challenges like small ship sizes and complex maritime backgrounds in SAR imagery, we systematically evaluate the impact of adding and replacing attention layers at strategic positions within the models. Experiments reveal that replacing the original attention layer at position 4 (C3k2 module) with the CBAM in YOLOv12 achieves optimal performance, attaining an mAP@0.5 of 98.0% on the SAR Ship Dataset (SSD), surpassing baseline YOLOv12 (97.8%) and prior works. The optimized CBAM-enhanced YOLOv12 also reduces computational costs (5.9 GFLOPS vs. 6.5 GFLOPS in the baseline). Cross-dataset validation on the SAR Ship Detection Dataset (SSDD) confirms consistent improvements, underscoring the efficacy of targeted attention-layer replacement for SAR-specific challenges. Additionally, tests on the SADD and MSAR datasets demonstrate that this optimization generalizes beyond ship detection, yielding gains in aircraft detection and multi-class SAR object recognition. This work establishes a robust framework for efficient, high-precision maritime surveillance using deep learning. Full article
Show Figures

Figure 1

21 pages, 10256 KB  
Article
Dual-Path Attention Network for Multi-State Safety Helmet Identification in Complex Power Scenarios
by Wei Li, Rong Jia, Xiangwu Chen, Ge Cao and Ziyan Zhao
Processes 2025, 13(9), 2750; https://doi.org/10.3390/pr13092750 - 28 Aug 2025
Viewed by 420
Abstract
The environment of the power operation site is complex and changeable, and the accurate identification of the wearing status of workers’ safety helmets is significant to ensure personal safety and the stable operation of the power system. Existing research suffers from high rates [...] Read more.
The environment of the power operation site is complex and changeable, and the accurate identification of the wearing status of workers’ safety helmets is significant to ensure personal safety and the stable operation of the power system. Existing research suffers from high rates of missed detections and limited ability to discriminate fine-grained states, especially the identification of “wrongly wearing” states. Therefore, this paper proposes an intelligent identification method of safety helmet status for power workers based on a dual-path attention network. We embed the convolutional block attention module (CBAM) in the two paths of the backbone and neck layers of YOLOv5 and enhance the feature focusing ability of the key areas of the helmet through the channel-spatial attention coordination, so as to suppress the interference of complex background. In addition, a special dataset covering power scenarios is constructed, including fine-grained state annotation under various lighting, different poses, and occlusion conditions to improve the generalization of the model. Finally, the proposed method is applied to the images of the electric power operation site for experimental verification. The experimental results show that the proposed YOLO-CBAM achieves an outstanding mean average precision of 98.81% for identifying all helmet states, providing reliable technical support for intelligent safety monitoring. Full article
Show Figures

Figure 1

20 pages, 3978 KB  
Article
Cotton-YOLO: A Lightweight Detection Model for Falled Cotton Impurities Based on Yolov8
by Jie Li, Zhoufan Zhong, Youran Han and Xinhou Wang
Symmetry 2025, 17(8), 1185; https://doi.org/10.3390/sym17081185 - 24 Jul 2025
Viewed by 445
Abstract
As an important pillar of the global economic system, the cotton industry faces critical challenges from non-fibrous impurities (e.g., leaves and debris) during processing, which severely degrade product quality, inflate costs, and reduce efficiency. Traditional detection methods suffer from insufficient accuracy and low [...] Read more.
As an important pillar of the global economic system, the cotton industry faces critical challenges from non-fibrous impurities (e.g., leaves and debris) during processing, which severely degrade product quality, inflate costs, and reduce efficiency. Traditional detection methods suffer from insufficient accuracy and low efficiency, failing to meet practical production needs. While deep learning models excel in general object detection, their massive parameter counts render them ill-suited for real-time industrial applications. To address these issues, this study proposes Cotton-YOLO, an optimized yolov8 model. By leveraging principles of symmetry in model design and system setup, the study integrates the CBAM attention module—with its inherent dual-path (channel-spatial) symmetry—to enhance feature capture for tiny impurities and mitigate insufficient focus on key areas. The C2f_DSConv module, exploiting functional equivalence via quantization and shift operations, reduces model complexity by 12% (to 2.71 million parameters) without sacrificing accuracy. Considering angle and shape variations in complex scenarios, the loss function is upgraded to Wise-IoU for more accurate boundary box regression. Experimental results show that Cotton-YOLO achieves 86.5% precision, 80.7% recall, 89.6% mAP50, 50.1% mAP50–95, and 50.51 fps detection speed, representing a 3.5% speed increase over the original yolov8. This work demonstrates the effective application of symmetry concepts (in algorithmic structure and performance balance) to create a model that balances lightweight design and high efficiency, providing a practical solution for industrial impurity detection and key technical support for automated cotton sorting systems. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

22 pages, 6496 KB  
Article
Real-Time Search and Rescue with Drones: A Deep Learning Approach for Small-Object Detection Based on YOLO
by Francesco Ciccone and Alessandro Ceruti
Drones 2025, 9(8), 514; https://doi.org/10.3390/drones9080514 - 22 Jul 2025
Viewed by 2345
Abstract
Unmanned aerial vehicles are increasingly used in civil Search and Rescue operations due to their rapid deployment and wide-area coverage capabilities. However, detecting missing persons from aerial imagery remains challenging due to small object sizes, cluttered backgrounds, and limited onboard computational resources, especially [...] Read more.
Unmanned aerial vehicles are increasingly used in civil Search and Rescue operations due to their rapid deployment and wide-area coverage capabilities. However, detecting missing persons from aerial imagery remains challenging due to small object sizes, cluttered backgrounds, and limited onboard computational resources, especially when managed by civil agencies. In this work, we present a comprehensive methodology for optimizing YOLO-based object detection models for real-time Search and Rescue scenarios. A two-stage transfer learning strategy was employed using VisDrone for general aerial object detection and Heridal for Search and Rescue-specific fine-tuning. We explored various architectural modifications, including enhanced feature fusion (FPN, BiFPN, PB-FPN), additional detection heads (P2), and modules such as CBAM, Transformers, and deconvolution, analyzing their impact on performance and computational efficiency. The best-performing configuration (YOLOv5s-PBfpn-Deconv) achieved a mAP@50 of 0.802 on the Heridal dataset while maintaining real-time inference on embedded hardware (Jetson Nano). Further tests at different flight altitudes and explainability analyses using EigenCAM confirmed the robustness and interpretability of the model in real-world conditions. The proposed solution offers a viable framework for deploying lightweight, interpretable AI systems for UAV-based Search and Rescue operations managed by civil protection authorities. Limitations and future directions include the integration of multimodal sensors and adaptation to broader environmental conditions. Full article
Show Figures

Figure 1

26 pages, 7857 KB  
Article
Investigation of an Efficient Multi-Class Cotton Leaf Disease Detection Algorithm That Leverages YOLOv11
by Fangyu Hu, Mairheba Abula, Di Wang, Xuan Li, Ning Yan, Qu Xie and Xuedong Zhang
Sensors 2025, 25(14), 4432; https://doi.org/10.3390/s25144432 - 16 Jul 2025
Viewed by 571
Abstract
Cotton leaf diseases can lead to substantial yield losses and economic burdens. Traditional detection methods are challenged by low accuracy and high labor costs. This research presents the ACURS-YOLO network, an advanced cotton leaf disease detection architecture developed on the foundation of YOLOv11. [...] Read more.
Cotton leaf diseases can lead to substantial yield losses and economic burdens. Traditional detection methods are challenged by low accuracy and high labor costs. This research presents the ACURS-YOLO network, an advanced cotton leaf disease detection architecture developed on the foundation of YOLOv11. By integrating a medical image segmentation model, it effectively tackles challenges including complex background interference, the missed detection of small targets, and restricted generalization ability. Specifically, the U-Net v2 module is embedded in the backbone network to boost the multi-scale feature extraction performance in YOLOv11. Meanwhile, the CBAM attention mechanism is integrated to emphasize critical disease-related features. To lower the computational complexity, the SPPF module is substituted with SimSPPF. The C3k2_RCM module is appended for long–range context modeling, and the ARelu activation function is employed to alleviate the vanishing gradient problem. A database comprising 3000 images covering six types of cotton leaf diseases was constructed, and data augmentation techniques were applied. The experimental results show that ACURS-YOLO attains impressive performance indicators, encompassing a mAP_0.5 value of 94.6%, a mAP_0.5:0.95 value of 83.4%, 95.5% accuracy, 89.3% recall, an F1 score of 92.3%, and a frame rate of 148 frames per second. It outperforms YOLOv11 and other conventional models with regard to both detection precision and overall functionality. Ablation tests additionally validate the efficacy of each component, affirming the framework’s advantage in addressing complex detection environments. This framework provides an efficient solution for the automated monitoring of cotton leaf diseases, advancing the development of smart sensors through improved detection accuracy and practical applicability. Full article
(This article belongs to the Section Smart Agriculture)
Show Figures

Figure 1

21 pages, 3406 KB  
Article
ResNet-SE-CBAM Siamese Networks for Few-Shot and Imbalanced PCB Defect Classification
by Chao-Hsiang Hsiao, Huan-Che Su, Yin-Tien Wang, Min-Jie Hsu and Chen-Chien Hsu
Sensors 2025, 25(13), 4233; https://doi.org/10.3390/s25134233 - 7 Jul 2025
Viewed by 900
Abstract
Defect detection in mass production lines often involves small and imbalanced datasets, necessitating the use of few-shot learning methods. Traditional deep learning-based approaches typically rely on large datasets, limiting their applicability in real-world scenarios. This study explores few-shot learning models for detecting product [...] Read more.
Defect detection in mass production lines often involves small and imbalanced datasets, necessitating the use of few-shot learning methods. Traditional deep learning-based approaches typically rely on large datasets, limiting their applicability in real-world scenarios. This study explores few-shot learning models for detecting product defects using limited data, enhancing model generalization and stability. Unlike previous deep learning models that require extensive datasets, our approach effectively performs defect detection with minimal data. We propose a Siamese network that integrates Residual blocks, Squeeze and Excitation blocks, and Convolution Block Attention Modules (ResNet-SE-CBAM Siamese network) for feature extraction, optimized through triplet loss for embedding learning. The ResNet-SE-CBAM Siamese network incorporates two primary features: attention mechanisms and metric learning. The recently developed attention mechanisms enhance the convolutional neural network operations and significantly improve feature extraction performance. Meanwhile, metric learning allows for the addition or removal of feature classes without the need to retrain the model, improving its applicability in industrial production lines with limited defect samples. To further improve training efficiency with imbalanced datasets, we introduce a sample selection method based on the Structural Similarity Index Measure (SSIM). Additionally, a high defect rate training strategy is utilized to reduce the False Negative Rate (FNR) and ensure no missed defect detections. At the classification stage, a K-Nearest Neighbor (KNN) classifier is employed to mitigate overfitting risks and enhance stability in few-shot conditions. The experimental results demonstrate that with a good-to-defect ratio of 20:40, the proposed system achieves a classification accuracy of 94% and an FNR of 2%. Furthermore, when the number of defective samples increases to 80, the system achieves zero false negatives (FNR = 0%). The proposed metric learning approach outperforms traditional deep learning models, such as parametric-based YOLO series models in defect detection, achieving higher accuracy and lower miss rates, highlighting its potential for high-reliability industrial deployment. Full article
Show Figures

Figure 1

24 pages, 15100 KB  
Article
Sugarcane Feed Volume Detection in Stacked Scenarios Based on Improved YOLO-ASM
by Xiao Lai and Guanglong Fu
Agriculture 2025, 15(13), 1428; https://doi.org/10.3390/agriculture15131428 - 2 Jul 2025
Viewed by 404
Abstract
Improper regulation of sugarcane feed volume can lead to harvester inefficiency or clogging. Accurate recognition of feed volume is therefore critical. However, visual recognition is challenging due to sugarcane stacking during feeding. To address this, we propose YOLO-ASM (YOLO Accurate Stereo Matching), a [...] Read more.
Improper regulation of sugarcane feed volume can lead to harvester inefficiency or clogging. Accurate recognition of feed volume is therefore critical. However, visual recognition is challenging due to sugarcane stacking during feeding. To address this, we propose YOLO-ASM (YOLO Accurate Stereo Matching), a novel detection method. At the target detection level, we integrate a Convolutional Block Attention Module (CBAM) into the YOLOv5s backbone network. This significantly reduces missed detections and low-confidence predictions in dense stacking scenarios, improving detection speed by 28.04% and increasing mean average precision (mAP) by 5.31%. At the stereo matching level, we enhance the SGBM (Semi-Global Block Matching) algorithm through improved cost calculation and cost aggregation, resulting in Opti-SGBM (Optimized SGBM). This double-cost fusion approach strengthens texture feature extraction in stacked sugarcane, effectively reducing noise in the generated depth maps. The optimized algorithm yields depth maps with smaller errors relative to the original images, significantly improving depth accuracy. Experimental results demonstrate that the fused YOLO-ASM algorithm reduces sugarcane volume error rates across feed volumes of one to six by 3.45%, 3.23%, 6.48%, 5.86%, 9.32%, and 11.09%, respectively, compared to the original stereo matching algorithm. It also accelerates feed volume detection by approximately 100%, providing a high-precision solution for anti-clogging control in sugarcane harvester conveyor systems. Full article
(This article belongs to the Section Agricultural Technology)
Show Figures

Figure 1

25 pages, 4471 KB  
Article
A Novel Lightweight Framework for Non-Contact Broiler Face Identification in Intensive Farming
by Bin Gao, Yongmin Guo, Pengshen Zheng, Kaisi Yang and Changxi Chen
Sensors 2025, 25(13), 4051; https://doi.org/10.3390/s25134051 - 29 Jun 2025
Viewed by 534
Abstract
Efficient individual identification is essential for advancing precision broiler farming. In this study, we propose YOLO-IFSC, a high-precision and lightweight face recognition framework specifically designed for dense broiler farming environments. Building on the YOLOv11n architecture, the proposed model integrates four key modules to [...] Read more.
Efficient individual identification is essential for advancing precision broiler farming. In this study, we propose YOLO-IFSC, a high-precision and lightweight face recognition framework specifically designed for dense broiler farming environments. Building on the YOLOv11n architecture, the proposed model integrates four key modules to overcome the limitations of traditional methods and recent CNN-based approaches. The Inception-F module employs a dynamic multi-branch design to enhance multi-scale feature extraction, while the C2f-Faster module leverages partial convolution to reduce computational redundancy and parameter count. Furthermore, the SPPELANF module reinforces cross-layer spatial feature aggregation to alleviate the adverse effects of occlusion, and the CBAM module introduces a dual-domain attention mechanism to emphasize critical facial regions. Experimental evaluations on a self-constructed dataset demonstrate that YOLO-IFSC achieves a mAP@0.5 of 91.5%, alongside a 40.8% reduction in parameters and a 24.2% reduction in FLOPs compared to the baseline, with a consistent real-time inference speed of 36.6 FPS. The proposed framework offers a cost-effective, non-contact alternative for broiler face recognition, significantly advancing individual tracking and welfare monitoring in precision farming. Full article
Show Figures

Figure 1

15 pages, 1949 KB  
Article
High-Performance and Lightweight AI Model with Integrated Self-Attention Layers for Soybean Pod Number Estimation
by Qian Huang
AI 2025, 6(7), 135; https://doi.org/10.3390/ai6070135 - 24 Jun 2025
Viewed by 801
Abstract
Background: Soybean is an important global crop in food security and agricultural economics. Accurate estimation of soybean pod counts is critical for yield prediction, breeding programs, precision farming, etc. Traditional methods, such as manual counting, are slow, labor-intensive, and prone to errors. With [...] Read more.
Background: Soybean is an important global crop in food security and agricultural economics. Accurate estimation of soybean pod counts is critical for yield prediction, breeding programs, precision farming, etc. Traditional methods, such as manual counting, are slow, labor-intensive, and prone to errors. With rapid advancements in artificial intelligence (AI), deep learning has enabled automatic pod number estimation in collaboration with unmanned aerial vehicles (UAVs). However, existing AI models are computationally demanding and require significant processing resources (e.g., memory). These resources are often not available in rural regions and small farms. Methods: To address these challenges, this study presents a set of lightweight, efficient AI models designed to overcome these limitations. By integrating model simplification, weight quantization, and squeeze-and-excitation (SE) self-attention blocks, we develop compact AI models capable of fast and accurate soybean pod count estimation. Results and Conclusions: Experimental results show a comparable estimation accuracy of 84–87%, while the AI model size is significantly reduced by a factor of 9–65, thus making them suitable for deployment in edge devices, such as Raspberry Pi. Compared to existing models such as YOLO POD and SoybeanNet, which rely on over 20 million parameters to achieve approximately 84% accuracy, our proposed lightweight models deliver a comparable or even higher accuracy (84.0–86.76%) while using fewer than 2 million parameters. In future work, we plan to expand the dataset by incorporating diverse soybean images to enhance model generalizability. Additionally, we aim to explore more advanced attention mechanisms—such as CBAM or ECA—to further improve feature extraction and model performance. Finally, we aim to implement the complete system in edge devices and conduct real-world testing in soybean fields. Full article
Show Figures

Figure 1

21 pages, 83137 KB  
Article
RGB-FIR Multimodal Pedestrian Detection with Cross-Modality Context Attentional Model
by Han Wang, Lei Jin, Guangcheng Wang, Wenjie Liu, Quan Shi, Yingyan Hou and Jiali Liu
Sensors 2025, 25(13), 3854; https://doi.org/10.3390/s25133854 - 20 Jun 2025
Viewed by 644
Abstract
Pedestrian detection is an important research topic in the field of visual cognition and autonomous driving systems. The proposal of the YOLO model has significantly improved the speed and accuracy of detection. To achieve full day detection performance, multimodal YOLO models based on [...] Read more.
Pedestrian detection is an important research topic in the field of visual cognition and autonomous driving systems. The proposal of the YOLO model has significantly improved the speed and accuracy of detection. To achieve full day detection performance, multimodal YOLO models based on RGB-FIR image pairs have become a research hotspot. Existing work has focused on the design of fusion modules after feature extraction of RGB and FIR branch backbone networks, achieving a multimodal backbone network framework based on back-end fusion. However, these methods overlook the complementarity and prior knowledge between modalities and scales in the front-end raw feature extraction of RGB and FIR branch backbone networks. As a result, the performance of the backend fusion framework largely depends on the representation ability of the raw features of each modality in the front-end. This paper proposes a novel RGB-FIR multimodal backbone network framework based on a cross-modality context attentional model (CCAM). Different from the existing works, a multi-level fusion framework is designed. At the front-end of the RGB-FIR parallel backbone network, the CCAM model is constructed for the raw features of each scale. The RGB-FIR feature fusion results of the lower-level features of the RGB and FIR branch backbone networks are fully utilized to optimize the spatial weight of the upper level RGB and FIR features, to achieve cross-modality and cross-scale complementarity between adjacent scale feature extraction modules. At the back-end of the RGB-FIR parallel network, a channel-space joint attention model (CBAM) and self-attention models are combined to obtain the final RGB-FIR fusion features at each scale for those RGB and FIR features optimized by CCAM. Compared with the current RGB-FIR multimodal YOLO model, comparative experiments on different performance evaluation indicators on multiple RGB-FIR public datasets indicate that this method can significantly enhance the accuracy and robustness of pedestrian detection. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

24 pages, 6594 KB  
Article
GAT-Enhanced YOLOv8_L with Dilated Encoder for Multi-Scale Space Object Detection
by Haifeng Zhang, Han Ai, Donglin Xue, Zeyu He, Haoran Zhu, Delian Liu, Jianzhong Cao and Chao Mei
Remote Sens. 2025, 17(13), 2119; https://doi.org/10.3390/rs17132119 - 20 Jun 2025
Viewed by 659
Abstract
The problem of inadequate object detection accuracy in complex remote sensing scenarios has been identified as a primary concern. Traditional YOLO-series algorithms encounter challenges such as poor robustness in small object detection and significant interference from complex backgrounds. In this paper, a multi-scale [...] Read more.
The problem of inadequate object detection accuracy in complex remote sensing scenarios has been identified as a primary concern. Traditional YOLO-series algorithms encounter challenges such as poor robustness in small object detection and significant interference from complex backgrounds. In this paper, a multi-scale feature fusion framework based on an improved version of YOLOv8_L is proposed. The combination of a graph attention network (GAT) and Dilated Encoder network significantly improves the algorithm detection and recognition performance for space remote sensing objects. It mainly includes abandoning the original Feature Pyramid Network (FPN) structure, proposing an adaptive fusion strategy based on multi-level features of backbone network, enhancing the expression ability of multi-scale objects through upsampling and feature stacking, and reconstructing the FPN. The local features extracted by convolutional neural networks are mapped to graph-structured data, and the nodal attention mechanism of GAT is used to capture the global topological association of space objects, which makes up for the deficiency of the convolutional operation in weight allocation and realizes GAT integration. The Dilated Encoder network is introduced to cover different-scale targets by differentiating receptive fields, and the feature weight allocation is optimized by combining it with a Convolutional Block Attention Module (CBAM). According to the characteristics of space missions, an annotated dataset containing 8000 satellite and space station images is constructed, covering a variety of lighting, attitude and scale scenes, and providing benchmark support for model training and verification. Experimental results on the space object dataset reveal that the enhanced algorithm achieves a mean average precision (mAP) of 97.2%, representing a 2.1% improvement over the original YOLOv8_L. Comparative experiments with six other models demonstrate that the proposed algorithm outperforms its counterparts. Ablation studies further validate the synergistic effect between the graph attention network (GAT) and the Dilated Encoder. The results indicate that the model maintains a high detection accuracy under challenging conditions, including strong light interference, multi-scale variations, and low-light environments. Full article
(This article belongs to the Special Issue Remote Sensing Image Thorough Analysis by Advanced Machine Learning)
Show Figures

Figure 1

31 pages, 12794 KB  
Article
Enhanced Defect Detection in Additive Manufacturing via Virtual Polarization Filtering and Deep Learning Optimization
by Xu Su, Xing Peng, Xingyu Zhou, Hongbing Cao, Chong Shan, Shiqing Li, Shuo Qiao and Feng Shi
Photonics 2025, 12(6), 599; https://doi.org/10.3390/photonics12060599 - 11 Jun 2025
Cited by 1 | Viewed by 2164
Abstract
Additive manufacturing (AM) is widely used in industries such as aerospace, medical, and automotive. Within this domain, defect detection technology has emerged as a critical area of research focus in the quality inspection phase of AM. The main challenge lies in that under [...] Read more.
Additive manufacturing (AM) is widely used in industries such as aerospace, medical, and automotive. Within this domain, defect detection technology has emerged as a critical area of research focus in the quality inspection phase of AM. The main challenge lies in that under extreme lighting conditions, strong reflected light obscures defect feature information, leading to a significant decrease in the defect detection rate. This paper introduces a novel methodology for intelligent defect detection in AM components with reflective surfaces, leveraging virtual polarization filtering (IEVPF) and an improved YOLO V5-W model. The IEVPF algorithm is designed to enhance image quality through the virtual manipulation of light polarization, thereby improving defect visibility. The YOLO V5-W model, integrated with CBAM attention, DenseNet connections, and an EIoU loss function, demonstrates superior performance in defect identification across various lighting conditions. Experiments show a 40.3% reduction in loss, a 10.8% improvement in precision, a 10.3% improvement in recall, and a 13.7% improvement in mAP compared to the original YOLO V5 model. Our findings highlight the potential of combining virtual polarization filtering with advanced deep learning models for enhanced AM surface defect detection. Full article
(This article belongs to the Special Issue Advances in Micro-Nano Optical Manufacturing)
Show Figures

Figure 1

15 pages, 4176 KB  
Article
Wind Turbine Surface Crack Detection Based on YOLOv5l-GCB
by Feng Hu, Xiaohui Leng, Chao Ma, Guoming Sun, Dehong Wang, Duanxuan Liu and Zixuan Zhang
Energies 2025, 18(11), 2775; https://doi.org/10.3390/en18112775 - 27 May 2025
Viewed by 405
Abstract
As a fundamental element of the wind power generation system, the timely detection and rectification of surface cracks and other defects are imperative to ensure the stable function of the entire system. A new wind tower surface crack detection model, You Only Look [...] Read more.
As a fundamental element of the wind power generation system, the timely detection and rectification of surface cracks and other defects are imperative to ensure the stable function of the entire system. A new wind tower surface crack detection model, You Only Look Once version 5l GhostNetV2-CBAM-BiFPN (YOLOv5l-GCB), is proposed to accomplish the accurate classification of wind tower surface cracks. Ghost Network Version 2 (GhostNetV2) is integrated into the backbone of YOLOv5l to realize lightweighting of the backbone, which simplifies the complexity of the model and enhances the inference speed; the Convolutional Block Attention Module (CBAM) is added to strengthen the attention of the model to the target region; and the bidirectional feature pyramid network (BiFPN) has been developed for the purpose of enhancing the model’s detection accuracy in complex scenes. The proposed improvement strategy is verified through ablation experiments. The experimental results indicate that the precision, recall, F1 score, and mean average precision of YOLOv5l-GCB reach 91.6%, 99.0%, 75.0%, and 84.6%, which are 4.7%, 2%, 1%, and 10.4% higher than that of YOLOv5l, and it can accurately recognize multiple types of cracks, with an average number of 28 images detected per second, which improves the detection speed. Full article
Show Figures

Figure 1

22 pages, 11736 KB  
Article
A Precise Detection Method for Tomato Fruit Ripeness and Picking Points in Complex Environments
by Xinfa Wang, Xuan Wen, Yi Li, Chenfan Du, Duokuo Zhang, Chengxiu Sun and Bihua Chen
Horticulturae 2025, 11(6), 585; https://doi.org/10.3390/horticulturae11060585 - 25 May 2025
Cited by 1 | Viewed by 1585
Abstract
Accurate identification of tomato ripeness and precise detection of picking points is the key to realizing automated picking. Aiming at the problems faced in practical applications, such as low accuracy of tomato ripeness and picking points detection in complex greenhouse environments, which leads [...] Read more.
Accurate identification of tomato ripeness and precise detection of picking points is the key to realizing automated picking. Aiming at the problems faced in practical applications, such as low accuracy of tomato ripeness and picking points detection in complex greenhouse environments, which leads to wrong picking, missed picking, and fruit damage by robots, this study proposes the YOLO-TMPPD (Tomato Maturity and Picking Point Detection) model. YOLO-TMPPD is structurally improved and algorithmically optimized based on the YOLOv8 baseline architecture. Firstly, the Depthwise Convolution (DWConv) module is utilized to substitute the C2f module within the backbone network. This substitution not only cuts down the model’s computational load but also simultaneously enhances the detection precision. Secondly, the Content-Aware ReAssembly of FEatures (CARAFE) operator is utilized to enhance the up-sampling operation, enabling precise content-aware processing of tomatoes and picking keypoints to improve accuracy and recall. Finally, the Convolutional Attention Mechanism (CBAM) module is incorporated to enhance the model’s ability to detect tomato-picking key regions in a large field of view in both channel and spatial dimensions. Ablation experiments were conducted to validate the effectiveness of each proposed module (DWConv, CARAFE, CBAM), and the architecture was compared with YOLOv3, v5, v6, v8, v9, and v10. The experimental results reveal that, when juxtaposed with the original network model, the YOLO-TMPPD model brings about remarkable improvements. Specifically, it improves the object detection F1 score by 4.48% and enhances the keypoint detection accuracy by 4.43%. Furthermore, the model’s size is reduced by 8.6%. This study holds substantial theoretical and practical value. In the complex environment of a greenhouse, it contributes significantly to computer-vision-enabled detection of tomato ripening. It can also help robots accurately locate picking points and estimate posture, which is crucial for efficient and precise tomato-picking operations without damage. Full article
Show Figures

Graphical abstract

Back to TopTop