Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (3,811)

Search Parameters:
Keywords = improved YOLOv9

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 12312 KB  
Article
ES-YOLO: Multi-Scale Port Ship Detection Combined with Attention Mechanism in Complex Scenes
by Lixiang Cao, Jia Xi, Zixuan Xie, Teng Feng and Xiaomin Tian
Sensors 2025, 25(24), 7630; https://doi.org/10.3390/s25247630 - 16 Dec 2025
Abstract
With the rapid development of remote sensing technology and deep learning, the port ship detection based on a single-stage algorithm has achieved remarkable results in optical imagery. However, most of the existing methods are designed and verified in specific scenes, such as fixed [...] Read more.
With the rapid development of remote sensing technology and deep learning, the port ship detection based on a single-stage algorithm has achieved remarkable results in optical imagery. However, most of the existing methods are designed and verified in specific scenes, such as fixed viewing angle, uniform background, or open sea, which makes it difficult to deal with the problem of ship detection in complex environments, such as cloud occlusion, wave fluctuation, complex buildings in the harbor, and multi-ship aggregation. To this end, ES-YOLO framework is proposed to solve the limitations of ship detection. A novel edge perception channel, Spatial Attention Mechanism (EACSA), is proposed to enhance the extraction of edge information and improve the ability to capture feature details. A lightweight spatial–channel decoupled down-sampling module (LSCD) is designed to replace the down-sampling structure of the original network and reduce the complexity of the down-sampling stage. A new hierarchical scale structure is designed to balance the detection effect of different scale differences. In this paper, a remote sensing ship dataset, TJShip, is constructed based on Gaofen-2 images, which covers multi-scale targets from small fishing boats to large cargo ships. The TJShip dataset was adopted as the data source, and the ES-YOLO model was employed to conduct ablation and comparison experiments. The results show that the introduction of EACSA attention mechanism, LSCD, and multi-scale structure improves the mAP of ship detection by 0.83%, 0.54%, and 1.06%, respectively, compared with the baseline model, also performing well in precision, recall and F1. Compared with Faster R-CNN, RetinaNet, YOLOv5, YOLOv7, and YOLOv8 methods, the results show that the ES-YOLO model improves the mAP by 46.87%, 8.14%, 1.85%, 1.75%, and 0.86%, respectively, under the same experimental conditions, which provides research ideas for ship detection. Full article
(This article belongs to the Section Remote Sensors)
Show Figures

Figure 1

29 pages, 12360 KB  
Article
Vision-Guided Dynamic Risk Assessment for Long-Span PC Continuous Rigid-Frame Bridge Construction Through DEMATEL–ISM–DBN Modelling
by Linlin Zhao, Qingfei Gao, Yidian Dong, Yajun Hou, Liangbo Sun and Wei Wang
Buildings 2025, 15(24), 4543; https://doi.org/10.3390/buildings15244543 - 16 Dec 2025
Abstract
In response to the challenges posed by the complex evolution of risks and the static nature of traditional assessment methods during the construction of long-span prestressed concrete (PC) continuous rigid-frame bridges, this study proposes a risk assessment framework that integrates visual perception with [...] Read more.
In response to the challenges posed by the complex evolution of risks and the static nature of traditional assessment methods during the construction of long-span prestressed concrete (PC) continuous rigid-frame bridges, this study proposes a risk assessment framework that integrates visual perception with dynamic probabilistic reasoning. By combining an improved YOLOv8 model with the Decision-making Trial and Evaluation Laboratory–InterpretiveStructure Modeling (DEMATEL–ISM) algorithm, the framework achieves intelligent identification of risk elements and causal structure modelling. On this basis, a dynamic Bayesian network (DBN) is constructed, incorporating a sliding window and forgetting factor mechanism to enable adaptive updating of conditional probability tables. Using the Tongshun River Bridge as a case study, at the identification layer, we refine onsite targets into 14 risk elements (F1–F14). For visualization, these are aggregated into four categories—“Bridge, Person, Machine, Environment”—to enhance readability. In the methodology layer, leveraging causal a priori information provided by DEMATEL–ISM, risk elements are mapped to scenario probabilities, enabling scenario-level risk assessment and grading. This establishes a traceable closed-loop process from “elements” to “scenarios.” The results demonstrate that the proposed approach effectively identifies key risk chains within the “human–machine–environment–bridge” system, revealing phase-specific peaks in human-related risks and cumulative increases in structural and environmental risks. The particle filter and Monte Carlo prediction outputs generate short-term risk evolution curves with confidence intervals, facilitating the quantitative classification of risk levels. Overall, this vision-guided dynamic risk assessment method significantly enhances the real-time responsiveness, interpretability, and foresight of bridge construction safety management and provides a promising pathway for proactive risk control in complex engineering environments. Full article
(This article belongs to the Special Issue Big Data and Machine/Deep Learning in Construction)
Show Figures

Figure 1

22 pages, 2204 KB  
Article
A Lightweight YOLOv8-Based Network for Efficient Corn Disease Detection
by Deao Song, Yiran Peng, Xinyuan Gu and KinTak U
Mathematics 2025, 13(24), 4002; https://doi.org/10.3390/math13244002 - 16 Dec 2025
Abstract
To address the pressing need for accurate and efficient detection of corn diseases, we propose a novel, lightweight object detection framework, CBS-YOLOv8 (C2f-BiFPN-SCConv YOLOv8), which builds upon the YOLOv8 architecture to enhance performance for corn disease detection. The model incorporates two key components, [...] Read more.
To address the pressing need for accurate and efficient detection of corn diseases, we propose a novel, lightweight object detection framework, CBS-YOLOv8 (C2f-BiFPN-SCConv YOLOv8), which builds upon the YOLOv8 architecture to enhance performance for corn disease detection. The model incorporates two key components, the GhostNetV2 block and SCConv (Selective Convolution). The GhostNetV2 block improves feature representation by reducing computational complexity, while SCConv optimizes convolution operations dynamically, adjusting based on the input to ensure minimal computational overhead. Together, these features maintain high detection accuracy while keeping the network lightweight. Additionally, the model integrates the C2f-GhostNetV2 module to eliminate redundancy, and the SimAM attention mechanism improves lesion-background separation, enabling more accurate disease detection. The Bi-directional Feature Pyramid Network (BiFPN) enhances feature representation across multiple scales, strengthening detection across varying object sizes. Evaluated on a custom dataset of over 6000 corn leaf images across six categories, CBS-YOLOv8 achieves improved accuracy and reliability in object detection. With a lightweight architecture of just 8.1M parameters and 21 GFLOPs, it enables real-time deployment on edge devices in agricultural settings. CBS-YOLOv8 offers high detection performance while maintaining computational efficiency, making it ideal for precision agriculture. Full article
(This article belongs to the Special Issue Intelligent Mathematics and Applications)
Show Figures

Figure 1

18 pages, 2808 KB  
Article
Lightweight Structure and Attention Fusion for In-Field Crop Pest and Disease Detection
by Zijing Luo, Yunsen Liang, Naimin Kong, Lirui Liang, Wenjun Peng, Yujie Yao, Chi Qin, Xiaohan Lu, Mingman Xu, Yining Zhang, Chenyang Lin, Chengyao Jiang, Mengyao Li, Yangxia Zheng, Yameng Jiang and Wei Lu
Agronomy 2025, 15(12), 2879; https://doi.org/10.3390/agronomy15122879 - 15 Dec 2025
Abstract
In agricultural production, plant diseases and pests are among the major threats to crop yield and quality. Existing agricultural pest and disease identification methods have problems such as small target scales, complex background environments, and unbalanced sample distributions. This paper proposes a lightweight [...] Read more.
In agricultural production, plant diseases and pests are among the major threats to crop yield and quality. Existing agricultural pest and disease identification methods have problems such as small target scales, complex background environments, and unbalanced sample distributions. This paper proposes a lightweight improved target detection model, YOLOv5s-LiteAttn. Based on YOLOv5s, the model introduces GhostConv and Depthwise Conv to reduce the number of parameters and computational complexity, and it combines CBAM and Coordinate Attention mechanisms to enhance the network’s feature representation capability. Experimental results show that, compared with the basic YOLOv5s model, the number of parameters of the improved model is reduced by 22.75%, and the computational load is reduced by 16.77%. At the same time, mAP@0.5–0.95 is increased by 3.3 percentage points, and recall is improved by 1.1 percentage points. In addition, the inference speed increases from 121 FPS to 142 FPS at an input resolution of 640 × 640, further confirming that the proposed model achieves a favorable trade-off between accuracy and efficiency. The average precision of YOLOv5s-LiteAttn is 97.1%, which outperforms the existing mainstream lightweight detection models. Moreover, an independent test set containing 4328 newly collected field images was established to evaluate generalization and practical applicability. Despite a slight performance decrease compared with the validation results, the model maintained an mAP@0.5–0.95 of 95.8%, significantly outperforming the baseline model, thereby confirming its robustness and cross-domain adaptability. These results confirm that the model has high precision and is lightweight, making it effective for the detection of agricultural diseases and pests. Full article
(This article belongs to the Section Pest and Disease Management)
Show Figures

Figure 1

20 pages, 4533 KB  
Article
YOLOv11-LADC: A Lightweight Detection Framework for Micro–Nano Damage Precursors in Thermal Barrier Coatings
by Cong Huang, Xing Peng, Feng Shi, Ci Song, Hongbing Cao, Xinjie Zhao and Hengrui Xu
Nanomaterials 2025, 15(24), 1878; https://doi.org/10.3390/nano15241878 - 14 Dec 2025
Abstract
Performance breakthroughs and safety assurance of aerospace equipment are critical to the advancement of modern aerospace technology. As a key protective system for the hot-end components of aeroengines, thermal barrier coatings (TBCs) play a vital role in ensuring the safe operation of aeroengines [...] Read more.
Performance breakthroughs and safety assurance of aerospace equipment are critical to the advancement of modern aerospace technology. As a key protective system for the hot-end components of aeroengines, thermal barrier coatings (TBCs) play a vital role in ensuring the safe operation of aeroengines and overall flight safety. To address the core detection technology challenge for micro–nano damage precursors in aerospace TBCs, this study proposes an enhanced detection framework, namely YOLOv11-LADC. Specifically, the framework integrates the LSKA attention mechanism to construct the C2PSA-LA module, thereby enhancing the detection capability for micro–nano damage precursors and adaptability to complex small-sample datasets. Additionally, it introduces deformable convolutions (DeformConv) to build the C3k2-DeformCSP module, which dynamically adapts to the irregular deformations of micro–nano damage precursors while reducing computational complexity. A data augmentation strategy incorporating 19 transformations is employed to expand the dataset to 5140 images. A series of experimental results demonstrates that, compared with the YOLOv11 baseline model, the proposed model achieves a 1.6% improvement in precision (P) and a 2.0% increase in recall (R), while maintaining mAP50 and mAP50-95 at near-constant levels. Meanwhile, the computational complexity (GFLOPs) is reduced to 6.2, validating the superiority of the enhanced framework in terms of detection accuracy and training efficiency. This further confirms the feasibility and practicality of the YOLOv11-LADC algorithm for detecting multi-scale micro–nano damage precursors in aerospace TBCs. Overall, this study provides an effective solution for the intelligent, high-precision, and real-time detection of multi-scale micro–nano damage precursors in aerospace TBCs. Full article
(This article belongs to the Section Nanoelectronics, Nanosensors and Devices)
Show Figures

Figure 1

21 pages, 2820 KB  
Article
Research on Small Target Detection Method for Poppy Plants in UAV Aerial Photography Based on Improved YOLOv8
by Xiaodan Feng, Lijun Yun, Chunlong Wang, Haojie Zhang, Rou Guan, Yuying Ma and Huan Jin
Agronomy 2025, 15(12), 2868; https://doi.org/10.3390/agronomy15122868 - 14 Dec 2025
Viewed by 99
Abstract
In response to the challenges in unmanned aerial vehicle (UAV)-based poppy plant detection, such as dense small targets, occlusions, and complex backgrounds, an improved YOLOv8-based detection algorithm with multi-module collaborative optimization is proposed. First, the lightweight Efficient Channel Attention (ECA) mechanism was integrated [...] Read more.
In response to the challenges in unmanned aerial vehicle (UAV)-based poppy plant detection, such as dense small targets, occlusions, and complex backgrounds, an improved YOLOv8-based detection algorithm with multi-module collaborative optimization is proposed. First, the lightweight Efficient Channel Attention (ECA) mechanism was integrated into the YOLOv8 backbone network to construct a composite feature extraction module with enhanced representational capacity. Subsequently, a Bidirectional Feature Pyramid Network (BiFPN) was introduced into the neck network to establish adaptive cross-scale feature fusion through learnable weighting parameters. Furthermore, the Wise Intersection over Union (WIoU) loss function was adopted to enhance the accuracy of bounding box regression. Finally, a dedicated 160 × 160 pixels detection head was added to leverage the high-resolution features from shallow layers, thereby enhancing the detection capability for small targets. Under five-fold cross-validation, the proposed model achieved mAP@0.5 and mAP@0.5:0.95 of 0.989 ± 0.003 and 0.850 ± 0.013, respectively, with average increases of 1.3 and 3.2 percentage points over YOLOv8. Statistical analysis confirmed that these performance gains were significant, demonstrating the effectiveness of the proposed method as a reliable solution for poppy plant detection. Full article
(This article belongs to the Special Issue Agricultural Imagery and Machine Vision)
Show Figures

Figure 1

28 pages, 4422 KB  
Article
Enhanced Object Detection Algorithms in Complex Environments via Improved CycleGAN Data Augmentation and AS-YOLO Framework
by Zhen Li, Yuxuan Wang, Lingzhong Meng, Wenjuan Chu and Guang Yang
J. Imaging 2025, 11(12), 447; https://doi.org/10.3390/jimaging11120447 - 12 Dec 2025
Viewed by 226
Abstract
Object detection in complex environments, such as challenging lighting conditions, adverse weather, and target occlusions, poses significant difficulties for existing algorithms. To address these challenges, this study introduces a collaborative solution integrating improved CycleGAN-based data augmentation and an enhanced object detection framework, AS-YOLO. [...] Read more.
Object detection in complex environments, such as challenging lighting conditions, adverse weather, and target occlusions, poses significant difficulties for existing algorithms. To address these challenges, this study introduces a collaborative solution integrating improved CycleGAN-based data augmentation and an enhanced object detection framework, AS-YOLO. The improved CycleGAN incorporates a dual self-attention mechanism and spectral normalization to enhance feature capture and training stability. The AS-YOLO framework integrates a channel–spatial parallel attention mechanism, an AFPN structure for improved feature fusion, and the Inner_IoU loss function for better generalization. The experimental results show that compared with YOLOv8n, mAP@0.5 and mAP@0.95 of the AS-YOLO algorithm have increased by 1.5% and 0.6%, respectively. After data augmentation and style transfer, mAP@0.5 and mAP@0.95 have increased by 14.6% and 17.8%, respectively, demonstrating the effectiveness of the proposed method in improving the performance of the model in complex scenarios. Full article
(This article belongs to the Special Issue Advances in Machine Learning for Computer Vision Applications)
Show Figures

Figure 1

15 pages, 1730 KB  
Article
Research on Printed Circuit Board (PCB) Defect Detection Algorithm Based on Convolutional Neural Networks (CNN)
by Zhiduan Ni and Yeonhee Kim
Appl. Sci. 2025, 15(24), 13115; https://doi.org/10.3390/app152413115 - 12 Dec 2025
Viewed by 253
Abstract
Printed Circuit Board (PCB) defect detection is critical for quality control in electronics manufacturing. Traditional manual inspection and classical Automated Optical Inspection (AOI) methods face challenges in speed, consistency, and flexibility. This paper proposes a CNN-based approach for automatic PCB defect detection using [...] Read more.
Printed Circuit Board (PCB) defect detection is critical for quality control in electronics manufacturing. Traditional manual inspection and classical Automated Optical Inspection (AOI) methods face challenges in speed, consistency, and flexibility. This paper proposes a CNN-based approach for automatic PCB defect detection using the YOLOv5 model. The method leverages a Convolutional Neural Network to identify various PCB defect types (e.g., open circuits, short circuits, and missing holes) from board images. In this study, a model was trained on a PCB image dataset with detailed annotations. Data augmentation techniques, such as sharpening and noise filtering, were applied to improve robustness. The experimental results showed that the proposed approach could locate and classify multiple defect types on PCBs, with overall detection precision and recall above 90% and 91%, respectively, enabling reliable automated inspection. A brief comparison with the latest YOLOv8 model is also presented, showing that the proposed CNN-based detector offers competitive performance. This study shows that deep learning-based defect detection can improve the PCB inspection efficiency and accuracy significantly, paving the way for intelligent manufacturing and quality assurance in PCB production. From a sensing perspective, we frame the system around an industrial RGB camera and controlled illumination, emphasizing how imaging-sensor choices and settings shape defect visibility and model robustness, and sketching future sensor-fusion directions. Full article
(This article belongs to the Special Issue Applications in Computer Vision and Image Processing)
Show Figures

Figure 1

28 pages, 27801 KB  
Article
Optimising Deep Learning-Based Segmentation of Crop and Soil Marks with Spectral Enhancements on Sentinel-2 Data
by Andaleeb Yaseen, Giulio Poggi, Sebastiano Vascon and Arianna Traviglia
Remote Sens. 2025, 17(24), 4014; https://doi.org/10.3390/rs17244014 - 12 Dec 2025
Viewed by 123
Abstract
This study presents the first systematic investigation into the influence of spectral enhancement techniques on the segmentation accuracy of specific soil and vegetation marks associated with palaeochannels. These marks are often subtle and can be seasonally obscured by vegetation dynamics and soil variability. [...] Read more.
This study presents the first systematic investigation into the influence of spectral enhancement techniques on the segmentation accuracy of specific soil and vegetation marks associated with palaeochannels. These marks are often subtle and can be seasonally obscured by vegetation dynamics and soil variability. Spectral enhancement methods, such as spectral indices and statistical aggregations, are routinely applied to improve their visual discriminability and interpretability. Despite recent progress in automated detection workflows, no prior research has rigorously quantified the effects of these enhancement techniques on the performance of deep learning–based segmentation models. This gap at the intersection of remote sensing and AI-driven analysis is critical, as addressing it is essential for improving the accuracy, efficiency, and scalability of subsurface feature detection across large and heterogeneous landscapes. In this study, two state-of-the-art deep learning architectures, U-Net and YOLOv8, were trained and tested to assess the influence of these spectral transformations on model performance, using Sentinel-2 imagery acquired across three seasonal windows. Across all experiments, spectral enhancement techniques led to clear improvements in segmentation accuracy compared with raw multispectral inputs. The multi-temporal Median Visualisation (MV) composite provided the most stable performance overall, achieving mean IoU values of 0.22 ± 0.02 in April, 0.07 ± 0.03 in August, and 0.19 ± 0.03 in November for U-Net, outperforming the full 12-band Sentinel-2 stack, which reached only 0.04, 0.02, and 0.03 in the same periods. FCC and VBB also performed competitively, e.g., FCC reached 0.21 ± 0.02 (April) and VBB 0.18 ± 0.03 (April), showing that compact three-band enhancements consistently exceed the segmentation quality obtained from using all spectral bands. Performance varied with environmental conditions, with April yielding the highest accuracy, while August remained challenging across all methods. These results highlight the importance of seasonally informed spectral preprocessing and establish an empirical benchmark for integrating enhancement techniques into AI-based archaeological and geomorphological prospection workflows. Full article
Show Figures

Figure 1

14 pages, 2582 KB  
Article
Seafood Object Detection Method Based on Improved YOLOv5s
by Nan Zhu, Zhaohua Liu, Zhongxun Wang and Zheng Xie
Sensors 2025, 25(24), 7546; https://doi.org/10.3390/s25247546 - 12 Dec 2025
Viewed by 147
Abstract
To address the issues of false positives and missed detections commonly observed in traditional underwater seafood object detection algorithms, this paper proposes an improved detection method based on YOLOv5s. Specifically, we introduce a Spatial–Channel Synergistic Attention (SCSA) module after the Fast Spatial Pyramid [...] Read more.
To address the issues of false positives and missed detections commonly observed in traditional underwater seafood object detection algorithms, this paper proposes an improved detection method based on YOLOv5s. Specifically, we introduce a Spatial–Channel Synergistic Attention (SCSA) module after the Fast Spatial Pyramid Pooling layer in the backbone network. This module adopts a synergistic mechanism where the channel attention guides spatial localization, and the spatial attention feeds back to optimize channel weights, dynamically enhancing the unique features of aquatic targets (such as sea cucumber folds) while suppressing seawater background interference. In addition, we replace some C3 modules in YOLOv5s with our designed three-scale convolution dual-path variable-kernel module based on Pinwheel-shaped Convolution (C3k2-PSConv). This module strengthens the model’s ability to capture multi-dimensional features of aquatic targets, especially in the feature extraction of small-sized and occluded targets, reducing the false detection rate while ensuring the model’s lightweight property. The enhanced model is evaluated on the URPC dataset, which contains real-world underwater imagery of echinus, starfish, holothurian, and scallop. The experimental results show that compared with the baseline model YOLOv5s, while maintaining real-time inference speed, the proposed method in this paper increases the mean average precision (mAP) by 2.3% and reduces the number of parameters by approximately 2.4%, significantly improving the model’s operational efficiency. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

19 pages, 2659 KB  
Article
A Structure-Aware Masked Autoencoder for Sparse Character Image Recognition
by Cheng Luo, Wenhong Wang, Junhang Mai, Tianwei Mu, Shuo Guo and Mingzhe Yuan
Electronics 2025, 14(24), 4886; https://doi.org/10.3390/electronics14244886 - 12 Dec 2025
Viewed by 208
Abstract
Conventional vehicle character recognition methods often treat detection and recognition as separate processes, resulting in limited feature interaction and potential error propagation. To address this issue, this paper proposes a structure-aware self-supervised Masked Autoencoder (CharSAM-MAE) framework, combined with an independent region extraction preprocessing [...] Read more.
Conventional vehicle character recognition methods often treat detection and recognition as separate processes, resulting in limited feature interaction and potential error propagation. To address this issue, this paper proposes a structure-aware self-supervised Masked Autoencoder (CharSAM-MAE) framework, combined with an independent region extraction preprocessing stage. A YOLOv8n detector is employed solely to crop the region of interest (ROI) from full-frame vehicle images using 50 single bounding-box annotated samples. After cropping, the detector is discarded, and subsequent self-supervised pre-training and recognition are fully executed using MAE without any involvement of YOLO model parameters or labeled data. CharSAM-MAE incorporates a structure-aware masking strategy and a region-weighted reconstruction loss during pre-training to improve both local structural representation and global feature modeling. During fine-tuning, a multi-head attention-enhanced CTC decoder (A-CTC) is applied to mitigate issues such as sparse characters, adhesion, and long-sequence instability. The framework is trained on 13,544 ROI images, with only 5% of labeled data used for supervised fine-tuning. Experimental results demonstrate that the proposed method achieves 99.25% character accuracy, 88.6% sequence accuracy, and 0.85% character error rate, outperforming the PaddleOCR v5 baseline (98.92%, 85.2%, and 1.15%, respectively). These results verify the effectiveness of structure-aware self-supervised learning and highlight the applicability of the proposed method for industrial character recognition with minimal annotation requirements. Full article
(This article belongs to the Section Electrical and Autonomous Vehicles)
Show Figures

Figure 1

23 pages, 7617 KB  
Article
A Dual-Modal Adaptive Pyramid Transformer Algorithm for UAV Cross-Modal Object Detection
by Qiqin Li, Ming Yang, Xiaoqiang Zhang, Nannan Wang, Xiaoguang Tu, Xijun Liu and Xinyu Zhu
Sensors 2025, 25(24), 7541; https://doi.org/10.3390/s25247541 - 11 Dec 2025
Viewed by 175
Abstract
Unmanned Aerial Vehicles (UAVs) play vital roles in traffic surveillance, disaster management, and border security, highlighting the importance of reliable infrared–visible image detection under complex illumination conditions. However, UAV-based infrared–visible detection still faces challenges in multi-scale target recognition, robustness to lighting variations, and [...] Read more.
Unmanned Aerial Vehicles (UAVs) play vital roles in traffic surveillance, disaster management, and border security, highlighting the importance of reliable infrared–visible image detection under complex illumination conditions. However, UAV-based infrared–visible detection still faces challenges in multi-scale target recognition, robustness to lighting variations, and efficient cross-modal information utilization. To address these issues, this study proposes a lightweight Dual-modality Adaptive Pyramid Transformer (DAP) module integrated into the YOLOv8 framework. The DAP module employs a hierarchical self-attention mechanism and a residual fusion structure to achieve adaptive multi-scale representation and cross-modal semantic alignment while preserving modality-specific features. This design enables effective feature fusion with reduced computational cost, enhancing detection accuracy in complex environments. Experiments on the DroneVehicle and LLVIP datasets demonstrate that the proposed DAP-based YOLOv8 achieves mAP50:95 scores of 61.2% and 62.1%, respectively, outperforming conventional methods. The results validate the capability of the DAP module to optimize cross-modal feature interaction and improve UAV real-time infrared–visible target detection, offering a practical and efficient solution for UAV applications such as traffic monitoring and disaster response. Full article
(This article belongs to the Section Remote Sensors)
Show Figures

Figure 1

27 pages, 6470 KB  
Article
Lightweight YOLO-SR: A Method for Small Object Detection in UAV Aerial Images
by Sirong Liang, Xubin Feng, Meilin Xie, Qiang Tang, Haoran Zhu and Guoliang Li
Appl. Sci. 2025, 15(24), 13063; https://doi.org/10.3390/app152413063 - 11 Dec 2025
Viewed by 175
Abstract
To address challenges in small object detection within drone aerial imagery—such as sparse feature information, intense background interference, and drastic scale variations—this paper proposes YOLO-SR, a lightweight detection algorithm based on attention enhancement and feature reuse mechanisms. First, we designed the lightweight feature [...] Read more.
To address challenges in small object detection within drone aerial imagery—such as sparse feature information, intense background interference, and drastic scale variations—this paper proposes YOLO-SR, a lightweight detection algorithm based on attention enhancement and feature reuse mechanisms. First, we designed the lightweight feature extraction module C2f-SA, which incorporates Shuffle Attention. By integrating channel shuffling and grouped spatial attention mechanisms, this module dynamically enhances edge and texture feature responses for small objects, effectively improving the discriminative power of shallow-level features. Second, the Spatial Pyramid Pooling Attention (SPPC) module captures multi-scale contextual information through spatial pyramid pooling. Combined with dual-path (channel and spatial) attention mechanisms, it optimizes feature representation while significantly suppressing complex background interference. Finally, the detection head employs a decoupled architecture separating classification and regression tasks, supplemented by a dynamic loss weighting strategy to mitigate small object localization inaccuracies. Experimental results on the RGBT-Tiny dataset demonstrate that compared to the baseline model YOLOv5s, our algorithm achieves a 5.3% improvement in precision, a 13.1% increase in recall, and respective gains of 11.5% and 22.3% in mAP0.5 and mAP0.75, simultaneously reducing the number of parameters by 42.9% (from 7.0 × 106 to 4.0 × 106) and computational cost by 37.2% (from 60.0 GFLOPs to 37.7 GFLOPs). The comprehensive improvement across multiple metrics validates the superiority of the proposed algorithm in both accuracy and efficiency. Full article
Show Figures

Figure 1

17 pages, 1940 KB  
Article
Detection and Segmentation of Chip Budding Graft Sites in Apple Nursery Using YOLO Models
by Magdalena Kapłan, Damian I. Wójcik and Kamil Buczyński
Agriculture 2025, 15(24), 2565; https://doi.org/10.3390/agriculture15242565 - 11 Dec 2025
Viewed by 142
Abstract
The use of convolutional neural networks in nursery production remains limited, emphasizing the need for advanced vision-based approaches to support automation. This study evaluated the feasibility of detecting chip-budding graft sites in apple nurseries using YOLO object detection and segmentation models. A dataset [...] Read more.
The use of convolutional neural networks in nursery production remains limited, emphasizing the need for advanced vision-based approaches to support automation. This study evaluated the feasibility of detecting chip-budding graft sites in apple nurseries using YOLO object detection and segmentation models. A dataset of 3630 RGB images of budding sites was collected under variable field conditions. The models achieved high detection precision and consistent segmentation performance, confirming strong convergence and structural maturity across YOLO generations. The YOLO12s model demonstrated the most balanced performance, combining high precision with superior localization accuracy, particularly under higher Intersection-over-Union threshold conditions. In the segmentation experiments, both architectures achieved nearly equivalent performance, with only minor variations observed across evaluation metrics. The YOLO11s-seg model showed slightly higher Precision and overall stability, whereas YOLOv8s-seg retained a small advantage in Recall. Inference efficiency was assessed on both high-performance (RTX 5080) and embedded (Jetson Orin NX) platforms. YOLOv8s achieved the highest inference efficiency with minimal Latency, while TensorRT optimization further improved throughput and reduced Latency across all YOLO models. These results demonstrate that framework-level optimization can provide substantial practical benefits. The findings confirm the suitability of YOLO-based methods for precise detection of grafting sites in apple nurseries and establish a foundation for developing autonomous systems supporting nursery and orchard automation. Full article
Show Figures

Figure 1

15 pages, 11915 KB  
Article
Weld Seam ROI Detection and Segmentation Method Based on Active–Passive Vision Fusion
by Ming Hu, Xiangtao Hu, Jiuzhou Zhao and Honghui Zhan
Sensors 2025, 25(24), 7530; https://doi.org/10.3390/s25247530 - 11 Dec 2025
Viewed by 165
Abstract
Rapid detection and precise segmentation of the weld seam region of interest (ROI) remain a core challenge in robotic intelligent grinding. To address this issue, this paper proposes a method for weld seam ROI detection and segmentation based on the fusion of active [...] Read more.
Rapid detection and precise segmentation of the weld seam region of interest (ROI) remain a core challenge in robotic intelligent grinding. To address this issue, this paper proposes a method for weld seam ROI detection and segmentation based on the fusion of active and passive vision. The proposed approach primarily consists of two stages: weld seam image instance segmentation and weld seam ROI point cloud segmentation. In the image segmentation stage, an enhanced segmentation network is constructed by integrating a convolutional attention module into YOLOv8n-seg, which effectively improves the localization accuracy and mask extraction quality of the weld seam region. In the point cloud segmentation stage, the 3D point cloud is first mapped onto a 2D pixel plane to achieve spatial alignment. Subsequently, a coarse screening of the projected point cloud is performed based on the bounding boxes output from the instance segmentation, eliminating a large amount of redundant data. Furthermore, a grayscale matrix is constructed based on the segmentation masks, enabling precise extraction of the weld seam ROI point cloud through point-wise discrimination. Experimental results demonstrate that the proposed method achieves high-quality segmentation of the weld seam region, providing a reliable foundation for robotic automated grinding. Full article
Show Figures

Figure 1

Back to TopTop