Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (386)

Search Parameters:
Keywords = Light-YOLO

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
25 pages, 2513 KB  
Article
YOLO-DAA: Directional Area Attention for Lightweight Tiny Object Detection in Maritime UAV Imagery
by Kuan-Chou Chen, Vinay Malligere Shivanna and Jiun-In Guo
Drones 2026, 10(4), 283; https://doi.org/10.3390/drones10040283 - 14 Apr 2026
Abstract
Tiny object detection in maritime Unmanned Aerial Vehicles (UAV) imagery remains challenging due to low-resolution targets, dynamic lighting, and vast water backgrounds that obscure fine spatial cues. This study introduces You Only Look Once – Directional Area Attention (YOLO-DAA), a lightweight yet direction-aware [...] Read more.
Tiny object detection in maritime Unmanned Aerial Vehicles (UAV) imagery remains challenging due to low-resolution targets, dynamic lighting, and vast water backgrounds that obscure fine spatial cues. This study introduces You Only Look Once – Directional Area Attention (YOLO-DAA), a lightweight yet direction-aware detection framework designed to enhance spatial reasoning and feature discrimination for maritime environments. The proposed model integrates two key components: the Spatial Reconstruction Unit (SRU), which dynamically filters redundant activations and reconstructs informative spatial features, and the Directional Area Attention (DAA), which introduces controllable row–column attention to model anisotropic dependencies. Together, they enable the network to capture orientation-sensitive structures such as elongated vessels and vertically aligned swimmers while maintaining real-time efficiency. Experimental results on Common Objects in Context (COCO) and SeaDronesSee datasets demonstrate that YOLO-DAA achieves significant improvements in both precision and recall, outperforming the YOLOv12-turbo baseline across multiple scales. In particular, the lightweight YOLO-DAA-n variant achieves a 12.5% AP95 gain on SeaDronesSee with minimal computational overhead. The findings confirm that directional attention and spatial reconstruction jointly enhance the representation of tiny maritime targets, offering an effective balance between accuracy and efficiency for real-world UAV deployments. Full article
Show Figures

Figure 1

26 pages, 4957 KB  
Article
Detection of Traffic Lights and Status (Red, Yellow and Green) in Images with Different Environmental Conditions Using Architectures from Yolov8 to Yolov12
by Julio Saucedo-Soto, Viridiana Hernández-Herrera, Moisés Márquez-Olivera, Octavio Sánchez-García and Antonio-Gustavo Juárez-Gracia
Vehicles 2026, 8(4), 90; https://doi.org/10.3390/vehicles8040090 - 10 Apr 2026
Viewed by 159
Abstract
Given that approximately 70% of traffic accidents are attributable to driver-related factors, it is necessary for vehicles to incorporate technologies that reduce risk through preventive actions derived from traffic-scene analysis. Interpreting the driving environment is non-trivial and is commonly decomposed into sub-tasks; among [...] Read more.
Given that approximately 70% of traffic accidents are attributable to driver-related factors, it is necessary for vehicles to incorporate technologies that reduce risk through preventive actions derived from traffic-scene analysis. Interpreting the driving environment is non-trivial and is commonly decomposed into sub-tasks; among them, traffic light perception is critical due to its role in regulating vehicular flow. This paper evaluates five YOLO CNN families (YOLOv8–YOLOv12) on two tasks: (i) traffic light detection and (ii) traffic light state recognition (green, yellow, red). The evaluation uses a hybrid dataset comprising the public LISA traffic light dataset and a custom dataset with images from Mexico City captured under diverse lighting conditions—a relevant setting given the city’s high traffic intensity. The results show mAP@0.50 = 94.4–96.3% for traffic light detection and mAP@0.50 = 99.3–99.4% for traffic light state recognition, indicating that modern YOLO variants provide highly reliable performance for both tasks under natural illumination variability. Full article
(This article belongs to the Special Issue AI-Empowered Assisted and Autonomous Driving)
Show Figures

Figure 1

22 pages, 4431 KB  
Article
LA-YOLO: Robust Tea-Shoot Detection Under Dynamic Illumination via Input Illumination Stabilization and Discriminative Feature Learning
by Menghua Liu, Fanghua Liu and Junchao Chen
Agriculture 2026, 16(7), 809; https://doi.org/10.3390/agriculture16070809 - 4 Apr 2026
Viewed by 348
Abstract
Accurate tea-shoot detection in real tea gardens is essential for intelligent harvesting, yet dynamic illumination (low light, strong light, and shadows) can cause brightness/contrast fluctuations and feature distribution shifts, degrading detection stability and localization accuracy. This paper proposes LA-YOLO, a dynamic-light tea-shoot detector [...] Read more.
Accurate tea-shoot detection in real tea gardens is essential for intelligent harvesting, yet dynamic illumination (low light, strong light, and shadows) can cause brightness/contrast fluctuations and feature distribution shifts, degrading detection stability and localization accuracy. This paper proposes LA-YOLO, a dynamic-light tea-shoot detector based on YOLOv11. First, we construct a dynamic-light benchmark dataset and a difficulty-stratified evaluation protocol with four single-light subsets (A–D) and a mixed-light subset (E). Second, we design LA-CSNorm, an input-side brightness-adaptive preprocessing module that applies gated enhancement to dark samples followed by channel-selective normalization to reduce illumination-induced drift. Third, we propose RECA, a residual efficient channel-attention module to enhance discriminative channels and improve localization stability. Ablation studies show that LA-CSNorm and RECA provide complementary gains, and their combination improves the YOLOv11 baseline to 0.831 mAP@0.5 and 0.621 mAP@0.5:0.95, with only 0.01 M additional parameters. On the mixed-light subset E, LA-YOLO achieves 0.816 mAP@0.5 and 0.613 mAP@0.5:0.95, and consistently outperforms mainstream YOLO variants (e.g., YOLOv11m) under dynamic lighting conditions. These results demonstrate that LA-YOLO offers a robust and deployment-friendly solution for tea-shoot detection in complex natural illumination. Full article
Show Figures

Figure 1

20 pages, 7512 KB  
Article
PDA-YOLO: An Early Detection Method for Egg Fertilization Rate Based on Position-Decoupled Attention
by Yifan Zhou, Zhengxiang Shi, Geqi Yan, Haiqing Peng, Fuwei Li, Wei Liu and Dapeng Li
Agriculture 2026, 16(7), 784; https://doi.org/10.3390/agriculture16070784 - 2 Apr 2026
Viewed by 318
Abstract
This study addresses the inefficiencies, subjectivity, and poor adaptability to lighting variations inherent in traditional candling methods used in large-scale egg incubation. We developed a high-throughput transmissive imaging system capable of capturing 30 eggs simultaneously. Based on this system, we propose PDA-YOLO, an [...] Read more.
This study addresses the inefficiencies, subjectivity, and poor adaptability to lighting variations inherent in traditional candling methods used in large-scale egg incubation. We developed a high-throughput transmissive imaging system capable of capturing 30 eggs simultaneously. Based on this system, we propose PDA-YOLO, an enhanced YOLOv8-based object detection model featuring a position-decoupled attention strategy. Specifically, a lightweight C2f-SE module is integrated into the backbone to amplify subtle feature responses in low-contrast regions, while a CBAM is deployed prior to the detection head to mitigate background clutter through precise spatial attention. Experimental results on a self-constructed Hailan White egg dataset show that at the critical 60 h incubation stage, PDA-YOLO achieves a Recall of 91.5% and an mAP@0.5 of 97.4%, outperforming the YOLOv8 baseline while maintaining a real-time inference speed of 62.1 FPS. Grad-CAM visualizations confirm the model’s ability to focus on vascular textures and suppress noise. Furthermore, the model demonstrates robust performance under varying illumination (180–540 lumens), effectively mitigating missed detections in low light and recognition degradation from overexposure. This work provides a scalable, real-time solution for non-destructive, early-stage detection of poultry health and fertilization status in commercial hatcheries. Full article
(This article belongs to the Special Issue Computer Vision Analysis Applied to Farm Animals)
Show Figures

Figure 1

23 pages, 7126 KB  
Article
Dual-Modal Chicken Mortality Detection Using Dynamic Hybrid Convolution-Based Feature Fusion
by Tian Hua, Qian Fan, Runhao Chen, Yulin Bi, Hao Bai, Zhixiu Wang, Guobin Chang and Wenming Zhao
Animals 2026, 16(7), 1057; https://doi.org/10.3390/ani16071057 - 31 Mar 2026
Viewed by 303
Abstract
In large-scale caged broiler farms, daily inspection of dead broilers is essential for flock health management and disease prevention. To address the significant performance degradation of existing methods under challenging conditions such as poor lighting, severe occlusion, and complex backgrounds, this paper proposes [...] Read more.
In large-scale caged broiler farms, daily inspection of dead broilers is essential for flock health management and disease prevention. To address the significant performance degradation of existing methods under challenging conditions such as poor lighting, severe occlusion, and complex backgrounds, this paper proposes a dual-modal dynamic hybrid convolutional feature fusion method for dead bird detection based on an improved YOLO11 framework, termed YOLO11-DualDynConv-FF. First, a dual-modal fusion network architecture was developed to combine RGB and infrared (IR) images, enabling the model to simultaneously process both modalities. By integrating complementary information from RGB and IR data, the proposed method significantly improved detection accuracy and efficiency under low-light conditions. Second, a dynamic hybrid convolution feature fusion module was designed to merge multi-scale feature maps with contextual information, allowing the network to capture fine-grained details and adapt better to complex farming environments. In addition, an occlusion-aware module was introduced to specifically address the physical occlusion challenges prevalent in crowded cage settings. Comparative experiments and ablation studies involving multiple object detection networks were conducted to evaluate the proposed method. The results show that the improved YOLO11 model achieves superior performance, with precision, recall, F1-score, and mAP@0.5 reaching 92.6%, 79.0%, 0.85, and 80.1%, respectively. These results represent improvements of 2.0%, 5.0%, 0.17, and 12.1%, respectively, over the original YOLO11 model. Compared with existing approaches, the proposed model is better suited to complex real-world poultry farming environments and achieves higher detection accuracy, providing a valuable reference for intelligent monitoring in caged poultry farming. Full article
(This article belongs to the Section Poultry)
Show Figures

Figure 1

24 pages, 4742 KB  
Article
Comparative Evaluation of YOLOv8 and YOLO11 for Image-Based Classification of Sugar Beet Seed Treatment Levels
by Cihan Unal, Ilkay Cinar, Zulfi Saripinar and Murat Koklu
Sensors 2026, 26(7), 2137; https://doi.org/10.3390/s26072137 - 30 Mar 2026
Viewed by 346
Abstract
This study addresses the automatic classification of sugar beet seeds according to their spraying levels using RGB images, aiming to enable a fast, practical, and non-destructive early warning system without chemical analysis. A dataset of 16,519 seed images acquired under controlled lighting conditions [...] Read more.
This study addresses the automatic classification of sugar beet seeds according to their spraying levels using RGB images, aiming to enable a fast, practical, and non-destructive early warning system without chemical analysis. A dataset of 16,519 seed images acquired under controlled lighting conditions was used to evaluate YOLOv8-CLS and YOLO11-CLS architectures, including the n, s, m, l, and x scale variants within the Ultralytics framework. All experiments were conducted using a 10-fold cross-validation strategy, with models trained under different batch size and learning rate configurations. The results indicate that both architectures achieve reliable performance, with accuracy values ranging from approximately 78–83% for YOLOv8-CLS and 80–82% for YOLO11-CLS models. ROC-AUC scores consistently above 0.94 demonstrate strong inter-class discrimination. Misclassification analysis shows that errors mainly occur between visually similar intermediate treatment levels, particularly 25% and 50%. Despite this challenge, low log-loss values and balanced precision–recall profiles indicate stable decision behavior. Overall, the findings confirm that sugar beet seed treatment levels can be effectively distinguished using only RGB imagery, providing a potentially low-cost and scalable approach for early warning and quality control in seed treatment processes. Full article
(This article belongs to the Section Smart Agriculture)
Show Figures

Figure 1

27 pages, 6255 KB  
Article
Lightweight Safety Helmet Wearing Detection Algorithm Based on GSA-YOLO
by Haodong Wang, Qiang Zhou, Zhiyuan Hao, Wentao Xiao and Luqing Yan
Sensors 2026, 26(7), 2110; https://doi.org/10.3390/s26072110 - 28 Mar 2026
Viewed by 377
Abstract
Electric power station confined spaces are high-risk and complex environments characterized by significant illumination variations. Whether safety helmets are properly worn directly affects the operational safety of workers in confined spaces. However, helmet detection in such environments faces several challenges, including drastic lighting [...] Read more.
Electric power station confined spaces are high-risk and complex environments characterized by significant illumination variations. Whether safety helmets are properly worn directly affects the operational safety of workers in confined spaces. However, helmet detection in such environments faces several challenges, including drastic lighting changes and difficulties in small-object detection. Moreover, existing object detection models typically contain a large number of parameters, making real-time helmet detection difficult to deploy on field devices with limited computational resources. To address these issues, this paper proposes a lightweight safety helmet wearing detection algorithm named GSA-YOLO. To mitigate the effects of severe illumination variation and detail loss in confined spaces, a GCA-C2f module integrating GhostConv and the CBAM attention mechanism is embedded into the backbone network. This design reduces the number of parameters and computational cost while enhancing the model’s feature extraction capability under challenging lighting conditions. To improve detection performance for occluded targets, an improved efficient channel attention (I-ECA) mechanism is introduced into the neck structure, which suppresses irrelevant channel features and enhances occluded object detection accuracy. Furthermore, to alleviate missed detections of small objects and inaccurate localization under low-light conditions, a P2 detection branch is added to the head, and the WIoU loss function is adopted to dynamically adjust the weights of hard and easy samples, thereby improving small-object detection accuracy and localization robustness. A confined space helmet detection dataset containing 5000 images was constructed through on-site data collection for model training and validation. Experimental results demonstrate that the proposed GSA-YOLO achieves an mAP@0.5 of 91.2% on the self-built dataset with only 2.3 M parameters, outperforming the baseline model by 2.9% while reducing the parameter count by 23.6%. The experimental results verify that the proposed algorithm is suitable for environments with significant illumination variation and small-object detection challenges. It provides a lightweight and efficient solution for on-site helmet detection in confined space scenarios, thereby contributing to the reduction in industrial safety accidents. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

27 pages, 4296 KB  
Article
Research on Lightweight Apple Detection and 3D Accurate Yield Estimation for Complex Orchard Environments
by Bangbang Chen, Xuzhe Sun, Xiangdong Liu, Baojian Ma and Feng Ding
Horticulturae 2026, 12(3), 393; https://doi.org/10.3390/horticulturae12030393 - 22 Mar 2026
Viewed by 232
Abstract
Severe foliage occlusion and dynamically changing lighting conditions in complex orchard environments pose significant challenges for visual perception systems in automated apple harvesting, including low detection accuracy, poor robustness, and insufficient real-time performance. To address these issues, this study proposes an improved lightweight [...] Read more.
Severe foliage occlusion and dynamically changing lighting conditions in complex orchard environments pose significant challenges for visual perception systems in automated apple harvesting, including low detection accuracy, poor robustness, and insufficient real-time performance. To address these issues, this study proposes an improved lightweight detection network based on YOLOv11, named YOLO-WBL, along with a precise yield estimation algorithm based on 3D point clouds, termed CLV. The YOLO-WBL network is optimized in three aspects: (1) A C3K2_WT module integrating wavelet transform is introduced into the backbone network to enhance multi-scale feature extraction capability; (2) A weighted bidirectional feature pyramid network (BiFPN) is adopted in the neck network to improve the efficiency of multi-scale feature fusion; (3) A lightweight shared convolution separated batch normalization detection head (Detect-SCGN) is designed to significantly reduce the parameter count while maintaining accuracy. Based on this detection model, the CLV algorithm deeply integrates depth camera point cloud information through 3D coordinate mapping, irregular point cloud reconstruction, and convex hull volume calculation to achieve accurate estimation of individual fruit volume and total yield. Experimental results demonstrate that: (1) The YOLO-WBL model achieves a precision of 93.8%, recall of 79.3%, and mean average precision (mAP@0.5) of 87.2% on the apple test set; (2) The model size is only 3.72 MB, a reduction of 28.87% compared to the baseline model; (3) When deployed on an NVIDIA Jetson Xavier NX edge device, its inference speed reaches 8.7 FPS, meeting real-time requirements; (4) In scenarios with an occlusion rate below 40%, the mean absolute percentage error (MAPE) of yield estimation can be controlled within 8%. Experimental validation was conducted using apple images selected from the dataset under varying lighting intensities and fruit occlusion conditions. The results demonstrate that the CLV algorithm significantly outperforms traditional average-weight-based estimation methods. This study provides an efficient, accurate, and deployable visual solution for intelligent apple harvesting and yield estimation in complex orchard environments, offering practical reference value for advancing smart orchard production. Full article
(This article belongs to the Special Issue AI for a Precision and Resilient Horticulture)
Show Figures

Figure 1

28 pages, 14845 KB  
Article
Spatial Relation Reasoning Based on Keypoints for Railway Intrusion Detection and Risk Assessment
by Shanping Ning, Feng Ding and Bangbang Chen
Appl. Sci. 2026, 16(6), 3026; https://doi.org/10.3390/app16063026 - 20 Mar 2026
Viewed by 225
Abstract
Foreign object intrusion in railway tracks is a major threat to train operation safety, yet current detection methods face challenges in identifying small distant targets and adapting to low-light conditions. Moreover, existing systems often lack the ability to assess intrusion risk levels, limiting [...] Read more.
Foreign object intrusion in railway tracks is a major threat to train operation safety, yet current detection methods face challenges in identifying small distant targets and adapting to low-light conditions. Moreover, existing systems often lack the ability to assess intrusion risk levels, limiting real-time warning and graded response capabilities. To address these gaps, this paper proposes a novel method for intrusion detection and risk assessment based on keypoint spatial discrimination. First, an XS-BiSeNetV2-based track segmentation network is developed, incorporating cross-feature fusion and spatial feature recalibration to improve track extraction accuracy in complex scenes. Second, an enhanced STI-YOLO detection model is introduced, integrating a Shuffle attention mechanism for better feature interaction, a high-resolution Transformer detection head to improve small-target sensitivity, and the Inner-IoU loss function to refine bounding box regression. Detected targets’ bottom keypoints are then analyzed relative to track boundaries to determine intrusion direction. By combining lateral distance and motion state features, a multi-level risk classification system is established for quantitative threat assessment. Experiments on the RailSem19 and GN-rail-Object datasets show that the method achieves a track segmentation mIoU of 88.19% and a detection mAP of 82.6%. The risk assessment module effectively quantifies threats across scenarios and maintains stable performance under low-light and strong-glare conditions. This work offers a quantifiable risk assessment solution for intelligent railway safety systems. Full article
Show Figures

Figure 1

22 pages, 6052 KB  
Article
HSMD-YOLO: An Anti-Aliasing Feature-Enhanced Network for High-Speed Microbubble Detection
by Wenda Luo, Yongjie Li and Siguang Zong
Algorithms 2026, 19(3), 234; https://doi.org/10.3390/a19030234 - 20 Mar 2026
Viewed by 247
Abstract
Underwater micro-bubble detection entails multiple challenges, including diminutive target sizes, sparse pixel information, pronounced specular highlights and water scattering, indistinct bubble boundaries, and adhesion or overlap between instances. To address these issues, we propose HSMD-YOLO, an improved detector tailored for high-resolution micro-bubble detection [...] Read more.
Underwater micro-bubble detection entails multiple challenges, including diminutive target sizes, sparse pixel information, pronounced specular highlights and water scattering, indistinct bubble boundaries, and adhesion or overlap between instances. To address these issues, we propose HSMD-YOLO, an improved detector tailored for high-resolution micro-bubble detection and built upon YOLOv11. The model incorporates three novel components: the Scale Switch Block (SSB), a scale-transformation module that suppresses artifacts and background noise, thereby stabilizing edges in thin-walled bubble regions and enhancing sensitivity to geometric contours; the Global Local Refine Block (GLRB), which achieves efficient global relationship modeling with an asymptotic linear complexity (O(N)) in spatial dimensions while further refining local features, thereby strengthening boundary perception and improving bubble–background separability; and the Bidirectional Exponential Moving Attention Fusion (BEMAF), which accommodates the multi-scale nature of bubbles by employing a parallel multi-kernel architecture to extract spatial features across scales, coupled with a multi-stage EMA based attention mechanism to enhance detection robustness under weak boundaries and complex backgrounds. Experiments conducted on an Side-Illuminated Light Field Bubble Database (SILB-DB) and a public gas–liquid two-phase flow dataset (GTFD) demonstrate that HSMD-YOLO achieves mAP@50 scores of 0.911 and 0.854, respectively, surpassing mainstream detection methods. Ablation studies indicate that SSB, GLRB, and BEMAF contribute performance gains of 1.3%, 2.0%, and 0.4%, respectively, thereby corroborating the effectiveness of each module for micro-scale object detection. Full article
(This article belongs to the Section Evolutionary Algorithms and Machine Learning)
Show Figures

Figure 1

20 pages, 3218 KB  
Article
MIP-YOLO11: An Underwater Object Detection Model Based on Improved YOLO11
by Xinyu Qu, Ying Shao, Zheng Wang and Man Chang
J. Mar. Sci. Eng. 2026, 14(6), 572; https://doi.org/10.3390/jmse14060572 - 19 Mar 2026
Viewed by 325
Abstract
Due to challenges such as inadequate lighting, water scattering, high density of small objects, and complex object morphology in underwater environments, traditional YOLO11 models face difficulties including interference from complex backgrounds, weak perception of small objects, and insufficient feature extraction when applied underwater. [...] Read more.
Due to challenges such as inadequate lighting, water scattering, high density of small objects, and complex object morphology in underwater environments, traditional YOLO11 models face difficulties including interference from complex backgrounds, weak perception of small objects, and insufficient feature extraction when applied underwater. This paper proposes an improved MIP-YOLO11 model for underwater object detection based on the YOLO11 framework. First, a MCEA module is designed in the backbone network to replace the basic CBS convolution module. Through a lightweight multi-branch convolutional structure, the perception ability for small objects, object edges, contours, and morphological features in underwater scenes are enhanced without significantly increasing computational overhead. Second, an IMCA module based on the coordinate attention mechanism is introduced at the end of the backbone network to replace the C2PSA module, reducing the number of model parameters while maintaining detection accuracy. Finally, the Bottleneck module in C3k2 is improved by incorporating a PConv and a dual residual connection mechanism, thereby expanding the receptive field and enhancing the efficiency of complex feature extraction. Experimental results demonstrate that MIP-YOLO11 significantly outperforms the traditional YOLO11 in underwater environments. P and R are improved by 2.5% and 4.1%, respectively. Moreover, the mAP0.5 and mAP0.5:0.95 metrics are increased by 4.2% and 7.5%, respectively. The improved model achieves a good balance between high accuracy and light weight, and can provide a more reliable underwater object detection scheme for AUV underwater detection and other application scenarios. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

25 pages, 6302 KB  
Article
Artificial Intelligence-Based Detection of On-Ground Chestnuts Toward Automated Picking
by Kaixuan Fang, Yuzhen Lu and Xinyang Mu
AgriEngineering 2026, 8(3), 116; https://doi.org/10.3390/agriengineering8030116 - 19 Mar 2026
Viewed by 542
Abstract
Traditional mechanized chestnut harvesting is too costly for small producers, non-selective, and prone to damaging nuts. Accurate, reliable detection of chestnuts on the orchard floor is crucial for developing low-cost, vision-guided automated harvesting technology. However, developing a reliable chestnut detection system faces challenges [...] Read more.
Traditional mechanized chestnut harvesting is too costly for small producers, non-selective, and prone to damaging nuts. Accurate, reliable detection of chestnuts on the orchard floor is crucial for developing low-cost, vision-guided automated harvesting technology. However, developing a reliable chestnut detection system faces challenges in complex environments with shading, varying natural light conditions, and interference from weeds, fallen leaves, stones, and other foreign on-ground objects, which have remained unaddressed. This study collected 319 images of chestnuts on the orchard floor, containing 6524 annotated chestnuts. A comprehensive set of 29 state-of-the-art real-time object detectors, including 14 in the YOLO (v11–v13) and 15 in the RT-DETR (v1–v4) families at various model scales, was systematically evaluated through replicated modeling experiments for chestnut detection. Experimental results show that the YOLOv12m model achieved the best mAP@0.5 of 95.1% among all the evaluated models, while RT-DETRv2-R101 was the most accurate variant among the RT-DETR models, with mAP@0.5 of 91.1%. In terms of mAP@[0.5:0.95], the YOLOv11x model achieved the best accuracy of 80.1%. All models demonstrated significant potential for real-time chestnut detection, and YOLO models outperformed RT-DETR models in terms of both detection accuracy and inference, making them better suited for on-board deployment. This work lays a foundation for developing AI-based, vision-guided intelligent chestnut harvest systems. Full article
(This article belongs to the Special Issue Applications of Computer Vision in Agriculture)
Show Figures

Figure 1

30 pages, 26587 KB  
Article
Research on Synthetic Data Methods and Detection Models for Micro-Cracks
by Yaotong Jiang, Tianmiao Wang, Xuanhe Chen and Jianhong Liang
Sensors 2026, 26(6), 1883; https://doi.org/10.3390/s26061883 - 17 Mar 2026
Viewed by 320
Abstract
Micro-crack detection on concrete surfaces is challenging because labeled micro-crack data are scarce, crack cues are extremely weak (often only a few pixels wide), and complex backgrounds (e.g., non-uniform illumination, shadows, and stains) degrade feature extraction; this study aims to improve both data [...] Read more.
Micro-crack detection on concrete surfaces is challenging because labeled micro-crack data are scarce, crack cues are extremely weak (often only a few pixels wide), and complex backgrounds (e.g., non-uniform illumination, shadows, and stains) degrade feature extraction; this study aims to improve both data availability and detection robustness for practical inspection. A Poisson image editing-based synthesis strategy is developed to generate visually coherent micro-crack samples via gradient-domain blending, and a Complex-Scene-Tolerant YOLO (CST-YOLO) detector is proposed on top of YOLOv10, following an “lighting decoupling–global perception–micro-feature enhancement” design. CST-YOLO integrates an Lighting-Adaptive Preprocessing Module (LAPM) to suppress illumination/shadow perturbations, a Spatial–Channel Sparse Transformer (SCS-Former) to model long-range crack topology efficiently, and a Small Object Focus Block (SOFB) to enhance micro-scale cues under cluttered backgrounds. Experiments are conducted on a 650-image dataset (200 real and 450 synthesized), in which synthesized samples are used only for training, and the validation/test sets contain only real images, with a 7:2:1 split. CST-YOLO achieves 0.990 mAP@0.5 and 0.926 mAP@0.5:0.95 at 139 FPS, and ablation results indicate complementary contributions from LAPM, SCS-Former, and SOFB. These results support the effectiveness of combining realistic synthesis and architecture-level robustness for real-time micro-crack detection in complex scenes. Full article
(This article belongs to the Section Fault Diagnosis & Sensors)
Show Figures

Figure 1

19 pages, 6716 KB  
Article
Multi-Type Weld Defect Detection in Galvanized Sheet MIG Welding Using an Improved YOLOv10 Model
by Bangzhi Xiao, Yadong Yang, Yinshui He and Guohong Ma
Materials 2026, 19(6), 1178; https://doi.org/10.3390/ma19061178 - 17 Mar 2026
Viewed by 361
Abstract
Shop-floor weld inspection may appear to be a solved problem until a camera is deployed near a galvanized-sheet MIG welding line. The seam reflects light, the texture changes from frame to frame, and the defects of interest are often small and visually subtle. [...] Read more.
Shop-floor weld inspection may appear to be a solved problem until a camera is deployed near a galvanized-sheet MIG welding line. The seam reflects light, the texture changes from frame to frame, and the defects of interest are often small and visually subtle. Additionally, the hardware near the line is rarely a data-center GPU. With those constraints in mind, this paper presents YOLO-MIG, a compact detector built on YOLOv10n for weld-seam inspection in practical production conditions. We make three focused changes to the baseline: a C2f-EMSCP backbone block to better preserve weak defect cues with modest parameter growth, a BiFPN neck to keep small-target information alive during feature fusion, and a C2fCIB head to clean up predictions that otherwise get distracted by seam edges and illumination artifacts. On a workshop-collected dataset containing 326 original images, with the training subset expanded through augmentation to 2608 labeled samples in total, YOLO-MIG achieves 98.4% mAP@0.5 and 56.29% mAP@0.5:0.95 on the test set while remaining lightweight (1.83 M parameters, 3.87 MB FP16 weights). Compared with YOLOv10n, the proposed model improves mAP@0.5 by 9.36 points and mAP@0.5:0.95 by 4.89 points, while reducing parameters, GFLOPs, and model size by 43.4%, 19.9%, and 29.9%, respectively. The results suggest that YOLO-MIG is not only accurate but also realistic to deploy at the edge for intelligent weld quality control. Full article
(This article belongs to the Section Manufacturing Processes and Systems)
Show Figures

Figure 1

25 pages, 2978 KB  
Article
Performance Analysis of the YOLO Object Detection Algorithm in Embedded Systems: Generated Code vs. Native Implementation
by Pablo Martínez Otero, Alberto Tellaeche and Mar Hernández Melero
Computation 2026, 14(3), 67; https://doi.org/10.3390/computation14030067 - 12 Mar 2026
Viewed by 721
Abstract
This paper evaluates the current maturity of automatic code-generation workflows for deploying modern CNN-based object detectors on embedded GPU platforms. We compare a native pipeline against a code generation pipeline through a Model-Based Engineering (MBE) approach, using YOLOv8/YOLOv9 inference on NVIDIA Jetson Orin [...] Read more.
This paper evaluates the current maturity of automatic code-generation workflows for deploying modern CNN-based object detectors on embedded GPU platforms. We compare a native pipeline against a code generation pipeline through a Model-Based Engineering (MBE) approach, using YOLOv8/YOLOv9 inference on NVIDIA Jetson Orin Nano and Jetson AGX Orin as representative edge-GPU workloads. We report detection-quality metrics (mAP, PR curves) and system-level metrics (latency distribution and initialization overhead) under a controlled single-class scenario based on a CARLA-generated sequence with frame-level annotations. Absolute accuracy and latency values are scenario-dependent and may vary under different camera optics, illumination, motion blur, sensor noise, occlusion patterns, and multi-class scene. Results quantify the performance gap between code generation and native pipelines and show that, for the evaluated workloads, the automated pipeline remains less competitive in both latency and accuracy. We discuss the implications of this gap for deployment workflows in safety-oriented domains, and we outline bottlenecks that should be addressed. The study is intended as a controlled traffic-light detection micro-benchmark and does not aim to validate full ADAS perception stacks. Full article
(This article belongs to the Special Issue Object Detection Models for Transportation Systems)
Show Figures

Figure 1

Back to TopTop