Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,560)

Search Parameters:
Keywords = YOLO object detection

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
31 pages, 6430 KB  
Article
Glare-Aware Resi-YOLO: Tiny-Vessel Detection with Dual-Brain Edge Deployment for Maritime UAVs
by Shang-En Tsai and Chia-Han Hsieh
Drones 2026, 10(3), 226; https://doi.org/10.3390/drones10030226 - 23 Mar 2026
Abstract
Maritime UAV perception must reliably detect and track tiny vessels under harsh specular glare. In practice, detection failures are dominated by two coupled factors: (i) vessels often occupy only a few pixels, causing small-object recall collapse and (ii) sun glint and sea-surface reflections [...] Read more.
Maritime UAV perception must reliably detect and track tiny vessels under harsh specular glare. In practice, detection failures are dominated by two coupled factors: (i) vessels often occupy only a few pixels, causing small-object recall collapse and (ii) sun glint and sea-surface reflections generate over-exposed regions that trigger false positives and unstable associations. This paper presents Resi-YOLO, a system-level pipeline that improves tiny-vessel sensitivity while preserving embedded throughput on a Jetson Orin Nano. At the model level, Resi-YOLO combines a P2-enhanced feature path with CBAM-based glare suppression to strengthen high-resolution semantics and suppress glare-induced artifacts; optional SAHI-style slicing is supported for ultra-high-resolution scenes. At the system level, we adopt a heterogeneous dual-brain deployment, where the Orin Nano performs primary inference and an MCU-based safety-island tracker mitigates delay/jitter via time-stamped measurement replay and IMM-UKF updates. We further define a Glare Severity Score (GSS) to stratify robustness by illumination intensity. Experiments show that Resi-YOLO improves APsmall by 13.1 percentage points over YOLOv8n (18.4% to 31.5%), raises high-glare mAP@0.5 from 41.2% to 53.7%, and runs at 12.8 FPS end-to-end (~100 ms latency) on Jetson Orin Nano, while TensorRT inference-only throughput exceeds 30 FPS. Full article
Show Figures

Graphical abstract

20 pages, 7591 KB  
Article
Research on Landslide Hazard Detection in Ya’an Region Based on an Improved YOLO Model
by Kewei Cui, Meng Huang, Weiling Zhang, Guang Yang, Yongxiong Huang, Zhengyi Wu, Zhiwei Zhai and Chao Cheng
Remote Sens. 2026, 18(6), 957; https://doi.org/10.3390/rs18060957 - 23 Mar 2026
Abstract
Landslide hazards occur frequently in the Ya’an region; therefore, accurately identifying and delineating potential landslide areas is crucial for disaster prevention and mitigation. Although deep learning-based detection methods using optical remote sensing imagery are widely adopted, the complex terrain and diverse land cover [...] Read more.
Landslide hazards occur frequently in the Ya’an region; therefore, accurately identifying and delineating potential landslide areas is crucial for disaster prevention and mitigation. Although deep learning-based detection methods using optical remote sensing imagery are widely adopted, the complex terrain and diverse land cover in this area often result in blurred boundaries and weakened textural features, making it difficult to precisely define spatial extents. To overcome these challenges, this study proposes an improved YOLOv11 model for landslide detection. Building on the YOLOv11 baseline, we designed a novel Multi-Scale Detail Enhancement module and integrated it into the neck network to effectively aggregate shallow-level details with deep-level semantic information, thereby enhancing the model’s ability to represent ambiguous boundaries. Additionally, we incorporated the lightweight SimAM attention mechanism into the backbone network. This mechanism dynamically suppresses background noise based on an energy minimization principle, improving feature discriminability within landslide regions and enabling precise boundary boxes. We conducted validation experiments in the Ya’an region using a custom dataset constructed from high-resolution UAV orthoimagery, comparing our method against mainstream models such as YOLOv8 and YOLOv10. The results show that the proposed improved YOLOv11 model achieves a precision of 90.2%, a recall of 84.8%, and an mAP of 92.7%. This enhanced performance demonstrates the model’s effectiveness in detecting landslides under complex terrain conditions, providing a practical technical reference for efficient hazard screening and dynamic monitoring. Full article
Show Figures

Figure 1

22 pages, 14276 KB  
Article
DualFOD: A Dual-Modality Deep Learning Framework for UAS-Based Foreign Object Debris Detection Using Thermal and RGB Imagery
by Owais Ahmed, Caleb S. Caldwell and Adeel Khalid
Drones 2026, 10(3), 225; https://doi.org/10.3390/drones10030225 - 23 Mar 2026
Abstract
Foreign Object Debris (FOD) poses critical risks to aircraft during takeoff and landing, resulting in billions of dollars in losses annually due to infrastructure damage and flight delays. Advancements in automated inspection technologies have enabled the use of Unmanned Aerial Systems (UAS) combined [...] Read more.
Foreign Object Debris (FOD) poses critical risks to aircraft during takeoff and landing, resulting in billions of dollars in losses annually due to infrastructure damage and flight delays. Advancements in automated inspection technologies have enabled the use of Unmanned Aerial Systems (UAS) combined with Artificial Intelligence (AI) for rapid FOD identification. While prior research has extensively evaluated optical sensors such as RGB imaging and radar, limited work has investigated the potential of thermal imaging for improved FOD visibility under challenging environmental conditions. This study proposes DualFOD, a dual-modality detection framework that integrates a supervised YOLO12-based RGB detector with an unsupervised thermal anomaly extraction pipeline for identifying debris on runway surfaces. A decision-level fusion algorithm combines detections from both branches using spatial proximity matching to produce a unified FOD inventory. The RGB branch achieves a precision of 0.954 and mAP@0.5 of 0.890 on the held-out test set. Cross-site validation at the Cobb County Sport Aviation Complex demonstrates that thermal detection recovers debris missed by RGB at higher altitudes, with the fused output consistently outperforming either single-modality branch. This research contributes toward scalable autonomous FOD monitoring that enhances operational safety in aviation environments. Full article
Show Figures

Figure 1

43 pages, 6336 KB  
Systematic Review
A Systematic Literature Review of You Only Look Once Architectures (v1–v12) in Healthcare Systems
by Ozgur Koray Sahingoz, Gozde Karatas Baydogmus and Emin Kugu
Diagnostics 2026, 16(6), 935; https://doi.org/10.3390/diagnostics16060935 - 22 Mar 2026
Abstract
Background/Objectives: The integration of deep learning and computer vision into healthcare has improved medical diagnosis and image analysis. Among object detection algorithms, the YOLO family has attracted substantial attention due to its ability to analyze images in real time with reported improvements [...] Read more.
Background/Objectives: The integration of deep learning and computer vision into healthcare has improved medical diagnosis and image analysis. Among object detection algorithms, the YOLO family has attracted substantial attention due to its ability to analyze images in real time with reported improvements in detection performance across multiple studies. This systematic review examines the evolution of YOLO algorithms for diagnostic applications in healthcare from YOLOv1 to YOLOv12. Methods: Peer-reviewed scientific articles published up to 1 January 2026 were retrieved from major scientific databases in accordance with PRISMA 2020 guidelines. The included studies applied YOLO models to medical imaging tasks, including disease and lesion detection and support for clinical procedures. Performance was synthesized using reported metrics such as average precision, accuracy, inference time, and computational efficiency. Results: The reviewed literature suggests progressive architectural refinements associated with reported improvements in diagnostic performance. YOLOv5 and YOLOv8 are the most frequently used architectures in diagnostic settings, reflecting a favorable trade-off between accuracy and computational complexity. YOLO-based methods have demonstrated strong performance across radiological, pathological, ophthalmological, and endoscopic applications. Conclusions: YOLO models have matured into robust and optimized solutions for medical image analysis; however, challenges remain in interpretability, cross-institution generalization, and deployment on edge devices. Future work on explainable YOLO-based diagnostics and energy-efficient model design will be particularly valuable. Full article
Show Figures

Figure 1

18 pages, 4159 KB  
Article
Advancing Breast Cancer Lesion Analysis in Real-Time Sonography Through Multi-Layer Transfer Learning and Adaptive Tracking
by Suliman Thwib, Radwan Qasrawi, Ghada Issa, Razan AbuGhoush, Hussein AlMasri and Marah Qawasmi
Mach. Learn. Knowl. Extr. 2026, 8(3), 82; https://doi.org/10.3390/make8030082 - 21 Mar 2026
Abstract
Background: Real-time and accurate analysis of breast ultrasounds is crucial for diagnosis but remains challenging due to issues like low image contrast and operator dependency. This study aims to address these challenges by developing an integrated framework for real-time lesion detection and [...] Read more.
Background: Real-time and accurate analysis of breast ultrasounds is crucial for diagnosis but remains challenging due to issues like low image contrast and operator dependency. This study aims to address these challenges by developing an integrated framework for real-time lesion detection and tracking. Methods: The proposed system combines Contrast-Limited Adaptive Histogram Equalization (CLAHE) for image preprocessing, a transfer learning-enhanced YOLOv11 model following a continual learning paradigm for cross-center generalization in for lesion detection, and a novel Detection-Based Tracking (DBT) approach that integrates Kernelized Correlation Filters (KCF) with periodic detection verification. The framework was evaluated on a dataset comprising 11,383 static images and 40 ultrasound video sequences, with a subset verified through biopsy and the remainder annotated by two radiologists based on radiological reports. Results: The proposed framework demonstrated high performance across all components. The transfer learning strategy (TL12) significantly improved detection outcomes, achieving a mean Average Precision (mAP) of 0.955, a sensitivity of 0.938, and an F1 score of 0.956. The DBT method (KCF + YOLO) achieved high tracking accuracy, with a success rate of 0.984, an Intersection over Union (IoU) of 0.85, and real-time operation at 54 frames per second (FPS) with a latency of 7.74 ms. The use of CLAHE preprocessing was shown to be a critical factor in improving both detection and tracking stability across diverse imaging conditions. Conclusions: This research presents a robust, fully integrated framework that bridges the gap between speed and accuracy in breast ultrasound analysis. The system’s high performance and real-time efficiency underscore its strong potential for clinical adoption to enhance diagnostic workflows, reduce operator variability, and improve breast cancer assessment. Full article
Show Figures

Figure 1

28 pages, 3863 KB  
Article
DeepSORT-OCR: Design and Application Research of a Maritime Ship Target Tracking Algorithm Incorporating Hull Number Features
by Jing Ma, Xihang Su, Kehui Xu, Hongliang Yin, Zhihong Xiao, Jiale Wang and Peng Liu
Mathematics 2026, 14(6), 1062; https://doi.org/10.3390/math14061062 - 20 Mar 2026
Abstract
Maritime ship target tracking plays an important role in applications such as maritime patrol and maritime surveillance. However, complex sea conditions, similar target appearances, and long-distance imaging often lead to target identity confusion and unstable trajectories. To address these issues, in this paper, [...] Read more.
Maritime ship target tracking plays an important role in applications such as maritime patrol and maritime surveillance. However, complex sea conditions, similar target appearances, and long-distance imaging often lead to target identity confusion and unstable trajectories. To address these issues, in this paper, a ship multi-object tracking algorithm, DeepSORT-OCR, that integrates hull number semantic features is proposed. Based on the YOLO detection framework and the DeepSORT tracking architecture, a CBAM-ResNet network is introduced to enhance the representation of ship appearance features. An Inner-SIoU metric is adopted to improve the geometric matching of slender ship targets, while an LSTM-Adaptive Kalman Filter is employed to model the nonlinear motion patterns of ships and improve trajectory prediction stability. In addition, a Hull Number Feature Extraction module is designed in order to recognize ship hull numbers using OCR and match them with a hull number database. The extracted hull number semantic features are dynamically fused with visual appearance features to strengthen identity constraints during target association. The experimental results show that the proposed method achieves an MOTA of 66.53% on the MOT16 dataset, representing an improvement of 5.13% over DeepSORT. On the self-constructed maritime ship dataset, the method achieves an MOTA of 70.89% and an MOTP of 80.84%. Furthermore, on the hull-number subset, the MOTA further increases to 77.18%, an improvement of 7.31% compared with DeepSORT, while the number of ID switches is significantly reduced. In addition, experiments conducted on pure real data, pure synthetic data, and cross-domain evaluation settings demonstrate the stability and strong generalization capability of the proposed algorithm under different data distributions. The proposed method effectively improves the stability and identity consistency of ship multi-object tracking in complex maritime environments. Full article
Show Figures

Figure 1

29 pages, 9360 KB  
Article
Spatial Relation Reasoning Based on Keypoints for Railway Intrusion Detection and Risk Assessment
by Shanping Ning, Feng Ding and Bangbang Chen
Appl. Sci. 2026, 16(6), 3026; https://doi.org/10.3390/app16063026 - 20 Mar 2026
Abstract
Foreign object intrusion in railway tracks is a major threat to train operation safety, yet current detection methods face challenges in identifying small distant targets and adapting to low-light conditions. Moreover, existing systems often lack the ability to assess intrusion risk levels, limiting [...] Read more.
Foreign object intrusion in railway tracks is a major threat to train operation safety, yet current detection methods face challenges in identifying small distant targets and adapting to low-light conditions. Moreover, existing systems often lack the ability to assess intrusion risk levels, limiting real-time warning and graded response capabilities. To address these gaps, this paper proposes a novel method for intrusion detection and risk assessment based on keypoint spatial discrimination. First, an XS-BiSeNetV2-based track segmentation network is developed, incorporating cross-feature fusion and spatial feature recalibration to improve track extraction accuracy in complex scenes. Second, an enhanced STI-YOLO detection model is introduced, integrating a Shuffle attention mechanism for better feature interaction, a high-resolution Transformer detection head to improve small-target sensitivity, and the Inner-IoU loss function to refine bounding box regression. Detected targets’ bottom keypoints are then analyzed relative to track boundaries to determine intrusion direction. By combining lateral distance and motion state features, a multi-level risk classification system is established for quantitative threat assessment. Experiments on the RailSem19 and GN-rail-Object datasets show that the method achieves a track segmentation mIoU of 88.19% and a detection mAP of 82.6%. The risk assessment module effectively quantifies threats across scenarios and maintains stable performance under low-light and strong-glare conditions. This work offers a quantifiable risk assessment solution for intelligent railway safety systems. Full article
22 pages, 6052 KB  
Article
HSMD-YOLO: An Anti-Aliasing Feature-Enhanced Network for High-Speed Microbubble Detection
by Wenda Luo, Yongjie Li and Siguang Zong
Algorithms 2026, 19(3), 234; https://doi.org/10.3390/a19030234 - 20 Mar 2026
Abstract
Underwater micro-bubble detection entails multiple challenges, including diminutive target sizes, sparse pixel information, pronounced specular highlights and water scattering, indistinct bubble boundaries, and adhesion or overlap between instances. To address these issues, we propose HSMD-YOLO, an improved detector tailored for high-resolution micro-bubble detection [...] Read more.
Underwater micro-bubble detection entails multiple challenges, including diminutive target sizes, sparse pixel information, pronounced specular highlights and water scattering, indistinct bubble boundaries, and adhesion or overlap between instances. To address these issues, we propose HSMD-YOLO, an improved detector tailored for high-resolution micro-bubble detection and built upon YOLOv11. The model incorporates three novel components: the Scale Switch Block (SSB), a scale-transformation module that suppresses artifacts and background noise, thereby stabilizing edges in thin-walled bubble regions and enhancing sensitivity to geometric contours; the Global Local Refine Block (GLRB), which achieves efficient global relationship modeling with an asymptotic linear complexity (O(N)) in spatial dimensions while further refining local features, thereby strengthening boundary perception and improving bubble–background separability; and the Bidirectional Exponential Moving Attention Fusion (BEMAF), which accommodates the multi-scale nature of bubbles by employing a parallel multi-kernel architecture to extract spatial features across scales, coupled with a multi-stage EMA based attention mechanism to enhance detection robustness under weak boundaries and complex backgrounds. Experiments conducted on an Side-Illuminated Light Field Bubble Database (SILB-DB) and a public gas–liquid two-phase flow dataset (GTFD) demonstrate that HSMD-YOLO achieves mAP@50 scores of 0.911 and 0.854, respectively, surpassing mainstream detection methods. Ablation studies indicate that SSB, GLRB, and BEMAF contribute performance gains of 1.3%, 2.0%, and 0.4%, respectively, thereby corroborating the effectiveness of each module for micro-scale object detection. Full article
(This article belongs to the Section Evolutionary Algorithms and Machine Learning)
Show Figures

Figure 1

14 pages, 18688 KB  
Article
Outdoor Motion Capture at Scale
by Michael Zwölfer, Martin Mössner, Helge Rhodin and Werner Nachbauer
Sensors 2026, 26(6), 1951; https://doi.org/10.3390/s26061951 - 20 Mar 2026
Abstract
Capturing kinematic data in outdoor sports is challenging, as motions span large capture volumes and occur under difficult environmental conditions. Video-based approaches, particularly with pan–tilt–zoom cameras, offer a practical solution, but the extensive manual post-processing required limits their use to short sequences and [...] Read more.
Capturing kinematic data in outdoor sports is challenging, as motions span large capture volumes and occur under difficult environmental conditions. Video-based approaches, particularly with pan–tilt–zoom cameras, offer a practical solution, but the extensive manual post-processing required limits their use to short sequences and few athletes. This study presents a motion capture pipeline that automates the detection of both reference points and sport-specific keypoints to overcome this limitation. The field test employed eight cameras covering a 250×80×30 m capture volume with nearly 300 reference points. Ten state-certified ski instructors performed eight standardized maneuvers. Reference points were localized through a hybrid approach combining YOLO object detection and ArUco marker identification. AlphaPose was fine-tuned on a new manually annotated dataset to detect skier-specific keypoints (e.g., skis, poles) alongside anatomical landmarks. Continuous frame-wise calibration and 3D reconstruction were performed using Direct Linear Transformation. Evaluation compared automated detections with manual annotations. Automated reference point detection achieved a mean localization error of 4.1 pixels (0.1% of 4K width) and reduced 3D segment-length variation by 23%. The skier-specific keypoint model reached 98% PCK, mAP of 0.97, and an MPJPE of 10.3 pixels while lowering 3D segment-length variation by 0.5 cm compared to manual digitization and 0.6 cm relative to a pretrained model. Replacing manual digitization with automated detection improves accuracy and facilitates kinematic data collection in large outdoor fields with many athletes and trials. The approach also enables the creation of sport-specific datasets valuable for biomechanical research and training next-generation 3D pose estimation models. Full article
(This article belongs to the Special Issue Advanced Sensors in Biomechanics and Rehabilitation—2nd Edition)
Show Figures

Graphical abstract

20 pages, 3218 KB  
Article
MIP-YOLO11: An Underwater Object Detection Model Based on Improved YOLO11
by Xinyu Qu, Ying Shao, Zheng Wang and Man Chang
J. Mar. Sci. Eng. 2026, 14(6), 572; https://doi.org/10.3390/jmse14060572 - 19 Mar 2026
Abstract
Due to challenges such as inadequate lighting, water scattering, high density of small objects, and complex object morphology in underwater environments, traditional YOLO11 models face difficulties including interference from complex backgrounds, weak perception of small objects, and insufficient feature extraction when applied underwater. [...] Read more.
Due to challenges such as inadequate lighting, water scattering, high density of small objects, and complex object morphology in underwater environments, traditional YOLO11 models face difficulties including interference from complex backgrounds, weak perception of small objects, and insufficient feature extraction when applied underwater. This paper proposes an improved MIP-YOLO11 model for underwater object detection based on the YOLO11 framework. First, a MCEA module is designed in the backbone network to replace the basic CBS convolution module. Through a lightweight multi-branch convolutional structure, the perception ability for small objects, object edges, contours, and morphological features in underwater scenes are enhanced without significantly increasing computational overhead. Second, an IMCA module based on the coordinate attention mechanism is introduced at the end of the backbone network to replace the C2PSA module, reducing the number of model parameters while maintaining detection accuracy. Finally, the Bottleneck module in C3k2 is improved by incorporating a PConv and a dual residual connection mechanism, thereby expanding the receptive field and enhancing the efficiency of complex feature extraction. Experimental results demonstrate that MIP-YOLO11 significantly outperforms the traditional YOLO11 in underwater environments. P and R are improved by 2.5% and 4.1%, respectively. Moreover, the mAP0.5 and mAP0.5:0.95 metrics are increased by 4.2% and 7.5%, respectively. The improved model achieves a good balance between high accuracy and light weight, and can provide a more reliable underwater object detection scheme for AUV underwater detection and other application scenarios. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

25 pages, 6302 KB  
Article
Artificial Intelligence-Based Detection of On-Ground Chestnuts Toward Automated Picking
by Kaixuan Fang, Yuzhen Lu and Xinyang Mu
AgriEngineering 2026, 8(3), 116; https://doi.org/10.3390/agriengineering8030116 - 19 Mar 2026
Abstract
Traditional mechanized chestnut harvesting is too costly for small producers, non-selective, and prone to damaging nuts. Accurate, reliable detection of chestnuts on the orchard floor is crucial for developing low-cost, vision-guided automated harvesting technology. However, developing a reliable chestnut detection system faces challenges [...] Read more.
Traditional mechanized chestnut harvesting is too costly for small producers, non-selective, and prone to damaging nuts. Accurate, reliable detection of chestnuts on the orchard floor is crucial for developing low-cost, vision-guided automated harvesting technology. However, developing a reliable chestnut detection system faces challenges in complex environments with shading, varying natural light conditions, and interference from weeds, fallen leaves, stones, and other foreign on-ground objects, which have remained unaddressed. This study collected 319 images of chestnuts on the orchard floor, containing 6524 annotated chestnuts. A comprehensive set of 29 state-of-the-art real-time object detectors, including 14 in the YOLO (v11–v13) and 15 in the RT-DETR (v1–v4) families at various model scales, was systematically evaluated through replicated modeling experiments for chestnut detection. Experimental results show that the YOLOv12m model achieved the best mAP@0.5 of 95.1% among all the evaluated models, while RT-DETRv2-R101 was the most accurate variant among the RT-DETR models, with mAP@0.5 of 91.1%. In terms of mAP@[0.5:0.95], the YOLOv11x model achieved the best accuracy of 80.1%. All models demonstrated significant potential for real-time chestnut detection, and YOLO models outperformed RT-DETR models in terms of both detection accuracy and inference, making them better suited for on-board deployment. This work lays a foundation for developing AI-based, vision-guided intelligent chestnut harvest systems. Full article
(This article belongs to the Special Issue Applications of Computer Vision in Agriculture)
Show Figures

Figure 1

29 pages, 7173 KB  
Article
Research on Detection and Picking Point of Lychee Fruits in Natural Scenes Based on Deep Learning
by Jing Chang and Sangdae Kim
Agriculture 2026, 16(6), 686; https://doi.org/10.3390/agriculture16060686 - 18 Mar 2026
Viewed by 71
Abstract
China is one of the world’s major lychee producers, and the fruit’s soft texture, small size, and thin peel make non-destructive robotic harvesting particularly challenging. Accurate fruit detection, branch segmentation, and precise picking-point localization are critical for enabling automated harvesting in complex natural [...] Read more.
China is one of the world’s major lychee producers, and the fruit’s soft texture, small size, and thin peel make non-destructive robotic harvesting particularly challenging. Accurate fruit detection, branch segmentation, and precise picking-point localization are critical for enabling automated harvesting in complex natural orchard environments. This study proposes an integrated perception framework for lychee harvesting that combines object detection, density-based clustering, and semantic segmentation. An improved YOLO11s-based detection network incorporating SimAM attention, CMUNeXt feature enhancement, and MPDIoU loss is developed to enhance robustness under illumination variation, occlusion, and scale changes. The proposed detector achieves a precision of 84.3%, recall of 73.2%, and mAP of 81.6%, outperforming baseline models. Density-based clustering is employed to group individual detections into fruit clusters. Comparative experiments demonstrate that MeanShift achieves the highest clustering consistency, with an average Adjusted Rand Index (ARI) of 0.768, outperforming k-means and other baselines. An improved DeepLab v3+ semantic segmentation network with a ResDenseFocal backbone and Focal Loss is designed for accurate branch extraction under complex backgrounds. Finally, a rule-based geometric picking-point localization algorithm is formulated in the image coordinate system by integrating detection, clustering, and branch segmentation results. Experimental validation demonstrates that the proposed framework can reliably localize picking points in two-dimensional images under natural orchard conditions. The proposed method provides a practical perception solution for intelligent lychee harvesting and establishes a foundation for future 3D robotic manipulation and field deployment. Full article
(This article belongs to the Special Issue Robots for Fruit Crops: Harvesting, Pruning, and Phenotyping)
Show Figures

Figure 1

20 pages, 1861 KB  
Article
Design of a Hardware-Optimized High-Performance CNN Accelerator for Real-Time Object Detection Using YOLOv3 with Darknet-19 Architecture
by Shuo Wu, Manasa Kunapareddy and Nan Wang
Electronics 2026, 15(6), 1264; https://doi.org/10.3390/electronics15061264 - 18 Mar 2026
Viewed by 50
Abstract
This research proposes a novel hardware-optimized design to accelerate Convolutional Neural Networks (CNNs) using Verilog HDL. The design is specifically developed for the DARKNET-19 system model, which serves as the backbone of the YOLOv3-tiny algorithm, a widely used framework for real-time object detection [...] Read more.
This research proposes a novel hardware-optimized design to accelerate Convolutional Neural Networks (CNNs) using Verilog HDL. The design is specifically developed for the DARKNET-19 system model, which serves as the backbone of the YOLOv3-tiny algorithm, a widely used framework for real-time object detection in dynamic environments. The CNN architecture was implemented in Verilog HDL and synthesized using Synopsys Design Compiler, with a focus on improving both object detection accuracy and hardware resource efficiency. The proposed design efficiently performs key CNN operations, including convolution, pooling, and activation, enabling faster real-time object detection compared to many existing methods. To improve performance, the hardware design incorporates parallel processing techniques, allowing multiple computations to be executed simultaneously. This significantly reduces the system latency and power consumption. The convolutional layers of the DARKNET-19 architecture are efficiently mapped onto the hardware platform, ensuring optimized data storage and fast memory access, which further enhances processing speed and detection accuracy. An innovative feature of the design is a 2-dimensional image preprocessing module that prepares input images before they are fed into the CNN. This preprocessing stage includes image resizing, brightness normalization, and color adjustment, which helps the CNN process visual data more effectively. After preprocessing, the images pass through several CNN layers. The convolutional layers extract key features from the images, while the pooling and activation layers refine these features to improve detection performance. Finally, the processed data is analyzed by the YOLOv3-tiny algorithm, which identifies and locates objects in the images with high precision. Experimental results demonstrate that the proposed high-speed and resource-efficient hardware architecture is well-suited for real-time object detection applications, particularly in highly dynamic and unpredictable environments. Full article
Show Figures

Figure 1

15 pages, 3339 KB  
Article
AI-Driven Adaptive Camouflage Pattern Generation for Helicopter Detection Evasion in Aerial Sensor Imagery Using Fine-Tuned YOLOv8 and Stable Diffusion
by Jonghyeok Im, Yeonhong Kim, Heoung-Jae Chun and Kyoungsik Kim
Sensors 2026, 26(6), 1895; https://doi.org/10.3390/s26061895 - 17 Mar 2026
Viewed by 183
Abstract
In aerial sensor systems, detecting helicopters against diverse backgrounds remains challenging due to environmental camouflage. This paper proposes an end-to-end framework for generating adaptive camouflage patterns to evade YOLO-based object detection. Starting with synthetic sensor imagery (background + transparent helicopter overlay), we employ [...] Read more.
In aerial sensor systems, detecting helicopters against diverse backgrounds remains challenging due to environmental camouflage. This paper proposes an end-to-end framework for generating adaptive camouflage patterns to evade YOLO-based object detection. Starting with synthetic sensor imagery (background + transparent helicopter overlay), we employ a fine-tuned YOLOv8m for precise VTOL mask extraction, followed by KMeans clustering with Gaussian blur for dominant color extraction from the background. These colors guide Stable Diffusion inpainting to synthesize full-screen camouflage textures, which are then masked and overlapped onto the helicopter region. Evaluated on a 920-image dataset across multiple backgrounds, our method achieves a 97.6% reduction in mAP@0.5 (from 0.8175 to 0.0196) on 751 camouflaged images against a fine-tuned YOLOv8m model, with recall dropping by 95.9%. Even against a helicopter-specialized Defence model, mAP@0.5 drops by 89.6% (from 0.1178 to 0.0123). Ablation studies confirm the synergy of YOLO masking and color-guided inpainting. This sensor-fusion approach enhances stealth in unmanned aerial surveillance, with implications for civilian aviation safety. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

30 pages, 26587 KB  
Article
Research on Synthetic Data Methods and Detection Models for Micro-Cracks
by Yaotong Jiang, Tianmiao Wang, Xuanhe Chen and Jianhong Liang
Sensors 2026, 26(6), 1883; https://doi.org/10.3390/s26061883 - 17 Mar 2026
Viewed by 99
Abstract
Micro-crack detection on concrete surfaces is challenging because labeled micro-crack data are scarce, crack cues are extremely weak (often only a few pixels wide), and complex backgrounds (e.g., non-uniform illumination, shadows, and stains) degrade feature extraction; this study aims to improve both data [...] Read more.
Micro-crack detection on concrete surfaces is challenging because labeled micro-crack data are scarce, crack cues are extremely weak (often only a few pixels wide), and complex backgrounds (e.g., non-uniform illumination, shadows, and stains) degrade feature extraction; this study aims to improve both data availability and detection robustness for practical inspection. A Poisson image editing-based synthesis strategy is developed to generate visually coherent micro-crack samples via gradient-domain blending, and a Complex-Scene-Tolerant YOLO (CST-YOLO) detector is proposed on top of YOLOv10, following an “lighting decoupling–global perception–micro-feature enhancement” design. CST-YOLO integrates an Lighting-Adaptive Preprocessing Module (LAPM) to suppress illumination/shadow perturbations, a Spatial–Channel Sparse Transformer (SCS-Former) to model long-range crack topology efficiently, and a Small Object Focus Block (SOFB) to enhance micro-scale cues under cluttered backgrounds. Experiments are conducted on a 650-image dataset (200 real and 450 synthesized), in which synthesized samples are used only for training, and the validation/test sets contain only real images, with a 7:2:1 split. CST-YOLO achieves 0.990 mAP@0.5 and 0.926 mAP@0.5:0.95 at 139 FPS, and ablation results indicate complementary contributions from LAPM, SCS-Former, and SOFB. These results support the effectiveness of combining realistic synthesis and architecture-level robustness for real-time micro-crack detection in complex scenes. Full article
(This article belongs to the Section Fault Diagnosis & Sensors)
Show Figures

Figure 1

Back to TopTop