MDPI - Publisher of Open Access Journals

31 pages, 6430 KB

Open AccessArticle

Glare-Aware Resi-YOLO: Tiny-Vessel Detection with Dual-Brain Edge Deployment for Maritime UAVs

by Shang-En Tsai and Chia-Han Hsieh

Drones 2026, 10(3), 226; https://doi.org/10.3390/drones10030226 - 23 Mar 2026

Maritime UAV perception must reliably detect and track tiny vessels under harsh specular glare. In practice, detection failures are dominated by two coupled factors: (i) vessels often occupy only a few pixels, causing small-object recall collapse and (ii) sun glint and sea-surface reflections [...] Read more.

Maritime UAV perception must reliably detect and track tiny vessels under harsh specular glare. In practice, detection failures are dominated by two coupled factors: (i) vessels often occupy only a few pixels, causing small-object recall collapse and (ii) sun glint and sea-surface reflections generate over-exposed regions that trigger false positives and unstable associations. This paper presents Resi-YOLO, a system-level pipeline that improves tiny-vessel sensitivity while preserving embedded throughput on a Jetson Orin Nano. At the model level, Resi-YOLO combines a P2-enhanced feature path with CBAM-based glare suppression to strengthen high-resolution semantics and suppress glare-induced artifacts; optional SAHI-style slicing is supported for ultra-high-resolution scenes. At the system level, we adopt a heterogeneous dual-brain deployment, where the Orin Nano performs primary inference and an MCU-based safety-island tracker mitigates delay/jitter via time-stamped measurement replay and IMM-UKF updates. We further define a Glare Severity Score (GSS) to stratify robustness by illumination intensity. Experiments show that Resi-YOLO improves AP_small by 13.1 percentage points over YOLOv8n (18.4% to 31.5%), raises high-glare mAP@0.5 from 41.2% to 53.7%, and runs at 12.8 FPS end-to-end (~100 ms latency) on Jetson Orin Nano, while TensorRT inference-only throughput exceeds 30 FPS. Full article

(This article belongs to the Special Issue Artificial Intelligence-Driven Drones Systems for Marine Engineering Applications)

► Show Figures

Graphical abstract

20 pages, 7591 KB

Open AccessArticle

Research on Landslide Hazard Detection in Ya’an Region Based on an Improved YOLO Model

by Kewei Cui, Meng Huang, Weiling Zhang, Guang Yang, Yongxiong Huang, Zhengyi Wu, Zhiwei Zhai and Chao Cheng

Remote Sens. 2026, 18(6), 957; https://doi.org/10.3390/rs18060957 - 23 Mar 2026

Abstract

Landslide hazards occur frequently in the Ya’an region; therefore, accurately identifying and delineating potential landslide areas is crucial for disaster prevention and mitigation. Although deep learning-based detection methods using optical remote sensing imagery are widely adopted, the complex terrain and diverse land cover [...] Read more.

Landslide hazards occur frequently in the Ya’an region; therefore, accurately identifying and delineating potential landslide areas is crucial for disaster prevention and mitigation. Although deep learning-based detection methods using optical remote sensing imagery are widely adopted, the complex terrain and diverse land cover in this area often result in blurred boundaries and weakened textural features, making it difficult to precisely define spatial extents. To overcome these challenges, this study proposes an improved YOLOv11 model for landslide detection. Building on the YOLOv11 baseline, we designed a novel Multi-Scale Detail Enhancement module and integrated it into the neck network to effectively aggregate shallow-level details with deep-level semantic information, thereby enhancing the model’s ability to represent ambiguous boundaries. Additionally, we incorporated the lightweight SimAM attention mechanism into the backbone network. This mechanism dynamically suppresses background noise based on an energy minimization principle, improving feature discriminability within landslide regions and enabling precise boundary boxes. We conducted validation experiments in the Ya’an region using a custom dataset constructed from high-resolution UAV orthoimagery, comparing our method against mainstream models such as YOLOv8 and YOLOv10. The results show that the proposed improved YOLOv11 model achieves a precision of 90.2%, a recall of 84.8%, and an mAP of 92.7%. This enhanced performance demonstrates the model’s effectiveness in detecting landslides under complex terrain conditions, providing a practical technical reference for efficient hazard screening and dynamic monitoring. Full article

► Show Figures

Figure 1

22 pages, 14276 KB

Open AccessArticle

DualFOD: A Dual-Modality Deep Learning Framework for UAS-Based Foreign Object Debris Detection Using Thermal and RGB Imagery

by Owais Ahmed, Caleb S. Caldwell and Adeel Khalid

Drones 2026, 10(3), 225; https://doi.org/10.3390/drones10030225 - 23 Mar 2026

Abstract

Foreign Object Debris (FOD) poses critical risks to aircraft during takeoff and landing, resulting in billions of dollars in losses annually due to infrastructure damage and flight delays. Advancements in automated inspection technologies have enabled the use of Unmanned Aerial Systems (UAS) combined [...] Read more.

Foreign Object Debris (FOD) poses critical risks to aircraft during takeoff and landing, resulting in billions of dollars in losses annually due to infrastructure damage and flight delays. Advancements in automated inspection technologies have enabled the use of Unmanned Aerial Systems (UAS) combined with Artificial Intelligence (AI) for rapid FOD identification. While prior research has extensively evaluated optical sensors such as RGB imaging and radar, limited work has investigated the potential of thermal imaging for improved FOD visibility under challenging environmental conditions. This study proposes DualFOD, a dual-modality detection framework that integrates a supervised YOLO12-based RGB detector with an unsupervised thermal anomaly extraction pipeline for identifying debris on runway surfaces. A decision-level fusion algorithm combines detections from both branches using spatial proximity matching to produce a unified FOD inventory. The RGB branch achieves a precision of 0.954 and mAP@0.5 of 0.890 on the held-out test set. Cross-site validation at the Cobb County Sport Aviation Complex demonstrates that thermal detection recovers debris missed by RGB at higher altitudes, with the fused output consistently outperforming either single-modality branch. This research contributes toward scalable autonomous FOD monitoring that enhances operational safety in aviation environments. Full article

► Show Figures

Figure 1

43 pages, 6336 KB

Open AccessSystematic Review

A Systematic Literature Review of You Only Look Once Architectures (v1–v12) in Healthcare Systems

by Ozgur Koray Sahingoz, Gozde Karatas Baydogmus and Emin Kugu

Diagnostics 2026, 16(6), 935; https://doi.org/10.3390/diagnostics16060935 - 22 Mar 2026

Abstract

Background/Objectives: The integration of deep learning and computer vision into healthcare has improved medical diagnosis and image analysis. Among object detection algorithms, the YOLO family has attracted substantial attention due to its ability to analyze images in real time with reported improvements [...] Read more.

Background/Objectives: The integration of deep learning and computer vision into healthcare has improved medical diagnosis and image analysis. Among object detection algorithms, the YOLO family has attracted substantial attention due to its ability to analyze images in real time with reported improvements in detection performance across multiple studies. This systematic review examines the evolution of YOLO algorithms for diagnostic applications in healthcare from YOLOv1 to YOLOv12. Methods: Peer-reviewed scientific articles published up to 1 January 2026 were retrieved from major scientific databases in accordance with PRISMA 2020 guidelines. The included studies applied YOLO models to medical imaging tasks, including disease and lesion detection and support for clinical procedures. Performance was synthesized using reported metrics such as average precision, accuracy, inference time, and computational efficiency. Results: The reviewed literature suggests progressive architectural refinements associated with reported improvements in diagnostic performance. YOLOv5 and YOLOv8 are the most frequently used architectures in diagnostic settings, reflecting a favorable trade-off between accuracy and computational complexity. YOLO-based methods have demonstrated strong performance across radiological, pathological, ophthalmological, and endoscopic applications. Conclusions: YOLO models have matured into robust and optimized solutions for medical image analysis; however, challenges remain in interpretability, cross-institution generalization, and deployment on edge devices. Future work on explainable YOLO-based diagnostics and energy-efficient model design will be particularly valuable. Full article

(This article belongs to the Special Issue AI and Digital Health for Disease Diagnosis and Monitoring, 2nd Edition)

► Show Figures

Figure 1

18 pages, 4159 KB

Open AccessArticle

Advancing Breast Cancer Lesion Analysis in Real-Time Sonography Through Multi-Layer Transfer Learning and Adaptive Tracking

by Suliman Thwib, Radwan Qasrawi, Ghada Issa, Razan AbuGhoush, Hussein AlMasri and Marah Qawasmi

Mach. Learn. Knowl. Extr. 2026, 8(3), 82; https://doi.org/10.3390/make8030082 - 21 Mar 2026

Abstract

Background: Real-time and accurate analysis of breast ultrasounds is crucial for diagnosis but remains challenging due to issues like low image contrast and operator dependency. This study aims to address these challenges by developing an integrated framework for real-time lesion detection and [...] Read more.

Background: Real-time and accurate analysis of breast ultrasounds is crucial for diagnosis but remains challenging due to issues like low image contrast and operator dependency. This study aims to address these challenges by developing an integrated framework for real-time lesion detection and tracking. Methods: The proposed system combines Contrast-Limited Adaptive Histogram Equalization (CLAHE) for image preprocessing, a transfer learning-enhanced YOLOv11 model following a continual learning paradigm for cross-center generalization in for lesion detection, and a novel Detection-Based Tracking (DBT) approach that integrates Kernelized Correlation Filters (KCF) with periodic detection verification. The framework was evaluated on a dataset comprising 11,383 static images and 40 ultrasound video sequences, with a subset verified through biopsy and the remainder annotated by two radiologists based on radiological reports. Results: The proposed framework demonstrated high performance across all components. The transfer learning strategy (TL12) significantly improved detection outcomes, achieving a mean Average Precision (mAP) of 0.955, a sensitivity of 0.938, and an F1 score of 0.956. The DBT method (KCF + YOLO) achieved high tracking accuracy, with a success rate of 0.984, an Intersection over Union (IoU) of 0.85, and real-time operation at 54 frames per second (FPS) with a latency of 7.74 ms. The use of CLAHE preprocessing was shown to be a critical factor in improving both detection and tracking stability across diverse imaging conditions. Conclusions: This research presents a robust, fully integrated framework that bridges the gap between speed and accuracy in breast ultrasound analysis. The system’s high performance and real-time efficiency underscore its strong potential for clinical adoption to enhance diagnostic workflows, reduce operator variability, and improve breast cancer assessment. Full article

► Show Figures

Figure 1

28 pages, 3863 KB

Open AccessArticle

DeepSORT-OCR: Design and Application Research of a Maritime Ship Target Tracking Algorithm Incorporating Hull Number Features

by Jing Ma, Xihang Su, Kehui Xu, Hongliang Yin, Zhihong Xiao, Jiale Wang and Peng Liu

Mathematics 2026, 14(6), 1062; https://doi.org/10.3390/math14061062 - 20 Mar 2026

Abstract

Maritime ship target tracking plays an important role in applications such as maritime patrol and maritime surveillance. However, complex sea conditions, similar target appearances, and long-distance imaging often lead to target identity confusion and unstable trajectories. To address these issues, in this paper, [...] Read more.

Maritime ship target tracking plays an important role in applications such as maritime patrol and maritime surveillance. However, complex sea conditions, similar target appearances, and long-distance imaging often lead to target identity confusion and unstable trajectories. To address these issues, in this paper, a ship multi-object tracking algorithm, DeepSORT-OCR, that integrates hull number semantic features is proposed. Based on the YOLO detection framework and the DeepSORT tracking architecture, a CBAM-ResNet network is introduced to enhance the representation of ship appearance features. An Inner-SIoU metric is adopted to improve the geometric matching of slender ship targets, while an LSTM-Adaptive Kalman Filter is employed to model the nonlinear motion patterns of ships and improve trajectory prediction stability. In addition, a Hull Number Feature Extraction module is designed in order to recognize ship hull numbers using OCR and match them with a hull number database. The extracted hull number semantic features are dynamically fused with visual appearance features to strengthen identity constraints during target association. The experimental results show that the proposed method achieves an MOTA of 66.53% on the MOT16 dataset, representing an improvement of 5.13% over DeepSORT. On the self-constructed maritime ship dataset, the method achieves an MOTA of 70.89% and an MOTP of 80.84%. Furthermore, on the hull-number subset, the MOTA further increases to 77.18%, an improvement of 7.31% compared with DeepSORT, while the number of ID switches is significantly reduced. In addition, experiments conducted on pure real data, pure synthetic data, and cross-domain evaluation settings demonstrate the stability and strong generalization capability of the proposed algorithm under different data distributions. The proposed method effectively improves the stability and identity consistency of ship multi-object tracking in complex maritime environments. Full article

(This article belongs to the Special Issue Control Theory for Multi-Agent Systems: Recent Advances and Applications)

► Show Figures

Figure 1

29 pages, 9360 KB

Open AccessArticle

Spatial Relation Reasoning Based on Keypoints for Railway Intrusion Detection and Risk Assessment

by Shanping Ning, Feng Ding and Bangbang Chen

Appl. Sci. 2026, 16(6), 3026; https://doi.org/10.3390/app16063026 - 20 Mar 2026

Abstract

Foreign object intrusion in railway tracks is a major threat to train operation safety, yet current detection methods face challenges in identifying small distant targets and adapting to low-light conditions. Moreover, existing systems often lack the ability to assess intrusion risk levels, limiting [...] Read more.

Foreign object intrusion in railway tracks is a major threat to train operation safety, yet current detection methods face challenges in identifying small distant targets and adapting to low-light conditions. Moreover, existing systems often lack the ability to assess intrusion risk levels, limiting real-time warning and graded response capabilities. To address these gaps, this paper proposes a novel method for intrusion detection and risk assessment based on keypoint spatial discrimination. First, an XS-BiSeNetV2-based track segmentation network is developed, incorporating cross-feature fusion and spatial feature recalibration to improve track extraction accuracy in complex scenes. Second, an enhanced STI-YOLO detection model is introduced, integrating a Shuffle attention mechanism for better feature interaction, a high-resolution Transformer detection head to improve small-target sensitivity, and the Inner-IoU loss function to refine bounding box regression. Detected targets’ bottom keypoints are then analyzed relative to track boundaries to determine intrusion direction. By combining lateral distance and motion state features, a multi-level risk classification system is established for quantitative threat assessment. Experiments on the RailSem19 and GN-rail-Object datasets show that the method achieves a track segmentation mIoU of 88.19% and a detection mAP of 82.6%. The risk assessment module effectively quantifies threats across scenarios and maintains stable performance under low-light and strong-glare conditions. This work offers a quantifiable risk assessment solution for intelligent railway safety systems. Full article

22 pages, 6052 KB

Open AccessArticle

HSMD-YOLO: An Anti-Aliasing Feature-Enhanced Network for High-Speed Microbubble Detection

by Wenda Luo, Yongjie Li and Siguang Zong

Algorithms 2026, 19(3), 234; https://doi.org/10.3390/a19030234 - 20 Mar 2026

Abstract

Underwater micro-bubble detection entails multiple challenges, including diminutive target sizes, sparse pixel information, pronounced specular highlights and water scattering, indistinct bubble boundaries, and adhesion or overlap between instances. To address these issues, we propose HSMD-YOLO, an improved detector tailored for high-resolution micro-bubble detection [...] Read more.

Underwater micro-bubble detection entails multiple challenges, including diminutive target sizes, sparse pixel information, pronounced specular highlights and water scattering, indistinct bubble boundaries, and adhesion or overlap between instances. To address these issues, we propose HSMD-YOLO, an improved detector tailored for high-resolution micro-bubble detection and built upon YOLOv11. The model incorporates three novel components: the Scale Switch Block (SSB), a scale-transformation module that suppresses artifacts and background noise, thereby stabilizing edges in thin-walled bubble regions and enhancing sensitivity to geometric contours; the Global Local Refine Block (GLRB), which achieves efficient global relationship modeling with an asymptotic linear complexity (

O (N)

) in spatial dimensions while further refining local features, thereby strengthening boundary perception and improving bubble–background separability; and the Bidirectional Exponential Moving Attention Fusion (BEMAF), which accommodates the multi-scale nature of bubbles by employing a parallel multi-kernel architecture to extract spatial features across scales, coupled with a multi-stage EMA based attention mechanism to enhance detection robustness under weak boundaries and complex backgrounds. Experiments conducted on an Side-Illuminated Light Field Bubble Database (SILB-DB) and a public gas–liquid two-phase flow dataset (GTFD) demonstrate that HSMD-YOLO achieves mAP@50 scores of 0.911 and 0.854, respectively, surpassing mainstream detection methods. Ablation studies indicate that SSB, GLRB, and BEMAF contribute performance gains of 1.3%, 2.0%, and 0.4%, respectively, thereby corroborating the effectiveness of each module for micro-scale object detection. Full article

(This article belongs to the Section Evolutionary Algorithms and Machine Learning)

► Show Figures

Figure 1

14 pages, 18688 KB

Open AccessArticle

Outdoor Motion Capture at Scale

by Michael Zwölfer, Martin Mössner, Helge Rhodin and Werner Nachbauer

Sensors 2026, 26(6), 1951; https://doi.org/10.3390/s26061951 - 20 Mar 2026

Abstract

Capturing kinematic data in outdoor sports is challenging, as motions span large capture volumes and occur under difficult environmental conditions. Video-based approaches, particularly with pan–tilt–zoom cameras, offer a practical solution, but the extensive manual post-processing required limits their use to short sequences and [...] Read more.

Capturing kinematic data in outdoor sports is challenging, as motions span large capture volumes and occur under difficult environmental conditions. Video-based approaches, particularly with pan–tilt–zoom cameras, offer a practical solution, but the extensive manual post-processing required limits their use to short sequences and few athletes. This study presents a motion capture pipeline that automates the detection of both reference points and sport-specific keypoints to overcome this limitation. The field test employed eight cameras covering a

250 \times 80 \times 30 m

capture volume with nearly 300 reference points. Ten state-certified ski instructors performed eight standardized maneuvers. Reference points were localized through a hybrid approach combining YOLO object detection and ArUco marker identification. AlphaPose was fine-tuned on a new manually annotated dataset to detect skier-specific keypoints (e.g., skis, poles) alongside anatomical landmarks. Continuous frame-wise calibration and 3D reconstruction were performed using Direct Linear Transformation. Evaluation compared automated detections with manual annotations. Automated reference point detection achieved a mean localization error of

4.1

pixels (

0.1 %

of 4K width) and reduced 3D segment-length variation by

23 %

. The skier-specific keypoint model reached

98 %

PCK, mAP of

0.97

, and an MPJPE of

10.3

pixels while lowering 3D segment-length variation by

0.5 cm

compared to manual digitization and

0.6 cm

relative to a pretrained model. Replacing manual digitization with automated detection improves accuracy and facilitates kinematic data collection in large outdoor fields with many athletes and trials. The approach also enables the creation of sport-specific datasets valuable for biomechanical research and training next-generation 3D pose estimation models. Full article

(This article belongs to the Special Issue Advanced Sensors in Biomechanics and Rehabilitation—2nd Edition)

► Show Figures

Graphical abstract

20 pages, 3218 KB

Open AccessArticle

MIP-YOLO11: An Underwater Object Detection Model Based on Improved YOLO11

by Xinyu Qu, Ying Shao, Zheng Wang and Man Chang

J. Mar. Sci. Eng. 2026, 14(6), 572; https://doi.org/10.3390/jmse14060572 - 19 Mar 2026

Abstract

Due to challenges such as inadequate lighting, water scattering, high density of small objects, and complex object morphology in underwater environments, traditional YOLO11 models face difficulties including interference from complex backgrounds, weak perception of small objects, and insufficient feature extraction when applied underwater. [...] Read more.

Due to challenges such as inadequate lighting, water scattering, high density of small objects, and complex object morphology in underwater environments, traditional YOLO11 models face difficulties including interference from complex backgrounds, weak perception of small objects, and insufficient feature extraction when applied underwater. This paper proposes an improved MIP-YOLO11 model for underwater object detection based on the YOLO11 framework. First, a MCEA module is designed in the backbone network to replace the basic CBS convolution module. Through a lightweight multi-branch convolutional structure, the perception ability for small objects, object edges, contours, and morphological features in underwater scenes are enhanced without significantly increasing computational overhead. Second, an IMCA module based on the coordinate attention mechanism is introduced at the end of the backbone network to replace the C2PSA module, reducing the number of model parameters while maintaining detection accuracy. Finally, the Bottleneck module in C3k2 is improved by incorporating a PConv and a dual residual connection mechanism, thereby expanding the receptive field and enhancing the efficiency of complex feature extraction. Experimental results demonstrate that MIP-YOLO11 significantly outperforms the traditional YOLO11 in underwater environments. P and R are improved by 2.5% and 4.1%, respectively. Moreover, the mAP0.5 and mAP0.5:0.95 metrics are increased by 4.2% and 7.5%, respectively. The improved model achieves a good balance between high accuracy and light weight, and can provide a more reliable underwater object detection scheme for AUV underwater detection and other application scenarios. Full article

(This article belongs to the Section Ocean Engineering)

► Show Figures

Figure 1

25 pages, 6302 KB

Open AccessArticle

Artificial Intelligence-Based Detection of On-Ground Chestnuts Toward Automated Picking

by Kaixuan Fang, Yuzhen Lu and Xinyang Mu

AgriEngineering 2026, 8(3), 116; https://doi.org/10.3390/agriengineering8030116 - 19 Mar 2026

Abstract

Traditional mechanized chestnut harvesting is too costly for small producers, non-selective, and prone to damaging nuts. Accurate, reliable detection of chestnuts on the orchard floor is crucial for developing low-cost, vision-guided automated harvesting technology. However, developing a reliable chestnut detection system faces challenges [...] Read more.

Traditional mechanized chestnut harvesting is too costly for small producers, non-selective, and prone to damaging nuts. Accurate, reliable detection of chestnuts on the orchard floor is crucial for developing low-cost, vision-guided automated harvesting technology. However, developing a reliable chestnut detection system faces challenges in complex environments with shading, varying natural light conditions, and interference from weeds, fallen leaves, stones, and other foreign on-ground objects, which have remained unaddressed. This study collected 319 images of chestnuts on the orchard floor, containing 6524 annotated chestnuts. A comprehensive set of 29 state-of-the-art real-time object detectors, including 14 in the YOLO (v11–v13) and 15 in the RT-DETR (v1–v4) families at various model scales, was systematically evaluated through replicated modeling experiments for chestnut detection. Experimental results show that the YOLOv12m model achieved the best mAP@0.5 of 95.1% among all the evaluated models, while RT-DETRv2-R101 was the most accurate variant among the RT-DETR models, with mAP@0.5 of 91.1%. In terms of mAP@[0.5:0.95], the YOLOv11x model achieved the best accuracy of 80.1%. All models demonstrated significant potential for real-time chestnut detection, and YOLO models outperformed RT-DETR models in terms of both detection accuracy and inference, making them better suited for on-board deployment. This work lays a foundation for developing AI-based, vision-guided intelligent chestnut harvest systems. Full article

(This article belongs to the Special Issue Applications of Computer Vision in Agriculture)

► Show Figures

Figure 1

29 pages, 7173 KB

Open AccessArticle

Research on Detection and Picking Point of Lychee Fruits in Natural Scenes Based on Deep Learning

by Jing Chang and Sangdae Kim

Agriculture 2026, 16(6), 686; https://doi.org/10.3390/agriculture16060686 - 18 Mar 2026

Viewed by 71

Abstract

China is one of the world’s major lychee producers, and the fruit’s soft texture, small size, and thin peel make non-destructive robotic harvesting particularly challenging. Accurate fruit detection, branch segmentation, and precise picking-point localization are critical for enabling automated harvesting in complex natural [...] Read more.

China is one of the world’s major lychee producers, and the fruit’s soft texture, small size, and thin peel make non-destructive robotic harvesting particularly challenging. Accurate fruit detection, branch segmentation, and precise picking-point localization are critical for enabling automated harvesting in complex natural orchard environments. This study proposes an integrated perception framework for lychee harvesting that combines object detection, density-based clustering, and semantic segmentation. An improved YOLO11s-based detection network incorporating SimAM attention, CMUNeXt feature enhancement, and MPDIoU loss is developed to enhance robustness under illumination variation, occlusion, and scale changes. The proposed detector achieves a precision of 84.3%, recall of 73.2%, and mAP of 81.6%, outperforming baseline models. Density-based clustering is employed to group individual detections into fruit clusters. Comparative experiments demonstrate that MeanShift achieves the highest clustering consistency, with an average Adjusted Rand Index (ARI) of 0.768, outperforming k-means and other baselines. An improved DeepLab v3+ semantic segmentation network with a ResDenseFocal backbone and Focal Loss is designed for accurate branch extraction under complex backgrounds. Finally, a rule-based geometric picking-point localization algorithm is formulated in the image coordinate system by integrating detection, clustering, and branch segmentation results. Experimental validation demonstrates that the proposed framework can reliably localize picking points in two-dimensional images under natural orchard conditions. The proposed method provides a practical perception solution for intelligent lychee harvesting and establishes a foundation for future 3D robotic manipulation and field deployment. Full article

(This article belongs to the Special Issue Robots for Fruit Crops: Harvesting, Pruning, and Phenotyping)

► Show Figures

Figure 1

20 pages, 1861 KB

Open AccessArticle

Design of a Hardware-Optimized High-Performance CNN Accelerator for Real-Time Object Detection Using YOLOv3 with Darknet-19 Architecture

by Shuo Wu, Manasa Kunapareddy and Nan Wang

Electronics 2026, 15(6), 1264; https://doi.org/10.3390/electronics15061264 - 18 Mar 2026

Viewed by 50

Abstract

This research proposes a novel hardware-optimized design to accelerate Convolutional Neural Networks (CNNs) using Verilog HDL. The design is specifically developed for the DARKNET-19 system model, which serves as the backbone of the YOLOv3-tiny algorithm, a widely used framework for real-time object detection [...] Read more.

This research proposes a novel hardware-optimized design to accelerate Convolutional Neural Networks (CNNs) using Verilog HDL. The design is specifically developed for the DARKNET-19 system model, which serves as the backbone of the YOLOv3-tiny algorithm, a widely used framework for real-time object detection in dynamic environments. The CNN architecture was implemented in Verilog HDL and synthesized using Synopsys Design Compiler, with a focus on improving both object detection accuracy and hardware resource efficiency. The proposed design efficiently performs key CNN operations, including convolution, pooling, and activation, enabling faster real-time object detection compared to many existing methods. To improve performance, the hardware design incorporates parallel processing techniques, allowing multiple computations to be executed simultaneously. This significantly reduces the system latency and power consumption. The convolutional layers of the DARKNET-19 architecture are efficiently mapped onto the hardware platform, ensuring optimized data storage and fast memory access, which further enhances processing speed and detection accuracy. An innovative feature of the design is a 2-dimensional image preprocessing module that prepares input images before they are fed into the CNN. This preprocessing stage includes image resizing, brightness normalization, and color adjustment, which helps the CNN process visual data more effectively. After preprocessing, the images pass through several CNN layers. The convolutional layers extract key features from the images, while the pooling and activation layers refine these features to improve detection performance. Finally, the processed data is analyzed by the YOLOv3-tiny algorithm, which identifies and locates objects in the images with high precision. Experimental results demonstrate that the proposed high-speed and resource-efficient hardware architecture is well-suited for real-time object detection applications, particularly in highly dynamic and unpredictable environments. Full article

► Show Figures

Figure 1

15 pages, 3339 KB

Open AccessArticle

AI-Driven Adaptive Camouflage Pattern Generation for Helicopter Detection Evasion in Aerial Sensor Imagery Using Fine-Tuned YOLOv8 and Stable Diffusion

by Jonghyeok Im, Yeonhong Kim, Heoung-Jae Chun and Kyoungsik Kim

Sensors 2026, 26(6), 1895; https://doi.org/10.3390/s26061895 - 17 Mar 2026

Viewed by 183

Abstract

In aerial sensor systems, detecting helicopters against diverse backgrounds remains challenging due to environmental camouflage. This paper proposes an end-to-end framework for generating adaptive camouflage patterns to evade YOLO-based object detection. Starting with synthetic sensor imagery (background + transparent helicopter overlay), we employ [...] Read more.

In aerial sensor systems, detecting helicopters against diverse backgrounds remains challenging due to environmental camouflage. This paper proposes an end-to-end framework for generating adaptive camouflage patterns to evade YOLO-based object detection. Starting with synthetic sensor imagery (background + transparent helicopter overlay), we employ a fine-tuned YOLOv8m for precise VTOL mask extraction, followed by KMeans clustering with Gaussian blur for dominant color extraction from the background. These colors guide Stable Diffusion inpainting to synthesize full-screen camouflage textures, which are then masked and overlapped onto the helicopter region. Evaluated on a 920-image dataset across multiple backgrounds, our method achieves a 97.6% reduction in mAP@0.5 (from 0.8175 to 0.0196) on 751 camouflaged images against a fine-tuned YOLOv8m model, with recall dropping by 95.9%. Even against a helicopter-specialized Defence model, mAP@0.5 drops by 89.6% (from 0.1178 to 0.0123). Ablation studies confirm the synergy of YOLO masking and color-guided inpainting. This sensor-fusion approach enhances stealth in unmanned aerial surveillance, with implications for civilian aviation safety. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

30 pages, 26587 KB

Open AccessArticle

Research on Synthetic Data Methods and Detection Models for Micro-Cracks

by Yaotong Jiang, Tianmiao Wang, Xuanhe Chen and Jianhong Liang

Sensors 2026, 26(6), 1883; https://doi.org/10.3390/s26061883 - 17 Mar 2026

Viewed by 99

Abstract

Micro-crack detection on concrete surfaces is challenging because labeled micro-crack data are scarce, crack cues are extremely weak (often only a few pixels wide), and complex backgrounds (e.g., non-uniform illumination, shadows, and stains) degrade feature extraction; this study aims to improve both data [...] Read more.

Micro-crack detection on concrete surfaces is challenging because labeled micro-crack data are scarce, crack cues are extremely weak (often only a few pixels wide), and complex backgrounds (e.g., non-uniform illumination, shadows, and stains) degrade feature extraction; this study aims to improve both data availability and detection robustness for practical inspection. A Poisson image editing-based synthesis strategy is developed to generate visually coherent micro-crack samples via gradient-domain blending, and a Complex-Scene-Tolerant YOLO (CST-YOLO) detector is proposed on top of YOLOv10, following an “lighting decoupling–global perception–micro-feature enhancement” design. CST-YOLO integrates an Lighting-Adaptive Preprocessing Module (LAPM) to suppress illumination/shadow perturbations, a Spatial–Channel Sparse Transformer (SCS-Former) to model long-range crack topology efficiently, and a Small Object Focus Block (SOFB) to enhance micro-scale cues under cluttered backgrounds. Experiments are conducted on a 650-image dataset (200 real and 450 synthesized), in which synthesized samples are used only for training, and the validation/test sets contain only real images, with a 7:2:1 split. CST-YOLO achieves 0.990 mAP@0.5 and 0.926 mAP@0.5:0.95 at 139 FPS, and ablation results indicate complementary contributions from LAPM, SCS-Former, and SOFB. These results support the effectiveness of combining realistic synthesis and architecture-level robustness for real-time micro-crack detection in complex scenes. Full article

(This article belongs to the Section Fault Diagnosis & Sensors)

► Show Figures

Figure 1

Search Results (1,560)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (1,560)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI