Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (6,421)

Search Parameters:
Keywords = YOLOv10

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
31 pages, 3582 KB  
Article
A Stage-Aware Cascaded Detection–Segmentation Framework for Leaf Phenotyping and Leaf Dry Biomass Estimation of Pepper Seedlings
by Han Li, Dongyuan Shi, Hui Shi, Ming Li and Ming Diao
Plants 2026, 15(12), 1912; https://doi.org/10.3390/plants15121912 (registering DOI) - 20 Jun 2026
Abstract
Quantitative phenotyping of pepper seedlings is important for greenhouse plug tray seedling cultivation, but it remains constrained by inefficient manual monitoring, complex greenhouse backgrounds, and growth-stage-dependent discrepancies between two-dimensional image traits and actual leaf biomass. In this study, a cascaded vision framework with [...] Read more.
Quantitative phenotyping of pepper seedlings is important for greenhouse plug tray seedling cultivation, but it remains constrained by inefficient manual monitoring, complex greenhouse backgrounds, and growth-stage-dependent discrepancies between two-dimensional image traits and actual leaf biomass. In this study, a cascaded vision framework with stage-specific morphological correction was developed for nondestructive seedling phenotyping. The framework integrated Visual Dynamic Momentum YOLO (VDM-YOLO) for individual seedling localization and growth-stage recognition, Variance Guided Strip Ghost Gated UNet (VSG-UNet) for lightweight, high-resolution leaf segmentation, and a stage-aware correction model for leaf dry biomass estimation. In performance evaluation, VDM-YOLO achieved a mean average precision at an intersection over union threshold of 0.5 (mAP0.5) of 89.27%, improving mAP0.5 by 1.82 percentage points over YOLOv12. VSG-UNet achieved a mean intersection over union (mIoU) of 83.9% and a Dice coefficient of 81.8%, while reducing floating point operations (FLOPs) and parameters by 44.2% and 61.2%, respectively, compared with U-Net. After stage-aware calibration, the coefficient of determination (R2) between segmented area and leaf dry weight increased from 0.764 to 0.813, and the root mean square error (RMSE) decreased from 0.0210 g to 0.0190 g. These results demonstrated that the proposed framework provided a proof of concept approach based on RGB images for the nondestructive assessment of leaf area and leaf dry biomass in pepper seedlings under restricted experimental conditions. Full article
(This article belongs to the Section Plant Modeling)
19 pages, 4732 KB  
Article
YOLO-OBB and Two-Stage Geometric Correction for RGB-LED Array Optical Camera Communication
by Jiaqi Ju, Pan Qiu, Yipeng Tan and Zhengguang Shi
Photonics 2026, 13(6), 599; https://doi.org/10.3390/photonics13060599 (registering DOI) - 20 Jun 2026
Abstract
In Optical Camera Communication (OCC), precise localization of LED arrays under complex tilt conditions is a core challenge for reliable decoding. This paper proposes an OCC reception scheme for RGB-LED arrays that integrates YOLO-OBB rotated object detection with two-stage geometric correction. The system [...] Read more.
In Optical Camera Communication (OCC), precise localization of LED arrays under complex tilt conditions is a core challenge for reliable decoding. This paper proposes an OCC reception scheme for RGB-LED arrays that integrates YOLO-OBB rotated object detection with two-stage geometric correction. The system first employs a YOLOv8n-OBB model to extract a quadrilateral region of interest that tightly encloses the LED array boundary. This effectively suppresses background interference caused by superimposed perspective tilt and in-plane rotation. A coarse-to-fine two-stage correction framework is then applied. The first stage rapidly eliminates the dominant perspective distortion based on the detected bounding-box corners. The second stage performs a refined correction using the actual LED center positions. Two homography matrices are cascaded into a combined transformation, achieving two-stage correction accuracy through a single coordinate mapping. In the corrected image, K-Means clustering constructs a 16 × 16 LED topological grid. A locking strategy is adopted so that subsequent frames skip repeated LED detection and clustering. The steady-state per-frame processing time is reduced to approximately 78.9 ms. Experiments covered 16 cross-combinations of vertical tilt from 0° to 45° (0°, 15°, 30°, 45°) and in-plane rotation from 0° to 40° (0°, 15°, 30°, 40°). The uncorrected scheme and the horizontal-box scheme experienced severe bit errors or complete failure under complicated distortion. The proposed scheme maintained error-free transmission under all 16 tested conditions. The ratios of opposite sides of the corrected LED grid remained stable between 0.997 and 1.004. The system simultaneously achieves high reliability and low-latency real-time processing under complex geometric distortions. Full article
Show Figures

Figure 1

31 pages, 34272 KB  
Article
Reliable Vision-Based PPE Detection for Construction Safety in Adverse Environmental Conditions
by Sujan Gyawali, Ali Mohammadjafari, Saurav Ghimire and Mahmoud Habibnezhad
Buildings 2026, 16(12), 2447; https://doi.org/10.3390/buildings16122447 (registering DOI) - 20 Jun 2026
Abstract
Adverse imaging conditions such as fog, rain, and low light degrade the reliability of vision-based Personal Protective Equipment (PPE) detection systems on construction sites, yet most existing models are trained under clear-weather assumptions. This paper introduces a physics-based weather augmentation framework integrated with [...] Read more.
Adverse imaging conditions such as fog, rain, and low light degrade the reliability of vision-based Personal Protective Equipment (PPE) detection systems on construction sites, yet most existing models are trained under clear-weather assumptions. This paper introduces a physics-based weather augmentation framework integrated with the YOLOv8n architecture to improve PPE detection robustness under adverse environmental conditions. The original Color Helmet and Vest (CHV) dataset was expanded from 1330 clear-weather images to 6650 images across five conditions using four physically grounded augmentation models: the Koschmieder atmospheric scattering model for fog, the Garg–Nayar streak model for rain, gamma-corrected attenuation with Poisson–Gaussian noise for low light, and a PSF-based glare model for bright sunlight. The weather-resistant model, a clear-weather baseline, and an augmented baseline were evaluated on the same 665-image weather-augmented test set. The weather-resistant model achieves 89.2% mAP50, a 5.7 percentage-point improvement over the clear-weather baseline (83.5%), with a nearly four-fold improvement in cross-condition stability (standard deviation 1.5% vs. 5.7%). Under matched training-data volume, the weather-resistant model still outperforms a conventionally augmented baseline across all five simulated conditions, indicating that these gains stem from physics-based modeling rather than larger training-data volume. The largest gain occurs under low light, where mAP50 improves from 73.4% to 87.9%. Gradient-weighted Class Activation Mapping (Grad-CAM) analysis confirms that the weather-resistant model directs more attention toward PPE regions across all conditions, with the largest improvement under low light (+10.0 percentage points). The lightweight design (3.0 M parameters) and quantitative and qualitative validation on 205 annotated real-world construction site images under normal and low-light conditions provide preliminary evidence of practical applicability. Full article
(This article belongs to the Special Issue Intelligent Monitoring for Health and Safety in Built Environments)
23 pages, 2264 KB  
Article
Real-Time Leaf Disease Detection with Boundary-Aware and Texture-Sensitive Feature Enhancement
by Jinyang Qiu, Qiuyi Du, Yonggang Wang, Yuhan Tao, Yue Guo, Ye Zhang and Yue Gao
Symmetry 2026, 18(6), 1059; https://doi.org/10.3390/sym18061059 (registering DOI) - 19 Jun 2026
Abstract
Accurate and robust detection of leaf diseases is a key enabler for precision agriculture and large-scale crop health monitoring. Despite the strong generalization of modern one-stage detectors (e.g., YOLOv8), two domain-specific challenges remain: (i) weak or blurry lesion boundaries hinder precise localization, and [...] Read more.
Accurate and robust detection of leaf diseases is a key enabler for precision agriculture and large-scale crop health monitoring. Despite the strong generalization of modern one-stage detectors (e.g., YOLOv8), two domain-specific challenges remain: (i) weak or blurry lesion boundaries hinder precise localization, and (ii) low color contrast between diseased and healthy tissues forces models to rely on subtle texture patterns rather than salient shapes. To tackle these challenges, we reframe the core agricultural disease detection task as the identification of “asymmetric morphological anomalies” and propose a domain-tailored enhancement framework. First, we introduce an Edge Enhancement Module (EEM) that explicitly strengthens boundary-aware representations. Inspired by the natural symmetry of healthy leaves, our EEM is specifically designed to capture symmetry-breaking boundary discontinuities and localized asymmetric edges caused by disease lesions. Our method enhances edge and texture cues that are indicative of disease lesions, which often exhibit local asymmetries and boundary discontinuities. The EEM includes a Differential Normalized Pooling Block (DNPB) that highlights edge responses through discrepancies between max pooling and average pooling, which also models cross-group edge correlations. Second, the Lightweight Texture-Sensitive Feature Enhancement (LTSFE) mechanism amplifies texture-discriminative channels under low-contrast conditions by leveraging complementary global statistics and efficient channel mixing, all with negligible computational overhead. We evaluated our method on a self-constructed dataset of 106,434 images with 225,640 annotations covering diverse crops. Experiments show that the proposed method achieves state-of-the-art accuracy (81.54% mAP@0.5:0.95) while maintaining real-time inference (142 FPS), consistently outperforming strong baselines. Ablations confirm the effectiveness and complementarity of EEM and LTSFE, demonstrating that domain-specific architectural design, inspired by biological symmetry, can substantially improve agricultural vision systems. Full article
(This article belongs to the Section Engineering and Materials)
17 pages, 15918 KB  
Article
ADA-YOLO: An Adaptive Dynamic Aggregation Network for Small Object Detection in UAV Imagery
by Jiajun Chen, Shaochen Jiang, Yongming Li, Sulaiman Tuersunayi and Yong Liu
Sensors 2026, 26(12), 3908; https://doi.org/10.3390/s26123908 (registering DOI) - 19 Jun 2026
Abstract
Unmanned Aerial Vehicle (UAV) image object detection holds significant application value in the low-altitude economy, traffic monitoring, intelligent agriculture, and disaster rescue. However, due to the top-down perspective, UAV images typically suffer from challenges such as small target scales, dense object distribution, severe [...] Read more.
Unmanned Aerial Vehicle (UAV) image object detection holds significant application value in the low-altitude economy, traffic monitoring, intelligent agriculture, and disaster rescue. However, due to the top-down perspective, UAV images typically suffer from challenges such as small target scales, dense object distribution, severe occlusions, and complex backgrounds. These issues often limit the recall and localization accuracy of general-purpose detectors when they are directly applied to UAV small-object detection scenarios. To address these aforementioned challenges, this paper proposes an Adaptive Dynamic Aggregation YOLO network, termed ADA-YOLO. The novelty of ADA-YOLO lies in its highly efficient combinatorial design specifically tailored for UAV small object detection, while retaining the efficient backbone of YOLOv8, we systematically reconstruct the neck and detection head to improve accuracy. Specifically, a high-resolution P2 detection branch is incorporated to construct a P2–P5 multi-scale prediction structure. Furthermore, the lightweight DySample dynamic upsampling module is adopted to replace traditional upsampling methods, and an Adaptive Spatial Feature Fusion (ASFF) mechanism is introduced to alleviate semantic conflicts and noise interference during multi-scale feature fusion. This synergistic combination explicitly addresses multi-scale representation challenges and enhances small-object detection performance in complex scenes. Comparative experiments with the baseline YOLOv8n on the VisDrone2019 dataset demonstrate that ADA-YOLO achieves an improvement of 11.3% in mAP@0.5 and 8.2% in mAP@0.5:0.95. The improved model achieves these performance gains with a modest parameter increase and acceptable computational complexity. Finally, ablation experiments further validate the effectiveness of each individual module and their synergistic gains. Full article
(This article belongs to the Section Remote Sensors)
Show Figures

Figure 1

24 pages, 13145 KB  
Article
Real-Time Assistive System Integrating Geometric Topology Analysis and State-Adaptive Warning Logic for the Visually Impaired
by Bilie Hu, Peishen Gao, Yan Liu, Xi Xia and Guoping Huo
Sensors 2026, 26(12), 3905; https://doi.org/10.3390/s26123905 (registering DOI) - 19 Jun 2026
Abstract
Traditional white canes offer a limited perception range, whereas end-to-end visual models face challenges in real-time deployment on edge devices. To address these limitations, this paper proposes a lightweight real-time assistive system that integrates geometric topology reconstruction with state-adaptive warning logic. The system [...] Read more.
Traditional white canes offer a limited perception range, whereas end-to-end visual models face challenges in real-time deployment on edge devices. To address these limitations, this paper proposes a lightweight real-time assistive system that integrates geometric topology reconstruction with state-adaptive warning logic. The system utilizes YOLOv9 to extract discrete semantic primitives of tactile paving. It constructs a dual-branch perception framework based on Median Absolute Deviation and the Minimum Spanning Tree algorithm to analyze the topological structure of tactile paving. For complex intersections characterized by warning indicators, a one-dimensional connectivity clustering algorithm based on longitudinal topology is proposed. It generates accurate macroscopic feasible directional prompts under field-of-view boundary constraints. Additionally, a hierarchical scheduling framework dynamically orchestrates scenario-specific finite state machines to enable continuous dynamic interaction across typical high-risk scenarios. Evaluated on a custom real-world dataset, the system achieves a 95.21% frame-level comprehensive accuracy for straight-path deviation correction and intersection directional prompting. Dynamic temporal stress tests confirm the temporal stability and logical coherence of state transitions. Furthermore, latency evaluations demonstrate the logic layer’s minimal computational overhead, proving its theoretical feasibility for real-time edge deployment. This approach provides an effective, low-latency solution for delivering directional prompts and hazard warnings to visually impaired users. Full article
(This article belongs to the Section Intelligent Sensors)
13 pages, 3658 KB  
Article
TR-ABFT: Tile-Resilient Fault Detection for Neural Processing Units
by Yang Hua, Yunhong Bai, Bo Wang, Wei Zhuang and Yuanfu Zhao
Electronics 2026, 15(12), 2715; https://doi.org/10.3390/electronics15122715 - 19 Jun 2026
Abstract
Spaceborne neural processing units (NPUs) increasingly support real-time deep-learning inference, but their dense multiply-accumulate arrays are vulnerable to radiation-induced soft errors. Conventional radiation-hardening methods improve reliability through hardware redundancy, but they incur substantial area, performance and compiler-mapping overheads. This paper proposes tile-resilient algorithm-based [...] Read more.
Spaceborne neural processing units (NPUs) increasingly support real-time deep-learning inference, but their dense multiply-accumulate arrays are vulnerable to radiation-induced soft errors. Conventional radiation-hardening methods improve reliability through hardware redundancy, but they incur substantial area, performance and compiler-mapping overheads. This paper proposes tile-resilient algorithm-based fault tolerance (TR-ABFT), a software-scheduled, detection-oriented scheme for quantized NPU inference. TR-ABFT generates checksum information at tile granularity and maps checking tasks onto the original processing element (PE) array without changing the hardware topology. To make ABFT compatible with INT8 datapaths, we design two checksum-coding strategies: checksum decomposition and modulo-239 checksum coding. The modulo-239 scheme removes structural missed detections for two-bit flips with bit-position spacings in (1, 31), while preserving compatibility with signed INT8 inputs. Evaluations on ResNet, YOLOv8, and RT-DETR show that, on a 16×16 array, TR-ABFT introduces only 6.37% to 24.61% additional computational overhead. By converting spatial redundancy into schedulable temporal redundancy, TR-ABFT preserves systolic-array regularity and provides a low-overhead reliability-enhancement mechanism for space-grade neural-network accelerators. Full article
(This article belongs to the Special Issue Artificial Intelligence and Microsystems)
Show Figures

Figure 1

43 pages, 13866 KB  
Article
Research on Multi-Source Heterogeneous Collaborative Perception System Based on Unmanned Aerial Vehicle and Unmanned Ground Vehicle
by Yufeng Li, Erming Tian, Xiaofeng Chen, Huiyan Han and Xinya Zhang
Drones 2026, 10(6), 470; https://doi.org/10.3390/drones10060470 (registering DOI) - 19 Jun 2026
Abstract
Complex urban scenarios impose high demands on the environmental perception capabilities of unmanned systems, which serve as a prerequisite for executing autonomous missions such as disaster response, infrastructure inspection, and smart city operations. UAVs, leveraging their high mobility, can provide accurate prior maps [...] Read more.
Complex urban scenarios impose high demands on the environmental perception capabilities of unmanned systems, which serve as a prerequisite for executing autonomous missions such as disaster response, infrastructure inspection, and smart city operations. UAVs, leveraging their high mobility, can provide accurate prior maps and wide-area aerial observation for unmanned ground vehicles. However, their long-range perception accuracy is limited. Conversely, UGVs can achieve high-precision environmental perception along their navigation paths using prior maps, but suffer from a constrained field of view. The collaboration between the two platforms complements their respective strengths, thereby enhancing 3D object perception and mapping accuracy in complex scenarios. To address the aforementioned challenges, this study proposes a cross-platform feature fusion method for 3D object perception and an incremental map updating approach for UAVs and UGVs. First, a dynamic SLAM method that integrates an optimized YOLOv8 with ORB-SLAM3 is employed to mitigate map blurring caused by dynamic noise, providing prior map information for UGVs. Second, a multimodal fusion perception model is constructed for UGVs, utilizing attention mechanisms to achieve deep fusion of multimodal Bird’s-Eye-View (BEV) features. This overcomes issues such as diminishing complementarity between modalities and weak temporal feature associations. Finally, an air ground fusion model based on a cross-attention mechanism is developed to fuse aerial view features with ground-based fused BEV features across platforms, yielding a unified feature representation for 3D object detection and generating a fused high-precision map. Experimental results demonstrate that under complex occlusion scenarios in a simulated dataset, the proposed collaborative perception system improves the mean Average Precision (mAP) by 12.7% and 15.7% compared to using a single UAV or a single UGV, respectively, while increasing the map accuracy F1-score by 0.21. This study provides technical support for achieving real-time and accurate air ground collaborative perception in complex dynamic environments. Full article
(This article belongs to the Section Innovative Urban Mobility)
Show Figures

Figure 1

27 pages, 23377 KB  
Article
YOLO-Crack: Geometry-Guided Real-Time Crack Detection Framework Toward Edge Deployment
by Zhe Wei, Rui Wang, Rong Dai, Haibo Xu, Huan Zhang and Yurong Zou
Sensors 2026, 26(12), 3892; https://doi.org/10.3390/s26123892 (registering DOI) - 18 Jun 2026
Abstract
Crack detection in mobile inspection scenarios is constrained by both the extremely slender geometry of crack targets and the real-time inference requirements on edge devices, which expose systematic limitations of general-purpose object detectors. This paper proposes YOLO-Crack, a closed-loop solution that couples geometry-statistics-driven [...] Read more.
Crack detection in mobile inspection scenarios is constrained by both the extremely slender geometry of crack targets and the real-time inference requirements on edge devices, which expose systematic limitations of general-purpose object detectors. This paper proposes YOLO-Crack, a closed-loop solution that couples geometry-statistics-driven module design with end-to-end edge deployment validation. On the algorithmic side, we first quantify crack geometric properties and then introduce (i) a crack-aware cross-dimensional fusion attention (CFCA) module to strengthen feature representations, (ii) a dual-path feature enhancement module (DFEM) to preserve fine details during upsampling, and (iii) an empirical smooth quality window adjustment with shape consistency regularization to stabilize bounding-box regression for slender cracks. Experiments on the Crack500 dataset show that YOLO-Crack achieves 78.8% precision, 51.4% recall, and 65.7% mAP@0.5, improving over the YOLOv11n baseline by 4.2, 1.7, and 2.9 percentage points, respectively. On the engineering side, we deploy YOLO-Crack on a Jetson Orin NX mobile robot platform and evaluate it in a real ROS pipeline; the measured end-to-end throughput reaches 25.5 FPS, meeting real-time video processing requirements. The proposed framework provides a practical reference workflow for edge vision tasks, from geometry analysis to engineering verification. Full article
(This article belongs to the Special Issue Image-Based Surface Damage Detection)
29 pages, 6688 KB  
Article
CGMSN: CFAR-Guided Mode-Selective Network for SAR Target Detection
by Lingjuan Yu, Xinya Xiong, Xiaochun Xie, Miaomiao Liang, Xiangchun Yu, Xuan Jiao and Wen Hong
Remote Sens. 2026, 18(12), 2040; https://doi.org/10.3390/rs18122040 - 18 Jun 2026
Abstract
Improving detection performance across diverse synthetic aperture radar (SAR) scenes remains challenging because different datasets exhibit different levels of target–background separability. To address this issue, we propose a constant false alarm rate (CFAR)-guided mode-selective network (CGMSN), which selects an appropriate feature-fusion mode according [...] Read more.
Improving detection performance across diverse synthetic aperture radar (SAR) scenes remains challenging because different datasets exhibit different levels of target–background separability. To address this issue, we propose a constant false alarm rate (CFAR)-guided mode-selective network (CGMSN), which selects an appropriate feature-fusion mode according to the CFAR target–background separation margin. Specifically, CFAR is used as an interpretable statistical tool to construct an anomaly response map. The separation margin is then calculated by comparing the average CFAR anomaly responses of annotated target regions and their surrounding contextual backgrounds. Based on this indicator, a You Only Look Once version 8 (YOLOv8)-based mode-selective detector is constructed with three key components. First, a lightweight representation-enhanced backbone that integrates ResNet18 and a dilated convolutional spatial pyramid (DCSP) module is adopted to improve contextual representation while maintaining moderate model complexity. Second, a mode-selective neck (MSN) is designed with three predefined fusion modes, where the appropriate fusion depth is selected according to the CFAR-guided target–background separation margin of each dataset. Third, a complete intersection over the union modulated head (CMH) is developed to enhance classification-regression alignment and suppress clutter-induced responses. Experiments on SAR-Aircraft-1.0, High-Resolution SAR Images Dataset (HRSID), and SAR Ship Detection Dataset (SSDD) indicate that datasets with smaller CFAR target–background separation margins benefit from deeper fusion, while datasets with larger separation margins can adopt shallower fusion. Moreover, the proposed CGMSN achieves superior performance over representative detectors, demonstrating its effectiveness on the evaluated SAR datasets with diverse scene characteristics. Full article
26 pages, 3882 KB  
Article
Remote Sensing Small Object Detection Network Based on Wavelet-Convolution and Fine-Grained Preservation
by Hangyu Li and Tiecheng Song
Information 2026, 17(6), 609; https://doi.org/10.3390/info17060609 (registering DOI) - 18 Jun 2026
Abstract
Small object detection in remote sensing imagery is a fundamental task for visual information extraction, yet it remains challenging due to extremely small target scales, complex backgrounds, and the loss of discriminative feature information caused by repeated downsampling. To address these issues, this [...] Read more.
Small object detection in remote sensing imagery is a fundamental task for visual information extraction, yet it remains challenging due to extremely small target scales, complex backgrounds, and the loss of discriminative feature information caused by repeated downsampling. To address these issues, this paper proposes a Wavelet-Convolution and Fine-Grained Preservation Network (WCFPNet) based on YOLOv8n. Specifically, a Wavelet-Convolution Module (WCM) is introduced into the backbone to decompose feature maps into low- and high-frequency sub-bands, thereby enhancing structural feature modeling and preserving subtle target details. To compensate for the weakened fine-grained information after repeated downsampling, an Enhanced Spatial Pyramid Pooling-Fast (ESPPF) module is embedded at the end of the backbone to strengthen multi-scale contextual aggregation. In addition, an Enhanced Feature Pyramid Network (EFPN) is designed in the neck to facilitate the propagation of shallow and intermediate fine-grained features to high-level semantic features through cross-level fusion and the Convolutional Block Attention Module (CBAM). Experiments on the NWPU VHR-10 dataset show that WCFPNet achieves 0.879 mAP@0.5 and 0.515 mAP@0.5:0.95, outperforming YOLOv8n by 1.7 and 2.5 percentage points, respectively. Moreover, the proposed WCFPNet achieves a competitive performance compared with several representative detectors while maintaining moderate model complexity. These results demonstrate the effectiveness of WCFPNet in challenging remote sensing scenes characterized by complex backgrounds, dense object distributions, and weak textures. Full article
Show Figures

Figure 1

22 pages, 14170 KB  
Article
A YOLO-Based Workflow for Detecting and Mapping Archaeological Stone Cairns in Satellite Imagery: A Case Study from Western Ennedi, Chad
by Ebrahim Ghaderpour, Clarisse Djetounako Nekoulnang, Hamdji Milman Noudjiko, Pier Paolo Rossi, Rocco Rotunno and Savino di Lernia
Heritage 2026, 9(6), 237; https://doi.org/10.3390/heritage9060237 - 18 Jun 2026
Abstract
Automated detection of archaeological stone cairns using high-resolution satellite imagery offers a scalable approach for documenting vulnerable heritage landscapes in the Ennedi Massif, where extensive and remote terrain limits traditional field survey, and rapid documentation is required. This study presents a GIS and [...] Read more.
Automated detection of archaeological stone cairns using high-resolution satellite imagery offers a scalable approach for documenting vulnerable heritage landscapes in the Ennedi Massif, where extensive and remote terrain limits traditional field survey, and rapid documentation is required. This study presents a GIS and deep learning framework based on the YOLOv8 model to identify and map stone cairns using Google Satellite RGB imagery at 28.5 cm spatial resolution. Ground-truth data collected via GPS field survey were used to train and validate YOLOv8n. The study area was divided into two regions with contrasting terrain and illumination conditions to evaluate model transferability. The training region included 149 verified cairns, while the independent test region included 103 cairns. Early stopping reduced overfitting, reaching mAP50 of 99.5% and mAP50–95 of 94.3%. A density-based spatial clustering algorithm was applied to merge overlapping detections and generate circular cairn representations. On the test set, the model achieved 83.5% precision, recall, and F1-score, indicating stable performance under the selected operational configuration. Comparison with YOLOv5n showed slightly higher localization accuracy for YOLOv8n, while YOLOv5n yielded marginally higher precision and F1-score. Overall, the framework provides a non-invasive tool for large-scale archaeological prospection and heritage monitoring in remote desert environments. Full article
30 pages, 11823 KB  
Article
YOLO-MOD: An Instance Segmentation Algorithm for Pomelo Fruit and Fruit Stem Based on YOLOv11-Seg
by Wei Zhou, Leina Gao, Fuchun Sun, Qiurong Lv, Yuechao Bian, Chi Hu and Senlin Yang
Horticulturae 2026, 12(6), 744; https://doi.org/10.3390/horticulturae12060744 - 18 Jun 2026
Abstract
This study aims to develop an instance segmentation model for the joint segmentation of pomelo fruits and stems in complex natural orchard environments, with particular emphasis on slender, small-scale, and easily occluded stem targets. To this end, YOLO-MOD, an improved instance segmentation algorithm [...] Read more.
This study aims to develop an instance segmentation model for the joint segmentation of pomelo fruits and stems in complex natural orchard environments, with particular emphasis on slender, small-scale, and easily occluded stem targets. To this end, YOLO-MOD, an improved instance segmentation algorithm based on YOLOv11-seg, is proposed. Specifically, Omni-Dimensional Dynamic Convolution (ODConv) is introduced into the C3k2 module to enhance complex feature representation; a Multi-Scale Dilated Attention (MSDA) module is embedded to improve the multi-scale semantic perception of slender stem regions; and the original upsampling operator is replaced with DySample to strengthen fine-grained boundary recovery. Experimental results show that, compared with the original YOLOv11-seg, YOLO-MOD improves the Box mAP@50 and Mask mAP@50 by 2.9% and 3.9%, respectively. For the Stem class, the Box mAP@50 and Mask mAP@50 increase from 71.9% to 77.8% and from 68.4% to 76.2%, respectively. These results indicate that YOLO-MOD can achieve fine-grained segmentation of pomelo fruits and stems on the dataset used in this study. However, its generalization capability across different orchards, seasons, pomelo varieties, and fruit types still requires further evaluation, and its practical effectiveness in an integrated robotic harvesting system remains to be further validated. Full article
Show Figures

Figure 1

18 pages, 5048 KB  
Article
AI-Driven Pavement Condition Assessment from Dash-Cam Imagery: A Comparative Analysis of YOLOv8-Based PCI Estimation, Manual Inspections, and Automated PASER Ratings in Urban Networks
by Giulia Del Serrone, Giuseppe Loprencipe and Laura Moretti
Infrastructures 2026, 11(6), 207; https://doi.org/10.3390/infrastructures11060207 - 18 Jun 2026
Abstract
This study presents an AI-enabled framework for automated pavement condition assessment in urban environments by integrating YOLOv8-based distress detection, computational Pavement Condition Index (PCI) estimation, and comparative validation against manual PCI inspections and Pavement Surface Evaluation and Rating (PASER) scores. A YOLOv8 object-detection [...] Read more.
This study presents an AI-enabled framework for automated pavement condition assessment in urban environments by integrating YOLOv8-based distress detection, computational Pavement Condition Index (PCI) estimation, and comparative validation against manual PCI inspections and Pavement Surface Evaluation and Rating (PASER) scores. A YOLOv8 object-detection model, implemented in Python and trained on the publicly available N-RDD2024 dataset, was developed to identify longitudinal cracks, transverse cracks, alligator cracking, and potholes. The model achieved an accuracy of 84.6%, a precision of 89.6%, and a recall of 86.3%, demonstrating robust detection performance under heterogeneous environmental conditions. Dash-cam imagery collected along 6.3 km of urban flexible pavements was processed through an automated workflow that detects pavement distresses, estimates their severity and extent, and computes PCI values according to ASTM D6433-20 procedures. Automated PCI values were compared with manual PCI inspections and PASER ratings generated by the Blyncsy platform across 23 pavement sections. Statistical validation between automated and manual PCI assessments returned an R-squared of 0.925, a Pearson correlation coefficient of 0.962, a Spearman correlation coefficient of 0.955, a Mean Absolute Error of 5.0 PCI points, and a Root Mean Square Error of 6.1 PCI points. Compared with the proposed framework, PASER ratings exhibited lower agreement with manual PCI assessments and generally overestimated the pavement condition. The results demonstrate the potential of low-cost AI-based systems for large-scale pavement monitoring. Nevertheless, performance degradation was observed under challenging environmental conditions and in heavily deteriorated sections, highlighting the need for improved distress quantification, dataset balancing, and multimodal sensing integration. Full article
(This article belongs to the Special Issue Smart Mobility and Transportation Infrastructure)
Show Figures

Figure 1

20 pages, 6258 KB  
Article
A Lightweight Tea Bud Detector via Cascaded Gated Modulation and Multi-Scale Feature Enhancement
by Zewei Mi and Minming Gu
AI 2026, 7(6), 227; https://doi.org/10.3390/ai7060227 - 18 Jun 2026
Abstract
Accurate detection of tea buds is a key technology for enabling automated tea harvesting. However, in natural environments, tea buds present challenges such as scale variation, dense distribution, and high similarity to the background, making it difficult for traditional methods to balance accuracy [...] Read more.
Accurate detection of tea buds is a key technology for enabling automated tea harvesting. However, in natural environments, tea buds present challenges such as scale variation, dense distribution, and high similarity to the background, making it difficult for traditional methods to balance accuracy and efficiency. To address these issues, this paper proposes a lightweight detection framework, PCM-YOLO. The model introduces a cascaded gated feature modulation network into the YOLOv11 architecture, combining feedforward structures and gating mechanisms to selectively emphasize informative features, thereby improving tea bud detection performance. In addition, a feature-enhanced downsampling module is proposed, which employs a stepwise pooling-based feature enhancement mechanism to progressively expand the receptive field while preserving feature resolution, effectively incorporating multi-scale contextual information. Finally, a multi-scale feature enhancement module is designed to reduce the computational complexity of the model while maintaining detection performance as much as possible. Experimental results on public datasets demonstrate notable performance improvements over YOLOv11-N: Precision increases from 86.7% to 90.6% (an absolute increase of 3.9 percentage points), mAP50-95 increases by 1.6%, and the number of parameters is reduced by 20.6%. These results indicate that PCM-YOLO achieves a substantial reduction in model complexity while effectively improving detection accuracy, providing a feasible technical solution for deploying high-precision, real-time tea bud detection systems at the edge in tea plantation environments. Full article
(This article belongs to the Section AI Systems: Theory and Applications)
Show Figures

Figure 1

Back to TopTop