Saved Queries

Quantitative phenotyping of pepper seedlings is important for greenhouse plug tray seedling cultivation, but it remains constrained by inefficient manual monitoring, complex greenhouse backgrounds, and growth-stage-dependent discrepancies between two-dimensional image traits and actual leaf biomass. In this study, a cascaded vision framework with stage-specific morphological correction was developed for nondestructive seedling phenotyping. The framework integrated Visual Dynamic Momentum YOLO (VDM-YOLO) for individual seedling localization and growth-stage recognition, Variance Guided Strip Ghost Gated UNet (VSG-UNet) for lightweight, high-resolution leaf segmentation, and a stage-aware correction model for leaf dry biomass estimation. In performance evaluation, VDM-YOLO achieved a mean average precision at an intersection over union threshold of 0.5 (mAP_0.5) of 89.27%, improving mAP_0.5 by 1.82 percentage points over YOLOv12. VSG-UNet achieved a mean intersection over union (mIoU) of 83.9% and a Dice coefficient of 81.8%, while reducing floating point operations (FLOPs) and parameters by 44.2% and 61.2%, respectively, compared with U-Net. After stage-aware calibration, the coefficient of determination (R²) between segmented area and leaf dry weight increased from 0.764 to 0.813, and the root mean square error (RMSE) decreased from 0.0210 g to 0.0190 g. These results demonstrated that the proposed framework provided a proof of concept approach based on RGB images for the nondestructive assessment of leaf area and leaf dry biomass in pepper seedlings under restricted experimental conditions. Full article

(This article belongs to the Section Plant Modeling)

19 pages, 4732 KB

Open AccessArticle

YOLO-OBB and Two-Stage Geometric Correction for RGB-LED Array Optical Camera Communication

by Jiaqi Ju, Pan Qiu, Yipeng Tan and Zhengguang Shi

Photonics 2026, 13(6), 599; https://doi.org/10.3390/photonics13060599 (registering DOI) - 20 Jun 2026

Abstract

In Optical Camera Communication (OCC), precise localization of LED arrays under complex tilt conditions is a core challenge for reliable decoding. This paper proposes an OCC reception scheme for RGB-LED arrays that integrates YOLO-OBB rotated object detection with two-stage geometric correction. The system first employs a YOLOv8n-OBB model to extract a quadrilateral region of interest that tightly encloses the LED array boundary. This effectively suppresses background interference caused by superimposed perspective tilt and in-plane rotation. A coarse-to-fine two-stage correction framework is then applied. The first stage rapidly eliminates the dominant perspective distortion based on the detected bounding-box corners. The second stage performs a refined correction using the actual LED center positions. Two homography matrices are cascaded into a combined transformation, achieving two-stage correction accuracy through a single coordinate mapping. In the corrected image, K-Means clustering constructs a 16 × 16 LED topological grid. A locking strategy is adopted so that subsequent frames skip repeated LED detection and clustering. The steady-state per-frame processing time is reduced to approximately 78.9 ms. Experiments covered 16 cross-combinations of vertical tilt from 0° to 45° (0°, 15°, 30°, 45°) and in-plane rotation from 0° to 40° (0°, 15°, 30°, 40°). The uncorrected scheme and the horizontal-box scheme experienced severe bit errors or complete failure under complicated distortion. The proposed scheme maintained error-free transmission under all 16 tested conditions. The ratios of opposite sides of the corrected LED grid remained stable between 0.997 and 1.004. The system simultaneously achieves high reliability and low-latency real-time processing under complex geometric distortions. Full article

(This article belongs to the Special Issue Editorial Board Members’ Collection Series: Optical Wireless Communication)

►▼ Show Figures

Figure 1

31 pages, 34272 KB

Open AccessArticle

Reliable Vision-Based PPE Detection for Construction Safety in Adverse Environmental Conditions

by Sujan Gyawali, Ali Mohammadjafari, Saurav Ghimire and Mahmoud Habibnezhad

Buildings 2026, 16(12), 2447; https://doi.org/10.3390/buildings16122447 (registering DOI) - 20 Jun 2026

Abstract

Adverse imaging conditions such as fog, rain, and low light degrade the reliability of vision-based Personal Protective Equipment (PPE) detection systems on construction sites, yet most existing models are trained under clear-weather assumptions. This paper introduces a physics-based weather augmentation framework integrated with the YOLOv8n architecture to improve PPE detection robustness under adverse environmental conditions. The original Color Helmet and Vest (CHV) dataset was expanded from 1330 clear-weather images to 6650 images across five conditions using four physically grounded augmentation models: the Koschmieder atmospheric scattering model for fog, the Garg–Nayar streak model for rain, gamma-corrected attenuation with Poisson–Gaussian noise for low light, and a PSF-based glare model for bright sunlight. The weather-resistant model, a clear-weather baseline, and an augmented baseline were evaluated on the same 665-image weather-augmented test set. The weather-resistant model achieves 89.2% mAP50, a 5.7 percentage-point improvement over the clear-weather baseline (83.5%), with a nearly four-fold improvement in cross-condition stability (standard deviation 1.5% vs. 5.7%). Under matched training-data volume, the weather-resistant model still outperforms a conventionally augmented baseline across all five simulated conditions, indicating that these gains stem from physics-based modeling rather than larger training-data volume. The largest gain occurs under low light, where mAP50 improves from 73.4% to 87.9%. Gradient-weighted Class Activation Mapping (Grad-CAM) analysis confirms that the weather-resistant model directs more attention toward PPE regions across all conditions, with the largest improvement under low light (+10.0 percentage points). The lightweight design (3.0 M parameters) and quantitative and qualitative validation on 205 annotated real-world construction site images under normal and low-light conditions provide preliminary evidence of practical applicability. Full article

(This article belongs to the Special Issue Intelligent Monitoring for Health and Safety in Built Environments)

23 pages, 2264 KB

Open AccessArticle

Real-Time Leaf Disease Detection with Boundary-Aware and Texture-Sensitive Feature Enhancement

by Jinyang Qiu, Qiuyi Du, Yonggang Wang, Yuhan Tao, Yue Guo, Ye Zhang and Yue Gao

Symmetry 2026, 18(6), 1059; https://doi.org/10.3390/sym18061059 (registering DOI) - 19 Jun 2026

Abstract

Accurate and robust detection of leaf diseases is a key enabler for precision agriculture and large-scale crop health monitoring. Despite the strong generalization of modern one-stage detectors (e.g., YOLOv8), two domain-specific challenges remain: (i) weak or blurry lesion boundaries hinder precise localization, and (ii) low color contrast between diseased and healthy tissues forces models to rely on subtle texture patterns rather than salient shapes. To tackle these challenges, we reframe the core agricultural disease detection task as the identification of “asymmetric morphological anomalies” and propose a domain-tailored enhancement framework. First, we introduce an Edge Enhancement Module (EEM) that explicitly strengthens boundary-aware representations. Inspired by the natural symmetry of healthy leaves, our EEM is specifically designed to capture symmetry-breaking boundary discontinuities and localized asymmetric edges caused by disease lesions. Our method enhances edge and texture cues that are indicative of disease lesions, which often exhibit local asymmetries and boundary discontinuities. The EEM includes a Differential Normalized Pooling Block (DNPB) that highlights edge responses through discrepancies between max pooling and average pooling, which also models cross-group edge correlations. Second, the Lightweight Texture-Sensitive Feature Enhancement (LTSFE) mechanism amplifies texture-discriminative channels under low-contrast conditions by leveraging complementary global statistics and efficient channel mixing, all with negligible computational overhead. We evaluated our method on a self-constructed dataset of 106,434 images with 225,640 annotations covering diverse crops. Experiments show that the proposed method achieves state-of-the-art accuracy (81.54% mAP@0.5:0.95) while maintaining real-time inference (142 FPS), consistently outperforming strong baselines. Ablations confirm the effectiveness and complementarity of EEM and LTSFE, demonstrating that domain-specific architectural design, inspired by biological symmetry, can substantially improve agricultural vision systems. Full article

(This article belongs to the Section Engineering and Materials)

17 pages, 15918 KB

Open AccessArticle

ADA-YOLO: An Adaptive Dynamic Aggregation Network for Small Object Detection in UAV Imagery

by Jiajun Chen, Shaochen Jiang, Yongming Li, Sulaiman Tuersunayi and Yong Liu

Sensors 2026, 26(12), 3908; https://doi.org/10.3390/s26123908 (registering DOI) - 19 Jun 2026

Abstract

Unmanned Aerial Vehicle (UAV) image object detection holds significant application value in the low-altitude economy, traffic monitoring, intelligent agriculture, and disaster rescue. However, due to the top-down perspective, UAV images typically suffer from challenges such as small target scales, dense object distribution, severe occlusions, and complex backgrounds. These issues often limit the recall and localization accuracy of general-purpose detectors when they are directly applied to UAV small-object detection scenarios. To address these aforementioned challenges, this paper proposes an Adaptive Dynamic Aggregation YOLO network, termed ADA-YOLO. The novelty of ADA-YOLO lies in its highly efficient combinatorial design specifically tailored for UAV small object detection, while retaining the efficient backbone of YOLOv8, we systematically reconstruct the neck and detection head to improve accuracy. Specifically, a high-resolution P2 detection branch is incorporated to construct a P2–P5 multi-scale prediction structure. Furthermore, the lightweight DySample dynamic upsampling module is adopted to replace traditional upsampling methods, and an Adaptive Spatial Feature Fusion (ASFF) mechanism is introduced to alleviate semantic conflicts and noise interference during multi-scale feature fusion. This synergistic combination explicitly addresses multi-scale representation challenges and enhances small-object detection performance in complex scenes. Comparative experiments with the baseline YOLOv8n on the VisDrone2019 dataset demonstrate that ADA-YOLO achieves an improvement of 11.3% in mAP@0.5 and 8.2% in mAP@0.5:0.95. The improved model achieves these performance gains with a modest parameter increase and acceptable computational complexity. Finally, ablation experiments further validate the effectiveness of each individual module and their synergistic gains. Full article

(This article belongs to the Section Remote Sensors)

►▼ Show Figures

Figure 1

24 pages, 13145 KB

Open AccessArticle

Real-Time Assistive System Integrating Geometric Topology Analysis and State-Adaptive Warning Logic for the Visually Impaired

by Bilie Hu, Peishen Gao, Yan Liu, Xi Xia and Guoping Huo

Sensors 2026, 26(12), 3905; https://doi.org/10.3390/s26123905 (registering DOI) - 19 Jun 2026

Abstract

Traditional white canes offer a limited perception range, whereas end-to-end visual models face challenges in real-time deployment on edge devices. To address these limitations, this paper proposes a lightweight real-time assistive system that integrates geometric topology reconstruction with state-adaptive warning logic. The system utilizes YOLOv9 to extract discrete semantic primitives of tactile paving. It constructs a dual-branch perception framework based on Median Absolute Deviation and the Minimum Spanning Tree algorithm to analyze the topological structure of tactile paving. For complex intersections characterized by warning indicators, a one-dimensional connectivity clustering algorithm based on longitudinal topology is proposed. It generates accurate macroscopic feasible directional prompts under field-of-view boundary constraints. Additionally, a hierarchical scheduling framework dynamically orchestrates scenario-specific finite state machines to enable continuous dynamic interaction across typical high-risk scenarios. Evaluated on a custom real-world dataset, the system achieves a 95.21% frame-level comprehensive accuracy for straight-path deviation correction and intersection directional prompting. Dynamic temporal stress tests confirm the temporal stability and logical coherence of state transitions. Furthermore, latency evaluations demonstrate the logic layer’s minimal computational overhead, proving its theoretical feasibility for real-time edge deployment. This approach provides an effective, low-latency solution for delivering directional prompts and hazard warnings to visually impaired users. Full article

(This article belongs to the Section Intelligent Sensors)

13 pages, 3658 KB

Open AccessArticle

TR-ABFT: Tile-Resilient Fault Detection for Neural Processing Units

by Yang Hua, Yunhong Bai, Bo Wang, Wei Zhuang and Yuanfu Zhao

Electronics 2026, 15(12), 2715; https://doi.org/10.3390/electronics15122715 - 19 Jun 2026

Abstract

Spaceborne neural processing units (NPUs) increasingly support real-time deep-learning inference, but their dense multiply-accumulate arrays are vulnerable to radiation-induced soft errors. Conventional radiation-hardening methods improve reliability through hardware redundancy, but they incur substantial area, performance and compiler-mapping overheads. This paper proposes tile-resilient algorithm-based fault tolerance (TR-ABFT), a software-scheduled, detection-oriented scheme for quantized NPU inference. TR-ABFT generates checksum information at tile granularity and maps checking tasks onto the original processing element (PE) array without changing the hardware topology. To make ABFT compatible with INT8 datapaths, we design two checksum-coding strategies: checksum decomposition and modulo-239 checksum coding. The modulo-239 scheme removes structural missed detections for two-bit flips with bit-position spacings in (1, 31), while preserving compatibility with signed INT8 inputs. Evaluations on ResNet, YOLOv8, and RT-DETR show that, on a

16 \times 16

array, TR-ABFT introduces only 6.37% to 24.61% additional computational overhead. By converting spatial redundancy into schedulable temporal redundancy, TR-ABFT preserves systolic-array regularity and provides a low-overhead reliability-enhancement mechanism for space-grade neural-network accelerators. Full article

(This article belongs to the Special Issue Artificial Intelligence and Microsystems)

►▼ Show Figures

Figure 1

43 pages, 13866 KB

Open AccessArticle

Research on Multi-Source Heterogeneous Collaborative Perception System Based on Unmanned Aerial Vehicle and Unmanned Ground Vehicle

by Yufeng Li, Erming Tian, Xiaofeng Chen, Huiyan Han and Xinya Zhang

Drones 2026, 10(6), 470; https://doi.org/10.3390/drones10060470 (registering DOI) - 19 Jun 2026

Abstract

Complex urban scenarios impose high demands on the environmental perception capabilities of unmanned systems, which serve as a prerequisite for executing autonomous missions such as disaster response, infrastructure inspection, and smart city operations. UAVs, leveraging their high mobility, can provide accurate prior maps and wide-area aerial observation for unmanned ground vehicles. However, their long-range perception accuracy is limited. Conversely, UGVs can achieve high-precision environmental perception along their navigation paths using prior maps, but suffer from a constrained field of view. The collaboration between the two platforms complements their respective strengths, thereby enhancing 3D object perception and mapping accuracy in complex scenarios. To address the aforementioned challenges, this study proposes a cross-platform feature fusion method for 3D object perception and an incremental map updating approach for UAVs and UGVs. First, a dynamic SLAM method that integrates an optimized YOLOv8 with ORB-SLAM3 is employed to mitigate map blurring caused by dynamic noise, providing prior map information for UGVs. Second, a multimodal fusion perception model is constructed for UGVs, utilizing attention mechanisms to achieve deep fusion of multimodal Bird’s-Eye-View (BEV) features. This overcomes issues such as diminishing complementarity between modalities and weak temporal feature associations. Finally, an air ground fusion model based on a cross-attention mechanism is developed to fuse aerial view features with ground-based fused BEV features across platforms, yielding a unified feature representation for 3D object detection and generating a fused high-precision map. Experimental results demonstrate that under complex occlusion scenarios in a simulated dataset, the proposed collaborative perception system improves the mean Average Precision (mAP) by 12.7% and 15.7% compared to using a single UAV or a single UGV, respectively, while increasing the map accuracy F1-score by 0.21. This study provides technical support for achieving real-time and accurate air ground collaborative perception in complex dynamic environments. Full article

(This article belongs to the Section Innovative Urban Mobility)

►▼ Show Figures

Figure 1

27 pages, 23377 KB

Open AccessArticle

YOLO-Crack: Geometry-Guided Real-Time Crack Detection Framework Toward Edge Deployment

by Zhe Wei, Rui Wang, Rong Dai, Haibo Xu, Huan Zhang and Yurong Zou

Sensors 2026, 26(12), 3892; https://doi.org/10.3390/s26123892 (registering DOI) - 18 Jun 2026

Abstract

Crack detection in mobile inspection scenarios is constrained by both the extremely slender geometry of crack targets and the real-time inference requirements on edge devices, which expose systematic limitations of general-purpose object detectors. This paper proposes YOLO-Crack, a closed-loop solution that couples geometry-statistics-driven module design with end-to-end edge deployment validation. On the algorithmic side, we first quantify crack geometric properties and then introduce (i) a crack-aware cross-dimensional fusion attention (CFCA) module to strengthen feature representations, (ii) a dual-path feature enhancement module (DFEM) to preserve fine details during upsampling, and (iii) an empirical smooth quality window adjustment with shape consistency regularization to stabilize bounding-box regression for slender cracks. Experiments on the Crack500 dataset show that YOLO-Crack achieves 78.8% precision, 51.4% recall, and 65.7% mAP@0.5, improving over the YOLOv11n baseline by 4.2, 1.7, and 2.9 percentage points, respectively. On the engineering side, we deploy YOLO-Crack on a Jetson Orin NX mobile robot platform and evaluate it in a real ROS pipeline; the measured end-to-end throughput reaches 25.5 FPS, meeting real-time video processing requirements. The proposed framework provides a practical reference workflow for edge vision tasks, from geometry analysis to engineering verification. Full article

(This article belongs to the Special Issue Image-Based Surface Damage Detection)

29 pages, 6688 KB

Open AccessArticle

CGMSN: CFAR-Guided Mode-Selective Network for SAR Target Detection

by Lingjuan Yu, Xinya Xiong, Xiaochun Xie, Miaomiao Liang, Xiangchun Yu, Xuan Jiao and Wen Hong

Remote Sens. 2026, 18(12), 2040; https://doi.org/10.3390/rs18122040 - 18 Jun 2026

Abstract

Improving detection performance across diverse synthetic aperture radar (SAR) scenes remains challenging because different datasets exhibit different levels of target–background separability. To address this issue, we propose a constant false alarm rate (CFAR)-guided mode-selective network (CGMSN), which selects an appropriate feature-fusion mode according to the CFAR target–background separation margin. Specifically, CFAR is used as an interpretable statistical tool to construct an anomaly response map. The separation margin is then calculated by comparing the average CFAR anomaly responses of annotated target regions and their surrounding contextual backgrounds. Based on this indicator, a You Only Look Once version 8 (YOLOv8)-based mode-selective detector is constructed with three key components. First, a lightweight representation-enhanced backbone that integrates ResNet18 and a dilated convolutional spatial pyramid (DCSP) module is adopted to improve contextual representation while maintaining moderate model complexity. Second, a mode-selective neck (MSN) is designed with three predefined fusion modes, where the appropriate fusion depth is selected according to the CFAR-guided target–background separation margin of each dataset. Third, a complete intersection over the union modulated head (CMH) is developed to enhance classification-regression alignment and suppress clutter-induced responses. Experiments on SAR-Aircraft-1.0, High-Resolution SAR Images Dataset (HRSID), and SAR Ship Detection Dataset (SSDD) indicate that datasets with smaller CFAR target–background separation margins benefit from deeper fusion, while datasets with larger separation margins can adopt shallower fusion. Moreover, the proposed CGMSN achieves superior performance over representative detectors, demonstrating its effectiveness on the evaluated SAR datasets with diverse scene characteristics. Full article

(This article belongs to the Special Issue Target Recognition and Detection Based on High Resolution Radar Images (Second Edition))

26 pages, 3882 KB

Open AccessArticle

Remote Sensing Small Object Detection Network Based on Wavelet-Convolution and Fine-Grained Preservation

by Hangyu Li and Tiecheng Song

Information 2026, 17(6), 609; https://doi.org/10.3390/info17060609 (registering DOI) - 18 Jun 2026

Abstract

Small object detection in remote sensing imagery is a fundamental task for visual information extraction, yet it remains challenging due to extremely small target scales, complex backgrounds, and the loss of discriminative feature information caused by repeated downsampling. To address these issues, this paper proposes a Wavelet-Convolution and Fine-Grained Preservation Network (WCFPNet) based on YOLOv8n. Specifically, a Wavelet-Convolution Module (WCM) is introduced into the backbone to decompose feature maps into low- and high-frequency sub-bands, thereby enhancing structural feature modeling and preserving subtle target details. To compensate for the weakened fine-grained information after repeated downsampling, an Enhanced Spatial Pyramid Pooling-Fast (ESPPF) module is embedded at the end of the backbone to strengthen multi-scale contextual aggregation. In addition, an Enhanced Feature Pyramid Network (EFPN) is designed in the neck to facilitate the propagation of shallow and intermediate fine-grained features to high-level semantic features through cross-level fusion and the Convolutional Block Attention Module (CBAM). Experiments on the NWPU VHR-10 dataset show that WCFPNet achieves 0.879 mAP@0.5 and 0.515 mAP@0.5:0.95, outperforming YOLOv8n by 1.7 and 2.5 percentage points, respectively. Moreover, the proposed WCFPNet achieves a competitive performance compared with several representative detectors while maintaining moderate model complexity. These results demonstrate the effectiveness of WCFPNet in challenging remote sensing scenes characterized by complex backgrounds, dense object distributions, and weak textures. Full article

(This article belongs to the Special Issue Emerging Research in Target Detection and Recognition in Remote Sensing Images, 2nd Edition)

►▼ Show Figures

Figure 1

22 pages, 14170 KB

Open AccessArticle

A YOLO-Based Workflow for Detecting and Mapping Archaeological Stone Cairns in Satellite Imagery: A Case Study from Western Ennedi, Chad

by Ebrahim Ghaderpour, Clarisse Djetounako Nekoulnang, Hamdji Milman Noudjiko, Pier Paolo Rossi, Rocco Rotunno and Savino di Lernia

Heritage 2026, 9(6), 237; https://doi.org/10.3390/heritage9060237 - 18 Jun 2026

Abstract

Automated detection of archaeological stone cairns using high-resolution satellite imagery offers a scalable approach for documenting vulnerable heritage landscapes in the Ennedi Massif, where extensive and remote terrain limits traditional field survey, and rapid documentation is required. This study presents a GIS and deep learning framework based on the YOLOv8 model to identify and map stone cairns using Google Satellite RGB imagery at 28.5 cm spatial resolution. Ground-truth data collected via GPS field survey were used to train and validate YOLOv8n. The study area was divided into two regions with contrasting terrain and illumination conditions to evaluate model transferability. The training region included 149 verified cairns, while the independent test region included 103 cairns. Early stopping reduced overfitting, reaching mAP50 of 99.5% and mAP50–95 of 94.3%. A density-based spatial clustering algorithm was applied to merge overlapping detections and generate circular cairn representations. On the test set, the model achieved 83.5% precision, recall, and F1-score, indicating stable performance under the selected operational configuration. Comparison with YOLOv5n showed slightly higher localization accuracy for YOLOv8n, while YOLOv5n yielded marginally higher precision and F1-score. Overall, the framework provides a non-invasive tool for large-scale archaeological prospection and heritage monitoring in remote desert environments. Full article

30 pages, 11823 KB

Open AccessArticle

YOLO-MOD: An Instance Segmentation Algorithm for Pomelo Fruit and Fruit Stem Based on YOLOv11-Seg

by Wei Zhou, Leina Gao, Fuchun Sun, Qiurong Lv, Yuechao Bian, Chi Hu and Senlin Yang

Horticulturae 2026, 12(6), 744; https://doi.org/10.3390/horticulturae12060744 - 18 Jun 2026

Abstract

This study aims to develop an instance segmentation model for the joint segmentation of pomelo fruits and stems in complex natural orchard environments, with particular emphasis on slender, small-scale, and easily occluded stem targets. To this end, YOLO-MOD, an improved instance segmentation algorithm based on YOLOv11-seg, is proposed. Specifically, Omni-Dimensional Dynamic Convolution (ODConv) is introduced into the C3k2 module to enhance complex feature representation; a Multi-Scale Dilated Attention (MSDA) module is embedded to improve the multi-scale semantic perception of slender stem regions; and the original upsampling operator is replaced with DySample to strengthen fine-grained boundary recovery. Experimental results show that, compared with the original YOLOv11-seg, YOLO-MOD improves the Box mAP@50 and Mask mAP@50 by 2.9% and 3.9%, respectively. For the Stem class, the Box mAP@50 and Mask mAP@50 increase from 71.9% to 77.8% and from 68.4% to 76.2%, respectively. These results indicate that YOLO-MOD can achieve fine-grained segmentation of pomelo fruits and stems on the dataset used in this study. However, its generalization capability across different orchards, seasons, pomelo varieties, and fruit types still requires further evaluation, and its practical effectiveness in an integrated robotic harvesting system remains to be further validated. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence in the Processing of Horticultural Crops)

►▼ Show Figures

Figure 1

18 pages, 5048 KB

Open AccessArticle

AI-Driven Pavement Condition Assessment from Dash-Cam Imagery: A Comparative Analysis of YOLOv8-Based PCI Estimation, Manual Inspections, and Automated PASER Ratings in Urban Networks

by Giulia Del Serrone, Giuseppe Loprencipe and Laura Moretti

Infrastructures 2026, 11(6), 207; https://doi.org/10.3390/infrastructures11060207 - 18 Jun 2026

Abstract

This study presents an AI-enabled framework for automated pavement condition assessment in urban environments by integrating YOLOv8-based distress detection, computational Pavement Condition Index (PCI) estimation, and comparative validation against manual PCI inspections and Pavement Surface Evaluation and Rating (PASER) scores. A YOLOv8 object-detection model, implemented in Python and trained on the publicly available N-RDD2024 dataset, was developed to identify longitudinal cracks, transverse cracks, alligator cracking, and potholes. The model achieved an accuracy of 84.6%, a precision of 89.6%, and a recall of 86.3%, demonstrating robust detection performance under heterogeneous environmental conditions. Dash-cam imagery collected along 6.3 km of urban flexible pavements was processed through an automated workflow that detects pavement distresses, estimates their severity and extent, and computes PCI values according to ASTM D6433-20 procedures. Automated PCI values were compared with manual PCI inspections and PASER ratings generated by the Blyncsy platform across 23 pavement sections. Statistical validation between automated and manual PCI assessments returned an R-squared of 0.925, a Pearson correlation coefficient of 0.962, a Spearman correlation coefficient of 0.955, a Mean Absolute Error of 5.0 PCI points, and a Root Mean Square Error of 6.1 PCI points. Compared with the proposed framework, PASER ratings exhibited lower agreement with manual PCI assessments and generally overestimated the pavement condition. The results demonstrate the potential of low-cost AI-based systems for large-scale pavement monitoring. Nevertheless, performance degradation was observed under challenging environmental conditions and in heavily deteriorated sections, highlighting the need for improved distress quantification, dataset balancing, and multimodal sensing integration. Full article

(This article belongs to the Special Issue Smart Mobility and Transportation Infrastructure)

►▼ Show Figures

Figure 1

20 pages, 6258 KB

Open AccessArticle

A Lightweight Tea Bud Detector via Cascaded Gated Modulation and Multi-Scale Feature Enhancement

by Zewei Mi and Minming Gu

AI 2026, 7(6), 227; https://doi.org/10.3390/ai7060227 - 18 Jun 2026

Abstract

Accurate detection of tea buds is a key technology for enabling automated tea harvesting. However, in natural environments, tea buds present challenges such as scale variation, dense distribution, and high similarity to the background, making it difficult for traditional methods to balance accuracy and efficiency. To address these issues, this paper proposes a lightweight detection framework, PCM-YOLO. The model introduces a cascaded gated feature modulation network into the YOLOv11 architecture, combining feedforward structures and gating mechanisms to selectively emphasize informative features, thereby improving tea bud detection performance. In addition, a feature-enhanced downsampling module is proposed, which employs a stepwise pooling-based feature enhancement mechanism to progressively expand the receptive field while preserving feature resolution, effectively incorporating multi-scale contextual information. Finally, a multi-scale feature enhancement module is designed to reduce the computational complexity of the model while maintaining detection performance as much as possible. Experimental results on public datasets demonstrate notable performance improvements over YOLOv11-N: Precision increases from 86.7% to 90.6% (an absolute increase of 3.9 percentage points), mAP50-95 increases by 1.6%, and the number of parameters is reduced by 20.6%. These results indicate that PCM-YOLO achieves a substantial reduction in model complexity while effectively improving detection accuracy, providing a feasible technical solution for deploying high-precision, real-time tea bud detection systems at the edge in tea plantation environments. Full article

(This article belongs to the Section AI Systems: Theory and Applications)

►▼ Show Figures

Figure 1

Show export options Show export options

Select all

Export citation of selected articles as:

Error

Oops... you haven't selected anything for export.

Displaying article 1-50 on page 1 of 129.

Go to page 1 2 3 4 5

Search Results (6,421)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI