MDPI - Publisher of Open Access Journals

31 pages, 3582 KB

Open AccessArticle

A Stage-Aware Cascaded Detection–Segmentation Framework for Leaf Phenotyping and Leaf Dry Biomass Estimation of Pepper Seedlings

by Han Li, Dongyuan Shi, Hui Shi, Ming Li and Ming Diao

Plants 2026, 15(12), 1912; https://doi.org/10.3390/plants15121912 (registering DOI) - 20 Jun 2026

Abstract

Quantitative phenotyping of pepper seedlings is important for greenhouse plug tray seedling cultivation, but it remains constrained by inefficient manual monitoring, complex greenhouse backgrounds, and growth-stage-dependent discrepancies between two-dimensional image traits and actual leaf biomass. In this study, a cascaded vision framework with [...] Read more.

Quantitative phenotyping of pepper seedlings is important for greenhouse plug tray seedling cultivation, but it remains constrained by inefficient manual monitoring, complex greenhouse backgrounds, and growth-stage-dependent discrepancies between two-dimensional image traits and actual leaf biomass. In this study, a cascaded vision framework with stage-specific morphological correction was developed for nondestructive seedling phenotyping. The framework integrated Visual Dynamic Momentum YOLO (VDM-YOLO) for individual seedling localization and growth-stage recognition, Variance Guided Strip Ghost Gated UNet (VSG-UNet) for lightweight, high-resolution leaf segmentation, and a stage-aware correction model for leaf dry biomass estimation. In performance evaluation, VDM-YOLO achieved a mean average precision at an intersection over union threshold of 0.5 (mAP_0.5) of 89.27%, improving mAP_0.5 by 1.82 percentage points over YOLOv12. VSG-UNet achieved a mean intersection over union (mIoU) of 83.9% and a Dice coefficient of 81.8%, while reducing floating point operations (FLOPs) and parameters by 44.2% and 61.2%, respectively, compared with U-Net. After stage-aware calibration, the coefficient of determination (R²) between segmented area and leaf dry weight increased from 0.764 to 0.813, and the root mean square error (RMSE) decreased from 0.0210 g to 0.0190 g. These results demonstrated that the proposed framework provided a proof of concept approach based on RGB images for the nondestructive assessment of leaf area and leaf dry biomass in pepper seedlings under restricted experimental conditions. Full article

(This article belongs to the Section Plant Modeling)

31 pages, 34272 KB

Open AccessArticle

Reliable Vision-Based PPE Detection for Construction Safety in Adverse Environmental Conditions

by Sujan Gyawali, Ali Mohammadjafari, Saurav Ghimire and Mahmoud Habibnezhad

Buildings 2026, 16(12), 2447; https://doi.org/10.3390/buildings16122447 (registering DOI) - 20 Jun 2026

Abstract

Adverse imaging conditions such as fog, rain, and low light degrade the reliability of vision-based Personal Protective Equipment (PPE) detection systems on construction sites, yet most existing models are trained under clear-weather assumptions. This paper introduces a physics-based weather augmentation framework integrated with [...] Read more.

Adverse imaging conditions such as fog, rain, and low light degrade the reliability of vision-based Personal Protective Equipment (PPE) detection systems on construction sites, yet most existing models are trained under clear-weather assumptions. This paper introduces a physics-based weather augmentation framework integrated with the YOLOv8n architecture to improve PPE detection robustness under adverse environmental conditions. The original Color Helmet and Vest (CHV) dataset was expanded from 1330 clear-weather images to 6650 images across five conditions using four physically grounded augmentation models: the Koschmieder atmospheric scattering model for fog, the Garg–Nayar streak model for rain, gamma-corrected attenuation with Poisson–Gaussian noise for low light, and a PSF-based glare model for bright sunlight. The weather-resistant model, a clear-weather baseline, and an augmented baseline were evaluated on the same 665-image weather-augmented test set. The weather-resistant model achieves 89.2% mAP50, a 5.7 percentage-point improvement over the clear-weather baseline (83.5%), with a nearly four-fold improvement in cross-condition stability (standard deviation 1.5% vs. 5.7%). Under matched training-data volume, the weather-resistant model still outperforms a conventionally augmented baseline across all five simulated conditions, indicating that these gains stem from physics-based modeling rather than larger training-data volume. The largest gain occurs under low light, where mAP50 improves from 73.4% to 87.9%. Gradient-weighted Class Activation Mapping (Grad-CAM) analysis confirms that the weather-resistant model directs more attention toward PPE regions across all conditions, with the largest improvement under low light (+10.0 percentage points). The lightweight design (3.0 M parameters) and quantitative and qualitative validation on 205 annotated real-world construction site images under normal and low-light conditions provide preliminary evidence of practical applicability. Full article

(This article belongs to the Special Issue Intelligent Monitoring for Health and Safety in Built Environments)

23 pages, 2264 KB

Open AccessArticle

Real-Time Leaf Disease Detection with Boundary-Aware and Texture-Sensitive Feature Enhancement

by Jinyang Qiu, Qiuyi Du, Yonggang Wang, Yuhan Tao, Yue Guo, Ye Zhang and Yue Gao

Symmetry 2026, 18(6), 1059; https://doi.org/10.3390/sym18061059 (registering DOI) - 19 Jun 2026

Abstract

Accurate and robust detection of leaf diseases is a key enabler for precision agriculture and large-scale crop health monitoring. Despite the strong generalization of modern one-stage detectors (e.g., YOLOv8), two domain-specific challenges remain: (i) weak or blurry lesion boundaries hinder precise localization, and [...] Read more.

Accurate and robust detection of leaf diseases is a key enabler for precision agriculture and large-scale crop health monitoring. Despite the strong generalization of modern one-stage detectors (e.g., YOLOv8), two domain-specific challenges remain: (i) weak or blurry lesion boundaries hinder precise localization, and (ii) low color contrast between diseased and healthy tissues forces models to rely on subtle texture patterns rather than salient shapes. To tackle these challenges, we reframe the core agricultural disease detection task as the identification of “asymmetric morphological anomalies” and propose a domain-tailored enhancement framework. First, we introduce an Edge Enhancement Module (EEM) that explicitly strengthens boundary-aware representations. Inspired by the natural symmetry of healthy leaves, our EEM is specifically designed to capture symmetry-breaking boundary discontinuities and localized asymmetric edges caused by disease lesions. Our method enhances edge and texture cues that are indicative of disease lesions, which often exhibit local asymmetries and boundary discontinuities. The EEM includes a Differential Normalized Pooling Block (DNPB) that highlights edge responses through discrepancies between max pooling and average pooling, which also models cross-group edge correlations. Second, the Lightweight Texture-Sensitive Feature Enhancement (LTSFE) mechanism amplifies texture-discriminative channels under low-contrast conditions by leveraging complementary global statistics and efficient channel mixing, all with negligible computational overhead. We evaluated our method on a self-constructed dataset of 106,434 images with 225,640 annotations covering diverse crops. Experiments show that the proposed method achieves state-of-the-art accuracy (81.54% mAP@0.5:0.95) while maintaining real-time inference (142 FPS), consistently outperforming strong baselines. Ablations confirm the effectiveness and complementarity of EEM and LTSFE, demonstrating that domain-specific architectural design, inspired by biological symmetry, can substantially improve agricultural vision systems. Full article

(This article belongs to the Section Engineering and Materials)

17 pages, 15918 KB

Open AccessArticle

ADA-YOLO: An Adaptive Dynamic Aggregation Network for Small Object Detection in UAV Imagery

by Jiajun Chen, Shaochen Jiang, Yongming Li, Sulaiman Tuersunayi and Yong Liu

Sensors 2026, 26(12), 3908; https://doi.org/10.3390/s26123908 (registering DOI) - 19 Jun 2026

Abstract

Unmanned Aerial Vehicle (UAV) image object detection holds significant application value in the low-altitude economy, traffic monitoring, intelligent agriculture, and disaster rescue. However, due to the top-down perspective, UAV images typically suffer from challenges such as small target scales, dense object distribution, severe [...] Read more.

Unmanned Aerial Vehicle (UAV) image object detection holds significant application value in the low-altitude economy, traffic monitoring, intelligent agriculture, and disaster rescue. However, due to the top-down perspective, UAV images typically suffer from challenges such as small target scales, dense object distribution, severe occlusions, and complex backgrounds. These issues often limit the recall and localization accuracy of general-purpose detectors when they are directly applied to UAV small-object detection scenarios. To address these aforementioned challenges, this paper proposes an Adaptive Dynamic Aggregation YOLO network, termed ADA-YOLO. The novelty of ADA-YOLO lies in its highly efficient combinatorial design specifically tailored for UAV small object detection, while retaining the efficient backbone of YOLOv8, we systematically reconstruct the neck and detection head to improve accuracy. Specifically, a high-resolution P2 detection branch is incorporated to construct a P2–P5 multi-scale prediction structure. Furthermore, the lightweight DySample dynamic upsampling module is adopted to replace traditional upsampling methods, and an Adaptive Spatial Feature Fusion (ASFF) mechanism is introduced to alleviate semantic conflicts and noise interference during multi-scale feature fusion. This synergistic combination explicitly addresses multi-scale representation challenges and enhances small-object detection performance in complex scenes. Comparative experiments with the baseline YOLOv8n on the VisDrone2019 dataset demonstrate that ADA-YOLO achieves an improvement of 11.3% in mAP@0.5 and 8.2% in mAP@0.5:0.95. The improved model achieves these performance gains with a modest parameter increase and acceptable computational complexity. Finally, ablation experiments further validate the effectiveness of each individual module and their synergistic gains. Full article

(This article belongs to the Section Remote Sensors)

► Show Figures

Figure 1

13 pages, 3658 KB

Open AccessArticle

TR-ABFT: Tile-Resilient Fault Detection for Neural Processing Units

by Yang Hua, Yunhong Bai, Bo Wang, Wei Zhuang and Yuanfu Zhao

Electronics 2026, 15(12), 2715; https://doi.org/10.3390/electronics15122715 - 19 Jun 2026

Abstract

Spaceborne neural processing units (NPUs) increasingly support real-time deep-learning inference, but their dense multiply-accumulate arrays are vulnerable to radiation-induced soft errors. Conventional radiation-hardening methods improve reliability through hardware redundancy, but they incur substantial area, performance and compiler-mapping overheads. This paper proposes tile-resilient algorithm-based [...] Read more.

Spaceborne neural processing units (NPUs) increasingly support real-time deep-learning inference, but their dense multiply-accumulate arrays are vulnerable to radiation-induced soft errors. Conventional radiation-hardening methods improve reliability through hardware redundancy, but they incur substantial area, performance and compiler-mapping overheads. This paper proposes tile-resilient algorithm-based fault tolerance (TR-ABFT), a software-scheduled, detection-oriented scheme for quantized NPU inference. TR-ABFT generates checksum information at tile granularity and maps checking tasks onto the original processing element (PE) array without changing the hardware topology. To make ABFT compatible with INT8 datapaths, we design two checksum-coding strategies: checksum decomposition and modulo-239 checksum coding. The modulo-239 scheme removes structural missed detections for two-bit flips with bit-position spacings in (1, 31), while preserving compatibility with signed INT8 inputs. Evaluations on ResNet, YOLOv8, and RT-DETR show that, on a

16 \times 16

array, TR-ABFT introduces only 6.37% to 24.61% additional computational overhead. By converting spatial redundancy into schedulable temporal redundancy, TR-ABFT preserves systolic-array regularity and provides a low-overhead reliability-enhancement mechanism for space-grade neural-network accelerators. Full article

(This article belongs to the Special Issue Artificial Intelligence and Microsystems)

► Show Figures

Figure 1

43 pages, 13866 KB

Open AccessArticle

Research on Multi-Source Heterogeneous Collaborative Perception System Based on Unmanned Aerial Vehicle and Unmanned Ground Vehicle

by Yufeng Li, Erming Tian, Xiaofeng Chen, Huiyan Han and Xinya Zhang

Drones 2026, 10(6), 470; https://doi.org/10.3390/drones10060470 (registering DOI) - 19 Jun 2026

Abstract

Complex urban scenarios impose high demands on the environmental perception capabilities of unmanned systems, which serve as a prerequisite for executing autonomous missions such as disaster response, infrastructure inspection, and smart city operations. UAVs, leveraging their high mobility, can provide accurate prior maps [...] Read more.

Complex urban scenarios impose high demands on the environmental perception capabilities of unmanned systems, which serve as a prerequisite for executing autonomous missions such as disaster response, infrastructure inspection, and smart city operations. UAVs, leveraging their high mobility, can provide accurate prior maps and wide-area aerial observation for unmanned ground vehicles. However, their long-range perception accuracy is limited. Conversely, UGVs can achieve high-precision environmental perception along their navigation paths using prior maps, but suffer from a constrained field of view. The collaboration between the two platforms complements their respective strengths, thereby enhancing 3D object perception and mapping accuracy in complex scenarios. To address the aforementioned challenges, this study proposes a cross-platform feature fusion method for 3D object perception and an incremental map updating approach for UAVs and UGVs. First, a dynamic SLAM method that integrates an optimized YOLOv8 with ORB-SLAM3 is employed to mitigate map blurring caused by dynamic noise, providing prior map information for UGVs. Second, a multimodal fusion perception model is constructed for UGVs, utilizing attention mechanisms to achieve deep fusion of multimodal Bird’s-Eye-View (BEV) features. This overcomes issues such as diminishing complementarity between modalities and weak temporal feature associations. Finally, an air ground fusion model based on a cross-attention mechanism is developed to fuse aerial view features with ground-based fused BEV features across platforms, yielding a unified feature representation for 3D object detection and generating a fused high-precision map. Experimental results demonstrate that under complex occlusion scenarios in a simulated dataset, the proposed collaborative perception system improves the mean Average Precision (mAP) by 12.7% and 15.7% compared to using a single UAV or a single UGV, respectively, while increasing the map accuracy F1-score by 0.21. This study provides technical support for achieving real-time and accurate air ground collaborative perception in complex dynamic environments. Full article

(This article belongs to the Section Innovative Urban Mobility)

► Show Figures

Figure 1

27 pages, 23377 KB

Open AccessArticle

YOLO-Crack: Geometry-Guided Real-Time Crack Detection Framework Toward Edge Deployment

by Zhe Wei, Rui Wang, Rong Dai, Haibo Xu, Huan Zhang and Yurong Zou

Sensors 2026, 26(12), 3892; https://doi.org/10.3390/s26123892 (registering DOI) - 18 Jun 2026

Abstract

Crack detection in mobile inspection scenarios is constrained by both the extremely slender geometry of crack targets and the real-time inference requirements on edge devices, which expose systematic limitations of general-purpose object detectors. This paper proposes YOLO-Crack, a closed-loop solution that couples geometry-statistics-driven [...] Read more.

Crack detection in mobile inspection scenarios is constrained by both the extremely slender geometry of crack targets and the real-time inference requirements on edge devices, which expose systematic limitations of general-purpose object detectors. This paper proposes YOLO-Crack, a closed-loop solution that couples geometry-statistics-driven module design with end-to-end edge deployment validation. On the algorithmic side, we first quantify crack geometric properties and then introduce (i) a crack-aware cross-dimensional fusion attention (CFCA) module to strengthen feature representations, (ii) a dual-path feature enhancement module (DFEM) to preserve fine details during upsampling, and (iii) an empirical smooth quality window adjustment with shape consistency regularization to stabilize bounding-box regression for slender cracks. Experiments on the Crack500 dataset show that YOLO-Crack achieves 78.8% precision, 51.4% recall, and 65.7% mAP@0.5, improving over the YOLOv11n baseline by 4.2, 1.7, and 2.9 percentage points, respectively. On the engineering side, we deploy YOLO-Crack on a Jetson Orin NX mobile robot platform and evaluate it in a real ROS pipeline; the measured end-to-end throughput reaches 25.5 FPS, meeting real-time video processing requirements. The proposed framework provides a practical reference workflow for edge vision tasks, from geometry analysis to engineering verification. Full article

(This article belongs to the Special Issue Image-Based Surface Damage Detection)

29 pages, 6688 KB

Open AccessArticle

CGMSN: CFAR-Guided Mode-Selective Network for SAR Target Detection

by Lingjuan Yu, Xinya Xiong, Xiaochun Xie, Miaomiao Liang, Xiangchun Yu, Xuan Jiao and Wen Hong

Remote Sens. 2026, 18(12), 2040; https://doi.org/10.3390/rs18122040 - 18 Jun 2026

Abstract

Improving detection performance across diverse synthetic aperture radar (SAR) scenes remains challenging because different datasets exhibit different levels of target–background separability. To address this issue, we propose a constant false alarm rate (CFAR)-guided mode-selective network (CGMSN), which selects an appropriate feature-fusion mode according [...] Read more.

Improving detection performance across diverse synthetic aperture radar (SAR) scenes remains challenging because different datasets exhibit different levels of target–background separability. To address this issue, we propose a constant false alarm rate (CFAR)-guided mode-selective network (CGMSN), which selects an appropriate feature-fusion mode according to the CFAR target–background separation margin. Specifically, CFAR is used as an interpretable statistical tool to construct an anomaly response map. The separation margin is then calculated by comparing the average CFAR anomaly responses of annotated target regions and their surrounding contextual backgrounds. Based on this indicator, a You Only Look Once version 8 (YOLOv8)-based mode-selective detector is constructed with three key components. First, a lightweight representation-enhanced backbone that integrates ResNet18 and a dilated convolutional spatial pyramid (DCSP) module is adopted to improve contextual representation while maintaining moderate model complexity. Second, a mode-selective neck (MSN) is designed with three predefined fusion modes, where the appropriate fusion depth is selected according to the CFAR-guided target–background separation margin of each dataset. Third, a complete intersection over the union modulated head (CMH) is developed to enhance classification-regression alignment and suppress clutter-induced responses. Experiments on SAR-Aircraft-1.0, High-Resolution SAR Images Dataset (HRSID), and SAR Ship Detection Dataset (SSDD) indicate that datasets with smaller CFAR target–background separation margins benefit from deeper fusion, while datasets with larger separation margins can adopt shallower fusion. Moreover, the proposed CGMSN achieves superior performance over representative detectors, demonstrating its effectiveness on the evaluated SAR datasets with diverse scene characteristics. Full article

(This article belongs to the Special Issue Target Recognition and Detection Based on High Resolution Radar Images (Second Edition))

30 pages, 11823 KB

Open AccessArticle

YOLO-MOD: An Instance Segmentation Algorithm for Pomelo Fruit and Fruit Stem Based on YOLOv11-Seg

by Wei Zhou, Leina Gao, Fuchun Sun, Qiurong Lv, Yuechao Bian, Chi Hu and Senlin Yang

Horticulturae 2026, 12(6), 744; https://doi.org/10.3390/horticulturae12060744 - 18 Jun 2026

Abstract

This study aims to develop an instance segmentation model for the joint segmentation of pomelo fruits and stems in complex natural orchard environments, with particular emphasis on slender, small-scale, and easily occluded stem targets. To this end, YOLO-MOD, an improved instance segmentation algorithm [...] Read more.

This study aims to develop an instance segmentation model for the joint segmentation of pomelo fruits and stems in complex natural orchard environments, with particular emphasis on slender, small-scale, and easily occluded stem targets. To this end, YOLO-MOD, an improved instance segmentation algorithm based on YOLOv11-seg, is proposed. Specifically, Omni-Dimensional Dynamic Convolution (ODConv) is introduced into the C3k2 module to enhance complex feature representation; a Multi-Scale Dilated Attention (MSDA) module is embedded to improve the multi-scale semantic perception of slender stem regions; and the original upsampling operator is replaced with DySample to strengthen fine-grained boundary recovery. Experimental results show that, compared with the original YOLOv11-seg, YOLO-MOD improves the Box mAP@50 and Mask mAP@50 by 2.9% and 3.9%, respectively. For the Stem class, the Box mAP@50 and Mask mAP@50 increase from 71.9% to 77.8% and from 68.4% to 76.2%, respectively. These results indicate that YOLO-MOD can achieve fine-grained segmentation of pomelo fruits and stems on the dataset used in this study. However, its generalization capability across different orchards, seasons, pomelo varieties, and fruit types still requires further evaluation, and its practical effectiveness in an integrated robotic harvesting system remains to be further validated. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence in the Processing of Horticultural Crops)

► Show Figures

Figure 1

18 pages, 5048 KB

Open AccessArticle

AI-Driven Pavement Condition Assessment from Dash-Cam Imagery: A Comparative Analysis of YOLOv8-Based PCI Estimation, Manual Inspections, and Automated PASER Ratings in Urban Networks

by Giulia Del Serrone, Giuseppe Loprencipe and Laura Moretti

Infrastructures 2026, 11(6), 207; https://doi.org/10.3390/infrastructures11060207 - 18 Jun 2026

Abstract

This study presents an AI-enabled framework for automated pavement condition assessment in urban environments by integrating YOLOv8-based distress detection, computational Pavement Condition Index (PCI) estimation, and comparative validation against manual PCI inspections and Pavement Surface Evaluation and Rating (PASER) scores. A YOLOv8 object-detection [...] Read more.

This study presents an AI-enabled framework for automated pavement condition assessment in urban environments by integrating YOLOv8-based distress detection, computational Pavement Condition Index (PCI) estimation, and comparative validation against manual PCI inspections and Pavement Surface Evaluation and Rating (PASER) scores. A YOLOv8 object-detection model, implemented in Python and trained on the publicly available N-RDD2024 dataset, was developed to identify longitudinal cracks, transverse cracks, alligator cracking, and potholes. The model achieved an accuracy of 84.6%, a precision of 89.6%, and a recall of 86.3%, demonstrating robust detection performance under heterogeneous environmental conditions. Dash-cam imagery collected along 6.3 km of urban flexible pavements was processed through an automated workflow that detects pavement distresses, estimates their severity and extent, and computes PCI values according to ASTM D6433-20 procedures. Automated PCI values were compared with manual PCI inspections and PASER ratings generated by the Blyncsy platform across 23 pavement sections. Statistical validation between automated and manual PCI assessments returned an R-squared of 0.925, a Pearson correlation coefficient of 0.962, a Spearman correlation coefficient of 0.955, a Mean Absolute Error of 5.0 PCI points, and a Root Mean Square Error of 6.1 PCI points. Compared with the proposed framework, PASER ratings exhibited lower agreement with manual PCI assessments and generally overestimated the pavement condition. The results demonstrate the potential of low-cost AI-based systems for large-scale pavement monitoring. Nevertheless, performance degradation was observed under challenging environmental conditions and in heavily deteriorated sections, highlighting the need for improved distress quantification, dataset balancing, and multimodal sensing integration. Full article

(This article belongs to the Special Issue Smart Mobility and Transportation Infrastructure)

► Show Figures

Figure 1

20 pages, 6258 KB

Open AccessArticle

A Lightweight Tea Bud Detector via Cascaded Gated Modulation and Multi-Scale Feature Enhancement

by Zewei Mi and Minming Gu

AI 2026, 7(6), 227; https://doi.org/10.3390/ai7060227 - 18 Jun 2026

Abstract

Accurate detection of tea buds is a key technology for enabling automated tea harvesting. However, in natural environments, tea buds present challenges such as scale variation, dense distribution, and high similarity to the background, making it difficult for traditional methods to balance accuracy [...] Read more.

Accurate detection of tea buds is a key technology for enabling automated tea harvesting. However, in natural environments, tea buds present challenges such as scale variation, dense distribution, and high similarity to the background, making it difficult for traditional methods to balance accuracy and efficiency. To address these issues, this paper proposes a lightweight detection framework, PCM-YOLO. The model introduces a cascaded gated feature modulation network into the YOLOv11 architecture, combining feedforward structures and gating mechanisms to selectively emphasize informative features, thereby improving tea bud detection performance. In addition, a feature-enhanced downsampling module is proposed, which employs a stepwise pooling-based feature enhancement mechanism to progressively expand the receptive field while preserving feature resolution, effectively incorporating multi-scale contextual information. Finally, a multi-scale feature enhancement module is designed to reduce the computational complexity of the model while maintaining detection performance as much as possible. Experimental results on public datasets demonstrate notable performance improvements over YOLOv11-N: Precision increases from 86.7% to 90.6% (an absolute increase of 3.9 percentage points), mAP50-95 increases by 1.6%, and the number of parameters is reduced by 20.6%. These results indicate that PCM-YOLO achieves a substantial reduction in model complexity while effectively improving detection accuracy, providing a feasible technical solution for deploying high-precision, real-time tea bud detection systems at the edge in tea plantation environments. Full article

(This article belongs to the Section AI Systems: Theory and Applications)

► Show Figures

Figure 1

17 pages, 4000 KB

Open AccessArticle

A Lightweight and High-Precision PCB Surface Defect Detection Method Based on YOLOv8

by Zhenling Wang, Ya Gao, Ying Xiao and Qiurui He

J. Imaging 2026, 12(6), 266; https://doi.org/10.3390/jimaging12060266 - 18 Jun 2026

Abstract

In response to the diverse types and large number of PCB surface defects, our paper proposes an improved YOLOv8-based method for PCB surface defect detection. First, a lightweight modification is performed by introducing RepGhostBottleNeck as the lightweight backbone network, which reduces the number [...] Read more.

In response to the diverse types and large number of PCB surface defects, our paper proposes an improved YOLOv8-based method for PCB surface defect detection. First, a lightweight modification is performed by introducing RepGhostBottleNeck as the lightweight backbone network, which reduces the number of parameters in the training model. It should be noted that the term “lightweight” in this paper is relative to the original YOLOv8L baseline. Compared with extremely lightweight detectors, the model in this paper places greater emphasis on the balance between accuracy and efficiency. Additionally, an attention mechanism module and a small object detection head module are added to the backbone network. Furthermore, the loss function of the network is improved. Experimental results show that the improved model achieves an average mAP@0.5 of 0.976, demonstrating high-precision detection on the constructed dataset. Full article

(This article belongs to the Special Issue AI-Driven Image and Video Understanding)

► Show Figures

Figure 1

29 pages, 13586 KB

Open AccessArticle

Visual Recognition of Coal–Biomass Blend Ratios on a Conveyor Belt Using YOLO-Series Models with Oriented Bounding Boxes

by Yisheng Mao, Huijin Yang, Cuihua Zhang, Weihui Liao, Zhilong Ruan, Haibing Pu, Xu Huang, Xiaolong Wu and Zhimin Lu

Processes 2026, 14(12), 1979; https://doi.org/10.3390/pr14121979 - 18 Jun 2026

Abstract

Real-time perception of coal–biomass blending during conveyor-belt transport remains challenging because of local aggregation, particle overlap, and illumination variation. In this study, a laboratory-scale conveyor-belt image dataset covering different coal mass fractions, illumination conditions, and particle sizes was constructed. Whole-image classification, cropped-ROI classification, [...] Read more.

Real-time perception of coal–biomass blending during conveyor-belt transport remains challenging because of local aggregation, particle overlap, and illumination variation. In this study, a laboratory-scale conveyor-belt image dataset covering different coal mass fractions, illumination conditions, and particle sizes was constructed. Whole-image classification, cropped-ROI classification, direct regression, horizontal bounding box (HBB)-based detection, oriented bounding box (OBB)-based detection, and RT-DETR-L detection baselines were compared using YOLO-series and auxiliary models. Coal mass fraction was estimated using a frequency-weighted statistical strategy that converts frame-level predictions into continuous estimates. YOLOv8-cls achieved an average RMSE of 13.98 percentage points (pp), indicating the influence of background interference in whole-image classification. Among HBB models, YOLOv8m achieved the lowest mean RMSE of 6.10 pp but required higher computational cost. Compared with YOLOv8n, YOLOv8n-OBB reduced the average RMSE from 9.02 to 6.90 pp by providing a more compact material-region representation and reducing background redundancy. These results show that OBB representation improves the stability of lightweight models. The proposed method provides a feasible vision-based soft-sensing approach for online trend monitoring of coal–biomass blending under lightweight deployment. Full article

(This article belongs to the Section AI-Enabled Process Engineering)

► Show Figures

Figure 1

25 pages, 19355 KB

Open AccessArticle

REB-Tea: An Intelligent Detection Model for Tea Buds with Clarity and Multi-Scale Feature Enhancement

by Zhuoxun Wu, Jun Lyu, Jingfan Pan, Junyi Luo and Lin Wang

Agriculture 2026, 16(12), 1340; https://doi.org/10.3390/agriculture16121340 - 17 Jun 2026

Viewed by 19

Abstract

Tea bud detection is a fundamental prerequisite for accurate tea yield estimation and intelligent mechanical harvesting. However, existing detection methods face several critical challenges, including ineffective extraction of multi-scale features, weak feature saliency for small tea bud targets, and the prevalent imaging issue [...] Read more.

Tea bud detection is a fundamental prerequisite for accurate tea yield estimation and intelligent mechanical harvesting. However, existing detection methods face several critical challenges, including ineffective extraction of multi-scale features, weak feature saliency for small tea bud targets, and the prevalent imaging issue in which the central regions of tea images are in focus while peripheral areas suffer from defocus blur. These factors collectively result in a high rate of missed detections, severely limiting detection accuracy and subsequent application performance. To overcome these technical bottlenecks, this paper proposes a novel tea bud detection framework, termed REB-Tea, which integrates image clarity optimization with multi-scale feature enhancement. First, the Restormer image restoration network is employed to improve overall image clarity and enhance the discriminative representation of tea bud features. Subsequently, a bidirectional feature pyramid network (BiFPN) structure and an efficient multi-scale attention (EMA) mechanism are incorporated into the neck of the YOLOv5 model to strengthen multi-scale feature fusion and guide the network to focus on fine-grained tea bud features across different scales, thereby improving detection performance for small and densely distributed targets. Experimental results based on 10-fold cross-validation demonstrate that the proposed REB-Tea model achieves an average mAP₅₀ of 95.5% on the Longjing 43 tea test set, representing a 9.9 percentage point improvement over the baseline YOLOv5 model, and Welch’s independent two-sample t-test verifies that this accuracy increment is highly statistically significant. Moreover, the model exhibits reliable detection performance across different tea varieties, including Cuifeng and Fuding White Tea. Specifically, the mAP₅₀ reaches 88.3% on Cuifeng, which shares similar appearance characteristics with Longjing, and 78.1% on Fuding White Tea, which has noticeably different appearance characteristics from Longjing. These results confirm the effectiveness of the REB-Tea framework in addressing challenges such as out-of-focus blurring, weak feature saliency, and multi-scale feature extraction. Overall, the proposed approach significantly enhances tea bud detection accuracy in natural environments and provides robust technical support for intelligent tea harvesting applications. Full article

(This article belongs to the Topic Multidisciplinary Advances in Tea Science: Smart Cultivation, Digital Processing, and Health Innovation)

► Show Figures

Figure 1

20 pages, 1672 KB

Open AccessArticle

CASDA: Enhancing Steel Defect Detection Through Context-Aware Data Augmentation Framework

by Ho-Jun Han and Il-Young Moon

Appl. Sci. 2026, 16(12), 6137; https://doi.org/10.3390/app16126137 - 17 Jun 2026

Viewed by 47

Abstract

Defect detection in manufacturing has evolved from manual inspection to deep learning-based Automated Visual Inspection (AVI) systems; however, acquiring sufficient defect samples in real industrial environments remains challenging, causing severe data sparsity and class imbalance. We propose CASDA (Context-Aware Steel Defect Augmentation), a [...] Read more.

Defect detection in manufacturing has evolved from manual inspection to deep learning-based Automated Visual Inspection (AVI) systems; however, acquiring sufficient defect samples in real industrial environments remains challenging, causing severe data sparsity and class imbalance. We propose CASDA (Context-Aware Steel Defect Augmentation), a five-stage framework that classifies defect morphology and background surface properties, constructs a compatibility matrix encoding their contextual relationship, and synthesizes defect images via a ControlNet pipeline conditioned on a three-channel hint image. Experiments on the Severstal steel dataset demonstrate that CASDA achieves an 83.0% quality validation pass rate. Under multi-seed evaluation (seeds 42 and 456), CASDA improved EB-YOLOv8’s overall mAP@0.5 by 2.60 pp over the raw baseline and achieved a Class 2 AP gain of 22.09 pp over Copy-Paste, suggesting that context-aware synthesis produces more discriminative minority-class training samples than simple patch reuse under the tested settings. Performance gains are architecture-dependent; YOLO-MFD did not show overall improvement, indicating that augmentation sensitivity varies with backbone feature representation. Full article

(This article belongs to the Special Issue Intelligent Automation Technologies for Industry 4.0)

Search Results (4,514)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (4,514)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI