MDPI - Publisher of Open Access Journals

21 pages, 1784 KB

Open AccessArticle

Development and Application of an AI Visual Defect Detection System for Warp-Knitted Lace Based on 5G+ Technology

by Taohai Yan, Yongze Wu, Yajing Shi, Chaowang Lin and Li Ji

Information 2026, 17(7), 623; https://doi.org/10.3390/info17070623 (registering DOI) - 24 Jun 2026

Conventional defect inspection for warp-knitted lace relies on manual work and negative-sample-based training, resulting in low efficiency, frequent false detections and poor adaptability. This study presents a novel AI visual inspection system centered on positive-sample learning, which is built upon a five-layer 5G [...] Read more.

Conventional defect inspection for warp-knitted lace relies on manual work and negative-sample-based training, resulting in low efficiency, frequent false detections and poor adaptability. This study presents a novel AI visual inspection system centered on positive-sample learning, which is built upon a five-layer 5G + Industrial Internet distributed architecture. Supported by modified looms, high-precision imaging devices and an optimized YOLOv5s model, the system accomplishes intelligent defect detection. A positive-sample self-learning paradigm and dual-model collaboration mechanism are proposed to reduce the demand for negative samples and cut labeling expenses. The integration of CBAM, FPN + PAN structure, self-supervised learning and hybrid loss further strengthens the recognition performance for subtle defects under complex patterns. Industrial tests show that the system reaches a grid-level classification accuracy of 95% and a frame-level detection rate over 98%, with a detection speed of 30 m/min. It reduces labor costs and product reject rates by 40% and 30% correspondingly while running stably in real production. This method breaks the constraints of traditional training modes, provides a scalable intelligent solution for the digital upgrading of the warp-knitted lace industry, and promotes the high-quality development of textile manufacturing. Full article

(This article belongs to the Section Information Applications)

► Show Figures

Graphical abstract

20 pages, 20750 KB

Open AccessArticle

Does Facility Provision Translate into Vitality? Video-Based Evidence from Renovated Public Open Spaces in Old Communities

by Guiwen Liu, Yipin Huang, Hongjuan Wu and Heng Zhang

Land 2026, 15(7), 1119; https://doi.org/10.3390/land15071119 (registering DOI) - 24 Jun 2026

Abstract

Public open spaces (POS) in old communities are important settings for daily neighborhood life, yet many renovated POS remain underused after physical upgrading. Existing evaluations often rely on subjective perceptions, providing limited evidence on how facilities are associated with vitality. This study analyzes [...] Read more.

Public open spaces (POS) in old communities are important settings for daily neighborhood life, yet many renovated POS remain underused after physical upgrading. Existing evaluations often rely on subjective perceptions, providing limited evidence on how facilities are associated with vitality. This study analyzes the associations between facility provision and POS vitality in 63 renovated POS across 11 old communities in Jiulongpo District, Chongqing, China. POS vitality is operationalized through two behavioral dimensions, use frequency and stay duration, derived from video detection and tracking using YOLOv8 and ByteTrack. Facility provision was then classified by facility type and examined in relation to the vitality indicators through descriptive analysis and Generalized Estimating Equations models. Descriptive evidence indicates substantial heterogeneity in both facility provision and POS vitality. Resting amenities and landscape elements are more commonly provided, whereas children’s facilities show the lowest provision and greater spatial selectivity. Higher use frequency and longer stay duration are concentrated in some POS. The Generalized Estimating Equations analysis further indicates that facilities are not associated with vitality in a uniform way. Children’s facilities show the strongest positive associations with both use frequency and stay duration despite their limited provision, supporting their key role in POS vitality. Landscape elements and lighting facilities are more closely associated with stay duration, highlighting the role of environmental support in sustaining longer use. In contrast, the negative associations for fitness facilities, together with the non-significant results for resting and sanitation amenities, suggest that not all facility provision translates into stronger vitality. Taken together, renovation performance should be judged not by the quantity of upgraded facilities alone, but by whether facilities support the behavioral dimensions of vitality that a POS is expected to achieve. Full article

(This article belongs to the Section Urban Contexts and Urban-Rural Interactions)

► Show Figures

Figure 1

22 pages, 3680 KB

Open AccessArticle

Tomato Visual Object Detection Method Based on the Mamba State Space Model

by Wenhao Li, Hengyi Zheng, Chengheng Zhao, Wei Liu, Shunjie Li and Mengbo Qian

Horticulturae 2026, 12(7), 770; https://doi.org/10.3390/horticulturae12070770 (registering DOI) - 24 Jun 2026

Abstract

Tomato harvesting still relies heavily on manual labor, while factors such as clustered fruit growth, inconsistent ripening stages, occlusion, and complex cultivation environments pose significant challenges to automated harvesting systems and place higher demands on target detection accuracy. To address these issues, a [...] Read more.

Tomato harvesting still relies heavily on manual labor, while factors such as clustered fruit growth, inconsistent ripening stages, occlusion, and complex cultivation environments pose significant challenges to automated harvesting systems and place higher demands on target detection accuracy. To address these issues, a tomato detection method based on the Mamba state space model was proposed, and an improved model termed YOLO-VCW was developed based on YOLOv8n. Specifically, the original C2f module in the backbone network was replaced with the C2f-VSS module to enhance global contextual feature extraction. A Coordinate Attention mechanism was introduced into the feature fusion stage to improve the model’s ability to focus on tomato target regions under complex background and occlusion conditions. In addition, the WIoUv3 loss function was adopted in the detection head to improve localization accuracy and training stability in overlapping fruit scenarios. Experimental results showed that YOLO-VCW achieved a precision of 91.33%, a recall of 86.79%, and an F1-score of 89.00% on the tomato dataset. Compared with YOLOv8n, the proposed model improved precision, recall, F1-score, and mAP₅₀ by 1.90%, 4.43%, 3.25%, and 4.44%, respectively, with only a slight increase in Parameters to 3.9 M. These results demonstrate that YOLO-VCW provides effective and robust performance for tomato target detection in complex environments. Full article

(This article belongs to the Special Issue Intelligent Agricultural Equipment Monitoring Technology for Vegetable Production)

► Show Figures

Figure 1

27 pages, 7020 KB

Open AccessArticle

MSA-YOLO: An Optimized UAV Object Detection Algorithm for Low-Visibility Maritime

by Longcheng Huang, Mengguang Liao, Shaoning Li, Chuanguang Zhu and Sichun Long

Remote Sens. 2026, 18(13), 2065; https://doi.org/10.3390/rs18132065 (registering DOI) - 23 Jun 2026

Abstract

Maritime search and rescue is an important component of emergency response frameworks and primarily relies on Unmanned Aerial Vehicles (UAVs) for maritime object detection. However, maritime accidents frequently occur in low-visibility environments, such as foggy or low-light conditions, which lead to low contrast, [...] Read more.

Maritime search and rescue is an important component of emergency response frameworks and primarily relies on Unmanned Aerial Vehicles (UAVs) for maritime object detection. However, maritime accidents frequently occur in low-visibility environments, such as foggy or low-light conditions, which lead to low contrast, blurred object boundaries, and degraded texture representations. Most existing maritime object detection algorithms are developed for natural light scenes, and their performance deteriorates markedly when deployed directly in low-visibility environments, primarily due to reduced image quality that hinders feature extraction and semantic information aggregation. Although several studies incorporate image enhancement techniques prior to detection to improve image quality, these approaches often introduce significant additional computational overhead, limiting their practical deployment on UAV platforms. To tackle these challenges, this paper proposes a lightweight model built upon a recent YOLO framework, termed Multi-Scale Adaptive YOLO (MSA-YOLO), for maritime detection using UAVs in low-visibility environments. The proposed model systematically optimizes the backbone, neck, and detection head networks. Specifically, an improved StarNet backbone is designed by integrating Efficient Channel Attention (ECA) mechanisms and multi-scale convolutional kernels, which strengthen feature extraction capability while maintaining low computational overhead. In the neck network, a high-frequency enhanced residual block branch is inserted into the C3k2 module to capture richer detailed information, while depthwise separable convolution is utilized to further reduce computational cost. Moreover, a non-parametric attention module is incorporated into the detection head to adaptively optimize features in the classification and regression branches. Finally, a joint loss function that combines bounding box regression, classification, and distribution focal losses is utilized to improve detection accuracy and training stability. Experimental results on the constructed AFO, Zhoushan Island, and Shandong Province datasets demonstrate that, relative to YOLOv11-s, MSA-YOLO reduces model parameters and FLOPs by 52.07% and 41.36%, respectively, while achieving improvements of 1.11% and 1.33% in mAP@0.5:0.95 and mAP@0.5. These results indicate that the proposed method effectively balances computational efficiency and detection accuracy, rendering it suitable for practical maritime search and rescue applications in low-visibility environments. Full article

► Show Figures

Figure 1

17 pages, 8857 KB

Open AccessArticle

An Interpretable Deep Learning System for Fine-Grained Classification and Longitudinal Tracking of Neonatal Auricular Deformities

by Yihui Feng, Xujun Hu, Xiwen Zhang, Xiaobao Ma, Jialin Xie, Jianyong Chen and Yangyang Yuan

Biology 2026, 15(13), 985; https://doi.org/10.3390/biology15130985 (registering DOI) - 23 Jun 2026

Abstract

Early non-invasive correction of neonatal auricular deformities is highly dependent on timely and precise diagnosis. However, clinical practice is often compromised by the subjectivity of visual assessments and the lack of objective tracking metrics, which frequently leads to missed optimal treatment windows. To [...] Read more.

Early non-invasive correction of neonatal auricular deformities is highly dependent on timely and precise diagnosis. However, clinical practice is often compromised by the subjectivity of visual assessments and the lack of objective tracking metrics, which frequently leads to missed optimal treatment windows. To address these challenges, we developed an interpretable deep learning-based diagnostic system for the automated screening and fine-grained classification of these deformities. Methodologically, a large-scale, multi-source dataset (n = 4644) was curated to support model training. The system pairs an automated object detector (YOLOv11) for background-reduced region-of-interest isolation with a cascaded classification pipeline optimized via ConvNeXt-Tiny. Crucially, we introduced a supervised contrastive learning module to project high-dimensional morphological features into a continuous severity score, enabling quantitative longitudinal tracking of therapeutic efficacy. To evaluate generalization and robustness, the framework underwent rigorous evaluation across three independent real-world cohorts and one controlled synthetic stress test. The system achieved 88.2% accuracy (Area Under the Curve (AUC): 0.949) in binary screening and 87.4% accuracy (macro-AUC: 0.976) in multi-class subtyping on the internal baseline. To enhance interpretability and build clinical trust, Gradient-weighted Class Activation Mapping (Grad-CAM) was utilized to explore the spatial distribution of the model’s attention, which frequently aligned with key anatomical landmarks. Furthermore, the learned severity scores robustly quantified post-intervention improvements (p = 0.0004), effectively capturing subtle anatomical normalization. While validation for rare subtypes remains underpowered, and the severity score currently functions mainly as a learned morphological similarity index requiring future clinical calibration, this study ultimately provides an objective and standardized web-based tool to facilitate the early intervention and precision management of neonatal auricular anomalies. Full article

(This article belongs to the Special Issue AI Deep Learning Approach to Study Biological Questions (3rd Edition))

► Show Figures

Figure 1

26 pages, 5787 KB

Open AccessArticle

CNS-YOLOv8: An Improved YOLOv8-Based Defect Detection Method

by Runhua Geng, Yuan Jiang, Jin Li, Kaiwen Wu, Yingjian Yang, Ziheng Li and Yaohui Chang

Electronics 2026, 15(12), 2730; https://doi.org/10.3390/electronics15122730 (registering DOI) - 21 Jun 2026

Viewed by 155

Abstract

Steel surface defect inspection plays an essential role in maintaining product quality and production safety in industrial manufacturing. However, existing detection methods still encounter difficulties in accurately identifying tiny defects, suppressing interference from complex backgrounds, and balancing detection accuracy with computational cost. To [...] Read more.

Steel surface defect inspection plays an essential role in maintaining product quality and production safety in industrial manufacturing. However, existing detection methods still encounter difficulties in accurately identifying tiny defects, suppressing interference from complex backgrounds, and balancing detection accuracy with computational cost. To address these challenges, this paper proposes CNS-YOLOv8, an improved defect detection model based on YOLOv8n. First, a C2f_SCConv module is introduced to enhance multi-scale feature extraction and spatial representation capability. Second, a Normalization-based Attention Module (NAM) is embedded after the high-level semantic feature layer to improve the model’s sensitivity to critical defect regions. Third, a SlimNeck structure is adopted to strengthen feature fusion while reducing computational overhead. Experimental results on the NEU-DET dataset demonstrate that CNS-YOLOv8 achieves 83.1% mAP@0.5 and 49.6% mAP@0.5:0.95, surpassing YOLOv8n by 3.9 and 1.2 percentage points, respectively. In addition, comparative experiments show that CNS-YOLOv8 outperforms Faster R-CNN and YOLOv7 in terms of mAP@0.5 while requiring substantially fewer GFLOPs. In general, the proposed method balances detection accuracy and computational efficiency effectively, highlighting its potential for real-time industrial surface defect detection. Full article

(This article belongs to the Special Issue Advanced Technologies and Applications for Computer Vision and Recognition Systems)

► Show Figures

Figure 1

26 pages, 35295 KB

Open AccessArticle

A Lightweight Framework for Tea Shoot Detection and Plucking Point Localization Enabled by Modified YOLOv11s-Seg Model

by Yongmao Huang, Yuankai Luo, Yuanxi Mu and Haiyan Jin

Agriculture 2026, 16(12), 1357; https://doi.org/10.3390/agriculture16121357 (registering DOI) - 20 Jun 2026

Viewed by 217

Abstract

In this work, a lightweight framework enabled by the modified YOLOv11s-seg model for tea shoot detection and plucking point localization is proposed. Detecting tea shoots and localizing plucking points with higher accuracy generally require larger model size and more model parameters, making it [...] Read more.

In this work, a lightweight framework enabled by the modified YOLOv11s-seg model for tea shoot detection and plucking point localization is proposed. Detecting tea shoots and localizing plucking points with higher accuracy generally require larger model size and more model parameters, making it difficult to balance accuracy and lightweighting. To overcome this limitation, a modified lightweight YOLOv11s-seg model is developed. First, the multi-scale edge information enhancement is introduced into the conventional YOLOv11s-seg to extract edge feature better and improve the detection accuracy of tea shoots. Meanwhile, context anchor attention is utilized to modify the cross stage partial spatial attention module in a backbone network to improve the detection capability for small objects. Moreover, the detail calibration reconstruction feature pyramid network is proposed. It utilizes spatial and contextual semantic information to reconstruct and calibrate features in key regions, enhancing the capability for object fusion and recognition at various scales. Furthermore, with the modified model performing instance segmentation to acquire the contour of each tea shoot, the coordinates of the three lowest pixel points in the contour are captured to localize the plucking point based on the average coordinates. In addition, the layer-adaptive magnitude-based pruning (LAMP) method is used to lighten the model. The experimental results show that the LAMP-pruned modified YOLOv11s-seg model with a speedup ratio of 1.5 achieves a mAP@0.5 of 86.5% for tea shoot detection, exhibiting a 4.7 percentage point improvement over the conventional YOLOv11s-seg model. Moreover, it exhibits an accuracy of 81.9% for plucking point localization on the validation and test subsets with 232 images in total, and its number of parameters, model size and floating point operations (FLOPs) separately achieve reductions of 67.3%, 66.2%, and 24.9% over the conventional model as well. Therefore, the proposed LAMP-pruned modified model shows good balance between lightweighting and detection accuracy. Finally, the modified LAMP-pruned YOLOv11s-seg model is deployed on a Jetson Orin NX edge module and measured in a tea plantation, with the measured results exhibiting a detection speed of 34.1 FPS and verifying its availability in practical applications. Full article

(This article belongs to the Special Issue Advances in Precision Agriculture in Orchard)

► Show Figures

Figure 1

18 pages, 6162 KB

Open AccessArticle

YOLO-UTD: A Domain-Specific Detection Framework for Small Objects in UAV Traffic Surveillance

by Hailang Huang, Meng Li, Jiebao Zhang and Yitong Li

Sensors 2026, 26(12), 3931; https://doi.org/10.3390/s26123931 (registering DOI) - 20 Jun 2026

Viewed by 284

Abstract

Detecting objects in drone-captured aerial imagery is particularly formidable due to challenges such as the prevalence of numerous small targets and their dense spatial distribution. To bridge this gap, this paper introduces YOLO-UTD (YOLO-UAV Traffic Detection), a dedicated small object detector tailored for [...] Read more.

Detecting objects in drone-captured aerial imagery is particularly formidable due to challenges such as the prevalence of numerous small targets and their dense spatial distribution. To bridge this gap, this paper introduces YOLO-UTD (YOLO-UAV Traffic Detection), a dedicated small object detector tailored for drone traffic surveillance. Built upon the YOLOv8 framework, the proposed model incorporates three principal enhancements. First, a specialized small-object detection head replaces the original large-object head to increase the sensitivity to fine-grained features. Second, we introduce a shallow-augmented feature pyramid network (SFPN) into the neck module. The SFPN enriches the semantic content of high-resolution shallow features via dense multiscale interactions and CARAFE upsampling, boosting performance on small targets. Finally, a C2fA layer is integrated into the deep backbone stages to adaptively fuse spatial details and semantic context through a dual-path architecture and a cross-attention mechanism, thereby dynamically refining features critical for small objects. Extensive experiments on the VisDrone2019 dataset validate that YOLO-UTD achieves a 3.6% higher mean average precision (mAP) than YOLOv8 while preserving a low parameter footprint, with a particularly significant gain of 5.3% in vehicle detection accuracy. These findings confirm the model’s efficacy and strong potential for application in smart city drone surveillance. Full article

(This article belongs to the Topic Transformer and Deep Learning Applications in Image Processing)

► Show Figures

Figure 1

24 pages, 13146 KB

Open AccessArticle

Real-Time Assistive System Integrating Geometric Topology Analysis and State-Adaptive Warning Logic for the Visually Impaired

by Bilie Hu, Peishen Gao, Yan Liu, Xi Xia and Guoping Huo

Sensors 2026, 26(12), 3905; https://doi.org/10.3390/s26123905 (registering DOI) - 19 Jun 2026

Viewed by 212

Abstract

Traditional white canes offer a limited perception range, whereas end-to-end visual models face challenges in real-time deployment on edge devices. To address these limitations, this paper proposes a lightweight real-time assistive system that integrates geometric topology reconstruction with state-adaptive warning logic. The system [...] Read more.

Traditional white canes offer a limited perception range, whereas end-to-end visual models face challenges in real-time deployment on edge devices. To address these limitations, this paper proposes a lightweight real-time assistive system that integrates geometric topology reconstruction with state-adaptive warning logic. The system utilizes YOLOv9 to extract discrete semantic primitives of tactile paving. It constructs a dual-branch perception framework based on Median Absolute Deviation and the Minimum Spanning Tree algorithm to analyze the topological structure of tactile paving. For complex intersections characterized by warning indicators, a one-dimensional connectivity clustering algorithm based on longitudinal topology is proposed. It generates accurate macroscopic feasible directional prompts under field-of-view boundary constraints. Additionally, a hierarchical scheduling framework dynamically orchestrates scenario-specific finite state machines to enable continuous dynamic interaction across typical high-risk scenarios. Evaluated on a custom real-world dataset, the system achieves a 95.21% frame-level comprehensive accuracy for straight-path deviation correction and intersection directional prompting. Dynamic temporal stress tests confirm the temporal stability and logical coherence of state transitions. Furthermore, latency evaluations demonstrate the logic layer’s minimal computational overhead, proving its theoretical feasibility for real-time edge deployment. This approach provides an effective, low-latency solution for delivering directional prompts and hazard warnings to visually impaired users. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

43 pages, 13866 KB

Open AccessArticle

Research on Multi-Source Heterogeneous Collaborative Perception System Based on Unmanned Aerial Vehicle and Unmanned Ground Vehicle

by Yufeng Li, Erming Tian, Xiaofeng Chen, Huiyan Han and Xinya Zhang

Drones 2026, 10(6), 470; https://doi.org/10.3390/drones10060470 (registering DOI) - 19 Jun 2026

Viewed by 258

Abstract

Complex urban scenarios impose high demands on the environmental perception capabilities of unmanned systems, which serve as a prerequisite for executing autonomous missions such as disaster response, infrastructure inspection, and smart city operations. UAVs, leveraging their high mobility, can provide accurate prior maps [...] Read more.

Complex urban scenarios impose high demands on the environmental perception capabilities of unmanned systems, which serve as a prerequisite for executing autonomous missions such as disaster response, infrastructure inspection, and smart city operations. UAVs, leveraging their high mobility, can provide accurate prior maps and wide-area aerial observation for unmanned ground vehicles. However, their long-range perception accuracy is limited. Conversely, UGVs can achieve high-precision environmental perception along their navigation paths using prior maps, but suffer from a constrained field of view. The collaboration between the two platforms complements their respective strengths, thereby enhancing 3D object perception and mapping accuracy in complex scenarios. To address the aforementioned challenges, this study proposes a cross-platform feature fusion method for 3D object perception and an incremental map updating approach for UAVs and UGVs. First, a dynamic SLAM method that integrates an optimized YOLOv8 with ORB-SLAM3 is employed to mitigate map blurring caused by dynamic noise, providing prior map information for UGVs. Second, a multimodal fusion perception model is constructed for UGVs, utilizing attention mechanisms to achieve deep fusion of multimodal Bird’s-Eye-View (BEV) features. This overcomes issues such as diminishing complementarity between modalities and weak temporal feature associations. Finally, an air ground fusion model based on a cross-attention mechanism is developed to fuse aerial view features with ground-based fused BEV features across platforms, yielding a unified feature representation for 3D object detection and generating a fused high-precision map. Experimental results demonstrate that under complex occlusion scenarios in a simulated dataset, the proposed collaborative perception system improves the mean Average Precision (mAP) by 12.7% and 15.7% compared to using a single UAV or a single UGV, respectively, while increasing the map accuracy F1-score by 0.21. This study provides technical support for achieving real-time and accurate air ground collaborative perception in complex dynamic environments. Full article

(This article belongs to the Section Innovative Urban Mobility)

► Show Figures

Figure 1

25 pages, 19355 KB

Open AccessArticle

REB-Tea: An Intelligent Detection Model for Tea Buds with Clarity and Multi-Scale Feature Enhancement

by Zhuoxun Wu, Jun Lyu, Jingfan Pan, Junyi Luo and Lin Wang

Agriculture 2026, 16(12), 1340; https://doi.org/10.3390/agriculture16121340 - 17 Jun 2026

Viewed by 348

Abstract

Tea bud detection is a fundamental prerequisite for accurate tea yield estimation and intelligent mechanical harvesting. However, existing detection methods face several critical challenges, including ineffective extraction of multi-scale features, weak feature saliency for small tea bud targets, and the prevalent imaging issue [...] Read more.

Tea bud detection is a fundamental prerequisite for accurate tea yield estimation and intelligent mechanical harvesting. However, existing detection methods face several critical challenges, including ineffective extraction of multi-scale features, weak feature saliency for small tea bud targets, and the prevalent imaging issue in which the central regions of tea images are in focus while peripheral areas suffer from defocus blur. These factors collectively result in a high rate of missed detections, severely limiting detection accuracy and subsequent application performance. To overcome these technical bottlenecks, this paper proposes a novel tea bud detection framework, termed REB-Tea, which integrates image clarity optimization with multi-scale feature enhancement. First, the Restormer image restoration network is employed to improve overall image clarity and enhance the discriminative representation of tea bud features. Subsequently, a bidirectional feature pyramid network (BiFPN) structure and an efficient multi-scale attention (EMA) mechanism are incorporated into the neck of the YOLOv5 model to strengthen multi-scale feature fusion and guide the network to focus on fine-grained tea bud features across different scales, thereby improving detection performance for small and densely distributed targets. Experimental results based on 10-fold cross-validation demonstrate that the proposed REB-Tea model achieves an average mAP₅₀ of 95.5% on the Longjing 43 tea test set, representing a 9.9 percentage point improvement over the baseline YOLOv5 model, and Welch’s independent two-sample t-test verifies that this accuracy increment is highly statistically significant. Moreover, the model exhibits reliable detection performance across different tea varieties, including Cuifeng and Fuding White Tea. Specifically, the mAP₅₀ reaches 88.3% on Cuifeng, which shares similar appearance characteristics with Longjing, and 78.1% on Fuding White Tea, which has noticeably different appearance characteristics from Longjing. These results confirm the effectiveness of the REB-Tea framework in addressing challenges such as out-of-focus blurring, weak feature saliency, and multi-scale feature extraction. Overall, the proposed approach significantly enhances tea bud detection accuracy in natural environments and provides robust technical support for intelligent tea harvesting applications. Full article

(This article belongs to the Topic Multidisciplinary Advances in Tea Science: Smart Cultivation, Digital Processing, and Health Innovation)

► Show Figures

Figure 1

19 pages, 30860 KB

Open AccessArticle

CASDA: Enhancing Steel Defect Detection Through Context-Aware Data Augmentation Framework

by Ho-Jun Han and Il-Young Moon

Appl. Sci. 2026, 16(12), 6137; https://doi.org/10.3390/app16126137 - 17 Jun 2026

Viewed by 102

Abstract

Defect detection in manufacturing has evolved from manual inspection to deep learning-based Automated Visual Inspection (AVI) systems; however, acquiring sufficient defect samples in real industrial environments remains challenging, causing severe data sparsity and class imbalance. We propose CASDA (Context-Aware Steel Defect Augmentation), a [...] Read more.

Defect detection in manufacturing has evolved from manual inspection to deep learning-based Automated Visual Inspection (AVI) systems; however, acquiring sufficient defect samples in real industrial environments remains challenging, causing severe data sparsity and class imbalance. We propose CASDA (Context-Aware Steel Defect Augmentation), a five-stage framework that classifies defect morphology and background surface properties, constructs a compatibility matrix encoding their contextual relationship, and synthesizes defect images via a ControlNet pipeline conditioned on a three-channel hint image. Experiments on the Severstal steel dataset demonstrate that CASDA achieves an 83.0% quality validation pass rate. Under multi-seed evaluation (seeds 42 and 456), CASDA improved EB-YOLOv8’s overall mAP@0.5 by 2.60 pp over the raw baseline and achieved a Class 2 AP gain of 22.09 pp over Copy-Paste, suggesting that context-aware synthesis produces more discriminative minority-class training samples than simple patch reuse under the tested settings. Performance gains are architecture-dependent; YOLO-MFD did not show overall improvement, indicating that augmentation sensitivity varies with backbone feature representation. Full article

(This article belongs to the Special Issue Intelligent Automation Technologies for Industry 4.0)

► Show Figures

Figure 1

15 pages, 32174 KB

Open AccessArticle

YOLO-FSEP: An Improved YOLOv8n Algorithm for Sugar Orange Detection in Orchards

by Tianfa Deng, Jinchao Sun, Qingjuan Zhao and Faguo Huang

Sensors 2026, 26(12), 3848; https://doi.org/10.3390/s26123848 - 17 Jun 2026

Viewed by 116

Abstract

To address the challenges of detecting sugar orange fruits in complex natural orchard environments—where fruits are frequently occluded by leaves and branches and may be mutually occluded due to dense growth, leading to missed detections, false positives, and low detection confidence—we propose an [...] Read more.

To address the challenges of detecting sugar orange fruits in complex natural orchard environments—where fruits are frequently occluded by leaves and branches and may be mutually occluded due to dense growth, leading to missed detections, false positives, and low detection confidence—we propose an improved algorithm based on YOLOv8n, named YOLO-FSEP. A Spatial-Channel Synergistic Attention (SCSA) module is introduced into the main network to enhance feature extraction capabilities; the IoU loss function is replaced with Focal_SIOU to improve the detection accuracy for difficult samples; and an SE attention mechanism is embedded in the detection head, with the addition of a P6 high-resolution detection layer to optimize multi-scale object performance. Experimental results on a self-built sugar orange dataset show that, compared to the baseline YOLOv8n, the improved model achieves a 0.9% increase in accuracy, a 1.3% increase in recall, and a 3.2% increase in mAP50-95, while maintaining an inference speed of 62.6 FPS. To evaluate the model under dynamic conditions, we performed a 200-frame continuous test of the 3D localization pipeline on a laptop with a RealSense D435i camera. The average YOLO inference time was 49.90 ms, post-processing (depth extraction and 3D coordinate conversion) took 0.24 ms, and the total processing time was 50.15 ms. Given that the typical response time for a robotic arm’s single positioning operation is 100–200 ms, this real-time performance meets the dynamic localization requirements of sugar orange harvesting. Full article

(This article belongs to the Special Issue Smart Sensors in Precision Agriculture)

► Show Figures

Figure 1

36 pages, 13556 KB

Open AccessArticle

OAD-YOLOv8n: A Lightweight Direction-Adaptive Framework for Steel Strip Surface Defect Detection

by Yuji Liu and Piwei Chen

Metals 2026, 16(6), 666; https://doi.org/10.3390/met16060666 - 16 Jun 2026

Viewed by 260

Abstract

Steel strip surface defect detection remains challenging because defects are often elongated, weakly bounded, low-contrast, and sensitive to imaging degradation. To address these issues, this paper proposes Orthogonal Direction-Adaptive YOLOv8n (OAD-YOLOv8n), a lightweight detector based on You Only Look Once version 8 nano [...] Read more.

Steel strip surface defect detection remains challenging because defects are often elongated, weakly bounded, low-contrast, and sensitive to imaging degradation. To address these issues, this paper proposes Orthogonal Direction-Adaptive YOLOv8n (OAD-YOLOv8n), a lightweight detector based on You Only Look Once version 8 nano (YOLOv8n) and centered on Orthogonal Direction-Adaptive Efficient Multi-Scale Attention (OA-EMA), an orthogonal direction-adaptive attention module that combines debiased strip descriptors, adaptive direction selection, and local directional convolution. Dynamic upsampling by learning to sample (DySample), a lightweight neck structure (SlimNeck), and Adaptive Threshold Focal Loss (ATFL) are further integrated to improve detail-preserving upsampling, efficient multi-scale fusion, and hard-sample optimization. Across five independent runs on NEU-DET, OAD-YOLOv8n improves Precision, Recall, mAP50, and mAP50:95 by 5.0, 3.6, 4.4, and 3.7 percentage points over YOLOv8n, while reducing FLOPs and parameters by approximately 10.3% and 7.0%, respectively. Complementary experiments on GC10-DET, cross-dataset transfer/adaptation, simulated practical image perturbations, failure cases, and measured inference speed provide a broader characterization of the model’s benchmark-level generalization, robustness, and deployment-related behavior. These results indicate that OAD-YOLOv8n provides an effective accuracy–efficiency trade-off for lightweight steel strip surface defect detection. Full article

► Show Figures

Figure 1

26 pages, 6707 KB

Open AccessArticle

BDRNet: Background-Aware Dynamic-Scale Routing Network for UAV Remote Sensing Object Detection

by Xuelong Zheng, Faming Shao, Qing Liu, Juying Dai, Yiming Yue, Tao Zhang and Caian Chen

Remote Sens. 2026, 18(12), 1987; https://doi.org/10.3390/rs18121987 - 15 Jun 2026

Viewed by 242

Abstract

Object detection in UAV remote sensing imagery remains challenging due to severe scale variation, dense object distributions, complex background clutter, and localization ambiguity caused by extremely small objects. To address these issues, this paper proposes BDRNet, a lightweight background-aware dynamic-scale routing network for [...] Read more.

Object detection in UAV remote sensing imagery remains challenging due to severe scale variation, dense object distributions, complex background clutter, and localization ambiguity caused by extremely small objects. To address these issues, this paper proposes BDRNet, a lightweight background-aware dynamic-scale routing network for UAV remote sensing object detection. First, a background-aware feature enhancement (BAFE) module is introduced into the backbone to enhance feature representation through horizontal and vertical contextual modeling, improving target-related responses in complex aerial scenes. Second, a dynamic-scale routing pyramid (DSRP) is designed to retain the high-resolution

P_{2}

branch and adaptively integrate multi-scale features through spatially dynamic routing, alleviating the loss of fine-grained information and improving the representation of small and scale-varied objects. Third, a scale- and geometry-aware normalized Wasserstein distance (SGNW) loss is proposed by modeling bounding boxes as two-dimensional Gaussian distributions. By incorporating aspect-ratio-guided geometric weighting and scale-aware dynamic fusion, SGNW improves regression stability for small objects while preserving geometric constraints for medium and large targets. Extensive experiments on the VisDrone2019 and UAVDT datasets demonstrate that BDRNet consistently improves detection accuracy over the YOLOv10s detector while maintaining a comparable model size and computational cost. Compared with several mainstream lightweight detectors, BDRNet achieves a favorable accuracy–efficiency trade-off, demonstrating its effectiveness for UAV remote sensing object detection in complex aerial scenarios. Full article

► Show Figures

Figure 1

Search Results (2,823)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (2,823)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI