Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (3,401)

Search Parameters:
Keywords = Yolo

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
28 pages, 5944 KB  
Article
3D Vision-Guided Adaptive 3D Ultrasonic Scanning for Robotic Arms: Nondestructive Testing of Aerospace Components
by Xiaolong Wei, Zijian Kang, Yizhen Yin, Jingtao Zhang, Caizhi Li, Yu Cai and Weifeng He
Sensors 2026, 26(7), 2129; https://doi.org/10.3390/s26072129 (registering DOI) - 30 Mar 2026
Abstract
In view of the bottleneck problems existing in the 3D ultrasonic testing of aircraft composite laminated structures—including heavy reliance on manual operation, resulting in low detection efficiency, and the inability of traditional robotic arms to adapt to the testing of complex curved surfaces [...] Read more.
In view of the bottleneck problems existing in the 3D ultrasonic testing of aircraft composite laminated structures—including heavy reliance on manual operation, resulting in low detection efficiency, and the inability of traditional robotic arms to adapt to the testing of complex curved surfaces due to their dependence on predefined fixed trajectories—this paper proposes an automated 3D ultrasonic testing method based on 3D vision guidance for robotic arms. Firstly, the proposed Yolo-Mask model is adopted to realize the visual recognition and segmentation of composite component regions, after which the segmentation results are mapped to the depth map and further converted into the surface point cloud of the material. Secondly, on the basis of point cloud preprocessing and trajectory point extraction, the automatic planning of the robotic arm’s scanning trajectory is achieved, which drives the robotic arm to perform precise motion and to synchronously collect spatial pose and ultrasonic testing data. Finally, 3D reconstruction is completed via a fusion algorithm, and 3D images of the material’s internal structures are generated. Experimental verification shows that the proposed method achieves a Segm-mAP of 97.4%, a detection speed of 11.7 fps, and a 3D imaging error of less than 0.1 mm, thereby realizing fully automated detection throughout the entire process. This research provides an effective solution for the non-destructive testing of aircraft composite structures. Full article
(This article belongs to the Special Issue AI-Driven Analytics and Intelligent Sensing for Industrial Systems)
Show Figures

Figure 1

24 pages, 4811 KB  
Article
Lightweight Power Line Defect Detection Based on Improved YOLOv8n
by Yuhan Yin, Xiaoyi Liu, Kunxiao Wu, Ruilin Xu, Jianyong Zheng and Fei Mei
Sensors 2026, 26(7), 2112; https://doi.org/10.3390/s26072112 - 28 Mar 2026
Viewed by 51
Abstract
To address the challenges of small targets, severe background clutter, and high deployment cost in UAV-based power-line defect detection, this paper proposes a lightweight defect detection model based on an improved YOLOv8n. In the downsampling stage, we design an improved lightweight adaptive downsampling [...] Read more.
To address the challenges of small targets, severe background clutter, and high deployment cost in UAV-based power-line defect detection, this paper proposes a lightweight defect detection model based on an improved YOLOv8n. In the downsampling stage, we design an improved lightweight adaptive downsampling module (ADownPro) to replace part of conventional convolutions, which uses a dual-branch parallel structure for stronger feature interaction and depthwise separable convolutions (DSConv) for complexity reduction. In the feature extraction stage, an integration of cross-stage partial connections and partial convolution (CSPPC) is proposed to replace the C2F module for efficient multi-scale feature fusion. In the detection head, mixed local channel attention (MLCA), which combines channel-spatial information and local–global contextual features, is introduced to strengthen defect-focused representations under complex backgrounds. For the loss function, a scale-annealed mixed-quality EIoU loss (SAMQ-EIoU) is proposed by combining iso-center scale transformation, scale factor annealing and focal-style quality reweighting to improve localization accuracy at high IoU thresholds. Experiments on a constructed dataset covering six typical defect categories show that the improved YOLOv8n achieves 91.4% mAP@0.50 and 64.5% mAP@0.50:0.95, with only 1.59 M parameters and 4.9 GFLOPs. Compared with mainstream detectors, the proposed model achieves a better balance between detection accuracy and lightweight design. In particular, compared with the recently proposed YOLOv8n-DSN and IDD-YOLO, it improves mAP@0.50 by 0.6% and 0.8%, and mAP@0.50:0.95 by 1.2% and 4.8%, respectively, while further reducing the parameter count by 1.00 M and 1.26 M, and the FLOPs by 1.7 G and 0.2 G. Moreover, the cross-dataset evaluation on the public UPID and SFID datasets further demonstrate the robustness and generalization ability of the proposed method. Full article
(This article belongs to the Section Fault Diagnosis & Sensors)
29 pages, 30542 KB  
Article
Identification of Allergenic Plant Distribution and Pollen Exposure Risk Assessment in Beijing Based on the YOLO Model
by Shuxin Xu, Shengbei Zhou, Jun Wu and Pengbo Li
Forests 2026, 17(4), 428; https://doi.org/10.3390/f17040428 (registering DOI) - 28 Mar 2026
Viewed by 50
Abstract
With the continuous renewal of urban greening, pollen released by allergenic tree species has become a prominent environmental issue affecting residents’ health. However, existing research still lacks city-wide, rapidly replicable methods for identifying allergenic tree species and assessing exposure risks. Taking Beijing’s central [...] Read more.
With the continuous renewal of urban greening, pollen released by allergenic tree species has become a prominent environmental issue affecting residents’ health. However, existing research still lacks city-wide, rapidly replicable methods for identifying allergenic tree species and assessing exposure risks. Taking Beijing’s central urban districts as a case study, this research establishes a method for the automated identification of allergenic tree species and the assessment of pollen exposure risks based on high-resolution satellite imagery. This study coupled tree species distribution results derived from model inference with population density per unit area to delineate three tiers of exposure risk zones. Subsequently, these risk zones were overlaid with the road network within the study area to determine the distribution of roads with low, medium, and high exposure risk. Public transport stop locations were then introduced as a proxy variable for areas of high population mobility. Lorenz curves and Gini coefficients were calculated to quantify the spatial equity of pollen exposure risk. The results indicate that the model reliably identifies target tree species, with approximately 117,000 valid targets. Exposure risks exhibit significant clustering characteristics and can form continuous expansions along road networks. Incorporating population factors shows minimal change in risk concentration, suggesting pollen exposure risk is primarily driven by the spatial clustering of allergenic tree species and their accessibility within road networks. This risk is highly correlated with the spatial distribution patterns and accessibility characteristics of allergenic tree species, rather than being solely determined by population size. This study provides foundational data and methodological support for urban tree species identification, pollen exposure risk management, and optimised greening configurations. Full article
(This article belongs to the Special Issue Urban Forestry: Management of Sustainable Landscapes)
27 pages, 6255 KB  
Article
Lightweight Safety Helmet Wearing Detection Algorithm Based on GSA-YOLO
by Haodong Wang, Qiang Zhou, Zhiyuan Hao, Wentao Xiao and Luqing Yan
Sensors 2026, 26(7), 2110; https://doi.org/10.3390/s26072110 - 28 Mar 2026
Viewed by 62
Abstract
Electric power station confined spaces are high-risk and complex environments characterized by significant illumination variations. Whether safety helmets are properly worn directly affects the operational safety of workers in confined spaces. However, helmet detection in such environments faces several challenges, including drastic lighting [...] Read more.
Electric power station confined spaces are high-risk and complex environments characterized by significant illumination variations. Whether safety helmets are properly worn directly affects the operational safety of workers in confined spaces. However, helmet detection in such environments faces several challenges, including drastic lighting changes and difficulties in small-object detection. Moreover, existing object detection models typically contain a large number of parameters, making real-time helmet detection difficult to deploy on field devices with limited computational resources. To address these issues, this paper proposes a lightweight safety helmet wearing detection algorithm named GSA-YOLO. To mitigate the effects of severe illumination variation and detail loss in confined spaces, a GCA-C2f module integrating GhostConv and the CBAM attention mechanism is embedded into the backbone network. This design reduces the number of parameters and computational cost while enhancing the model’s feature extraction capability under challenging lighting conditions. To improve detection performance for occluded targets, an improved efficient channel attention (I-ECA) mechanism is introduced into the neck structure, which suppresses irrelevant channel features and enhances occluded object detection accuracy. Furthermore, to alleviate missed detections of small objects and inaccurate localization under low-light conditions, a P2 detection branch is added to the head, and the WIoU loss function is adopted to dynamically adjust the weights of hard and easy samples, thereby improving small-object detection accuracy and localization robustness. A confined space helmet detection dataset containing 5000 images was constructed through on-site data collection for model training and validation. Experimental results demonstrate that the proposed GSA-YOLO achieves an mAP@0.5 of 91.2% on the self-built dataset with only 2.3 M parameters, outperforming the baseline model by 2.9% while reducing the parameter count by 23.6%. The experimental results verify that the proposed algorithm is suitable for environments with significant illumination variation and small-object detection challenges. It provides a lightweight and efficient solution for on-site helmet detection in confined space scenarios, thereby contributing to the reduction in industrial safety accidents. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

7 pages, 866 KB  
Proceeding Paper
Inspection for Solder Joint Defects in Voltage Regulator ICs of Automotive Charging Applications
by Yi-Hsuan Chiu and Kuang-Chyi Lee
Eng. Proc. 2026, 134(1), 6; https://doi.org/10.3390/engproc2026134006 - 27 Mar 2026
Viewed by 58
Abstract
In automated production lines for automotive chargers, solder joint inspection is critical due to the widespread adoption of automotive electronics and electric vehicles. This study establishes a You Only Look Once Version 8 (YOLOv8)-based single-pin solder joint classification model for an 8-pin automotive [...] Read more.
In automated production lines for automotive chargers, solder joint inspection is critical due to the widespread adoption of automotive electronics and electric vehicles. This study establishes a You Only Look Once Version 8 (YOLOv8)-based single-pin solder joint classification model for an 8-pin automotive voltage regulator IC. Solder joints were categorized into four types: normal, misalignment, insufficient fillet, and cold joint. The model achieved a single-pin training accuracy of 0.987 (4000 samples) and a test accuracy of 0.973 (4800 samples), while overall IC-level evaluation exceeded 0.90. Normal and cold joint categories were detected with the highest reliability, whereas occasional misclassifications occurred in the insufficient fillet and misalignment categories. These results demonstrate that the proposed method is feasible for efficient and accurate detection of solder joint defects, providing a practical approach to support automated inspection and ensure consistent production quality. Full article
Show Figures

Figure 1

17 pages, 1166 KB  
Article
An Integrated 60 GHz Radar and AI-Guided Infrared System for Non-Contact Heart Rate and Body Temperature Monitoring
by Sangwook Sim and Changgyun Kim
Appl. Sci. 2026, 16(7), 3272; https://doi.org/10.3390/app16073272 - 27 Mar 2026
Viewed by 139
Abstract
The growing need for remote patient monitoring, accelerated by the global pandemic and an aging population, necessitates the development of advanced non-contact technologies for measuring vital signs. In this study, an integrated, non-contact system for accurately measuring heart rate (HR) and body temperature [...] Read more.
The growing need for remote patient monitoring, accelerated by the global pandemic and an aging population, necessitates the development of advanced non-contact technologies for measuring vital signs. In this study, an integrated, non-contact system for accurately measuring heart rate (HR) and body temperature (BT) is developed and validated. The proposed system combines a 60 GHz radar sensor and infrared (IR) sensor for HR and BT measurements, respectively, enhanced with advanced signal processing and an AI-based computer vision algorithm. A Window Filter and a Peak Uniformity algorithm were applied to the raw radar signal to mitigate noise and motion artifacts. For Temp measurement, an IR sensor with a narrow five-degree field of view (FOV) was integrated with a YOLO Pose-based tracking system using a camera and servo motors to automatically orient the sensor towards the user’s face. The system was validated with 30 healthy adult participants, benchmarked against a MAX30102 PPG sensor and Braun ThermoScan 7 for BT and BT measurements, respectively. The advanced signal processing reduced the HR Mean Absolute Error from 13.73 BPM to 5.28 BPM (p = 0.002), while the AI-guided IR sensor reduced the BT MAE from 4.10 °C to 1.64 °C (p < 0.001). These findings demonstrate that integrating 60 GHz radar with AI-driven tracking provides a promising approach for home-based trend monitoring. Full article
(This article belongs to the Special Issue AI-Based Biomedical Signal Processing—2nd Edition)
24 pages, 17498 KB  
Article
Vertebra-Level Completeness Analysis in Thoracolumbar Ultrasound Using a YOLO-Based Detection Framework
by Sumartini Dana, Chen Zhang, Yongping Zheng and Sai Ho Ling
Sensors 2026, 26(7), 2101; https://doi.org/10.3390/s26072101 - 27 Mar 2026
Viewed by 240
Abstract
Ultrasound enables radiation-free longitudinal monitoring of scoliosis, but rib shadowing and speckle noise often obscure vertebral structures. Current deep-learning methods present results in terms of localisation accuracy, without directly measuring anatomical completeness. We introduce a vertebra-level completeness model that includes a YOLO-based detection [...] Read more.
Ultrasound enables radiation-free longitudinal monitoring of scoliosis, but rib shadowing and speckle noise often obscure vertebral structures. Current deep-learning methods present results in terms of localisation accuracy, without directly measuring anatomical completeness. We introduce a vertebra-level completeness model that includes a YOLO-based detection framework and an explicit representation of completeness, the Vertebra Presence Matrix (VPM). The VPM provides visibility into detections across 17 ordinal vertebral levels (T1–T12, L1–L5), allowing us to measure completeness across anatomy rather than just detections. Thoracolumbar ultrasound scans were annotated and divided into train/test sets using a patient-wise split to avoid data leakage. Four model variants were evaluated, including full-spine and vertebra-centric crop representations with single-class and 17-class detection heads. The full-spine detector was less stable in regions of high anatomical variability, such as the upper thoracic and lower lumbar spine. Crops of individual vertebrae were more stable under partial fields of view. The 17-class crop model achieved an mAP50 of 0.929 and a scan-level completeness score of 0.74 using the VPM. These results demonstrate that vertebral completeness can be explicitly quantified and integrated with localisation-based metrics for completeness-aware automated scoliosis evaluation. Full article
(This article belongs to the Special Issue Ultrasound Sensors and MEMS Devices for Biomedical Applications)
16 pages, 1544 KB  
Article
Evaluating and Enhancing YOLOv8’s Soft Error Resilience
by Aonan Yang, Zhi Liu, Yihao Guan, Tan Tan and Jinrong Shen
Electronics 2026, 15(7), 1404; https://doi.org/10.3390/electronics15071404 - 27 Mar 2026
Viewed by 169
Abstract
Soft errors pose a critical reliability threat to deep neural networks in safety-critical systems. We present YOLO-FI, a fault injection framework for YOLO models that supports module-wise analysis, multiple formats, and customizable fault models. Using YOLO-FI, we evaluate YOLOv8 in floating- and fixed-point [...] Read more.
Soft errors pose a critical reliability threat to deep neural networks in safety-critical systems. We present YOLO-FI, a fault injection framework for YOLO models that supports module-wise analysis, multiple formats, and customizable fault models. Using YOLO-FI, we evaluate YOLOv8 in floating- and fixed-point forms on COCO. Results show that floating-point YOLOv8 is vulnerable across modules, while fixed-point models are resilient except for the final module. Range restriction improves resilience in most floating-point modules but fails in the final one. To address this, we propose selective hardening of the vulnerable module, achieving effective fault mitigation with modest overhead. Full article
Show Figures

Figure 1

50 pages, 7780 KB  
Systematic Review
Intelligent Eyes on Buildings: A Scientometric Mapping and Systematic Review of AI-Based Crack Detection and Predictive Diagnostics of Building Structures
by Mehdi Mohagheghi, Ali Bahadori-Jahromi and Shah Room
Encyclopedia 2026, 6(4), 75; https://doi.org/10.3390/encyclopedia6040075 - 27 Mar 2026
Viewed by 229
Abstract
Artificial Intelligence (AI)-based crack detection in buildings uses computer vision and deep learning to automatically identify structural cracks from inspection images. In recent years, many studies have explored this topic, but the overall development of the field, its methodological practices, and the remaining [...] Read more.
Artificial Intelligence (AI)-based crack detection in buildings uses computer vision and deep learning to automatically identify structural cracks from inspection images. In recent years, many studies have explored this topic, but the overall development of the field, its methodological practices, and the remaining challenges are still not fully clear. Unlike most previous reviews that focus mainly on technical methods, this study combines a large-scale scientometric mapping of the research field with a focused technical analysis of recent AI-based crack detection methods specifically applied to building structures. This study therefore provides a dual-layer review covering research published between 2015 and 2025. A total of 146 Scopus-indexed publications were analysed using Visualization of Similarities viewer (VOSviewer) to examine publication growth, thematic evolution, collaboration patterns, and citation structures. In addition, a focused technical review of 36 highly relevant studies was carried out to analyse task formulations, model families, datasets, evaluation protocols, and methodological practices. The results show a rapid increase in research activity after 2020, largely driven by advances in deep-learning and Unmanned Aerial Vehicle (UAV)-based inspections. At the same time, collaboration networks remain uneven, and citation influence is concentrated in a limited number of research communities. The technical review further shows that most studies focus on detection-level tasks, particularly You Only Look Once (YOLO)-based models, while predictive diagnostics, automated inspection reporting, and decision-oriented Structural Health Monitoring (SHM) are still rarely addressed. Current datasets and evaluation protocols also remain mostly perception-oriented, which makes it difficult to assess robustness, generalisability and long-term predictive capability. Full article
Show Figures

Figure 1

24 pages, 15151 KB  
Article
SG-YOLO: A Multispectral Small-Object Detector for UAV Imagery Based on YOLO
by Binjie Zhang, Lin Wang, Quanwei Yao, Keyang Li and Qinyan Tan
Remote Sens. 2026, 18(7), 1003; https://doi.org/10.3390/rs18071003 - 27 Mar 2026
Viewed by 186
Abstract
Object detection in unmanned aerial vehicle (UAV) imagery remains a crucial yet challenging task due to complex backgrounds, large scale variations, and the prevalence of small objects. Visible-spectrum images lack robustness under all-weather and all-illumination conditions; by contrast, multispectral sensing provides complementary cues [...] Read more.
Object detection in unmanned aerial vehicle (UAV) imagery remains a crucial yet challenging task due to complex backgrounds, large scale variations, and the prevalence of small objects. Visible-spectrum images lack robustness under all-weather and all-illumination conditions; by contrast, multispectral sensing provides complementary cues (e.g., thermal signatures) that improve detection robustness. However, existing multispectral solutions often incur high computational costs and are therefore difficult to deploy on resource-constrained UAV platforms. To address these issues, SG-YOLO is proposed, a lightweight and efficient multispectral object detection framework that aims to balance accuracy and efficiency. First, a Spectral Gated Downsampling Stem (SGDS) is designed, in which grouped convolutions and a gating mechanism are employed at the early stage of the network to extract band-specific features, thereby maximizing spectral complementarity while minimizing redundancy. Second, a Spectral–Spatial Iterative Attention Fusion (SSIAF) module is introduced, in which spectral-wise (channel) attention and spatial-wise attention are iteratively coupled and cascaded in a multi-scale manner to jointly model cross-band dependencies and spatial saliency, thereby aggregating high-level semantic information while suppressing redundant spectral responses. Finally, a Spatial–Channel Synergistic Fusion (SCSF) module is designed to enhance multi-scale and cross-channel feature integration in the neck. Experiments on the MODA dataset show that SG-YOLOs achieves 72.4% mAP50, outperforming the baseline by 3.2%. Moreover, compared with a range of mainstream one-stage detectors and multispectral detection methods, SG-YOLO delivers the best overall performance, providing an effective solution for UAV object detection while maintaining a favorable trade-off between model size and detection accuracy. Full article
Show Figures

Figure 1

25 pages, 9555 KB  
Article
EFSL-YOLO: An Improved Model for Small Object Detection in UAV Vision
by Meng Zhou, Shuke He, Chang Wang and Jing Wang
Drones 2026, 10(4), 243; https://doi.org/10.3390/drones10040243 - 27 Mar 2026
Viewed by 109
Abstract
To address the challenges in UAV remote sensing imagery, such as small object size, dense occlusion and complex background interference, this paper proposes an enhanced small object detection algorithm based on an improved YOLOv13 model for drone applications in complex weather environments. First, [...] Read more.
To address the challenges in UAV remote sensing imagery, such as small object size, dense occlusion and complex background interference, this paper proposes an enhanced small object detection algorithm based on an improved YOLOv13 model for drone applications in complex weather environments. First, an enhanced feature fusion attention network (EFFA-Net) is designed in the preprocessing stage to reduce image degradation and suppress the interference caused by smoke and haze. Then, in the backbone, a swish-gated convolution (SwiGLUConv) module is designed to adaptively expand the receptive field and enhance multi-scale feature extraction, which strengthens the representation of small targets while maintaining efficient computation. Furthermore, a locally enhanced multi-scale context fusion (LF-MSCF) module is integrated into the feature fusion neck of YOLO, combining multi-head self-attention, channel attention, and spatial attention to suppress background noise and redundant responses, thereby improving detection accuracy. Extensive experiments on the VisDrone-DET2019 dataset, UAVDT dataset, and HazyDet dataset demonstrate that the proposed algorithm outperforms other mainstream methods, showcasing excellent detection accuracy and robustness in complex UAV aerial scenarios. Full article
36 pages, 7711 KB  
Article
Integrating Visual Perception with Conservative Enhanced Bio-Inspired Optimization for Safe UAV Trajectory Planning
by Qiushuang Gao, Zhenshen Qu, Qihang Zhang and Yuhao Shang
Appl. Sci. 2026, 16(7), 3245; https://doi.org/10.3390/app16073245 - 27 Mar 2026
Viewed by 94
Abstract
Unmanned Aerial Vehicle (UAV) trajectory planning in complex three-dimensional environments with threats remains a challenging optimization problem requiring efficient algorithms and threat detection capabilities. This study proposes the Conservative Enhanced Dwarf Mongoose Optimization Algorithm (CEDMOA), which introduces four key innovations to the original [...] Read more.
Unmanned Aerial Vehicle (UAV) trajectory planning in complex three-dimensional environments with threats remains a challenging optimization problem requiring efficient algorithms and threat detection capabilities. This study proposes the Conservative Enhanced Dwarf Mongoose Optimization Algorithm (CEDMOA), which introduces four key innovations to the original DMOA: hybrid population initialization, adaptive vocalization parameters, elite-guided learning strategy, and intelligent restart mechanisms. This work proposed the integration of CEDMOA with a novel vision-based threat detection system using YOLO object detection technology, enabling the identification and incorporation of threats into the optimization process. CEDMOA was comprehensively evaluated on the CEC2022 benchmark test suite, demonstrating superior performance compared to other state-of-the-art algorithms in solution quality and convergence stability. The results show the approach successfully generates an optimal collision-free flight trajectory in complex environments in UAV trajectory planning with both static and dynamic threats. Combining metaheuristic optimization with computer vision technology provides a robust framework for autonomous navigation that adapts to changing threat conditions. Experimental results validate the effectiveness of both the enhanced algorithm and the vision-based threat integration approach for practical UAV operations. Full article
(This article belongs to the Special Issue Latest Research on Computer Vision and Its Application)
Show Figures

Figure 1

19 pages, 2359 KB  
Article
MSAdaNet: An Adaptive Multi-Scale Network for Surface Defect Detection of Smartphone Components
by Jianqing Wu, Hong Chen, Xiangchun Yu, Shuxin Yang, Weidong Huang, Fei Xie, Hanlin Hong and Hui Wang
Sensors 2026, 26(7), 2091; https://doi.org/10.3390/s26072091 - 27 Mar 2026
Viewed by 230
Abstract
The detection of surface defects on smartphone components is a critical step in quality assurance for industrial manufacturing. However, existing deep learning-based methods struggle with the extreme variations in defect morphology and scale, while labeled training data remains scarce due to the high [...] Read more.
The detection of surface defects on smartphone components is a critical step in quality assurance for industrial manufacturing. However, existing deep learning-based methods struggle with the extreme variations in defect morphology and scale, while labeled training data remains scarce due to the high cost of expert annotation. To address these challenges, we propose a twofold solution. First, we introduce MSAdaNet, a Multi-Scale Adaptive Defect Detection Network, which integrates three novel modules: a Parallel Multi-Scale Feature Aggregation (PMSFA) backbone, a Focusing Diffusion Pyramid Network (FDPN) neck, and a Scale-Adaptive Shared Detection (SASD) head. Second, to combat data scarcity, we propose a novel data generation pipeline, creating the synthetic Smartphone Camera Bezel Dataset (SCBD) of 4936 images. Extensive experiments on both real-world and synthetic datasets validate our approach. On the challenging public SSGD, MSAdaNet achieves a state-of-the-art mAP@0.5 of 54.8%, outperforming prominent frameworks and improving upon the strong YOLOv11m baseline by +10.6 points in mAP@0.5 and +18.3 points in recall. Furthermore, on our synthetic SCBD, the model achieves an impressive 94.0% mAP@0.5, confirming the quality of our data generation pipeline and the robustness of our architecture across different data distributions. Ablation studies systematically confirm the significant contribution of each proposed module, validating MSAdaNet as an effective and efficient solution for industrial defect detection. Full article
(This article belongs to the Topic Industrial Big Data and Artificial Intelligence)
Show Figures

Figure 1

19 pages, 3480 KB  
Article
Adapting Vision–Language Models for Few-Shot Industrial Defect Detection
by Chayanon Sub-r-pa and Rung-Ching Chen
Algorithms 2026, 19(4), 259; https://doi.org/10.3390/a19040259 - 27 Mar 2026
Viewed by 140
Abstract
Automated surface defect detection often faces a “cold-start” problem due to limited annotated data for new anomalies. Traditional object detectors struggle to converge in such few-shot settings. To address this, we adapt Vision–Language Models (VLMs), specifically YOLO-World. We use semantic pre-training to mitigate [...] Read more.
Automated surface defect detection often faces a “cold-start” problem due to limited annotated data for new anomalies. Traditional object detectors struggle to converge in such few-shot settings. To address this, we adapt Vision–Language Models (VLMs), specifically YOLO-World. We use semantic pre-training to mitigate data scarcity. We evaluate this approach on the MVTec AD dataset in bounding-box format. We use a strict 1:9 train-validation split, resulting in an average of 11.8 defect instances per category. YOLO-World surpasses traditional baselines, like YOLOv11s and YOLOv26s, in 12 of 15 categories. The optimized VLM pipeline achieves up to 64.9% mAP@50 on texture-heavy categories, such as Tile, with only nine training instances. Ablation studies show standard optimization techniques are limited under 10-shot constraints. We find a critical augmentation divide. Disabling spatial distortions (Mosaic) is vital to preserving rigid-object geometry. The Normalized Wasserstein Distance (NWD) improves the localization of microscopic anomalies. Varifocal Loss (VFL) often causes model collapse. Ultimately, VLMs offer a superior foundation for cold-start inspection but require carefully tailored pipelines for robustness. Full article
Show Figures

Figure 1

22 pages, 4435 KB  
Article
Semantic Mapping in Public Indoor Environments Using Improved Instance Segmentation and Continuous-Frame Dynamic Constraint
by Yumin Lu, Xueyu Feng, Zonghuan Guo, Jianchao Wang, Lin Zhou and Yingcheng Lin
Electronics 2026, 15(7), 1392; https://doi.org/10.3390/electronics15071392 - 26 Mar 2026
Viewed by 211
Abstract
Reliable semantic perception is crucial for service robots operating in complex public indoor environments. However, existing semantic mapping approaches often face the dual challenges of high computational overhead and semantic redundancy in maps. To address these limitations, this paper proposes a low-resource semantic [...] Read more.
Reliable semantic perception is crucial for service robots operating in complex public indoor environments. However, existing semantic mapping approaches often face the dual challenges of high computational overhead and semantic redundancy in maps. To address these limitations, this paper proposes a low-resource semantic mapping framework based on improved instance segmentation and dynamic constraints from consecutive frames. First, we design the lightweight model MS-YOLO, which adopts MobileNetV4 as its backbone network and incorporates the SHViT neck module, effectively optimizing the balance between detection accuracy and computational cost. Second, we propose a consecutive frame dynamic constraint method that eliminates redundant object annotations through consecutive frame stability verification. Experimental results relating to both fusion and custom datasets demonstrate that compared to YOLOv8n-seg, MS-YOLO achieves improvements in accuracy, recall, and mAP@0.5, while reducing the number of parameters by 11.7% and floating-point operations (FLOPs) by 32.2%. Furthermore, compared to YOLOv11n-seg and YOLOv5n-seg, its FLOPs are reduced by 17.2% and 25.5%, respectively. Finally, the successful deployment and field validation of this system on the Jetson Orin NX platform demonstrate its real-time capability and engineering practicality for edge computing in public indoor service robots. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

Back to TopTop