Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (2,891)

Search Parameters:
Keywords = YOLOv8 object detection

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 30829 KB  
Article
Crop-IRM: An Intelligent Recognition and Management System for Organ Characteristics of Crop Germplasm Resources
by Jie Zhang, Chenyao Yang, Hailin Peng, Xintong Wei, Jiaqi Zou, Shiyu Wang, Zhaohong Lu, Xianming Tan and Feng Yang
Agriculture 2026, 16(9), 996; https://doi.org/10.3390/agriculture16090996 (registering DOI) - 30 Apr 2026
Abstract
The traditional methods of field-based phenotypic data collection for crop germplasm resources are often inefficient and highly subjective. As the foundation for breeding innovation, these resources require precise identification of phenotypic traits for effective evaluation and utilization. Therefore, efficient and standardized management of [...] Read more.
The traditional methods of field-based phenotypic data collection for crop germplasm resources are often inefficient and highly subjective. As the foundation for breeding innovation, these resources require precise identification of phenotypic traits for effective evaluation and utilization. Therefore, efficient and standardized management of germplasm data is critical during the breeding process. To address this, we have developed an intelligent recognition and management system focused on the crop’s organ characteristics. The system consists of a web client for overall project management and data download, and a WeChat Mini Program for data collection and uploading. Both components are integrated with image analysis models. Using a soybean variety screening experiment as a case study, we have constructed multiple high-definition datasets for soybean phenotypic traits, and employed YOLOv11 series models for object detection, image classification, instance segmentation, and pose estimation to build analytical models for each of these traits. All models achieved a mean average precision (mAP@0.5) exceeding 94%, along with a top1_accuracy of 0.999. In practical evaluations, all models took between 0.71 and 3.03 s to make predictions for 100 images, achieving an accuracy rate of over 98%. This system delivers a comprehensive solution for field phenotypic identification of crop germplasm resources, substantially enhancing the efficiency and objectivity of data collection and analysis. It serves as a valuable decision-support tool for precision breeding and digital agriculture. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
Show Figures

Figure 1

28 pages, 2497 KB  
Article
Research on the Application of Time-Frequency Characteristics of GPR in Railway Mud Pumping Intelligent Detection
by Wenxing Shi, Shilei Wang, Feng Yang, Chi Zhang, Fanruo Li and Suping Peng
Remote Sens. 2026, 18(9), 1393; https://doi.org/10.3390/rs18091393 - 30 Apr 2026
Abstract
Ground penetrating radar (GPR), as an efficient non-destructive testing technique, plays a crucial role in the structural condition assessment and defect identification of railway ballast. Typical defects such as mud pumping generally exhibit characteristics in B-scan images including weak reflections, blurred boundaries, and [...] Read more.
Ground penetrating radar (GPR), as an efficient non-destructive testing technique, plays a crucial role in the structural condition assessment and defect identification of railway ballast. Typical defects such as mud pumping generally exhibit characteristics in B-scan images including weak reflections, blurred boundaries, and irregular structures, which pose significant challenges for stable detection and precise localization using existing methods that rely primarily on spatial feature modeling. Most current deep learning approaches focus on modeling spatial or temporal information, while lacking effective utilization of frequency-domain features, thereby limiting their discriminative capability under complex electromagnetic environments. To address these issues, this paper proposes a single-stage object detection framework, termed YOLO-DGW, based on time-frequency collaborative modeling. Built upon YOLOv8, the proposed method introduces a structure-aware spatial enhancement module to improve the representation of continuous GPR echo structures. Meanwhile, frequency-domain information is incorporated as a modulation prior to guide spatial feature learning, enhancing the model’s sensitivity to weak reflections and complex-shaped targets. In addition, A-CIoU loss function is designed to improve localization accuracy and stability for defect regions of varying scales. Experimental results demonstrate that YOLO-DGW achieves an F1-score of 63.06% and an AP@0.50 of 62.07%, representing improvements of approximately 7.41% and 2.8%, respectively, over the strongest baseline method. Compared with several mainstream object detection models, the proposed approach exhibits superior performance in both detection accuracy and cross-region generalization capability. These findings indicate that integrating frequency-domain information into spatial feature learning through a modulation mechanism can effectively enhance the model’s ability to discriminate weak-reflection anomalies, providing a novel time-frequency collaborative modeling paradigm for railway GPR defect detection. Full article
32 pages, 91311 KB  
Article
From Geometric Exploration to Semantic Completion: Scene Exploration Convolution and Large Format Perception for Adverse-Weather UAV Aerial Object Detection
by Yize Zhao, Bo Wang and Jialei Zhan
Sensors 2026, 26(9), 2802; https://doi.org/10.3390/s26092802 - 30 Apr 2026
Abstract
Object detection from unmanned aerial vehicle (UAV) imagery is essential for applications such as traffic monitoring, disaster response, and urban surveillance, yet most existing methods are developed and evaluated under clear-sky conditions. In real-world UAV operations, adverse weather including fog, rain, and snow [...] Read more.
Object detection from unmanned aerial vehicle (UAV) imagery is essential for applications such as traffic monitoring, disaster response, and urban surveillance, yet most existing methods are developed and evaluated under clear-sky conditions. In real-world UAV operations, adverse weather including fog, rain, and snow introduces severe image degradation that simultaneously disrupts both the geometric and photometric properties of targets. This paper identifies two fundamental bottlenecks underlying this performance collapse: the lack of geometric invariance in standard convolutional operators and the inability of fixed receptive fields to reconstruct features corrupted by atmospheric interference. To address these bottlenecks, we propose SELPNet (Scene Exploration and Large Format Perception Network), a unified framework that integrates geometric alignment and multi-scale contextual perception into the YOLOv13 head. SELPNet consists of two key modules: (1) The Scene Exploration Convolution (SEC) leverages affine Lie group theory to construct a discrete manifold of rotation and scale transformations, actively probing multiple geometric views and selecting the most coherent response via a Maxout mechanism. (2) The Large Format Perception Module (LPM) introduces a dynamic dilation strategy with depthwise separable convolutions, progressively enlarging the receptive field from fine-grained edge preservation to scene-level contextual perception for semantic completion of degraded regions. We further construct and release AWU-OBB, a large-scale benchmark containing over 18,000 oriented bounding box-annotated UAV images across four representative scene categories. Ablation experiments demonstrate that SEC and LPM yield complementary gains, achieving a combined improvement of +4.26% mAP50 over the YOLOv13-n baseline with only 0.11 M additional parameters and 0.2 extra GFLOPs. The source code will be publicly released upon acceptance of this paper. Full article
(This article belongs to the Section Intelligent Sensors)
16 pages, 13549 KB  
Article
YOLO-ALD: An Efficient and Robust Lightweight Model for Apple Leaf Disease Detection in Complex Orchard Environments
by Lei Liu, Yinyin Li, Qingyu Liu, Huihui Sun, Yeguo Sun and Xiaobo Shen
Horticulturae 2026, 12(5), 550; https://doi.org/10.3390/horticulturae12050550 - 30 Apr 2026
Abstract
Real-time detection of apple leaf diseases in orchard environments faces ongoing challenges, particularly in preserving fine-grained disease features with limited computing resources. To address these issues, we propose a high-precision lightweight model based on YOLOv10n, called YOLO-ALD. First, we introduce Spatial and Channel [...] Read more.
Real-time detection of apple leaf diseases in orchard environments faces ongoing challenges, particularly in preserving fine-grained disease features with limited computing resources. To address these issues, we propose a high-precision lightweight model based on YOLOv10n, called YOLO-ALD. First, we introduce Spatial and Channel Reconstruction Convolution into deeper backbone networks to replace standard downsampling layers and convolutions. This suppresses spatial and channel redundancy caused by environmental noise and optimizes feature representation. Second, we design a new C2f-Faster-SimAM module for the neck network. This module combines the inference efficiency of FasterNet with a parameter-free 3D attention mechanism to adaptively focus on early lesions, effectively distinguishing them from leaf veins without increasing model complexity. Third, in the detection head section, we use the Focaler-ShapeIoU loss function to optimize bounding box regression. It utilizes a dynamic focusing mechanism and geometric constraints to ensure the localization accuracy of irregular shapes and hard-to-detect samples. Experimental results on our self-built dataset covering four specific diseases and healthy leaves showed that, compared with YOLOv10n, the mAP@0.5 of YOLO-ALD reached 92.1%, achieving a 2.1% increase. In addition, the model has an inference speed of 105 FPS, with only 2.1 M parameters and 5.6 GFLOPs. Therefore, YOLO-ALD achieves a good balance between efficiency and robustness, showing strong theoretical potential for resource-constrained mobile agriculture diagnosis. Full article
(This article belongs to the Special Issue Emerging Technologies in Smart Agriculture)
Show Figures

Figure 1

24 pages, 4665 KB  
Article
Human Fall Detection with Infrared Imaging: A Comparison of Graph Convolutional Networks and YOLO
by Karol Perliński, Artur Faltyński and Aleksandra Świetlicka
Sensors 2026, 26(9), 2794; https://doi.org/10.3390/s26092794 - 30 Apr 2026
Abstract
This paper presents a comparative study of two artificial intelligence approaches—graph convolutional networks (GCNs) and the YOLO object detection algorithm—for analyzing human fall events using infrared imaging. From the AI perspective, the study introduces a GCN model that achieves over 99% classification accuracy [...] Read more.
This paper presents a comparative study of two artificial intelligence approaches—graph convolutional networks (GCNs) and the YOLO object detection algorithm—for analyzing human fall events using infrared imaging. From the AI perspective, the study introduces a GCN model that achieves over 99% classification accuracy by modeling 2D and 3D skeletal data as graph structures and evaluates the real-time detection capabilities of YOLOv8 on infrared video frames. On the engineering side, the research addresses practical challenges in elderly care and healthcare monitoring systems by demonstrating how these AI methods can accurately detect and classify fall directions under infrared conditions. The results highlight each model’s strengths and propose a hybrid framework combining YOLO’s spatial localization with GCN’s motion-pattern analysis for future real-world applications. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

20 pages, 12707 KB  
Article
SWUAV-DANet: A Severe-Weather UAV Dataset and Dynamic AlignAir Network for Robust Aerial Vehicle Detection
by Longze Zhang and Yihong Li
Sensors 2026, 26(9), 2793; https://doi.org/10.3390/s26092793 - 30 Apr 2026
Abstract
Unmanned aerial vehicle (UAV) aerial object detection is increasingly important for traffic monitoring, emergency rescue, and environmental perception. However, vehicle detection in heavy rain, dense fog, blizzards, and backlit night scenes suffers from target information loss, feature misalignment, and unstable performance. We, therefore, [...] Read more.
Unmanned aerial vehicle (UAV) aerial object detection is increasingly important for traffic monitoring, emergency rescue, and environmental perception. However, vehicle detection in heavy rain, dense fog, blizzards, and backlit night scenes suffers from target information loss, feature misalignment, and unstable performance. We, therefore, construct a new severe-weather UAV dataset, Severe-Weather UAV (SWUAV), and propose the real-time Dynamic AlignAir Network (DANet). SWUAV contains 18,195 red–green–blue (RGB) aerial images covering 12 adverse weather/illumination conditions with 236,392 vehicle instances. After the high-resolution backbone features, we insert a cross-scale adaptive alignment module that performs adaptive channel calibration, contrastive self-attention, and geometric/semantic remapping to reduce scale drift/mismatch, suppress noise, and strengthen degraded target cues; we then design a dynamic adaptive alignment head (DAAH) with a shared encoder and a deformable regression branch to mitigate classification–regression mismatch under adverse conditions while further reducing complexity. On SWUAV, DANet raises the YOLOv11-s baseline average precision (AP)/AP50 (AP at intersection over union, IoU = 0.50) from 43.9%/62.6% to 46.9%/64.8%, with only 8.65 M parameters, 22.7 giga floating-point operations (GFLOPs), and a 323.47 frames-per-second (FPS) end-to-end throughput (3.09 ms per image at batch size 16), outperforming EdgeYOLO-s and RT-DETR. The dataset and code are publicly available. Full article
(This article belongs to the Section Vehicular Sensing)
Show Figures

Figure 1

14 pages, 3627 KB  
Article
Efficient YOLOv11 with a FasterNet Backbone and Attention for Multi-Class Underwater Object Detection in Nearshore Waters
by Yinghao He, Wenjie Yin, Ruomiao Song, Siyi Zhou, Shimin Shan and Shuo Liu
J. Mar. Sci. Eng. 2026, 14(9), 827; https://doi.org/10.3390/jmse14090827 - 29 Apr 2026
Abstract
Underwater multi-class object detection in nearshore waters is essential for intelligent cleaning operations and ecological monitoring. However, strong reflection and scattering interference, color attenuation, frequent occlusion, and non-rigid deformation often cause fine-grained information loss and feature misalignment in conventional detectors, leading to missed [...] Read more.
Underwater multi-class object detection in nearshore waters is essential for intelligent cleaning operations and ecological monitoring. However, strong reflection and scattering interference, color attenuation, frequent occlusion, and non-rigid deformation often cause fine-grained information loss and feature misalignment in conventional detectors, leading to missed and false detections. To address these challenges, we propose an enhanced YOLOv11 framework integrating FasterNet and attention mechanisms. Specifically, we include FasterNet to replace the YOLOv11 baseline backbone to improve fine-grained feature preservation while reducing computational redundancy. Furthermore, a Deformable Underwater Attention Module (DUAM) is introduced to capture local texture variations and deformation-aware features, enhancing discrimination among heterogeneous categories. Additionally, a Submerged Occlusion-Aware Head (SOAH) is designed to recalibrate features based on occlusion visibility, improving the detection of small-scale and partially occluded objects in the high-resolution P2 layer. Performance gains mainly stem from the recalibration strategy and its synergy with multi-scale optimization objectives. Experiments on a nearshore underwater multi-class dataset (8610 images across 40 classes) show that the proposed method increases mAP from 66.9% to 82.3%, achieving a 15.4-point improvement over baseline YOLOv11, with superior robustness under complex backgrounds. Full article
(This article belongs to the Special Issue Assessment and Monitoring of Coastal Water Quality)
34 pages, 36077 KB  
Article
Modular Multi-Attribute Vehicle Analysis by Color, License Plate, Make and Sub-Model Using YOLO and OCR: A Benchmark Across YOLO Versions
by Cristian Japhet Islas-Yañez, Viridiana Hernández-Herrera and Moisés Márquez-Olivera
Sensors 2026, 26(9), 2785; https://doi.org/10.3390/s26092785 - 29 Apr 2026
Abstract
We present a modular multi-attribute vehicle analysis pipeline that integrates YOLO-based models and an OCR engine into a single workflow. The system detects vehicles, classifies color, recognizes make and sub-model, detects license plates, and extracts plate characters to generate a structured vehicle record. [...] Read more.
We present a modular multi-attribute vehicle analysis pipeline that integrates YOLO-based models and an OCR engine into a single workflow. The system detects vehicles, classifies color, recognizes make and sub-model, detects license plates, and extracts plate characters to generate a structured vehicle record. Vehicle detection is reported with standard metrics (precision, recall, and mAP@0.5), while license plate detection is reported at IoU = 0.3 to reflect the small-object nature of plates and downstream OCR usability. Among the evaluated versions, YOLOv8 provides the most balanced overall performance across modules, while maintaining real-time-equivalent throughput of approximately 18–22 FPS for the full pipeline on recorded traffic videos, depending on scene complexity. We emphasize module-level evaluation and runtime benchmarking; instance-level end-to-end identification across unique vehicles is defined as future work once track-based ground truth becomes available. Full article
(This article belongs to the Topic Deep Visual Recognition: Methods, and Applications)
Show Figures

Figure 1

26 pages, 54080 KB  
Article
MPES-YOLO: A Multi-Scale Lightweight Framework with Selective Edge Enhancement for Loess Landslide Detection
by Hanyu Cheng, Jiali Su, Jiangbo Xi, Haixing Shang, Zhen Zhang, Bingkun Wang and Pan Li
Remote Sens. 2026, 18(9), 1374; https://doi.org/10.3390/rs18091374 - 29 Apr 2026
Abstract
Loess landslides in northwestern China are highly unstable and difficult to distinguish due to sparse vegetation and their spectral and morphological similarity to the surrounding terrain. These landslides demonstrate considerable diversity in manifestation, encompassing shallow translational slides, small-scale features, partially obscured formations, and [...] Read more.
Loess landslides in northwestern China are highly unstable and difficult to distinguish due to sparse vegetation and their spectral and morphological similarity to the surrounding terrain. These landslides demonstrate considerable diversity in manifestation, encompassing shallow translational slides, small-scale features, partially obscured formations, and instances with irregular or poorly defined boundaries. To address the above issues, we propose MPES-YOLO, a multi-scale lightweight YOLO-based framework with selective edge enhancement to detect loess landslides. This model is based on the YOLOv8 architecture and incorporates a multi-scale partial convolution and exponential moving average (MPCE) module to improve multi-scale feature representation while reducing computational cost and enhancing small-target sensitivity. Additionally, to address ambiguous boundaries, a selective edge enhancement (SEE) module is introduced to extract authentic object edges from original images and inject them into key training layers, improving boundary perception. Finally, SIoU is adopted to improve geometric consistency for irregular landslide boundary localization. This paper first verified the basic detection performance of MPES-YOLO on the publicly available Bijie landslide dataset. Then, an experimental study was conducted in the loess landslides of Yan’an City, Shaanxi Province. The mAP@0.5 was 91.9%, and the parameter quantity was reduced by 23.3% compared with the baseline model. A generalization experiment was also carried out on the landslides in the Ningxia region, with the mAP@0.5 being 97.4%. The results show that MPES-YOLO achieves a strong balance between detection accuracy and computational efficiency, providing an effective and scalable solution for automated loess landslide detection and geological disaster early warning. Full article
Show Figures

Figure 1

26 pages, 4074 KB  
Article
Early Diagnosis of Blood Disorders via Enhanced Image Preprocessing and Deep Learning Modeling
by Alpamis Kutlimuratov, Dilshod Eshmurodov, Fotima Tulaganova, Akhmet Utegenov, Piratdin Allayarov, Jamshid Khamzaev, Islambek Saymanov and Fazliddin Makhmudov
BioMedInformatics 2026, 6(3), 25; https://doi.org/10.3390/biomedinformatics6030025 - 29 Apr 2026
Abstract
Background: Accurate and early detection of hematological disorders from microscopic peripheral blood smear images remains a technically challenging task due to inherent imaging limitations, including noise contamination, low contrast, staining variability, and significant cellular overlap. Conventional deep learning-based object detection frameworks often [...] Read more.
Background: Accurate and early detection of hematological disorders from microscopic peripheral blood smear images remains a technically challenging task due to inherent imaging limitations, including noise contamination, low contrast, staining variability, and significant cellular overlap. Conventional deep learning-based object detection frameworks often exhibit limited robustness under such conditions and demonstrate reduced sensitivity to small-scale morphological structures, particularly platelets and abnormal cell variants. Methods: To address these challenges, this study proposes a hybrid detection framework that integrates a fuzzy logic-driven image preprocessing module with the YOLOv11 object detection architecture. The proposed preprocessing pipeline employs adaptive fuzzy membership functions to normalize pixel intensity distributions, suppress high-frequency noise, and enhance edge-defined cellular boundaries. This transformation produces a structurally optimized feature representation, improving downstream feature extraction and localization performance. The proposed framework was evaluated on a curated dataset of 3000 annotated microscopic blood smear images spanning five hematological classes. Results: Experimental results show that the fuzzy logic module improves mAP@0.5 by +3.4% and mAP@0.5:0.95 by +3.6%, confirming its effectiveness in enhancing both classification and localization accuracy. Conclusions: These findings demonstrate the robustness and practical applicability of the proposed hybrid approach under challenging imaging conditions. Full article
Show Figures

Figure 1

25 pages, 6442 KB  
Article
YOLOv12-WCIRS: An Improved YOLOv12-Based Framework for Small Intestinal Lesion Detection in WCE
by Shiren Ye, Liangjing Li, Zetong Zhang and Haipeng Ma
Computers 2026, 15(5), 283; https://doi.org/10.3390/computers15050283 - 29 Apr 2026
Abstract
Accurate detection of small intestinal lesions in wireless capsule endoscopy (WCE) images remains challenging because lesions are often small, weakly contrasted, irregular in shape, and easily confused with complex mucosal backgrounds. To address these difficulties, this study proposes YOLOv12-WCIRS, a WCE-oriented improvement of [...] Read more.
Accurate detection of small intestinal lesions in wireless capsule endoscopy (WCE) images remains challenging because lesions are often small, weakly contrasted, irregular in shape, and easily confused with complex mucosal backgrounds. To address these difficulties, this study proposes YOLOv12-WCIRS, a WCE-oriented improvement of YOLOv12 that jointly enhances local feature extraction, selective multi-scale fusion, background suppression, localization sensitivity, and scale-aware optimization. The proposed framework incorporates a Weighted Convolution (WConv) module, a Contextual Selection Fusion Module (CSFM), an Information Integration Attention Fusion (IIA_Fusion) module, a Receptive Field Attention-based detection head (RFAHeadDetect), and a Scale Dynamic Loss (SD Loss). Experiments on the SEE-AI dataset show that YOLOv12-WCIRS achieves 83.4% mAP@0.5 and 61.1% mAP@0.5:0.95, improving mAP@0.5 from 76.9% to 83.4% over the direct baseline YOLOv12 while maintaining competitive efficiency. Additional analyses, including cross-dataset validation on overlapping categories in Kvasir-Capsule, normal-frame false-alarm evaluation, false-positive/false-negative breakdown, and repeated-run statistical testing, further support the robustness and practical value of the proposed framework. These results indicate that YOLOv12-WCIRS provides an effective solution for automated lesion detection in WCE images and shows promise for computer-aided capsule endoscopy analysis. Full article
(This article belongs to the Special Issue Artificial Intelligence (AI) in Medical Informatics)
Show Figures

Figure 1

24 pages, 8644 KB  
Article
YOLO-REFB: Rectangular Edge Fusion for Cardboard Box Detection in Warehouse Environments Using Mobile Robot
by Narendra Kumar Kolla and Pandu Ranga Vundavilli
Modelling 2026, 7(3), 83; https://doi.org/10.3390/modelling7030083 - 28 Apr 2026
Abstract
Accurate detection of cardboard boxes is essential to mobile manipulators to perform pick-and-place operations in warehouses. Conventional object detection methods like YOLOv11 struggle in low-texture and occluded environments. This paper presents YOLO-REFB, a novel object detection framework for real-time cardboard box detection in [...] Read more.
Accurate detection of cardboard boxes is essential to mobile manipulators to perform pick-and-place operations in warehouses. Conventional object detection methods like YOLOv11 struggle in low-texture and occluded environments. This paper presents YOLO-REFB, a novel object detection framework for real-time cardboard box detection in robotic manipulation using a dual-arm mobile robot (DAMR) operating in indoor warehouse environments. The proposed approach enhances the network by integrating the Rectangular Edge Fusion Block (REFB) into the YOLOv11 architecture; it focuses on learning the geometric and structural features of cardboard boxes. Enhanced edge information extraction and feature fusion improve training stability and localization accuracy. A custom dataset of 3501 annotated images, collected under varied conditions, was utilized. The images were randomly assigned to training and validation sets while keeping an 80:20 ratio. They were manually annotated and trained using Roboflow software, ensuring precise alignment of bounding boxes with cardboard box edges for accurate comparison with existing YOLO models. The model outperformed existing YOLO variants (YOLOv8n and YOLOv5n) in terms of precision (89.29%), recall (83.95%), and F1-score (86.54%). YOLO-REFB achieved improved localization metrics, including mean Average Precision (mAP)@0.5 (91.68%) and mAP@0.5:0.95 (68.61%). The inclusion of REFB was essential to performance gains, enabling effective detection of objects in challenging environments. Future developments may include 3D pose estimation and multi-object grasp planning for advanced robotic manipulation. Full article
Show Figures

Figure 1

36 pages, 1539 KB  
Article
PGT-Net: A Physics-Guided Transformer–CNN Hybrid Network for Low-Light Image Enhancement and Object Detection in Traffic Scenes
by Bin Chen, Jian Qiao, Baowei Li, Shipeng Liu and Wei She
J. Imaging 2026, 12(5), 191; https://doi.org/10.3390/jimaging12050191 - 28 Apr 2026
Abstract
In autonomous driving and intelligent transportation systems, the degradation of image quality under low-light conditions severely impacts the reliability of subsequent object detection. Existing methods predominantly employ data-driven deep learning models for image enhancement, often lacking physical interpretability and struggling to maintain robustness [...] Read more.
In autonomous driving and intelligent transportation systems, the degradation of image quality under low-light conditions severely impacts the reliability of subsequent object detection. Existing methods predominantly employ data-driven deep learning models for image enhancement, often lacking physical interpretability and struggling to maintain robustness in complex lighting-varying traffic scenarios. To address this, this paper proposes a Physically Guided Transformer–CNN Hybrid Network (Physically Guided Transformer–CNN Hybrid Network, PGT-Net) for end-to-end joint optimization of low-light enhancement and object detection. PGT-Net innovatively integrates the atmospheric scattering physical model with deep learning architecture: first, a learnable physical guidance branch estimates the scene’s atmospheric illumination map and transmittance map, providing explicit physical priors for the network; second, a dual-branch enhancement backbone is designed, where the local CNN branch (based on an improved UNet) restores fine textures, while the Global Transformer Branch (based on Swin Transformer) models long-range dependencies to correct global uneven illumination, with features adaptively combined via a Physical Fusion Module to ensure enhancement results align with physical laws while retaining rich visual features; finally, the enhanced images are directly fed into a lightweight detection head (e.g., YOLOv7) for joint training and optimization. Comprehensive experiments on public datasets (ExDark, BDD100K-night, etc.) demonstrate that PGT-Net significantly outperforms mainstream methods (e.g., RetinexNet, KinD, Zero-DCE) in both low-light image enhancement quality (PSNR/SSIM) and object detection accuracy (mAP), while maintaining high inference efficiency. This research offers an interpretable, high-performance solution for visual perception tasks under adverse lighting conditions, holding strong theoretical significance and practical value. Full article
(This article belongs to the Section AI in Imaging)
41 pages, 16618 KB  
Article
Multi-Type Ship Detection in Complex Marine Backgrounds Using an Enhanced YOLO-Based Network
by Anran Du, Huiqi Xu and Wenqiang Yao
Sensors 2026, 26(9), 2718; https://doi.org/10.3390/s26092718 - 28 Apr 2026
Abstract
Accurate detection of ship targets in complex marine environments is fundamental to ensuring maritime security and safeguarding maritime rights. With the increasing diversity of vessel types and configurations, achieving precise identification of multiple ship classes amidst dynamic interference and cluttered backgrounds has emerged [...] Read more.
Accurate detection of ship targets in complex marine environments is fundamental to ensuring maritime security and safeguarding maritime rights. With the increasing diversity of vessel types and configurations, achieving precise identification of multiple ship classes amidst dynamic interference and cluttered backgrounds has emerged as a formidable challenge in marine surveillance. To address three pervasive issues in ship target detection—namely, high false-negative rates for small targets, inadequate feature discrimination, and imprecise localization—this paper proposes AK-DSAM-YOLOv13, a multi-scale detection algorithm specifically tailored for complex marine scenarios. Built upon the YOLOv13n architecture, the proposed algorithm implements integrated optimizations across the backbone network, neck structure, and loss function. First, a lightweight cross-scale feature extraction module, AKC3k2, is constructed by incorporating Alterable Kernel Convolutions (AKConv) to reconstruct the feature extraction path, thereby significantly enhancing the representation of multi-scale targets. Second, a Dynamic Up-Sampling Dual-Stream Attention Merging (DyDSAM) structure is designed, which integrates the DySample operator with a Dual-Stream Attention Mechanism (DSAM) to effectively suppress background clutter and improve feature fusion accuracy. Third, an Accuracy-Intersection-over-Union (AIoU) loss function is introduced to jointly optimize overlap area, center distance, and aspect ratio, enhancing localization robustness for small-scale objects. Experimental results on the self-built CM-Ships dataset, as well as the public SeaShips and McShips datasets, demonstrate that AK-DSAM-YOLOv13 significantly outperforms baseline models in detection accuracy, recall, and generalization capability while maintaining a low computational overhead. This research provides an efficient and reliable technical framework for intelligent maritime visual monitoring in complex environments. Full article
Show Figures

Figure 1

7 pages, 845 KB  
Proceeding Paper
You Only Look Once-Based Bitter Melon Size Classification Enhanced by Harris Corner Detection and Douglas–Peucker Algorithm
by Julian Marc B. Surara, Charles Ivan Matthew C. Nangit, Analyn N. Yumang and Charmaine C. Paglinawan
Eng. Proc. 2026, 134(1), 85; https://doi.org/10.3390/engproc2026134085 - 27 Apr 2026
Abstract
Accurate size classification remains a persistent challenge for agricultural products with irregular morphology, such as bitter melon (Momordica charantia). Proper grading is essential for fair pricing, efficient packaging, and compliance with the Association of Southeast Asian Nations and Philippine National Standards, [...] Read more.
Accurate size classification remains a persistent challenge for agricultural products with irregular morphology, such as bitter melon (Momordica charantia). Proper grading is essential for fair pricing, efficient packaging, and compliance with the Association of Southeast Asian Nations and Philippine National Standards, yet traditional manual sorting often results in inconsistencies. To address this, we introduce an automated classification framework built on the You Only Look Once Version 8 (YOLOv8) model. The system integrates Harris Corner Detection to enhance feature extraction and the Douglas–Peucker algorithm to simplify contour representations, thereby reducing noise and improving shape analysis. A dataset of Ampalaya images was trained and processed to detect and categorize fruit sizes, with evaluation conducted through a confusion matrix. Experimental results showed an overall classification accuracy of 93.75%, demonstrating that the combined approach effectively balances precision with computational efficiency. Beyond improving classification accuracy, the findings highlight the broader potential of combining deep learning and contour-based methods to advance agricultural automation, optimize post-harvest workflows, and strengthen competitiveness in both local and international markets. Full article
Show Figures

Figure 1

Back to TopTop