Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (949)

Search Parameters:
Keywords = clutter detection

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
30 pages, 14061 KB  
Article
Power Defect Detection with Improved YOLOv12 and ROI Pseudo Point Cloud Visual Analytics
by Minglang Xu and Jishen Peng
Sensors 2026, 26(2), 445; https://doi.org/10.3390/s26020445 - 9 Jan 2026
Viewed by 105
Abstract
Power-equipment fault detection is challenging in real-world inspections due to subtle defect cues and cluttered backgrounds. This paper proposes an improved YOLOv12-based framework for multi-class power defect detection. We introduce a Prior-Guided Region Attention (PG-RA) module and design a Lightweight Residual Efficient Layer [...] Read more.
Power-equipment fault detection is challenging in real-world inspections due to subtle defect cues and cluttered backgrounds. This paper proposes an improved YOLOv12-based framework for multi-class power defect detection. We introduce a Prior-Guided Region Attention (PG-RA) module and design a Lightweight Residual Efficient Layer Aggregation Network (LR-RELAN). In addition, we develop a Dual-Spectrum Adaptive Fusion Loss (DSAF Loss) function to jointly improve classification confidence and bounding box regression consistency, enabling more robust learning under complex scenes. To support defect-oriented visual analytics and system interpretability, the framework further constructs Region of Interest (ROI) pseudo point clouds from detection outputs and compares two denoising strategies, Statistical Outlier Removal (SOR) and Radius Outlier Removal (ROR). A Python-based graphical prototype integrates image import, defect detection, ROI pseudo point cloud construction, denoising, 3D visualization, and result archiving into a unified workflow. Experimental results demonstrate that the proposed method improves detection accuracy and robustness while maintaining real-time performance, and the ROI pseudo point cloud module provides an intuitive auxiliary view for defect-structure inspection in practical applications. Full article
(This article belongs to the Section Fault Diagnosis & Sensors)
Show Figures

Figure 1

25 pages, 4824 KB  
Article
SCMT-Net: Spatial Curvature and Motion Temporal Feature Synergy Network for Multi-Frame Infrared Small Target Detection
by Ruiqi Yang, Yuan Liu, Ming Zhu, Huiping Zhu and Yuanfu Yuan
Remote Sens. 2026, 18(2), 215; https://doi.org/10.3390/rs18020215 - 9 Jan 2026
Viewed by 169
Abstract
Infrared small target (IRST) detection remains a challenging task due to extremely small target sizes, low signal-to-noise ratios (SNR), and complex background clutter. Existing methods often fail to balance reliable detection with low false alarm rates due to limited spatial–temporal modeling. To address [...] Read more.
Infrared small target (IRST) detection remains a challenging task due to extremely small target sizes, low signal-to-noise ratios (SNR), and complex background clutter. Existing methods often fail to balance reliable detection with low false alarm rates due to limited spatial–temporal modeling. To address this, we propose a multi-frame network that synergistically integrates spatial curvature and temporal motion consistency. Specifically, in the single-frame stage, a Gaussian Curvature Attention (GCA) module is introduced to exploit spatial curvature and geometric saliency, enhancing the discriminability of weak targets. In the multi-frame stage, a Motion-Aware Encoding Block (MAEB) utilizes MotionPool3D to capture temporal motion consistency and extract salient motion regions, while a Temporal Consistency Enhancement Module (TCEM) further refines cross-frame features to effectively suppress noise. Extensive experiments demonstrate that the proposed method achieves advanced overall performance. In particular, under low-SNR conditions, the method improves the detection rate by 0.29% while maintaining a low false alarm rate, providing an effective solution for the stable detection of weak and small targets. Full article
Show Figures

Graphical abstract

36 pages, 5941 KB  
Review
Physics-Driven SAR Target Detection: A Review and Perspective
by Xinyi Li, Lei Liu, Gang Wan, Fengjie Zheng, Shihao Guo, Guangde Sun, Ziyan Wang and Xiaoxuan Liu
Remote Sens. 2026, 18(2), 200; https://doi.org/10.3390/rs18020200 - 7 Jan 2026
Viewed by 263
Abstract
Synthetic Aperture Radar (SAR) is highly valuable for target detection due to its all-weather, day-night operational capability and certain ground penetration potential. However, traditional SAR target detection methods often directly adapt algorithms designed for optical imagery, simplistically treating SAR data as grayscale images. [...] Read more.
Synthetic Aperture Radar (SAR) is highly valuable for target detection due to its all-weather, day-night operational capability and certain ground penetration potential. However, traditional SAR target detection methods often directly adapt algorithms designed for optical imagery, simplistically treating SAR data as grayscale images. This approach overlooks SAR’s unique physical nature, failing to account for key factors such as backscatter variations from different polarizations, target representation changes across resolutions, and detection threshold shifts due to clutter background heterogeneity. Consequently, these limitations lead to insufficient cross-polarization adaptability, feature masking, and degraded recognition accuracy due to clutter interference. To address these challenges, this paper systematically reviews recent research advances in SAR target detection, focusing on physical constraints including polarization characteristics, scattering mechanisms, signal-domain properties, and resolution effects. Finally, it outlines promising research directions to guide future developments in physics-aware SAR target detection. Full article
Show Figures

Figure 1

14 pages, 9038 KB  
Article
BSGNet: Vehicle Detection in UAV Imagery of Construction Scenes via Biomimetic Edge Awareness and Global Receptive Field Modeling
by Yongwei Wang, Yuan Chen, Yakun Xie, Jun Zhu, Chao Dang and Hao Zhu
Drones 2026, 10(1), 32; https://doi.org/10.3390/drones10010032 - 5 Jan 2026
Viewed by 133
Abstract
Detecting vehicles in remote sensing images of construction sites captured by Unmanned Aerial Vehicles (UAVs) faces severe challenges, including extremely small target scales, high inter-class visual similarity, cluttered backgrounds, and highly variable imaging conditions. To address these issues, we propose BSGNet (Biomimetic Sharpening [...] Read more.
Detecting vehicles in remote sensing images of construction sites captured by Unmanned Aerial Vehicles (UAVs) faces severe challenges, including extremely small target scales, high inter-class visual similarity, cluttered backgrounds, and highly variable imaging conditions. To address these issues, we propose BSGNet (Biomimetic Sharpening and Global Receptive Field Network)—a novel detection architecture that synergistically fuses biologically inspired visual mechanisms with global receptive field modeling. Inspired by the Sustained Contrast Detection (SCD) mechanism in frog retinal ganglion cells, we design a Perceptual Sharpening Module (PSM). This module combines dual-path contrast enhancement with spatial attention mechanisms to significantly improve sensitivity to the high-frequency edge structures of small targets while effectively suppressing interfering backgrounds. To overcome the inherent limitation of such biomimetic mechanisms—specifically their restricted local receptive fields—we further introduce a Global Heterogeneous Receptive Field Learning Module (GRM). This module employs parallel multi-branch dilated convolutions and local detail enhancement paths to achieve joint modeling of long-range semantic context and fine-grained local features. Extensive experiments on our newly constructed UAV Construction Vehicle (UCV) dataset demonstrate that BSGNet achieves state-of-the-art performance: obtaining 64.9% APs on small targets and 81.2% on the overall mAP@0.5 metric, with an inference latency of only 31.4 milliseconds, outperforming existing mainstream detection frameworks in multiple metrics. Furthermore, the model demonstrates robust generalization performance on public datasets. Full article
Show Figures

Figure 1

21 pages, 4180 KB  
Article
Mine Exogenous Fire Detection Algorithm Based on Improved YOLOv9
by Xinhui Zhan, Rui Yao, Yun Qi, Chenhao Bai, Qiuyang Li and Qingjie Qi
Processes 2026, 14(1), 169; https://doi.org/10.3390/pr14010169 - 4 Jan 2026
Viewed by 207
Abstract
Exogenous fires in underground coal mines are characterized by low illumination, smoke occlusion, heavy dust loading and pseudo fire sources, which jointly degrade image quality and cause missed and false alarms in visual detection. To achieve accurate and real-time early warning under such [...] Read more.
Exogenous fires in underground coal mines are characterized by low illumination, smoke occlusion, heavy dust loading and pseudo fire sources, which jointly degrade image quality and cause missed and false alarms in visual detection. To achieve accurate and real-time early warning under such conditions, this paper proposes a mine exogenous fire detection algorithm based on an improved YOLOv9m, termed PPL-YOLO-F-C. First, a lightweight PP-LCNet backbone is embedded into YOLOv9m to reduce the number of parameters and GFLOPs while maintaining multi-scale feature representation suitable for deployment on resource-constrained edge devices. Second, a Fully Connected Attention (FCAttention) module is introduced to perform fine-grained frequency–channel calibration, enhancing discriminative flame and smoke features and suppressing low-frequency background clutter and non-flame textures. Third, the original upsampling operators in the neck are replaced by the CARAFE content-aware dynamic upsampler to recover blurred flame contours and tenuous smoke edges and to strengthen small-object perception. In addition, an MPDIoU-based bounding-box regression loss is adopted to improve geometric sensitivity and localization accuracy for small fire spots. Experiments on a self-constructed mine fire image dataset comprising 3000 samples show that the proposed PPL-YOLO-F-C model achieves a precision of 97.36%, a recall of 84.91%, mAP@50 of 96.49% and mAP@50:95 of 76.6%, outperforming Faster R-CNN, YOLOv5m, YOLOv7 and YOLOv8m while using fewer parameters and lower computational cost. The results demonstrate that the proposed algorithm provides a robust and efficient solution for real-time exogenous fire detection and edge deployment in complex underground mine environments. Full article
(This article belongs to the Section AI-Enabled Process Engineering)
Show Figures

Figure 1

30 pages, 8453 KB  
Article
PBZGNet: A Novel Defect Detection Network for Substation Equipment Based on Gradual Parallel Branch Architecture
by Mintao Hu, Yang Zhuang, Jiahao Wang, Yaoyi Hu, Desheng Sun, Dawei Xu and Yongjie Zhai
Sensors 2026, 26(1), 300; https://doi.org/10.3390/s26010300 - 2 Jan 2026
Viewed by 442
Abstract
As power systems expand and grow smarter, the safe and steady operation of substation equipment has become a prerequisite for grid reliability. In cluttered substation scenes, however, existing deep learning detectors still struggle with small targets, multi-scale feature fusion, and precise localization. To [...] Read more.
As power systems expand and grow smarter, the safe and steady operation of substation equipment has become a prerequisite for grid reliability. In cluttered substation scenes, however, existing deep learning detectors still struggle with small targets, multi-scale feature fusion, and precise localization. To overcome these limitations, we introduce PBZGNet, a defect-detection network that couples a gradual parallel-branch backbone, a zoom-fusion neck, and a global channel-recalibration module. First, BiCoreNet is embedded in the feature extractor: dual-core parallel paths, reversible residual links, and channel recalibration cooperate to mine fault-sensitive cues. Second, cross-scale ZFusion and Concat-CBFuse are dynamically merged so that no scale loses information; a hierarchical composite feature pyramid is then formed, strengthening the representation of both complex objects and tiny flaws. Third, an attention-guided decoupled detection head (ADHead) refines responses to obscured and minute defect patterns. Finally, within the Generalized Focal Loss framework, a quality rating scheme suppresses background interference while distribution regression sharpens the localization of small targets. Across all scales, PBZGNet clearly outperforms YOLOv11. Its lightweight variant, PBZGNet-n, attains 83.9% mAP@50 with only 2.91 M parameters and 7.7 GFLOPs—9.3% above YOLOv11-n. The full PBZGNet surpasses the current best substation model, YOLO-SD, by 7.3% mAP@50, setting a new state of the art (SOTA). Full article
(This article belongs to the Special Issue Deep Learning Based Intelligent Fault Diagnosis)
Show Figures

Figure 1

23 pages, 3593 KB  
Article
CCAI-YOLO: A High-Precision Synthetic Aperture Radar Ship Detection Model Based on YOLOv8n Algorithm
by Hui Liu, Haoyu Dong, Hongyin Shi and Fang Li
Remote Sens. 2026, 18(1), 145; https://doi.org/10.3390/rs18010145 - 1 Jan 2026
Viewed by 366
Abstract
To tackle core challenges in detecting ship targets within synthetic aperture radar (SAR) images—including coherent speckle noise interference, complex background clutter, and multi-scale target distribution—this paper proposes a high-accuracy detection model, CCAI-YOLO. This model is based on the YOLOv8n framework, achieving systematic enhancements [...] Read more.
To tackle core challenges in detecting ship targets within synthetic aperture radar (SAR) images—including coherent speckle noise interference, complex background clutter, and multi-scale target distribution—this paper proposes a high-accuracy detection model, CCAI-YOLO. This model is based on the YOLOv8n framework, achieving systematic enhancements through the collaborative optimisation of key components: within the backbone network, the original C2f structure is replaced with the dynamic convolution module C2f-ODConv, improving the model’s extraction capabilities under noisy interference; the C2f-ACmix module is integrated into the neck network, introducing a self-attention mechanism to strengthen global context information modelling, thereby better distinguishing targets from structured backgrounds; the ASFF detection head optimises multi-scale feature fusion, enhancing detection consistency across different-sized targets. Concurrently, the Inner-SIoU loss function further improves bounding box regression accuracy and accelerates convergence. Experimental results demonstrate that on the public datasets SSDD and SAR-Ship-Dataset, CCAI-YOLO achieves consistent improvements over the baseline model YOLOv8n across key metrics including F1 score, mAP50, and mAP50-95. Its overall performance surpasses current mainstream SAR ship detection methods, providing an effective solution for robust and efficient ship detection in complex scenarios. Full article
(This article belongs to the Special Issue Radar Data Processing and Analysis)
Show Figures

Figure 1

27 pages, 5048 KB  
Article
MCB-RT-DETR: A Real-Time Vessel Detection Method for UAV Maritime Operations
by Fang Liu, Yongpeng Wei, Aruhan Yan, Tiezhu Cao and Xinghai Xie
Drones 2026, 10(1), 13; https://doi.org/10.3390/drones10010013 - 27 Dec 2025
Viewed by 344
Abstract
Maritime UAV operations face challenges in real-time ship detection. Complex ocean backgrounds, drastic scale variations, and prevalent distant small targets create difficulties. We propose MCB-RT-DETR, a real-time detection transformer enhanced by multi-component boosting. This method builds upon the RT-DETR architecture. It significantly improves [...] Read more.
Maritime UAV operations face challenges in real-time ship detection. Complex ocean backgrounds, drastic scale variations, and prevalent distant small targets create difficulties. We propose MCB-RT-DETR, a real-time detection transformer enhanced by multi-component boosting. This method builds upon the RT-DETR architecture. It significantly improves detection under wave interference, lighting changes, and scale differences. Key innovations address these challenges. An Orthogonal Channel Attention (Ortho) mechanism preserves high-frequency edge details in the backbone network. Receptive Field Attention Convolution (RFAConv) enhances robustness against background clutter. A Small Object Detail Enhancement Pyramid (SOD-EPN) strengthens small-target representation. SOD-EPN combines SPDConv with multi-scale CSP-OmniKernel transformations. The neck network integrates ultra-lightweight DySample upsampling. This enables content-aware sampling for precise multi-scale localization. The method maintains high computational efficiency. Experiments on the SeaDronesSee dataset show significant improvements. MCB-RT-DETR achieves 82.9% mAP@0.5 and 49.7% mAP@0.5:0.95. These correspond to improvements of 4.5% and 3.4% relative to the baseline model. Inference speed maintains 50 FPS for real-time processing. The outstanding performance in cross-dataset tests further validates the algorithm’s strong generalization capability on DIOR remote sensing images and VisDrone2019 aerial scenes. The method provides a reliable visual perception solution for autonomous maritime UAV operations. Full article
Show Figures

Figure 1

23 pages, 4261 KB  
Article
Efficient Drone Detection Using Temporal Anomalies and Small Spatio-Temporal Networks
by Abhijit Mahalanobis and Amadou Tall
Sensors 2026, 26(1), 170; https://doi.org/10.3390/s26010170 - 26 Dec 2025
Viewed by 299
Abstract
Detecting small drones in Infrared (IR) sequences poses significant challenges due to their low visibility, low resolution, and complex cluttered backgrounds. These factors often lead to high false alarm and missed detection rates. This paper frames drone detection as a spatio-temporal anomaly detection [...] Read more.
Detecting small drones in Infrared (IR) sequences poses significant challenges due to their low visibility, low resolution, and complex cluttered backgrounds. These factors often lead to high false alarm and missed detection rates. This paper frames drone detection as a spatio-temporal anomaly detection problem and proposes a remarkably lightweight pipeline solution (well-suited for edge applications), by employing a statistical temporal anomaly detector (known as the temporal Reed Xiaoli (TRX) algorithm), in parallel with a light-weight convolutional neural network known as the TCRNet. While the TRX detector is unsupervised, the TCRNet is trained to discriminate between drones and clutter using spatio-temporal patches (or chips). The confidence maps from both modules are additively fused to localize drones in video imagery. We compare our method, dubbed TRX-TCRnet, to other state-of-the-art drone detection techniques using the Detection of Aircraft Under Background (DAUB) dataset. Our approach achieves exceptional computational efficiency with only 0.17 GFLOPs with 0.83 M parameters, outperforming methods that require 145–795 times more computational resources. At the same time, the TRX–TCRNet achieves one of the highest detection accuracies (mAP50 of 97.40) while requiring orders of magnitude fewer computational resources than competing methods, demonstrating unprecedented efficiency–performance trade-offs for real-time applications. Experimental results, including ROC and PR curves, confirm the framework’s exceptional suitability for resource-constrained environments and embedded systems. Full article
(This article belongs to the Special Issue Signal Processing and Machine Learning for Sensor Systems)
Show Figures

Figure 1

24 pages, 8240 KB  
Article
Multi-Constraint and Shortest Path Optimization Method for Individual Urban Street Tree Segmentation from Point Clouds
by Shengbo Yu, Dajun Li, Xiaowei Xie, Zhenyang Hui, Xiaolong Cheng, Faming Huang, Hua Liu and Liping Tu
Forests 2026, 17(1), 27; https://doi.org/10.3390/f17010027 - 25 Dec 2025
Viewed by 224
Abstract
Street trees are vital components of urban ecosystems, contributing to air purification, microclimate regulation, and visual landscape enhancement. Thus, accurate segmentation of individual trees from point clouds is an essential task for effective urban green space management. However, existing methods often struggle with [...] Read more.
Street trees are vital components of urban ecosystems, contributing to air purification, microclimate regulation, and visual landscape enhancement. Thus, accurate segmentation of individual trees from point clouds is an essential task for effective urban green space management. However, existing methods often struggle with noise, crown overlap, and the complexity of street environments. To address these challenges, this paper introduces a multi-constraint and shortest path optimization method for individual urban street tree segmentation from point clouds. In this paper, object primitives are first generated using multi-constraints based on graph segmentation. Subsequently, trunk points are identified and associated with their corresponding crowns through structural cues. To further improve the robustness of the proposed method under dense and cluttered conditions, the shortest-path optimization and stem-axis distance analysis techniques are proposed to further refine the individual tree extraction results. To evaluate the performance of the proposed method, the WHU-STree benchmark dataset is utilized for testing. Experimental results demonstrate that the proposed method achieves an average F1-score of 0.768 and coverage of 0.803, outperforming superpoint graph structure single-tree classification (SSSC) and nyström spectral clustering (NSC) methods by 17.4% and 43.0%, respectively. The comparison of visual individual tree segmentation results also indicates that the proposed framework offers a reliable solution for street tree detection in complex urban scenes and holds practical value for advancing smart city ecological management. Full article
(This article belongs to the Special Issue LiDAR Remote Sensing for Forestry)
Show Figures

Figure 1

39 pages, 19666 KB  
Article
WA-YOLO: Water-Aware Improvements for Maritime Small-Object Detection Under Glare and Low-Light
by Hongxin Sun, Hongguan Zhao, Zhao Liu, Guanyao Jiang and Jiansen Zhao
J. Mar. Sci. Eng. 2026, 14(1), 37; https://doi.org/10.3390/jmse14010037 - 24 Dec 2025
Viewed by 268
Abstract
Maritime vision systems for unmanned surface vehicles confront challenges in small-object detection, specular reflections and low-light conditions. This paper introduces WA-YOLO, a water-aware training framework that incorporates lightweight attention modules (ECA/CBAM) to enhance the model’s discriminative capacity for small objects and critical features, [...] Read more.
Maritime vision systems for unmanned surface vehicles confront challenges in small-object detection, specular reflections and low-light conditions. This paper introduces WA-YOLO, a water-aware training framework that incorporates lightweight attention modules (ECA/CBAM) to enhance the model’s discriminative capacity for small objects and critical features, particularly against cluttered water ripples and glare backgrounds; employs advanced bounding box regression losses (e.g., SIoU) to improve localization stability and convergence efficiency under wave disturbances; systematically explores the efficacy trade-off between high-resolution input and tiled inference strategies to tackle small-object detection, significantly boosting small-object recall (APS) while carefully evaluating the impact on real-time performance on embedded devices; and introduces physically inspired data augmentation techniques for low-light and strong-reflection scenarios, compelling the model to learn more robust feature representations under extreme optical variations. WA-YOLO achieves a compelling +2.1% improvement in mAP@0.5 and a +6.3% gain in APS over YOLOv8 across three test sets. When benchmarked against the advanced RT-DETR model, WA-YOLO not only surpasses its detection accuracy (0.7286 mAP@0.5) but crucially maintains real-time performance at 118 FPS on workstations and 17 FPS on embedded devices, achieving a superior balance between precision and efficiency. Our approach offers a simple, reproducible and readily deployable solution, with full code and pre-trained models publicly released. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

20 pages, 1304 KB  
Article
LSDA-YOLO: Enhanced SAR Target Detection with Large Kernel and SimAM Dual Attention
by Jingtian Yang and Lei Zhu
Symmetry 2026, 18(1), 23; https://doi.org/10.3390/sym18010023 - 23 Dec 2025
Viewed by 288
Abstract
Synthetic Aperture Radar (SAR) target detection faces significant challenges including speckle noise interference, weak small object features, and multi-category imbalance. To address these issues, this paper proposes LSDA-YOLO, an enhanced SAR target detection framework built upon the YOLO architecture that integrates Large Kernel [...] Read more.
Synthetic Aperture Radar (SAR) target detection faces significant challenges including speckle noise interference, weak small object features, and multi-category imbalance. To address these issues, this paper proposes LSDA-YOLO, an enhanced SAR target detection framework built upon the YOLO architecture that integrates Large Kernel Attention and SimAM dual attention mechanisms. Our method effectively overcomes these challenges by synergistically combining global context modeling and local detail enhancement to improve robustness and accuracy. Notably, this framework leverages the inherent symmetry properties of typical SAR targets (e.g., geometric symmetry of ships and bridges) to strengthen feature consistency, thereby reducing interference from asymmetric background clutter. By replacing the baseline C2PSA module with Deformable Large Kernel Attention and incorporating parameter-free SimAM attention throughout the detection network, our approach achieves improved detection accuracy while maintaining computational efficiency. The deformable large kernel attention module expands the receptive field through synergistic integration of deformable and dilated convolutions, enhancing geometric modeling for complex-shaped targets. Simultaneously, the SimAM attention mechanism enables adaptive feature enhancement across channel and spatial dimensions based on visual neuroscience principles, effectively improving discriminability for small targets in noisy SAR environments. Experimental results on the RSAR dataset demonstrate that LSDA-YOLO achieves 80.8% mAP50, 53.2% mAP50-95, and 77.6% F1 score, with computational complexity of 7.3 GFLOPS, showing significant improvement over baseline models and other attention variants while maintaining lightweight characteristics suitable for real-time applications. Full article
Show Figures

Figure 1

20 pages, 2188 KB  
Article
SAQ-YOLO: An Efficient Small Object Detection Model for Unmanned Aerial Vehicle in Maritime Search and Rescue
by Sichen Li, Hao Yi, Shengyi Chen, Xinmin Chen, Mao Xu and Feifan Yu
Appl. Sci. 2026, 16(1), 131; https://doi.org/10.3390/app16010131 - 22 Dec 2025
Viewed by 287
Abstract
In Search and Rescue (SAR) missions, UAVs must be capable of detecting small objects from complex and noise-prone maritime images. Existing small object detection methods typically rely on super-resolution techniques or complex structural designs, which often demand significant computational resources and fail to [...] Read more.
In Search and Rescue (SAR) missions, UAVs must be capable of detecting small objects from complex and noise-prone maritime images. Existing small object detection methods typically rely on super-resolution techniques or complex structural designs, which often demand significant computational resources and fail to meet the real-time requirements for small mobile devices in SAR tasks. To address this challenge, we propose SAQ-YOLO, an efficient small object detection model based on the YOLO framework. We design a Small Object Auxiliary Query branch, which uses deep semantic information to guide the fusion of shallow features, thereby improving small object capture efficiency. Additionally, SAQ-YOLO incorporates a series of lightweight channel, spatial, and group (large kernel) gated attention mechanisms to suppress background clutter in complex maritime environments, enhancing feature extraction at a low computational cost. Experiments on the SeaDronesSee dataset demonstrate that, compared to YOLOv11s, SAQ-YOLO reduces the number of parameters by approximately 70% while increasing mAP@50 by 2.1 percentage points. Compared to YOLOv11n, SAQ-YOLO improves mAP@50 by 8.7 percentage points. When deployed on embedded platforms, SAQ-YOLO achieves an inference latency of only 35 milliseconds per frame, meeting the real-time requirements of maritime SAR applications. These results suggest that SAQ-YOLO provides an efficient and deployable solution for UAV SAR operations in vast and highly dynamic marine environments. Future work will focus on enhancing the robustness of the detection model. Full article
Show Figures

Figure 1

21 pages, 6979 KB  
Article
A Lightweight Edge-Deployable Framework for Intelligent Rice Disease Monitoring Based on Pruning and Distillation
by Wei Liu, Baoquan Duan, Zhipeng Fan, Ming Chen and Zeguo Qiu
Sensors 2026, 26(1), 35; https://doi.org/10.3390/s26010035 - 20 Dec 2025
Viewed by 443
Abstract
Digital agriculture and smart farming require crop health monitoring methods that balance detection accuracy with computational cost. Rice leaf diseases threaten yield, while field images often contain small multi-scale lesions, variable illumination and cluttered backgrounds. This paper investigates SCD-YOLOv11n, a lightweight detector designed [...] Read more.
Digital agriculture and smart farming require crop health monitoring methods that balance detection accuracy with computational cost. Rice leaf diseases threaten yield, while field images often contain small multi-scale lesions, variable illumination and cluttered backgrounds. This paper investigates SCD-YOLOv11n, a lightweight detector designed with these constraints in mind. The model replaces the YOLOv11n backbone with a StarNet backbone and integrates a C3k2-Star module to enhance fine-grained, multi-scale feature extraction. A Detail-Strengthened Cross-scale Detection (DSCD) head is further introduced to improve localization of small lesions. On this architecture, we design a DepGraph-based mixed group-normalization pruning rule and apply channel-wise feature distillation to recover performance after pruning. Experiments on a public rice leaf disease dataset show that the compressed model requires 1.9 MB of storage, achieves 97.4% mAP@50 and 76.2% mAP@50:95, and attains a measured speed of 184 FPS under the tested settings. These results provide a quantitative reference for designing lightweight object detectors for rice disease monitoring in digital agriculture scenarios. Full article
(This article belongs to the Topic Digital Agriculture, Smart Farming and Crop Monitoring)
Show Figures

Figure 1

23 pages, 7391 KB  
Article
TSE-YOLO: A Model for Tomato Ripeness Segmentation
by Liangquan Jia, Xinhui Yuan, Ze Chen, Tao Wang, Lu Gao, Guosong Gu, Xuechun Wang and Yang Wang
Agriculture 2026, 16(1), 8; https://doi.org/10.3390/agriculture16010008 - 19 Dec 2025
Viewed by 418
Abstract
Accurate and efficient tomato ripeness estimation is crucial for robotic harvesting and supply chain grading in smart agriculture. However, manual visual inspection is subjective, slow and difficult to scale, while existing vision models often struggle with cluttered field backgrounds, small targets and limited [...] Read more.
Accurate and efficient tomato ripeness estimation is crucial for robotic harvesting and supply chain grading in smart agriculture. However, manual visual inspection is subjective, slow and difficult to scale, while existing vision models often struggle with cluttered field backgrounds, small targets and limited throughput. To overcome these limitations, we introduce TSE-YOLO, an improved real-time detector tailored for tomato ripeness estimation with joint detection and segmentation. In the TSE-YOLO model, three key enhancements are introduced. The C2PSA module is improved with ConvGLU, adapted from TransNeXt, to strengthen feature extraction within tomato regions. A novel segmentation head is designed to accelerate ripeness-aware segmentation and improve recall. Additionally, the C3k2 module is augmented with partial and frequency-dynamic convolutions, enhancing feature representation under complex planting conditions. These components enable precise instance-level localization and pixel-wise segmentation of tomatoes at three ripeness stages: verde, semi-ripe (semi-maduro), and ripe. Experiments on a self-constructed tomato ripeness dataset demonstrate that TSE-YOLO achieves 92.5% mAP@0.5 for detection and 92.2% mAP@0.5 for segmentation with only 9.8 GFLOPs. Deployed on Android via Ncnn Convolutional Neural Network (NCNN), the model runs at 30 fps on Dimensity 9300, offering a practical solution for automated tomato harvesting and grading that accelerates smart agriculture’s industrial adoption. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
Show Figures

Figure 1

Back to TopTop