Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (62)

Search Parameters:
Keywords = normalized Wasserstein distance loss

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 2382 KB  
Article
DMN-YOLO: A Lightweight Small-Object Detector for Multi-Species Animal Detection in UAV Grassland Imagery
by Qian Huang, Jun Yang, Mengqi Yang, Dan Jiang and Tan Wang
Animals 2026, 16(11), 1643; https://doi.org/10.3390/ani16111643 - 27 May 2026
Abstract
To meet the requirements of accurate multi-class animal detection and model lightweighting in UAV-based grazing monitoring, this study presents DMN-YOLO, an efficient detector built upon YOLO11n. In particular, a lightweight downsampling module, DSDown, is introduced to alleviate the loss of detailed features of [...] Read more.
To meet the requirements of accurate multi-class animal detection and model lightweighting in UAV-based grazing monitoring, this study presents DMN-YOLO, an efficient detector built upon YOLO11n. In particular, a lightweight downsampling module, DSDown, is introduced to alleviate the loss of detailed features of tiny targets during downsampling under complex grassland backgrounds, thereby improving the preservation of edge, texture, and local structural information. Meanwhile, a MACFPN multi-scale feature fusion structure is designed to handle large scale variations and feature confusion among multiple animal targets, enhancing cross-scale feature interaction and background suppression for better small-target representation. In addition, NWDR Loss combines CIoU geometric constraints, normalized Wasserstein distance, and an adaptive weighting strategy to improve overall stability and localization accuracy of small-target bounding box regression. Results indicate that DMN-YOLO attains 93.6% precision, 89.9% recall, and 95.8% mAP@0.5 on the UAV animal detection dataset. Compared with YOLO11n, it reduces the parameter count by 35.7% while lowering the model size by 29.3%. These results show that DMN-YOLO effectively reduces model complexity while maintaining strong detection performance, demonstrating good potential for practical field deployment. Full article
(This article belongs to the Section Animal System and Management)
22 pages, 3526 KB  
Article
PCSNet-YOLOv12: YOLOv12-Based Target Detection Model for Winged Aphids on Sticky Traps with Precise Coordinate Synergy Network
by Bolun Guan, Juanjuan Kong, Jingbo Zhu, Liping Zhang, Meng Zhang and Wei Dong
Agriculture 2026, 16(10), 1058; https://doi.org/10.3390/agriculture16101058 - 13 May 2026
Viewed by 293
Abstract
In the field of smart plant protection, accurate early monitoring of winged aphids is critical, as it enables the interruption of viral disease transmission and reduces dependence on pesticides. In response to the core challenges of low efficiency in manual counting associated with [...] Read more.
In the field of smart plant protection, accurate early monitoring of winged aphids is critical, as it enables the interruption of viral disease transmission and reduces dependence on pesticides. In response to the core challenges of low efficiency in manual counting associated with current sticky trap-based monitoring, as well as the insufficient recognition accuracy and poor robustness of computer vision models in dense small-target scenarios, this study aims to develop a high-precision, highly reliable automated identification method for winged aphids. To achieve this, a specialized detection model named PCSNet is proposed. Based on YOLOv12, this model innovatively incorporates a coordinate attention mechanism to enhance the perception of spatial structures for small targets. Simultaneously, a shallow feature enhancement branch (SFEB) is introduced to enrich detailed information, and the Normalized Wasserstein Distance loss function is integrated to optimize bounding box regression. Comparative experiments conducted on a self-constructed dataset of sticky trap images encompassing complex field backgrounds demonstrate that the PCSNet model achieves optimal detection performance, with a mean average precision (mAP) of 0.791 and a precision of 0.866, significantly outperforming mainstream detection models and various attention mechanism variants. This research provides an effective technical solution for constructing a real-time and automated intelligent pest monitoring system, offering substantial application value for advancing the intelligent transformation of pest and disease monitoring and promoting practices in green prevention and control. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
Show Figures

Figure 1

16 pages, 1828 KB  
Article
Recognition of Electricity Meter Digits Based on Improved YOLOv10n and Cascaded Visual-Semantic Processing
by Yan Li and Yanfei Bai
Symmetry 2026, 18(4), 694; https://doi.org/10.3390/sym18040694 - 21 Apr 2026
Viewed by 294
Abstract
Digital electricity meters display readings via digits, but accurate image-based recognition faces a key challenge: the frequent omission of decimal points creates a critical asymmetry between the visual image and its true semantic meaning. To address this visual-semantic asymmetry, we propose an improved [...] Read more.
Digital electricity meters display readings via digits, but accurate image-based recognition faces a key challenge: the frequent omission of decimal points creates a critical asymmetry between the visual image and its true semantic meaning. To address this visual-semantic asymmetry, we propose an improved YOLOv10n approach incorporating cascaded Visual-Semantic processing. We introduce a Reparameterized Convolution Single-Shot Aggregation (RCSOSA) module and a SimAM attention mechanism to enhance feature extraction, and employ Normalized Wasserstein Distance (NWD) Loss to boost small-target detection. To rectify the visual-semantic asymmetry, we introduce domain-specific format rules based on power industry standards (taking GB/T 17215-2018 as an example) to provide structural constraints for digit recognition. Experimental results show superior performance with 0.870 precision, 0.932 mAP50, and 116 FPS inference speed, outperforming reference models in both precision and efficiency for real-time meter inspection. Full article
Show Figures

Figure 1

27 pages, 4209 KB  
Article
ViTWGAN: An Improved WGAN and Vision Transformer-Based Model for Intrusion Detection
by Xu Lin, Yanhui Liu, Cuihua Wu, Xiaodan Liang and Menghao Fang
Electronics 2026, 15(8), 1617; https://doi.org/10.3390/electronics15081617 - 13 Apr 2026
Viewed by 292
Abstract
This study proposes ViTWGAN, a novel and effective intrusion detection model designed to enhance data privacy protection by detecting malicious traffic within network flows. By improving the discriminator’s loss function, our approach reduces blind spots in the discriminator by explicitly reinforcing the learning [...] Read more.
This study proposes ViTWGAN, a novel and effective intrusion detection model designed to enhance data privacy protection by detecting malicious traffic within network flows. By improving the discriminator’s loss function, our approach reduces blind spots in the discriminator by explicitly reinforcing the learning of hard negative samples, thereby mitigating the forgetting of negative samples in the generative adversarial network. A Vision Transformer is employed as the backbone architecture for both the generator and the discriminator, while the Wasserstein distance is introduced to prevent mode collapse, enabling the generator to produce diverse normal traffic and consequently improving the discriminator’s detection capability. Extensive experiments on the NSL-KDD and CIC-DDoS2019 datasets demonstrate the superior performance of the proposed model, achieving accuracy rates of 96.45% and 99.37%, respectively. These results highlight the effectiveness of ViTWGAN as a high-performance solution for general intrusion detection systems. Full article
(This article belongs to the Special Issue Recent Advances in Cybersecurity)
Show Figures

Figure 1

19 pages, 3480 KB  
Article
Adapting Vision–Language Models for Few-Shot Industrial Defect Detection
by Chayanon Sub-r-pa and Rung-Ching Chen
Algorithms 2026, 19(4), 259; https://doi.org/10.3390/a19040259 - 27 Mar 2026
Viewed by 1427
Abstract
Automated surface defect detection often faces a “cold-start” problem due to limited annotated data for new anomalies. Traditional object detectors struggle to converge in such few-shot settings. To address this, we adapt Vision–Language Models (VLMs), specifically YOLO-World. We use semantic pre-training to mitigate [...] Read more.
Automated surface defect detection often faces a “cold-start” problem due to limited annotated data for new anomalies. Traditional object detectors struggle to converge in such few-shot settings. To address this, we adapt Vision–Language Models (VLMs), specifically YOLO-World. We use semantic pre-training to mitigate data scarcity. We evaluate this approach on the MVTec AD dataset in bounding-box format. We use a strict 1:9 train-validation split, resulting in an average of 11.8 defect instances per category. YOLO-World surpasses traditional baselines, like YOLOv11s and YOLOv26s, in 12 of 15 categories. The optimized VLM pipeline achieves up to 64.9% mAP@50 on texture-heavy categories, such as Tile, with only nine training instances. Ablation studies show standard optimization techniques are limited under 10-shot constraints. We find a critical augmentation divide. Disabling spatial distortions (Mosaic) is vital to preserving rigid-object geometry. The Normalized Wasserstein Distance (NWD) improves the localization of microscopic anomalies. Varifocal Loss (VFL) often causes model collapse. Ultimately, VLMs offer a superior foundation for cold-start inspection but require carefully tailored pipelines for robustness. Full article
Show Figures

Figure 1

25 pages, 3342 KB  
Article
A Novel Spectrum Recognition Model of Spatial Electromagnetic Anomalies Based on VAE-GANGP
by Bin Liu, Jiansheng Bai and Qiongyi Li
Electronics 2026, 15(5), 1062; https://doi.org/10.3390/electronics15051062 - 3 Mar 2026
Viewed by 515
Abstract
To address the issues of sample imbalance, unstable generation quality, and insufficient feature extraction in spectrum anomaly signal detection under complex electromagnetic environments, this paper proposes a VAE-GANGP identification model that integrates a Variational Autoencoder (VAE) with a Gradient Penalty-based Generative Adversarial Network [...] Read more.
To address the issues of sample imbalance, unstable generation quality, and insufficient feature extraction in spectrum anomaly signal detection under complex electromagnetic environments, this paper proposes a VAE-GANGP identification model that integrates a Variational Autoencoder (VAE) with a Gradient Penalty-based Generative Adversarial Network (GAN-GP). First, the VAE is employed to encode the original spectrum, generating structured latent features that follow a standard normal distribution. This replaces the random noise input in traditional GANs, significantly enhancing the semantic consistency of generated samples and training stability. Second, an adversarial training mechanism based on Wasserstein distance with gradient penalty (WGAN-GP) is introduced, effectively mitigating mode collapse and gradient vanishing, thereby improving the model’s capability to fit complex signal distributions. Furthermore, a multi-objective optimization function combining reconstruction error and adversarial loss is constructed, establishing an end-to-end integrated framework for feature learning, signal reconstruction, and anomaly discrimination. Experiments are conducted using a synthetic dataset comprising various modulation types and simulated environments with different signal-to-noise ratios for systematic validation. The results demonstrate that the spectrum data generated by VAE-GANGP closely matches the distribution of real signals. Under AWGN-dominated synthetic test conditions, the model achieves an anomaly detection accuracy of 98.1%. When evaluated under more realistic channel impairments (phase noise, multipath, impulsive interference), the model maintains competitive performance, outperforming existing methods and demonstrating promising potential for practical electromagnetic spectrum monitoring. Its performance significantly surpasses traditional detection methods and single deep learning models, providing a highly reliable and adaptive solution for spatial electromagnetic spectrum anomaly detection. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

18 pages, 4195 KB  
Article
WeldSimAM and EnNWD Co-Optimization: Enhancing Lightweight YOLOv11 for Multi-Scale Weld Defect Detection
by Wenquan Huang, Qing Cheng and Jing Zhu
Technologies 2026, 14(3), 140; https://doi.org/10.3390/technologies14030140 - 26 Feb 2026
Viewed by 687
Abstract
In the context of Industry 4.0, reliable automatic inspection of weld surface defects is critical for structural safety, yet current deep learning-based detectors struggle with the extreme scale variation and anisotropic shapes characteristic of weld flaws such as pores, cracks, and lack of [...] Read more.
In the context of Industry 4.0, reliable automatic inspection of weld surface defects is critical for structural safety, yet current deep learning-based detectors struggle with the extreme scale variation and anisotropic shapes characteristic of weld flaws such as pores, cracks, and lack of fusion. Existing YOLO-family models, although effective on general-purpose datasets, often fail to robustly localize tiny defects and long, slender discontinuities while remaining lightweight enough for industrial edge deployment. A critical research gap lies in the lack of task-specific optimization for weld defects: standard attention mechanisms are isotropic and cannot capture linear defect continuity, while existing loss functions ignore scale disparity between tiny pores (area < 100 pixels2) and large incomplete fusion defects (area > 5000 pixels2), leading to unstable regression. Here, we propose a dual-optimized lightweight YOLOv11 framework tailored for weld defect detection that addresses both feature representation and bounding-box regression. Here, we propose a dual-optimized lightweight YOLOv11 framework tailored for weld defect detection that addresses both feature representation and bounding-box regression. First, we introduce WeldSimAM, an enhanced attention module that augments parameter-free SimAM with directional (horizontal/vertical) and channel-wise enhancement to better capture the directional texture of linear weld defects. Second, we develop an Enhanced Normalized Wasserstein Distance (EnNWD) loss, which incorporates scale-disparity penalties and relative-area-based weighting to mitigate sample imbalance and improve regression accuracy for tiny and large-aspect-ratio targets. Validated via 10-fold cross-validation on three datasets (self-built + two public), the method achieves 99.48% mAP@0.5 and 73.29% mAP@0.5:0.95, outperforming YOLOv11 by 0.13 and 3.76 percentage points (p < 0.01, two-tailed t-test), with 5.21 MB and 132 FPS on NVIDIA RTX 4090. It also surpasses non-YOLO SOTA methods (e.g., EfficientDet-Lite3) by 3.8–5.5 percentage points in mAP@0.5 (p < 0.05), offering a practical real-time solution for industrial inspection. Full article
(This article belongs to the Section Manufacturing Technology)
Show Figures

Figure 1

19 pages, 6089 KB  
Article
Energy-Efficient Automated Detection of OPGW Features for Sustainable UAV-Based Inspection
by Xiaoling Yan, Wuxing Mao, Xiao Li, Ruiming Huang, Chi Ye, Faguang Li and Zheyu Fan
Sensors 2026, 26(2), 658; https://doi.org/10.3390/s26020658 - 19 Jan 2026
Viewed by 558
Abstract
Unmanned Aerial Vehicle (UAV)-based inspection is crucial for the maintenance and monitoring of high-voltage transmission lines, but detecting small objects in inspection images presents significant challenges, especially under complex backgrounds and varying lighting. These challenges are particularly evident when detecting the wire features [...] Read more.
Unmanned Aerial Vehicle (UAV)-based inspection is crucial for the maintenance and monitoring of high-voltage transmission lines, but detecting small objects in inspection images presents significant challenges, especially under complex backgrounds and varying lighting. These challenges are particularly evident when detecting the wire features of optical fiber composite overhead ground wire and conventional ground wires. Optical fiber composite overhead ground wire (OPGW) is a specialized cable designed to replace conventional shield wires on power utility towers. It contains one or more optical fibers housed in a protective tube, surrounded by layers of aluminum-clad steel and/or aluminum alloy wires, ensuring robust mechanical strength for grounding and high-bandwidth capabilities for remote sensing and control. Existing detection methods often struggle with low accuracy, insufficient performance, and high computational demands when dealing with small objects. To address these issues, this paper proposes an energy-efficient OPGW feature detection model for UAV-based inspection. The model incorporates a Feature Enhancement Module (FEM) to replace the C3K2 module in the sixth layer of the YOLO11 backbone, improving multi-scale feature extraction. A P2 shallow detection head is added to enhance the perception of small and edge features. Additionally, the traditional Intersection over Union (IoU) loss is replaced with Normalized Wasserstein Distance (NWD) loss function, which improves boundary regression accuracy for small objects. Experimental results show that the proposed method achieves a mAP50 of 78.3% and mAP5095 of 52.0%, surpassing the baseline by 2.3% and 1.1%, respectively. The proposed model offers the advantages of high detection accuracy and low computational resource requirements, providing a practical solution for sustainable UAV-based inspections. Full article
Show Figures

Figure 1

20 pages, 4373 KB  
Article
SO-YOLO11-CDP: An Instance Segmentation-Based Approach for Cross-Depth-of-Field Positioning Micro Image Sensor Modules in Precision Assembly
by Xi Lu, Juan Zhang, Yi Yang and Lie Bi
Electronics 2026, 15(2), 411; https://doi.org/10.3390/electronics15020411 - 16 Jan 2026
Viewed by 499
Abstract
During batch soldering, assembly of micro image sensor modules, initial random pose, and feature partially occlude target micro-component image, leading to issues of missed and erroneous detection, and low 3D spatial positioning accuracy due to cross-depth-of-field detection errors in microscopic vision. This paper [...] Read more.
During batch soldering, assembly of micro image sensor modules, initial random pose, and feature partially occlude target micro-component image, leading to issues of missed and erroneous detection, and low 3D spatial positioning accuracy due to cross-depth-of-field detection errors in microscopic vision. This paper proposes Small object-YOLO11-Cross-Depth-of-field Positioning (SO-YOLO11-CDP), an instance segmentation-based approach for precision cross-depth-of-field positioning micro-component. First, an improved Small object-YOLO11 (SO-YOLO11) image segmentation algorithm is designed. By incorporating a coordinate attention mechanism (CA) into segmentation head to enhance localization of micro-targets, the backbone uses non-stride convolution to preserve fine-grained feature, while target regression performance is boosted via Efficient-IoU (EIoU) loss combined with normalized Wasserstein distance (NWD). Subsequently, to further improve spatial position detection accuracy in cross-depth-of-field detection, a calibration error compensation model for image Jacobian matrix is established based on pinhole imaging principles. Experimental results indicate that SO-YOLO11 achieves 16.1% increase in precision, 4.0% increase in recall, and 9.9% increase in mean average precision (mAP0.5) over baseline YOLO11. Furthermore, it accomplishes spatial detection accuracy superior to 6.5 μm for target micro-components. The method presented in this paper holds significant engineering application value for high-precision spatial position detection of micro image sensor components. Full article
Show Figures

Figure 1

20 pages, 3283 KB  
Article
Small-Target Pest Detection Model Based on Dynamic Multi-Scale Feature Extraction and Dimensionally Selected Feature Fusion
by Junjie Li, Wu Le, Zhenhong Jia, Gang Zhou, Jiajia Wang, Guohong Chen, Yang Wang and Yani Guo
Appl. Sci. 2026, 16(2), 793; https://doi.org/10.3390/app16020793 - 13 Jan 2026
Cited by 1 | Viewed by 612
Abstract
Pest detection in the field is crucial for realizing smart agriculture. Deep learning-based target detection algorithms have become an important pest identification method due to their high detection accuracy, but the existing methods still suffer from misdetection and omission when detecting small-targeted pests [...] Read more.
Pest detection in the field is crucial for realizing smart agriculture. Deep learning-based target detection algorithms have become an important pest identification method due to their high detection accuracy, but the existing methods still suffer from misdetection and omission when detecting small-targeted pests and small-targeted pests in more complex backgrounds. For this reason, this study improves on YOLO11 and proposes a new model called MSDS-YOLO for enhanced detection of small-target pests. First, a new dynamic multi-scale feature extraction module (C3k2_DMSFE) is introduced, which can be adaptively adjusted according to different input features and thus effectively capture multi-scale and diverse feature information. Next, a novel Dimensional Selective Feature Pyramid Network (DSFPN) is proposed, which employs adaptive feature selection and multi-dimensional fusion mechanisms to enhance small-target saliency. Finally, the ability to fit small targets was enhanced by adding 160 × 160 detection heads removing 20 × 20 detection heads and using Normalized Gaussian Wasserstein Distance (NWD) combined with CIoU as a position loss function to measure the prediction error. In addition, a real small-target pest dataset, Cottonpest2, is constructed for validating the proposed model. The experimental results showed that a mAP50 of 86.7% was achieved on the self-constructed dataset Cottonpest2, which was improved by 3.0% compared to the baseline. At the same time, MSDS-YOLO has achieved better detection accuracy than other YOLO models on public datasets. Model evaluation on these three datasets shows that the MSDS-YOLO model has excellent robustness and model generalization ability. Full article
Show Figures

Figure 1

21 pages, 5664 KB  
Article
M2S-YOLOv8: Multi-Scale and Asymmetry-Aware Ship Detection for Marine Environments
by Peizheng Li, Dayong Qiao, Jianyi Mu and Linlin Qi
Sensors 2026, 26(2), 502; https://doi.org/10.3390/s26020502 - 12 Jan 2026
Viewed by 597
Abstract
Ship detection serves as a core foundational task for marine environmental perception. However, in real marine scenarios, dense vessel traffic often causes severe target occlusion while multi-scale targets, asymmetric vessel geometries, and harsh conditions (e.g., haze, low illumination) further degrade image quality. These [...] Read more.
Ship detection serves as a core foundational task for marine environmental perception. However, in real marine scenarios, dense vessel traffic often causes severe target occlusion while multi-scale targets, asymmetric vessel geometries, and harsh conditions (e.g., haze, low illumination) further degrade image quality. These factors pose significant challenges to vision-based ship detection methods. To address these issues, we propose M2S-YOLOv8, an improved framework based on YOLOv8, which integrates three key enhancements: First, a Multi-Scale Asymmetry-aware Parallelized Patch-wise Attention (MSA-PPA) module is designed in the backbone to strengthen the perception of multi-scale and geometrically asymmetric vessel targets. Second, a Deformable Convolutional Upsampling (DCNUpsample) operator is introduced in the Neck network to enable adaptive feature fusion with high computational efficiency. Third, a Wasserstein-Distance-Based Weighted Normalized CIoU (WA-CIoU) loss function is developed to alleviate gradient imbalance in small-target regression, thereby improving localization stability. Experimental results on the Unmanned Vessel Zhoushan Perception Dataset (UZPD) and the open-source Singapore Maritime Dataset (SMD) demonstrate that M2S-YOLOv8 achieves a balanced performance between lightweight design and real-time inference, showcasing strong potential for reliable deployment on edge devices of unmanned marine platforms. Full article
(This article belongs to the Section Environmental Sensing)
Show Figures

Figure 1

22 pages, 4804 KB  
Article
SER-YOLOv8: An Early Forest Fire Detection Model Integrating Multi-Path Attention and NWD
by Juan Liu, Jiaxin Feng, Shujie Wang, Yian Ding, Jianghua Guo, Yuhang Li, Wenxuan Xue and Jie Hu
Forests 2026, 17(1), 93; https://doi.org/10.3390/f17010093 - 10 Jan 2026
Viewed by 527
Abstract
Forest ecosystems, as vital natural resources, are increasingly endangered by wildfires. Effective forest fire management relies on the accurate and early detection of small–scale flames and smoke. However, the complex and dynamic forest environment, along with the small size and irregular shape of [...] Read more.
Forest ecosystems, as vital natural resources, are increasingly endangered by wildfires. Effective forest fire management relies on the accurate and early detection of small–scale flames and smoke. However, the complex and dynamic forest environment, along with the small size and irregular shape of early fire indicators, poses significant challenges to reliable early warning systems. To address these issues, this paper introduces SER–YOLOv8, an enhanced detection model based on the YOLOv8 architecture. The model incorporates the RepNCSPELAN4 module and an SPPELAN structure to strengthen multi-scale feature representation. Furthermore, to improve small target localization, the Normalized Wasserstein Distance (NWD) loss is adopted, providing a more robust similarity measure than traditional IoU–based losses. The newly designed SERDet module deeply integrates a multi–scale feature extraction mechanism with a multi-path fused attention mechanism, significantly enhancing the recognition capability for flame targets under complex backgrounds. Depthwise separable convolution (DWConv) is utilized to reduce parameters and boost inference efficiency. Experiments on the M4SFWD dataset show that the proposed method improves mAP50 by 1.2% for flames and 2.4% for smoke, with a 1.5% overall gain in mAP50–95 over the baseline YOLOv8, outperforming existing mainstream models and offering a reliable solution for forest fire prevention. Full article
(This article belongs to the Section Natural Hazards and Risk Management)
Show Figures

Figure 1

22 pages, 5599 KB  
Article
An Adaptive State-Space Convolutional Fusion Network for High-Precision Pest Detection in Smart Agarwood Cultivation
by Zhijie Luo, Rui Chen, Shaoxin Li and Jianjun Guo
Mathematics 2025, 13(24), 3937; https://doi.org/10.3390/math13243937 - 10 Dec 2025
Viewed by 514
Abstract
The sustainable cultivation of agarwood, a high-value tree species, is significantly threatened by foliar pests, requiring efficient and accurate monitoring solutions. While deep learning is widely used, mainstream models face inherent limitations: Convolutional Neural Networks have restricted receptive fields and Transformers incur high [...] Read more.
The sustainable cultivation of agarwood, a high-value tree species, is significantly threatened by foliar pests, requiring efficient and accurate monitoring solutions. While deep learning is widely used, mainstream models face inherent limitations: Convolutional Neural Networks have restricted receptive fields and Transformers incur high computational complexity, complicating the balance of accuracy and efficiency for tiny pest detection in complex environments. To address these challenges, a novel Adaptive State-space Convolutional Fusion Network (ASCNet) is proposed. Its core component, the Adaptive State-space Convolutional Fusion Block (ASBlock), integrates the global context modeling of state-space models—which have linear complexity—with the local feature extraction of convolutional networks through a dual-path adaptive fusion mechanism. A Grouped Spatial Shuffle Downsampling (GSD) module replaces standard strided convolutions to preserve fine-grained spatial details during downsampling. For small object detection, a Normalized Wasserstein Distance (NWD)-based loss function mitigates the sensitivity of traditional IoU to minor localization errors. Evaluations on a new agarwood pest dataset show that ASCNet outperforms state-of-the-art detectors (including the YOLO series, RT-DETR, and Gold-YOLO), achieving a maximum mAP@50 of 93.0 ± 0.2% and mAP@50:95 of 71.2 ± 0.3% with high computational efficiency. The results confirm ASCNet as a robust and effective solution for intelligent pest monitoring in high-value crops like agarwood. Full article
(This article belongs to the Special Issue Deep Learning and Adaptive Control, 4th Edition)
Show Figures

Figure 1

18 pages, 11243 KB  
Article
TCSN-YOLO: A Small-Target Object Detection Method for Fire Smoke
by Cao Yang, Zhou Jun, Wen Hongyuan and Wang Gang
Fire 2025, 8(12), 466; https://doi.org/10.3390/fire8120466 - 29 Nov 2025
Cited by 1 | Viewed by 1579
Abstract
Forest fires continue to pose a significant threat to public and personal safety. Detecting smoke in its early stages or when it is distant from the camera is challenging because it appears in only a small region of the captured images. This paper [...] Read more.
Forest fires continue to pose a significant threat to public and personal safety. Detecting smoke in its early stages or when it is distant from the camera is challenging because it appears in only a small region of the captured images. This paper proposes a small-scale smoke detection algorithm called TCSN-YOLO to address these challenges. First, it introduces a novel feature fusion module called trident fusion (TF), which is innovatively designed and incorporated into the neck of the model. TF significantly enhances small target smoke recognition. Additionally, to obtain global contextual information with high computational efficiency, we propose a Cross Attention Mechanism (CAM). CAM captures diverse smoke features by assigning attention weights in both horizontal and vertical directions. Furthermore, we suggest using SoftPool to preserve more detailed information in the feature map. Normalized Wasserstein Distance (NWD) metric be embedded into the loss function of our detector to distinguish positive and negative samples under the same threshold. Finally, we evaluate the proposed model using AI For Humankind dataset and FlgLib dataset. The experimental results demonstrate that our method achieves 37.1% APs, 90.3% AP50, 40.4% AP50:95, 45.34 M Params and 170.5 G FLOPs. Full article
Show Figures

Figure 1

28 pages, 5550 KB  
Article
RMH-YOLO: A Refined Multi-Scale Architecture for Small-Target Detection in UAV Aerial Imagery
by Fan Yang, Min He, Jiuxian Liu and Haochen Jin
Sensors 2025, 25(22), 7088; https://doi.org/10.3390/s25227088 - 20 Nov 2025
Cited by 3 | Viewed by 1228
Abstract
Unmanned aerial vehicle (UAV) vision systems have been widely deployed for aerial monitoring applications, yet small-target detection in UAV imagery remains a significant challenge due to minimal pixel representation, substantial scale variations, complex background interference, and varying illumination conditions. Existing object detection algorithms [...] Read more.
Unmanned aerial vehicle (UAV) vision systems have been widely deployed for aerial monitoring applications, yet small-target detection in UAV imagery remains a significant challenge due to minimal pixel representation, substantial scale variations, complex background interference, and varying illumination conditions. Existing object detection algorithms struggle to maintain high accuracy when processing small targets with fewer than 32 × 32 pixels in UAV-captured scenes, particularly in complex environments where target-background confusion is prevalent. To address these limitations, this study proposes RMH-YOLO, a refined multi-scale architecture. The model incorporates four key innovations: a Refined Feature Module (RFM) that fuses channel and spatial attention mechanisms to enhance weak feature representation of small targets while maintaining contextual integrity; a Multi-scale Focus-and-Diffuse (MFFD) network that employs a focus-diffuse transmission pathway to preserve fine-grained spatial details from high-resolution layers and propagate them to semantic features; an efficient CS-Head detection architecture that utilizes parameter-sharing convolution to enable efficient processing on embedded platforms; and an optimized loss function combining Normalized Wasserstein Distance (NWD) with InnerCIoU to improve localization accuracy for small targets. Experimental validation on the VisDrone2019 dataset demonstrates that RMH-YOLO achieves a precision and recall of 53.0% and 40.4%, representing improvements of 8.8% and 7.4% over the YOLOv8n baseline. The proposed method attains mAP50 and mAP50:95 of 42.4% and 25.7%, corresponding to enhancements of 9.2% and 6.4%, respectively, while maintaining computational efficiency with only 1.3 M parameters and 16.7 G FLOPs. Experimental results confirm that RMH-YOLO effectively improves small-target detection accuracy while maintaining computational efficiency, demonstrating its broad application potential in diverse UAV aerial monitoring scenarios. Full article
Show Figures

Figure 1

Back to TopTop