MDPI - Publisher of Open Access Journals

31 pages, 13534 KB

Open AccessArticle

CSFADet: Dual-Modal Anti-UAV Detection via Cross-Spectral Feature Alignment and Adaptive Multi-Scale Refinement

by Heqin Yuan and Yuheng Li

Algorithms 2026, 19(4), 254; https://doi.org/10.3390/a19040254 - 26 Mar 2026

Anti-unmanned aerial vehicle (Anti-UAV) detection is critical for airspace security, yet existing single-modality approaches suffer from severe performance degradation under adverse illumination, thermal crossover, and extreme scale variation. In this paper, we propose CSFADet, a dual-modal detection framework that jointly exploits visible and [...] Read more.

Anti-unmanned aerial vehicle (Anti-UAV) detection is critical for airspace security, yet existing single-modality approaches suffer from severe performance degradation under adverse illumination, thermal crossover, and extreme scale variation. In this paper, we propose CSFADet, a dual-modal detection framework that jointly exploits visible and infrared imagery through four tightly integrated modules. First, a Cross-Spectral Feature Alignment (CSFA) module performs early-stage spectral calibration by computing cross-modal query–value attention maps, generating modality-aware channel descriptors that re-weight and concatenate the two spectral streams. Second, a Dual-path Texture Enhancement Module (DTEM) enriches fine-grained spatial details via cascaded convolutions with residual connections. Third, a Dual-path Cross-Attention Module (DCAM) introduces a feature-shrinking token generation strategy followed by symmetric cross-attention branches with learnable scaling factors, Squeeze-and-Excitation recalibration, and a

1 \times 1

convolution fusion head, enabling deep bidirectional interaction between modalities. Fourth, a Dual-path Information Refinement Module (DIRM) embeds Adaptive Residual Groups (ARGs) that cascade Multi-modal Spatial Attention Blocks (MSABs) with channel and dynamic spatial attention, culminating in a Multi-scale Scale-aware Fusion Refinement (MSFR) unit that employs three parallel multi-head attention branches with a Scale Reasoning Gate and Channel Fusion Layer to produce scale-discriminative enhanced features. Experiments on the public Anti-UAV300 benchmark show that CSFADet achieves 91.4% mAP@0.5 and 58.7% mAP@0.5:0.95, surpassing fifteen representative detectors spanning single-stage, two-stage, YOLO-family, and Transformer-based categories. Ablation studies confirm the complementary contributions of each module, and heatmap visualizations verify the model’s capacity to focus on small, distant UAV targets under challenging conditions. Full article

(This article belongs to the Topic Theoretical Foundations and Applications of Deep Learning Techniques)

► Show Figures

Figure 1

26 pages, 7929 KB

Open AccessArticle

FirePM-YOLO: Position-Enhanced Mamba for YOLO-Based Fire Rescue Object Detection from UAV Perspectives

by Qingyu Xu, Runtong Zhang, Zihuan Qiu and Fanman Meng

Sensors 2026, 26(7), 2064; https://doi.org/10.3390/s26072064 - 26 Mar 2026

Abstract

Object detection in UAV-based fire rescue scenarios faces multiple challenges, including densely distributed small targets, severe occlusion, and interference from smoke and flames. Existing mainstream detection models, such as the YOLO series, often prioritize inference speed at the expense of modeling global context [...] Read more.

Object detection in UAV-based fire rescue scenarios faces multiple challenges, including densely distributed small targets, severe occlusion, and interference from smoke and flames. Existing mainstream detection models, such as the YOLO series, often prioritize inference speed at the expense of modeling global context and spatial positional information, resulting in limited performance in such complex environments. To address these limitations, this paper proposes FirePM-YOLO, an object detection architecture optimized for fire rescue applications. Based on the YOLO framework, the proposed model introduces two key innovations: first, a Position-Aware Enhanced Mamba module (PEMamba) is designed, which incorporates a compact positional encoding mechanism, lightweight spatial enhancement, and an adaptive feature fusion strategy to significantly improve scene perception while maintaining computational efficiency. Second, a PEMBottleneck structure is constructed, which dynamically balances local convolutional features and global PEMamba features via learnable weights. This module is embedded into the shallow layers of the backbone network, forming an enhanced PEM-C3K2 module that captures long-range dependencies with linear complexity while preserving fine local details, thereby enabling holistic contextual understanding of fireground environments. Experimental results on the self-built “FireRescue” dataset demonstrate that compared with the original YOLOv12 and other mainstream detectors, the proposed model achieves improvements in both mean average precision (mAP) and recall while maintaining real-time inference capability. Notably, it exhibits superior detection performance on challenging samples, such as small-scale and partially occluded professional firefighting vehicles. Full article

(This article belongs to the Section Remote Sensors)

► Show Figures

Figure 1

26 pages, 7095 KB

Open AccessArticle

CB-DETR: Symmetry-Guided Density-Adaptive Attention and Posterior Dynamic Query Decoding for Remote Sensing Target Detection

by Xiaodong Zhang, Jiahui Xue and Shengye Zhao

Symmetry 2026, 18(4), 561; https://doi.org/10.3390/sym18040561 - 25 Mar 2026

Abstract

Remote sensing object detection is severely hindered by background clutter and uneven object spatial distribution, limiting the performance of traditional algorithms and the original RT-DETR. To address these issues, this paper proposes an improved RT-DETR-based algorithm, CB-DETR. First, a symmetry-guided Density-Adaptive Attention (DAA) [...] Read more.

Remote sensing object detection is severely hindered by background clutter and uneven object spatial distribution, limiting the performance of traditional algorithms and the original RT-DETR. To address these issues, this paper proposes an improved RT-DETR-based algorithm, CB-DETR. First, a symmetry-guided Density-Adaptive Attention (DAA) module is designed to tackle insufficient intra-scale feature interaction and poor adaptability to uneven density regions in RT-DETR. Centered on a density estimation network, it predicts target density, generates normalized weights via temperature scaling and softmax, and dynamically adjusts receptive fields through a multi-branch structure to symmetrically adapt to high- and low-density regions, outperforming RT-DETR’s fixed receptive field design. Second, a cross-attention-fused Posterior Dynamic Query Decoder (PDQD) is constructed to overcome fixed query interaction and weak small/occluded object detection in the original decoder. A dynamic query update mechanism optimizes vectors via multi-round iterations, breaking fixed-layer limitations and mining detailed features in complex scenarios, thus improving small/occluded target detection accuracy. Comparative experiments on RSOD, DIOR, and DOTA datasets show that CB-DETR outperforms the original RT-DETR comprehensively:

{mAP}_{50}

/

{mAP}_{50 : 95}

improve by 2.8%/2.1% and Precision (P)/Recall (R) by 4%/2.4% on RSOD;

{mAP}_{50}

improves by 1.3% on DIOR and 3% on DOTA. All core metrics surpass the original model and mainstream improved algorithms, verifying the effectiveness and innovation of the proposed improvements. Full article

(This article belongs to the Special Issue Symmetry-Aware Methods in Image Processing and Computer Vision)

► Show Figures

Figure 1

28 pages, 14283 KB

Open AccessArticle

FSD-YOLO: A Fusion Framework for Region Segmentation and Deformable Object Detection in Container Yards

by Linghao Dai, Zhihong Liang, Qi Feng, Shihuan Xie and Hongxu Li

Sensors 2026, 26(7), 2029; https://doi.org/10.3390/s26072029 - 24 Mar 2026

Abstract

Safety monitoring in container hoisting operations within rail-road intermodal logistics parks is a critical task in industrial safety management. Such scenarios are characterized by complex environments, large variations in target scales, deformable object shapes, and frequent occlusions, which pose significant challenges to visual [...] Read more.

Safety monitoring in container hoisting operations within rail-road intermodal logistics parks is a critical task in industrial safety management. Such scenarios are characterized by complex environments, large variations in target scales, deformable object shapes, and frequent occlusions, which pose significant challenges to visual perception systems. Conventional single-task models suffer from inherent limitations in handling low recall rates for distant small targets and insufficient adaptability to geometric deformations, making them inadequate for high-precision, real-time safety warning applications. To address these challenges, this study proposes a unified visual analysis framework that integrates semantic segmentation and object detection to enhance the recognition performance of small and deformable targets in complex operational environments, enabling real-time perception and safety warning of key objects and hazardous regions within container yards. Specifically, we introduce FSD-YOLO, a fusion-based architecture composed of the following key components. First, a SegFormer-based semantic segmentation module is employed to achieve pixel-level delineation of different operational regions. Second, an improved object detection network is developed based on the YOLOv8n architecture, incorporating: (1) the integration of C2f modules in the shallow layers of the backbone to enhance high-resolution feature extraction; (2) the embedding of C2fDCN modules within the detection head to improve modeling capability for deformable objects via deformable convolution; (3) the adoption of CARAFE upsampling operators to optimize multi-scale feature fusion; and (4) a dynamic loss-weighting strategy for small objects, where loss weights are adaptively adjusted according to target area to increase training emphasis on small-scale targets. Finally, a decision-level fusion strategy is applied to combine segmentation and detection outputs, enabling real-time safety judgment based on semantic rules. Experimental results on a self-constructed container yard dataset demonstrate that the proposed detection model achieves an mAP50-95 of 0.6433 and an mAP50 of 0.9565, significantly outperforming the baseline YOLOv8n model (mAP50-95: 0.5394, mAP50: 0.8435), thereby validating the effectiveness of the proposed framework. Full article

(This article belongs to the Topic AI and Data-Driven Advancements in Industry 4.0, 2nd Edition)

► Show Figures

Figure 1

26 pages, 4177 KB

Open AccessArticle

PPM-YOLOv11: Improved YOLOv11n-Based Algorithm for Small-Object Detection in Aerial Images

by Yuheng Yang, Haiying Zhang and Xiaoya Wang

Sensors 2026, 26(7), 2030; https://doi.org/10.3390/s26072030 - 24 Mar 2026

Abstract

To address the challenges in drone aerial image target detection—including the loss of critical information on small objects during multiple subsampling operations, the disappearance of minute target features, and insufficient detection accuracy due to dense occlusion interference—we propose PPM-YOLOv11, an improved target detection [...] Read more.

To address the challenges in drone aerial image target detection—including the loss of critical information on small objects during multiple subsampling operations, the disappearance of minute target features, and insufficient detection accuracy due to dense occlusion interference—we propose PPM-YOLOv11, an improved target detection algorithm based on YOLOv11n. The C3K2_PPA module integrates parallelized patch-aware attention with the C3K2 backbone network to better preserve critical information on small objects. A multi-scale detection head P2 specifically designed for detecting ultra-small objects ranging from 4 × 4 to 8 × 8 pixels is introduced. A high-resolution feature layer is added to the neck network to enhance detection accuracy with respect to ultra-small objects from a drone’s perspective. Adding the MultiSEAM module to the neck network enhances detection of occluded small objects by amplifying feature responses in unobstructed regions and compensating for information loss in occluded areas. Experiments on VisDrone2019 and SIMD datasets demonstrate our algorithm achieves a 40.9% mAP50 on VisDrone2019, surpassing the baseline YOLOv11n by 9.3 percentage points. On the SIMD dataset, the mAP50 reached 82.0%, surpassing the baseline network by 3.9 percentage points. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

27 pages, 4488 KB

Open AccessArticle

AL-YOLOv8: A Small Object Detection Algorithm for Remote Sensing Images Based on an Improved YOLOv8s

by Feng Zhang, Chuanzhao Tian, Xuewen Li, Na Yang and Yanting Zhang

Sensors 2026, 26(7), 2016; https://doi.org/10.3390/s26072016 - 24 Mar 2026

Viewed by 74

Abstract

To address false detections in small object detection within remote sensing imagery caused by complex backgrounds and minute target sizes, we propose an enhanced YOLOv8s detection algorithm, named AL-YOLOv8. The detection head is designed based on Adaptive Spatial Feature Fusion (ASFF) to resolve [...] Read more.

To address false detections in small object detection within remote sensing imagery caused by complex backgrounds and minute target sizes, we propose an enhanced YOLOv8s detection algorithm, named AL-YOLOv8. The detection head is designed based on Adaptive Spatial Feature Fusion (ASFF) to resolve issues where shallow-level detail features of small remote sensing targets are easily disrupted by backgrounds, while deep-level semantic features lack sufficient representation. We embed Large-Kernel Separate Attention (LSKA) in the deep feature layer to expand the receptive field and enhance the response intensity of small target features. Additionally, an IFIoU loss function is introduced by combining the dynamic attention mechanism from FocalerIoU with InnerIoU, mitigating regression bias for small target bounding boxes and improving small target localization accuracy. On the DIOR, RSOD, and NWPU VHR-10 datasets, the AL-YOLOv8 model achieves precision rates of 91.5%, 94.2%, and 91.8%, respectively, with mAP@0.5 scores of 89.8%, 96.9%, and 92.2%. These results demonstrate consistent improvements over YOLOv8s and show that AL-YOLOv8 effectively reduces false detections and enhances detection accuracy for small object detection in remote sensing applications. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

22 pages, 4595 KB

Open AccessArticle

Toward Real-Time Industrial Small Object Inspection: Decoupled Attention and Multi-Scale Aggregation for PCB Defect Detection

by Yuting Wang, Bingyang Guo, Liming Sun and Ruiyun Yu

Electronics 2026, 15(6), 1191; https://doi.org/10.3390/electronics15061191 - 12 Mar 2026

Viewed by 232

Abstract

PCB surface defect detection plays a critical role in ensuring electronics manufacturing quality. To address the challenges of small target defect detection, this study proposes PCB-YOLO, an enhanced lightweight detector based on YOLOv8n. PCB-YOLO introduces three key improvements. First, a RepViT-EMA Fusion Architecture [...] Read more.

PCB surface defect detection plays a critical role in ensuring electronics manufacturing quality. To address the challenges of small target defect detection, this study proposes PCB-YOLO, an enhanced lightweight detector based on YOLOv8n. PCB-YOLO introduces three key improvements. First, a RepViT-EMA Fusion Architecture (REFA) module is designed for deep backbone layers to strengthen feature extraction while suppressing background interference from complex circuit patterns. Second, a Multi-Scale Grouped Aggregation (MSGA) module is developed to reduce feature redundancy and improve spatial-semantic information extraction for multi-scale defects. Third, a Pixel-level Intersection over Union (PIoU) loss function is proposed to enable pixel-level IoU calculation with enhanced angular and area constraints for more precise localization. Extensive experiments on the PKU-Market-PCB dataset demonstrate that PCB-YOLO achieves 98.4% mAP@0.5, 97.4% recall, and 96.1% precision with only 2.4 M parameters, 6.9 G FLOPs, and an inference speed of 224 FPS, outperforming multiple state-of-the-art methods while maintaining real-time capability. Additional experiments on the DeepPCB dataset yield 99.0% mAP@0.5 and 80.4% mAP@0.5:0.95, confirming the cross-dataset generalization ability of the proposed method. Full article

(This article belongs to the Special Issue Intelligent Image and Video Processing: Quality, Compression and Vision Applications)

► Show Figures

Figure 1

27 pages, 9169 KB

Open AccessArticle

S²D-Net: A Synergistic Star-Attentive Network with Dynamic Feature Refinement for Robust Inshore SAR Ship Detection

by Shentao Wang, Byung-Won Min, Guoru Li, Depeng Gao, Jianlin Qiu and Yue Hong

Electronics 2026, 15(6), 1160; https://doi.org/10.3390/electronics15061160 - 11 Mar 2026

Viewed by 239

Abstract

Detecting ships using Synthetic Aperture Radar (SAR) in coastal areas is still difficult due to the impact of coherent speckle noise from the ocean surface, complex land clutter and having multi-scale target representations in the radar imagery. Most of the existing ship detection [...] Read more.

Detecting ships using Synthetic Aperture Radar (SAR) in coastal areas is still difficult due to the impact of coherent speckle noise from the ocean surface, complex land clutter and having multi-scale target representations in the radar imagery. Most of the existing ship detection algorithms lose important target features during downsampling and have difficulty recovering those features through upsampling, resulting in a high number of false detections and missed detections. In this work, we present a new ship detection algorithm called Synergistic Star-Attentive Network with Dynamic Feature Refinement (S²D-Net). First, we create a new backbone called Multi-scale PCCA-StarNet to generate robust feature representations. Within the backbone we implement a Progressive Channel-Coordinate Attention (PCCA) mechanism to create a synergy between global channel filtering and adaptive coordinate locking to decouple ship textures from granular speckle noise. Second, we create a Dynamic Feature Refinement Neck. We develop a content-aware dynamic upsampler called DySample to replace conventional interpolation to improve fidelity of the upsampled feature of small targets. Further, we design a Star-PCCA Feature Aggregation module which fuses features together. Using star-operations and the PCCA mechanism, this module refines semantic features and removes background clutter while aggregating features across multiple scales. Third, we develop a Lightweight Shared Convolutional Detection Head with Quality Estimation (LSCD-LQE). The LSCD-LQE decreases parameter redundancy by using shared convolutional layers and adds a localization quality estimation branch. Therefore, the LSCD-LQE effectively reduces false positive detections through alignment of classification scores with localization quality based on Intersection over Union (IoU) in difficult coastal environments. Our experimental results, using the SSDD and HRSID datasets, show that S²D-Net produces results comparable to representative ship detection algorithms. In particular, on the challenging HRSID inshore subset, our proposed method achieved a mean average precision (mAP) of 82.7%, which is 6.9% greater than the YOLOv11n baseline ship detection algorithm. These results demonstrate that S²D-Net is superior at detecting small coastal vessels and mitigating the detrimental effects of the nearshore complex environment on the performance ship detection using SAR. Full article

(This article belongs to the Special Issue AI-Powered Visual Intelligence: Tasks, Methods, and Real-World Applications)

► Show Figures

Figure 1

21 pages, 4810 KB

Open AccessArticle

Target Detection of Trellised Watermelons in Complex Agricultural Scenes Based on Improved RT-DETR

by Weichen Yan, Huixing Qu, Shaowei Wang, Huawei Yang, Yongbing Hao and Guohai Zhang

Horticulturae 2026, 12(3), 333; https://doi.org/10.3390/horticulturae12030333 - 10 Mar 2026

Viewed by 132

Abstract

To address the problems of severe fruit occlusion, large variations in target scale, and many small-scale goals being overlooked in the recognition of trellised watermelons under complex agricultural scenarios, this study proposes an improved RT-DETR-based detection model, termed RT-DETR-Watermelon. A context-guided (CG) module [...] Read more.

To address the problems of severe fruit occlusion, large variations in target scale, and many small-scale goals being overlooked in the recognition of trellised watermelons under complex agricultural scenarios, this study proposes an improved RT-DETR-based detection model, termed RT-DETR-Watermelon. A context-guided (CG) module is embedded into the backbone network. A dedicated P2 detection layer is added to enhance the model’s sensitivity to small objects. A scale sequence feature fusion (SSFF) module and a triple feature encoder (TFE) module are introduced into the model to improve the model’s capability to detect targets at multiple scales. The original bounding box regression loss is replaced with MPDIoU (Multiple Path Distance Intersection over Union) loss, which accelerates model convergence and improves localization precision. Finally, the number of channels is adjusted to reduce parameter count, computational complexity, and storage size. The experimental results show that, compared with the original RT-DETR model, the proposed RT-DETR-Watermelon model increases precision, recall, and mean Average Precision (mAP@0.5) by 0.4, 1.8, and 1.0 percentage points, while reducing the number of parameters, computational cost, and model size by 53.5%, 23.5%, and 53.2%, respectively. Full article

(This article belongs to the Special Issue A New Wave of Smart and Mechanized Techniques in Horticulture)

► Show Figures

Figure 1

19 pages, 4538 KB

Open AccessArticle

YOLO-EGASF: A Small-Target Detection Algorithm for Surface Residual Film in UAV Imagery of Arid-Region Cotton Fields

by Xiao Yang, Ji Shi, Kailin Yang, Xiaoqing Lian, Shufeng Zhang, Hongbiao Wang and Zheng Li

AgriEngineering 2026, 8(3), 106; https://doi.org/10.3390/agriengineering8030106 - 10 Mar 2026

Viewed by 216

Abstract

Mulch-film covering technology has been widely adopted in cotton production in arid regions; however, the associated problem of residual-film pollution has become increasingly prominent, creating an urgent demand for efficient and accurate monitoring approaches. Owing to the small target scale, irregular morphology, blurred [...] Read more.

Mulch-film covering technology has been widely adopted in cotton production in arid regions; however, the associated problem of residual-film pollution has become increasingly prominent, creating an urgent demand for efficient and accurate monitoring approaches. Owing to the small target scale, irregular morphology, blurred boundaries, and complex soil backgrounds of residual-film fragments, residual-film detection based on close-range UAV imagery remains a challenging task. To address these issues, this study proposes an improved algorithm, termed YOLO-EGASF, for residual-film detection in arid-region cotton fields, built upon the lightweight YOLOv11n framework. To enhance the detection of small targets with weak boundary characteristics, the baseline model is improved from three aspects. First, a boundary-enhanced multi-branch small-target extraction module (EMSE) is designed to reinforce shallow-layer details and gradient information through multi-scale convolution and explicit edge enhancement. Second, a GLoCA attention module that integrates global coordinate information with local geometric features is constructed to improve the discriminative capability of the model for residual-film targets under complex background conditions. Third, an ASF-layer multi-scale feature fusion structure is introduced, together with an additional small-target detection layer, to strengthen the participation of high-resolution features in cross-scale fusion and prediction. Experimental results on a self-constructed UAV-based residual-film dataset from cotton fields demonstrate that YOLO-EGASF outperforms several mainstream detection models in terms of Precision, Recall, mAP@0.5, and mAP@0.5:0.95, achieving mAP@0.5 and mAP@0.5:0.95 values of 71.9% and 26.8%, respectively. These results indicate a significant improvement in detection accuracy and robustness, confirming that the proposed method can effectively meet the practical requirements of fine-grained residual-film monitoring in arid-region cotton fields. Full article

(This article belongs to the Special Issue Applications of Computer Vision in Agriculture)

► Show Figures

Graphical abstract

21 pages, 472 KB

Open AccessArticle

Efficient CNN–GRU Transfer Learning for Edge IoT Intrusion Detection

by Amjad Gamlo, Sanaa Sharaf and Rania Molla

Electronics 2026, 15(5), 981; https://doi.org/10.3390/electronics15050981 - 27 Feb 2026

Viewed by 302

Abstract

Intrusion detection in Internet of Things (IoT) environments is challenged by severe class imbalance, evolving attack patterns, and the limited computational resources of edge devices. To address these challenges, this paper proposes a lightweight transfer-learning framework based on a combined architecture of Convolutional [...] Read more.

Intrusion detection in Internet of Things (IoT) environments is challenged by severe class imbalance, evolving attack patterns, and the limited computational resources of edge devices. To address these challenges, this paper proposes a lightweight transfer-learning framework based on a combined architecture of Convolutional Neural Network and Gated Recurrent Unit (CNN–GRU) for IoT intrusion detection. The model is first pretrained on a large-scale source dataset containing mixed benign and attack traffic, then adapted to a smaller and structurally different target dataset using partial finetuning. To enable efficient edge adaptation, early convolutional layers are frozen while only the GRU and classification head are updated on the target domain. A leakage-free, group-aware data preparation strategy with overlapping temporal windows is employed to ensure reliable evaluation. Experimental results demonstrate that the proposed lightweight transfer approach achieves solid macro-level detection performance while reducing training cost compared to full finetuning. Additional analysis using a CPU-based inference proxy shows low latency and a small model footprint. This supports the feasibility of edge deployment. The results confirm that lightweight transfer learning offers an effective balance between detection performance and adaptation efficiency for resource-constrained IoT intrusion detection systems. Full article

► Show Figures

Figure 1

24 pages, 10647 KB

Open AccessArticle

Spatio-Temporal Feature Fusion for Anti-UAV Detection: Integrating Inter-Frame Dynamics and Appearance

by Yake Zhang, Xiaoxi Fu, Yunfeng Zhou, Xiaojun Guo, Bei Sun, Yinglong Wang and Yongping Zhai

Sensors 2026, 26(5), 1492; https://doi.org/10.3390/s26051492 - 27 Feb 2026

Viewed by 325

Abstract

In order to improve the detection capability of low-slow-small UAV targets in complex backgrounds, this paper introduces a novel method that combines spatio-temporal information, which includes (1) an improved YOLO detector for small UAV detection, (2) a motion target detection module, and (3) [...] Read more.

In order to improve the detection capability of low-slow-small UAV targets in complex backgrounds, this paper introduces a novel method that combines spatio-temporal information, which includes (1) an improved YOLO detector for small UAV detection, (2) a motion target detection module, and (3) an integrated combination strategy for static and dynamic judgment. We firstly provided an improved YOLOv11 static detection method by combining SPD Conv, BiFPN and a detect header for high-resolution layers, and then designed a dynamic target-detection algorithm which helps the YOLO method capture minor movement features, finally introducing a fusing strategy of static detection and dynamic judgment. The experimental results on small UAV datasets, including various sky, mountain and building backgrounds, have shown that the proposed approach increases Precision, Recall, and mAP50 by 12.1%, 29.5%, and 29.6%, respectively, compared with the baseline YOLO11 detector. The proposed MSM-YOLO achieves Precision, Recall, and mAP50 of 94%, 92%, and 86.3%, enabling the effective detection of small UAV targets in complex scenarios. Moreover, the ablation experiments also proved the effectiveness of each module. The proposed method was further deployed in a redesigned RK3588 embedded system, achieving 100 fps after optimized process, and it has shown effectiveness and practicality in further air-to-air UAV detection applications. Full article

(This article belongs to the Section Sensors and Robotics)

► Show Figures

Figure 1

23 pages, 2666 KB

Open AccessArticle

A Study on ACCC Surface Defect Classification Method Using ResNet18 with Integrated SE Attention Mechanism

by Wenlong Xiao and Rui Chen

Appl. Sci. 2026, 16(4), 1899; https://doi.org/10.3390/app16041899 - 13 Feb 2026

Viewed by 277

Abstract

Surface defect detection in aluminum-based composite core conductors (ACCC) via X-ray imaging has long been constrained by challenges such as small sample sizes, class imbalance, model redundancy, and inadequate adaptation to single-channel industrial images. To address this, this paper proposes SE-ResNet18, a lightweight [...] Read more.

Surface defect detection in aluminum-based composite core conductors (ACCC) via X-ray imaging has long been constrained by challenges such as small sample sizes, class imbalance, model redundancy, and inadequate adaptation to single-channel industrial images. To address this, this paper proposes SE-ResNet18, a lightweight classification model synergistically designed for industrial single-channel X-ray images. The model features a co-adapted architecture where a single-channel input layer (preserving native image information and eliminating RGB conversion overhead) is coupled with a channel attention mechanism (to amplify subtle defect features), all within a globally optimized lightweight framework. With targeted data augmentation and robust training strategies, the model achieves superior performance on the ACCC defect dataset: classification accuracy reaches 98.39%, while excelling in lightweight design (12.0 million parameters) and real-time capability (0.44 ms/image inference speed). The experiments demonstrate that the proposed model exhibits high classification accuracy in testing while offering superior lightweight characteristics and inference efficiency. This provides a feasible solution for achieving high-precision detection and real-time processing in industrial scenarios, showcasing potential for ACCC online detection applications. Full article

(This article belongs to the Special Issue AI Applications in Modern Industrial Systems)

► Show Figures

Figure 1

25 pages, 3298 KB

Open AccessArticle

FDE-YOLO: An Improved Algorithm for Small Target Detection in UAV Images

by Jialiang Li, Xu Guo, Xu Zhao and Jie Jin

Mathematics 2026, 14(4), 663; https://doi.org/10.3390/math14040663 - 13 Feb 2026

Viewed by 504

Abstract

Accurate small object detection in unmanned aerial vehicle (UAV) imagery is fundamental to numerous safety-critical applications, including intelligent transportation, urban surveillance, and disaster assessment. However, extreme scale compression, dense object distributions, and complex backgrounds severely constrain the feature representation capability of existing detectors, [...] Read more.

Accurate small object detection in unmanned aerial vehicle (UAV) imagery is fundamental to numerous safety-critical applications, including intelligent transportation, urban surveillance, and disaster assessment. However, extreme scale compression, dense object distributions, and complex backgrounds severely constrain the feature representation capability of existing detectors, leading to degraded reliability in real-world deployments. To overcome these limitations, we propose FDE-YOLO, a lightweight yet high-performance detection framework built upon YOLOv11 with three complementary architectural innovations. The Fine-Grained Detection Pyramid (FGDP) integrates space-to-depth convolution with a CSP-MFE module that fuses multi-granularity features through parallel local, context, and global branches, capturing comprehensive small target information while avoiding computational overhead from layer stacking. The Dynamic Detection Fusion Head (DDFHead) unifies scale-aware, spatial-aware, and task-aware attention mechanisms via sequential refinement with DCNv4 and FReLU activation, adaptively enhancing discriminative capability for densely clustered targets in complex scenes. The EdgeSpaceNet module explicitly fuses Sobel-extracted boundary features with spatial convolution outputs through residual connections, recovering edge details typically lost in standard operations while reducing parameter count via depthwise separable convolutions. Extensive experiments on the VisDrone2019 dataset demonstrate that FDE-YOLO achieves 53.6% precision, 42.5% recall, 43.3% mAP50, and 26.3% mAP50:95, surpassing YOLOv11s by 2.8%, 4.4%, 4.1%, and 2.8% respectively, with only 10.25 M parameters. The proposed approach outperforms UAV-specialized methods including Drone-YOLO and MASF-YOLO while using significantly fewer parameters (37.5% and 29.8% reductions respectively), demonstrating superior efficiency. Cross-dataset evaluations on UAV-DT and NWPU VHR-10 further confirm strong generalization capability with 1.6% and 1.5% mAP50 improvements respectively, validating FDE-YOLO as an effective and efficient solution for reliable UAV-based small object detection in real-world scenarios. Full article

(This article belongs to the Special Issue New Advances in Image Processing and Computer Vision)

► Show Figures

Figure 1

28 pages, 66640 KB

Open AccessArticle

SSABNet: Spatial-Semantic Aggregation and Balancing Network for Small-Target Detection in UAV Remote Sensing Images

by Hongxing Zhang, Zhonghong Ou, Siyuan Yao, Shigeng Wang, Yang Guo and Meina Song

Remote Sens. 2026, 18(4), 550; https://doi.org/10.3390/rs18040550 - 9 Feb 2026

Viewed by 376

Abstract

The precise localization of small objects in UAV-captured remote sensing imagery remains a formidable challenge due to their limited spatial support, coarse resolution, and severe background clutter. These factors often cause weak target cues to be progressively overwhelmed during deep feature extraction. Existing [...] Read more.

The precise localization of small objects in UAV-captured remote sensing imagery remains a formidable challenge due to their limited spatial support, coarse resolution, and severe background clutter. These factors often cause weak target cues to be progressively overwhelmed during deep feature extraction. Existing deep learning-based detectors typically suffer from two fundamental limitations: the irreversible loss of fine-grained spatial details during hierarchical feature fusion and the scale-insensitive optimization of conventional loss functions, which inadequately emphasize hard-to-detect small targets. To address these issues, we propose a novel Spatial-Semantic Aggregation and Balancing Network (SSABNet) tailored for UAV-based small-target detection. First, a Spatial-Semantic Aggregation (SSA) module is introduced to establish a high-fidelity restoration pathway that recovers fine-grained texture and boundary information from shallow layers. By employing content-aware operators, SSA effectively reconciles the structural discrepancy between spatial details and semantic abstractions, enabling precise cross-scale feature fusion while suppressing aliasing artifacts. Second, we design a Scale-Aware Balancing Loss (SABL) to mitigate the gradient instability and vanishing-gradient issues commonly encountered when optimizing non-overlapping small targets. SABL adopts a scale-dependent modulation mechanism that smoothly transitions from Wasserstein distance for distributional alignment of small objects to Euclidean distance for geometric refinement of larger targets, thereby ensuring stable and balanced optimization across object scales. Extensive experiments on the VisDrone benchmark demonstrate that SSABNet outperforms state-of-the-art detectors, achieving gains of 1.3% in overall AP and 2.5% in

{AP}_{s}

. Further evaluation on the UAVDT dataset confirms its strong generalization capability, yielding improvements of 0.5% in AP and 16.9% in

{AP}_{s}

. These results validate the effectiveness of jointly addressing feature representation and scale-aware optimization for UAV small-target detection. Full article

(This article belongs to the Special Issue Deep Learning-Based Small-Target Detection in Remote Sensing)

► Show Figures

Figure 1

Search Results (515)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (515)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI