MDPI - Publisher of Open Access Journals

27 pages, 49730 KB

Open AccessArticle

AMSRDet: An Adaptive Multi-Scale UAV Infrared-Visible Remote Sensing Vehicle Detection Network

by Zekai Yan and Yuheng Li

Sensors 2026, 26(3), 817; https://doi.org/10.3390/s26030817 - 26 Jan 2026

Cited by 1 | Viewed by 404

Unmanned Aerial Vehicle (UAV) platforms enable flexible and cost-effective vehicle detection for intelligent transportation systems, yet small-scale vehicles in complex aerial scenes pose substantial challenges from extreme scale variations, environmental interference, and single-sensor limitations. We present AMSRDet (Adaptive Multi-Scale Remote Sensing Detector), an [...] Read more.

Unmanned Aerial Vehicle (UAV) platforms enable flexible and cost-effective vehicle detection for intelligent transportation systems, yet small-scale vehicles in complex aerial scenes pose substantial challenges from extreme scale variations, environmental interference, and single-sensor limitations. We present AMSRDet (Adaptive Multi-Scale Remote Sensing Detector), an adaptive multi-scale detection network fusing infrared (IR) and visible (RGB) modalities for robust UAV-based vehicle detection. Our framework comprises four novel components: (1) a MobileMamba-based dual-stream encoder extracting complementary features via Selective State-Space 2D (SS2D) blocks with linear complexity

O (H W C)

, achieving 2.1× efficiency improvement over standard Transformers; (2) a Cross-Modal Global Fusion (CMGF) module capturing global dependencies through spatial-channel attention while suppressing modality-specific noise via adaptive gating; (3) a Scale-Coordinate Attention Fusion (SCAF) module integrating multi-scale features via coordinate attention and learned scale-aware weighting, improving small object detection by 2.5 percentage points; and (4) a Separable Dynamic Decoder generating scale-adaptive predictions through content-aware dynamic convolution, reducing computational cost by 48.9% compared to standard DETR decoders. On the DroneVehicle dataset, AMSRDet achieves 45.8% mAP@0.5:0.95 (81.2% mAP@0.5) at 68.3 Frames Per Second (FPS) with 28.6 million (M) parameters and 47.2 Giga Floating Point Operations (GFLOPs), outperforming twenty state-of-the-art detectors including YOLOv12 (+0.7% mAP), DEIM (+0.8% mAP), and Mamba-YOLO (+1.5% mAP). Cross-dataset evaluation on Camera-vehicle yields 52.3% mAP without fine-tuning, demonstrating strong generalization across viewpoints and scenarios. Full article

(This article belongs to the Special Issue AI and Smart Sensors for Intelligent Transportation Systems)

► Show Figures

Figure 1

23 pages, 53610 KB

Open AccessArticle

Multispectral Sparse Cross-Attention Guided Mamba Network for Small Object Detection in Remote Sensing

by Wen Xiang, Yamin Li, Liu Duan, Qifeng Wu, Jiaqi Ruan, Yucheng Wan and Sihan Wu

Remote Sens. 2026, 18(3), 381; https://doi.org/10.3390/rs18030381 - 23 Jan 2026

Viewed by 423

Abstract

Remote sensing small object detection remains a challenging task due to limited feature representation and interference from complex backgrounds. Existing methods that rely exclusively on either visible or infrared modalities often fail to achieve both accuracy and robustness in detection. Effectively integrating cross-modal [...] Read more.

Remote sensing small object detection remains a challenging task due to limited feature representation and interference from complex backgrounds. Existing methods that rely exclusively on either visible or infrared modalities often fail to achieve both accuracy and robustness in detection. Effectively integrating cross-modal information to enhance detection performance remains a critical challenge. To address this issue, we propose a novel Multispectral Sparse Cross-Attention Guided Mamba Network (MSCGMN) for small object detection in remote sensing. The proposed MSCGMN architecture comprises three key components: Multispectral Sparse Cross-Attention Guidance Module (MSCAG), Dynamic Grouped Mamba Block (DGMB), and Gated Enhanced Attention Module (GEAM). Specifically, the MSCAG module selectively fuses RGB and infrared (IR) features using sparse cross-modal attention, effectively capturing complementary information across modalities while suppressing redundancy. The DGMB introduces a dynamic grouping strategy to improve the computational efficiency of Mamba, enabling effective global context modeling. In remote sensing images, small objects occupy limited areas, making it difficult to capture their critical features. We design the GEAM module to enhance both global and local feature representations for small object detection. Experiments on the VEDAI and DroneVehicle datasets show that MSCGMN achieves mAP50 scores of 83.9% and 84.4%, outperforming existing state-of-the-art methods and demonstrating strong competitiveness in small object detection tasks. Full article

(This article belongs to the Special Issue Image Fusion and Object Detection Using Multi-Modal Remote Sensing Data)

► Show Figures

Graphical abstract

22 pages, 27042 KB

Open AccessArticle

MSDF-Mamba: Mutual-Spectrum Perception Deformable Fusion Mamba for Drone-Based Visible–Infrared Cross-Modality Vehicle Detection

by Jiashuo Shen, Jun He, Qiuyu Liu, Zhilong Zhang, Guoyan Wang and Dawei Lu

Remote Sens. 2025, 17(24), 4037; https://doi.org/10.3390/rs17244037 - 15 Dec 2025

Cited by 1 | Viewed by 970

Abstract

To ensure all-day detection performance, unmanned aerial vehicles (UAVs) usually need both visible and infrared images for dual-modality fusion object detection. However, misalignment between the RGB-IR image pairs and complexity of fusion models constrain the fusion detection performance. Specifically, typical alignment methods choose [...] Read more.

To ensure all-day detection performance, unmanned aerial vehicles (UAVs) usually need both visible and infrared images for dual-modality fusion object detection. However, misalignment between the RGB-IR image pairs and complexity of fusion models constrain the fusion detection performance. Specifically, typical alignment methods choose only one modality as a reference modality, leading to excessive dependence on the chosen modality quality. Furthermore, current multimodal fusion detection methods still struggle to strike a balance between high accuracy and low computational complexity, thus making the deployment of these models on resource-constrained UAV platforms a challenge. In order to solve the above problems, this paper proposes a dual-modality UAV image target detection method named Mutual-Spectrum Perception Deformable Fusion Mamba (MSDF-Mamba). First, we designed a Mutual Spectral Deformable Alignment (MSDA) module. This module employs a bidirectional cross-attention mechanism to enable one modality to actively extract the semantic information of the other, generating fusion features rich in cross-modal context as shared references. These fusion features are then used to predict spatial offsets, with deformable convolutions achieving feature alignment. Based on the MSDA module, a Selective Scan Fusion (SSF) module is carefully designed to project the aligned features onto a unified hidden state space. With this method, we achieve full interaction and enhanced fusion of intermodal features with low computational complexity. Experiment results demonstrate that our method outperforms existing state-of-the-art cross-modality detection methods on the mAP metric, achieving a relative improvement of 3.1% compared to baseline models such as DMM, while still maintaining high computational efficiency. Full article

(This article belongs to the Special Issue Multi-Object Detection and Feature Extraction of Remote Sensing Images)

► Show Figures

Figure 1

24 pages, 1626 KB

Open AccessArticle

Physical Layer Security Enhancement in IRS-Assisted Interweave CIoV Networks: A Heterogeneous Multi-Agent Mamba RainbowDQN Method

by Ruiquan Lin, Shengjie Xie, Wencheng Chen and Tao Xu

Sensors 2025, 25(20), 6287; https://doi.org/10.3390/s25206287 - 10 Oct 2025

Viewed by 795

Abstract

The Internet of Vehicles (IoV) relies on Vehicle-to-Everything (V2X) communications to enable cooperative perception among vehicles, infrastructures, and devices, where Vehicle-to-Infrastructure (V2I) links are crucial for reliable transmission. However, the openness of wireless channels exposes IoV to eavesdropping, threatening privacy and security. This [...] Read more.

The Internet of Vehicles (IoV) relies on Vehicle-to-Everything (V2X) communications to enable cooperative perception among vehicles, infrastructures, and devices, where Vehicle-to-Infrastructure (V2I) links are crucial for reliable transmission. However, the openness of wireless channels exposes IoV to eavesdropping, threatening privacy and security. This paper investigates an Intelligent Reflecting Surface (IRS)-assisted interweave Cognitive IoV (CIoV) network to enhance physical layer security in V2I communications. A non-convex joint optimization problem involving spectrum allocation, transmit power for Vehicle Users (VUs), and IRS phase shifts is formulated. To address this challenge, a heterogeneous multi-agent (HMA) Mamba RainbowDQN algorithm is proposed, where homogeneous VUs and a heterogeneous secondary base station (SBS) act as distinct agents to simplify decision-making. Simulation results show that the proposed method significantly outperform benchmark schemes, achieving a 13.29% improvement in secrecy rate and a 54.2% reduction in secrecy outage probability (SOP). These results confirm the effectiveness of integrating IRS and deep reinforcement learning (DRL) for secure and efficient V2I communications in CIoV networks. Full article

(This article belongs to the Section Sensor Networks)

► Show Figures

Figure 1

20 pages, 4585 KB

Open AccessArticle

MMamba: An Efficient Multimodal Framework for Real-Time Ocean Surface Wind Speed Inpainting Using Mutual Information and Attention-Mamba-2

by Xinjie Shi, Weicheng Ni, Boheng Duan, Qingguo Su, Lechao Liu and Kaijun Ren

Remote Sens. 2025, 17(17), 3091; https://doi.org/10.3390/rs17173091 - 4 Sep 2025

Cited by 2 | Viewed by 1423

Abstract

Accurate observations of Ocean Surface Wind Speed (OSWS) are vital for predicting extreme weather and understanding ocean–atmosphere interactions. However, spaceborne sensors (e.g., ASCAT, SMAP) often experience data loss due to harsh weather and instrument malfunctions. Existing inpainting methods often rely on reanalysis data [...] Read more.

Accurate observations of Ocean Surface Wind Speed (OSWS) are vital for predicting extreme weather and understanding ocean–atmosphere interactions. However, spaceborne sensors (e.g., ASCAT, SMAP) often experience data loss due to harsh weather and instrument malfunctions. Existing inpainting methods often rely on reanalysis data that is released with delays, which restricts their real-time capability. Additionally, deep-learning-based methods, such as Transformers, face challenges due to their high computational complexity. To address these challenges, we present the Multimodal Wind Speed Inpainting Dataset (MWSID), which integrates 12 auxiliary forecasting variables to support real-time OSWS inpainting. Based on MWSID, we propose the MMamba framework, combining the Multimodal Feature Extraction module, which uses mutual information (MI) theory to optimize feature selection, and the OSWS Reconstruction module, which employs Attention-Mamba-2 within a Residual-in-Residual-Dense architecture for efficient OSWS inpainting. Experiments show that MMamba outperforms MambaIR (state-of-the-art) with an RMSE of 0.5481 m/s and an SSIM of 0.9820, significantly reducing RMSE by 21.10% over Kriging and 8.22% over MambaIR in high-winds (>15 m/s). We further introduce MMamba-L, a lightweight 0.22M-parameter variant suitable for resource-limited devices. These contributions make MMamba and MWSID powerful tools for OSWS inpainting, benefiting extreme weather prediction and oceanographic research. Full article

(This article belongs to the Section AI Remote Sensing)

► Show Figures

Figure 1

21 pages, 2657 KB

Open AccessArticle

A Lightweight Multi-Stage Visual Detection Approach for Complex Traffic Scenes

by Xuanyi Zhao, Xiaohan Dou, Jihong Zheng and Gengpei Zhang

Sensors 2025, 25(16), 5014; https://doi.org/10.3390/s25165014 - 13 Aug 2025

Cited by 1 | Viewed by 1229

Abstract

In complex traffic environments, image degradation due to adverse factors such as haze, low illumination, and occlusion significantly compromises the performance of object detection systems in recognizing vehicles and pedestrians. To address these challenges, this paper proposes a robust visual detection framework that [...] Read more.

In complex traffic environments, image degradation due to adverse factors such as haze, low illumination, and occlusion significantly compromises the performance of object detection systems in recognizing vehicles and pedestrians. To address these challenges, this paper proposes a robust visual detection framework that integrates multi-stage image enhancement with a lightweight detection architecture. Specifically, an image preprocessing module incorporating ConvIR and CIDNet is designed to perform defogging and illumination enhancement, thereby substantially improving the perceptual quality of degraded inputs. Furthermore, a novel enhancement strategy based on the Horizontal/Vertical-Intensity color space is introduced to decouple brightness and chromaticity modeling, effectively enhancing structural details and visual consistency in low-light regions. In the detection phase, a lightweight state-space modeling network, Mamba-Driven Lightweight Detection Network with RT-DETR Decoding, is proposed for object detection in complex traffic scenes. This architecture integrates VSSBlock and XSSBlock modules to enhance detection performance, particularly for multi-scale and occluded targets. Additionally, a VisionClueMerge module is incorporated to strengthen the perception of edge structures by effectively fusing multi-scale spatial features. Experimental evaluations on traffic surveillance datasets demonstrate that the proposed method surpasses the mainstream YOLOv12s model in terms of mAP@50–90, achieving a performance gain of approximately 1.0 percentage point (from 0.759 to 0.769). While ensuring competitive detection accuracy, the model exhibits reduced parameter complexity and computational overhead, thereby demonstrating superior deployment adaptability and robustness. This framework offers a practical and effective solution for object detection in intelligent transportation systems operating under visually challenging conditions. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

24 pages, 9664 KB

Open AccessArticle

Frequency-Domain Collaborative Lightweight Super-Resolution for Fine Texture Enhancement in Rice Imagery

by Zexiao Zhang, Jie Zhang, Jinyang Du, Xiangdong Chen, Wenjing Zhang and Changmeng Peng

Agronomy 2025, 15(7), 1729; https://doi.org/10.3390/agronomy15071729 - 18 Jul 2025

Cited by 1 | Viewed by 1405

Abstract

In rice detection tasks, accurate identification of leaf streaks, pest and disease distribution, and spikelet hierarchies relies on high-quality images to distinguish between texture and hierarchy. However, existing images often suffer from texture blurring and contour shifting due to equipment and environment limitations, [...] Read more.

In rice detection tasks, accurate identification of leaf streaks, pest and disease distribution, and spikelet hierarchies relies on high-quality images to distinguish between texture and hierarchy. However, existing images often suffer from texture blurring and contour shifting due to equipment and environment limitations, which affects the detection performance. In view of the fact that pests and diseases affect the whole situation and tiny details are mostly localized, we propose a rice image reconstruction method based on an adaptive two-branch heterogeneous structure. The method consists of a low-frequency branch (LFB) that recovers global features using orientation-aware extended receptive fields to capture streaky global features, such as pests and diseases, and a high-frequency branch (HFB) that enhances detail edges through an adaptive enhancement mechanism to boost the clarity of local detail regions. By introducing the dynamic weight fusion mechanism (CSDW) and lightweight gating network (LFFN), the problem of the unbalanced fusion of frequency information for rice images in traditional methods is solved. Experiments on the 4× downsampled rice test set demonstrate that the proposed method achieves a 62% reduction in parameters compared to EDSR, 41% lower computational cost (30 G) than MambaIR-light, and an average PSNR improvement of 0.68% over other methods in the study while balancing memory usage (227 M) and inference speed. In downstream task validation, rice panicle maturity detection achieves a 61.5% increase in mAP50 (0.480 → 0.775) compared to interpolation methods, and leaf pest detection shows a 2.7% improvement in average mAP50 (0.949 → 0.975). This research provides an effective solution for lightweight rice image enhancement, with its dual-branch collaborative mechanism and dynamic fusion strategy establishing a new paradigm in agricultural rice image processing. Full article

(This article belongs to the Collection AI, Sensors and Robotics for Smart Agriculture)

► Show Figures

Figure 1

26 pages, 14660 KB

Open AccessArticle

Succulent-YOLO: Smart UAV-Assisted Succulent Farmland Monitoring with CLIP-Based YOLOv10 and Mamba Computer Vision

by Hui Li, Fan Zhao, Feng Xue, Jiaqi Wang, Yongying Liu, Yijia Chen, Qingyang Wu, Jianghan Tao, Guocheng Zhang, Dianhan Xi, Jundong Chen and Hill Hiroki Kobayashi

Remote Sens. 2025, 17(13), 2219; https://doi.org/10.3390/rs17132219 - 28 Jun 2025

Cited by 22 | Viewed by 1640

Abstract

Recent advances in unmanned aerial vehicle (UAV) technology combined with deep learning techniques have greatly improved agricultural monitoring. However, accurately processing images at low resolutions remains challenging for precision cultivation of succulents. To address this issue, this study proposes a novel method that [...] Read more.

Recent advances in unmanned aerial vehicle (UAV) technology combined with deep learning techniques have greatly improved agricultural monitoring. However, accurately processing images at low resolutions remains challenging for precision cultivation of succulents. To address this issue, this study proposes a novel method that combines cutting-edge super-resolution reconstruction (SRR) techniques with object detection and then applies the above model in a unified drone framework to achieve large-scale, reliable monitoring of succulent plants. Specifically, we introduce MambaIR, an innovative SRR method leveraging selective state-space models, significantly improving the quality of UAV-captured low-resolution imagery (achieving a PSNR of 23.83 dB and an SSIM of 79.60%) and surpassing current state-of-the-art approaches. Additionally, we develop Succulent-YOLO, a customized target detection model optimized for succulent image classification, achieving a mean average precision (mAP@50) of 87.8% on high-resolution images. The integrated use of MambaIR and Succulent-YOLO achieves an mAP@50 of 85.1% when tested on enhanced super-resolution images, closely approaching the performance on original high-resolution images. Through extensive experimentation supported by Grad-CAM visualization, our method effectively captures critical features of succulents, identifying the best trade-off between resolution enhancement and computational demands. By overcoming the limitations associated with low-resolution UAV imagery in agricultural monitoring, this solution provides an effective, scalable approach for evaluating succulent plant growth. Addressing image-quality issues further facilitates informed decision-making, reducing technical challenges. Ultimately, this study provides a robust foundation for expanding the practical use of UAVs and artificial intelligence in precision agriculture, promoting sustainable farming practices through advanced remote sensing technologies. Full article

► Show Figures

Graphical abstract

Search Results (8)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (8)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI