Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (138)

Search Parameters:
Keywords = multi-target joint detection

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
34 pages, 1939 KB  
Article
AutoUAVFormer: Neural Architecture Search with Implicit Super-Resolution for Real-Time UAV Aerial Object Detection
by Li Pan, Huiyao Wan, Pazlat Nurmamat, Jie Chen, Long Sun, Yice Cao, Shuai Wang, Yingsong Li and Zhixiang Huang
Remote Sens. 2026, 18(9), 1268; https://doi.org/10.3390/rs18091268 (registering DOI) - 22 Apr 2026
Abstract
The widespread deployment of unmanned aerial vehicles (UAVs) in civil and commercial airspace has raised significant safety concerns, driving the demand for reliable and real-time Anti-UAV visual detection systems. However, existing deep learning-based detectors face substantial challenges in complex low-altitude environments, including drastic [...] Read more.
The widespread deployment of unmanned aerial vehicles (UAVs) in civil and commercial airspace has raised significant safety concerns, driving the demand for reliable and real-time Anti-UAV visual detection systems. However, existing deep learning-based detectors face substantial challenges in complex low-altitude environments, including drastic scale variations, severe background clutter, and weak feature representation of small UAV targets. Moreover, handcrafted Transformer-based architectures often lack adaptability across diverse scenarios and struggle to balance detection accuracy with computational efficiency. To address these limitations, this paper proposes AutoUAVFormer, a super-resolution guided neural architecture search framework for Anti-UAV detection. In contrast to conventional manually designed approaches, AutoUAVFormer leverages joint optimization of a Transformer-based detection objective and a super-resolution reconstruction objective to automatically identify a task-specific optimal network architecture for detecting UAV targets. Specifically, a unified search space is formulated by jointly embedding Transformer hyperparameters and Feature Pyramid Network (FPN) structures, facilitating end-to-end co-optimization of multi-scale feature fusion and global context modeling. To efficiently locate architectures that balance accuracy and computational cost, a three-stage pipeline, combining supernetwork training with evolutionary search, is employed. Additionally, we design a super-resolution auxiliary branch that operates only during training to enhance the model’s ability to learn fine-grained textures and sharpen edge representations of small targets, without introducing any inference overhead. Extensive experiments on three challenging Anti-UAV detection benchmarks, namely DetFly, DUT Anti-UAV, and UAV Swarm, confirm the superiority of AutoUAVFormer over current state-of-the-art methods, with mAP@0.5 scores reaching 98.6%, 95.5%, and 89.9% on the respective datasets while sustaining real-time inference speed. These results demonstrate that AutoUAVFormer achieves strong generalization and maintains robust Anti-UAV detection performance under challenging low-altitude conditions. Full article
29 pages, 45646 KB  
Article
FSMD–Net: Joint Spatial–Channel Spectral Modeling for SAR Ship Detection in Complex Inshore Scenarios
by Xianxun Yao, Yijiang Shen and Yuheng Lei
Remote Sens. 2026, 18(8), 1254; https://doi.org/10.3390/rs18081254 - 21 Apr 2026
Abstract
Synthetic aperture radar (SAR) ship detection in complex inshore scenarios has long been constrained by the coupled effects of speckle noise and small–scale weak scattering targets. Although feature–level frequency–domain denoising methods partially alleviate noise interference, existing studies predominantly focus on spatial frequency modeling [...] Read more.
Synthetic aperture radar (SAR) ship detection in complex inshore scenarios has long been constrained by the coupled effects of speckle noise and small–scale weak scattering targets. Although feature–level frequency–domain denoising methods partially alleviate noise interference, existing studies predominantly focus on spatial frequency modeling and implicitly assume consistent spectral responses and discriminative contributions across channels. This assumption may lead to over–suppression of weak ship targets under complex backgrounds. To address the incomplete dimensionality of current frequency–domain modeling, this paper proposes FSMD–Net, a joint spatial–channel spectral modeling framework for SAR ship detection. During multi–scale feature fusion, a coordinated modulation mechanism integrating multi–spectral channel attention with spatial frequency–domain denoising is introduced. This design enables channel discriminability and frequency–subspace denoising to act synergistically, enforcing structurally consistent spectral constraints throughout multi–scale feature propagation. Extensive experiments on SARDet–100K, HRSID, and AIR–SARShip–2.0 demonstrate that FSMD–Net achieves consistent performance improvements, particularly in small–target and strong–clutter scenarios, exhibiting enhanced detection accuracy and robustness. Full article
(This article belongs to the Special Issue Ship Imaging, Detection and Recognition for High-Resolution SAR)
Show Figures

Figure 1

21 pages, 4869 KB  
Article
Joint Adjustment Image Stabilization Method Based on Trajectories of Maritime Multi-Target Detection and Tracking
by Fangjian Liu, Yuan Li and Mi Wang
Appl. Sci. 2026, 16(8), 4029; https://doi.org/10.3390/app16084029 - 21 Apr 2026
Abstract
Existing technologies can achieve relative geometric correction and stabilization of geostationary satellite image sequences through fixed land scene matching or homonymous point adjustment. However, these methods heavily rely on fixed land areas, rendering them completely ineffective in vast ocean regions with only ship [...] Read more.
Existing technologies can achieve relative geometric correction and stabilization of geostationary satellite image sequences through fixed land scene matching or homonymous point adjustment. However, these methods heavily rely on fixed land areas, rendering them completely ineffective in vast ocean regions with only ship targets. Additionally, the trajectories of ship targets after processing still exhibit noticeable jitter, hindering motion information analysis. To address these issues, this paper proposes a joint image adjustment and stabilization method based on multi-target trajectories in marine environments: (1) An optimized target detection algorithm based on a multi-scale heterogeneous convolution module is introduced, which extracts background and target features through convolutions of different scales, enabling accurate detection and tracking of weak small targets in the image sequence frame by frame. (2) Curve fitting is performed on the detected positions of the same ship across multiple frames to simulate its motion trajectory under stabilized conditions. Combined with the prior assumption of uniform motion, an equal-division strategy is adopted to determine the corrected positions of the target in the image sequence. (3) The deviation correction values of multiple targets within the same frame are obtained, and based on the principle of intra-frame deviation consistency, precise image stabilization is achieved under multi-target constraints. Experiments based on Gaofen-4 satellite image sequences demonstrate that this method reduces the average position deviation of ship targets in the original images from 8.5 pixels (425 m) to 3.4 pixels (170 m), a decrease of approximately 59.41%, effectively improving the relative geometric accuracy of the image sequence and significantly eliminating target trajectory jitter. Full article
(This article belongs to the Section Earth Sciences)
Show Figures

Figure 1

26 pages, 956 KB  
Article
Environment-Guided Multimodal Pest Detection and Risk Assessment in Fruit and Vegetable Production Systems
by Jiapeng Sun, Yucheng Peng, Zhimeng Zhang, Wenrui Xu, Boyuan Xi, Yuanying Zhang and Yihong Song
Horticulturae 2026, 12(4), 486; https://doi.org/10.3390/horticulturae12040486 - 16 Apr 2026
Viewed by 410
Abstract
Aimed at the practical challenge that pest occurrence in fruit and vegetable horticultural production exhibits strong environmental dependency, pronounced stage characteristics, and high sensitivity to control decision-making, a multimodal pest recognition and occurrence risk joint modeling method is proposed to address the limitation [...] Read more.
Aimed at the practical challenge that pest occurrence in fruit and vegetable horticultural production exhibits strong environmental dependency, pronounced stage characteristics, and high sensitivity to control decision-making, a multimodal pest recognition and occurrence risk joint modeling method is proposed to address the limitation that conventional intelligent plant protection systems focus primarily on pest identification while lacking risk discrimination capability. Within a unified network framework, pest visual information and environmental temporal data are integrated through the construction of an environment-guided representation learning mechanism, a recognition–risk joint optimization strategy, and a risk-aware decision representation modeling structure. In this manner, pest category recognition and occurrence risk evaluation are conducted simultaneously, thereby providing direct decision support for precision prevention and control in fruit and vegetable production. Systematic experimental evaluation is conducted based on multi-crop and multi-year field data collected from Wuyuan County, Bayannur City, Inner Mongolia. Overall comparative results demonstrate that an identification accuracy of 0.947, a precision of 0.936, and a recall of 0.924 are achieved on the test set, all of which significantly outperform mainstream visual detection models such as YOLOv8, DETR, and Mask R-CNN. In terms of detection performance, mAP@50 and mAP@75 reach 0.962 and 0.821, respectively, indicating stable localization and discrimination capability under complex backgrounds and dense small-target conditions. For the occurrence risk discrimination task, a risk accuracy of 0.887 is obtained, representing an improvement of approximately 4.5 percentage points compared with the simple multimodal feature concatenation method. Cross-crop, cross-site, and cross-year generalization experiments further show that risk accuracy remains above 0.84 with stable recognition performance under significant distribution shifts. Ablation studies verify the synergistic contributions of the proposed core modules to overall performance improvement. The results indicate that the proposed framework enables the transition from single recognition to risk-driven plant protection decision-making, providing a technically viable pathway for pest diagnosis and control strategy optimization in fruit and vegetable horticulture. Full article
25 pages, 2805 KB  
Article
CAPG: Context-Aware Perturbation Generation for Multi-Label Adversarial Attacks
by Aidos Askhatuly, Dinara Berdysheva, Azamat Berdyshev, Aigul Adamova and Didar Yedilkhan
Technologies 2026, 14(4), 233; https://doi.org/10.3390/technologies14040233 - 16 Apr 2026
Viewed by 221
Abstract
Multi-label deep learning models are widely used in real-world applications where predictions depend on the joint presence of several semantically correlated labels. However, existing adversarial attacks largely overlook these inter-label dependencies, often perturbing outputs indiscriminately and producing structurally implausible or easily detectable changes. [...] Read more.
Multi-label deep learning models are widely used in real-world applications where predictions depend on the joint presence of several semantically correlated labels. However, existing adversarial attacks largely overlook these inter-label dependencies, often perturbing outputs indiscriminately and producing structurally implausible or easily detectable changes. This paper presents CAPG (Context-Aware Perturbation Generation), a white-box, label-space targeted adversarial framework for generating selective and contextually consistent perturbations in multi-label settings. CAPG incorporates correlation-weighted regularization into the adversarial objective, enabling targeted manipulation of specific labels while preserving the contextual integrity of non-target outputs. Using the Pascal VOC 2012 dataset and a ResNet-101 multi-label classifier, we show that CAPG achieves higher Attack Success Rates (ASR) and substantially improved Contextual Consistency Scores (CCSs) than FGSM, PGD, CW, and DeepFool under identical perturbation budgets. CAPG also produces lower perceptual distortion, yielding adversarial examples that better preserve contextual structure. These results highlight the importance of correlation-aware adversarial evaluation for assessing the robustness of modern multi-label deep learning systems. Full article
(This article belongs to the Section Information and Communication Technologies)
Show Figures

Figure 1

24 pages, 7018 KB  
Article
Robust Multi-Object Tracking in Dense Swarms with Query Propagation and Adaptive Attention
by Sen Zhang, Weilin Du, Zheng Li and Junmin Rao
Drones 2026, 10(4), 280; https://doi.org/10.3390/drones10040280 - 14 Apr 2026
Viewed by 250
Abstract
The query propagation paradigm provides a unified theoretical framework for end-to-end multi-object tracking, yet it still faces challenges in complex scenarios involving multi-scale variations, dense interactions, and trajectory fragmentation, including insufficient query initialization quality, imprecise feature alignment, and difficult identity recovery. Building upon [...] Read more.
The query propagation paradigm provides a unified theoretical framework for end-to-end multi-object tracking, yet it still faces challenges in complex scenarios involving multi-scale variations, dense interactions, and trajectory fragmentation, including insufficient query initialization quality, imprecise feature alignment, and difficult identity recovery. Building upon MOTRv2, this paper proposes three core improvements. First, we design a geometric prior injection strategy based on sine–cosine encoding, which explicitly encodes target location and scale information into detection queries, providing high-quality initialization for tracking queries. Second, we propose a width–height-modulated deformable attention mechanism that dynamically adjusts the sampling range of deformable convolution according to target size, enabling fine-grained feature matching for multi-scale targets. Third, we construct a motion-direction-consistency-based trajectory re-association module that leverages motion continuity to efficiently recover lost trajectories without introducing additional appearance models. Furthermore, we introduce a progressive joint training strategy that optimizes detection and tracking modules in stages, effectively mitigating gradient competition in multi-task learning. Extensive quantitative and qualitative experiments on the BEE24, UAVSwarm, and VTMOT infrared datasets validate the effectiveness of the proposed method. On the UAVSwarm dataset, our method achieves state-of-the-art performance with 52.4% HOTA, 72.1% MOTA, and only 51 identity switches. Ablation studies further reveal the synergistic enhancement mechanism among the proposed modules. Full article
(This article belongs to the Section Artificial Intelligence in Drones (AID))
Show Figures

Figure 1

24 pages, 4781 KB  
Article
DFDP-QuadDiff: A Dual-Frequency Dual-Polarization Quad-Differential Framework for Weak-Echo Ship Target Detection in GNSS-Based Bistatic Synthetic Aperture Radar
by Gang Yang, Tianwen Zhang, Zhen Chen, Bingxiu Yao, Yucong He, Dunyun He, Tianyi Wei and Qinglin He
Remote Sens. 2026, 18(8), 1130; https://doi.org/10.3390/rs18081130 - 10 Apr 2026
Viewed by 285
Abstract
Weak-echo ship target detection in GNSS-based bistatic synthetic aperture radar is severely limited by the coupled effects of burst-type strong windows and polarization mismatch, cross-frequency mis-registration, and long-sequence chain drift in dual-frequency dual-polarization observations. To address these issues, this paper proposes DFDP-QuadDiff, a [...] Read more.
Weak-echo ship target detection in GNSS-based bistatic synthetic aperture radar is severely limited by the coupled effects of burst-type strong windows and polarization mismatch, cross-frequency mis-registration, and long-sequence chain drift in dual-frequency dual-polarization observations. To address these issues, this paper proposes DFDP-QuadDiff, a dual-frequency dual-polarization quad-differential framework for weak-echo ship target detection using B1/B3 × horizontal–horizontal (HH)/vertical–vertical (VV) four-channel complex range-time data. The proposed framework integrates polarization-consistency-driven strong-window suppression, intra-band adaptive polarimetric synthesis, joint delay–Doppler–phase cross-frequency registration, segment-wise Jones drift calibration, and quality-aware final fusion in a unified hierarchical processing chain. In this way, multi-source inconsistencies are progressively constrained and suppressed from the polarization level to the segment level before final accumulation and detection are performed. Experimental results on self-developed four-channel GNSS-S demonstrate that, relative to the best raw single-channel result, the proposed framework increases the median SCR from 6.51 dB to 9.04 dB (+2.53 dB), improves the P10 SCR from −1.76 dB to 3.05 dB (+4.81 dB), and raises the track continuity from 0.85 to 0.97. In addition, the standard deviation of segment-wise delay drift is reduced from 0.97 bin to 0.29 bin, and positive multi-scale accumulation gains are maintained up to the second-long integration range. These results indicate that the proposed framework not only substantially enhances the stability, continuity, and long-time integrability of weak-target responses under low-SNR maritime conditions, but also maintains robust gains under weak-visibility, interference-dominant, and mismatch-sensitive local conditions in the stratified evaluation, thereby establishing a physically interpretable and implementation-ready solution for collaborative weak-target detection in dual-band dual-polarization GNSS-S. Full article
(This article belongs to the Special Issue Recent Advances in SAR Object Detection)
Show Figures

Figure 1

25 pages, 42196 KB  
Article
Frequency–Spatial Domain Jointly Guided Perceptual Network for Infrared Small Target Detection
by Yeteng Han, Minrui Ye, Bohan Liu, Jie Li, Chaoxian Jia, Wennan Cui and Tao Zhang
Remote Sens. 2026, 18(7), 1000; https://doi.org/10.3390/rs18071000 - 26 Mar 2026
Viewed by 633
Abstract
Infrared small target detection is a critical task in remote sensing. However, it remains highly challenging due to low contrast, heavy background clutter, and large variations in target scale. Traditional convolutional networks are inadequate for joint modeling, as they cannot effectively capture both [...] Read more.
Infrared small target detection is a critical task in remote sensing. However, it remains highly challenging due to low contrast, heavy background clutter, and large variations in target scale. Traditional convolutional networks are inadequate for joint modeling, as they cannot effectively capture both fine structural details and global contextual dependencies. To address these issues, we propose FSGPNet, a frequency–spatial domain jointly guided perceptual network that explicitly exploits complementary representations in both the frequency and spatial domains. Specifically, a Frequency–Spatial Enhancement Module (FSEM) is introduced to strengthen target details while suppressing background interference through high-frequency enhancement and Perona–Malik diffusion. To enhance global context modeling, we propose a Multi-Scale Global Perception (MSGP) module that integrates non-local attention with multi-scale dilated convolutions, enabling robust background modeling. Furthermore, a Gabor Transformer Attention Module (GTAM) is designed to achieve selective frequency–spatial feature aggregation via self-attention over multi-directional and multi-scale Gabor responses, effectively highlighting discriminative structures of various small targets. Extensive experiments are conducted on two benchmark datasets (IRSTD-1K and NUDT-SIRST) that cover typical remote sensing infrared scenarios. Quantitative and qualitative results demonstrate that FSGPNet consistently outperforms state-of-the-art methods across multiple evaluation metrics. These findings validate the effectiveness and robustness of the proposed FSGPNet for detecting small infrared targets in remote sensing applications. Full article
(This article belongs to the Special Issue Deep Learning-Based Small-Target Detection in Remote Sensing)
Show Figures

Figure 1

22 pages, 17744 KB  
Article
Task-Aware Low-Light Image Enhancement Method for Underground Coal Mine Monitoring
by Zhirui Yan, Yaru Li, Hongwei Wang, Zhixin Jin, Lei Tao and Yide Geng
Sensors 2026, 26(6), 1886; https://doi.org/10.3390/s26061886 - 17 Mar 2026
Viewed by 357
Abstract
Video AI recognition is crucial for coal mine safety, but complex environments often yield low-quality images, hindering intelligent monitoring. Existing enhancement methods typically focus on image quality alone, lacking adaptability to specific tasks. Therefore, we propose Mine-DCE-YDT: a task-aware low-light image enhancement model [...] Read more.
Video AI recognition is crucial for coal mine safety, but complex environments often yield low-quality images, hindering intelligent monitoring. Existing enhancement methods typically focus on image quality alone, lacking adaptability to specific tasks. Therefore, we propose Mine-DCE-YDT: a task-aware low-light image enhancement model that jointly optimizes enhancement with downstream object detection, ensuring enhanced images are both visually clearer and more conducive to accurate detection. Firstly, an improved Zero-DCE algorithm (Mine-DCE) is presented by introducing a Brightness-aware Mask Coordinate Attention (BMCA) module to improve illumination balance in the Value channel of the HSV image and a Multi-scale Detail Enhancement (MDE) module to reinforce textures and suppress noise. Then, Mine-DCE is co-modeled with YOLOv11n by training end-to-end via a joint loss fusing detection and enhancement quality losses to form Mine-DCE-YDT, which can enhance specific details containing image detection targets. Experimental results show that compared with Zero-DCE, Mine-DCE-YDT achieves reductions of 9.5% in NIQE and 35.5% in BRISQUE on the custom-constructed MineDataset and exhibits great enhancement performance on the public dataset LOL-V1. For the miner detection task in MineDataset, the integration of Mine-DCE-YDT with YOLOv11n achieves increases of 2.8% and 8.3% in mAP@0.5 and mAP@0.5:0.95, demonstrating its effectiveness in enhancing task-critical image features. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

35 pages, 6720 KB  
Article
Vision-Based Vehicle State and Behavior Analysis for Aircraft Stand Safety
by Ke Tang, Liang Zeng, Tianxiong Zhang, Di Zhu, Wenjie Liu and Xinping Zhu
Sensors 2026, 26(6), 1821; https://doi.org/10.3390/s26061821 - 13 Mar 2026
Viewed by 381
Abstract
With the continuous elevation of aviation safety standards, accurate monitoring of ground support vehicles in aircraft stand areas has become a critical task for enhancing overall aircraft stand operational safety. Given the limitations of existing surface movement radar and multi-camera surveillance systems in [...] Read more.
With the continuous elevation of aviation safety standards, accurate monitoring of ground support vehicles in aircraft stand areas has become a critical task for enhancing overall aircraft stand operational safety. Given the limitations of existing surface movement radar and multi-camera surveillance systems in terms of cost, deployment complexity, and coverage, this paper proposes a lightweight vision-based framework for vehicle state perception and spatiotemporal behavior analysis oriented toward aircraft stand safety. Leveraging existing fixed monocular monitoring resources in the stand area, the framework first establishes a precise mapping from image pixel coordinates to the physical plane through self-calibration and homography transformation utilizing scene line features, thereby achieving unified spatial measurement of vehicle targets. Subsequently, it integrates an improved lightweight YOLO detector (incorporating Ghost modules and CBAM for noise suppression) with the ByteTrack tracking algorithm to enable stable extraction of vehicle trajectories under complex occlusion conditions. Finally, by combining functional zone division within the stand, a semantic map is constructed, and a behavior analysis method based on a spatiotemporal finite state machine is proposed. This method performs joint reasoning by fusing multi-dimensional constraints including position, zone, and time, enabling automatic detection of abnormal behaviors such as “intrusion into restricted areas” and “abnormal stop.” Quantitative evaluations demonstrate the framework’s efficacy: it achieves an average physical localization error (RMSE) of 0.32 m, and the improved detection model reaches an accuracy (mAP@50) of 90.4% for ground support vehicles. In tests simulating typical violation scenarios, the system achieved high recall (96.0%) and precision (95.8%) rates in detecting ‘area intrusion’ and ‘abnormal stop’ violations, respectively. These results, achieved using only existing surveillance cameras, validate its potential as a cost-effective and easily deployable tool to augment existing safety monitoring systems for airport ground operations. Full article
(This article belongs to the Special Issue Intelligent Sensing and Control Technology for Unmanned Vehicles)
Show Figures

Figure 1

20 pages, 4810 KB  
Article
Unauthorized Expressway Parking Detection Based on Spatiotemporal Analysis of Vehicle–Structure Distances Using UAV Aerial Images
by Xiaolong Gong, Haiqing Liu, Yuehao Wang, Yaxin Wei and Guoran Shi
Vehicles 2026, 8(3), 49; https://doi.org/10.3390/vehicles8030049 - 6 Mar 2026
Viewed by 629
Abstract
Owing to their high-altitude vantage point and maneuverability, unmanned aerial vehicles (UAVs) have emerged as an effective technical solution for real-time parking detection in expressway scenarios. Using UAV cruise-perspective images, this paper proposes an unauthorized parking detection method by analyzing the time-series variations [...] Read more.
Owing to their high-altitude vantage point and maneuverability, unmanned aerial vehicles (UAVs) have emerged as an effective technical solution for real-time parking detection in expressway scenarios. Using UAV cruise-perspective images, this paper proposes an unauthorized parking detection method by analyzing the time-series variations in the relative distances between the moving vehicle and static structure as a reference. Firstly, vehicle and static structure targets are recognized and tracked by the DeepSort, and a Vehicle–Structure (V-S) distance matrix is further constructed to describe their frame-wise relative positions in the pixel coordinate system. Then, to eliminate the radial scale errors caused by perspective distortion, a scale factor (SF) index is introduced to correct the original V-S matrix and provide a more accurate spatiotemporal representation. Finally, the stationarity of the distance series in the V-S matrix is tested using the Augmented Dickey–Fuller (ADF) test, and a parking detection method is proposed by introducing the parking support ratio (PSR) to establish a multi-structure joint decision scheme. Experimental results show that the corrected V-S matrix can faithfully describe the spatial positional relationship between road vehicles and static structures. With the optimal PSR threshold ψ0 and time window T, the proposed method achieves better overall parking-detection performance in terms of accuracy, precision, recall, and F1-score in comparison with a traditional speed threshold approach. Full article
(This article belongs to the Special Issue Air Vehicle Operations: Opportunities, Challenges and Future Trends)
Show Figures

Figure 1

32 pages, 7101 KB  
Article
A PMBM Filter for Tracking Coexisting Point and Group Targets with Target Spawning and Generalized Measurement Models
by Jichuan Zhang, Qi Jiang, Longxiang Jiao, Weidong Li and Cheng Hu
Remote Sens. 2026, 18(5), 769; https://doi.org/10.3390/rs18050769 - 3 Mar 2026
Viewed by 364
Abstract
Accurate multi-target filtering is crucial for low-altitude surveillance, where point and group targets often coexist. Poisson multi-Bernoulli mixture (PMBM) filters provide a unified Bayesian framework for the joint filtering of point and group targets under the assumptions of independent target dynamics and standard [...] Read more.
Accurate multi-target filtering is crucial for low-altitude surveillance, where point and group targets often coexist. Poisson multi-Bernoulli mixture (PMBM) filters provide a unified Bayesian framework for the joint filtering of point and group targets under the assumptions of independent target dynamics and standard measurement models. However, in practical scenarios, group targets may generate new targets through member separation, while point targets may produce multiple measurements due to multi-beam sensing and micro-Doppler signatures. These phenomena violate the assumptions of existing PMBM filters and lead to degraded state estimation and target-type inference. To address these challenges, this paper proposes a modified PMBM filter with group target spawning and generalized measurement models for coexisting point and group targets. Specifically, a group-dependent spawning model is incorporated into the prediction step to enable timely detection of newly spawned targets. In addition, a generalized update function is developed to support point-target density updates with measurement sets of arbitrary cardinality, and a measurement-rate-based correction factor is introduced to improve target-type estimation under nonstandard measurement conditions. Furthermore, an efficient Poisson multi-Bernoulli approximation is derived to reduce computational complexity. The effectiveness of the proposed filter is verified through simulation and experimental results. Full article
(This article belongs to the Special Issue Radar Data Processing and Analysis)
Show Figures

Figure 1

27 pages, 7867 KB  
Article
A Multi-Scale Object Detection Network with Integrated Spatial-Channel Collaborative Attention for Remote Sensing Images
by Lijun Ma, Chengjun Xu, Kun Jiao, Wenming Pei, Hongfei Zhang, Lanfeng Liu, Bin Deng and Juan Wu
Sensors 2026, 26(4), 1370; https://doi.org/10.3390/s26041370 - 21 Feb 2026
Viewed by 487
Abstract
In remote sensing object detection, current models typically employ feature extraction modules and attention mechanisms to tackle issues such as significant scale variations among targets, cluttered backgrounds, and the subtle characteristics of small objects. Nevertheless, existing feature extraction approaches often depend on convolution [...] Read more.
In remote sensing object detection, current models typically employ feature extraction modules and attention mechanisms to tackle issues such as significant scale variations among targets, cluttered backgrounds, and the subtle characteristics of small objects. Nevertheless, existing feature extraction approaches often depend on convolution kernels with fixed sizes, which can blur the contours of large objects and provide inadequate feature representation for small objects. Moreover, many attention mechanisms simply combine spatial and channel attention, without fully considering the deep integration between spatial and channel features, consequently leading to high-dimensional features and considerable computational overhead. To overcome these shortcomings, this paper introduces a multi-scale object detection network with integrated spatial-channel collaborative attention for remote sensing images. This approach enhances feature perception and representation for multi-scale targets, particularly small targets, through the design of the cross-channel multi-scale feature extraction module (CC-MSFE). Furthermore, a new channel-spatial cross-attention mechanism (CSCA) is introduced, comprising the channel attention mechanism (CA), the spatial attention mechanism (SA), and the cross-attention fusion module (CAFM). This design fosters dynamic interaction and joint optimization across channel and spatial dimensions, thereby improving detection accuracy while effectively reducing computational cost. The efficacy of the proposed model is evaluated on three publicly available remote sensing datasets. Experimental results show that the model achieves a mAP of 78.1% on the DIOR dataset and of 90.6% on the HRRSD dataset, outperforming YOLOv11 by 0.7% and 1.4%, respectively. On the RSOD dataset, it attains a mAP of 96.5%, surpassing YOLOv8 by 2.1%. In addition, the proposed method maintains a notably lower parameter count and computational complexity compared to existing approaches, achieving an effective balance between detection accuracy and computational efficiency. Full article
(This article belongs to the Section Remote Sensors)
Show Figures

Figure 1

25 pages, 3654 KB  
Article
MDF2Former: Multi-Scale Dual-Domain Feature Fusion Transformer for Hyperspectral Image Classification of Bacteria in Murine Wounds
by Decheng Wu, Wendan Liu, Rui Li, Xudong Fu, Lin Tao, Yinli Tian, Anqiang Zhang, Zhen Wang and Hao Tang
J. Imaging 2026, 12(2), 90; https://doi.org/10.3390/jimaging12020090 - 19 Feb 2026
Viewed by 368
Abstract
Bacterial wound infection poses a major challenge in trauma care and can lead to severe complications such as sepsis and organ failure. Therefore, rapid and accurate identification of the pathogen, along with targeted intervention, is of vital importance for improving treatment outcomes and [...] Read more.
Bacterial wound infection poses a major challenge in trauma care and can lead to severe complications such as sepsis and organ failure. Therefore, rapid and accurate identification of the pathogen, along with targeted intervention, is of vital importance for improving treatment outcomes and reducing risks. However, current detection methods are still constrained by procedural complexity and long processing times. In this study, a hyperspectral imaging (HSI) acquisition system for bacterial analysis and a multi-scale dual-domain feature fusion transformer (MDF2Former) were developed for classifying wound bacteria. MDF2Former integrates three modules: a multi-scale feature enhancement and fusion module that generates tokens with multi-scale discriminative representations, a spatial–spectral dual-branch attention module that strengthens joint feature modeling, and a frequency and spatial–spectral domain encoding module that captures global and local interactions among tokens through a hierarchical stacking structure, thereby enabling more efficient feature learning. Extensive experiments on our self-constructed HSI dataset of typical wound bacteria demonstrate that MDF2Former achieved outstanding performance across five metrics: Accuracy (91.94%), Precision (92.26%), Recall (91.94%), F1-score (92.01%), and Kappa coefficient (90.73%), surpassing all comparative models. These results have verified the effectiveness of combining HSI with deep learning for bacterial identification, and have highlighted its potential in assisting in the identification of bacterial species and making personalized treatment decisions for wound infections. Full article
(This article belongs to the Section Color, Multi-spectral, and Hyperspectral Imaging)
Show Figures

Figure 1

27 pages, 4522 KB  
Article
Multi-Object Detection of Forage Density and Dairy Cow Feeding Behavior Based on an Improved YOLOv10 Model for Smart Pasture Applications
by Zhiwei Liu, Jiandong Fang and Yudong Zhao
Sensors 2026, 26(4), 1273; https://doi.org/10.3390/s26041273 - 15 Feb 2026
Viewed by 460
Abstract
In modern smart dairy farms, precise feed management and accurate monitoring of dairy cows’ feeding behavior are crucial for improving production efficiency and reducing feeding costs. However, in practical applications, complex environmental factors such as varying illumination, frequent occlusion, and dense multi-targets pose [...] Read more.
In modern smart dairy farms, precise feed management and accurate monitoring of dairy cows’ feeding behavior are crucial for improving production efficiency and reducing feeding costs. However, in practical applications, complex environmental factors such as varying illumination, frequent occlusion, and dense multi-targets pose significant challenges to real-time visual perception. To address these issues, this paper proposes a lightweight multi-target detection model, BFDet-YOLO, for the joint detection of dairy cows’ feeding behavior and feed density levels in pasture environments. Based on the YOLOv10 framework, the model incorporates four targeted improvements: (1) a bidirectional feature fusion network (BiFPN) to address the insufficient multi-scale feature interaction between dairy cows (large targets) and feed particles (small targets); (2) a lightweight downsampling module (Adown) to preserve fine-grained features of feed particles and reduce the risk of small target miss detection; (3) an attention-enhanced detection head (SEAM) to mitigate occlusion interference caused by cow stacking and feed accumulation; (4) an improved bounding box regression loss function (DIoU) to optimize the localization accuracy of non-overlapping small targets. Additionally, this paper constructs a pasture-specific dataset integrating dairy cows’ feeding behavior and feed distribution information, which is annotated and expanded by combining public datasets with on-site monitoring data. Experimental results demonstrate that BFDet-YOLO outperforms the original YOLOv10 and other mainstream target recognition models in terms of detection accuracy and robustness while maintaining a significantly streamlined model scale. On the constructed dataset, the model achieves 95.7% mAP@0.5 and 70.7% mAP@0.5:0.95 with only 1.85 M parameters. These results validate the effectiveness and deployability of the proposed method, providing a reliable visual perception solution for intelligent feeding systems and smart pasture management. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

Back to TopTop