Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (61)

Search Parameters:
Keywords = arbitrary-oriented object detection

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 3590 KB  
Article
Rotation-Sensitive Feature Enhancement Network for Oriented Object Detection in Remote Sensing Images
by Jiaxin Xu, Hua Huo, Shilu Kang, Aokun Mei and Chen Zhang
Sensors 2026, 26(2), 381; https://doi.org/10.3390/s26020381 - 7 Jan 2026
Viewed by 205
Abstract
Oriented object detection in remote sensing images remains a challenging task due to arbitrary target rotations, extreme scale variations, and complex backgrounds. However, current rotated detectors still face several limitations: insufficient orientation-sensitive feature representation, feature misalignment for rotated proposals, and unstable optimization of [...] Read more.
Oriented object detection in remote sensing images remains a challenging task due to arbitrary target rotations, extreme scale variations, and complex backgrounds. However, current rotated detectors still face several limitations: insufficient orientation-sensitive feature representation, feature misalignment for rotated proposals, and unstable optimization of rotation parameters. To address these issues, this paper proposes an enhanced Rotation-Sensitive Feature Pyramid Network (RSFPN) framework. Building upon the effective Oriented R-CNN paradigm, we introduce three novel core components: (1) a Dynamic Adaptive Feature Pyramid Network (DAFPN) that enables bidirectional multi-scale feature fusion through semantic-guided upsampling and structure-enhanced downsampling paths; (2) an Angle-Aware Collaborative Attention (AACA) module that incorporates orientation priors to guide feature refinement; (3) a Geometrically Consistent Multi-Task Loss (GC-MTL) that unifies the regression of rotation parameters with periodic smoothing and adaptive weight mechanisms. Comprehensive experiments on the DOTA-v1.0 and HRSC2016 benchmarks show that our RSFPN achieves superior performance. It attains a state-of-the-art mAP of 77.42% on DOTA-v1.0 and 91.85% on HRSC2016, while maintaining efficient inference at 14.5 FPS, demonstrating a favorable accuracy-efficiency trade-off. Visual analysis confirms that our method produces concentrated, rotation-aware feature responses and effectively suppresses background interference. The proposed approach provides a robust solution for detecting multi-oriented objects in high-resolution remote sensing imagery, with significant practical value for urban planning, environmental monitoring, and security applications. Full article
Show Figures

Figure 1

29 pages, 3367 KB  
Article
Small Object Detection in Synthetic Aperture Radar with Modular Feature Encoding and Vectorized Box Regression
by Xinmiao Du and Xihong Wu
Remote Sens. 2025, 17(17), 3094; https://doi.org/10.3390/rs17173094 - 5 Sep 2025
Cited by 2 | Viewed by 1785
Abstract
Object detection in synthetic aperture radar (SAR) imagery poses significant challenges due to low resolution, small objects, arbitrary orientations, and complex backgrounds. Standard object detectors often fail to capture sufficient semantic and geometric cues for such tiny targets. To address this issue, a [...] Read more.
Object detection in synthetic aperture radar (SAR) imagery poses significant challenges due to low resolution, small objects, arbitrary orientations, and complex backgrounds. Standard object detectors often fail to capture sufficient semantic and geometric cues for such tiny targets. To address this issue, a new Convolutional Neural Network (CNN) framework called Deformable Vectorized Detection Network (DVDNet) has been proposed, specifically designed for detecting small, oriented, and densely packed objects in SAR images. The DVDNet consists of Grouped-Deformable Convolution for adaptive receptive field adjustment to diverse object scales, a Local Binary Pattern (LBP) Enhancement Module that enriches texture representations and enhances the visibility of small or camouflaged objects, and a Vector Decomposition Module that enables accurate regression of oriented bounding boxes via learnable geometric vectors. The DVDNet is embedded in a two-stage detection architecture and is particularly effective in preserving fine-grained features critical for mall object localization. The performance of DVDNet is validated on two SAR small target detection datasets, HRSID and SSDD, and it is experimentally demonstrated that it achieves 90.9% mAP on HRSID and 87.2% mAP on SSDD. The generalizability of DVDNet was also verified on the self-built SAR ship dataset and the remote sensing optical dataset HRSC2016. All these experiments show that DVDNet outperforms the standard detector. Notably, our framework shows substantial gains in precision and recall for small object subsets, validating the importance of combining deformable sampling, texture enhancement, and vector-based box representation for high-fidelity small object detection in complex SAR scenes. Full article
(This article belongs to the Special Issue Deep Learning Techniques and Applications of MIMO Radar Theory)
Show Figures

Figure 1

20 pages, 28680 KB  
Article
SN-YOLO: A Rotation Detection Method for Tomato Harvest in Greenhouses
by Jinlong Chen, Ruixue Yu, Minghao Yang, Wujun Che, Yi Ning and Yongsong Zhan
Electronics 2025, 14(16), 3243; https://doi.org/10.3390/electronics14163243 - 15 Aug 2025
Cited by 2 | Viewed by 934
Abstract
Accurate detection of tomato fruits is a critical component in vision-guided robotic harvesting systems, which play an increasingly important role in automated agriculture. However, this task is challenged by variable lighting conditions and background clutter in natural environments. In addition, the arbitrary orientations [...] Read more.
Accurate detection of tomato fruits is a critical component in vision-guided robotic harvesting systems, which play an increasingly important role in automated agriculture. However, this task is challenged by variable lighting conditions and background clutter in natural environments. In addition, the arbitrary orientations of fruits reduce the effectiveness of traditional horizontal bounding boxes. To address these challenges, we propose a novel object detection framework named SN-YOLO. First, we introduce the StarNet’ backbone to enhance the extraction of fine-grained features, thereby improving the detection performance in cluttered backgrounds. Second, we design a Color-Prior Spatial-Channel Attention (CPSCA) module that incorporates red-channel priors to strengthen the model’s focus on salient fruit regions. Third, we implement a multi-level attention fusion strategy to promote effective feature integration across different layers, enhancing background suppression and object discrimination. Furthermore, oriented bounding boxes improve localization precision by better aligning with the actual fruit shapes and poses. Experiments conducted on a custom tomato dataset demonstrate that SN-YOLO outperforms the baseline YOLOv8 OBB, achieving a 1.0% improvement in precision and a 0.8% increase in mAP@0.5. These results confirm the robustness and accuracy of the proposed method under complex field conditions. Overall, SN-YOLO provides a practical and efficient solution for fruit detection in automated harvesting systems, contributing to the deployment of computer vision techniques in smart agriculture. Full article
Show Figures

Figure 1

22 pages, 6201 KB  
Article
SOAM Block: A Scale–Orientation-Aware Module for Efficient Object Detection in Remote Sensing Imagery
by Yi Chen, Zhidong Wang, Zhipeng Xiong, Yufeng Zhang and Xinqi Xu
Symmetry 2025, 17(8), 1251; https://doi.org/10.3390/sym17081251 - 6 Aug 2025
Cited by 1 | Viewed by 701
Abstract
Object detection in remote sensing imagery is critical in environmental monitoring, urban planning, and land resource management. However, the task remains challenging due to significant scale variations, arbitrary object orientations, and complex background clutter. To address these issues, we propose a novel orientation [...] Read more.
Object detection in remote sensing imagery is critical in environmental monitoring, urban planning, and land resource management. However, the task remains challenging due to significant scale variations, arbitrary object orientations, and complex background clutter. To address these issues, we propose a novel orientation module (SOAM Block) that jointly models object scale and directional features while exploiting geometric symmetry inherent in many remote sensing targets. The SOAM Block is constructed upon a lightweight and efficient Adaptive Multi-Scale (AMS) Module, which utilizes a symmetric arrangement of parallel depth-wise convolutional branches with varied kernel sizes to extract fine-grained multi-scale features without dilation, thereby preserving local context and enhancing scale adaptability. In addition, a Strip-based Context Attention (SCA) mechanism is introduced to model long-range spatial dependencies, leveraging horizontal and vertical 1D strip convolutions in a directionally symmetric fashion. This design captures spatial correlations between distant regions and reinforces semantic consistency in cluttered scenes. Importantly, this work is the first to explicitly analyze the coupling between object scale and orientation in remote sensing imagery. The proposed method addresses the limitations of fixed receptive fields in capturing symmetric directional cues of large-scale objects. Extensive experiments are conducted on two widely used benchmarks—DOTA and HRSC2016—both of which exhibit significant scale variations and orientation diversity. Results demonstrate that our approach achieves superior detection accuracy with fewer parameters and lower computational overhead compared to state-of-the-art methods. The proposed SOAM Block thus offers a robust, scalable, and symmetry-aware solution for high-precision object detection in complex aerial scenes. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

21 pages, 6270 KB  
Article
Cross-Level Adaptive Feature Aggregation Network for Arbitrary-Oriented SAR Ship Detection
by Lu Qian, Junyi Hu, Haohao Ren, Jie Lin, Xu Luo, Lin Zou and Yun Zhou
Remote Sens. 2025, 17(10), 1770; https://doi.org/10.3390/rs17101770 - 19 May 2025
Cited by 4 | Viewed by 925
Abstract
The rapid progress of deep learning has significantly enhanced the development of ship detection using synthetic aperture radar (SAR). However, the diversity of ship sizes, arbitrary orientations, densely arranged ships, etc., have been hindering the improvement of SAR ship detection accuracy. In response [...] Read more.
The rapid progress of deep learning has significantly enhanced the development of ship detection using synthetic aperture radar (SAR). However, the diversity of ship sizes, arbitrary orientations, densely arranged ships, etc., have been hindering the improvement of SAR ship detection accuracy. In response to these challenges, this study introduces a new detection approach called a cross-level adaptive feature aggregation network (CLAFANet) to achieve arbitrary-oriented multi-scale SAR ship detection. Specifically, we first construct a hierarchical backbone network based on a residual architecture to extract multi-scale features of ship objects from large-scale SAR imagery. Considering the multi-scale nature of ship objects, we then resort to the idea of self-attention to develop a cross-level adaptive feature aggregation (CLAFA) mechanism, which can not only alleviate the semantic gap between cross-level features but also improve the feature representation capabilities of multi-scale ships. To better adapt to the arbitrary orientation of ship objects in real application scenarios, we put forward a frequency-selective phase-shifting coder (FSPSC) module for arbitrary-oriented SAR ship detection tasks, which is dedicated to mapping the rotation angle of the object bounding box to different phases and exploits frequency-selective phase-shifting to solve the periodic ambiguity problem of the rotated bounding box. Qualitative and quantitative experiments conducted on two public datasets demonstrate that the proposed CLAFANet achieves competitive performance compared to some state-of-the-art methods in arbitrary-oriented SAR ship detection. Full article
Show Figures

Figure 1

23 pages, 10175 KB  
Article
Feature-Guided Instance Mining and Task-Aligned Focal Loss for Weakly Supervised Object Detection in Remote Sensing Images
by Jinlin Tan, Chenhao Wang, Xiaomin Tan, Min Zhang and Hai Wang
Remote Sens. 2025, 17(10), 1673; https://doi.org/10.3390/rs17101673 - 9 May 2025
Cited by 1 | Viewed by 1055
Abstract
Weakly supervised object detection (WSOD) in remote sensing images (RSIs) aims to achieve high-value object classification and localization using only image-level labels, and it has a wide range of applications. However, existing popular WSOD models still encounter two challenges. First, these WSOD models [...] Read more.
Weakly supervised object detection (WSOD) in remote sensing images (RSIs) aims to achieve high-value object classification and localization using only image-level labels, and it has a wide range of applications. However, existing popular WSOD models still encounter two challenges. First, these WSOD models typically select the highest-scoring proposal as the seed instance while ignoring lower-scoring ones, resulting in some less-obvious objects being missed. Second, current models fail to ensure consistency between classification and regression, limiting the upper bound of WSOD performance. To address the first challenge, we propose a feature-guided seed instance mining (FGSIM) strategy to mine reliable seed instances. Specifically, FGSIM first selects multiple high-scoring proposals as seed instances and then leverages a feature similarity measure to mine additional seed instances among lower-scoring proposals. Furthermore, a contrastive loss is introduced to construct a credible similarity threshold for FGSIM by leveraging the consistent feature representations of instances within the same category. To address the second challenge, a task-aligned focal (TAF) loss is proposed to enforce consistency between classification and regression. Specifically, the localization difficulty score and classification difficulty score are used as weights for the regression and classification losses, respectively, thereby promoting their synchronous optimization by minimizing the TAF loss. Additionally, rotated images are incorporated into the baseline to encourage the model to make consistent predictions for objects with arbitrary orientations. Ablation studies validate the effectiveness of FGSIM, TAF loss, and their combination. Comparisons with popular models on two RSI datasets further demonstrate the superiority of our approach. Full article
Show Figures

Graphical abstract

19 pages, 8533 KB  
Article
Rotation-Invariant Feature Enhancement with Dual-Aspect Loss for Arbitrary-Oriented Object Detection in Remote Sensing
by Zhao Hu, Xiangfu Meng, Xinsong Liu and Zhuxiang Sun
Appl. Sci. 2025, 15(10), 5240; https://doi.org/10.3390/app15105240 - 8 May 2025
Viewed by 1452
Abstract
Object detection in remote sensing imagery plays a pivotal role in various applications, including aerial surveillance and urban planning. Despite its significance, the task remains challenging due to cluttered backgrounds, the arbitrary orientations of objects, and substantial scale variations across targets. To address [...] Read more.
Object detection in remote sensing imagery plays a pivotal role in various applications, including aerial surveillance and urban planning. Despite its significance, the task remains challenging due to cluttered backgrounds, the arbitrary orientations of objects, and substantial scale variations across targets. To address these issues, we proposed RFE-FCOS, a novel framework that synergizes rotation-invariant feature extraction with adaptive multi-scale fusion. Specifically, we introduce a rotation-invariant learning (RIL) module, which employs adaptive rotation transformations to enhance shallow feature representations, thereby effectively mitigating interference from complex backgrounds and boosting geometric robustness. Furthermore, a rotation feature fusion (RFF) module propagates these rotation-aware features across hierarchical levels through an attention-guided fusion strategy, resulting in richer, more discriminative representations at multiple scales. Finally, we propose a novel dual-aspect RIoU loss (DARIoU) that simultaneously optimizes horizontal and angular regression tasks, facilitating stable training and the precise alignment of arbitrarily oriented bounding boxes. Evaluated on the DIOR-R and HRSC2016 benchmarks, our method demonstrates robust detection capabilities for arbitrarily oriented objects, achieving competitive performance in both accuracy and efficiency. This work provides a versatile solution for advancing object detection in real-world remote sensing scenarios. Full article
Show Figures

Figure 1

17 pages, 1744 KB  
Article
Lightweight Transformer with Adaptive Rotational Convolutions for Aerial Object Detection
by Sabina Umirzakova, Shakhnoza Muksimova, Abrayeva Mahliyo Olimjon Qizi and Young Im Cho
Appl. Sci. 2025, 15(9), 5212; https://doi.org/10.3390/app15095212 - 7 May 2025
Cited by 4 | Viewed by 1191
Abstract
Oriented object detection in aerial imagery presents unique challenges due to the arbitrary orientations, diverse scales, and limited availability of labeled data. In response to these issues, we propose RASST—a lightweight Rotationally Aware Semi-Supervised Transformer framework designed to achieve high-precision detection under fully [...] Read more.
Oriented object detection in aerial imagery presents unique challenges due to the arbitrary orientations, diverse scales, and limited availability of labeled data. In response to these issues, we propose RASST—a lightweight Rotationally Aware Semi-Supervised Transformer framework designed to achieve high-precision detection under fully and semi-supervised conditions. RASST integrates a hybrid Vision Transformer architecture augmented with rotationally aware patch embeddings, adaptive rotational convolutions, and a multi-scale feature fusion (MSFF) module that employs cross-scale attention to enhance detection across object sizes. To address the scarcity of labeled data, we introduce a novel Pseudo-Label Guided Learning (PGL) framework, which refines pseudo-labels through Rotation-Aware Adaptive Weighting (RAW) and Global Consistency (GC) losses, thereby improving generalization and robustness against noisy supervision. Despite its lightweight design, RASST achieves superior performance on the DOTA-v1.5 benchmark, outperforming existing state-of-the-art methods in supervised and semi-supervised settings. The proposed framework demonstrates high scalability, precise orientation sensitivity, and effective utilization of unlabeled data, establishing a new benchmark for efficient oriented object detection in remote sensing imagery. Full article
Show Figures

Figure 1

29 pages, 9314 KB  
Article
SFRADNet: Object Detection Network with Angle Fine-Tuning Under Feature Matching
by Keliang Liu, Yantao Xi, Donglin Jing, Xue Zhang and Mingfei Xu
Remote Sens. 2025, 17(9), 1622; https://doi.org/10.3390/rs17091622 - 2 May 2025
Viewed by 1059
Abstract
Due to the distant acquisition and bird’s-eye perspective of remote sensing images, ground objects are distributed in arbitrary scales and multiple orientations. Existing detectors often utilize feature pyramid networks (FPN) and deformable (or rotated) convolutions to adapt to variations in object scale and [...] Read more.
Due to the distant acquisition and bird’s-eye perspective of remote sensing images, ground objects are distributed in arbitrary scales and multiple orientations. Existing detectors often utilize feature pyramid networks (FPN) and deformable (or rotated) convolutions to adapt to variations in object scale and orientation. However, these methods solve scale and orientation issues separately and ignore their deeper coupling relationships. When the scale features extracted by the network are significantly mismatched with the object, it is difficult for the detection head to effectively capture orientation of object, resulting in misalignment between object and bounding box. Therefore, we propose a one-stage detector—Scale First Refinement-Angle Detection Network (SFRADNet), which aims to fine-tune the rotation angle under precise scale feature matching. We introduce the Group Learning Large Kernel Network (GL2KNet) as the backbone of SFRADNet and employ a Shape-Aware Spatial Feature Extraction Module (SA-SFEM) as the primary component of the detection head. Specifically, within GL2KNet, we construct diverse receptive fields with varying dilation rates to capture features across different spatial coverage ranges. Building on this, we utilize multi-scale features within the layers and apply weighted aggregation based on a Scale Selection Matrix (SSMatrix). The SSMatrix dynamically adjusts the receptive field coverage according to the target size, enabling more refined selection of scale features. Based on precise scale features captured, we first design a Directed Guiding Box (DGBox) within the SA-SFEM, using its shape and position information to supervise the sampling points of the convolution kernels, thereby fitting them to deformations of object. This facilitates the extraction of orientation features near the object region, allowing for accurate refinement of both scale and orientation. Experiments show that our network achieves a mAP of 80.10% on the DOTA-v1.0 dataset, while reducing computational complexity compared to the baseline model. Full article
Show Figures

Figure 1

29 pages, 31432 KB  
Article
GAANet: Symmetry-Driven Gaussian Modeling with Additive Attention for Precise and Robust Oriented Object Detection
by Jiangang Zhu, Yi Liu, Qiang Fu and Donglin Jing
Symmetry 2025, 17(5), 653; https://doi.org/10.3390/sym17050653 - 25 Apr 2025
Viewed by 823
Abstract
Oriented objects in RSI (Remote Sensing Imagery) typically present arbitrary rotations, extreme aspect ratios, multi-scale variations, and complex backgrounds. These factors often result in feature misalignment, representational ambiguity, and regression inconsistency, which significantly degrade detection performance. To address these issues, GAANet (Gaussian-Augmented Additive [...] Read more.
Oriented objects in RSI (Remote Sensing Imagery) typically present arbitrary rotations, extreme aspect ratios, multi-scale variations, and complex backgrounds. These factors often result in feature misalignment, representational ambiguity, and regression inconsistency, which significantly degrade detection performance. To address these issues, GAANet (Gaussian-Augmented Additive Network), a symmetry-driven framework for ODD (oriented object detection), is proposed. GAANet incorporates a symmetry-preserving mechanism into three critical components—feature extraction, representation modeling, and metric optimization—facilitating systematic improvements from structural representation to learning objectives. A CAX-ViT (Contextual Additive Exchange Vision Transformer) is developed to enhance multi-scale structural modeling by combining spatial–channel symmetric interactions with convolution–attention fusion. A GBBox (Gaussian Bounding Box) representation is employed, which implicitly encodes directional information through the invariance of the covariance matrix, thereby alleviating angular periodicity problems. Additionally, a GPIoU (Gaussian Product Intersection over Union) loss function is introduced to ensure geometric consistency between training objectives and the SkewIoU evaluation metric. GAANet achieved a 90.58% mAP on HRSC2016, 89.95% on UCAS-AOD, and 77.86% on the large-scale DOTA v1.0 dataset, outperforming mainstream methods across various benchmarks. In particular, GAANet showed a +3.27% mAP improvement over R3Det and a +4.68% gain over Oriented R-CNN on HRSC2016, demonstrating superior performance over representative baselines. Overall, GAANet establishes a closed-loop detection paradigm that integrates feature interaction, probabilistic modeling, and metric optimization under symmetry priors, offering both theoretical rigor and practical efficacy. Full article
(This article belongs to the Special Issue Symmetry and Asymmetry Study in Object Detection)
Show Figures

Figure 1

22 pages, 11601 KB  
Article
ORPSD: Outer Rectangular Projection-Based Representation for Oriented Ship Detection in SAR Images
by Mingjin Zhang, Yuanjun Ouyang, Minghai Yang, Jie Guo and Yunsong Li
Remote Sens. 2025, 17(9), 1511; https://doi.org/10.3390/rs17091511 - 24 Apr 2025
Cited by 7 | Viewed by 977
Abstract
Ship object detection in synthetic aperture radar (SAR) images is both an important and challenging task. Previous methods based on horizontal bounding boxes struggle to accurately locate densely packed ships oriented in arbitrary directions, due to variations in scale, aspect ratio, and orientation, [...] Read more.
Ship object detection in synthetic aperture radar (SAR) images is both an important and challenging task. Previous methods based on horizontal bounding boxes struggle to accurately locate densely packed ships oriented in arbitrary directions, due to variations in scale, aspect ratio, and orientation, thereby requiring other forms of object representation, like rotated bounding boxes (OBBs). However, most deep learning-based OBB detection methods share a single-stage paradigm to improve detection speed, often at the expense of accuracy. In this paper, we propose a simple yet effective two-stage detector dubbed ORPSD, which enjoys good accuracy and efficiency owing to two key designs. First, we design a novel encoding scheme based on outer-rectangle projection (ORP) for the OrpRPN stage, which could efficiently generate high-quality oriented proposals. Second, we propose a convex quadrilateral rectification (CQR) method to rectify distorted shape proposals into rectangles by finding the outer rectangle based on the minimum area, ensuring correct proposal orientation. Comparative experiments on the challenging public benchmarks RSSDD and RSAR demonstrate the superiority of our ORPDet over previous OBB-based detectors in terms of both detection accuracy and efficiency. Full article
Show Figures

Figure 1

30 pages, 11153 KB  
Article
GCA2Net: Global-Consolidation and Angle-Adaptive Network for Oriented Object Detection in Aerial Imagery
by Shenbo Zhou, Zhenfei Liu, Hui Luo, Guanglin Qi, Yunfeng Liu, Haorui Zuo, Jianlin Zhang and Yuxing Wei
Remote Sens. 2025, 17(6), 1077; https://doi.org/10.3390/rs17061077 - 19 Mar 2025
Cited by 6 | Viewed by 1170
Abstract
Enhancing the detection capabilities of rotated objects in aerial imagery is a vital aspect of the burgeoning field of remote sensing technology. The objective is to identify and localize objects oriented in arbitrary directions within the image. In recent years, the capacity for [...] Read more.
Enhancing the detection capabilities of rotated objects in aerial imagery is a vital aspect of the burgeoning field of remote sensing technology. The objective is to identify and localize objects oriented in arbitrary directions within the image. In recent years, the capacity for rotated object detection has seen continuous improvement. However, existing methods largely employ traditional backbone networks, where static convolutions excel at extracting features from objects oriented at a specific angle. In contrast, most objects in aerial imagery are oriented in various directions. This poses a challenge for backbone networks to extract high-quality features from objects of different orientations. In response to the challenge above, we propose the Dynamic Rotational Convolution (DRC) module. By integrating it into the ResNet backbone network, we form the backbone network presented in this paper, DRC-ResNet. Within the proposed DRC module, rotation parameters are predicted by the Adaptive Routing Unit (ARU), employing a data-driven approach to adaptively rotate convolutional kernels to extract features from objects oriented in various directions within different images. Building upon this foundation, we introduce a conditional computation mechanism that enables convolutional kernels to more flexibly and efficiently adapt to the dramatic angular changes of objects within images. To better integrate key information within images after obtaining features rich in angular details, we propose the Multi-Order Spatial-Channel Aggregation Block (MOSCAB) module, which is aimed at enhancing the integration capacity of key information in images through selective focusing and global information aggregation. Meanwhile, considering the significant semantic gap between features at different levels during the feature pyramid fusion process, we propose a new multi-scale fusion network named AugFPN+. This network reduces the semantic gap between different levels before feature fusion, achieves more effective feature integration, and minimizes the spatial information loss of small objects to the greatest extent possible. Experiments conducted on popular benchmark datasets DOTA-V1.0 and HRSC2016 demonstrate that our proposed model has achieved mAP scores of 77.56% and 90.4%, respectively, significantly outperforming current rotated detection models. Full article
Show Figures

Figure 1

27 pages, 10153 KB  
Article
PSMDet: Enhancing Detection Accuracy in Remote Sensing Images Through Self-Modulation and Gaussian-Based Regression
by Jiangang Zhu, Yang Ruan, Donglin Jing, Qiang Fu and Ting Ma
Sensors 2025, 25(5), 1285; https://doi.org/10.3390/s25051285 - 20 Feb 2025
Cited by 1 | Viewed by 1094
Abstract
Conventional object detection methods face challenges in addressing the complexity of targets in optical remote sensing images (ORSIs), including multi-scale objects, high aspect ratios, and arbitrary orientations. This study proposes a novel detection framework called Progressive Self-Modulating Detector (PSMDet), which incorporates self-modulation mechanisms [...] Read more.
Conventional object detection methods face challenges in addressing the complexity of targets in optical remote sensing images (ORSIs), including multi-scale objects, high aspect ratios, and arbitrary orientations. This study proposes a novel detection framework called Progressive Self-Modulating Detector (PSMDet), which incorporates self-modulation mechanisms at the backbone, feature pyramid network (FPN), and detection head stages to address these issues. The backbone network utilizes a reparameterized large kernel network (RLK-Net) to enhance multi-scale feature extraction. At the same time, the adaptive perception network (APN) achieves accurate feature alignment through a self-attention mechanism. Additionally, a Gaussian-based bounding box representation and smooth relative entropy (smoothRE) regression loss are introduced to address traditional bounding box regression challenges, such as discontinuities and inconsistencies. Experimental validation on the HRSC2016 and UCAS-AOD datasets demonstrates the framework’s robust performance, achieving the mean Average Precision (mAP) scores of 90.69% and 89.86%, respectively. Although validated on ORSIs, the proposed framework is adaptable for broader applications, such as autonomous driving in intelligent transportation systems and defect detection in industrial vision, where high-precision object detection is essential. These contributions provide theoretical and technical support for advancing intelligent image sensor-based applications across multiple domains. Full article
Show Figures

Figure 1

22 pages, 15409 KB  
Article
A Deformable Split Fusion Method for Object Detection in High-Resolution Optical Remote Sensing Image
by Qinghe Guan, Ying Liu, Lei Chen, Guandian Li and Yang Li
Remote Sens. 2024, 16(23), 4487; https://doi.org/10.3390/rs16234487 - 29 Nov 2024
Cited by 1 | Viewed by 1182
Abstract
To better address the challenges of complex backgrounds, varying object sizes, and arbitrary orientations in remote sensing object detection tasks, this paper proposes a deformable split fusion method based on an improved RoI Transformer called RoI Transformer-DSF. Specifically, the deformable split fusion method [...] Read more.
To better address the challenges of complex backgrounds, varying object sizes, and arbitrary orientations in remote sensing object detection tasks, this paper proposes a deformable split fusion method based on an improved RoI Transformer called RoI Transformer-DSF. Specifically, the deformable split fusion method contains a deformable split module (DSM) and a space fusion module (SFM). Firstly, the DSM aims to assign different receptive fields according to the size of the remote sensing object and focus the feature attention on the remote sensing object to capture richer semantic and contextual information. Secondly, the SFM can highlight the spatial location of the remote sensing object and fuse spatial information of different scales to improve the detection ability of the algorithm for objects of different sizes. In addition, this paper presents the ResNext_Feature Calculation_block (ResNext_FC_block) to build the backbone of the algorithm and modifies the original regression loss to the KFIoU to improve the feature extraction capability and regression accuracy of the algorithm. Experiments show that the mAP0.5 of this method on DOTAv1.0 and FAIR1M (plane) datasets is 83.53% and 44.14%, respectively, which is 3% and 1.87% higher than that of the RoI Transformer, and it can be applied to the field of remote sensing object detection. Full article
Show Figures

Graphical abstract

17 pages, 2868 KB  
Technical Note
Boosting Point Set-Based Network with Optimal Transport Optimization for Oriented Object Detection
by Binhuan Yuan, Xiyang Zhi, Jianming Hu and Wei Zhang
Remote Sens. 2024, 16(22), 4133; https://doi.org/10.3390/rs16224133 - 6 Nov 2024
Viewed by 1594
Abstract
When handling complex remote sensing scenarios, rotational angle information can improve detection accuracy and enhance algorithm robustness, providing support for fine-grained detection. Point set representation is one of the most commonly used methods in arbitrary-oriented object detection tasks, leveraging discrete feature points to [...] Read more.
When handling complex remote sensing scenarios, rotational angle information can improve detection accuracy and enhance algorithm robustness, providing support for fine-grained detection. Point set representation is one of the most commonly used methods in arbitrary-oriented object detection tasks, leveraging discrete feature points to represent oriented targets and achieve high accuracy in angle prediction. However, due to the inherent discreteness of point set representation, it is prone to significant impact from isolated points and representational ambiguity in harsh application scenarios, leading to inaccurate detection. To address this issue, an efficient aerial object detector named BE-Det is proposed, which uses the optimal transport (OT) strategy to constrain the positions of isolated points. Additionally, a candidate point set quality evaluation scheme is designed to effectively assess the quality of candidate point sets. Experimental results on two challenging aerial datasets demonstrate that the proposed method outperforms several advanced detection methods. Full article
Show Figures

Figure 1

Back to TopTop