WDARFNet: A Wavelet-Domain Adaptive Receptive Field Network for Improved Oriented Object Detection in Remote Sensing
Abstract
1. Introduction
- Enriched contextual information boosts the accuracy of the detector. As illustrated in Figure 1a, vehicles and boats typically exhibit similar shapes in remote sensing images due to their modest imaging sizes, leading to classification errors. The occurrence of these errors is because the detector only considers a small range of contextual information. Increasing the effective receptive field introduces more background information, facilitating effective and accurate classification based on whether the background is sea or road surface.
- Correctly detecting different objects requires varying effective receptive field sizes. Different objects have different sizes, degrees of occlusion, and backdrop complexity, leading to varying necessary receptive field sizes for successful detection. For instance, in unobstructed scenes such as aircraft, swimming pools, and tennis courts, relatively smaller receptive fields are required. However, at intersections, bridges, docks, etc., extremely large receptive fields are required because these types of objects often have large sizes and aspect ratios and are obscured by other objects, making accurate recognition challenging. As illustrated in Figure 1b, the tennis court in the upper image is clear and unobstructed, requiring only a smaller receptive field for accurate recognition. However, recognizing the tennis court in the lower image requires a broader receptive field, as the tennis court is largely obscured by tree shadows, making recognition based on appearance alone challenging.
- 3.
- To address the issue of insufficient information utilization in the network and mitigate the impact of noise on its performance, we integrate Convolutional Neural Networks (CNNs) with frequency domain analysis techniques. This integration results in a unified model, which leads to the proposal of the Wavelet Domain Channel Selection Module (WDCSM) and the Wavelet Domain Spatial Selection Module (WDSSM). Both WDCSM and WDSSM effectively combine low-frequency and high-frequency components, thereby capturing additional contours and detailed information, which enhance the network’s representational capacity. Moreover, by eliminating diagonal high-frequency components that contain substantial noise, the network’s noise robustness is improved. The WDCSM explicitly models the interdependencies between channels of feature maps and leverages global information to adaptively recalibrate the channel-wise feature responses, thereby further enhancing the network’s representational capacity. The WDSSM is capable of selecting the most suitable spatial feature information for detecting objects from various scales of large-kernel branches, thereby dynamically adjusting the receptive field size for each object in accordance with their spatial needs.
- 4.
- To adaptively adjust the receptive field of the backbone network, we integrate the selective kernel mechanism with the WDSSM. The selective kernel mechanism is implemented through a multi-branch large-kernel network, and the WDSSM effectively assigns weights to the features processed by large kernels in each branch and spatially integrates the features, thereby permitting the network to dynamically adjust the receptive field for each object in space according to specific requirements.
- 5.
- Extensive experiments have been conducted to validate the effectiveness of our proposed model on three widely used datasets: DOTA-v1.0, DIOR-R, and HRSC2016.
2. Related Work
2.1. Oriented Object Detection
2.2. Frequency Domain in Deep Learning
2.3. Large Kernel Convolutional Neural Network
3. Method
3.1. Two-Dimensional Discrete Wavelet Transform
3.2. Wavelet Domain Selective Kernel Network
3.3. Selective Kernel Convolutions
3.4. Wavelet Domain Channel Selection Module
3.5. Wavelet Domain Spatial Selection Module
4. Experiments
4.1. Datasets
4.2. Implementation Details
4.3. Ablation Studies
4.4. Comparison with State-of-the-Art Methods
4.5. Visualization
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In Advances in Neural Information Processing Systems; Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R., Eds.; Curran Associates, Inc.: Newry, Northern Ireland, 2015; Volume 28. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Yang, X.; Yan, J.; Feng, Z.; He, T. R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object. Proc. AAAI Conf. Artif. Intell. 2021, 35, 3163–3171. [Google Scholar] [CrossRef]
- Yang, X.; Yan, J.; Ming, Q.; Wang, W.; Zhang, X.; Tian, Q. Rethinking Rotated Object Detection with Gaussian Wasserstein Distance Loss. In Proceedings of the International Conference on Machine Learning, PMLR, Online, 18–24 July 2021; pp. 11830–11841. [Google Scholar]
- Yang, X.; Yang, X.; Yang, J.; Ming, Q.; Wang, W.; Tian, Q.; Yan, J. Learning High-Precision Bounding Box for Rotated Object Detection via Kullback-Leibler Divergence. Adv. Neural Inf. Process. Syst. 2021, 34, 18381–18394. [Google Scholar]
- Hou, L.; Lu, K.; Xue, J.; Li, Y. Shape-Adaptive Selection and Measurement for Oriented Object Detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Online, 22 February–1 March 2022; AAAI Press: Palo Alto, CA, USA, 2022; Volume 36, pp. 923–932. [Google Scholar]
- Han, J.; Ding, J.; Li, J.; Xia, G.-S. Align Deep Features for Oriented Object Detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–11. [Google Scholar] [CrossRef]
- Hou, L.; Lu, K.; Yang, X.; Li, Y.; Xue, J. G-Rep: Gaussian Representation for Arbitrary-Oriented Object Detection. Remote Sens. 2023, 15, 757. [Google Scholar] [CrossRef]
- Ding, J.; Xue, N.; Long, Y.; Xia, G.-S.; Lu, Q. Learning RoI Transformer for Oriented Object Detection in Aerial Images. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 2844–2853. [Google Scholar]
- Yang, X.; Yan, J. Arbitrary-Oriented Object Detection with Circular Smooth Label. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Proceedings, Part VIII 16. Springer: Berlin/Heidelberg, Germany, 2020; pp. 677–694. [Google Scholar]
- Cheng, G.; Wang, J.; Li, K.; Xie, X.; Lang, C.; Yao, Y.; Han, J. Anchor-Free Oriented Proposal Generator for Object Detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5625411. [Google Scholar] [CrossRef]
- Han, J.; Ding, J.; Xue, N.; Xia, G.-S. Redet: A Rotation-Equivariant Detector for Aerial Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 2786–2795. [Google Scholar]
- Xie, X.; Cheng, G.; Wang, J.; Yao, X.; Han, J. Oriented R-CNN for Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 3520–3529. [Google Scholar]
- Xu, Y.; Fu, M.; Wang, Q.; Wang, Y.; Chen, K.; Xia, G.-S.; Bai, X. Gliding Vertex on the Horizontal Bounding Box for Multi-Oriented Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 1452–1459. [Google Scholar] [CrossRef]
- Li, Y.; Hou, Q.; Zheng, Z.; Cheng, M.-M.; Yang, J.; Li, X. Large Selective Kernel Network for Remote Sensing Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; pp. 16794–16805. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Xie, C.; Wu, Y.; Maaten, L.V.D.; Yuille, A.L.; He, K. Feature Denoising for Improving Adversarial Robustness. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 501–509. [Google Scholar]
- Qin, Z.; Zhang, P.; Wu, F.; Li, X. FcaNet: Frequency Channel Attention Networks. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, BC, Canada, 11–17 October 2021; pp. 763–772. [Google Scholar]
- Yang, X.; Yang, J.; Yan, J.; Zhang, Y.; Zhang, T.; Guo, Z.; Sun, X.; Fu, K. Scrdet: Towards More Robust Detection for Small, Cluttered and Rotated Objects. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 8232–8241. [Google Scholar]
- Yang, X.; Yan, J.; Liao, W.; Yang, X.; Tang, J.; He, T. Scrdet++: Detecting Small, Cluttered and Rotated Objects via Instance-Level Feature Denoising and Rotation Loss Smoothing. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 2384–2399. [Google Scholar] [CrossRef]
- Li, W.; Chen, Y.; Hu, K.; Zhu, J. Oriented Reppoints for Aerial Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 1829–1838. [Google Scholar]
- Guo, Z.; Liu, C.; Zhang, X.; Jiao, J.; Ji, X.; Ye, Q. Beyond Bounding-Box: Convex-Hull Feature Adaptation for Oriented and Densely Packed Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 8792–8801. [Google Scholar]
- Pu, Y.; Wang, Y.; Xia, Z.; Han, Y.; Wang, Y.; Gan, W.; Wang, Z.; Song, S.; Huang, G. Adaptive Rotated Convolution for Rotated Object Detection. arXiv 2023, arXiv:2303.07820. [Google Scholar]
- Unser, M. Texture Classification and Segmentation Using Wavelet Frames. IEEE Trans. Image Process. 1995, 4, 1549–1560. [Google Scholar] [CrossRef]
- Li, J.; Yuan, G.; Fan, H. Multifocus Image Fusion Using Wavelet-Domain-Based Deep CNN. Comput. Intell. Neurosci. 2019, 2019, 4179397. [Google Scholar] [CrossRef]
- Ehrlich, M.; Davis, L. Deep Residual Learning in the JPEG Transform Domain. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 3483–3492. [Google Scholar]
- Wang, Y.; Xu, C.; You, S.; Tao, D.; Xu, C. CNNpack: Packing Convolutional Neural Networks in the Frequency Domain. In Advances in Neural Information Processing Systems; Lee, D., Sugiyama, M., Luxburg, U., Guyon, I., Garnett, R., Eds.; Curran Associates, Inc.: Newry, Northern Ireland, 2016; Volume 29. [Google Scholar]
- Xin, J.; Li, J.; Jiang, X.; Wang, N.; Huang, H.; Gao, X. Wavelet-Based Dual Recursive Network for Image Super-Resolution. IEEE Trans. Neural Netw. Learn. Syst. 2022, 33, 707–720. [Google Scholar] [CrossRef] [PubMed]
- Mallat, S.G. A Theory for Multiresolution Signal Decomposition: The Wavelet Representation. IEEE Trans. Pattern Anal. Mach. Intell. 1989, 11, 674–693. [Google Scholar] [CrossRef]
- Fujieda, S.; Takayama, K.; Hachisuka, T. Wavelet Convolutional Neural Networks. arXiv 2018, arXiv:1805.08620. [Google Scholar]
- Williams, T.; Li, R. Wavelet Pooling for Convolutional Neural Networks. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Zhao, X.; Huang, P.; Shu, X. Wavelet-Attention CNN for Image Classification. Multimed. Syst. 2022, 28, 915–924. [Google Scholar] [CrossRef]
- Li, Q.; Shen, L.; Guo, S.; Lai, Z. Wavelet Integrated CNNs for Noise-Robust Image Classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 7245–7254. [Google Scholar]
- Yang, Y.; Jiao, L.; Liu, X.; Liu, F.; Yang, S.; Li, L.; Chen, P.; Li, X.; Huang, Z. Dual Wavelet Attention Networks for Image Classification. IEEE Trans. Circuits Syst. Video Technol. 2023, 33, 1899–1910. [Google Scholar] [CrossRef]
- Wang, S.; Cai, Z.; Yuan, J. Automatic SAR Ship Detection Based on Multifeature Fusion Network in Spatial and Frequency Domains. IEEE Trans. Geosci. Remote Sens. 2023, 61, 4102111. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S. An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Wang, W.; Xie, E.; Li, X.; Fan, D.-P.; Song, K.; Liang, D.; Lu, T.; Luo, P.; Shao, L. Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 568–578. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 10012–10022. [Google Scholar]
- Touvron, H.; Cord, M.; Douze, M.; Massa, F.; Sablayrolles, A.; Jégou, H. Training Data-Efficient Image Transformers & Distillation through Attention. In Proceedings of the International Conference on Machine Learning, Online, 18–24 July 2021; pp. 10347–10357. [Google Scholar]
- Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning; AAAI: Washington, DC, USA, 2017; Volume 31. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Sheng, P.; Shi, Y.; Liu, X.; Jin, H. Lsnet: Real-Time Attention Semantic Segmentation Network with Linear Complexity. Neurocomputing 2022, 509, 94–101. [Google Scholar] [CrossRef]
- Li, X.; Wang, W.; Hu, X.; Yang, J. Selective Kernel Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 510–519. [Google Scholar]
- Liu, Z.; Mao, H.; Wu, C.-Y.; Feichtenhofer, C.; Darrell, T.; Xie, S. A Convnet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 11976–11986. [Google Scholar]
- Ding, X.; Zhang, X.; Han, J.; Ding, G. Scaling up Your Kernels to 31x31: Revisiting Large Kernel Design in Cnns. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 11963–11975. [Google Scholar]
- Liu, S.; Chen, T.; Chen, X.; Chen, X.; Xiao, Q.; Wu, B.; Kärkkäinen, T.; Pechenizkiy, M.; Mocanu, D.; Wang, Z. More Convnets in the 2020s: Scaling up Kernels beyond 51x51 Using Sparsity. arXiv 2022, arXiv:2207.03620. [Google Scholar]
- Li, S.; Florencio, D.; Li, W.; Zhao, Y.; Cook, C. A Fusion Framework for Camouflaged Moving Foreground Detection in the Wavelet Domain. IEEE Trans. Image Process. 2018, 27, 3918–3930. [Google Scholar] [CrossRef] [PubMed]
- Xia, G.-S.; Bai, X.; Ding, J.; Zhu, Z.; Belongie, S.; Luo, J.; Datcu, M.; Pelillo, M.; Zhang, L. DOTA: A Large-Scale Dataset for Object Detection in Aerial Images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 3974–3983. [Google Scholar]
- Liu, Z.; Wang, H.; Weng, L.; Yang, Y. Ship Rotated Bounding Box Space for Ship Extraction from High-Resolution Optical Satellite Images with Complex Backgrounds. IEEE Geosci. Remote Sens. Lett. 2016, 13, 1074–1078. [Google Scholar] [CrossRef]
- Zhou, Y.; Yang, X.; Zhang, G.; Wang, J.; Liu, Y.; Hou, L.; Jiang, X.; Liu, X.; Yan, J.; Lyu, C. Mmrotate: A Rotated Object Detection Benchmark Using Pytorch. In Proceedings of the 30th ACM International Conference on Multimedia, Lisbon, Portugal, 10–14 October 2022; pp. 7331–7334. [Google Scholar]
- Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Fei-Fei, L. Imagenet: A Large-Scale Hierarchical Image Database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; IEEE: Piscataway, NJ, USA, 2009; pp. 248–255. [Google Scholar]
- Loshchilov, I.; Hutter, F. Decoupled Weight Decay Regularization. arXiv 2017, arXiv:1711.05101. [Google Scholar]
- Long, Y.; Xia, G.-S.; Li, S.; Yang, W.; Yang, M.Y.; Zhu, X.X.; Zhang, L.; Li, D. On Creating Benchmark Dataset for Aerial Image Interpretation: Reviews, Guidances, and Million-Aid. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 4205–4230. [Google Scholar] [CrossRef]
- Wang, J.; Yang, W.; Li, H.-C.; Zhang, H.; Xia, G.-S. Learning Center Probability Map for Detecting Objects in Aerial Images. IEEE Trans. Geosci. Remote Sens. 2020, 59, 4307–4323. [Google Scholar] [CrossRef]
- Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 318–327. [Google Scholar] [CrossRef]
- Ming, Q.; Zhou, Z.; Miao, L.; Zhang, H.; Li, L. Dynamic Anchor Learning for Arbitrary-Oriented Object Detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Online, 2–9 February 2021; Volume 35, pp. 2355–2363. [Google Scholar]
- Lang, S.; Ventola, F.; Kersting, K. Dafnet: A One-Stage Anchor-Free Deep Model for Oriented Object Detection. arXiv 2021, arXiv:2109.06148. [Google Scholar]
- Pan, X.; Ren, Y.; Sheng, K.; Dong, W.; Yuan, H.; Guo, X.; Ma, C.; Xu, C. Dynamic Refinement Network for Oriented and Densely Packed Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11207–11216. [Google Scholar]
- Yang, X.; Zhou, Y.; Zhang, G.; Yang, J.; Wang, W.; Yan, J.; Zhang, X.; Tian, Q. The KFIoU Loss for Rotated Object Detection. arXiv 2022, arXiv:2201.12558. [Google Scholar]
- Tian, Z.; Shen, C.; Chen, H.; He, T. FCOS: A Simple and Strong Anchor-Free Object Detector. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 44, 1922–1933. [Google Scholar] [CrossRef]
- Zhang, S.; Chi, C.; Yao, Y.; Lei, Z.; Li, S.Z. Bridging the Gap between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 9759–9768. [Google Scholar]
Kernel Design (K, D) | RF | GFLOPs | mAP |
---|---|---|---|
(3, 1), (5, 1) | 7 | 267.7 | 64.80 |
(3, 1), (5, 3) | 15 | 267.7 | 65.11 |
(5, 1), (7, 1) | 11 | 268.3 | 65.02 |
(5, 1), (7, 3) | 23 | 268.3 | 65.33 |
Method | mAP |
---|---|
O-RCNN (ResNet50) | 62.09 |
O-RCNN (WDCSM) | 63.56 |
O-RCNN (WDSSM) | 64.15 |
O-RCNN (WDCSM+WDSSM) | 65.33 |
Method | mAP | ||
---|---|---|---|
Original | ) | ) | |
O-RCNN (ResNet50) | 60.32 | 46.52 | 31.18 |
O-RCNN (WDARFNet) | 65.33 | 56.61 | 48.19 |
Method | mAP |
---|---|
WDARFNet (A+H) | 64.49 |
WDARFNet (A+V) | 64.28 |
WDARFNet (A+D) | 62.92 |
WDARFNet (A+H+V) | 65.33 |
Method | Backbone | PL | BD | BR | GTF | SV | LV | SH | TC | BC | ST | SBF | RA | HA | SP | HC | mAP |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Two-stage: | |||||||||||||||||
O-Faster R-CNN [1] | ResNet101 | 79.09 | 69.12 | 17.17 | 63.49 | 34.20 | 37.16 | 36.20 | 89.19 | 69.60 | 58.96 | 49.40 | 52.52 | 46.69 | 44.80 | 46.30 | 52.93 |
RoI-Transformer * [9] | ResNet101 | 88.64 | 78.52 | 43.44 | 75.92 | 68.81 | 73.68 | 83.59 | 90.74 | 77.27 | 81.46 | 58.39 | 53.54 | 62.83 | 58.93 | 47.67 | 69.56 |
CenterMap [53] | ResNet50 | 89.02 | 80.56 | 49.41 | 61.98 | 77.99 | 74.19 | 83.74 | 89.44 | 78.01 | 83.52 | 47.64 | 65.93 | 63.68 | 67.07 | 61.59 | 71.59 |
SCRDet * [19] | ResNet101 | 89.98 | 80.65 | 52.09 | 68.36 | 68.36 | 60.32 | 72.41 | 90.85 | 87.94 | 86.86 | 65.02 | 66.68 | 66.25 | 68.24 | 65.21 | 72.61 |
Gliding Vertex * [14] | ResNet101 | 89.64 | 85.00 | 52.26 | 77.34 | 73.01 | 73.14 | 86.82 | 90.74 | 79.02 | 86.81 | 59.55 | 70.91 | 72.94 | 70.86 | 57.32 | 75.02 |
ReDet [12] | ReR50-ReFPN | 88.79 | 82.64 | 53.97 | 74.00 | 78.13 | 84.06 | 88.04 | 90.89 | 87.78 | 85.75 | 61.76 | 60.39 | 75.96 | 68.07 | 63.59 | 76.25 |
CSL * [10] | ResNet152 | 90.25 | 85.53 | 54.64 | 75.31 | 70.44 | 73.51 | 77.62 | 90.84 | 86.15 | 86.69 | 69.60 | 68.04 | 73.83 | 71.10 | 68.93 | 76.17 |
CenterMap-Net * [54] | ResNet101 | 89.83 | 84.41 | 54.60 | 70.25 | 77.66 | 78.32 | 87.19 | 90.66 | 84.89 | 85.27 | 56.46 | 69.23 | 74.13 | 71.56 | 66.06 | 76.03 |
LSKNet [15] | LSKNet-S | 89.66 | 85.52 | 57.72 | 75.70 | 74.95 | 78.69 | 88.24 | 90.88 | 86.79 | 86.38 | 66.92 | 63.77 | 77.77 | 74.47 | 64.82 | 77.49 |
O-RCNN * [13] | ResNet50 | 89.84 | 85.43 | 61.09 | 79.82 | 79.71 | 85.35 | 88.82 | 90.88 | 86.68 | 87.73 | 72.21 | 70.80 | 82.42 | 78.18 | 74.11 | 80.87 |
Single-stage: | |||||||||||||||||
O-RetinaNet [55] | ResNet101 | 88.82 | 81.74 | 44.44 | 65.72 | 67.11 | 55.82 | 72.77 | 90.55 | 82.83 | 76.30 | 54.19 | 63.64 | 63.71 | 69.73 | 53.37 | 68.72 |
DAL [56] | ResNet101 | 88.68 | 76.55 | 45.08 | 66.80 | 67.00 | 76.76 | 79.74 | 90.84 | 79.54 | 78.45 | 57.71 | 62.27 | 69.05 | 73.14 | 60.11 | 71.44 |
O-RepPoints [21] | ResNet50 | 87.02 | 83.17 | 54.13 | 71.16 | 80.18 | 78.40 | 87.28 | 90.90 | 85.97 | 86.25 | 59.90 | 71.49 | 73.53 | 72.27 | 58.97 | 75.97 |
S2A-Net [7] | ResNet101 | 88.70 | 81.41 | 54.28 | 69.75 | 78.04 | 80.54 | 88.04 | 90.69 | 84.75 | 86.20 | 65.03 | 65.80 | 76.16 | 73.37 | 58.86 | 76.11 |
S2A-Net * [7] | ResNet101 | 89.28 | 84.11 | 56.95 | 79.21 | 80.18 | 82.93 | 89.21 | 90.86 | 84.66 | 87.61 | 71.66 | 68.23 | 78.58 | 78.20 | 65.55 | 79.15 |
R3Det [3] | ResNet152 | 89.80 | 83.77 | 48.11 | 66.77 | 78.76 | 83.27 | 87.84 | 90.82 | 85.38 | 85.51 | 65.67 | 62.68 | 67.53 | 78.56 | 72.62 | 76.47 |
KLD * [5] | ResNet50 | 88.91 | 85.23 | 53.64 | 81.23 | 78.20 | 76.99 | 84.58 | 89.50 | 86.84 | 86.38 | 71.69 | 68.06 | 75.95 | 72.23 | 75.42 | 78.32 |
Anchor-free: | |||||||||||||||||
DRN * [52] | Hourglass104 | 89.71 | 82.34 | 47.22 | 64.10 | 76.22 | 74.43 | 85.84 | 90.57 | 86.18 | 84.90 | 57.65 | 61.93 | 69.0 | 69.63 | 58.48 | 73.23 |
CFA [22] | ResNet101 | 89.26 | 81.72 | 51.81 | 67.17 | 79.99 | 78.25 | 84.46 | 90.77 | 83.40 | 85.54 | 54.86 | 67.75 | 73.04 | 70.24 | 64.96 | 75.05 |
DAFNet * [57] | ResNet101 | 89.40 | 86.27 | 53.70 | 60.51 | 82.04 | 81.17 | 88.66 | 90.37 | 83.81 | 87.27 | 53.93 | 69.38 | 75.61 | 81.26 | 70.86 | 76.95 |
SASM * [6] | ResNext101 | 88.41 | 83.32 | 54.00 | 74.34 | 80.87 | 84.10 | 88.04 | 90.74 | 82.85 | 86.26 | 63.96 | 66.78 | 78.40 | 73.84 | 61.97 | 77.19 |
DARDet * [58] | ResNet50 | 89.08 | 84.30 | 56.64 | 77.83 | 81.10 | 83.39 | 88.46 | 90.88 | 85.44 | 87.56 | 62.77 | 66.23 | 77.97 | 82.03 | 67.40 | 78.74 |
KFIoU * [59] | Swin-T | 89.44 | 84.41 | 62.22 | 82.51 | 80.10 | 86.07 | 88.68 | 90.90 | 87.32 | 88.38 | 72.80 | 71.95 | 78.96 | 74.95 | 75.27 | 80.93 |
Ours: | |||||||||||||||||
WDARFNet | WDARFNet | 88.60 | 82.10 | 55.91 | 77.96 | 74.41 | 84.79 | 88.70 | 90.88 | 85.43 | 85.59 | 62.52 | 63.34 | 78.06 | 75.63 | 74.48 | 77.90 |
WDARFNet * | WDARFNet | 88.87 | 84.93 | 61.35 | 80.39 | 80.77 | 86.58 | 88.54 | 90.07 | 88.36 | 87.25 | 71.04 | 71.45 | 80.64 | 79.92 | 75.72 | 81.06 |
Method | O-RCNN [13] | O-FCOS [60] | S2A-Net [7] | R3Det [3] | Gliding Vertex [14] | KFIoU [59] | SASM [6] | GWD [4] | KLD [5] | O-Faster RCNN [1] | O-ATSS [61] | CFA [22] | O-Reppoints [21] | RoI-Trans. [9] | WDARFNet |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
APL | 62.92 | 62.89 | 67.98 | 62.55 | 62.67 | 58.03 | 64.78 | 69.68 | 66.52 | 63.07 | 62.19 | 61.10 | 67.80 | 63.28 | 69.04 |
APO | 30.54 | 41.38 | 44. 44 | 43. 44 | 38.56 | 45.41 | 49.90 | 28.83 | 46. 80 | 40.22 | 44.63 | 44.93 | 48.01 | 46.05 | 55.41 |
BF | 78.86 | 71.83 | 71.63 | 71.72 | 71.94 | 69.52 | 74.94 | 74.32 | 71.76 | 71.89 | 71.55 | 77.62 | 77.02 | 71.93 | 76.04 |
BC | 81.42 | 81.00 | 81.39 | 81.48 | 81.20 | 81.55 | 80.38 | 81.49 | 81.43 | 81.36 | 81.42 | 84.67 | 85.37 | 81.33 | 81.51 |
BR | 39.77 | 38.01 | 42.66 | 36.49 | 37.73 | 38.82 | 34.52 | 29.62 | 40.81 | 39.67 | 41.08 | 37.69 | 38.55 | 43.71 | 41.84 |
CH | 72.40 | 72.46 | 72.72 | 72.63 | 72.48 | 73.36 | 69.21 | 72.67 | 78.25 | 72.51 | 72.37 | 75.71 | 78.45 | 72.69 | 76.45 |
ESA | 73.95 | 77.73 | 79.03 | 79.50 | 78.62 | 78.08 | 76.28 | 76.45 | 79.23 | 79.19 | 78.54 | 82.68 | 81.13 | 80.17 | 81.13 |
ETS | 61.33 | 67.52 | 70.40 | 64.41 | 69.04 | 66.41 | 61.37 | 63.14 | 66.63 | 69.45 | 67.50 | 72.03 | 72.06 | 70.04 | 73.22 |
DAM | 26.02 | 28.61 | 27.08 | 27.02 | 22.81 | 25.23 | 31.66 | 27.13 | 29.01 | 26.00 | 30.56 | 33.41 | 33.67 | 31.42 | 35.03 |
GF | 65.61 | 74.58 | 75.56 | 77.36 | 77.89 | 79.24 | 72.22 | 77.19 | 78.68 | 77.93 | 75.69 | 77.25 | 76.00 | 78.00 | 77.32 |
GTF | 75.58 | 77.04 | 81.02 | 77.17 | 82.13 | 78.25 | 77.81 | 78.94 | 80.19 | 82.28 | 79.11 | 79.94 | 79.89 | 83.48 | 80.49 |
HA | 32.15 | 40.66 | 43.41 | 40.53 | 46.22 | 44.67 | 44.69 | 39.11 | 44,88 | 46.91 | 42.77 | 46.20 | 45.72 | 49.04 | 48.62 |
OP | 54.80 | 53.92 | 56.45 | 53.33 | 54.76 | 54.45 | 52.08 | 42.18 | 57.23 | 53.90 | 56.31 | 54.27 | 54.27 | 58.29 | 56.77 |
SH | 81.12 | 79.41 | 81.12 | 79.66 | 81.03 | 80.78 | 83.64 | 79.10 | 80.91 | 81.03 | 80.92 | 87.01 | 85.13 | 81.17 | 84.64 |
STA | 67.76 | 66.33 | 68.00 | 69.22 | 74.88 | 68. 40 | 62.83 | 70.41 | 74.17 | 75.77 | 67.78 | 70.43 | 76.04 | 77.93 | 78.04 |
STO | 62.44 | 67.57 | 70.03 | 61.10 | 62.54 | 64.52 | 63.91 | 58.69 | 68.02 | 62.54 | 69.24 | 69.58 | 65.27 | 62.61 | 68.31 |
TC | 81.38 | 79.88 | 87.07 | 81.54 | 81.41 | 81.49 | 80.79 | 81.52 | 81.48 | 81.42 | 81.62 | 81.55 | 85.38 | 81.40 | 84.63 |
TS | 50.42 | 48.10 | 53.88 | 52.18 | 54.25 | 51.64 | 56.54 | 47.78 | 54.63 | 54.50 | 55.45 | 55.51 | 59.76 | 56.05 | 61.96 |
VE | 43.26 | 46.22 | 51.12 | 43.57 | 43.22 | 46.03 | 43.58 | 44.47 | 47.80 | 43.17 | 47.79 | 49.53 | 48.02 | 44.18 | 49.58 |
WM | 64.76 | 64.79 | 65.31 | 64.13 | 65.13 | 59.50 | 63.14 | 62.63 | 64.41 | 65.73 | 64.10 | 64.92 | 68.92 | 66. 44 | 69.37 |
mAP | 60.32 | 62.00 | 64.50 | 61.91 | 62.91 | 62.29 | 62.21 | 60.31 | 64.63 | 63.41 | 63.52 | 65.25 | 66.31 | 64.97 | 67.48 |
Method | mAP (07) | mAP (12) |
---|---|---|
DRN [54] | - | 92.70 |
GWD [4] | 89.95 | 97.37 |
Gliding Vertex [14] | 88.20 | - |
RoI-Transformer [9] | 86.20 | - |
CenterMap [53] | - | 92.80 |
R3Det [3] | 89.26 | 96.01 |
S2ANet [7] | 90.19 | 95.01 |
ReDet [12] | 90.46 | 97.63 |
O-RCNN [13] | 90.50 | 97.60 |
O-RepPoints [21] | 90.38 | 97.26 |
WDARFNet | 90.61 | 98.27 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, J.; Zhou, L.; Ju, Y. WDARFNet: A Wavelet-Domain Adaptive Receptive Field Network for Improved Oriented Object Detection in Remote Sensing. Appl. Sci. 2025, 15, 7035. https://doi.org/10.3390/app15137035
Yang J, Zhou L, Ju Y. WDARFNet: A Wavelet-Domain Adaptive Receptive Field Network for Improved Oriented Object Detection in Remote Sensing. Applied Sciences. 2025; 15(13):7035. https://doi.org/10.3390/app15137035
Chicago/Turabian StyleYang, Jie, Li Zhou, and Yongfeng Ju. 2025. "WDARFNet: A Wavelet-Domain Adaptive Receptive Field Network for Improved Oriented Object Detection in Remote Sensing" Applied Sciences 15, no. 13: 7035. https://doi.org/10.3390/app15137035
APA StyleYang, J., Zhou, L., & Ju, Y. (2025). WDARFNet: A Wavelet-Domain Adaptive Receptive Field Network for Improved Oriented Object Detection in Remote Sensing. Applied Sciences, 15(13), 7035. https://doi.org/10.3390/app15137035