Feature-Enhanced CenterNet for Small Object Detection in Remote Sensing Images
Abstract
:1. Introduction
- An anchor-free detector, which has excellent performance for small object detection in complex remote sensing scenes.
- A feature-enhanced module, which largely contributes to improving the ability of feature extraction and representation of small objects by digging the multiscale feature and integrating the attention mechanism.
- An established small and dim vehicle dataset, which helps to assess the performance of detection algorithms for small objects.
2. Related Works
2.1. Anchor-Based Framework for Object Detection
2.2. Anchor-Free Framework for Object Detection
3. Proposed Method
3.1. Feature-Enhanced Module
3.2. Loss Function
4. Experimental Results
4.1. Dim and Vehicle Datasets
- The vehicles with relatively monotonic appearance are small in size, dim in features, and low in contrast. It is very difficult to obtain a robust characteristic representation for these objects. The local regions of objects are shown in Figure 3.
- The images cover a wide area and various complex scenes, such as parking lots, roads, neighborhoods, etc. In addition, there are plenty of suspected objects in scenes prone to becoming false alarm sources. The complicated scenes are depicted in Figure 4.
4.2. Evaluation Metrics
4.3. Implementation Details and Ablation Analysis
4.4. Algorithm Performance Comparison
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Li, K.; Wan, G.; Cheng, G.; Meng, L.; Han, J. Object detection in optical remote sensing images: A survey and a new benchmark. ISPRS J. Photogramm. Remote Sens. 2020, 159, 296–307. [Google Scholar] [CrossRef]
- Zhang, G.; Lu, S.; Zhang, W. CAD-Net: A context-aware detection network for objects in remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 2019, 57, 10015–10024. [Google Scholar] [CrossRef] [Green Version]
- Chen, L.; Shi, W.; Deng, D. Improved YOLOv3 based on attention mechanism for fast and accurate ship detection in optical remote sensing images. Remote Sens. 2021, 13, 660. [Google Scholar] [CrossRef]
- Yan, P.; Liu, X.; Wang, F.; Yue, C.; Wang, X. LOVD: Land Vehicle Detection in Complex Scenes of Optical Remote Sensing Image. IEEE Trans. Geosci. Remote Sens. 2021, 60, 5615113. [Google Scholar] [CrossRef]
- Li, J.; Zhang, Z.; Tian, Y.; Xu, Y.; Wen, Y.; Wang, S. Target-guided feature super-resolution for vehicle detection in remote sensing images. IEEE Geosci. Remote Sens. Lett. 2021, 19, 8020805. [Google Scholar] [CrossRef]
- Hu, J.; Zhi, X.; Shi, T.; Zhang, W.; Cui, Y.; Zhao, S. PAG-YOLO: A portable attention-guided YOLO network for small ship detection. Remote Sens. 2021, 13, 3059. [Google Scholar] [CrossRef]
- Cheng, G.; Yuan, X.; Yao, X.; Yan, K.; Zeng, Q.; Han, J. Towards Large-Scale Small Object Detection: Survey and Benchmarks. arXiv 2022, arXiv:2207.14096. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Proceedings of the Advances in Neural Information Processing Systems 25, Lake Tahoe, NV, USA, 3–6 December 2012. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. In Proceedings of the Advances in Neural Information Processing Systems 28, Montreal, QC, Canada, 7–12 December 2015. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
- Xia, G.-S.; Bai, X.; Ding, J.; Zhu, Z.; Belongie, S.; Luo, J.; Datcu, M.; Pelillo, M.; Zhang, L. DOTA: A large-scale dataset for object detection in aerial images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 3974–3983. [Google Scholar]
- Zhou, X.; Wang, D.; Krähenbühl, P. Objects as points. arXiv 2019, arXiv:1904.07850. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Cai, Z.; Vasconcelos, N. Cascade r-cnn: Delving into high quality object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6154–6162. [Google Scholar]
- Wang, Y.; Ye, S.; Bai, Y.; Gao, G.; Gu, Y. Vehicle Detection Using Deep Learning with Deformable Convolution. In Proceedings of the 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, Brussels, Belgium, 11–16 July 2021; pp. 2329–2332. [Google Scholar]
- Zhang, Y.; Song, C.; Zhang, D. Small-Scale Aircraft Detection in Remote Sensing Images Based on Faster-RCNN. Multimed. Tools Appl. 2022, 81, 18091–18103. [Google Scholar] [CrossRef]
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv 2022, arXiv:2207.02696. [Google Scholar]
- Bashir, S.M.A.; Wang, Y. Small Object Detection in Remote Sensing Images with Residual Feature Aggregation-Based Super-Resolution and Object Detector Network. Remote Sens. 2021, 13, 1854. [Google Scholar] [CrossRef]
- Zhou, L.; Zheng, C.; Yan, H.; Zuo, X.; Qiao, B.; Zhou, B.; Fan, M.; Liu, Y. Vehicle Detection in Remote Sensing Image Based on Machine Vision. Comput. Intell. Neurosci. 2021, 2021, 8683226. [Google Scholar] [CrossRef] [PubMed]
- Kim, M.; Jeong, J.; Kim, S. ECAP-YOLO: Efficient Channel Attention Pyramid YOLO for Small Object Detection in Aerial Image. Remote Sens. 2021, 13, 4851. [Google Scholar] [CrossRef]
- Law, H.; Deng, J. Cornernet: Detecting objects as paired keypoints. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 734–750. [Google Scholar]
- Dong, Z.; Li, G.; Liao, Y.; Wang, F.; Ren, P.; Qian, C. Centripetalnet: Pursuing high-quality keypoint pairs for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10519–10528. [Google Scholar]
- Zhou, X.; Zhuo, J.; Krahenbuhl, P. Bottom-up object detection by grouping extreme and center points. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 850–859. [Google Scholar]
- Tian, Z.; Shen, C.; Chen, H.; He, T. Fcos: Fully convolutional one-stage object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 9627–9636. [Google Scholar]
- Fan, S.; Zhu, F.; Chen, S.; Zhang, H.; Tian, B.; Lv, Y.; Wang, F.-Y. FII-CenterNet: An anchor-free detector with foreground attention for traffic object detection. IEEE Trans. Veh. Technol. 2021, 70, 121–132. [Google Scholar] [CrossRef]
- Zhou, X.; Koltun, V.; Krähenbühl, P. Probabilistic two-stage detection. arXiv 2021, arXiv:2103.07461. [Google Scholar]
- Duan, K.; Bai, S.; Xie, L.; Qi, H.; Huang, Q.; Tian, Q. CenterNet++ for Object Detection. arXiv 2022, arXiv:2204.08394. [Google Scholar]
- Chen, L.-C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. arXiv 2016, arXiv:1606.00915. [Google Scholar] [CrossRef] [PubMed]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Hou, Q.; Zhou, D.; Feng, J. Coordinate attention for efficient mobile network design. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 13713–13722. [Google Scholar]
- Chen, D.; Miao, D. Control Distance IoU and Control Distance IoU Loss Function for Better Bounding Box Regression. arXiv 2021, arXiv:2103.11696. [Google Scholar]
- Leong, C.; Rovito, T.; Mendoza-Schrock, O.; Menart, C.; Bowser, J.; Moore, L.; Scarborough, S.; Minardi, M.; Hascher, D. Unified Coincident Optical and Radar for Recognition (UNICORN) 2008 Dataset. 2019. Available online: https://github.com/AFRL-RY/data-unicorn-2008 (accessed on 16 October 2019).
Methods | Precision | Recall | F1 Score | AP | FPS |
---|---|---|---|---|---|
CenterNet | 79.8% | 74.9% | 77.3% | 70.2% | 17.9 |
CenterNet + FEM | 82.7% | 78.6% | 80.6% | 74.5% | 16.7 |
CenterNet + proposed loss function | 80.8% | 77.0% | 78.9% | 72.9% | 17.9 |
CenterNet + FEM + proposed loss function | 83.5% | 80.8% | 82.1% | 77.4% | 16.6 |
Methods | Precision | Recall | F1 Score | AP | FPS |
---|---|---|---|---|---|
Cascade-RCNN | 81.0% | 69.1% | 74.6% | 75.6% | 7.7 |
imYOLOv3 | 76.0% | 78.7% | 77.3% | 74.4% | 15.8 |
YOLOv7 | 78.8% | 79.3% | 79.1% | 76.1% | 20.1 |
FII-CenterNet | 77.2% | 74.1% | 75.7% | 74.8% | 14.3 |
FE-CenterNet (ours) | 83.5% | 80.8% | 82.1% | 77.4% | 16.5 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Shi, T.; Gong, J.; Hu, J.; Zhi, X.; Zhang, W.; Zhang, Y.; Zhang, P.; Bao, G. Feature-Enhanced CenterNet for Small Object Detection in Remote Sensing Images. Remote Sens. 2022, 14, 5488. https://doi.org/10.3390/rs14215488
Shi T, Gong J, Hu J, Zhi X, Zhang W, Zhang Y, Zhang P, Bao G. Feature-Enhanced CenterNet for Small Object Detection in Remote Sensing Images. Remote Sensing. 2022; 14(21):5488. https://doi.org/10.3390/rs14215488
Chicago/Turabian StyleShi, Tianjun, Jinnan Gong, Jianming Hu, Xiyang Zhi, Wei Zhang, Yin Zhang, Pengfei Zhang, and Guangzheng Bao. 2022. "Feature-Enhanced CenterNet for Small Object Detection in Remote Sensing Images" Remote Sensing 14, no. 21: 5488. https://doi.org/10.3390/rs14215488
APA StyleShi, T., Gong, J., Hu, J., Zhi, X., Zhang, W., Zhang, Y., Zhang, P., & Bao, G. (2022). Feature-Enhanced CenterNet for Small Object Detection in Remote Sensing Images. Remote Sensing, 14(21), 5488. https://doi.org/10.3390/rs14215488