FiFoNet: Fine-Grained Target Focusing Network for Object Detection in UAV Images
Abstract
:1. Introduction
2. Related Work
2.1. SR-Based UAV Object Detection
2.2. Context-Based UAV Object Detection
2.3. MR-Based UAV Object Detection
3. Methodology
3.1. Overview
3.2. Fine-Grained Feature Aggregation (FiFA)
Algorithm 1 The pseudo-code for aggregating fine-grained features with our FiFA module. |
Input: Features extracted from different convolution layers . |
Output: Features aggregated by our FiFA module . |
1: Reshape |
2: Concatenate |
3: Conv() → = |
4: Broadcast() → = |
5: Aggregate R with Equation (1). |
3.3. Target Focusing Block (TFB)
Algorithm 2 The pseudo-code for refining features with the proposed Target-Focusing Block (TFB). |
Input: Features aggregated by FiFA ; |
the ground truth of objects’ position . |
Output: The final enhanced multi-scale features , |
the object position loss . |
1: Estimate an object mask map, |
2: Compute object position attention loss, |
3: Obtain the refined feature map, |
4: Resize |
5: Output the final : |
6: |
7: |
8: |
9: |
3.4. Global-Local Context Collector
4. Experiments
4.1. Datasets and Models
4.2. Implementation and Evaluation Metrics
4.3. Ablation Studies
4.4. Comparison with State-of-the-Art Methods
5. Limitation and Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Avola, D.; Cinque, L.; Diko, A.; Fagioli, A.; Foresti, G.L.; Mecca, A.; Pannone, D.; Piciarelli, C. MS-Faster R-CNN: Multi-stream backbone for improved Faster R-CNN object detection and aerial tracking from UAV images. Remote Sens. 2021, 13, 1670. [Google Scholar] [CrossRef]
- Stojnić, V.; Risojević, V.; Muštra, M.; Jovanović, V.; Filipi, J.; Kezić, N.; Babić, Z. A method for detection of small moving objects in UAV videos. Remote Sens. 2021, 13, 653. [Google Scholar] [CrossRef]
- Ma, Y.; Li, Q.; Chu, L.; Zhou, Y.; Xu, C. Real-time detection and spatial localization of insulators for UAV inspection based on binocular stereo vision. Remote Sens. 2021, 13, 230. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Paradise, NV, USA, 26 June–1 July 2016; pp. 779–788. [Google Scholar]
- Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. Scaled-yolov4: Scaling cross stage partial network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 13029–13038. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
- Zhu, P.; Wen, L.; Du, D.; Bian, X.; Fan, H.; Hu, Q.; Ling, H. Detection and Tracking Meet Drones Challenge. IEEE Trans. Pattern Anal. Mach. Intell. 2021. [Google Scholar] [CrossRef] [PubMed]
- Wen, L.; Du, D.; Zhu, P.; Hu, Q.; Wang, Q.; Bo, L.; Lyu, S. Detection, tracking, and counting meets drones in crowds: A benchmark. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 7812–7821. [Google Scholar]
- Deng, S.; Li, S.; Xie, K.; Song, W.; Liao, X.; Hao, A.; Qin, H. A global-local self-adaptive network for drone-view object detection. IEEE Trans. Image Process. 2020, 30, 1556–1569. [Google Scholar] [CrossRef] [PubMed]
- Yang, X.; Yan, J.; Liao, W.; Yang, X.; Tang, J.; He, T. Scrdet++: Detecting small, cluttered and rotated objects via instance-level feature denoising and rotation loss smoothing. IEEE Trans. Pattern Anal. Mach. Intell. 2022. [Google Scholar] [CrossRef]
- Yang, X.; Yang, J.; Yan, J.; Zhang, Y.; Zhang, T.; Guo, Z.; Sun, X.; Fu, K. Scrdet: Towards more robust detection for small, cluttered and rotated objects. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 8232–8241. [Google Scholar]
- Deng, C.; Wang, M.; Liu, L.; Liu, Y.; Jiang, Y. Extended feature pyramid network for small object detection. IEEE Trans. Multimed. 2021, 24, 1968–1979. [Google Scholar] [CrossRef]
- Noh, J.; Bae, W.; Lee, W.; Seo, J.; Kim, G. Better to follow, follow to be better: Towards precise supervision of feature super-resolution for small object detection. In Proceedings of the the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 9725–9734. [Google Scholar]
- Bashir, S.M.A.; Wang, Y. Small object detection in remote sensing images with residual feature aggregation-based super-resolution and object detector network. Remote Sens. 2021, 13, 1854. [Google Scholar] [CrossRef]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
- Peng, J.; Wang, H.; Yue, S.; Zhang, Z. Context-aware co-supervision for accurate object detection. Pattern Recognit. 2022, 121, 108199. [Google Scholar] [CrossRef]
- Tang, X.; Du, D.K.; He, Z.; Liu, J. Pyramidbox: A context-assisted single shot face detector. In Proceedings of the European conference on computer vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 797–813. [Google Scholar]
- Kong, Y.; Feng, M.; Li, X.; Lu, H.; Liu, X.; Yin, B. Spatial context-aware network for salient object detection. Pattern Recognit. 2021, 114, 107867. [Google Scholar] [CrossRef]
- Jiao, L.; Gao, J.; Liu, X.; Liu, F.; Yang, S.; Hou, B. Multi-Scale Representation Learning for Image Classification: A Survey. IEEE Trans. Artif. Intell. 2021. [Google Scholar] [CrossRef]
- Qiao, S.; Chen, L.C.; Yuille, A. Detectors: Detecting objects with recursive feature pyramid and switchable atrous convolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 10213–10224. [Google Scholar]
- Dai, X.; Chen, Y.; Xiao, B.; Chen, D.; Liu, M.; Yuan, L.; Zhang, L. Dynamic head: Unifying object detection heads with attentions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 7373–7382. [Google Scholar]
- Han, J.; Yao, X.; Cheng, G.; Feng, X.; Xu, D. P-CNN: Part-Based Convolutional Neural Networks for Fine-Grained Visual Categorization. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 579–590. [Google Scholar] [CrossRef]
- Song, L.; Li, Y.; Jiang, Z.; Li, Z.; Sun, H.; Sun, J.; Zheng, N. Fine-grained dynamic head for object detection. Adv. Neural Inf. Process. Syst. 2020, 33, 11131–11141. [Google Scholar]
- Du, D.; Qi, Y.; Yu, H.; Yang, Y.; Duan, K.; Li, G.; Zhang, W.; Huang, Q.; Tian, Q. The unmanned aerial vehicle benchmark: Object detection and tracking. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 370–386. [Google Scholar]
- Zhou, J.; Vong, C.M.; Liu, Q.; Wang, Z. Scale adaptive image cropping for UAV object detection. Neurocomputing 2019, 366, 305–313. [Google Scholar] [CrossRef]
- Xi, Y.; Jia, W.; Zheng, J.; Fan, X.; Xie, Y.; Ren, J.; He, X. DRL-GAN: Dual-stream representation learning GAN for low-resolution image classification in UAV applications. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 14, 1705–1716. [Google Scholar] [CrossRef]
- Yang, F.; Fan, H.; Chu, P.; Blasch, E.; Ling, H. Clustered object detection in aerial images. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 8311–8320. [Google Scholar]
- Li, J.; Liang, X.; Wei, Y.; Xu, T.; Feng, J.; Yan, S. Perceptual generative adversarial networks for small object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1222–1230. [Google Scholar]
- Bell, S.; Zitnick, C.L.; Bala, K.; Girshick, R. Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Paradise, NV, USA, 26 June–1 July 2016; pp. 2874–2883. [Google Scholar]
- Qiu, H.; Li, H.; Wu, Q.; Meng, F.; Xu, L.; Ngan, K.N.; Shi, H. Hierarchical context features embedding for object detection. IEEE Trans. Multimed. 2020, 22, 3039–3050. [Google Scholar] [CrossRef]
- Li, Y.; Chen, Y.; Wang, N.; Zhang, Z. Scale-aware trident networks for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 6054–6063. [Google Scholar]
- Zou, Z.; Shi, Z. Random access memories: A new paradigm for target detection in high resolution aerial remote sensing images. IEEE Trans. Image Process. 2017, 27, 1100–1111. [Google Scholar] [CrossRef]
- Bai, Y.; Zhang, Y.; Ding, M.; Ghanem, B. Sod-mtgan: Small object detection via multi-task generative adversarial network. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 206–221. [Google Scholar]
- Hu, P.; Ramanan, D. Finding tiny faces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 951–959. [Google Scholar]
- Mukhiddinov, M.; Cho, J. Smart glass system using deep learning for the blind and visually impaired. Electronics 2021, 10, 2756. [Google Scholar] [CrossRef]
- Yuan, Y.; Xiong, Z.; Wang, Q. VSSA-NET: Vertical spatial sequence attention network for traffic sign detection. IEEE Trans. Image Process. 2019, 28, 3423–3434. [Google Scholar] [CrossRef]
- Liu, Y.; Cao, S.; Lasang, P.; Shen, S. Modular lightweight network for road object detection using a feature fusion approach. IEEE Trans. Syst. Man Cybern. Syst. 2019, 51, 4716–4728. [Google Scholar] [CrossRef]
- Xiang, W.; Zhang, D.Q.; Yu, H.; Athitsos, V. Context-aware single-shot detector. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Lake Tahoe, NV, USA, 12–15 March 2018; pp. 1784–1793. [Google Scholar]
- Ouyang, W.; Wang, K.; Zhu, X.; Wang, X. Chained cascade network for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 1938–1946. [Google Scholar]
- Singh, B.; Davis, L.S. An analysis of scale invariance in object detection snip. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 3578–3587. [Google Scholar]
- Lyu, P.; Yao, C.; Wu, W.; Yan, S.; Bai, X. Multi-oriented scene text detection via corner localization and region segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7553–7563. [Google Scholar]
- Pang, J.; Chen, K.; Shi, J.; Feng, H.; Ouyang, W.; Lin, D. Libra r-cnn: Towards balanced learning for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 821–830. [Google Scholar]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path aggregation network for instance segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 8759–8768. [Google Scholar]
- Zoph, B.; Le, Q.V. Neural architecture search with reinforcement learning. Int. Conf. Learn. Represent. 2017, 1–16. [Google Scholar]
- Tan, M.; Pang, R.; Le, Q.V. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Virtual, 14–19 June 2020; pp. 10781–10790. [Google Scholar]
- Ghiasi, G.; Lin, T.Y.; Le, Q.V. Nas-fpn: Learning scalable feature pyramid architecture for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 7036–7045. [Google Scholar]
- Narasimhan, S.G.; Nayar, S.K. Vision and the atmosphere. Int. J. Comput. Vis. 2002, 48, 233–254. [Google Scholar] [CrossRef]
- Ranftl, R.; Bochkovskiy, A.; Koltun, V. Vision transformers for dense prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 12179–12188. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands, 11–14 October 2016; pp. 21–37. [Google Scholar]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Chalavadi, V.; Jeripothula, P.; Datla, R.; Ch, S.B. mSODANet: A Network for Multi-Scale Object Detection in Aerial Images using Hierarchical Dilated Convolutions. Pattern Recognit. 2022, 126, 108548. [Google Scholar] [CrossRef]
- Yu, W.; Yang, T.; Chen, C. Towards resolving the challenge of long-tail distribution in UAV images for object detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–8 January 2021; pp. 3258–3267. [Google Scholar]
- Wang, Y.; Yang, Y.; Zhao, X. Object detection using clustering algorithm adaptive searching regions in aerial images. In Proceedings of the ECCV, Glasgow, UK, 23–28 August 2020; pp. 651–664. [Google Scholar]
- Liu, Z.; Gao, G.; Sun, L.; Fang, Z. HRDNet: High-resolution detection network for small objects. In Proceedings of the ICME, Shenzhen, China, 5–9 July 2021; pp. 1–6. [Google Scholar]
- Zhu, X.; Lyu, S.; Wang, X.; Zhao, Q. TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-Captured Scenarios. In Proceedings of the ICCVW, Montreal, BC, Canada, 11–17 October 2021; pp. 2778–2788. [Google Scholar]
- Everingham, M.; Eslami, S.; Van Gool, L.; Williams, C.K.; Winn, J.; Zisserman, A. The pascal visual object classes challenge: A retrospective. Int. J. Comput. Vis. 2015, 111, 98–136. [Google Scholar] [CrossRef]
- Jocher, G. YOLOv5. 2021. Available online: https://github.com/ultralytics/yolov5 (accessed on 1 August 2022).
Methods | Advantages | Drawbacks |
---|---|---|
SR-based methods [26,27,28] | Reconstructed the information of RoIs; can effectively recognize small objects. | Difficult to locate RoIs accurately at places with cluttered backgrounds where it is hard to reconstruct the ROIs. |
Context-based methods [17,29,30] | Can effectively detect small targets with a fixed background (e.g., cars in roads). | Difficult to build such contextual relationships due to the diversity of UAV background scenes. |
MR-based methods [15,31,32] | Can effectively detect multi-scale objects, including small, middle and large size objects. | Suffer from the feature-level imbalance issue. |
Method | Sum and Average | Concatenation | FiFA |
---|---|---|---|
33.5 | 33.8 | 34.6 |
Method | Train Imgz | Test Imgz | mAP @.5 | Speed (ms) | Ped | People | Bicycle | Car | Van | Truck | Tricycle | Awning-Tricycle | Bus | Motor |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Baseline | 640 | 640 | 33.5 | 7.0 | 39.3 | 32.0 | 11.9 | 73.8 | 36.2 | 31.2 | 20.3 | 12.2 | 39.7 | 38.0 |
Baseline + FiFA | 640 | 640 | 34.6 | 7.0 | 40.9 | 32.9 | 11.4 | 74.5 | 37.5 | 31.3 | 20.5 | 10.9 | 46.8 | 38.8 |
Baseline + TFB | 640 | 640 | 34.5 | 7.0 | 39.8 | 33.1 | 11.0 | 74.5 | 37 | 31.7 | 18.6 | 11.2 | 48.7 | 39.0 |
Baseline + tinyHead | 640 | 640 | 37.2 | 7.0 | 45.1 | 35.4 | 12.8 | 79.1 | 40.0 | 34.3 | 22.0 | 12.1 | 48.8 | 42.6 |
Baseline | 640 | 1996 | 35.8 | 7.0 | 52.6 | 34.8 | 14.5 | 81.0 | 39.2 | 21.2 | 22.2 | 10.6 | 40.8 | 42.7 |
Baseline | 1536 | 640 | 34.0 | 7.0 | 34.8 | 30.3 | 14.4 | 73.4 | 36.3 | 32.5 | 22.0 | 11.7 | 48.0 | 36.8 |
Baseline | 1536 | 1996 | 55.6 | 15.6 | 69.2 | 54.9 | 36.2 | 89.1 | 55.6 | 49.4 | 44.7 | 24.8 | 69.6 | 62.6 |
Baseline + tinyHead | 1536 | 1996 | 56.1 | 15.6 | 70.4 | 55.4 | 37.6 | 89.7 | 57.7 | 48.9 | 42.8 | 24.2 | 68.7 | 65.3 |
Baseline + largeModel | 1536 | 1996 | 61.4 | 15.6 | 74.6 | 60.8 | 46.4 | 90.8 | 60.5 | 55.0 | 50.2 | 30.2 | 76.6 | 69.0 |
k = 1 | k = 3 | k = 3 | k = 3 | k = 3 | k = 3 | |||
---|---|---|---|---|---|---|---|---|
d = 1 | d = 1 | d = 2 | d = 3 | d = 4 | d = 5 | |||
🗸 | 21.4 | 33.5 | 20.2 | |||||
🗸 | 🗸 | 21.6 | 33.9 | 20.5 | ||||
🗸 | 🗸 | 🗸 | 21.8 | 34.2 | 20.7 | |||
🗸 | 🗸 | 🗸 | 🗸 | 21.9 | 34.3 | 20.9 | ||
🗸 | 🗸 | 🗸 | 🗸 | 🗸 | 22.0 | 34.3 | 20.9 | |
🗸 | 🗸 | 🗸 | 🗸 | 🗸 | 🗸 | 22.0 | 34.4 | 21.0 |
Model | Input Size | FLOPs | Params (M) | Time (ms) | ||
---|---|---|---|---|---|---|
Server | Edge Device | |||||
Baseline | Small | 55.6 | 15.9 | 7.0 | 18.4 | 69.8 |
Large | 61.4 | 204.2 | 86.2 | 79.7 | 480.9 | |
Baseline + FiFA | Small | 56.7 | 16.6 | 7.2 | 19.1 | 74.5 |
Large | 62.9 | 209.0 | 87.7 | 82.2 | 497.8 | |
Baseline + TFB | Small | 56.8 | 18.5 | 7.4 | 19.5 | 75.3 |
Large | 63.1 | 220.8 | 88.6 | 82.4 | 501.2 | |
Baseline + FiFA + TFB | Small | 57.5 | 18.9 | 7.5 | 19.7 | 79.5 |
Large | 63.8 | 225.9 | 89.8 | 82.8 | 509.8 |
Method | |||
---|---|---|---|
Thin Fog | Medium Thick Fog | Thick Fog | |
Baseline | 54.25 | 53.72 | 51.01 |
FiFoNet | 58.15 | 56.31 | 52.85 |
Method | VisDrone2019 | UAVDT | ||||
---|---|---|---|---|---|---|
SSD [49] (ECCV 16) | - | 15.20 | - | 9.30 | 21.40 | 6.70 |
FRCNN [6] + FPN [50] | 21.80 | 41.80 | 20.10 | 11.00 | 23.40 | 8.40 |
YOLOv5 [57] (Github 21) | 24.90 | 42.40 | 25.10 | 19.10 | 33.90 | 19.60 |
DSHNet [52] (WACV 21) | 30.30 | 51.80 | 30.90 | 17.80 | 30.40 | 19.70 |
CRENet [53] (ECCV 20) | 33.70 | 54.30 | 33.50 | - | - | - |
GLSAN [9] (TIP 20) | 30.70 | 55.60 | 29.90 | 19.00 | 30.50 | 21.70 |
ClustDet [27] (ICCV 19) | 32.40 | 56.20 | 31.60 | 13.70 | 26.50 | 12.50 |
SAIC-FPN [25] (Nerocomputing 19) | 35.69 | 62.97 | 35.08 | - | - | - |
HRDNet [54] (ICME 21) | 35.50 | 62.00 | 35.10 | - | - | - |
mSODANet [51] (PR 22) | 36.89 | 55.92 | 37.41 | - | - | - |
FiFoNet (Ours) | 36.91 | 63.80 | 36.11 | 21.30 | 36.80 | 22.50 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xi, Y.; Jia, W.; Miao, Q.; Liu, X.; Fan, X.; Li, H. FiFoNet: Fine-Grained Target Focusing Network for Object Detection in UAV Images. Remote Sens. 2022, 14, 3919. https://doi.org/10.3390/rs14163919
Xi Y, Jia W, Miao Q, Liu X, Fan X, Li H. FiFoNet: Fine-Grained Target Focusing Network for Object Detection in UAV Images. Remote Sensing. 2022; 14(16):3919. https://doi.org/10.3390/rs14163919
Chicago/Turabian StyleXi, Yue, Wenjing Jia, Qiguang Miao, Xiangzeng Liu, Xiaochen Fan, and Hanhui Li. 2022. "FiFoNet: Fine-Grained Target Focusing Network for Object Detection in UAV Images" Remote Sensing 14, no. 16: 3919. https://doi.org/10.3390/rs14163919
APA StyleXi, Y., Jia, W., Miao, Q., Liu, X., Fan, X., & Li, H. (2022). FiFoNet: Fine-Grained Target Focusing Network for Object Detection in UAV Images. Remote Sensing, 14(16), 3919. https://doi.org/10.3390/rs14163919