Low-Altitude Remote Sensing Opium Poppy Image Detection Based on Modified YOLOv3
Abstract
:1. Introduction
2. Materials and Methods
2.1. Data Collection
2.2. Data Processing
2.2.1. Remote Sensing Image
2.2.2. Data Segmentation
2.2.3. Data Enhancement
2.2.4. Mosaic Enhancement
3. Model Improvement
3.1. Basic YOLOv3 Network
3.2. Improved YOLOv3 Model
3.2.1. Construction of the Network for Poppy Detection in Villages
3.2.2. ResNeXt Group Convolution
3.2.3. ResNeXt Group Convolution
4. Experiment and Analysis
4.1. Experiment Dataset
4.2. Model Training
4.3. Evaluation Indicators
4.3.1. Confidence and Intersection-over-Union
4.3.2. Precision, Recall, and F1 Harmonic Average
4.3.3. Average Precision and Mean Average Precision
4.4. Comparison of Different Detection Algorithms
4.5. Comparison of Detection Results
5. Discussion
- (1)
- Interference from natural factors. In view of that UAV captures image at an altitude of 120 m, the wind in the atmosphere will cause the camera to shake. Consequently, the captured image may be blurred, resulting in unobvious features of the poppy, which can ultimately interfere with the model’s extraction of poppy features and impose an impact on the recognition accuracy.
- (2)
- Multiple planting strategies. Illegal poppy planters will deliberately cover the poppy plants with transparent plastic sheets, which will affect the shooting effect of UAV while providing sufficient sunlight to the plants. Additionally, the planters may also mix poppies with a variety of similar plants. In the UAV images, the appearance and color of poppy at the flowering stage are very similar to those of green onion, which may lead to detection errors.
- (3)
- Complicated planting background. The backgrounds of actual original images captured by UAV are more complicated than those of the cropped 416 × 416 pixel images in our dataset. The backgrounds of UAV images may contain a variety of complicated objects, such as buildings, other vegetable crops, bushes, plant flowers, etc. The complicated backgrounds will lead to an increase in false detection rate. Therefore, it is necessary to increase the model’s learning ability of negative samples by expanding the dataset, so as to adapt to the interference brought by a complicated environment.
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Liu, X.; Tian, Y.; Yuan, C. Opium poppy detection using deep learning. Remote Sens. 2018, 10, 1886. [Google Scholar] [CrossRef] [Green Version]
- He, Q.; Zhang, Y.; Liang, L. Identification of poppy by spectral matching classification. Optik 2020, 200, 163445. [Google Scholar] [CrossRef]
- Zhou, J.; Tian, Y.; Yuan, C. Improved uav opium poppy detection using an updated yolov3 model. Sensors 2019, 19, 4851. [Google Scholar] [CrossRef] [Green Version]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
- Deng, J.; Dong, W.; Socher, R.; Li, L.; Li, K.; Li, F.-F. ImageNet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar] [CrossRef] [Green Version]
- Girshick, R.; Donahue, J.; Darrell, T. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Wang, J.; Yuan, Y.; Yu, G. Face attention network: An effective face detector for the occluded faces. arXiv 2017, arXiv:1711.07246. [Google Scholar]
- Liu, Y.; Tang, X. BFBox: Searching Face-Appropriate Backbone and Feature Pyramid Network for Face Detec-tor. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 13568–13577. [Google Scholar]
- Liu, Y.; Tang, X.; Han, J. HAMBox: Delving Into Mining High-Quality Anchors on Face Detection 202. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 13043–13051. [Google Scholar]
- Li, J.; Liang, X.; Shen, S.M. Scale-aware fast R-CNN for pedestrian detection. IEEE Trans. Multi-Media 2017, 20, 985–996. [Google Scholar] [CrossRef] [Green Version]
- Campmany, V.; Silva, S.; Espinosa, A. GPU-based pedestrian detection for autonomous driving. Procedia Comput. Sci. 2016, 80, 2377–2381. [Google Scholar] [CrossRef] [Green Version]
- Cao, J.; Pang, Y.; Li, X. Learning multilayer channel features for pedestrian detection. IEEE Trans. Image Process. 2017, 26, 3210–3220. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ding, L.; Wang, Y.; Laganière, R. A robust and fast multispectral pedestrian detection deep network. Knowl.-Based Syst. 2021, 106990. [Google Scholar] [CrossRef]
- Li, P.; Chen, X.; Shen, S. Stereo r-cnn based 3d object detection for autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 7644–7652. [Google Scholar]
- Lu, Z.; Rathod, V.; Votel, R. Retinatrack: Online single stage joint detection and tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–18 June 2020; pp. 14668–14678. [Google Scholar]
- Ammour, N.; Alhichri, H.; Bazi, Y. Deep learning approach for car detection in UAV imagery. Remote Sens. 2017, 9, 312. [Google Scholar] [CrossRef] [Green Version]
- Ren, S.; He, K.; Girshick, R. Faster r-cnn: Towards real-time object detection with region proposal networks. arXiv 2015, arXiv:1506.01497. [Google Scholar] [CrossRef] [Green Version]
- Girshick, R. Fast r-cn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- He, K.; Gkioxari, G.; Dollár, P. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D. Ssd: Single shot multibox detector. In Proceedings of the European Conference on Computer vision, Amsterdam, The Netherlands, 8–16 October 2016; pp. 21–37. [Google Scholar]
- Gao, J.; Chen, Y.; Wei, Y. Detection of Specific Building in Remote Sensing Images Using a Novel YO-LO-S-CIOU Model Case: Gas Station Identification. Sensors 2021, 21, 1375. [Google Scholar] [CrossRef] [PubMed]
- Xu, D.; Wu, Y. Improved YOLO-V3 with DenseNet for Multi-Scale Remote Sensing Target Detection. Sensors 2020, 20, 4276. [Google Scholar] [CrossRef]
- Wu, X.; Hong, D.; Ghamisi, P. MsRi-CCF: Multi-scale and rotation-insensitive convolutional channel features for geospatial object detection. Remote Sens. 2018, 10, 1990. [Google Scholar] [CrossRef] [Green Version]
- Avola, D.; Cinque, L.; Diko, A. MS-Faster R-CNN: Multi-Stream Backbone for Improved Faster R-CNN Object Detection and Aerial Tracking from UAV Images. Remote Sens. 2021, 13, 1670. [Google Scholar] [CrossRef]
- Zhao, Y.; Ma, J.; Li, X. Saliency detection and deep learning-based wildfire identification in UAV imagery. Sensors 2018, 18, 712. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Bejiga, M.B.; Zeggada, A.; Nouffidj, A. A convolutional neural network approach for assisting avalanche search and rescue operations with UAV imagery. Remote Sens. 2017, 9, 100. [Google Scholar] [CrossRef] [Green Version]
- Bazi, Y.; Melgani, F. Convolutional SVM networks for object detection in UAV imagery. IEEE Trans. Geosci. Remote Sens. 2018, 56, 3107–3118. [Google Scholar] [CrossRef]
- Ampatzidis, Y.; Partel, V. UAV-based high throughput phenotyping in citrus utilizing multispectral imaging and artificial intelligence. Remote Sens. 2019, 11, 410. [Google Scholar] [CrossRef] [Green Version]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Yun, S.; Han, D.; Oh, S.J.; Chun, S.; Choe, J.; Yoo, Y. Cutmix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 6023–6032. [Google Scholar]
- He, K.; Zhang, X.; Ren, S. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Chen, X.; Fang, H.; Lin, T.Y. Microsoft coco captions: Data collection and evaluation server. arXiv 2015, arXiv:1504.00325. [Google Scholar]
- Xie, S.; Girshick, R.; Dollár, P. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE Conference on Computer vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1492–1500. [Google Scholar]
- Szegedy, C.; Liu, W.; Jia, Y. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- Howard, A.G.; Zhu, M.; Chen, B. Mobilenets: Efficient convolutional neural networks for mobile vision ap-plications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
- Szegedy, C.; Ioffe, S.; Vanhoucke, V. Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; p. 31. [Google Scholar]
- Chen, L.C.; Papandreou, G.; Kokkinos, I. Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv 2014, arXiv:1412.7062. [Google Scholar]
- He, K.; Zhang, X.; Ren, S. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Camera Model | Valid Pixel | Maximum Resolution | Color Space |
---|---|---|---|
Sony A7RII | 42.4 million pixels | 7952 × 5304 | sRGB |
Image File Name | Resolution | Ground Resolution/cm | File Type |
---|---|---|---|
DSC***** | 7952 × 5304 | 1.2 | JPG |
Scheme | Add 104 × 104 Prediction | Modify Residual Structure | Add the ASPP Module |
---|---|---|---|
1 | √ | ||
2 | √ | √ | |
3 | √ | √ | √ |
Model | Precision/% | Recall/% | F1/% | mAP/% | Total Parameter | Detection Time/ms |
---|---|---|---|---|---|---|
YOLOv3 | 95.5 | 90.8 | 93.1 | 91.97 | 61,523,734 | 30 |
Scheme 1 | 94.5 | 91.7 | 93.1 | 92.58 | 61,785,384 | 30.2 |
Scheme 2 | 91.8 | 89.1 | 90.4 | 90.89 | 3,591,224 | 32.4 |
Scheme 3 | 94.3 | 90.7 | 92.0 | 92.42 | 8,457,192 | 24.5 |
Model | P/% | R/% | F1/% | mAP/% | Total Parameter | Detection Time/ms |
---|---|---|---|---|---|---|
Faster rcnn (ResNet50) | 45.5 | 63.1 | 52.9 | 54.38 | 28,342,195 | 134 |
RetinaNet | 76.8 | 38.6 | 51.4 | 49.86 | 36,382,957 | 33.2 |
CenterNet | 97.2 | 32.8 | 49.0 | 57.50 | 32,718,597 | 36.2 |
SSD | 87.7 | 65.8 | 75.2 | 70.44 | 23,745,908 | 21.5 |
YOLOv4 | 88.5 | 36.6 | 51.8 | 47.86 | 63,937,686 | 34.2 |
YOLOv3 (EfficientNet) | 88.6 | 38.9 | 54.3 | 49.49 | 20,235,902 | 26.4 |
YOLOv3 (MobileNet) | 94.7 | 90.7 | 92.7 | 91.98 | 24,135,862 | 22.4 |
YOLOv3 (SqueezeNet) | 85.0 | 84.8 | 84.9 | 81.98 | 31,399,926 | 24.2 |
Global Multiscale-YOLOv3 | 94.3 | 90.7 | 92.0 | 92.42 | 8,457,192 | 24.5 |
Number of Detection | TP | FP | FN | Ground Truth | Precision/% | Recall/% | F-Score/% | |
---|---|---|---|---|---|---|---|---|
Global Multiscale-YOLOv3 | 1539 | 1451 | 88 | 148 | 1600 | 94.3 | 90.7 | 92.0 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, C.; Wang, Q.; Wu, H.; Zhao, C.; Teng, G.; Li, J. Low-Altitude Remote Sensing Opium Poppy Image Detection Based on Modified YOLOv3. Remote Sens. 2021, 13, 2130. https://doi.org/10.3390/rs13112130
Wang C, Wang Q, Wu H, Zhao C, Teng G, Li J. Low-Altitude Remote Sensing Opium Poppy Image Detection Based on Modified YOLOv3. Remote Sensing. 2021; 13(11):2130. https://doi.org/10.3390/rs13112130
Chicago/Turabian StyleWang, Chunshan, Qian Wang, Huarui Wu, Chunjiang Zhao, Guifa Teng, and Jiuxi Li. 2021. "Low-Altitude Remote Sensing Opium Poppy Image Detection Based on Modified YOLOv3" Remote Sensing 13, no. 11: 2130. https://doi.org/10.3390/rs13112130
APA StyleWang, C., Wang, Q., Wu, H., Zhao, C., Teng, G., & Li, J. (2021). Low-Altitude Remote Sensing Opium Poppy Image Detection Based on Modified YOLOv3. Remote Sensing, 13(11), 2130. https://doi.org/10.3390/rs13112130