Confidence-Aware Object Detection Based on MobileNetv2 for Autonomous Driving
Abstract
:1. Introduction
2. Related Work
3. Materials and Methods
3.1. Model Architecture
3.2. Model Parameters
3.3. Bounding Box Regression with Cross-Entropy Loss
3.4. NMS Based on Bounding Box Confidence
4. Results
4.1. Experiments on the KITTI Dataset
4.2. Experiment on the Pascal VOC Dataset
5. Discussion
Author Contributions
Funding
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 19–23 June 2018; pp. 4510–4520. [Google Scholar]
- Rezatofighi, H.; Tsoi, N.; Gwak, J.; Sadeghian, A.; Reid, I.; Savarese, S. Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. arXiv 2016, arXiv:1506.01497. [Google Scholar] [CrossRef] [Green Version]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017. [Google Scholar]
- Cai, Z.; Vasconcelos, N. Cascade R-CNN: Delving into High Quality Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
- Howard, A.; Sandler, M.; Chu, G.; Chen, L.C.; Chen, B.; Tan, M.; Adam, H. Searching for MobileNetV3. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27–28 October 2019. [Google Scholar]
- Wu, B.; Iandola, F.; Jin, P.H.; Keutzer, K. SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Las Vegas, NV, USA, 26 June–1 July 2016. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In European Conference on Computer Vision; Springer: Cham, Switzerland, 2016; pp. 21–37. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Tan, M.; Pang, R.; Le, Q.V. EfficientDet: Scalable and Efficient Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020. [Google Scholar]
- Han, K.; Wang, Y.; Tian, Q.; Guo, J.; Xu, C.; Xu, C. GhostNet: More Features from Cheap Operations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. In Proceedings of the IEEE International Conference on Computer Vision, Istanbul, Turkey, 30–31 January 2018. [Google Scholar]
- Zhou, D.; Fang, J.; Song, X.; Guan, C.; Yin, J.; Dai, Y.; Yang, R. IoU Loss for 2D/3D Object Detection. In Proceedings of the 2019 International Conference on 3D Vision (3DV), Québec City, QC, Canada, 16–19 September 2019. [Google Scholar]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020. [Google Scholar]
- Choi, J.; Chun, D.; Kim, H.; Lee, H.J. Gaussian YOLOv3: An Accurate and Fast Object Detector Using Localization Uncertainty for Autonomous Driving. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27–28 October 2019. [Google Scholar]
- He, Y.; Zhu, C.; Wang, J.; Savvides, M.; Zhang, X. Bounding Box Regression with Uncertainty for Accurate Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Geiger, A.; Lenz, P.; Urtasun, R. Are we ready for autonomous driving? the kitti vision benchmark suite. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012. [Google Scholar]
- Everingham, M.; Van Gool, L.; Williams, C.K.; Winn, J.; Zisserman, A. The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Development Kit. 2006. Available online: http://host.robots.ox.ac.uk/pascal/VOC/ (accessed on 28 March 2021).
Input | Operator | t | c | n | s |
---|---|---|---|---|---|
Conv2d | - | 32 | 1 | 2 | |
Bottleneck | 1 | 16 | 1 | 1 | |
Bottleneck | 6 | 24 | 2 | 2 | |
Bottleneck | 6 | 32 | 3 | 2 | |
Bottleneck | 6 | 64 | 4 | 2 | |
Bottleneck | 6 | 96 | 3 | 1 | |
Bottleneck | 6 | 160 | 3 | 2 |
Method | Car | Pedestrian | Cyclist | mAP | Gflops |
---|---|---|---|---|---|
MobileNetv2 + SSD [1] | 89.3 | 65.7 | 73.0 | 76.0 | 12.5 |
MobileNetv2 + MobileNeck (ours) | 90.2 | 68.1 | 78.2 | 78.8 | 4.6 |
Conf-MobileDet (ours) | 90.4 | 69.5 | 79.4 | 79.8 | 4.6 |
Method | Car | Pedestrian | Cyclist | mAP | Gflops |
---|---|---|---|---|---|
SSD [12] | 85.1 | 48.1 | 50.7 | 61.3 | 88.2 |
YOLOv3 [15] | 79.5 | 83.1 | 79.0 | 80.5 | 34.5 |
Gaussian YOLOv3 [21] | 87.3 | 79.9 | 83.6 | 83.6 | 34.5 |
SqueezeDet [11] | 82.9 | 70.4 | 76.8 | 76.7 | 9.7 |
SqueezeDet+ [11] | 85.5 | 73.7 | 82.0 | 80.4 | 77.2 |
Conf-MobileDet (ours) | 90.4 | 69.5 | 79.4 | 79.8 | 4.6 |
Method | mAP | Inference Time (ms) |
---|---|---|
MobileNetv2 + SSD [1] | 68.4 | 79.4 |
MobileNetv2 + MobileNeck (ours) | 70.1 | 58.8 |
Conf-MobileDet (ours) | 71.3 | 58.8 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, W.; Liu, K. Confidence-Aware Object Detection Based on MobileNetv2 for Autonomous Driving. Sensors 2021, 21, 2380. https://doi.org/10.3390/s21072380
Li W, Liu K. Confidence-Aware Object Detection Based on MobileNetv2 for Autonomous Driving. Sensors. 2021; 21(7):2380. https://doi.org/10.3390/s21072380
Chicago/Turabian StyleLi, Wei, and Kai Liu. 2021. "Confidence-Aware Object Detection Based on MobileNetv2 for Autonomous Driving" Sensors 21, no. 7: 2380. https://doi.org/10.3390/s21072380
APA StyleLi, W., & Liu, K. (2021). Confidence-Aware Object Detection Based on MobileNetv2 for Autonomous Driving. Sensors, 21(7), 2380. https://doi.org/10.3390/s21072380