FANet: Improving 3D Object Detection with Position Adaptation
Abstract
:1. Introduction
2. Related Work
3. Methods
Position Adaptive Convolution Embedded Network
4. Experimental Results and Analysis
4.1. Dataset
4.2. Evaluation Metrics
4.3. Data and Analysis
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Xia, X.; Meng, Z.; Han, X.; Li, H.; Tsukiji, T.; Xu, R.; Zheng, Z.; Ma, J.J. An automated driving systems data acquisition and analytics platform. Transp. Res. Part C Emerg. Technol. 2023, 151, 104120. [Google Scholar] [CrossRef]
- Meng, Z.; Xia, X.; Xu, R.; Liu, W.; Ma, J.J. HYDRO-3D: Hybrid Object Detection and Tracking for Cooperative Perception Using 3D LiDAR. IEEE Trans. Intell. Veh. 2023, 1–13. [Google Scholar] [CrossRef]
- Shi, S.; Jiang, L.; Deng, J.; Wang, Z.; Guo, C.; Shi, J.; Wang, X.; Li, H.J. PV-RCNN++: Point-voxel feature set abstraction with local vector representation for 3D object detection. Vis. Comput. 2021, 131, 531–551. [Google Scholar] [CrossRef]
- Li, B. 3D Fully Convolutional Network for Vehicle Detection in Point Cloud. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 1513–1518. [Google Scholar]
- Qi, C.R.; Su, H.; Mo, K.C.; Guibas, L.J. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In Proceedings of the 30th IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 77–85. [Google Scholar]
- Engelcke, M.; Rao, D.; Wang, D.Z.; Tong, C.H.; Posner, I. Vote3deep: Fast object detection in 3d point clouds using efficient convolutional neural networks. In Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, 29 May–3 June 2017; pp. 1355–1361. [Google Scholar]
- Ye, Y.Y.; Chen, H.J.; Zhang, C.; Hao, X.L.; Zhang, Z.X. SARPNET: Shape attention regional proposal network for liDAR-based 3D object detection. Neurocomputing 2020, 379, 53–63. [Google Scholar] [CrossRef]
- Deng, J.; Shi, S.; Li, P.; Zhou, W.; Zhang, Y.; Li, H. Voxel r-cnn: Towards high performance voxel-based 3d object detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 2–9 February 2021; pp. 1201–1209. [Google Scholar]
- Yang, Z.T.; Sun, Y.A.; Liu, S.; Shen, X.Y.; Jia, J.Y. STD: Sparse-to-Dense 3D Object Detector for Point Cloud. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1951–1960. [Google Scholar]
- Mahmoud, A.; Hu, J.S.; Waslander, S.L. Dense voxel fusion for 3D object detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 4–6 January 2023; pp. 663–672. [Google Scholar]
- Li, Y.; Chen, Y.; Qi, X.; Li, Z.; Sun, J.; Jia, J.J. Unifying voxel-based representation with transformer for 3d object detection. arXiv 2022, arXiv:2206.00630. [Google Scholar] [CrossRef] [Green Version]
- Zhou, Y.; Tuzel, O. VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection. In Proceedings of the 31st IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 4490–4499. [Google Scholar]
- Yan, Y.; Mao, Y.X.; Li, B. SECOND: Sparsely Embedded Convolutional Detection. Sensors 2018, 18, 3337. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Structure Aware Single-Stage 3D Object Detection from Point Cloud. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–18 June 2020; pp. 11870–11879. [CrossRef]
- Yin, T.; Zhou, X.; Krahenbuhl, P. Center-based 3d object detection and tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 11784–11793. [Google Scholar]
- Wang, Y.; Solomon, J.M. Object dgcnn: 3d object detection using dynamic graphs. Adv. Neural Inf. Process. Syst. 2021, 34, 20745–20758. [Google Scholar]
- Mao, J.; Xue, Y.; Niu, M.; Bai, H.; Feng, J.; Liang, X.; Xu, H.; Xu, C. Voxel Transformer for 3D Object Detection. In Proceedings of the 18th IEEE/CVF International Conference on Computer Vision (ICCV), Electric Network, Montreal, BC, Canada, 11–17 October 2021; pp. 3144–3153. [Google Scholar]
- Chen, C.; Chen, Z.; Zhang, J.; Tao, D. SASA: Semantics-Augmented Set Abstraction for Point-based 3D Object Detection. Proc. Conf. AAAI Artif. Intell. 2022, 36, 221–229. [Google Scholar] [CrossRef]
- Qi, C.R.; Yi, L.; Su, H.; Guibas, L.J. PointNet plus plus: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. In Proceedings of the 31st Annual Conference on Neural Information Processing Systems (NIPS), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Yang, Z.; Sun, Y.; Liu, S.; Jia, J. 3dssd: Point-based 3d single stage object detector. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11040–11048. [Google Scholar]
- Zheng, W.; Tang, W.; Jiang, L.; Fu, C.-W. SE-SSD: Self-Ensembling Single-Stage Object Detector from Point Cloud. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Electric Network, Nashville, TN, USA, 19–25 June 2021; pp. 14489–14498. [Google Scholar]
- Pan, X.; Xia, Z.; Song, S.; Li, L.E.; Huang, G. 3d object detection with pointformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 7463–7472. [Google Scholar]
- Wang, Q.; Chen, J.; Deng, J.; Zhang, X.J. 3D-CenterNet: 3D object detection network for point clouds with center estimation priority. Pattern Recognit. 2021, 115, 107884. [Google Scholar] [CrossRef]
- Qi, C.R.; Liu, W.; Wu, C.; Su, H.; Guibas, L.J. Frustum pointnets for 3d object detection from rgb-d data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 918–927. [Google Scholar]
- Sheng, H.L.; Cai, S.J.; Liu, Y.; Deng, B.; Huang, J.Q.; Hua, X.S.; Zhao, M.J. Improving 3D Object Detection with Channel-wise Transformer. In Proceedings of the 18th IEEE/CVF International Conference on Computer Vision (ICCV), Electric Network, Montreal, BC, Canada, 11–17 October 2021; pp. 2723–2732. [Google Scholar]
- Shi, S.; Guo, C.; Jiang, L.; Wang, Z.; Shi, J.; Wang, X.; Li, H. Pv-rcnn: Point-voxel feature set abstraction for 3d object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10529–10538. [Google Scholar]
- Shi, S.; Wang, Z.; Shi, J.; Wang, X.; Li, H. From Points to Parts: 3D Object Detection from Point Cloud with Part-Aware and Part-Aggregation Network. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 2647–2664. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Bhattacharyya, P.; Czarnecki, K.J. Deformable PV-RCNN: Improving 3D object detection with learned deformations. Int. J. Comput. Vis. 2022, 131, 531–551. [Google Scholar] [CrossRef]
- Xu, M.T.; Ding, R.Y.; Zhao, H.S.; Qi, X.J. PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Electric Network, Nashville, TN, USA, 19–25 June 2021; pp. 3172–3181. [Google Scholar]
- Thomas, H.; Qi, C.R.; Deschaud, J.-E.; Marcotegui, B.; Goulette, F.; Guibas, L.J. KPConv: Flexible and Deformable Convolution for Point Clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6420–6429. [Google Scholar]
- Chen, Y.; Dai, X.; Liu, M.; Chen, D.; Yuan, L.; Liu, Z. Dynamic convolution: Attention over convolution kernels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11030–11039. [Google Scholar]
- Geiger, A.; Lenz, P.; Stiller, C.; Urtasun, R.J. Vision meets robotics: The kitti dataset. Int. J. Robot. Res. 2013, 32, 1231–1237. [Google Scholar] [CrossRef] [Green Version]
- Sun, P.; Kretzschmar, H.; Dotiwalla, X.; Chouard, A.; Patnaik, V.; Tsui, P.; Guo, J.; Zhou, Y.; Chai, Y.; Caine, B. Scalability in perception for autonomous driving: Waymo open dataset. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 2446–2454. [Google Scholar]
Symbols | Significance |
---|---|
W | weight bank |
WT | weight matrix |
CP | input channel |
Cq | output channel |
T | number of weight matrices |
Di | center point |
Dj | neighboring point |
ATi,j | location adaptive coefficient |
Relu | activation function |
Softmax | normalization function |
K | dynamic kernel |
Fp | input characteristic |
τcorr | weight regularization |
Car-3D (%) | Car-BEV (%) | |||||
---|---|---|---|---|---|---|
Method | Easy | Mod | Hard | Easy | Mod | Hard |
PV-RCNN | 89.66 | 81.82 | 78.06 | 93.00 | 88.58 | 88.25 |
FANet | 92.41 | 82.82 | 80.30 | 94.06 | 90.63 | 91.30 |
Pedestrian-3D (%) | Pedestrian-BEV (%) | |||||
---|---|---|---|---|---|---|
Method | Easy | Mod | Hard | Easy | Mod | Hard |
PV-RCNN | 63.89 | 56.35 | 51.31 | 66.87 | 59.79 | 55.60 |
FANet | 65.53 | 58.11 | 52.06 | 66.21 | 60.21 | 55.37 |
Cyclist-3D (%) | Cyclist-BEV (%) | |||||
---|---|---|---|---|---|---|
Method | Easy | Mod | Hard | Easy | Mod | Hard |
PV-RCNN | 87.03 | 68.70 | 64.38 | 93.32 | 75.07 | 70.49 |
FANet | 90.10 | 71.27 | 66.42 | 89.53 | 77.34 | 71.10 |
3D-mAP (%) | |||
---|---|---|---|
Method | Easy | Mod | Hard |
STD | 77.89 | 67.71 | 62.85 |
Part-A2 | 77.75 | 66.49 | 61.27 |
3DSSD | 78.30 | 67.57 | 62.31 |
CT3D | 77.77 | 69.77 | 64.92 |
VoTR | 79.84 | 70.09 | 66.90 |
PV-RCNN | 80.19 | 68.96 | 64.58 |
PV-RCNN++ | 80.30 | 69.41 | 64.91 |
FANet | 82.68 | 70.73 | 66.26 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ye, J.; Zuo, F.; Qian, Y. FANet: Improving 3D Object Detection with Position Adaptation. Appl. Sci. 2023, 13, 7508. https://doi.org/10.3390/app13137508
Ye J, Zuo F, Qian Y. FANet: Improving 3D Object Detection with Position Adaptation. Applied Sciences. 2023; 13(13):7508. https://doi.org/10.3390/app13137508
Chicago/Turabian StyleYe, Jian, Fushan Zuo, and Yuqing Qian. 2023. "FANet: Improving 3D Object Detection with Position Adaptation" Applied Sciences 13, no. 13: 7508. https://doi.org/10.3390/app13137508
APA StyleYe, J., Zuo, F., & Qian, Y. (2023). FANet: Improving 3D Object Detection with Position Adaptation. Applied Sciences, 13(13), 7508. https://doi.org/10.3390/app13137508