HALNet: Partial Point Cloud Registration Based on Hybrid Attention and Deep Local Features
Abstract
:1. Introduction
- An end-to-end partial point cloud registration network, HALNet, is proposed. The network can extract deep local features through AGConv and CBAM and use the similarity scores of the point pairs of source and target point clouds to remove non-overlapping points in two point clouds.
- A hybrid attention mechanism composed of self-attention and cross-attention is proposed to modify the extracted features and help to improve the accuracy of feature grouping for predicting rigid transformation.
2. Related Work
2.1. Feature-Learning-Based Methods
2.2. End-to-End Methods
2.3. Attention Mechanism
3. Methodology
3.1. Preliminary
3.2. Feature Extraction
3.3. Overlapping Region Estimation
3.4. Hybrid Attention
3.4.1. Self-Attention
3.4.2. Cross-Attention
3.5. Rigid Transformation Calculation
3.6. Loss Function
4. Experiments
4.1. Experimental Setup
4.2. Dataset
4.3. Evaluation Indicators
4.4. Performance Evaluation
4.4.1. Unseen Shapes
4.4.2. Noise
4.4.3. Unseen Categories
4.5. Ablation Experiment
4.5.1. Hybrid Attention
4.5.2. Fully Connected Layer
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Gharineiat, Z.; Kurdi, F.T.; Campbell, G. Review of Automatic Processing of Topography and Surface Feature Identification LiDAR Data Using Machine Learning Techniques. Remote Sens. 2022, 14, 4685. [Google Scholar] [CrossRef]
- Mirzaei, K.; Arashpour, M.; Asadi, E.; Masoumi, H.; Bai, Y.; Behnood, A. 3D point cloud data processing with machine learning for construction and infrastructure applications: A comprehensive review. Adv. Eng. Inform. 2022, 51, 101501. [Google Scholar] [CrossRef]
- Izadi, S.; Kim, D.; Hilliges, O.; Molyneaux, D.; Newcombe, R.; Kohli, P.; Shotton, J.; Hodges, S.; Freeman, D.; Davison, A.; et al. KinectFusion: Real-time 3D reconstruction and interaction using a moving depth camera. In Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, Santa Barbara, CA, USA, 16–19 October 2011; pp. 559–568. [Google Scholar]
- Huang, X.; Mei, G.; Zhang, J.; Abbas, R. A comprehensive survey on point cloud registration. arXiv 2021, arXiv:2103.02690. [Google Scholar]
- Elhousni, M.; Huang, X. Review on 3D Lidar Localization for Autonomous Driving Cars. arXiv 2020, arXiv:2006.00648. [Google Scholar]
- Nagy, B.; Benedek, C. Real-Time Point Cloud Alignment for Vehicle Localization in a High Resolution 3D Map. In Proceedings of the Computer Vision—ECCV 2018 Workshops, Munich, Germany, 8–14 September 2018; Springer: Cham, Switzerland, 2019; pp. 226–239. [Google Scholar]
- Fu, Y.; Brown, N.M.; Saeed, S.U.; Casamitjana, A.; Baum, Z.M.C.; Delaunay, R.; Yang, Q.; Grimwood, A.; Min, Z.; Blumberg, S.B.; et al. DeepReg: A deep learning toolkit for medical image registration. J. Open Source Softw. 2020, 5, 2705. [Google Scholar] [CrossRef]
- Wang, Y.; Solomon, J.M. Deep Closest Point: Learning Representations for Point Cloud Registration. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 3522–3531. [Google Scholar] [CrossRef]
- Li, J.; Zhang, C.; Xu, Z.; Zhou, H.; Zhang, C. Iterative Distance-Aware Similarity Matrix Convolution with Mutual-Supervised Point Elimination for Efficient Point Cloud Registration. In Proceedings of the Computer Vision—ECCV 2020, Glasgow, UK, 23–28 August 2020; Springer: Cham, Switzerland, 2020; pp. 378–394. [Google Scholar]
- Yuan, W.; Eckart, B.; Kim, K.; Jampani, V.; Fox, D.; Kautz, J. DeepGMR: Learning Latent Gaussian Mixture Models for Registration. In Proceedings of the Computer Vision—ECCV, Glasgow, UK, 23–28 August 2020; Springer: Cham, Switzerland, 2020; pp. 733–750. [Google Scholar]
- Zhang, Z.Y.; Sun, J.D.; Dai, Y.C.; Fan, B.; He, M.Y. VRNet: Learning the Rectified Virtual Corresponding Points for 3D Point Cloud Registration. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 4997–5010. [Google Scholar] [CrossRef]
- Zhou, H.; Feng, Y.; Fang, M.; Wei, M.; Qin, J.; Lu, T. Adaptive Graph Convolution for Point Cloud Analysis. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, BC, Canada, 11–17 October 2021; pp. 4945–4954. [Google Scholar]
- Woo, S.H.; Park, J.; Lee, J.Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. Lect. Notes Comput. Sci. 2018, 11211, 3–19. [Google Scholar] [CrossRef]
- Zeng, A.; Song, S.; Nießner, M.; Fisher, M.; Xiao, J.; Funkhouser, T.A. 3DMatch: Learning Local Geometric Descriptors from RGB-D Reconstructions. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 199–208. [Google Scholar]
- Gojcic, Z.; Zhou, C.; Wegner, J.D.; Wieser, A. The Perfect Match: 3D Point Cloud Matching With Smoothed Densities. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 5540–5549. [Google Scholar]
- Deng, H.; Birdal, T.; Ilic, S. PPFNet: Global Context Aware Local Features for Robust 3D Point Matching. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 195–205. [Google Scholar]
- Yew, Z.J.; Lee, G.H. 3DFeat-Net: Weakly Supervised Local 3D Features for Point Cloud Registration. In Proceedings of the Computer Vision—ECCV 2018, Munich, Germany, 8–14 September 2018; Springer: Cham, Switzerland, 2018; pp. 630–646. [Google Scholar]
- Yew, Z.J.; Lee, G.H. RPM-Net: Robust Point Matching Using Learned Features. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 11821–11830. [Google Scholar]
- Deng, H.W.; Birdal, T.; Ilic, S. 3D Local Features for Direct Pairwise Registration. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 3239–3248. [Google Scholar] [CrossRef]
- Lu, W.X.; Wan, G.W.; Zhou, Y.; Fu, X.Y.; Yuan, P.F.; Song, S.Y. DeepVCP: An End-to-End Deep Neural Network for Point Cloud Registration. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 12–21. [Google Scholar] [CrossRef]
- Aoki, Y.; Goforth, H.; Srivatsan, R.A.; Lucey, S. PointNetLK: Robust & Efficient Point Cloud Registration using PointNet. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 7156–7165. [Google Scholar] [CrossRef]
- Charles, R.Q.; Su, H.; Kaichun, M.; Guibas, L.J. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 77–85. [Google Scholar]
- Elbaz, G.; Avraham, T.; Fischer, A. 3D Point Cloud Registration for Localization using a Deep Neural Network Auto-Encoder. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition (Cvpr 2017), Honolulu, HI, USA, 21–26 July 2017; pp. 2472–2481. [Google Scholar] [CrossRef]
- Huang, X.; Mei, G.; Zhang, J. Feature-Metric Registration: A Fast Semi-Supervised Approach for Robust Point Cloud Registration Without Correspondences. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 11363–11371. [Google Scholar]
- Yang, J.; Zhang, Q.; Ni, B.; Li, L.; Liu, J.; Zhou, M.; Tian, Q. Modeling Point Clouds With Self-Attention and Gumbel Subset Sampling. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 3318–3327. [Google Scholar]
- Zhang, W.; Xiao, C. PCAN: 3D Attention Map Learning Using Contextual Information for Point Cloud Based Retrieval. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 12428–12437. [Google Scholar]
- Chen, C.; Fragonara, L.Z.; Tsourdos, A. GAPointNet: Graph attention based point neural network for exploiting local feature of point cloud. Neurocomputing 2021, 438, 122–132. [Google Scholar] [CrossRef]
- Zhao, H.; Jiang, L.; Jia, J.; Torr, P.H.S.; Koltun, V. Point Transformer. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, BC, Canada, 11–17 October 2021; pp. 16239–16248. [Google Scholar]
- Lu, H.; Chen, X.; Zhang, G.; Zhou, Q.; Ma, Y.; Zhao, Y. Scanet: Spatial-channel Attention Network for 3D Object Detection. In Proceedings of the ICASSP 2019—2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), London, UK, 12–17 May 2019; pp. 1992–1996. [Google Scholar]
- Guo, M.-H.; Cai, J.-X.; Liu, Z.-N.; Mu, T.-J.; Martin, R.R.; Hu, S.-M. PCT: Point cloud transformer. Comput. Vis. Media 2021, 7, 187–199. [Google Scholar] [CrossRef]
- Huang, S.Y.; Gojcic, Z.; Usvyatsov, M.; Wieser, A.; Schindler, K. PREDATOR: Registration of 3D Point Clouds with Low Overlap. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021, Nashville, TN, USA, 20–25 June 2021; pp. 4265–4274. [Google Scholar] [CrossRef]
- Wang, G.H.; Zhai, Q.Y.; Liu, H. Cross self-attention network for 3D point cloud. Knowl.-Based Syst. 2022, 247, 108769. [Google Scholar] [CrossRef]
- Shi, J.T.; Ye, H.L.; Yang, B.; Cao, F.L. An iteration-based interactive attention network for 3D point cloud registration. Neurocomputing 2023, 560, 126822. [Google Scholar] [CrossRef]
- Bahdanau, D.; Chorowski, J.; Serdyuk, D.; Brakel, P.; Bengio, Y. End-to-end attention-based large vocabulary speech recognition. In Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 20–25 March 2016; pp. 4945–4949. [Google Scholar]
- Xu, K.; Ba, J.L.; Kiros, R.; Cho, K.; Courville, A.; Salakhutdinov, R.; Zemel, R.S.; Bengio, Y. Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of the 32nd International Conference on International Conference on Machine Learning, Lille, France, 6–11 July 2015; Volume 37, pp. 2048–2057. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Wu, Z.; Song, S.; Khosla, A.; Yu, F.; Zhang, L.; Tang, X.; Xiao, J. 3D ShapeNets: A deep representation for volumetric shapes. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1912–1920. [Google Scholar]
- Zhou, R.; Li, X.; Jiang, W. SCANet: A Spatial and Channel Attention based Network for Partial-to-Partial Point Cloud Registration. Pattern Recognit. Lett. 2021, 151, 120–126. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 6000–6010. [Google Scholar]
Model | RMSE(R) | MAE(R) | RMSE(t) | MAE(t) |
---|---|---|---|---|
DeepGMR | 13.382577 | 8.870650 | 0.041859 | 0.029198 |
DCP-V2 | 5.478204 | 3.538027 | 0.026153 | 0.019312 |
IDAM | 7.621095 | 3.713719 | 0.047063 | 0.024128 |
SCANet | 3.465979 | 1.982889 | 0.014385 | 0.010212 |
VRNet | 5.093359 | 3.364641 | 0.030701 | 0.022927 |
HALNet | 3.096093 | 1.743890 | 0.022106 | 0.017327 |
Model | RMSE(R) | MAE(R) | RMSE(t) | MAE(t) |
---|---|---|---|---|
DeepGMR | 13.352157 | 8.878562 | 0.041774 | 0.029208 |
DCP-V2 | 5.897242 | 3.899607 | 0.026251 | 0.019345 |
IDAM | 7.472706 | 3.453048 | 0.043778 | 0.022701 |
SCANet | 3.484932 | 2.003082 | 0.014277 | 0.010201 |
VRNet | 5.478986 | 3.679850 | 0.030500 | 0.022766 |
HALNet | 3.357249 | 1.948029 | 0.022781 | 0.017951 |
Model | RMSE(R) | MAE(R) | RMSE(t) | MAE(t) |
---|---|---|---|---|
DeepGMR | 14.122125 | 9.947880 | 0.043104 | 0.031322 |
DCP-V2 | 6.295135 | 4.116909 | 0.029018 | 0.021801 |
IDAM | 8.044392 | 3.891389 | 0.048142 | 0.025395 |
SCANet | 4.545772 | 2.933826 | 0.020703 | 0.014922 |
VRNet | 5.947994 | 4.035112 | 0.033973 | 0.025189 |
HALNet | 4.091592 | 2.616030 | 0.026665 | 0.020893 |
Condition | Method | RMSE(R) | MAE(R) | RMSE(t) | MAE(t) |
---|---|---|---|---|---|
Unseen Shapes | M1 | 3.934837 | 2.495351 | 0.025345 | 0.019687 |
M2 | 3.469968 | 2.138049 | 0.023129 | 0.018045 | |
M0 | 3.096093 | 1.743890 | 0.022106 | 0.017327 | |
Noise | M1 | 4.132365 | 2.701966 | 0.025643 | 0.019908 |
M2 | 3.551702 | 2.234420 | 0.023199 | 0.018124 | |
M0 | 3.357249 | 1.948029 | 0.022781 | 0.017951 | |
Unseen Categories | M1 | 5.165231 | 3.534752 | 0.030933 | 0.024414 |
M2 | 4.227642 | 2.852501 | 0.028591 | 0.022475 | |
M0 | 4.091592 | 2.616030 | 0.026665 | 0.020893 |
Condition | Method | RMSE(R) | MAE(R) | RMSE(t) | MAE(t) |
---|---|---|---|---|---|
Unseen Shapes | SVD | 4.905889 | 2.797221 | 0.0222372 | 0.016029 |
FCL | 3.096093 | 1.743890 | 0.022106 | 0.017327 | |
Noise | SVD | 5.034445 | 2.943128 | 0.022696 | 0.016189 |
FCL | 3.357249 | 1.948029 | 0.022781 | 0.017951 | |
Unseen Categories | SVD | 6.336225 | 4.035086 | 0.029551 | 0.022089 |
FCL | 4.091592 | 2.616030 | 0.026665 | 0.020893 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, D.; Hao, H.; Zhang, J. HALNet: Partial Point Cloud Registration Based on Hybrid Attention and Deep Local Features. Sensors 2024, 24, 2768. https://doi.org/10.3390/s24092768
Wang D, Hao H, Zhang J. HALNet: Partial Point Cloud Registration Based on Hybrid Attention and Deep Local Features. Sensors. 2024; 24(9):2768. https://doi.org/10.3390/s24092768
Chicago/Turabian StyleWang, Deling, Huadan Hao, and Jinsong Zhang. 2024. "HALNet: Partial Point Cloud Registration Based on Hybrid Attention and Deep Local Features" Sensors 24, no. 9: 2768. https://doi.org/10.3390/s24092768
APA StyleWang, D., Hao, H., & Zhang, J. (2024). HALNet: Partial Point Cloud Registration Based on Hybrid Attention and Deep Local Features. Sensors, 24(9), 2768. https://doi.org/10.3390/s24092768