Point Cloud Deep Learning Network Based on Balanced Sampling and Hybrid Pooling
Abstract
1. Introduction
- On the basis of PointNet++, the network uses a weight-based balanced sampling module to replace the original data sampling module, which balances the network’s ability to extract various sample data.
- We designed a self-conv (SC) module to be integrated into the downsampling layer of the original network to further improve the learning ability of the network for local fine-grained features, which mainly includes the following three aspects: (a) Feature learning of sampling center points and neighbor points: Aiming at the shortcomings of the current point cloud semantic segmentation network and focusing on describing the spatial relationship with the Euclidean distance and direction vector in the downsampling stage, the positional relationship between points is used as the intensity value and eigenvalue fusion, so as to enrich the spatial features of the sampling center point. (b) The mutual structural relationship learning between neighborhood points and neighborhood points: At present, most of the existing networks ignore the feature information of the neighborhood space. To solve this problem, we propose a vector calculation between the neighborhood point and the minimum neighborhood point to further enrich the features obtained in the downsampling stage. (c) Local feature enhancement under the attention model: The features of each sampling center point are different and incomplete. To solve this problem, we used an attention model to “score” different features and synthesize them according to their importance to ensure the complete expression of the spatial saliency structure.
- The hybrid pooling method is used to predict the segmentation and loss value calculation of the feature model output by the spatial pyramid pooling module and the max pooling module. Comparing their loss values, the weight matrix corresponding to the minimum loss value is taken and sent to the next iterative training epoch. The results of our experiments showed that this network design improves the accuracy of semantic segmentation and is more effective than other direct point cloud semantic segmentation algorithms.
2. Related Studies
2.1. Multiview-Based Point Cloud Segmentation Methods
2.2. Voxel-Based Point Cloud Segmentation Methods
2.3. Direct Point Cloud Processing Methods Based on PointNet
2.4. Direct Point Cloud Processing Method Based on Graph Convolution
3. Materials and Methods
3.1. S3DIS and Vaihingen Dataset
3.2. Weight-Based Balanced Sampling Module
3.3. Network Structure Design
3.3.1. SC Feature Learning Module
3.3.2. Hybrid Pooling
4. Experiment and Analysis
4.1. Experimental Environment and Evaluation Index
4.2. S3DIS Dataset Experiment
4.2.1. Crossover Trial
4.2.2. Six-Fold Crossover Experiment
4.3. Network Performance Comparison Based on Different Sampling Parameters
4.4. Vaihingen Dataset Experiment
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; IEEE: Piscataway Township, NJ, USA, 2017. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; IEEE: Piscataway Township, NJ, USA, 2016. [Google Scholar]
- Su, H.; Maji, S.; Kalogerakis, E.; Learned-Miller, E. Multi-view convolutional neural networks for 3d shape recognition. In Proceedings of the IEEE international conference on computer vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 945–953. [Google Scholar]
- Maturana, D.; Scherer, S. VoxNet: A 3D Convolutional Neural Network for real-time object recognition. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, 28 September–2 October 2015; pp. 922–928. [Google Scholar]
- Qi, C.R.; Yi, L.; Su, H.; Guibas, L.J. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. arXiv 2017. [Google Scholar] [CrossRef]
- Huang, H.; Kalogerakis, E.; Chaudhuri, S.; Ceylan, D.; Kim, V.G.; Yumer, E. Learning Local Shape Descriptors from Part Correspondences with Multiview Convolutional Networks. ACM Trans. Graph. 2017, 37, 6. [Google Scholar] [CrossRef]
- Feng, Y.; Zhang, Z.; Zhao, X.; Ji, R.; Gao, Y. GVCNN: Group-view convolutional neural networks for 3D shape ecognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; IEEE Computer Society Press: Los Alamitos, CA, USA, 2018; pp. 264–272. [Google Scholar]
- Boulch, A.; Guerry, J.; Le Saux, B.; Audebert, N. SnapNet: 3D point cloud semantic labeling with 2D deep segmentation networks. Comput. Graph. 2018, 71, 189–198. [Google Scholar] [CrossRef]
- Ye, X.; Li, J.; Huang, H.; Du, L.; Zhang, X. 3D Recurrent Neural Networks with Context Fusion for Point Cloud Semantic Segmentation. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; Springer: Heidelberg, Germany, 2018; pp. 415–430. [Google Scholar]
- Li, Y.; Bu, R.; Sun, M.; Wu, W.; Di, X.; Chen, B. PointCNN: Convolution on Χ-Transformed Points. arXiv 2018. [Google Scholar] [CrossRef]
- Hyeon, J.; Lee, W.; Kim, J.H.; Doh, N. NormNet: Point-wise normal estimation network for three-dimensional point cloud data. Int. J. Adv. Robot. Syst. 2019, 16, 172988141985753. [Google Scholar] [CrossRef]
- Zhao, H.; Jiang, L.; Fu, C.W.; Jia, J. PointWeb: Enhancing Local Neighborhood Features for Point Cloud Processing. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; IEEE: Piscataway Township, NJ, USA, 2019. [Google Scholar]
- Ye, Z.; Xu, Y.; Huang, R.; Tong, X.; Li, X.; Liu, X.; Luan, K.; Hoegner, L.; Stilla, U. LASDU: A Large-Scale Aerial LiDAR Dataset for Semantic Labeling in Dense Urban Areas. Int. J. Geo-Inf. 2020, 9, 450. [Google Scholar] [CrossRef]
- Huang, R.; Xu, Y.; Hong, D.; Yao, W.; Ghamisi, P.; Stilla, U. Deep point embedding for urban classification using ALS point clouds: A new perspective from local to global. ISPRS J. Photogramm. Remote Sens. 2020, 163, 62–81. [Google Scholar] [CrossRef]
- Wang, Y.; Sun, Y.; Liu, Z.; Sarma, S.E.; Bronstein, M.M.; Solomon, J.M. Dynamic graph cnn for learning on point clouds. ACM Trans. Graph. 2019, 38, 146. [Google Scholar] [CrossRef]
- Landrieu, L.; Simonovsky, M. Large-Scale Point Cloud Semantic Segmentation with Superpoint Graphs. arXiv 2018, arXiv:1711.09869. [Google Scholar]
- Wen, C.; Li, X.; Yao, X.; Peng, L.; Chi, T. Airborne LiDAR point cloud classification with global-local graph attention convolution neural network. ISPRS J. Photogramm. Remote Sens. 2021, 173, 181–194. [Google Scholar] [CrossRef]
- Huang, R.; Xu, Y.; Stilla, U. GraNet: Global relation-aware attentional network for semantic segmentation of ALS point clouds. ISPRS J. Photogramm. Remote Sens. 2021, 177, 1–20. [Google Scholar] [CrossRef]
- Chen, C.; Fragonara, L.Z.; Tsourdos, A. GAPNet: Graph attention based point neural network for exploiting local feature of point cloud. arXiv 2019, arXiv:1905.08705. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar] [CrossRef] [PubMed]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Girshick, R. Fast R-CNN. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015. [Google Scholar] [CrossRef]
- Armeni, I.; Sener, O.; Zamir, A.R.; Jiang, H.; Brilakis, I.; Fischer, M.; Savarese, S. 3D Semantic Parsing of Large-Scale Indoor Spaces. In Proceedings of the Computer Vision & Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; IEEE Computer Society: New York, NY, USA, 2016. [Google Scholar]
- Cramer, M. The DGPF-test on digital airborne camera evaluation overview and test design. Photogramm. Fernerkund. Geoinf. 2010, 2, 73–82. [Google Scholar] [CrossRef]
- Oquab, M.; Bottou, L.; Laptev, I.; Sivic, J. Learning and transferring mid-level image representations using convolutional neural networks. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; IEEE: Piscataway Township, NJ, USA, 2014; pp. 1717–1724. [Google Scholar]
- Fang, H.; Li, Y. Random undersampling and POSS method for software defect prediction. J. Shandong Univ. Eng. Sci. 2017, 47, 15–21. [Google Scholar]
- Lin, W.C.; Tsai, C.F.; Hu, Y.H.; Jhang, J.S. Clustering-based undersampling in class-imbalanced data. Inf. Sci. 2017, 409, 17–26. [Google Scholar] [CrossRef]
- Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
- Zhu, T.; Lin, Y.; Liu, Y. Synthetic minority oversampling technique for multiclass imbalance problems. Pattern Recognit. 2017, 72, 327–340. [Google Scholar] [CrossRef]
- Huang, H.S.; Wei, J.A.; Kang, P.D. A new over-sampling SVM classification algorithm based on unbalanced data sample characteristics. Control. Decis. 2018, 33, 1549–1558. [Google Scholar]
- Abdi, L.; Hashemi, S. To combat multi-class imbalanced problems by means of over-sampling techniques. IEEE Trans. Knowl. Data Eng. 2016, 28, 238–251. [Google Scholar] [CrossRef]
- Tan, Z.; Wang, M.; Xie, J.; Chen, Y.; Shi, X. Deep Semantic Role Labeling with Self-Attention. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018. [Google Scholar]
- Tong, Z.; Tanaka, G. Hybrid pooling for enhancement of generalization ability in deep convolutional neural networks. Neurocomputing 2019, 333, 76–85. [Google Scholar] [CrossRef]
- Chen, L.-Z.; Li, X.-Y.; Fan, D.-P.; Wang, K.; Lu, S.-P.; Cheng, M.-M. LSANet: Feature Learning on Point Sets by Local Spatial Attention. arXiv 2019, arXiv:1905.05442. [Google Scholar]
- Lin, Y.; Yan, Z.; Huang, H.; Du, D.; Liu, L.; Cui, S.; Han, X. Fpconv: Learning local flattening for point convolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 4293–4302. [Google Scholar]
- Guo, F.; Ren, Q.; Tang, J.; Li, Z. Dilated Multi-scale Fusion for Point Cloud Classification and Segmentation. Multimed. Tools Appl. 2022, 81, 6069–6090. [Google Scholar] [CrossRef]
- Hu, Q.; Yang, B.; Xie, L.; Rosa, S.; Guo, Y.; Wang, Z.; Trigoni, N.; Markham, A. RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; Volume 3, pp. 154–196. [Google Scholar]
- Yang, Z.; Tan, B.; Pei, H.; Jiang, W. Segmentation and Multi-Scale Convolutional Neural Network-Based Classification of Airborne Laser Scanner Data. Sensors 2018, 18, 3347. [Google Scholar] [CrossRef]
- Zhao, R.; Pang, M.; Wang, J. Classifying airborne LiDAR point clouds via deep features learned by a multi-scale convolutional neural network. Int. J. Geogr. Inf. Sci. 2018, 32, 960–979. [Google Scholar] [CrossRef]
- Wen, C.; Yang, L.; Li, X.; Peng, L.; Chi, T. Directionally Constrained Fully Convolutional Neural Network for Airborne Lidar Point Cloud Classification. ISPRS J. Photogramm. Remote Sens. 2019, 162, 50–62. [Google Scholar] [CrossRef]
- Li, X.; Wang, L.; Wang, M.; Wen, C.; Fang, Y. DANCE-NET: Density-aware convolution networks with context encoding for airborne LiDAR point cloud classification. ISPRS J. Photogramm. Remote Sens. 2020, 166, 128–139. [Google Scholar] [CrossRef]
- Li, W.; Wang, F.D.; Xia, G.S. A geometry-attentional network for ALS point cloud classification. ISPRS J. Photogramm. Remote Sens. 2020, 164, 26–40. [Google Scholar] [CrossRef]








| Class | Proportion | Class | Proportion | 
|---|---|---|---|
| ceiling | 21.6 | table | 2.7 | 
| floor | 19.4 | chair | 3.6 | 
| wall | 26.0 | sofa | 0.4 | 
| beam | 1.2 | Bookcase | 5.5 | 
| column | 1.5 | board | 1.0 | 
| window | 2.0 | clutter | 9.8 | 
| door | 5.3 | 
| Class | Power Line | Car | Façade | Hedge | Impervious Surface | Low Vegetation | Roof | Shrub | Tree | 
|---|---|---|---|---|---|---|---|---|---|
| Training | 546 | 4614 | 27,250 | 12,070 | 193,723 | 180,850 | 152,045 | 47,605 | 135,173 | 
| Proportion | 0.072% | 0.612% | 3.615% | 1.601% | 25.697% | 23.989% | 20.168% | 6.315% | 17.931% | 
| Name | Module | 
|---|---|
| PointNet++ | Baseline | 
| +ES | Weight-based balanced sampling | 
| +ATT | SC Feature encoding | 
| +PYA | Spatial pyramid pooling | 
| +HYB | Hybrid pooling | 
| ALL | Our method | 
| Module | MIoU | OA | Ceiling | Floor | Wall | Beam | Column | Window | Door | Table | Chair | Sofa | Bookcase | Board | Clutter | 
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Baseline | 70.2 | 87.7 | 93.0 | 97.3 | 74.8 | 68.7 | 43.2 | 77.8 | 78.9 | 72.4 | 76.8 | 41.9 | 58.7 | 66.2 | 63.2 | 
| ES | 70.7 | 88.1 | 94.4 | 96.5 | 76.4 | 52.9 | 47.1 | 74.3 | 70.2 | 74.7 | 77.8 | 56.1 | 62.4 | 68.6 | 67.8 | 
| PYA | 72.1 | 89.2 | 93.8 | 97.5 | 79.1 | 66.2 | 48.5 | 74.3 | 78.3 | 74.9 | 77.3 | 44.4 | 65.0 | 72.1 | 66.0 | 
| ATT | 73.3 | 89.6 | 93.3 | 97.1 | 80.1 | 66.0 | 50.9 | 73.2 | 83.5 | 73.7 | 80.2 | 49.3 | 65.7 | 73.9 | 65.6 | 
| HYB | 75.8 | 90.9 | 94.1 | 97.4 | 82.8 | 69.2 | 60.4 | 80.5 | 84.4 | 77.0 | 78.5 | 46.6 | 69.2 | 76.8 | 69.2 | 
| ALL | 80.0 | 93.4 | 94.4 | 97.5 | 82.2 | 74.6 | 63.4 | 83.9 | 83.3 | 80.0 | 87.7 | 64.9 | 75.6 | 79.0 | 73.3 | 
| ES | PYA | ATT | HYB | OA | MIoU | 
| √ | √ | √ | 94.0 | 77.1 | |
| √ | √ | √ | 89.2 | 75.4 | |
| √ | √ | √ | 88.9 | 73.3 | |
| √ | √ | √ | 90.0 | 74.9 | |
| √ | √ | √ | √ | 93.4 | 80.0 | 
| Method | OA | MIoU | 
|---|---|---|
| PointNet [1] | 78.5 | 47.6 | 
| 3DRCNN [9] | 85.7 | 53.4 | 
| PointNet++ [5] | 81.0 | 54.5 | 
| DGCNN [15] | 84.1 | 56.1 | 
| NormNet [11] | 84.5 | 57.1 | 
| SPGrap [16] | 85.5 | 62.1 | 
| LSANet [35] | 86.8 | 62.2 | 
| PointCNN [10] | 88.1 | 65.4 | 
| PointWeb [12] | 87.3 | 66.7 | 
| FPConv [36] | 89.9 | 66.7 | 
| DMSF [37] | 87.9 | 67.2 | 
| Randla-Net [38] | 88.0 | 70.0 | 
| Ours | 90.5 | 66.1 | 
| Model | Power Line | Car | Facade | Hedge | Impervious Surface | Low Vegetation | Roof | Shrub | Tree | OA | Average F1 | 
|---|---|---|---|---|---|---|---|---|---|---|---|
| PointNet++ [5] | 57.9 | 66.1 | 54.3 | 31.5 | 90.6 | 79.6 | 91.6 | 41.6 | 77.0 | 81.2 | 65.6 | 
| DPE [14] | 68.1 | 75.2 | 44.2 | 19.5 | 99.3 | 86.5 | 91.1 | 39.4 | 72.6 | 83.2 | 66.2 | 
| WhuY4 [39] | 42.5 | 74.7 | 53.1 | 53.7 | 91.4 | 82.7 | 94.3 | 47.9 | 82.8 | 84.9 | 69.2 | 
| NANJ2 [40] | 62.0 | 66.7 | 42.6 | 40.7 | 91.2 | 88.8 | 93.6 | 55.9 | 82.6 | 85.2 | 69.3 | 
| D-FCN [41] | 70.4 | 78.1 | 60.5 | 37.0 | 91.4 | 80.2 | 93.0 | 46.0 | 79.4 | 82.2 | 70.7 | 
| DANCE-Net [42] | 68.4 | 77.2 | 60.2 | 38.6 | 92.8 | 81.6 | 93.9 | 47.2 | 81.4 | 83.9 | 71.2 | 
| GACNN [43] | 76.0 | 77.7 | 58.9 | 37.8 | 93.0 | 81.8 | 93.1 | 46.7 | 78.9 | 83.2 | 71.5 | 
| GANet [17] | 75.4 | 77.8 | 61.5 | 44.2 | 91.6 | 82.0 | 94.4 | 49.6 | 82.6 | 84.5 | 73.2 | 
| GraNet [18] | 67.7 | 80.9 | 62.0 | 51.1 | 91.7 | 82.7 | 94.5 | 49.9 | 82.0 | 84.5 | 73.6 | 
| Our method | 46.5 | 77.8 | 57.9 | 37.9 | 92.9 | 82.3 | 94.8 | 48.6 | 86.3 | 85.4 | 69.5 | 
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Deng, C.; Peng, Z.; Chen, Z.; Chen, R. Point Cloud Deep Learning Network Based on Balanced Sampling and Hybrid Pooling. Sensors 2023, 23, 981. https://doi.org/10.3390/s23020981
Deng C, Peng Z, Chen Z, Chen R. Point Cloud Deep Learning Network Based on Balanced Sampling and Hybrid Pooling. Sensors. 2023; 23(2):981. https://doi.org/10.3390/s23020981
Chicago/Turabian StyleDeng, Chunyuan, Zhenyun Peng, Zhencheng Chen, and Ruixing Chen. 2023. "Point Cloud Deep Learning Network Based on Balanced Sampling and Hybrid Pooling" Sensors 23, no. 2: 981. https://doi.org/10.3390/s23020981
APA StyleDeng, C., Peng, Z., Chen, Z., & Chen, R. (2023). Point Cloud Deep Learning Network Based on Balanced Sampling and Hybrid Pooling. Sensors, 23(2), 981. https://doi.org/10.3390/s23020981
 
        



 
       