Accurate Instance-Based Segmentation for Boundary Detection in Robot Grasping Application
Abstract
:1. Introduction
- Adding a 3D process branch to conserve full pixels of objects by Difference of Normals-Based Segmentation [10]. Dealing with multiple objects and occlusion, which leads to the misclassification in Mask R-CNN.
- Distinguishing the interesting edge regions based on the relationship with the original masks.
- Continuously considering the spatial distance between edge regions and each mask region to categorize the edge region by Euclidean Cluster extraction [11].
2. Related Works
2.1. 2D Approach
2.2. 3D Approach
3. Experimental Methodology
3.1. Mask R-CNN Pipeline
3.2. Proposed Method
- Estimate the normals for every point using a large support radius of .
- Estimate the normals for every point using the small support radius of .
- For every point, the normalized Difference of Normals for every point, as defined above.
- Filter the resulting vector field to isolate the points belonging to the scale/region of interest.
- Create a Kd-tree representation for the input point cloud dataset P.
- Set up an empty list of clusters C, and a queue of the points that need to be checked Q.
- Then, for every , perform the following steps:
- Add to the current queue Q.
- For every point do:
- ∘
- Search for the set of point neighbors of in a sphere with radius .
- ∘
- For every neighbor , check if the point has already been processed, and if not add it to Q.
- When the list of all points in Q has been processed, add Q to the list of clusters C, and reset Q to an empty list.
- The algorithm terminates when all points have been processed and are now part of the list of point clusters C.
4. Experimental Results and Analysis
4.1. Experiment Preparation
4.2. Result and Evaluation
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Brachmann, E.; Krull, A.; Michel, F.; Gumhold, S.; Shotton, J.; Rother, C. Learning 6D object pose estimation using 3D object coordinates. In European Conference on Computer Vision (ECCV); Springer: Cham, Switzerland, 2014; pp. 536–551. [Google Scholar]
- Hu, Y.; Hugonot, J.; Fua, P.; Salzmann, M. Segmentation-Driven 6D object pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 3385–3394. [Google Scholar]
- Deng, X.; Xiang, Y.; Mousavian, A.; Eppner, C.; Bretl, T.; Fox, D. Self-supervised 6D object pose estimation for robot manipulation. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–31 August 2020. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Region-based convolutional networks for accurate object detection. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 38, 142–158. [Google Scholar] [CrossRef] [PubMed]
- Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. arXiv 2015, arXiv:1506.01497. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask rcnn. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Huang, Z.; Huang, L.; Gong, Y.; Huang, C.; Wang, X. Mask Scoring R-CNN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Zhang, Y.; Chu, J.; Leng, L.; Miao, J. Mask-Refined R-CNN: A Network for Refining Object Details in Instance Segmentation. Sensors 2020, 20, 1010. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ioannou, Y.; Taati, B.; Harrap, R.; Greenspan, M. Difference of Normals as a Multi-scale Operator in Unorganized Point Clouds. In Proceedings of the 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission, Zurich, Switzerland, 13–15 October 2012. [Google Scholar]
- pcl.readthedocs.io. Available online: https://pcl.readthedocs.io/en/latest/cluster_extraction.html (accessed on 1 March 2021).
- Ronneberger, O.; Fisher, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Computer Vision and Pattern Recognition; Springer: Cham, Switzerland, 2015. [Google Scholar]
- Wu, X.; Wen, S.; Xie, Y. Improvement of Mask-RCNN Object Segmentation Algorithm. In ICRIA 2019: Intelligent Robotics and Applications; Springer: Cham, Switzerland, 2019. [Google Scholar]
- Rother, C.; Kolmogorov, V.; Blake, A. GrabCut: Interactive Foreground Extraction Using Iterated Graph Cuts. ACM Trans. Graph. 2004, 23, 309–314. [Google Scholar] [CrossRef]
- Xu, C.; Wang, G.; Yan, S.; Yu, J.; Zhang, B.; Dai, S.; Li, Y.; Xu, L. Fast Vehicle and Pedestrian Detection Using Improved Mask R-CNN. Math. Probl. Eng. 2020, 2020, 5761414. [Google Scholar] [CrossRef]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the CVPR, IEEE, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Liu, S.; Qi, L.; Qin, H.; Jia, J.S.J. Path Aggregation Network for Instance Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, SU, USA, 18–22 June 2018. [Google Scholar]
- Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Uijlings, J.R.R.; van de Sande, K.E.A.; Gevers, T.; Smeulders, A.W.M. Selective Search for Object Recognition. Int. J. Comput. Vis. 2012, 104, 154–171. [Google Scholar] [CrossRef] [Green Version]
- Albawi, S.; Mohammed, T.A.; I-Zawi, S.A. Understanding of a convolutional neural network. In Proceedings of the International Conference on Engineering and Technology (ICET), Antalya, Turkey, 21–23 August 2017. [Google Scholar]
- Redmon, J.; Angelova, A. Real-time grasp detection using convolutional neural networks. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, USA, 26–30 May 2015. [Google Scholar]
- Rao, D.; Le, Q.V.; Phoka, T.; Quigley, M.; Sudsang, A.; Ng, A.Y. Grasping novel objects with depth segmentation. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems (IROS), Taipei, Taiwan, 18–22 October 2020. [Google Scholar]
- Uckermann, A.; Elbrechter, C.; Haschke, R.; Ritter, H. 3D scene segmentation for autonomous robot grasping. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems (IROS), Vilamoura-Algarve, Portugal, 7–12 October 2012. [Google Scholar]
- Sarbolandi, H.; Lefloch, D.; Kolb, A. Kinect Range Sensing: Structured-Light versus Time-of-Flight Kinect. Comput. Vis. Image Underst. 2015, 139, 1–20. [Google Scholar] [CrossRef] [Green Version]
- Kurban, R.; Skuka, F.; Bozpolat, H. Plane Segmentation of Kinect Point Clouds using RANSAC. In Proceedings of the 2015 7th International Conference on Information Technology, ICIT, Huangshan, China, 13–15 November 2015. [Google Scholar]
- pcl.readthedocs.io. Available online: https://pcl.readthedocs.io/projects/tutorials/en/latest/don_segmentation.html (accessed on 1 March 2021).
- Lin, T.-Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common objects, in context. In ECCV; Springer: Cham, Switzerland, 2014. [Google Scholar]
- Rahman, M.A.; Wang, Y. Optimizing intersection-over-union in deep neural networks for image segmentation. In International Symposium on Visual Computing; Springer: Cham, Switzerland, 2016; pp. 234–244. [Google Scholar]
- Lundell, J.; Verdoja, F.; Kyrki, V. Beyond Top-Grasps through Scene Completion. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–31 August 2020; pp. 545–551. [Google Scholar]
- Gualtieri, M.; Pas, A.t.; Saenko, K.; Platt, R. High precision grasp pose detection in dense clutter. In Proceedings of the International Conference on Intelligent Robots and Systems (IROS), Daejeon, Korea, 9–14 October 2016. [Google Scholar]
Model | Cup IoU | Blue Box IoU | Cylinder IoU | White Box IoU |
---|---|---|---|---|
Mask RCNN (Resnet 101) | 0.835 | 0.772 | 0.825 | 0.812 |
Mask RCNN (Resnet 50) | 0.821 | 0.768 | 0.826 | 0.792 |
DoN | 0.264 | 0.209 | 0.314 | 0.195 |
Euclidean Cluster Extraction | 0.310 | 0.287 | 0.257 | 0.358 |
Ours (Resnet 101) | 0.872 | 0.837 | 0.865 | 0.881 |
Ours (Resnet 50) | 0.870 | 0.834 | 0.859 | 0.879 |
Model | mAP | mAPcup | mAPblue box | mAPcylinder | mAPwhite box |
---|---|---|---|---|---|
Mask RCNN (Resnet 101) | 0.39 | 0.399 | 0.336 | 0.465 | 0.365 |
Mask RCNN (Resnet 50) | 0.38 | 0.387 | 0.345 | 0.448 | 0.365 |
Ours (Resnet 101) | 0.46 | 0.446 | 0.429 | 0.524 | 0.437 |
Ours (Resnet 50) | 0.46 | 0.456 | 0.421 | 0.534 | 0.447 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hoang, H.H.; Tran, B.L. Accurate Instance-Based Segmentation for Boundary Detection in Robot Grasping Application. Appl. Sci. 2021, 11, 4248. https://doi.org/10.3390/app11094248
Hoang HH, Tran BL. Accurate Instance-Based Segmentation for Boundary Detection in Robot Grasping Application. Applied Sciences. 2021; 11(9):4248. https://doi.org/10.3390/app11094248
Chicago/Turabian StyleHoang, Hong Hai, and Bao Long Tran. 2021. "Accurate Instance-Based Segmentation for Boundary Detection in Robot Grasping Application" Applied Sciences 11, no. 9: 4248. https://doi.org/10.3390/app11094248
APA StyleHoang, H. H., & Tran, B. L. (2021). Accurate Instance-Based Segmentation for Boundary Detection in Robot Grasping Application. Applied Sciences, 11(9), 4248. https://doi.org/10.3390/app11094248