Diffangle-Grasp: Dexterous Grasp Synthesis via Fine-Grained Contact Generation and Natural Pose Optimization
Abstract
1. Introduction
- We propose an improved contact map graph generation model, DiffCVAE. Unlike existing models that rely on a single model or a simple combination of CAVEs, the core of DiffCVAE is the construction of a shared latent space. The diffusion model is embedded into the potential space of the CVAE to refine the reconstruction of the initial potential variables. The quality of the contact maps is improved by utilizing the iterative refinement capability of the diffusion model on the shared potential space. Meanwhile, compared to the whole hand adjustment, the method can specifically generate fine contact maps for each gesture.
- We propose PNGR, a grasp optimization method that generates intuitive and effective dexterous hand grasps using the aforementioned fine-grained contact maps. By incorporating supervision from physical constraints and natural poses, this approach achieves grasps that are more natural and physically plausible.
2. Materials and Methods
2.1. Dataset and Hand Model
2.2. Generation of Detailed Contact Maps
2.3. Proposed Grasping Optimization
3. Results
3.1. Experimental Platform and Configuration Parameters
3.2. Quantitative Assessment of Contact Map Generation
3.3. Quantitative Assessment of Grab Generation
3.4. Qualitative Assessment
3.5. Ablation Experiments
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Li, Y.; Jiang, L.; Liu, Y.; Sun, Z.; Zheng, D. A Review of Stable Grasping Methods for Humanoid Dexterous Hands. Acta Armamentarii 2023, 44, 3237–3252. [Google Scholar]
- Woo, T.; Park, W.; Jeong, W.; Park, J. A survey of deep learning methods and datasets for hand pose estimation from hand-object interaction images. Comput. Graph.-Uk 2023, 116, 474–490. [Google Scholar] [CrossRef]
- Ma, J.; Zhou, Y.; Wang, Z.; Sang, H.; Jiang, R.; He, B. Geometric-aware RGB-D representation learning for hand-object reconstruction. Expert Syst. Appl. 2024, 257, 124995. [Google Scholar] [CrossRef]
- Hasson, Y.; Varol, G.; Tzionas, D.; Kalevatykh, I.; Black, M.J.; Laptev, I.; Schmid, C. Learning Joint Reconstruction of Hands and Manipulated Objects. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 11799–11808. [Google Scholar]
- Grady, P.; Tang, C.; Twigg, C.D.; Vo, M.; Brahmbhatt, S.; Kemp, C.C. ContactOpt: Optimizing Contact to Improve Grasps. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Virtual Conference, 20–25 June 2021; pp. 1471–1481. [Google Scholar]
- Cha, J.; Kim, J.; Yoon, J.S.; Baek, S. Text2HOI: Text-Guided 3D Motion Generation for Hand-Object Interaction. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 1577–1585. [Google Scholar]
- Chen, Z.; Hasson, Y.; Schmid, C.; Laptev, I. AlignSDF: Pose-Aligned Signed Distance Fields for Hand-Object Reconstruction. In Proceedings of the Computer Vision—ECCV 2022, Tel Aviv, Israel, 23–27 October 2022; Springer: Cham, Switzerland, 2022; pp. 231–248. [Google Scholar]
- Wu, X.; Liu, T.; Li, C.; Ma, Y.; Shi, Y.; He, X. FastGrasp: Efficient Grasp Synthesis with Diffusion. arXiv 2024, arXiv:2411.14786. [Google Scholar] [CrossRef]
- Debbagh, M. Learning Structured Output Representations from Attributes using Deep Conditional Generative Models. arXiv 2023, arXiv:2305.00980. [Google Scholar] [CrossRef]
- Jiang, H.; Liu, S.; Wang, J.; Wang, X. Hand-Object Contact Consistency Reasoning for Human Grasps Generation. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Virtual Conference, 10–17 October 2021; pp. 11087–11096. [Google Scholar]
- Liu, S.; Zhou, Y.; Yang, J.; Gupta, S.; Wang, S. ContactGen: Generative Contact Modeling for Grasp Generation. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; pp. 20552–20563. [Google Scholar]
- Shao, L.; Ferreira, F.; Jorda, M.; Nambiar, V.; Luo, J.; Solowjow, E.; Ojea, J.A.; Khatib, O.; Bohg, J. UniGrasp: Learning a Unified Model to Grasp With Multifingered Robotic Hands. IEEE Robot. Autom. Lett. 2020, 5, 2286–2293. [Google Scholar] [CrossRef]
- Brahmbhatt, S.; Handa, A.; Hays, J.; Fox, D. ContactGrasp: Functional Multi-finger Grasp Synthesis from Contact. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 3–8 November 2019; pp. 2386–2393. [Google Scholar]
- Brahmbhatt, S.; Ham, C.; Kemp, C.C.; Hays, J. ContactDB: Analyzing and Predicting Grasp Contact via Thermal Imaging. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 8701–8711. [Google Scholar]
- Wu, A.; Guo, M.; Karen Liu, C. Learning Diverse and Physically Feasible Dexterous Grasps with Generative Model and Bilevel Optimization. arXiv 2022, arXiv:2207.00195. [Google Scholar] [CrossRef]
- Zhao, F.; Tsetserukou, D.; Liu, Q. GrainGrasp: Dexterous Grasp Generation with Fine-grained Contact Guidance. In Proceedings of the 2024 IEEE International Conference on Robotics and Automation (ICRA), Yokohama, Japan, 13–17 May 2024; pp. 6470–6476. [Google Scholar]
- Miller, A.T.; Allen, P.K. Grasp it! A versatile simulator for robotic grasping. IEEE Robot. Autom. Mag. 2004, 11, 110–122. [Google Scholar] [CrossRef]
- Dzidek, B.M.; Adams, M.J.; Andrews, J.W.; Zhang, Z.; Johnson, S.A. Contact mechanics of the human finger pad under compressive loads. J. R. Soc. Interface 2017, 14, 20160935. [Google Scholar] [CrossRef] [PubMed]
- Romero, J.; Tzionas, D.; Black, M.J. Embodied Hands: Modeling and Capturing Hands and Bodies Together. ACM Trans. Graph. 2017, 36, 245. [Google Scholar] [CrossRef]
- Zhou, K.; Bhatnagar, B.L.; Lenssen, J.E.; Pons-Moll, G. TOCH: Spatio-Temporal Object-to-Hand Correspondence for Motion Refinement. In Proceedings of the Computer Vision—ECCV 2022, Tel Aviv, Israel, 23–27 October 2022; Springer: Cham, Switzerland, 2022; pp. 1–19. [Google Scholar]
- Angel, X.C.; Thomas, F.; Leonidas, G.; Pat, H.; Qixing, H.; Zimo, L.; Silvio, S.; Manolis, S.; Shuran, S.; Hao, S.; et al. ShapeNet: An Information-Rich 3D Model Repository. arXiv 2015, arXiv:1512.03012. [Google Scholar]
- Community, B.O. Blender—A 3D Modelling and Rendering Package; Blender Foundation: Amsterdam, The Netherlands; Blender Institute: Amsterdam, The Netherlands. Available online: http://www.blender.org/ (accessed on 14 November 2023).
- Yang, L.; Zhan, X.; Li, K.; Xu, W.; Zhang, J.; Li, J.; Lu, C. Learning a Contact Potential Field for Modeling the Hand-Object Interaction. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 5645–5662. [Google Scholar] [CrossRef] [PubMed]
- Li, P.; Liu, T.; Li, Y.; Geng, Y.; Zhu, Y.; Yang, Y.; Huang, S. GenDexGrasp: Generalizable Dexterous Grasping. In Proceedings of the 2023 IEEE International Conference on Robotics and Automation (ICRA), London, UK, 29 May–2 June 2023; pp. 8068–8074. [Google Scholar]
- Taud, H.; Mas, J.F. Multilayer Perceptron (MLP). In Geomatic Approaches for Modeling Land Change Scenarios; Camacho Olmedo, M.T., Paegelow, M., Mas, J.-F., Escobar, F., Eds.; Springer International Publishing: Cham, Switzerland, 2018; pp. 451–455. [Google Scholar]
- Chou, G.; Bahat, Y.; Heide, F. Diffusion-SDF: Conditional Generative Modeling of Signed Distance Functions. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; pp. 2262–2272. [Google Scholar]
- Mousavian, A.; Eppner, C.; Fox, D. 6-DOF GraspNet: Variational Grasp Generation for Object Manipulation. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 2901–2910. [Google Scholar]
- Ren, S.; Zhang, Y.; Hang, J.; Lin, X. Hand-object information embedded dexterous grasping generation. Pattern Recognit. Lett. 2023, 174, 130–136. [Google Scholar] [CrossRef]
- Kaya, E.C.; Schwarz, S.; Tabus, I. Refining The Bounding Volumes for Lossless Compression of Voxelized Point Clouds Geometry. In Proceedings of the 2021 IEEE International Conference on Image Processing (ICIP), Anchorage, AK, USA, 19–22 September 2021; pp. 3408–3412. [Google Scholar]
- Calculate the Intersection Volume of Two Geometries Based on Trimesh. Available online: https://blog.csdn.net/wtd2000/article/details/143625254 (accessed on 14 April 2025).
- Ratcliff, J.W. RELEASE of V-HACD, version 4.1; GitHub, Inc.: San Francisco, CA, USA. Available online: https://github.com/kmammou/v-hacd/releases/tag/v4.1.0 (accessed on 2 June 2025).
- Tzionas, D.; Ballan, L.; Srikantha, A.; Aponte, P.; Pollefeys, M.; Gall, J. Capturing Hands in Action Using Discriminative Salient Points and Physics Simulation. Int. J. Comput. Vis. 2016, 118, 172–193. [Google Scholar] [CrossRef]
- Karunratanakul, K.; Yang, J.; Zhang, Y.; Black, M.J.; Muandet, K.; Tang, S. Grasping Field: Learning Implicit Representations for Human Grasps. In Proceedings of the 2020 International Conference on 3D Vision (3DV), Fukuoka, Japan, 25–28 November 2020; pp. 333–344. [Google Scholar]
- Amazon Mechanical Turk. Available online: https://www.mturk.com/ (accessed on 10 May 2025).
- Zuo, B.; Zhao, Z.; Sun, W.; Yuan, X.; Yu, Z.; Wang, Y. GraspDiff: Grasping Generation for Hand-Object Interaction With Multimodal Guided Diffusion. IEEE Trans. Vis. Comput. Graph. 2024, 1–13. [Google Scholar] [CrossRef] [PubMed]
- Wu, R.; Zhu, T.; Lin, X.; Sun, Y. Cross-Category Functional Grasp Transfer. IEEE Robot. Autom. Lett. 2024, 9, 10652–10659. [Google Scholar] [CrossRef]
Angles | Index | Middle | Ring | Little | Thumb | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 10 | 11 | 12 | 7 | 8 | 9 | 13 | 14 | 15 | |
15° | 0° | 0° | 5° | 0° | 0° | 5° | 0° | 0° | 0° | 0° | 0° | 15° | 0° | 0° | |
−5° | 0° | 0° | −5° | 0° | 0° | −15° | 0° | 0° | −15° | 0° | 0° | 0 | 0° | 0° | |
90° | 90° | 90° | 90° | 90° | 90° | 90° | 90° | 90° | 90° | 90° | 90° | 90° | 90° | 90° | |
0° | 0 | 0 | 0° | 0 | 0 | 0° | 0 | 0 | 0° | 0 | 0 | 0° | 0 | 0 |
Parameter | Configuration |
---|---|
CPU | Intel(R) Xeon(R) Platinum 8457C |
GPU | L20 |
GPU Video Memory | 48 GB |
Operating System | Ubuntu 22.04 |
Training Environment | PyTorch 2.0.1 + Pytorch3d 0.77 + Python 3.10 + CUDA 11.8 |
Methods | Volume (cm3) | (cm) | (%) | Ntrl. Scores |
---|---|---|---|---|
GG(complete) [16] | 2.08 | 0.76 | 46.98 | 3.64 |
GG(only opt.) [16] | 1.98 | 0.84 | 48.75 | 3.71 |
GT [4] | 2.20 | 0.75 | 52.28 | 3.51 |
FG [8] | 1.93 | 0.79 | 48.27 | 3.68 |
GF [33] | 2.38 | 0.89 | 20.60 | 2.93 |
GA [10] | 3.65 | 0.80 | 35.61 | 3.43 |
Ours (only opt.) | 2.05 | 0.81 | 48.57 | 3.76 |
Ours (only Diff.) | 2.17 | 0.84 | 47.14 | 3.40 |
Ours (complete) | 1.85 | 0.78 | 49.16 | 3.72 |
Energy | Volume (cm3) | (cm) | (%) |
---|---|---|---|
w/o | 2.67 | 0.86 | 44.58 |
w/o | 2.56 | 0.78 | 45.88 |
w/o | 2.68 | 0.82 | 42.86 |
w/o | 3.86 | 1.21 | 17.07 |
w/o | 2.17 | 0.84 | 47.14 |
w/o | 1.98 | 0.84 | 48.75 |
complete E | 1.85 | 0.78 | 49.16 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ning, M.; Deng, C.; Zhan, Z.; Yin, Q.; Xia, X. Diffangle-Grasp: Dexterous Grasp Synthesis via Fine-Grained Contact Generation and Natural Pose Optimization. Biomimetics 2025, 10, 492. https://doi.org/10.3390/biomimetics10080492
Ning M, Deng C, Zhan Z, Yin Q, Xia X. Diffangle-Grasp: Dexterous Grasp Synthesis via Fine-Grained Contact Generation and Natural Pose Optimization. Biomimetics. 2025; 10(8):492. https://doi.org/10.3390/biomimetics10080492
Chicago/Turabian StyleNing, Meng, Chong Deng, Ziheng Zhan, Qianwei Yin, and Xue Xia. 2025. "Diffangle-Grasp: Dexterous Grasp Synthesis via Fine-Grained Contact Generation and Natural Pose Optimization" Biomimetics 10, no. 8: 492. https://doi.org/10.3390/biomimetics10080492
APA StyleNing, M., Deng, C., Zhan, Z., Yin, Q., & Xia, X. (2025). Diffangle-Grasp: Dexterous Grasp Synthesis via Fine-Grained Contact Generation and Natural Pose Optimization. Biomimetics, 10(8), 492. https://doi.org/10.3390/biomimetics10080492