Deep learning has achieved great success on robotic vision tasks. However, when compared with other vision-based tasks, it is difficult to collect a representative and sufficiently large training set for six-dimensional (6D) object pose estimation, due to the inherent difficulty of data collection. In this paper, we propose the RobotP dataset consisting of commonly used objects for benchmarking in 6D object pose estimation. To create the dataset, we apply a 3D reconstruction pipeline to produce high-quality depth images, ground truth poses, and 3D models for well-selected objects. Subsequently, based on the generated data, we produce object segmentation masks and two-dimensional (2D) bounding boxes automatically. To further enrich the data, we synthesize a large number of photo-realistic color-and-depth image pairs with ground truth 6D poses. Our dataset is freely distributed to research groups by the Shape Retrieval Challenge benchmark on 6D pose estimation. Based on our benchmark, different learning-based approaches are trained and tested by the unified dataset. The evaluation results indicate that there is considerable room for improvement in 6D object pose estimation, particularly for objects with dark colors, and photo-realistic images are helpful in increasing the performance of pose estimation algorithms.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited