Next Article in Journal
Exposure-Aware Training for Low-Light Object Detection Without Target-Domain Data
Previous Article in Journal
Neural Residual Correction for 3D Tooth Point Cloud Canonicalization
Previous Article in Special Issue
WAFF: A Synergetic Face Forgery Video Detection Method via Weakly Supervised EfficientNet
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Generative Data Augmentation for ArUco-Free RGB-Based 6-DoF Object Pose Estimation

1
Department of Physics, Informatics and Mathematics, University of Modena and Reggio Emilia, Via Campi 213/B , 41125 Modena, Italy
2
HIPERT s.r.l, Via Inventori, 37, 41121 Modena, Italy
*
Author to whom correspondence should be addressed.
J. Imaging 2026, 12(6), 244; https://doi.org/10.3390/jimaging12060244
Submission received: 23 March 2026 / Revised: 21 May 2026 / Accepted: 27 May 2026 / Published: 29 May 2026
(This article belongs to the Special Issue AI-Driven Image and Video Understanding)

Abstract

In recent years, data-driven approaches have become increasingly important in industrial computer vision applications, particularly for 6-Degrees-of-Freedom (6-DoF) object pose estimation. However, benchmark datasets may unintentionally introduce biases that affect the reliability of learned models. In this work, we investigate the shortcut bias induced by fiducial ArUco markers in the widely used Linemod dataset. Although such markers are typically absent in real industrial environments, they introduce unintended visual cues that neural networks tend to exploit. As a result, model selection based on state-of-the-art benchmarks can be biased, since the reported performance often reflects reliance on these shortcuts rather than on robust feature extraction. Using saliency map analysis, we show that often a large portion of the model’s attention is concentrated on these markers, revealing the presence of a shortcut that artificially boosts pose estimation performance. To mitigate this issue, we propose a data augmentation pipeline based on generative AI techniques that removes the markers and replaces the background with more realistic synthesized scenes. Experimental results indicate a noticeable drop in performance when models trained on the original Linemod dataset are evaluated in ArUco-free environments, confirming the presence of background-induced biases. Training with the proposed generative-swapped dataset leads to improved robustness and better generalization to unseen scenarios, although it does not fully eliminate the problem. Overall, the results highlight the impact of background-related biases in pose estimation benchmarks and suggest that the proposed augmentation strategy represents a practical and scalable step toward developing more reliable 6-DoF pose estimation systems for industrial applications, while leaving room for further improvements.
Keywords: data manipulation; 6d pose estimation; generative data augmentation; explainable artificial intelligence; saliency methods data manipulation; 6d pose estimation; generative data augmentation; explainable artificial intelligence; saliency methods

Share and Cite

MDPI and ACS Style

Scribano, C.; Ferrari, I.; Franchini, G.; Govi, E.; Sapienza, D.; Poppi, T.; Verucchi, M.; Bertogna, M. Generative Data Augmentation for ArUco-Free RGB-Based 6-DoF Object Pose Estimation. J. Imaging 2026, 12, 244. https://doi.org/10.3390/jimaging12060244

AMA Style

Scribano C, Ferrari I, Franchini G, Govi E, Sapienza D, Poppi T, Verucchi M, Bertogna M. Generative Data Augmentation for ArUco-Free RGB-Based 6-DoF Object Pose Estimation. Journal of Imaging. 2026; 12(6):244. https://doi.org/10.3390/jimaging12060244

Chicago/Turabian Style

Scribano, Carmelo, Iacopo Ferrari, Giorgia Franchini, Elena Govi, Davide Sapienza, Tobia Poppi, Micaela Verucchi, and Marko Bertogna. 2026. "Generative Data Augmentation for ArUco-Free RGB-Based 6-DoF Object Pose Estimation" Journal of Imaging 12, no. 6: 244. https://doi.org/10.3390/jimaging12060244

APA Style

Scribano, C., Ferrari, I., Franchini, G., Govi, E., Sapienza, D., Poppi, T., Verucchi, M., & Bertogna, M. (2026). Generative Data Augmentation for ArUco-Free RGB-Based 6-DoF Object Pose Estimation. Journal of Imaging, 12(6), 244. https://doi.org/10.3390/jimaging12060244

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop