Abstract
Reconstructing 3D human motion from monocular video presents challenges when frames contain occlusions or blur, as conventional approaches depend on features extracted within limited temporal windows, resulting in structural distortions. In this paper, we introduce a novel framework that combines temporal probability guidance with graph topology learning to achieve robust 3D human mesh reconstruction from incomplete observations. Our method leverages topology-aware probability distributions spanning entire motion sequences to recover missing anatomical regions. The Graph Topological Modeling (GTM) component captures structural relationships among body parts by learning the inherent connectivity patterns in human anatomy. Building upon GTM, our Temporal-alignable Probability Distribution (TPDist) mechanism predicts missing features through probabilistic inference, establishing temporal coherence across frames. Additionally, we propose a Hierarchical Human Loss (HHLoss) that hierarchically regularizes probability distribution errors for inter-frame features while accounting for topological variations. Experimental validation demonstrates that our approach outperforms state-of-the-art methods on the 3DPW benchmark, particularly excelling in scenarios involving occlusions and motion blur.