- Article
DPCA-GCN: Dual-Path Cross-Attention Graph Convolutional Networks for Skeleton-Based Action Recognition
- Khadija Lasri,
- Khalid El Fazazy and
- Adnane Mohamed Mahraz
- + 2 authors
Skeleton-based action recognition has achieved remarkable advances with graph convolutional networks (GCNs). However, most existing models process spatial and temporal information within a single coupled stream, which often obscures the distinct patterns of joint configuration and motion dynamics. This paper introduces the Dual-Path Cross-Attention Graph Convolutional Network (DPCA-GCN), an architecture that explicitly separates spatial and temporal modeling into two specialized pathways while maintaining rich bidirectional interaction between them. The spatial branch integrates graph convolution and spatial transformers to capture intra-frame joint relationships, whereas the temporal branch combines temporal convolution and temporal transformers to model inter-frame dependencies. A bidirectional cross-attention mechanism facilitates explicit information exchange between both paths, and an adaptive gating module balances their respective contributions according to the action context. Unlike traditional approaches that process spatial–temporal information sequentially, our dual-path design enables specialized processing while maintaining cross-modal coherence through memory-efficient chunked attention mechanisms. Extensive experiments on the NTU RGB+D 60 and NTU RGB+D 120 datasets demonstrate that DPCA-GCN achieves competitive joint-only accuracies of 88.72%/94.31% and 82.85%/83.65%, respectively, with exceptional top-5 scores of 96.97%/99.14% and 95.59%/95.96%, while maintaining significantly lower computational complexity compared to multi-modal approaches.
15 December 2025




