Next Article in Journal
A Novel Automatic Detection and Positioning Strategy for Buried Cylindrical Objects Based on B-Scan GPR Images
Previous Article in Journal
An Automatic Optimization Approach to the Non-Periodic Folded-Waveguide Slow-Wave Structure for the High Efficiency Traveling Wave Tube
Previous Article in Special Issue
Transmission-Reflection-Integrated Bifunctional Metasurface by Hybridizing Geometric Phase and Propagation Phase
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Human Pose Intelligent Detection Algorithm Based on Spatiotemporal Hybrid Dilated Convolution Model

School of Computer Science and Engineering, Northeastern University, Shenyang 110819, China
*
Author to whom correspondence should be addressed.
Electronics 2025, 14(24), 4798; https://doi.org/10.3390/electronics14244798 (registering DOI)
Submission received: 1 November 2025 / Revised: 30 November 2025 / Accepted: 3 December 2025 / Published: 5 December 2025

Abstract

Three-dimensional human pose estimation (3D HPE) refers to converting the input image or video into the coordinates of the keypoints of the 3D human body in the coordinate system. At present, the mainstream implementation scheme of a 3D HPE task is to take the 2D pose estimation result as the intermediate process and then return it to the 3D pose. The general difficulty of this scheme is how to effectively extract the features between 2D joint points and return them to 3D coordinates in a highly nonlinear 3D space. In this paper, we propose a new algorithm, called TSHDC, to solve the above dilemma by considering the temporal and spatial characteristics of human joint points. By introducing the self-attention mechanism and the temporal convolutional network (TCN) into the 3D HPE task, the model can use only 27 frames of temporal receptive field to make the model have fewer parameters and faster convergence speed when the accuracy is not much different from the SOTA-level algorithm (+6.8 mm). The TSHDC model is deployed on the embedded platform JetsonTX2, and by deploying TensorRT, the model inference speed can be greatly improved (13.7 times) with only a small loss of accuracy (5%). The comprehensive experimental results on representative benchmarks show that our method outperforms the state-of-the-art methods in quantitative and qualitative evaluation.
Keywords: 3D human pose estimation; self-attention mechanism; time domain convolution network; jetsonTX2; tensorRT 3D human pose estimation; self-attention mechanism; time domain convolution network; jetsonTX2; tensorRT

Share and Cite

MDPI and ACS Style

Zhang, L.; Dai, S.; She, L.; Huo, S. Human Pose Intelligent Detection Algorithm Based on Spatiotemporal Hybrid Dilated Convolution Model. Electronics 2025, 14, 4798. https://doi.org/10.3390/electronics14244798

AMA Style

Zhang L, Dai S, She L, Huo S. Human Pose Intelligent Detection Algorithm Based on Spatiotemporal Hybrid Dilated Convolution Model. Electronics. 2025; 14(24):4798. https://doi.org/10.3390/electronics14244798

Chicago/Turabian Style

Zhang, Lili, Shenxi Dai, Lihuang She, and Shuwei Huo. 2025. "Human Pose Intelligent Detection Algorithm Based on Spatiotemporal Hybrid Dilated Convolution Model" Electronics 14, no. 24: 4798. https://doi.org/10.3390/electronics14244798

APA Style

Zhang, L., Dai, S., She, L., & Huo, S. (2025). Human Pose Intelligent Detection Algorithm Based on Spatiotemporal Hybrid Dilated Convolution Model. Electronics, 14(24), 4798. https://doi.org/10.3390/electronics14244798

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop