This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Open AccessArticle
Human Pose Intelligent Detection Algorithm Based on Spatiotemporal Hybrid Dilated Convolution Model
by
Lili Zhang
Lili Zhang ,
Shenxi Dai
Shenxi Dai *,
Lihuang She
Lihuang She and
Shuwei Huo
Shuwei Huo
School of Computer Science and Engineering, Northeastern University, Shenyang 110819, China
*
Author to whom correspondence should be addressed.
Electronics 2025, 14(24), 4798; https://doi.org/10.3390/electronics14244798 (registering DOI)
Submission received: 1 November 2025
/
Revised: 30 November 2025
/
Accepted: 3 December 2025
/
Published: 5 December 2025
Abstract
Three-dimensional human pose estimation (3D HPE) refers to converting the input image or video into the coordinates of the keypoints of the 3D human body in the coordinate system. At present, the mainstream implementation scheme of a 3D HPE task is to take the 2D pose estimation result as the intermediate process and then return it to the 3D pose. The general difficulty of this scheme is how to effectively extract the features between 2D joint points and return them to 3D coordinates in a highly nonlinear 3D space. In this paper, we propose a new algorithm, called TSHDC, to solve the above dilemma by considering the temporal and spatial characteristics of human joint points. By introducing the self-attention mechanism and the temporal convolutional network (TCN) into the 3D HPE task, the model can use only 27 frames of temporal receptive field to make the model have fewer parameters and faster convergence speed when the accuracy is not much different from the SOTA-level algorithm (+6.8 mm). The TSHDC model is deployed on the embedded platform JetsonTX2, and by deploying TensorRT, the model inference speed can be greatly improved (13.7 times) with only a small loss of accuracy (5%). The comprehensive experimental results on representative benchmarks show that our method outperforms the state-of-the-art methods in quantitative and qualitative evaluation.
Share and Cite
MDPI and ACS Style
Zhang, L.; Dai, S.; She, L.; Huo, S.
Human Pose Intelligent Detection Algorithm Based on Spatiotemporal Hybrid Dilated Convolution Model. Electronics 2025, 14, 4798.
https://doi.org/10.3390/electronics14244798
AMA Style
Zhang L, Dai S, She L, Huo S.
Human Pose Intelligent Detection Algorithm Based on Spatiotemporal Hybrid Dilated Convolution Model. Electronics. 2025; 14(24):4798.
https://doi.org/10.3390/electronics14244798
Chicago/Turabian Style
Zhang, Lili, Shenxi Dai, Lihuang She, and Shuwei Huo.
2025. "Human Pose Intelligent Detection Algorithm Based on Spatiotemporal Hybrid Dilated Convolution Model" Electronics 14, no. 24: 4798.
https://doi.org/10.3390/electronics14244798
APA Style
Zhang, L., Dai, S., She, L., & Huo, S.
(2025). Human Pose Intelligent Detection Algorithm Based on Spatiotemporal Hybrid Dilated Convolution Model. Electronics, 14(24), 4798.
https://doi.org/10.3390/electronics14244798
Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details
here.
Article Metrics
Article metric data becomes available approximately 24 hours after publication online.