Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:HDFormer: High-order Directed Transformer for 3D Human Pose Estimation

Feb 03, 2023

Hanyuan Chen, Jun-Yan He, Wangmeng Xiang, Wei Liu, Zhi-Qi Cheng, Hanbing Liu, Bin Luo, Yifeng Geng, Xuansong Xie

Figure 1 for HDFormer: High-order Directed Transformer for 3D Human Pose Estimation

Figure 2 for HDFormer: High-order Directed Transformer for 3D Human Pose Estimation

Figure 3 for HDFormer: High-order Directed Transformer for 3D Human Pose Estimation

Figure 4 for HDFormer: High-order Directed Transformer for 3D Human Pose Estimation

Share this with someone who'll enjoy it:

Abstract:Human pose estimation is a complicated structured data sequence modeling task. Most existing methods only consider the pair-wise interaction of human body joints in model learning. Unfortunately, this causes 3D pose estimation to fail in difficult cases such as $\textit{joints overlapping}$, and pose $\textit{fast-changing}$, as pair-wise relations cannot exploit fine-grained human body priors in pose estimation. To this end, we revamped the 3D pose estimation framework with a $\textit{High-order}$ $\textit{Directed}$ $\textit{Transformer}$ (HDFormer), which coherently exploits the high-order bones and joints relevances to boost the performance of pose estimation. Specifically, HDFormer adopts both self-attention and high-order attention schemes to build up a multi-order attention module to perform the information flow interaction including the first-order $"\textit{joint$\leftrightarrow$joint}"$, second-order $"\textit{bone$\leftrightarrow$joint}"$ as well as high-order $"\textit{hyperbone$\leftrightarrow$joint}"$ relationships (hyperbone is defined as a joint set), compensating the hard cases prediction in fast-changing and heavy occlusion scenarios. Moreover, modernized CNN techniques are applied to upgrade the transformer-based architecture to speed up the HDFormer, achieving a favorable trade-off between effectiveness and efficiency. We compare our model with other SOTA models on the datasets Human3.6M and MPI-INF-3DHP. The results demonstrate that the proposed HDFormer achieves superior performance with only $\textbf{1/10}$ parameters and much lower computational cost compared to the current SOTAs. Moreover, HDFormer can be applied to various types of real-world applications, enabling real-time and accurate 3D pose estimation. The source code is in https://github.com/hyer/HDFormer.

View paper on

Share this with someone who'll enjoy it:

Title:HDFormer: High-order Directed Transformer for 3D Human Pose Estimation

Paper and Code