Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mu Nie

TransPose: Towards Explainable Human Pose Estimation by Transformer

Dec 31, 2020

Sen Yang, Zhibin Quan, Mu Nie, Wankou Yang

Figure 1 for TransPose: Towards Explainable Human Pose Estimation by Transformer

Figure 2 for TransPose: Towards Explainable Human Pose Estimation by Transformer

Figure 3 for TransPose: Towards Explainable Human Pose Estimation by Transformer

Figure 4 for TransPose: Towards Explainable Human Pose Estimation by Transformer

Abstract:Deep Convolutional Neural Networks (CNNs) have made remarkable progress on human pose estimation task. However, there is no explicit understanding of how the locations of body keypoints are predicted by CNN, and it is also unknown what spatial dependency relationships between structural variables are learned in the model. To explore these questions, we construct an explainable model named TransPose based on Transformer architecture and low-level convolutional blocks. Given an image, the attention layers built in Transformer can capture long-range spatial relationships between keypoints and explain what dependencies the predicted keypoints locations highly rely on. We analyze the rationality of using attention as the explanation to reveal the spatial dependencies in this task. The revealed dependencies are image-specific and variable for different keypoint types, layer depths, or trained models. The experiments show that TransPose can accurately predict the positions of keypoints. It achieves state-of-the-art performance on COCO dataset, while being more interpretable, lightweight, and efficient than mainstream fully convolutional architectures.

* Code will be released at https://github.com/yangsenius/TransPose

Via

Access Paper or Ask Questions