Abstract:Most realtime human pose estimation approaches are based on detecting joint positions. Using the detected joint positions, the yaw and pitch of the limbs can be computed. However, the roll along the limb, which is critical for application such as sports analysis and computer animation, cannot be computed as this axis of rotation remains unobserved. In this paper we therefore introduce orientation keypoints, a novel approach for estimating the full position and rotation of skeletal joints, using only single-frame RGB images. Inspired by how motion-capture systems use a set of point markers to estimate full bone rotations, our method uses virtual markers to generate sufficient information to accurately infer rotations with simple post processing. The rotation predictions improve upon the best reported mean error for joint angles by 48\% and achieves 93\% accuracy across 15 bone rotations. The method also improves the current state-of-the-art results for joint positions by 14\% as measured by MPJPE on the principle dataset, and generalizes well to in-the-wild datasets. Video available at: https://youtu.be/1EBUrfu_CaE