Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:DDT: A Diffusion-Driven Transformer-based Framework for Human Mesh Recovery from a Video

Mar 29, 2023

Ce Zheng, Guo-Jun Qi, Chen Chen

Figure 1 for DDT: A Diffusion-Driven Transformer-based Framework for Human Mesh Recovery from a Video

Figure 2 for DDT: A Diffusion-Driven Transformer-based Framework for Human Mesh Recovery from a Video

Figure 3 for DDT: A Diffusion-Driven Transformer-based Framework for Human Mesh Recovery from a Video

Figure 4 for DDT: A Diffusion-Driven Transformer-based Framework for Human Mesh Recovery from a Video

Share this with someone who'll enjoy it:

Abstract:Human mesh recovery (HMR) provides rich human body information for various real-world applications such as gaming, human-computer interaction, and virtual reality. Compared to single image-based methods, video-based methods can utilize temporal information to further improve performance by incorporating human body motion priors. However, many-to-many approaches such as VIBE suffer from motion smoothness and temporal inconsistency. While many-to-one approaches such as TCMR and MPS-Net rely on the future frames, which is non-causal and time inefficient during inference. To address these challenges, a novel Diffusion-Driven Transformer-based framework (DDT) for video-based HMR is presented. DDT is designed to decode specific motion patterns from the input sequence, enhancing motion smoothness and temporal consistency. As a many-to-many approach, the decoder of our DDT outputs the human mesh of all the frames, making DDT more viable for real-world applications where time efficiency is crucial and a causal model is desired. Extensive experiments are conducted on the widely used datasets (Human3.6M, MPI-INF-3DHP, and 3DPW), which demonstrated the effectiveness and efficiency of our DDT.

View paper on

Share this with someone who'll enjoy it:

Title:DDT: A Diffusion-Driven Transformer-based Framework for Human Mesh Recovery from a Video

Paper and Code