Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Timothy Langlois

How to Train Your Dragon: Automatic Diffusion-Based Rigging for Characters with Diverse Topologies

Mar 19, 2025

Zeqi Gu, Difan Liu, Timothy Langlois, Matthew Fisher, Abe Davis

Figure 1 for How to Train Your Dragon: Automatic Diffusion-Based Rigging for Characters with Diverse Topologies

Figure 2 for How to Train Your Dragon: Automatic Diffusion-Based Rigging for Characters with Diverse Topologies

Figure 3 for How to Train Your Dragon: Automatic Diffusion-Based Rigging for Characters with Diverse Topologies

Figure 4 for How to Train Your Dragon: Automatic Diffusion-Based Rigging for Characters with Diverse Topologies

Abstract:Recent diffusion-based methods have achieved impressive results on animating images of human subjects. However, most of that success has built on human-specific body pose representations and extensive training with labeled real videos. In this work, we extend the ability of such models to animate images of characters with more diverse skeletal topologies. Given a small number (3-5) of example frames showing the character in different poses with corresponding skeletal information, our model quickly infers a rig for that character that can generate images corresponding to new skeleton poses. We propose a procedural data generation pipeline that efficiently samples training data with diverse topologies on the fly. We use it, along with a novel skeleton representation, to train our model on articulated shapes spanning a large space of textures and topologies. Then during fine-tuning, our model rapidly adapts to unseen target characters and generalizes well to rendering new poses, both for realistic and more stylized cartoon appearances. To better evaluate performance on this novel and challenging task, we create the first 2D video dataset that contains both humanoid and non-humanoid subjects with per-frame keypoint annotations. With extensive experiments, we demonstrate the superior quality of our results. Project page: https://traindragondiffusion.github.io/

* Accepted to Eurographics 2025

Via

Access Paper or Ask Questions

Self-Supervised Generation of Spatial Audio for 360 Video

Sep 07, 2018

Pedro Morgado, Nuno Vasconcelos, Timothy Langlois, Oliver Wang

Figure 1 for Self-Supervised Generation of Spatial Audio for 360 Video

Figure 2 for Self-Supervised Generation of Spatial Audio for 360 Video

Figure 3 for Self-Supervised Generation of Spatial Audio for 360 Video

Figure 4 for Self-Supervised Generation of Spatial Audio for 360 Video

Abstract:We introduce an approach to convert mono audio recorded by a 360 video camera into spatial audio, a representation of the distribution of sound over the full viewing sphere. Spatial audio is an important component of immersive 360 video viewing, but spatial audio microphones are still rare in current 360 video production. Our system consists of end-to-end trainable neural networks that separate individual sound sources and localize them on the viewing sphere, conditioned on multi-modal analysis of audio and 360 video frames. We introduce several datasets, including one filmed ourselves, and one collected in-the-wild from YouTube, consisting of 360 videos uploaded with spatial audio. During training, ground-truth spatial audio serves as self-supervision and a mixed down mono track forms the input to our network. Using our approach, we show that it is possible to infer the spatial location of sound sources based only on 360 video and a mono audio track.

* To appear in NIPS 2018

Via

Access Paper or Ask Questions