Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Barbara Solenthaler

ETH Zurich, TUM - Institute for Advanced Study

Joint Learning of Depth and Appearance for Portrait Image Animation

Jan 15, 2025

Xinya Ji, Gaspard Zoss, Prashanth Chandran, Lingchen Yang, Xun Cao, Barbara Solenthaler, Derek Bradley

Abstract:2D portrait animation has experienced significant advancements in recent years. Much research has utilized the prior knowledge embedded in large generative diffusion models to enhance high-quality image manipulation. However, most methods only focus on generating RGB images as output, and the co-generation of consistent visual plus 3D output remains largely under-explored. In our work, we propose to jointly learn the visual appearance and depth simultaneously in a diffusion-based portrait image generator. Our method embraces the end-to-end diffusion paradigm and introduces a new architecture suitable for learning this conditional joint distribution, consisting of a reference network and a channel-expanded diffusion backbone. Once trained, our framework can be efficiently adapted to various downstream applications, such as facial depth-to-image and image-to-depth generation, portrait relighting, and audio-driven talking head animation with consistent 3D output.

Via

Access Paper or Ask Questions

Learning a Generalized Physical Face Model From Data

Feb 29, 2024

Lingchen Yang, Gaspard Zoss, Prashanth Chandran, Markus Gross, Barbara Solenthaler, Eftychios Sifakis, Derek Bradley

Abstract:Physically-based simulation is a powerful approach for 3D facial animation as the resulting deformations are governed by physical constraints, allowing to easily resolve self-collisions, respond to external forces and perform realistic anatomy edits. Today's methods are data-driven, where the actuations for finite elements are inferred from captured skin geometry. Unfortunately, these approaches have not been widely adopted due to the complexity of initializing the material space and learning the deformation model for each character separately, which often requires a skilled artist followed by lengthy network training. In this work, we aim to make physics-based facial animation more accessible by proposing a generalized physical face model that we learn from a large 3D face dataset in a simulation-free manner. Once trained, our model can be quickly fit to any unseen identity and produce a ready-to-animate physical face model automatically. Fitting is as easy as providing a single 3D face scan, or even a single face image. After fitting, we offer intuitive animation controls, as well as the ability to retarget animations across characters. All the while, the resulting animations allow for physical effects like collision avoidance, gravity, paralysis, bone reshaping and more.

Via

Access Paper or Ask Questions

An Implicit Physical Face Model Driven by Expression and Style

Jan 27, 2024

Lingchen Yang, Gaspard Zoss, Prashanth Chandran, Paulo Gotardo, Markus Gross, Barbara Solenthaler, Eftychios Sifakis, Derek Bradley

Figure 1 for An Implicit Physical Face Model Driven by Expression and Style

Figure 2 for An Implicit Physical Face Model Driven by Expression and Style

Figure 3 for An Implicit Physical Face Model Driven by Expression and Style

Figure 4 for An Implicit Physical Face Model Driven by Expression and Style

Abstract:3D facial animation is often produced by manipulating facial deformation models (or rigs), that are traditionally parameterized by expression controls. A key component that is usually overlooked is expression 'style', as in, how a particular expression is performed. Although it is common to define a semantic basis of expressions that characters can perform, most characters perform each expression in their own style. To date, style is usually entangled with the expression, and it is not possible to transfer the style of one character to another when considering facial animation. We present a new face model, based on a data-driven implicit neural physics model, that can be driven by both expression and style separately. At the core, we present a framework for learning implicit physics-based actuations for multiple subjects simultaneously, trained on a few arbitrary performance capture sequences from a small set of identities. Once trained, our method allows generalized physics-based facial animation for any of the trained identities, extending to unseen performances. Furthermore, it grants control over the animation style, enabling style transfer from one character to another or blending styles of different characters. Lastly, as a physics-based model, it is capable of synthesizing physical effects, such as collision handling, setting our method apart from conventional approaches.

* Accepted to SIGGRAPH ASIA 2023. Project page: https://studios.disneyresearch.com/2023/11/29/an-implicit-physical-face-model-driven-by-expression-and-style/ Video: https://www.youtube.com/watch?v=-qM_XUv-JhA&t

Via

Access Paper or Ask Questions

Implicit Neural Representation for Physics-driven Actuated Soft Bodies

Jan 26, 2024

Lingchen Yang, Byungsoo Kim, Gaspard Zoss, Baran Gözcü, Markus Gross, Barbara Solenthaler

Figure 1 for Implicit Neural Representation for Physics-driven Actuated Soft Bodies

Figure 2 for Implicit Neural Representation for Physics-driven Actuated Soft Bodies

Figure 3 for Implicit Neural Representation for Physics-driven Actuated Soft Bodies

Figure 4 for Implicit Neural Representation for Physics-driven Actuated Soft Bodies

Abstract:Active soft bodies can affect their shape through an internal actuation mechanism that induces a deformation. Similar to recent work, this paper utilizes a differentiable, quasi-static, and physics-based simulation layer to optimize for actuation signals parameterized by neural networks. Our key contribution is a general and implicit formulation to control active soft bodies by defining a function that enables a continuous mapping from a spatial point in the material space to the actuation value. This property allows us to capture the signal's dominant frequencies, making the method discretization agnostic and widely applicable. We extend our implicit model to mandible kinematics for the particular case of facial animation and show that we can reliably reproduce facial expressions captured with high-quality capture systems. We apply the method to volumetric soft bodies, human poses, and facial expressions, demonstrating artist-friendly properties, such as simple control over the latent space and resolution invariance at test time.

* Accepted to SIGGRAPH 2022. Project page: https://studios.disneyresearch.com/2022/07/24/implicit-neural-representation-for-physics-driven-actuated-soft-bodies/ Video: https://www.youtube.com/watch?v=9EERe_CTazk

Via

Access Paper or Ask Questions

Efficient Incremental Potential Contact for Actuated Face Simulation

Dec 03, 2023

Bo Li, Lingchen Yang, Barbara Solenthaler

Abstract:We present a quasi-static finite element simulator for human face animation. We model the face as an actuated soft body, which can be efficiently simulated using Projective Dynamics (PD). We adopt Incremental Potential Contact (IPC) to handle self-intersection. However, directly integrating IPC into the simulation would impede the high efficiency of the PD solver, since the stiffness matrix in the global step is no longer constant and cannot be pre-factorized. We notice that the actual number of vertices affected by the collision is only a small fraction of the whole model, and by utilizing this fact we effectively decrease the scale of the linear system to be solved. With the proposed optimization method for collision, we achieve high visual fidelity at a relatively low performance overhead.

* SIGGRAPH Asia 2023 Technical Communications

Via

Access Paper or Ask Questions

Spatially Adaptive Cloth Regression with Implicit Neural Representations

Nov 27, 2023

Lei Shu, Vinicius Azevedo, Barbara Solenthaler, Markus Gross

Abstract:The accurate representation of fine-detailed cloth wrinkles poses significant challenges in computer graphics. The inherently non-uniform structure of cloth wrinkles mandates the employment of intricate discretization strategies, which are frequently characterized by high computational demands and complex methodologies. Addressing this, the research introduced in this paper elucidates a novel anisotropic cloth regression technique that capitalizes on the potential of implicit neural representations of surfaces. Our first core contribution is an innovative mesh-free sampling approach, crafted to reduce the reliance on traditional mesh structures, thereby offering greater flexibility and accuracy in capturing fine cloth details. Our second contribution is a novel adversarial training scheme, which is designed meticulously to strike a harmonious balance between the sampling and simulation objectives. The adversarial approach ensures that the wrinkles are represented with high fidelity, while also maintaining computational efficiency. Our results showcase through various cloth-object interaction scenarios that our method, given the same memory constraints, consistently surpasses traditional discrete representations, particularly when modelling highly-detailed localized wrinkles.

* 16 pages, 13 figures

Via

Access Paper or Ask Questions

Learning to Estimate Single-View Volumetric Flow Motions without 3D Supervision

Feb 28, 2023

Erik Franz, Barbara Solenthaler, Nils Thuerey

Abstract:We address the challenging problem of jointly inferring the 3D flow and volumetric densities moving in a fluid from a monocular input video with a deep neural network. Despite the complexity of this task, we show that it is possible to train the corresponding networks without requiring any 3D ground truth for training. In the absence of ground truth data we can train our model with observations from real-world capture setups instead of relying on synthetic reconstructions. We make this unsupervised training approach possible by first generating an initial prototype volume which is then moved and transported over time without the need for volumetric supervision. Our approach relies purely on image-based losses, an adversarial discriminator network, and regularization. Our method can estimate long-term sequences in a stable manner, while achieving closely matching targets for inputs such as rising smoke plumes.

* ICLR 2023 poster, source code: https://github.com/tum-pbs/Neural-Global-Transport

Via

Access Paper or Ask Questions

Global Transport for Fluid Reconstruction with Learned Self-Supervision

Apr 13, 2021

Erik Franz, Barbara Solenthaler, Nils Thuerey

Figure 1 for Global Transport for Fluid Reconstruction with Learned Self-Supervision

Figure 2 for Global Transport for Fluid Reconstruction with Learned Self-Supervision

Figure 3 for Global Transport for Fluid Reconstruction with Learned Self-Supervision

Figure 4 for Global Transport for Fluid Reconstruction with Learned Self-Supervision

Abstract:We propose a novel method to reconstruct volumetric flows from sparse views via a global transport formulation. Instead of obtaining the space-time function of the observations, we reconstruct its motion based on a single initial state. In addition we introduce a learned self-supervision that constrains observations from unseen angles. These visual constraints are coupled via the transport constraints and a differentiable rendering step to arrive at a robust end-to-end reconstruction algorithm. This makes the reconstruction of highly realistic flow motions possible, even from only a single input view. We show with a variety of synthetic and real flows that the proposed global reconstruction of the transport process yields an improved reconstruction of the fluid motion.

* CVPR 2021 oral, source code: https://github.com/tum-pbs/Global-Flow-Transport

Via

Access Paper or Ask Questions

Lagrangian Neural Style Transfer for Fluids

May 02, 2020

Byungsoo Kim, Vinicius C. Azevedo, Markus Gross, Barbara Solenthaler

Figure 1 for Lagrangian Neural Style Transfer for Fluids

Figure 2 for Lagrangian Neural Style Transfer for Fluids

Figure 3 for Lagrangian Neural Style Transfer for Fluids

Figure 4 for Lagrangian Neural Style Transfer for Fluids

Abstract:Artistically controlling the shape, motion and appearance of fluid simulations pose major challenges in visual effects production. In this paper, we present a neural style transfer approach from images to 3D fluids formulated in a Lagrangian viewpoint. Using particles for style transfer has unique benefits compared to grid-based techniques. Attributes are stored on the particles and hence are trivially transported by the particle motion. This intrinsically ensures temporal consistency of the optimized stylized structure and notably improves the resulting quality. Simultaneously, the expensive, recursive alignment of stylization velocity fields of grid approaches is unnecessary, reducing the computation time to less than an hour and rendering neural flow stylization practical in production settings. Moreover, the Lagrangian representation improves artistic control as it allows for multi-fluid stylization and consistent color transfer from images, and the generality of the method enables stylization of smoke and liquids likewise.

* ACM Trans. Graph. 39, 4, Article 1 (July 2020), 10 pages
* ACM Transaction on Graphics (SIGGRAPH 2020), additional materials: http://www.byungsoo.me/project/lnst/index.html

Via

Access Paper or Ask Questions

Latent Space Subdivision: Stable and Controllable Time Predictions for Fluid Flow

Mar 12, 2020

Steffen Wiewel, Byungsoo Kim, Vinicius C. Azevedo, Barbara Solenthaler, Nils Thuerey

Figure 1 for Latent Space Subdivision: Stable and Controllable Time Predictions for Fluid Flow

Figure 2 for Latent Space Subdivision: Stable and Controllable Time Predictions for Fluid Flow

Figure 3 for Latent Space Subdivision: Stable and Controllable Time Predictions for Fluid Flow

Figure 4 for Latent Space Subdivision: Stable and Controllable Time Predictions for Fluid Flow

Abstract:We propose an end-to-end trained neural networkarchitecture to robustly predict the complex dynamics of fluid flows with high temporal stability. We focus on single-phase smoke simulations in 2D and 3D based on the incompressible Navier-Stokes (NS) equations, which are relevant for a wide range of practical problems. To achieve stable predictions for long-term flow sequences, a convolutional neural network (CNN) is trained for spatial compression in combination with a temporal prediction network that consists of stacked Long Short-Term Memory (LSTM) layers. Our core contribution is a novel latent space subdivision (LSS) to separate the respective input quantities into individual parts of the encoded latent space domain. This allows to distinctively alter the encoded quantities without interfering with the remaining latent space values and hence maximizes external control. By selectively overwriting parts of the predicted latent space points, our proposed method is capable to robustly predict long-term sequences of complex physics problems. In addition, we highlight the benefits of a recurrent training on the latent space creation, which is performed by the spatial compression network.

* https://ge.in.tum.de/publications/latent-space-subdivision/

Via

Access Paper or Ask Questions