Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Giljoo Nam

3DGH: 3D Head Generation with Composable Hair and Face

Jun 25, 2025

Chengan He, Junxuan Li, Tobias Kirschstein, Artem Sevastopolsky, Shunsuke Saito, Qingyang Tan, Javier Romero, Chen Cao, Holly Rushmeier, Giljoo Nam

Abstract:We present 3DGH, an unconditional generative model for 3D human heads with composable hair and face components. Unlike previous work that entangles the modeling of hair and face, we propose to separate them using a novel data representation with template-based 3D Gaussian Splatting, in which deformable hair geometry is introduced to capture the geometric variations across different hairstyles. Based on this data representation, we design a 3D GAN-based architecture with dual generators and employ a cross-attention mechanism to model the inherent correlation between hair and face. The model is trained on synthetic renderings using carefully designed objectives to stabilize training and facilitate hair-face separation. We conduct extensive experiments to validate the design choice of 3DGH, and evaluate it both qualitatively and quantitatively by comparing with several state-of-the-art 3D GAN methods, demonstrating its effectiveness in unconditional full-head image synthesis and composable 3D hairstyle editing. More details will be available on our project page: https://c-he.github.io/projects/3dgh/.

* Accepted to SIGGRAPH 2025. Project page: https://c-he.github.io/projects/3dgh/

Via

Access Paper or Ask Questions

LRSLAM: Low-rank Representation of Signed Distance Fields in Dense Visual SLAM System

Jun 12, 2025

Hongbeen Park, Minjeong Park, Giljoo Nam, Jinkyu Kim

Abstract:Simultaneous Localization and Mapping (SLAM) has been crucial across various domains, including autonomous driving, mobile robotics, and mixed reality. Dense visual SLAM, leveraging RGB-D camera systems, offers advantages but faces challenges in achieving real-time performance, robustness, and scalability for large-scale scenes. Recent approaches utilizing neural implicit scene representations show promise but suffer from high computational costs and memory requirements. ESLAM introduced a plane-based tensor decomposition but still struggled with memory growth. Addressing these challenges, we propose a more efficient visual SLAM model, called LRSLAM, utilizing low-rank tensor decomposition methods. Our approach, leveraging the Six-axis and CP decompositions, achieves better convergence rates, memory efficiency, and reconstruction/localization quality than existing state-of-the-art approaches. Evaluation across diverse indoor RGB-D datasets demonstrates LRSLAM's superior performance in terms of parameter efficiency, processing time, and accuracy, retaining reconstruction and localization quality. Our code will be publicly available upon publication.

* Accepted at ECCV 2024

Via

Access Paper or Ask Questions

LUCAS: Layered Universal Codec Avatars

Feb 27, 2025

Di Liu, Teng Deng, Giljoo Nam, Yu Rong, Stanislav Pidhorskyi, Junxuan Li, Jason Saragih, Dimitris N. Metaxas, Chen Cao

Abstract:Photorealistic 3D head avatar reconstruction faces critical challenges in modeling dynamic face-hair interactions and achieving cross-identity generalization, particularly during expressions and head movements. We present LUCAS, a novel Universal Prior Model (UPM) for codec avatar modeling that disentangles face and hair through a layered representation. Unlike previous UPMs that treat hair as an integral part of the head, our approach separates the modeling of the hairless head and hair into distinct branches. LUCAS is the first to introduce a mesh-based UPM, facilitating real-time rendering on devices. Our layered representation also improves the anchor geometry for precise and visually appealing Gaussian renderings. Experimental results indicate that LUCAS outperforms existing single-mesh and Gaussian-based avatar models in both quantitative and qualitative assessments, including evaluations on held-out subjects in zero-shot driving scenarios. LUCAS demonstrates superior dynamic performance in managing head pose changes, expression transfer, and hairstyle variations, thereby advancing the state-of-the-art in 3D head avatar reconstruction.

Via

Access Paper or Ask Questions

InterHandGen: Two-Hand Interaction Generation via Cascaded Reverse Diffusion

Mar 26, 2024

Jihyun Lee, Shunsuke Saito, Giljoo Nam, Minhyuk Sung, Tae-Kyun Kim

Figure 1 for InterHandGen: Two-Hand Interaction Generation via Cascaded Reverse Diffusion

Figure 2 for InterHandGen: Two-Hand Interaction Generation via Cascaded Reverse Diffusion

Figure 3 for InterHandGen: Two-Hand Interaction Generation via Cascaded Reverse Diffusion

Figure 4 for InterHandGen: Two-Hand Interaction Generation via Cascaded Reverse Diffusion

Abstract:We present InterHandGen, a novel framework that learns the generative prior of two-hand interaction. Sampling from our model yields plausible and diverse two-hand shapes in close interaction with or without an object. Our prior can be incorporated into any optimization or learning methods to reduce ambiguity in an ill-posed setup. Our key observation is that directly modeling the joint distribution of multiple instances imposes high learning complexity due to its combinatorial nature. Thus, we propose to decompose the modeling of joint distribution into the modeling of factored unconditional and conditional single instance distribution. In particular, we introduce a diffusion model that learns the single-hand distribution unconditional and conditional to another hand via conditioning dropout. For sampling, we combine anti-penetration and classifier-free guidance to enable plausible generation. Furthermore, we establish the rigorous evaluation protocol of two-hand synthesis, where our method significantly outperforms baseline generative models in terms of plausibility and diversity. We also demonstrate that our diffusion prior can boost the performance of two-hand reconstruction from monocular in-the-wild images, achieving new state-of-the-art accuracy.

* Accepted to CVPR 2024, project page: https://jyunlee.github.io/projects/interhandgen/

Via

Access Paper or Ask Questions

A Local Appearance Model for Volumetric Capture of Diverse Hairstyle

Dec 14, 2023

Ziyan Wang, Giljoo Nam, Aljaz Bozic, Chen Cao, Jason Saragih, Michael Zollhoefer, Jessica Hodgins

Figure 1 for A Local Appearance Model for Volumetric Capture of Diverse Hairstyle

Figure 2 for A Local Appearance Model for Volumetric Capture of Diverse Hairstyle

Figure 3 for A Local Appearance Model for Volumetric Capture of Diverse Hairstyle

Figure 4 for A Local Appearance Model for Volumetric Capture of Diverse Hairstyle

Abstract:Hair plays a significant role in personal identity and appearance, making it an essential component of high-quality, photorealistic avatars. Existing approaches either focus on modeling the facial region only or rely on personalized models, limiting their generalizability and scalability. In this paper, we present a novel method for creating high-fidelity avatars with diverse hairstyles. Our method leverages the local similarity across different hairstyles and learns a universal hair appearance prior from multi-view captures of hundreds of people. This prior model takes 3D-aligned features as input and generates dense radiance fields conditioned on a sparse point cloud with color. As our model splits different hairstyles into local primitives and builds prior at that level, it is capable of handling various hair topologies. Through experiments, we demonstrate that our model captures a diverse range of hairstyles and generalizes well to challenging new hairstyles. Empirical results show that our method improves the state-of-the-art approaches in capturing and generating photorealistic, personalized avatars with complete hair.

Via

Access Paper or Ask Questions

Relightable Gaussian Codec Avatars

Dec 06, 2023

Shunsuke Saito, Gabriel Schwartz, Tomas Simon, Junxuan Li, Giljoo Nam

Figure 1 for Relightable Gaussian Codec Avatars

Figure 2 for Relightable Gaussian Codec Avatars

Figure 3 for Relightable Gaussian Codec Avatars

Figure 4 for Relightable Gaussian Codec Avatars

Abstract:The fidelity of relighting is bounded by both geometry and appearance representations. For geometry, both mesh and volumetric approaches have difficulty modeling intricate structures like 3D hair geometry. For appearance, existing relighting models are limited in fidelity and often too slow to render in real-time with high-resolution continuous environments. In this work, we present Relightable Gaussian Codec Avatars, a method to build high-fidelity relightable head avatars that can be animated to generate novel expressions. Our geometry model based on 3D Gaussians can capture 3D-consistent sub-millimeter details such as hair strands and pores on dynamic face sequences. To support diverse materials of human heads such as the eyes, skin, and hair in a unified manner, we present a novel relightable appearance model based on learnable radiance transfer. Together with global illumination-aware spherical harmonics for the diffuse components, we achieve real-time relighting with spatially all-frequency reflections using spherical Gaussians. This appearance model can be efficiently relit under both point light and continuous illumination. We further improve the fidelity of eye reflections and enable explicit gaze control by introducing relightable explicit eye models. Our method outperforms existing approaches without compromising real-time performance. We also demonstrate real-time relighting of avatars on a tethered consumer VR headset, showcasing the efficiency and fidelity of our avatars.

* Website: https://shunsukesaito.github.io/rgca/

Via

Access Paper or Ask Questions

Differentiable Display Photometric Stereo

Jun 28, 2023

Seokjun Choi, Seungwoo Yoon, Giljoo Nam, Seungyong Lee, Seung-Hwan Baek

Figure 1 for Differentiable Display Photometric Stereo

Figure 2 for Differentiable Display Photometric Stereo

Figure 3 for Differentiable Display Photometric Stereo

Figure 4 for Differentiable Display Photometric Stereo

Abstract:Photometric stereo leverages variations in illumination conditions to reconstruct per-pixel surface normals. The concept of display photometric stereo, which employs a conventional monitor as an illumination source, has the potential to overcome limitations often encountered in bulky and difficult-to-use conventional setups. In this paper, we introduce Differentiable Display Photometric Stereo (DDPS), a method designed to achieve high-fidelity normal reconstruction using an off-the-shelf monitor and camera. DDPS addresses a critical yet often neglected challenge in photometric stereo: the optimization of display patterns for enhanced normal reconstruction. We present a differentiable framework that couples basis-illumination image formation with a photometric-stereo reconstruction method. This facilitates the learning of display patterns that leads to high-quality normal reconstruction through automatic differentiation. Addressing the synthetic-real domain gap inherent in end-to-end optimization, we propose the use of a real-world photometric-stereo training dataset composed of 3D-printed objects. Moreover, to reduce the ill-posed nature of photometric stereo, we exploit the linearly polarized light emitted from the monitor to optically separate diffuse and specular reflections in the captured images. We demonstrate that DDPS allows for learning display patterns optimized for a target configuration and is robust to initialization. We assess DDPS on 3D-printed objects with ground-truth normals and diverse real-world objects, validating that DDPS enables effective photometric-stereo reconstruction.

Via

Access Paper or Ask Questions

Event Fusion Photometric Stereo Network

Mar 11, 2023

Wonjeong Ryoo, Giljoo Nam, Jae-Sang Hyun, Sangpil Kim

Figure 1 for Event Fusion Photometric Stereo Network

Figure 2 for Event Fusion Photometric Stereo Network

Figure 3 for Event Fusion Photometric Stereo Network

Figure 4 for Event Fusion Photometric Stereo Network

Abstract:We present a novel method to estimate the surface normal of an object in an ambient light environment using RGB and event cameras. Modern photometric stereo methods rely on an RGB camera, mainly in a dark room, to avoid ambient illumination. To alleviate the limitations of the darkroom environment and to use essential light information, we employ an event camera with a high dynamic range and low latency. This is the first study that uses an event camera for the photometric stereo task, which works on continuous light sources and ambient light environment. In this work, we also curate a novel photometric stereo dataset that is constructed by capturing objects with event and RGB cameras under numerous ambient lights environment. Additionally, we propose a novel framework named Event Fusion Photometric Stereo Network~(EFPS-Net), which estimates the surface normals of an object using both RGB frames and event signals. Our proposed method interpolates event observation maps that generate light information with sparse event signals to acquire fluent light information. Subsequently, the event-interpolated observation maps are fused with the RGB observation maps. Our numerous experiments showed that EFPS-Net outperforms state-of-the-art methods on a dataset captured in the real world where ambient lights exist. Consequently, we demonstrate that incorporating additional modalities with EFPS-Net alleviates the limitations that occurred from ambient illumination.

* 33 pages, 11 figures

Via

Access Paper or Ask Questions

NeuWigs: A Neural Dynamic Model for Volumetric Hair Capture and Animation

Dec 08, 2022

Ziyan Wang, Giljoo Nam, Tuur Stuyck, Stephen Lombardi, Chen Cao, Jason Saragih, Michael Zollhoefer, Jessica Hodgins, Christoph Lassner

Figure 1 for NeuWigs: A Neural Dynamic Model for Volumetric Hair Capture and Animation

Figure 2 for NeuWigs: A Neural Dynamic Model for Volumetric Hair Capture and Animation

Figure 3 for NeuWigs: A Neural Dynamic Model for Volumetric Hair Capture and Animation

Figure 4 for NeuWigs: A Neural Dynamic Model for Volumetric Hair Capture and Animation

Abstract:The capture and animation of human hair are two of the major challenges in the creation of realistic avatars for the virtual reality. Both problems are highly challenging, because hair has complex geometry and appearance, as well as exhibits challenging motion. In this paper, we present a two-stage approach that models hair independently from the head to address these challenges in a data-driven manner. The first stage, state compression, learns a low-dimensional latent space of 3D hair states containing motion and appearance, via a novel autoencoder-as-a-tracker strategy. To better disentangle the hair and head in appearance learning, we employ multi-view hair segmentation masks in combination with a differentiable volumetric renderer. The second stage learns a novel hair dynamics model that performs temporal hair transfer based on the discovered latent codes. To enforce higher stability while driving our dynamics model, we employ the 3D point-cloud autoencoder from the compression stage for de-noising of the hair state. Our model outperforms the state of the art in novel view synthesis and is capable of creating novel hair animations without having to rely on hair observations as a driving signal. Project page is here https://ziyanw1.github.io/neuwigs/.

Via

Access Paper or Ask Questions

Neural Strands: Learning Hair Geometry and Appearance from Multi-View Images

Jul 28, 2022

Radu Alexandru Rosu, Shunsuke Saito, Ziyan Wang, Chenglei Wu, Sven Behnke, Giljoo Nam

Figure 1 for Neural Strands: Learning Hair Geometry and Appearance from Multi-View Images

Figure 2 for Neural Strands: Learning Hair Geometry and Appearance from Multi-View Images

Figure 3 for Neural Strands: Learning Hair Geometry and Appearance from Multi-View Images

Figure 4 for Neural Strands: Learning Hair Geometry and Appearance from Multi-View Images

Abstract:We present Neural Strands, a novel learning framework for modeling accurate hair geometry and appearance from multi-view image inputs. The learned hair model can be rendered in real-time from any viewpoint with high-fidelity view-dependent effects. Our model achieves intuitive shape and style control unlike volumetric counterparts. To enable these properties, we propose a novel hair representation based on a neural scalp texture that encodes the geometry and appearance of individual strands at each texel location. Furthermore, we introduce a novel neural rendering framework based on rasterization of the learned hair strands. Our neural rendering is strand-accurate and anti-aliased, making the rendering view-consistent and photorealistic. Combining appearance with a multi-view geometric prior, we enable, for the first time, the joint learning of appearance and explicit hair geometry from a multi-view setup. We demonstrate the efficacy of our approach in terms of fidelity and efficiency for various hairstyles.

* ECCV 2022. Project page: https://radualexandru.github.io/neural_strands/

Via

Access Paper or Ask Questions