Abstract:We introduce D$^3$-Human, a method for reconstructing Dynamic Disentangled Digital Human geometry from monocular videos. Past monocular video human reconstruction primarily focuses on reconstructing undecoupled clothed human bodies or only reconstructing clothing, making it difficult to apply directly in applications such as animation production. The challenge in reconstructing decoupled clothing and body lies in the occlusion caused by clothing over the body. To this end, the details of the visible area and the plausibility of the invisible area must be ensured during the reconstruction process. Our proposed method combines explicit and implicit representations to model the decoupled clothed human body, leveraging the robustness of explicit representations and the flexibility of implicit representations. Specifically, we reconstruct the visible region as SDF and propose a novel human manifold signed distance field (hmSDF) to segment the visible clothing and visible body, and then merge the visible and invisible body. Extensive experimental results demonstrate that, compared with existing reconstruction schemes, D$^3$-Human can achieve high-quality decoupled reconstruction of the human body wearing different clothing, and can be directly applied to clothing transfer and animation.
Abstract:In this paper, we introduce Neural-ABC, a novel parametric model based on neural implicit functions that can represent clothed human bodies with disentangled latent spaces for identity, clothing, shape, and pose. Traditional mesh-based representations struggle to represent articulated bodies with clothes due to the diversity of human body shapes and clothing styles, as well as the complexity of poses. Our proposed model provides a unified framework for parametric modeling, which can represent the identity, clothing, shape and pose of the clothed human body. Our proposed approach utilizes the power of neural implicit functions as the underlying representation and integrates well-designed structures to meet the necessary requirements. Specifically, we represent the underlying body as a signed distance function and clothing as an unsigned distance function, and they can be uniformly represented as unsigned distance fields. Different types of clothing do not require predefined topological structures or classifications, and can follow changes in the underlying body to fit the body. Additionally, we construct poses using a controllable articulated structure. The model is trained on both open and newly constructed datasets, and our decoupling strategy is carefully designed to ensure optimal performance. Our model excels at disentangling clothing and identity in different shape and poses while preserving the style of the clothing. We demonstrate that Neural-ABC fits new observations of different types of clothing. Compared to other state-of-the-art parametric models, Neural-ABC demonstrates powerful advantages in the reconstruction of clothed human bodies, as evidenced by fitting raw scans, depth maps and images. We show that the attributes of the fitted results can be further edited by adjusting their identities, clothing, shape and pose codes.