Abstract:We present SplatFace, a novel Gaussian splatting framework designed for 3D human face reconstruction without reliance on accurate pre-determined geometry. Our method is designed to simultaneously deliver both high-quality novel view rendering and accurate 3D mesh reconstructions. We incorporate a generic 3D Morphable Model (3DMM) to provide a surface geometric structure, making it possible to reconstruct faces with a limited set of input images. We introduce a joint optimization strategy that refines both the Gaussians and the morphable surface through a synergistic non-rigid alignment process. A novel distance metric, splat-to-surface, is proposed to improve alignment by considering both the Gaussian position and covariance. The surface information is also utilized to incorporate a world-space densification process, resulting in superior reconstruction quality. Our experimental analysis demonstrates that the proposed method is competitive with both other Gaussian splatting techniques in novel view synthesis and other 3D reconstruction methods in producing 3D face meshes with high geometric precision.
Abstract:Existing methods for 3D face reconstruction from a few casually captured images employ deep learning based models along with a 3D Morphable Model(3DMM) as face geometry prior. Structure From Motion(SFM), followed by Multi-View Stereo (MVS), on the other hand, uses dozens of high-resolution images to reconstruct accurate 3D faces.However, it produces noisy and stretched-out results with only two views available. In this paper, taking inspiration from both these methods, we propose an end-to-end pipeline that disjointly solves for pose and shape to make the optimization stable and accurate. We use a face shape prior to estimate face pose and use stereo matching followed by a 3DMM to solve for the shape. The proposed method achieves end-to-end topological consistency, enables iterative face pose refinement procedure, and show remarkable improvement on both quantitative and qualitative results over existing state-of-the-art methods.
Abstract:In this paper, we introduce PeerGAN, a generative adversarial network (GAN) solution to improve the stability of the generated samples and to mitigate mode collapse. Built upon the Vanilla GAN's two-player game between the discriminator $D_1$ and the generator $G$, we introduce a peer discriminator $D_2$ to the min-max game. Similar to previous work using two discriminators, the first role of both $D_1$, $D_2$ is to distinguish between generated samples and real ones, while the generator tries to generate high-quality samples that are able to fool both discriminators. Different from existing methods, we introduce another game between $D_1$ and $D_2$ to discourage their agreement and therefore increase the level of diversity of the generated samples. This property helps avoid early mode collapse by preventing $D_1$ and $D_2$ from converging too fast. We provide theoretical analysis for the equilibrium of the min-max game formed among $G, D_1, D_2$. We offer convergence behavior of PeerGAN as well as stability of the min-max game. It's worth mentioning that PeerGAN operates in the unsupervised setting, and the additional game between $D_1$ and $D_2$ does not need any label supervision. Experiments results on a synthetic dataset and on real-world image datasets (MNIST, Fashion MNIST, CIFAR-10, STL-10, CelebA, VGG) demonstrate that PeerGAN outperforms competitive baseline work in generating diverse and high-quality samples, while only introduces negligible computation cost.