Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Topic:Pixel2mesh

T-Pixel2Mesh: Combining Global and Local Transformer for 3D Mesh Generation from a Single Image

Mar 20, 2024

Shijie Zhang, Boyan Jiang, Keke He, Junwei Zhu, Ying Tai, Chengjie Wang, Yinda Zhang, Yanwei Fu

Figure 1 for T-Pixel2Mesh: Combining Global and Local Transformer for 3D Mesh Generation from a Single Image

Figure 2 for T-Pixel2Mesh: Combining Global and Local Transformer for 3D Mesh Generation from a Single Image

Figure 3 for T-Pixel2Mesh: Combining Global and Local Transformer for 3D Mesh Generation from a Single Image

Figure 4 for T-Pixel2Mesh: Combining Global and Local Transformer for 3D Mesh Generation from a Single Image

Abstract:Pixel2Mesh (P2M) is a classical approach for reconstructing 3D shapes from a single color image through coarse-to-fine mesh deformation. Although P2M is capable of generating plausible global shapes, its Graph Convolution Network (GCN) often produces overly smooth results, causing the loss of fine-grained geometry details. Moreover, P2M generates non-credible features for occluded regions and struggles with the domain gap from synthetic data to real-world images, which is a common challenge for single-view 3D reconstruction methods. To address these challenges, we propose a novel Transformer-boosted architecture, named T-Pixel2Mesh, inspired by the coarse-to-fine approach of P2M. Specifically, we use a global Transformer to control the holistic shape and a local Transformer to progressively refine the local geometry details with graph-based point upsampling. To enhance real-world reconstruction, we present the simple yet effective Linear Scale Search (LSS), which serves as prompt tuning during the input preprocessing. Our experiments on ShapeNet demonstrate state-of-the-art performance, while results on real-world data show the generalization capability.

* Received by ICASSP 2024

Via

Access Paper or Ask Questions

Pixel2Mesh++: 3D Mesh Generation and Refinement from Multi-View Images

Apr 21, 2022

Chao Wen, Yinda Zhang, Chenjie Cao, Zhuwen Li, Xiangyang Xue, Yanwei Fu

Figure 1 for Pixel2Mesh++: 3D Mesh Generation and Refinement from Multi-View Images

Figure 2 for Pixel2Mesh++: 3D Mesh Generation and Refinement from Multi-View Images

Figure 3 for Pixel2Mesh++: 3D Mesh Generation and Refinement from Multi-View Images

Figure 4 for Pixel2Mesh++: 3D Mesh Generation and Refinement from Multi-View Images

Abstract:We study the problem of shape generation in 3D mesh representation from a small number of color images with or without camera poses. While many previous works learn to hallucinate the shape directly from priors, we adopt to further improve the shape quality by leveraging cross-view information with a graph convolution network. Instead of building a direct mapping function from images to 3D shape, our model learns to predict series of deformations to improve a coarse shape iteratively. Inspired by traditional multiple view geometry methods, our network samples nearby area around the initial mesh's vertex locations and reasons an optimal deformation using perceptual feature statistics built from multiple input images. Extensive experiments show that our model produces accurate 3D shapes that are not only visually plausible from the input perspectives, but also well aligned to arbitrary viewpoints. With the help of physically driven architecture, our model also exhibits generalization capability across different semantic categories, and the number of input images. Model analysis experiments show that our model is robust to the quality of the initial mesh and the error of camera pose, and can be combined with a differentiable renderer for test-time optimization.

* Accepted by TPAMI2022. arXiv admin note: substantial text overlap with arXiv:1908.01491

Via

Access Paper or Ask Questions

Pixel2Mesh++: Multi-View 3D Mesh Generation via Deformation

Aug 16, 2019

Chao Wen, Yinda Zhang, Zhuwen Li, Yanwei Fu

Figure 1 for Pixel2Mesh++: Multi-View 3D Mesh Generation via Deformation

Figure 2 for Pixel2Mesh++: Multi-View 3D Mesh Generation via Deformation

Figure 3 for Pixel2Mesh++: Multi-View 3D Mesh Generation via Deformation

Figure 4 for Pixel2Mesh++: Multi-View 3D Mesh Generation via Deformation

Abstract:We study the problem of shape generation in 3D mesh representation from a few color images with known camera poses. While many previous works learn to hallucinate the shape directly from priors, we resort to further improving the shape quality by leveraging cross-view information with a graph convolutional network. Instead of building a direct mapping function from images to 3D shape, our model learns to predict series of deformations to improve a coarse shape iteratively. Inspired by traditional multiple view geometry methods, our network samples nearby area around the initial mesh's vertex locations and reasons an optimal deformation using perceptual feature statistics built from multiple input images. Extensive experiments show that our model produces accurate 3D shape that are not only visually plausible from the input perspectives, but also well aligned to arbitrary viewpoints. With the help of physically driven architecture, our model also exhibits generalization capability across different semantic categories, number of input images, and quality of mesh initialization.

* Accepted by ICCV 2019

Via

Access Paper or Ask Questions

Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images

Aug 03, 2018

Nanyang Wang, Yinda Zhang, Zhuwen Li, Yanwei Fu, Wei Liu, Yu-Gang Jiang

Figure 1 for Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images

Figure 2 for Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images

Figure 3 for Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images

Figure 4 for Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images

Abstract:We propose an end-to-end deep learning architecture that produces a 3D shape in triangular mesh from a single color image. Limited by the nature of deep neural network, previous methods usually represent a 3D shape in volume or point cloud, and it is non-trivial to convert them to the more ready-to-use mesh model. Unlike the existing methods, our network represents 3D mesh in a graph-based convolutional neural network and produces correct geometry by progressively deforming an ellipsoid, leveraging perceptual features extracted from the input image. We adopt a coarse-to-fine strategy to make the whole deformation procedure stable, and define various of mesh related losses to capture properties of different levels to guarantee visually appealing and physically accurate 3D geometry. Extensive experiments show that our method not only qualitatively produces mesh model with better details, but also achieves higher 3D shape estimation accuracy compared to the state-of-the-art.

* ECCV2018

Via

Access Paper or Ask Questions

Topic:Pixel2mesh

Papers and Code

T-Pixel2Mesh: Combining Global and Local Transformer for 3D Mesh Generation from a Single Image

Pixel2Mesh++: 3D Mesh Generation and Refinement from Multi-View Images

Pixel2Mesh++: Multi-View 3D Mesh Generation via Deformation

Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images