Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:CodedVTR: Codebook-based Sparse Voxel Transformer with Geometric Guidance

Mar 27, 2022

Tianchen Zhao, Niansong Zhang, Xuefei Ning, He Wang, Li Yi, Yu Wang

Figure 1 for CodedVTR: Codebook-based Sparse Voxel Transformer with Geometric Guidance

Figure 2 for CodedVTR: Codebook-based Sparse Voxel Transformer with Geometric Guidance

Figure 3 for CodedVTR: Codebook-based Sparse Voxel Transformer with Geometric Guidance

Figure 4 for CodedVTR: Codebook-based Sparse Voxel Transformer with Geometric Guidance

Share this with someone who'll enjoy it:

Abstract:Transformers have gained much attention by outperforming convolutional neural networks in many 2D vision tasks. However, they are known to have generalization problems and rely on massive-scale pre-training and sophisticated training techniques. When applying to 3D tasks, the irregular data structure and limited data scale add to the difficulty of transformer's application. We propose CodedVTR (Codebook-based Voxel TRansformer), which improves data efficiency and generalization ability for 3D sparse voxel transformers. On the one hand, we propose the codebook-based attention that projects an attention space into its subspace represented by the combination of "prototypes" in a learnable codebook. It regularizes attention learning and improves generalization. On the other hand, we propose geometry-aware self-attention that utilizes geometric information (geometric pattern, density) to guide attention learning. CodedVTR could be embedded into existing sparse convolution-based methods, and bring consistent performance improvements for indoor and outdoor 3D semantic segmentation tasks

* Published at CVPR2022

View paper on

Share this with someone who'll enjoy it:

Title:CodedVTR: Codebook-based Sparse Voxel Transformer with Geometric Guidance

Paper and Code