Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Addressing Representation Collapse in Vector Quantized Models with One Linear Layer

Nov 04, 2024

Yongxin Zhu, Bocheng Li, Yifei Xin, Linli Xu

Figure 1 for Addressing Representation Collapse in Vector Quantized Models with One Linear Layer

Figure 2 for Addressing Representation Collapse in Vector Quantized Models with One Linear Layer

Figure 3 for Addressing Representation Collapse in Vector Quantized Models with One Linear Layer

Figure 4 for Addressing Representation Collapse in Vector Quantized Models with One Linear Layer

Share this with someone who'll enjoy it:

Abstract:Vector Quantization (VQ) is a widely used method for converting continuous representations into discrete codes, which has become fundamental in unsupervised representation learning and latent generative models. However, VQ models are often hindered by the problem of representation collapse in the latent space, which leads to low codebook utilization and limits the scalability of the codebook for large-scale training. Existing methods designed to mitigate representation collapse typically reduce the dimensionality of latent space at the expense of model capacity, which do not fully resolve the core issue. In this study, we conduct a theoretical analysis of representation collapse in VQ models and identify its primary cause as the disjoint optimization of the codebook, where only a small subset of code vectors are updated through gradient descent. To address this issue, we propose \textbf{SimVQ}, a novel method which reparameterizes the code vectors through a linear transformation layer based on a learnable latent basis. This transformation optimizes the \textit{entire linear space} spanned by the codebook, rather than merely updating \textit{the code vector} selected by the nearest-neighbor search in vanilla VQ models. Although it is commonly understood that the multiplication of two linear matrices is equivalent to applying a single linear layer, our approach works surprisingly well in resolving the collapse issue in VQ models with just one linear layer. We validate the efficacy of SimVQ through extensive experiments across various modalities, including image and audio data with different model architectures. Our code is available at \url{https://github.com/youngsheen/SimVQ}.

View paper on

Share this with someone who'll enjoy it:

Title:Addressing Representation Collapse in Vector Quantized Models with One Linear Layer

Paper and Code