Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Michael J. Kastoryano

Coarse-To-Fine Tensor Trains for Compact Visual Representations

Jun 06, 2024

Sebastian Loeschcke, Dan Wang, Christian Leth-Espensen, Serge Belongie, Michael J. Kastoryano, Sagie Benaim

Figure 1 for Coarse-To-Fine Tensor Trains for Compact Visual Representations

Figure 2 for Coarse-To-Fine Tensor Trains for Compact Visual Representations

Figure 3 for Coarse-To-Fine Tensor Trains for Compact Visual Representations

Figure 4 for Coarse-To-Fine Tensor Trains for Compact Visual Representations

Abstract:The ability to learn compact, high-quality, and easy-to-optimize representations for visual data is paramount to many applications such as novel view synthesis and 3D reconstruction. Recent work has shown substantial success in using tensor networks to design such compact and high-quality representations. However, the ability to optimize tensor-based representations, and in particular, the highly compact tensor train representation, is still lacking. This has prevented practitioners from deploying the full potential of tensor networks for visual data. To this end, we propose 'Prolongation Upsampling Tensor Train (PuTT)', a novel method for learning tensor train representations in a coarse-to-fine manner. Our method involves the prolonging or `upsampling' of a learned tensor train representation, creating a sequence of 'coarse-to-fine' tensor trains that are incrementally refined. We evaluate our representation along three axes: (1). compression, (2). denoising capability, and (3). image completion capability. To assess these axes, we consider the tasks of image fitting, 3D fitting, and novel view synthesis, where our method shows an improved performance compared to state-of-the-art tensor-based methods. For full results see our project webpage: https://sebulo.github.io/PuTT_website/

* Project webpage: https://sebulo.github.io/PuTT_website/

Via

Access Paper or Ask Questions

LoQT: Low Rank Adapters for Quantized Training

May 26, 2024

Sebastian Loeschcke, Mads Toftrup, Michael J. Kastoryano, Serge Belongie, Vésteinn Snæbjarnarson

Figure 1 for LoQT: Low Rank Adapters for Quantized Training

Figure 2 for LoQT: Low Rank Adapters for Quantized Training

Figure 3 for LoQT: Low Rank Adapters for Quantized Training

Figure 4 for LoQT: Low Rank Adapters for Quantized Training

Abstract:Training of large neural networks requires significant computational resources. Despite advances using low-rank adapters and quantization, pretraining of models such as LLMs on consumer hardware has not been possible without model sharding, offloading during training, or per-layer gradient updates. To address these limitations, we propose LoQT, a method for efficiently training quantized models. LoQT uses gradient-based tensor factorization to initialize low-rank trainable weight matrices that are periodically merged into quantized full-rank weight matrices. Our approach is suitable for both pretraining and fine-tuning of models, which we demonstrate experimentally for language modeling and downstream task adaptation. We find that LoQT enables efficient training of models up to 7B parameters on a consumer-grade 24GB GPU. We also demonstrate the feasibility of training a 13B parameter model using per-layer gradient updates on the same hardware.

Via

Access Paper or Ask Questions

On the geometry of learning neural quantum states

Oct 24, 2019

Chae-Yeun Park, Michael J. Kastoryano

Figure 1 for On the geometry of learning neural quantum states

Figure 2 for On the geometry of learning neural quantum states

Figure 3 for On the geometry of learning neural quantum states

Figure 4 for On the geometry of learning neural quantum states

Abstract:Combining insights from machine learning and quantum Monte Carlo, the stochastic reconfiguration method with neural network Ansatz states is a promising new direction for high precision ground state estimation of quantum many body problems. At present, the method is heuristic, lacking a proper theoretical foundation. We initiate a thorough analysis of the learning landscape, and show that it reveals universal behavior reflecting a combination of the underlying physics and of the learning dynamics. In particular, the spectrum of the quantum Fisher matrix of complex restricted Boltzmann machine states can dramatically change across a phase transition. In contrast to the spectral properties of the quantum Fisher matrix, the actual weights of the network at convergence do not reveal much information about the system or the dynamics. Furthermore, we identify a new measure of correlation in the state by analyzing entanglement the eigenvectors. We show that, generically, the learning landscape modes with least entanglement have largest eigenvalue, suggesting that correlations are encoded in large flat valleys of the learning landscape, favoring stable representations of the ground state.

* 15 pages, 9 Figures. Comments welcome

Via

Access Paper or Ask Questions