Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Patrick Putzky

Choose Your Model Size: Any Compression by a Single Gradient Descent

Feb 03, 2025

Martin Genzel, Patrick Putzky, Pengfei Zhao, Sebastian Schulze, Mattes Mollenhauer, Robert Seidel, Stefan Dietzel, Thomas Wollmann

Figure 1 for Choose Your Model Size: Any Compression by a Single Gradient Descent

Figure 2 for Choose Your Model Size: Any Compression by a Single Gradient Descent

Figure 3 for Choose Your Model Size: Any Compression by a Single Gradient Descent

Figure 4 for Choose Your Model Size: Any Compression by a Single Gradient Descent

Abstract:The adoption of Foundation Models in resource-constrained environments remains challenging due to their large size and inference costs. A promising way to overcome these limitations is post-training compression, which aims to balance reduced model size against performance degradation. This work presents Any Compression via Iterative Pruning (ACIP), a novel algorithmic approach to determine a compression-performance trade-off from a single stochastic gradient descent run. To ensure parameter efficiency, we use an SVD-reparametrization of linear layers and iteratively prune their singular values with a sparsity-inducing penalty. The resulting pruning order gives rise to a global parameter ranking that allows us to materialize models of any target size. Importantly, the compressed models exhibit strong predictive downstream performance without the need for costly fine-tuning. We evaluate ACIP on a large selection of open-weight LLMs and tasks, and demonstrate state-of-the-art results compared to existing factorisation-based compression methods. We also show that ACIP seamlessly complements common quantization-based compression techniques.

Via

Access Paper or Ask Questions

Invert to Learn to Invert

Nov 25, 2019

Patrick Putzky, Max Welling

Abstract:Iterative learning to infer approaches have become popular solvers for inverse problems. However, their memory requirements during training grow linearly with model depth, limiting in practice model expressiveness. In this work, we propose an iterative inverse model with constant memory that relies on invertible networks to avoid storing intermediate activations. As a result, the proposed approach allows us to train models with 400 layers on 3D volumes in an MRI image reconstruction task. In experiments on a public data set, we demonstrate that these deeper, and thus more expressive, networks perform state-of-the-art image reconstruction.

Via

Access Paper or Ask Questions

i-RIM applied to the fastMRI challenge

Oct 20, 2019

Patrick Putzky, Dimitrios Karkalousos, Jonas Teuwen, Nikita Miriakov, Bart Bakker, Matthan Caan, Max Welling

Figure 1 for i-RIM applied to the fastMRI challenge

Figure 2 for i-RIM applied to the fastMRI challenge

Abstract:We, team AImsterdam, summarize our submission to the fastMRI challenge (Zbontar et al., 2018). Our approach builds on recent advances in invertible learning to infer models as presented in Putzky and Welling (2019). Both, our single-coil and our multi-coil model share the same basic architecture.

* Abstract submitted to the fastMRI challenge

Via

Access Paper or Ask Questions

Recurrent Inference Machines for Solving Inverse Problems

Jun 13, 2017

Patrick Putzky, Max Welling

Figure 1 for Recurrent Inference Machines for Solving Inverse Problems

Figure 2 for Recurrent Inference Machines for Solving Inverse Problems

Figure 3 for Recurrent Inference Machines for Solving Inverse Problems

Figure 4 for Recurrent Inference Machines for Solving Inverse Problems

Abstract:Much of the recent research on solving iterative inference problems focuses on moving away from hand-chosen inference algorithms and towards learned inference. In the latter, the inference process is unrolled in time and interpreted as a recurrent neural network (RNN) which allows for joint learning of model and inference parameters with back-propagation through time. In this framework, the RNN architecture is directly derived from a hand-chosen inference algorithm, effectively limiting its capabilities. We propose a learning framework, called Recurrent Inference Machines (RIM), in which we turn algorithm construction the other way round: Given data and a task, train an RNN to learn an inference algorithm. Because RNNs are Turing complete [1, 2] they are capable to implement any inference algorithm. The framework allows for an abstraction which removes the need for domain knowledge. We demonstrate in several image restoration experiments that this abstraction is effective, allowing us to achieve state-of-the-art performance on image denoising and super-resolution tasks and superior across-task generalization.

Via

Access Paper or Ask Questions