Picture for Vahab Mirrokni

Vahab Mirrokni

Dima

Nested Learning: The Illusion of Deep Learning Architectures

Add code
Dec 31, 2025
Viaarxiv icon

Trellis: Learning to Compress Key-Value Memory in Attention Models

Add code
Dec 29, 2025
Viaarxiv icon

MS-SSM: A Multi-Scale State Space Model for Efficient Sequence Modeling

Add code
Dec 29, 2025
Viaarxiv icon

TNT: Improving Chunkwise Training for Test-Time Memorization

Add code
Nov 10, 2025
Viaarxiv icon

Sampling and Loss Weights in Multi-Domain Training

Add code
Nov 10, 2025
Viaarxiv icon

Data Selection for Fine-tuning Vision Language Models via Cross Modal Alignment Trajectories

Add code
Oct 01, 2025
Viaarxiv icon

ATLAS: Learning to Optimally Memorize the Context at Test Time

Add code
May 29, 2025
Figure 1 for ATLAS: Learning to Optimally Memorize the Context at Test Time
Figure 2 for ATLAS: Learning to Optimally Memorize the Context at Test Time
Figure 3 for ATLAS: Learning to Optimally Memorize the Context at Test Time
Figure 4 for ATLAS: Learning to Optimally Memorize the Context at Test Time
Viaarxiv icon

Efficient Data Selection at Scale via Influence Distillation

Add code
May 25, 2025
Viaarxiv icon

Self-Boost via Optimal Retraining: An Analysis via Approximate Message Passing

Add code
May 21, 2025
Viaarxiv icon

TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate

Add code
Apr 28, 2025
Figure 1 for TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate
Figure 2 for TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate
Figure 3 for TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate
Figure 4 for TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate
Viaarxiv icon