Picture for Meisam Razaviyayn

Meisam Razaviyayn

Efficient DP-SGD for LLMs with Randomized Clipping

Add code
May 24, 2026
Viaarxiv icon

Early Stopping for Large Reasoning Models via Confidence Dynamics

Add code
Apr 06, 2026
Viaarxiv icon

Memory Caching: RNNs with Growing Memory

Add code
Feb 27, 2026
Viaarxiv icon

Less is More: Convergence Benefits of Fewer Data Weight Updates over Longer Horizon

Add code
Feb 23, 2026
Viaarxiv icon

Nested Learning: The Illusion of Deep Learning Architectures

Add code
Dec 31, 2025
Viaarxiv icon

Sampling and Loss Weights in Multi-Domain Training

Add code
Nov 10, 2025
Viaarxiv icon

TNT: Improving Chunkwise Training for Test-Time Memorization

Add code
Nov 10, 2025
Viaarxiv icon

Memory-Efficient Differentially Private Training with Gradient Random Projection

Add code
Jun 18, 2025
Viaarxiv icon

ATLAS: Learning to Optimally Memorize the Context at Test Time

Add code
May 29, 2025
Figure 1 for ATLAS: Learning to Optimally Memorize the Context at Test Time
Figure 2 for ATLAS: Learning to Optimally Memorize the Context at Test Time
Figure 3 for ATLAS: Learning to Optimally Memorize the Context at Test Time
Figure 4 for ATLAS: Learning to Optimally Memorize the Context at Test Time
Viaarxiv icon

It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization

Add code
Apr 17, 2025
Figure 1 for It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization
Figure 2 for It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization
Figure 3 for It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization
Figure 4 for It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization
Viaarxiv icon