Picture for Cengiz Pehlevan

Cengiz Pehlevan

Anytime Pretraining: Horizon-Free Learning-Rate Schedules with Weight Averaging

Add code
Feb 03, 2026
Viaarxiv icon

Universal One-third Time Scaling in Learning Peaked Distributions

Add code
Feb 03, 2026
Viaarxiv icon

Hyperparameter Transfer with Mixture-of-Expert Layers

Add code
Jan 28, 2026
Viaarxiv icon

Disordered Dynamics in High Dimensions: Connections to Random Matrices and Machine Learning

Add code
Jan 03, 2026
Viaarxiv icon

Demystifying LLM-as-a-Judge: Analytically Tractable Model for Inference-Time Scaling

Add code
Dec 22, 2025
Viaarxiv icon

Pretrain-Test Task Alignment Governs Generalization in In-Context Learning

Add code
Sep 30, 2025
Viaarxiv icon

A Simplified Analysis of SGD for Linear Regression with Weight Averaging

Add code
Jun 18, 2025
Viaarxiv icon

Don't be lazy: CompleteP enables compute-efficient deep transformers

Add code
May 02, 2025
Viaarxiv icon

Error Broadcast and Decorrelation as a Potential Artificial and Natural Learning Mechanism

Add code
Apr 15, 2025
Viaarxiv icon

Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining

Add code
Apr 10, 2025
Figure 1 for Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining
Figure 2 for Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining
Figure 3 for Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining
Figure 4 for Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining
Viaarxiv icon