Picture for Eugene Belilovsky

Eugene Belilovsky

MILA

Less is More: Undertraining Experts Improves Model Upcycling

Add code
Jun 17, 2025
Viaarxiv icon

PyLO: Towards Accessible Learned Optimizers in PyTorch

Add code
Jun 12, 2025
Viaarxiv icon

MuLoCo: Muon is a practical inner optimizer for DiLoCo

Add code
May 29, 2025
Viaarxiv icon

Incentivizing Permissionless Distributed Learning of LLMs

Add code
May 27, 2025
Viaarxiv icon

Continual Pre-training of MoEs: How robust is your router?

Add code
Mar 06, 2025
Viaarxiv icon

Beyond Cosine Decay: On the effectiveness of Infinite Learning Rate Schedule for Continual Pre-training

Add code
Mar 06, 2025
Viaarxiv icon

FairDropout: Using Example-Tied Dropout to Enhance Generalization of Minority Groups

Add code
Feb 10, 2025
Figure 1 for FairDropout: Using Example-Tied Dropout to Enhance Generalization of Minority Groups
Figure 2 for FairDropout: Using Example-Tied Dropout to Enhance Generalization of Minority Groups
Figure 3 for FairDropout: Using Example-Tied Dropout to Enhance Generalization of Minority Groups
Figure 4 for FairDropout: Using Example-Tied Dropout to Enhance Generalization of Minority Groups
Viaarxiv icon

Non-Uniform Parameter-Wise Model Merging

Add code
Dec 20, 2024
Viaarxiv icon

Sketch-guided Cage-based 3D Gaussian Splatting Deformation

Add code
Nov 19, 2024
Viaarxiv icon

Towards motion from video diffusion models

Add code
Nov 19, 2024
Viaarxiv icon