Picture for Pierre Ablin

Pierre Ablin

Ecole normale supérieure, Paris, France

Sparse Repellency for Shielded Generation in Text-to-image Diffusion Models

Add code
Oct 10, 2024
Viaarxiv icon

Dynamic Gradient Alignment for Online Data Mixing

Add code
Oct 03, 2024
Viaarxiv icon

Theory, Analysis, and Best Practices for Sigmoid Self-Attention

Add code
Sep 06, 2024
Viaarxiv icon

The AdEMAMix Optimizer: Better, Faster, Older

Add code
Sep 05, 2024
Figure 1 for The AdEMAMix Optimizer: Better, Faster, Older
Figure 2 for The AdEMAMix Optimizer: Better, Faster, Older
Figure 3 for The AdEMAMix Optimizer: Better, Faster, Older
Figure 4 for The AdEMAMix Optimizer: Better, Faster, Older
Viaarxiv icon

Optimization without retraction on the random generalized Stiefel manifold

Add code
May 02, 2024
Figure 1 for Optimization without retraction on the random generalized Stiefel manifold
Figure 2 for Optimization without retraction on the random generalized Stiefel manifold
Figure 3 for Optimization without retraction on the random generalized Stiefel manifold
Figure 4 for Optimization without retraction on the random generalized Stiefel manifold
Viaarxiv icon

Enhancing Hypergradients Estimation: A Study of Preconditioning and Reparameterization

Add code
Feb 26, 2024
Viaarxiv icon

Careful with that Scalpel: Improving Gradient Surgery with an EMA

Add code
Feb 05, 2024
Viaarxiv icon

Specialized Language Models with Cheap Inference from Limited Domain Data

Add code
Feb 02, 2024
Viaarxiv icon

Understanding the Regularity of Self-Attention with Optimal Transport

Add code
Dec 22, 2023
Figure 1 for Understanding the Regularity of Self-Attention with Optimal Transport
Figure 2 for Understanding the Regularity of Self-Attention with Optimal Transport
Figure 3 for Understanding the Regularity of Self-Attention with Optimal Transport
Figure 4 for Understanding the Regularity of Self-Attention with Optimal Transport
Viaarxiv icon

MultiView Independent Component Analysis with Delays

Add code
Dec 01, 2023
Viaarxiv icon