Picture for Claudio Mayrink Verdun

Claudio Mayrink Verdun

Inference-Time Reward Hacking in Large Language Models

Add code
Jun 24, 2025
Viaarxiv icon

HeavyWater and SimplexWater: Watermarking Low-Entropy Text Distributions

Add code
Jun 06, 2025
Viaarxiv icon

Multi-Group Proportional Representation for Text-to-Image Models

Add code
May 29, 2025
Viaarxiv icon

GradPCA: Leveraging NTK Alignment for Reliable Out-of-Distribution Detection

Add code
May 21, 2025
Viaarxiv icon

Optimized Couplings for Watermarking Large Language Models

Add code
May 13, 2025
Viaarxiv icon

Soft Best-of-n Sampling for Model Alignment

Add code
May 06, 2025
Viaarxiv icon

Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models

Add code
Jul 31, 2024
Figure 1 for Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models
Figure 2 for Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models
Figure 3 for Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models
Figure 4 for Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models
Viaarxiv icon

High-Dimensional Confidence Regions in Sparse MRI

Add code
Jul 18, 2024
Viaarxiv icon

Non-Asymptotic Uncertainty Quantification in High-Dimensional Learning

Add code
Jul 18, 2024
Viaarxiv icon

With or Without Replacement? Improving Confidence in Fourier Imaging

Add code
Jul 18, 2024
Viaarxiv icon