Picture for Song Mei

Song Mei

Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs

Add code
Oct 17, 2024
Figure 1 for Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Figure 2 for Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Figure 3 for Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Figure 4 for Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Viaarxiv icon

Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning

Add code
Oct 09, 2024
Viaarxiv icon

Large Stepsize Gradient Descent for Non-Homogeneous Two-Layer Networks: Margin Improvement and Fast Optimization

Add code
Jun 12, 2024
Viaarxiv icon

U-Nets as Belief Propagation: Efficient Classification, Denoising, and Diffusion in Generative Hierarchical Models

Add code
May 01, 2024
Viaarxiv icon

An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization

Add code
Apr 11, 2024
Viaarxiv icon

Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning

Add code
Apr 08, 2024
Viaarxiv icon

Statistical Estimation in the Spiked Tensor Model via the Quantum Approximate Optimization Algorithm

Add code
Feb 29, 2024
Viaarxiv icon

Mean-field variational inference with the TAP free energy: Geometric and statistical properties in linear models

Add code
Nov 14, 2023
Viaarxiv icon

How Do Transformers Learn In-Context Beyond Simple Functions? A Case Study on Learning with Representations

Add code
Oct 16, 2023
Viaarxiv icon

Transformers as Decision Makers: Provable In-Context Reinforcement Learning via Supervised Pretraining

Add code
Oct 12, 2023
Viaarxiv icon