Picture for Song Mei

Song Mei

Improving LLM Safety Alignment with Dual-Objective Optimization

Add code
Mar 05, 2025
Viaarxiv icon

An Overview of Large Language Models for Statisticians

Add code
Feb 25, 2025
Viaarxiv icon

Implicit Bias of Gradient Descent for Non-Homogeneous Deep Networks

Add code
Feb 22, 2025
Viaarxiv icon

How Do LLMs Perform Two-Hop Reasoning in Context?

Add code
Feb 19, 2025
Viaarxiv icon

A Statistical Theory of Contrastive Pre-training and Multimodal Generative AI

Add code
Jan 08, 2025
Figure 1 for A Statistical Theory of Contrastive Pre-training and Multimodal Generative AI
Figure 2 for A Statistical Theory of Contrastive Pre-training and Multimodal Generative AI
Figure 3 for A Statistical Theory of Contrastive Pre-training and Multimodal Generative AI
Figure 4 for A Statistical Theory of Contrastive Pre-training and Multimodal Generative AI
Viaarxiv icon

Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs

Add code
Oct 17, 2024
Figure 1 for Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Figure 2 for Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Figure 3 for Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Figure 4 for Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Viaarxiv icon

Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning

Add code
Oct 09, 2024
Viaarxiv icon

Large Stepsize Gradient Descent for Non-Homogeneous Two-Layer Networks: Margin Improvement and Fast Optimization

Add code
Jun 12, 2024
Viaarxiv icon

U-Nets as Belief Propagation: Efficient Classification, Denoising, and Diffusion in Generative Hierarchical Models

Add code
May 01, 2024
Viaarxiv icon

An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization

Add code
Apr 11, 2024
Viaarxiv icon