Picture for Yejin Choi

Yejin Choi

Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice

Add code
Dec 09, 2024
Viaarxiv icon

Negative Token Merging: Image-based Adversarial Feature Guidance

Add code
Dec 02, 2024
Viaarxiv icon

BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions

Add code
Nov 12, 2024
Viaarxiv icon

Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset

Add code
Nov 05, 2024
Figure 1 for Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset
Figure 2 for Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset
Figure 3 for Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset
Figure 4 for Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset
Viaarxiv icon

RESTOR: Knowledge Recovery through Machine Unlearning

Add code
Oct 31, 2024
Figure 1 for RESTOR: Knowledge Recovery through Machine Unlearning
Figure 2 for RESTOR: Knowledge Recovery through Machine Unlearning
Figure 3 for RESTOR: Knowledge Recovery through Machine Unlearning
Figure 4 for RESTOR: Knowledge Recovery through Machine Unlearning
Viaarxiv icon

Towards Visual Text Design Transfer Across Languages

Add code
Oct 24, 2024
Figure 1 for Towards Visual Text Design Transfer Across Languages
Figure 2 for Towards Visual Text Design Transfer Across Languages
Figure 3 for Towards Visual Text Design Transfer Across Languages
Figure 4 for Towards Visual Text Design Transfer Across Languages
Viaarxiv icon

SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation

Add code
Oct 22, 2024
Figure 1 for SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation
Figure 2 for SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation
Figure 3 for SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation
Figure 4 for SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation
Viaarxiv icon

Diverging Preferences: When do Annotators Disagree and do Models Know?

Add code
Oct 18, 2024
Figure 1 for Diverging Preferences: When do Annotators Disagree and do Models Know?
Figure 2 for Diverging Preferences: When do Annotators Disagree and do Models Know?
Figure 3 for Diverging Preferences: When do Annotators Disagree and do Models Know?
Figure 4 for Diverging Preferences: When do Annotators Disagree and do Models Know?
Viaarxiv icon

SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs

Add code
Oct 17, 2024
Figure 1 for SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs
Figure 2 for SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs
Figure 3 for SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs
Figure 4 for SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs
Viaarxiv icon

Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence

Add code
Oct 15, 2024
Figure 1 for Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence
Figure 2 for Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence
Figure 3 for Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence
Figure 4 for Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence
Viaarxiv icon