Picture for Andrew Bai

Andrew Bai

On the loss of context-awareness in general instruction fine-tuning

Add code
Nov 05, 2024
Viaarxiv icon

CLUE: Concept-Level Uncertainty Estimation for Large Language Models

Add code
Sep 04, 2024
Viaarxiv icon

Embedding Space Selection for Detecting Memorization and Fingerprinting in Generative Models

Add code
Jul 30, 2024
Viaarxiv icon

Defending LLMs against Jailbreaking Attacks via Backtranslation

Add code
Feb 28, 2024
Viaarxiv icon

Which Pretrain Samples to Rehearse when Finetuning Pretrained Models?

Add code
Feb 12, 2024
Viaarxiv icon

Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation

Add code
Jan 21, 2024
Viaarxiv icon

Concept Gradient: Concept-based Interpretation Without Linear Assumption

Add code
Aug 31, 2022
Figure 1 for Concept Gradient: Concept-based Interpretation Without Linear Assumption
Figure 2 for Concept Gradient: Concept-based Interpretation Without Linear Assumption
Figure 3 for Concept Gradient: Concept-based Interpretation Without Linear Assumption
Figure 4 for Concept Gradient: Concept-based Interpretation Without Linear Assumption
Viaarxiv icon