Picture for Xuanli He

Xuanli He

Cut the Deadwood Out: Post-Training Model Purification with Selective Module Substitution

Add code
Dec 29, 2024
Viaarxiv icon

An Auditing Test To Detect Behavioral Shift in Language Models

Add code
Oct 25, 2024
Figure 1 for An Auditing Test To Detect Behavioral Shift in Language Models
Figure 2 for An Auditing Test To Detect Behavioral Shift in Language Models
Figure 3 for An Auditing Test To Detect Behavioral Shift in Language Models
Figure 4 for An Auditing Test To Detect Behavioral Shift in Language Models
Viaarxiv icon

Analysing the Residual Stream of Language Models Under Knowledge Conflicts

Add code
Oct 21, 2024
Figure 1 for Analysing the Residual Stream of Language Models Under Knowledge Conflicts
Figure 2 for Analysing the Residual Stream of Language Models Under Knowledge Conflicts
Figure 3 for Analysing the Residual Stream of Language Models Under Knowledge Conflicts
Figure 4 for Analysing the Residual Stream of Language Models Under Knowledge Conflicts
Viaarxiv icon

Are We Done with MMLU?

Add code
Jun 07, 2024
Figure 1 for Are We Done with MMLU?
Figure 2 for Are We Done with MMLU?
Figure 3 for Are We Done with MMLU?
Figure 4 for Are We Done with MMLU?
Viaarxiv icon

IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models

Add code
Jun 05, 2024
Figure 1 for IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models
Figure 2 for IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models
Figure 3 for IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models
Figure 4 for IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models
Viaarxiv icon

SEEP: Training Dynamics Grounds Latent Representation Search for Mitigating Backdoor Poisoning Attacks

Add code
May 19, 2024
Viaarxiv icon

Transferring Troubles: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning

Add code
Apr 30, 2024
Figure 1 for Transferring Troubles: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning
Figure 2 for Transferring Troubles: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning
Figure 3 for Transferring Troubles: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning
Figure 4 for Transferring Troubles: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning
Viaarxiv icon

Attacks on Third-Party APIs of Large Language Models

Add code
Apr 24, 2024
Viaarxiv icon

The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models

Add code
Apr 08, 2024
Viaarxiv icon

Backdoor Attack on Multilingual Machine Translation

Add code
Apr 03, 2024
Viaarxiv icon