Picture for Xuanli He

Xuanli He

An Auditing Test To Detect Behavioral Shift in Language Models

Add code
Oct 25, 2024
Viaarxiv icon

Analysing the Residual Stream of Language Models Under Knowledge Conflicts

Add code
Oct 21, 2024
Figure 1 for Analysing the Residual Stream of Language Models Under Knowledge Conflicts
Figure 2 for Analysing the Residual Stream of Language Models Under Knowledge Conflicts
Figure 3 for Analysing the Residual Stream of Language Models Under Knowledge Conflicts
Figure 4 for Analysing the Residual Stream of Language Models Under Knowledge Conflicts
Viaarxiv icon

Are We Done with MMLU?

Add code
Jun 07, 2024
Viaarxiv icon

IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models

Add code
Jun 05, 2024
Viaarxiv icon

SEEP: Training Dynamics Grounds Latent Representation Search for Mitigating Backdoor Poisoning Attacks

Add code
May 19, 2024
Viaarxiv icon

Transferring Troubles: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning

Add code
Apr 30, 2024
Viaarxiv icon

Attacks on Third-Party APIs of Large Language Models

Add code
Apr 24, 2024
Viaarxiv icon

The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models

Add code
Apr 08, 2024
Viaarxiv icon

Backdoor Attack on Multilingual Machine Translation

Add code
Apr 03, 2024
Viaarxiv icon

StarCoder 2 and The Stack v2: The Next Generation

Add code
Feb 29, 2024
Figure 1 for StarCoder 2 and The Stack v2: The Next Generation
Figure 2 for StarCoder 2 and The Stack v2: The Next Generation
Figure 3 for StarCoder 2 and The Stack v2: The Next Generation
Figure 4 for StarCoder 2 and The Stack v2: The Next Generation
Viaarxiv icon