Picture for Ruixiang Tang

Ruixiang Tang

Disentangling Memory and Reasoning Ability in Large Language Models

Add code
Nov 21, 2024
Viaarxiv icon

When Backdoors Speak: Understanding LLM Backdoor Attacks Through Model-Generated Explanations

Add code
Nov 19, 2024
Viaarxiv icon

Taylor Unswift: Secured Weight Release for Large Language Models via Taylor Expansion

Add code
Oct 06, 2024
Figure 1 for Taylor Unswift: Secured Weight Release for Large Language Models via Taylor Expansion
Figure 2 for Taylor Unswift: Secured Weight Release for Large Language Models via Taylor Expansion
Figure 3 for Taylor Unswift: Secured Weight Release for Large Language Models via Taylor Expansion
Figure 4 for Taylor Unswift: Secured Weight Release for Large Language Models via Taylor Expansion
Viaarxiv icon

Navigating the Shortcut Maze: A Comprehensive Analysis of Shortcut Learning in Text Classification by Language Models

Add code
Sep 26, 2024
Viaarxiv icon

Assessing and Enhancing Large Language Models in Rare Disease Question-answering

Add code
Aug 15, 2024
Viaarxiv icon

Uncertainty is Fragile: Manipulating Uncertainty in Large Language Models

Add code
Jul 15, 2024
Viaarxiv icon

LoRA-as-an-Attack! Piercing LLM Safety Under The Share-and-Play Scenario

Add code
Feb 29, 2024
Viaarxiv icon

Large Language Models As Faithful Explainers

Add code
Feb 07, 2024
Viaarxiv icon

Setting the Trap: Capturing and Defeating Backdoors in Pretrained Language Models through Honeypots

Add code
Oct 28, 2023
Viaarxiv icon

Assessing Privacy Risks in Language Models: A Case Study on Summarization Tasks

Add code
Oct 20, 2023
Viaarxiv icon