Picture for Xinlei He

Xinlei He

CL-attack: Textual Backdoor Attacks via Cross-Lingual Triggers

Add code
Dec 26, 2024
Viaarxiv icon

Are We in the AI-Generated Text World Already? Quantifying and Monitoring AIGT on Social Media

Add code
Dec 24, 2024
Viaarxiv icon

On the Generalization Ability of Machine-Generated Text Detectors

Add code
Dec 23, 2024
Viaarxiv icon

Quantized Delta Weight Is Safety Keeper

Add code
Nov 29, 2024
Figure 1 for Quantized Delta Weight Is Safety Keeper
Figure 2 for Quantized Delta Weight Is Safety Keeper
Figure 3 for Quantized Delta Weight Is Safety Keeper
Figure 4 for Quantized Delta Weight Is Safety Keeper
Viaarxiv icon

Automatic Dataset Construction (ADC): Sample Collection, Data Curation, and Beyond

Add code
Aug 21, 2024
Figure 1 for Automatic Dataset Construction (ADC): Sample Collection, Data Curation, and Beyond
Figure 2 for Automatic Dataset Construction (ADC): Sample Collection, Data Curation, and Beyond
Figure 3 for Automatic Dataset Construction (ADC): Sample Collection, Data Curation, and Beyond
Figure 4 for Automatic Dataset Construction (ADC): Sample Collection, Data Curation, and Beyond
Viaarxiv icon

Membership Inference Attack Against Masked Image Modeling

Add code
Aug 13, 2024
Viaarxiv icon

On Evaluating The Performance of Watermarked Machine-Generated Texts Under Adversarial Attacks

Add code
Jul 05, 2024
Viaarxiv icon

Jailbreak Attacks and Defenses Against Large Language Models: A Survey

Add code
Jul 05, 2024
Viaarxiv icon

JailbreakEval: An Integrated Toolkit for Evaluating Jailbreak Attempts Against Large Language Models

Add code
Jun 13, 2024
Viaarxiv icon

Hidden Question Representations Tell Non-Factuality Within and Across Large Language Models

Add code
Jun 08, 2024
Viaarxiv icon