Picture for Yuliang Sun

Yuliang Sun

Knowledge-to-Jailbreak: One Knowledge Point Worth One Attack

Add code
Jun 17, 2024
Viaarxiv icon

WaterBench: Towards Holistic Evaluation of Watermarks for Large Language Models

Add code
Nov 13, 2023
Viaarxiv icon