Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:CSRT: Evaluation and Analysis of LLMs using Code-Switching Red-Teaming Dataset

Jun 17, 2024

Haneul Yoo, Yongjin Yang, Hwaran Lee

Figure 1 for CSRT: Evaluation and Analysis of LLMs using Code-Switching Red-Teaming Dataset

Figure 2 for CSRT: Evaluation and Analysis of LLMs using Code-Switching Red-Teaming Dataset

Figure 3 for CSRT: Evaluation and Analysis of LLMs using Code-Switching Red-Teaming Dataset

Figure 4 for CSRT: Evaluation and Analysis of LLMs using Code-Switching Red-Teaming Dataset

Share this with someone who'll enjoy it:

Abstract:Recent studies in large language models (LLMs) shed light on their multilingual ability and safety, beyond conventional tasks in language modeling. Still, current benchmarks reveal their inability to comprehensively evaluate them and are excessively dependent on manual annotations. In this paper, we introduce code-switching red-teaming (CSRT), a simple yet effective red-teaming technique that simultaneously tests multilingual understanding and safety of LLMs. We release the CSRT dataset, which comprises 315 code-switching queries combining up to 10 languages and eliciting a wide range of undesirable behaviors. Through extensive experiments with ten state-of-the-art LLMs, we demonstrate that CSRT significantly outperforms existing multilingual red-teaming techniques, achieving 46.7% more attacks than existing methods in English. We analyze the harmful responses toward the CSRT dataset concerning various aspects under ablation studies with 16K samples, including but not limited to scaling laws, unsafe behavior categories, and input conditions for optimal data generation. Additionally, we validate the extensibility of CSRT, by generating code-switching attack prompts with monolingual data.

View paper on

Share this with someone who'll enjoy it:

Title:CSRT: Evaluation and Analysis of LLMs using Code-Switching Red-Teaming Dataset

Paper and Code