Picture for Yanjun Qi

Yanjun Qi

TurboFuzzLLM: Turbocharging Mutation-based Fuzzing for Effectively Jailbreaking Large Language Models in Practice

Add code
Feb 21, 2025
Viaarxiv icon

Preference Optimization via Contrastive Divergence: Your Reward Model is Secretly an NLL Estimator

Add code
Feb 06, 2025
Viaarxiv icon

Graph of Attacks with Pruning: Optimizing Stealthy Jailbreak Prompt Generation for Enhanced LLM Content Moderation

Add code
Jan 28, 2025
Viaarxiv icon

AIDE: Task-Specific Fine Tuning with Attribute Guided Multi-Hop Data Expansion

Add code
Dec 09, 2024
Viaarxiv icon

Hierarchical Prompt Decision Transformer: Improving Few-Shot Policy Generalization with Global and Adaptive

Add code
Dec 01, 2024
Figure 1 for Hierarchical Prompt Decision Transformer: Improving Few-Shot Policy Generalization with Global and Adaptive
Figure 2 for Hierarchical Prompt Decision Transformer: Improving Few-Shot Policy Generalization with Global and Adaptive
Figure 3 for Hierarchical Prompt Decision Transformer: Improving Few-Shot Policy Generalization with Global and Adaptive
Figure 4 for Hierarchical Prompt Decision Transformer: Improving Few-Shot Policy Generalization with Global and Adaptive
Viaarxiv icon

Towards Improved Preference Optimization Pipeline: from Data Generation to Budget-Controlled Regularization

Add code
Nov 07, 2024
Viaarxiv icon

DFlow: Diverse Dialogue Flow Simulation with Large Language Models

Add code
Oct 18, 2024
Viaarxiv icon

TaeBench: Improving Quality of Toxic Adversarial Examples

Add code
Oct 08, 2024
Viaarxiv icon

Towards Building a Robust Toxicity Predictor

Add code
Apr 09, 2024
Viaarxiv icon

Less is More for Improving Automatic Evaluation of Factual Consistency

Add code
Apr 09, 2024
Viaarxiv icon