Picture for Honglin Mu

Honglin Mu

Stealthy Jailbreak Attacks on Large Language Models via Benign Data Mirroring

Add code
Oct 28, 2024
Figure 1 for Stealthy Jailbreak Attacks on Large Language Models via Benign Data Mirroring
Figure 2 for Stealthy Jailbreak Attacks on Large Language Models via Benign Data Mirroring
Figure 3 for Stealthy Jailbreak Attacks on Large Language Models via Benign Data Mirroring
Figure 4 for Stealthy Jailbreak Attacks on Large Language Models via Benign Data Mirroring
Viaarxiv icon

Concise and Precise Context Compression for Tool-Using Language Models

Add code
Jul 02, 2024
Viaarxiv icon

Self-Constructed Context Decompilation with Fined-grained Alignment Enhancement

Add code
Jun 25, 2024
Viaarxiv icon

Against The Achilles' Heel: A Survey on Red Teaming for Generative Models

Add code
Mar 31, 2024
Figure 1 for Against The Achilles' Heel: A Survey on Red Teaming for Generative Models
Figure 2 for Against The Achilles' Heel: A Survey on Red Teaming for Generative Models
Figure 3 for Against The Achilles' Heel: A Survey on Red Teaming for Generative Models
Figure 4 for Against The Achilles' Heel: A Survey on Red Teaming for Generative Models
Viaarxiv icon

Beyond Static Evaluation: A Dynamic Approach to Assessing AI Assistants' API Invocation Capabilities

Add code
Mar 27, 2024
Viaarxiv icon

Improving Domain Generalization for Sound Classification with Sparse Frequency-Regularized Transformer

Add code
Jul 19, 2023
Viaarxiv icon

MixPro: Simple yet Effective Data Augmentation for Prompt-based Learning

Add code
Apr 19, 2023
Viaarxiv icon