Picture for Zeyi Liao

Zeyi Liao

AmpleGCG-Plus: A Strong Generative Model of Adversarial Suffixes to Jailbreak LLMs with Higher Success Rates in Fewer Attempts

Add code
Oct 29, 2024
Viaarxiv icon

AdvWeb: Controllable Black-box Attacks on VLM-powered Web Agents

Add code
Oct 22, 2024
Figure 1 for AdvWeb: Controllable Black-box Attacks on VLM-powered Web Agents
Figure 2 for AdvWeb: Controllable Black-box Attacks on VLM-powered Web Agents
Figure 3 for AdvWeb: Controllable Black-box Attacks on VLM-powered Web Agents
Figure 4 for AdvWeb: Controllable Black-box Attacks on VLM-powered Web Agents
Viaarxiv icon

ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery

Add code
Oct 07, 2024
Figure 1 for ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
Figure 2 for ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
Figure 3 for ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
Figure 4 for ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
Viaarxiv icon

EIA: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage

Add code
Sep 17, 2024
Figure 1 for EIA: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage
Figure 2 for EIA: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage
Figure 3 for EIA: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage
Figure 4 for EIA: Environmental Injection Attack on Generalist Web Agents for Privacy Leakage
Viaarxiv icon

Joint Demonstration and Preference Learning Improves Policy Alignment with Human Feedback

Add code
Jun 11, 2024
Viaarxiv icon

Introducing v0.5 of the AI Safety Benchmark from MLCommons

Add code
Apr 18, 2024
Figure 1 for Introducing v0.5 of the AI Safety Benchmark from MLCommons
Figure 2 for Introducing v0.5 of the AI Safety Benchmark from MLCommons
Figure 3 for Introducing v0.5 of the AI Safety Benchmark from MLCommons
Figure 4 for Introducing v0.5 of the AI Safety Benchmark from MLCommons
Viaarxiv icon

AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs

Add code
Apr 11, 2024
Figure 1 for AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs
Figure 2 for AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs
Figure 3 for AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs
Figure 4 for AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs
Viaarxiv icon

AttributionBench: How Hard is Automatic Attribution Evaluation?

Add code
Feb 23, 2024
Viaarxiv icon

A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents

Add code
Feb 15, 2024
Figure 1 for A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents
Figure 2 for A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents
Figure 3 for A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents
Viaarxiv icon

In Search of the Long-Tail: Systematic Generation of Long-Tail Knowledge via Logical Rule Guided Search

Add code
Nov 13, 2023
Figure 1 for In Search of the Long-Tail: Systematic Generation of Long-Tail Knowledge via Logical Rule Guided Search
Figure 2 for In Search of the Long-Tail: Systematic Generation of Long-Tail Knowledge via Logical Rule Guided Search
Figure 3 for In Search of the Long-Tail: Systematic Generation of Long-Tail Knowledge via Logical Rule Guided Search
Figure 4 for In Search of the Long-Tail: Systematic Generation of Long-Tail Knowledge via Logical Rule Guided Search
Viaarxiv icon