Picture for Andy Zhou

Andy Zhou

AutoRedTeamer: Autonomous Red Teaming with Lifelong Attack Integration

Add code
Mar 20, 2025
Viaarxiv icon

MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models

Add code
Mar 19, 2025
Viaarxiv icon

Siege: Autonomous Multi-Turn Jailbreaking of Large Language Models with Tree Search

Add code
Mar 13, 2025
Viaarxiv icon

Compositional Subspace Representation Fine-tuning for Adaptive Large Language Models

Add code
Mar 13, 2025
Viaarxiv icon

RedCode: Risky Code Execution and Generation Benchmark for Code Agents

Add code
Nov 12, 2024
Viaarxiv icon

KnowGraph: Knowledge-Enabled Anomaly Detection via Logical Reasoning on Graph Data

Add code
Oct 10, 2024
Figure 1 for KnowGraph: Knowledge-Enabled Anomaly Detection via Logical Reasoning on Graph Data
Figure 2 for KnowGraph: Knowledge-Enabled Anomaly Detection via Logical Reasoning on Graph Data
Figure 3 for KnowGraph: Knowledge-Enabled Anomaly Detection via Logical Reasoning on Graph Data
Figure 4 for KnowGraph: Knowledge-Enabled Anomaly Detection via Logical Reasoning on Graph Data
Viaarxiv icon

Tamper-Resistant Safeguards for Open-Weight LLMs

Add code
Aug 01, 2024
Figure 1 for Tamper-Resistant Safeguards for Open-Weight LLMs
Figure 2 for Tamper-Resistant Safeguards for Open-Weight LLMs
Figure 3 for Tamper-Resistant Safeguards for Open-Weight LLMs
Figure 4 for Tamper-Resistant Safeguards for Open-Weight LLMs
Viaarxiv icon

AI Risk Categorization Decoded (AIR 2024): From Government Regulations to Corporate Policies

Add code
Jun 25, 2024
Viaarxiv icon

Jailbreaking Large Language Models Against Moderation Guardrails via Cipher Characters

Add code
May 30, 2024
Viaarxiv icon

FedSelect: Personalized Federated Learning with Customized Selection of Parameters for Fine-Tuning

Add code
Apr 03, 2024
Viaarxiv icon