Picture for Jingyuan Huang

Jingyuan Huang

Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents

Add code
Oct 03, 2024
Viaarxiv icon

Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?

Add code
Apr 10, 2024
Viaarxiv icon

Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in Large Language Models

Add code
Oct 19, 2023
Viaarxiv icon

An Image is Worth a Thousand Toxic Words: A Metamorphic Testing Framework for Content Moderation Software

Add code
Aug 18, 2023
Figure 1 for An Image is Worth a Thousand Toxic Words: A Metamorphic Testing Framework for Content Moderation Software
Figure 2 for An Image is Worth a Thousand Toxic Words: A Metamorphic Testing Framework for Content Moderation Software
Figure 3 for An Image is Worth a Thousand Toxic Words: A Metamorphic Testing Framework for Content Moderation Software
Figure 4 for An Image is Worth a Thousand Toxic Words: A Metamorphic Testing Framework for Content Moderation Software
Viaarxiv icon

Validating Multimedia Content Moderation Software via Semantic Fusion

Add code
May 23, 2023
Viaarxiv icon