Picture for Youliang Yuan

Youliang Yuan

VisFactor: Benchmarking Fundamental Visual Cognition in Multimodal Large Language Models

Add code
Feb 23, 2025
Viaarxiv icon

Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability

Add code
Dec 24, 2024
Figure 1 for Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability
Figure 2 for Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability
Figure 3 for Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability
Figure 4 for Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability
Viaarxiv icon

Difficult Task Yes but Simple Task No: Unveiling the Laziness in Multimodal LLMs

Add code
Oct 15, 2024
Figure 1 for Difficult Task Yes but Simple Task No: Unveiling the Laziness in Multimodal LLMs
Figure 2 for Difficult Task Yes but Simple Task No: Unveiling the Laziness in Multimodal LLMs
Figure 3 for Difficult Task Yes but Simple Task No: Unveiling the Laziness in Multimodal LLMs
Figure 4 for Difficult Task Yes but Simple Task No: Unveiling the Laziness in Multimodal LLMs
Viaarxiv icon

Insight Over Sight? Exploring the Vision-Knowledge Conflicts in Multimodal LLMs

Add code
Oct 10, 2024
Figure 1 for Insight Over Sight? Exploring the Vision-Knowledge Conflicts in Multimodal LLMs
Figure 2 for Insight Over Sight? Exploring the Vision-Knowledge Conflicts in Multimodal LLMs
Figure 3 for Insight Over Sight? Exploring the Vision-Knowledge Conflicts in Multimodal LLMs
Figure 4 for Insight Over Sight? Exploring the Vision-Knowledge Conflicts in Multimodal LLMs
Viaarxiv icon

Chain-of-Jailbreak Attack for Image Generation Models via Editing Step by Step

Add code
Oct 04, 2024
Viaarxiv icon

Learning to Ask: When LLMs Meet Unclear Instruction

Add code
Aug 31, 2024
Viaarxiv icon

On the Resilience of Multi-Agent Systems with Malicious Agents

Add code
Aug 02, 2024
Viaarxiv icon

Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training

Add code
Jul 12, 2024
Figure 1 for Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training
Figure 2 for Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training
Figure 3 for Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training
Figure 4 for Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training
Viaarxiv icon

How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments

Add code
Mar 18, 2024
Figure 1 for How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments
Figure 2 for How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments
Figure 3 for How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments
Figure 4 for How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments
Viaarxiv icon

A & B == B & A: Triggering Logical Reasoning Failures in Large Language Models

Add code
Jan 01, 2024
Viaarxiv icon