Picture for Ziyang Luo

Ziyang Luo

Aria-UI: Visual Grounding for GUI Instructions

Add code
Dec 20, 2024
Viaarxiv icon

ClarityEthic: Explainable Moral Judgment Utilizing Contrastive Ethical Insights from Large Language Models

Add code
Dec 17, 2024
Viaarxiv icon

ScratchEval: Are GPT-4o Smarter than My Child? Evaluating Large Multimodal Models with Visual Programming Challenges

Add code
Nov 28, 2024
Viaarxiv icon

VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation

Add code
Nov 20, 2024
Figure 1 for VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation
Figure 2 for VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation
Figure 3 for VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation
Figure 4 for VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation
Viaarxiv icon

From General to Specific: Utilizing General Hallucation to Automatically Measure the Role Relationship Fidelity for Specific Role-Play Agents

Add code
Nov 12, 2024
Figure 1 for From General to Specific: Utilizing General Hallucation to Automatically Measure the Role Relationship Fidelity for Specific Role-Play Agents
Figure 2 for From General to Specific: Utilizing General Hallucation to Automatically Measure the Role Relationship Fidelity for Specific Role-Play Agents
Figure 3 for From General to Specific: Utilizing General Hallucation to Automatically Measure the Role Relationship Fidelity for Specific Role-Play Agents
Figure 4 for From General to Specific: Utilizing General Hallucation to Automatically Measure the Role Relationship Fidelity for Specific Role-Play Agents
Viaarxiv icon

Towards Low-Resource Harmful Meme Detection with LMM Agents

Add code
Nov 08, 2024
Figure 1 for Towards Low-Resource Harmful Meme Detection with LMM Agents
Figure 2 for Towards Low-Resource Harmful Meme Detection with LMM Agents
Figure 3 for Towards Low-Resource Harmful Meme Detection with LMM Agents
Figure 4 for Towards Low-Resource Harmful Meme Detection with LMM Agents
Viaarxiv icon

AMR-Evol: Adaptive Modular Response Evolution Elicits Better Knowledge Distillation for Large Language Models in Code Generation

Add code
Oct 01, 2024
Viaarxiv icon

CodeJudge-Eval: Can Large Language Models be Good Judges in Code Understanding?

Add code
Aug 20, 2024
Viaarxiv icon

MFC-Bench: Benchmarking Multimodal Fact-Checking with Large Vision-Language Models

Add code
Jun 17, 2024
Viaarxiv icon

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Add code
Jun 11, 2024
Figure 1 for VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
Figure 2 for VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
Figure 3 for VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
Figure 4 for VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
Viaarxiv icon