Picture for Xingyu Fu

Xingyu Fu

FamiCom: Further Demystifying Prompts for Language Models with Task-Agnostic Performance Estimation

Add code
Jun 17, 2024
Viaarxiv icon

MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding

Add code
Jun 13, 2024
Figure 1 for MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding
Figure 2 for MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding
Figure 3 for MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding
Figure 4 for MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding
Viaarxiv icon

Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models

Add code
Jun 13, 2024
Viaarxiv icon

Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense?

Add code
Jun 11, 2024
Viaarxiv icon

BLINK: Multimodal Large Language Models Can See but Not Perceive

Add code
Apr 18, 2024
Viaarxiv icon

Deceiving Semantic Shortcuts on Reasoning Chains: How Far Can Models Go without Hallucination?

Add code
Nov 16, 2023
Viaarxiv icon

ImagenHub: Standardizing the evaluation of conditional image generation models

Add code
Oct 17, 2023
Viaarxiv icon

Typing on Any Surface: A Deep Learning-based Method for Real-Time Keystroke Detection in Augmented Reality

Add code
Aug 31, 2023
Viaarxiv icon

Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge

Add code
May 30, 2023
Figure 1 for Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge
Figure 2 for Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge
Figure 3 for Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge
Figure 4 for Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge
Viaarxiv icon

Interpretable by Design Visual Question Answering

Add code
May 24, 2023
Viaarxiv icon