Picture for Graham Neubig

Graham Neubig

Carnegie Mellon University

What Goes Into a LM Acceptability Judgment? Rethinking the Impact of Frequency and Length

Add code
Nov 04, 2024
Viaarxiv icon

JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation

Add code
Oct 22, 2024
Viaarxiv icon

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

Add code
Oct 21, 2024
Viaarxiv icon

Beyond Browsing: API-Based Web Agents

Add code
Oct 21, 2024
Viaarxiv icon

NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples

Add code
Oct 18, 2024
Viaarxiv icon

Harnessing Webpage UIs for Text-Rich Visual Understanding

Add code
Oct 17, 2024
Figure 1 for Harnessing Webpage UIs for Text-Rich Visual Understanding
Figure 2 for Harnessing Webpage UIs for Text-Rich Visual Understanding
Figure 3 for Harnessing Webpage UIs for Text-Rich Visual Understanding
Figure 4 for Harnessing Webpage UIs for Text-Rich Visual Understanding
Viaarxiv icon

Stereotype or Personalization? User Identity Biases Chatbot Recommendations

Add code
Oct 08, 2024
Viaarxiv icon

Better Instruction-Following Through Minimum Bayes Risk

Add code
Oct 07, 2024
Viaarxiv icon

Agent Workflow Memory

Add code
Sep 11, 2024
Viaarxiv icon

MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark

Add code
Sep 04, 2024
Figure 1 for MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark
Figure 2 for MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark
Figure 3 for MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark
Figure 4 for MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark
Viaarxiv icon