Picture for Anne Beyer

Anne Beyer

Two Giraffes in a Dirt Field: Using Game Play to Investigate Situation Modelling in Large Multimodal Models

Add code
Jun 20, 2024
Viaarxiv icon

clembench-2024: A Challenging, Dynamic, Complementary, Multilingual Benchmark and Underlying Flexible Framework for LLMs as Multi-Action Agents

Add code
May 31, 2024
Viaarxiv icon

Neural Conversation Models and How to Rein Them in: A Survey of Failures and Fixes

Add code
Aug 11, 2023
Viaarxiv icon

Is Incoherence Surprising? Targeted Evaluation of Coherence Prediction from Language Models

Add code
May 07, 2021
Figure 1 for Is Incoherence Surprising? Targeted Evaluation of Coherence Prediction from Language Models
Figure 2 for Is Incoherence Surprising? Targeted Evaluation of Coherence Prediction from Language Models
Figure 3 for Is Incoherence Surprising? Targeted Evaluation of Coherence Prediction from Language Models
Figure 4 for Is Incoherence Surprising? Targeted Evaluation of Coherence Prediction from Language Models
Viaarxiv icon