Picture for Taaren Singh

Taaren Singh

Vision language models are unreliable at trivial spatial cognition

Add code
Apr 22, 2025
Viaarxiv icon