Picture for Fengyuan Hu

Fengyuan Hu

Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference Under Ambiguities

Add code
Oct 22, 2024
Viaarxiv icon

Efficient In-Context Learning in Vision-Language Models for Egocentric Videos

Add code
Nov 29, 2023
Viaarxiv icon

From Heuristic to Analytic: Cognitively Motivated Strategies for Coherent Physical Commonsense Reasoning

Add code
Oct 24, 2023
Viaarxiv icon