Picture for Kechen Fang

Kechen Fang

VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AI

Add code
Oct 15, 2024
Viaarxiv icon

Can Vision-Language Models Think from a First-Person Perspective?

Add code
Nov 27, 2023
Figure 1 for Can Vision-Language Models Think from a First-Person Perspective?
Figure 2 for Can Vision-Language Models Think from a First-Person Perspective?
Figure 3 for Can Vision-Language Models Think from a First-Person Perspective?
Figure 4 for Can Vision-Language Models Think from a First-Person Perspective?
Viaarxiv icon