Picture for Weizhen Wang

Weizhen Wang

Embodied Scene Understanding for Vision Language Models via MetaVQA

Add code
Jan 15, 2025
Figure 1 for Embodied Scene Understanding for Vision Language Models via MetaVQA
Figure 2 for Embodied Scene Understanding for Vision Language Models via MetaVQA
Figure 3 for Embodied Scene Understanding for Vision Language Models via MetaVQA
Figure 4 for Embodied Scene Understanding for Vision Language Models via MetaVQA
Viaarxiv icon