Picture for Ziqiao Ma

Ziqiao Ma

Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference Under Ambiguities

Add code
Oct 22, 2024
Viaarxiv icon

Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models

Add code
Jul 09, 2024
Viaarxiv icon

Multi-Object Hallucination in Vision-Language Models

Add code
Jul 08, 2024
Viaarxiv icon

Towards Bidirectional Human-AI Alignment: A Systematic Review for Clarifications, Framework, and Future Directions

Add code
Jun 17, 2024
Viaarxiv icon

DriVLMe: Enhancing LLM-based Autonomous Driving Agents with Embodied and Social Experiences

Add code
Jun 05, 2024
Viaarxiv icon

Babysit A Language Model From Scratch: Interactive Language Learning by Trials and Demonstrations

Add code
May 22, 2024
Viaarxiv icon

GROUNDHOG: Grounding Large Language Models to Holistic Segmentation

Add code
Feb 26, 2024
Figure 1 for GROUNDHOG: Grounding Large Language Models to Holistic Segmentation
Figure 2 for GROUNDHOG: Grounding Large Language Models to Holistic Segmentation
Figure 3 for GROUNDHOG: Grounding Large Language Models to Holistic Segmentation
Figure 4 for GROUNDHOG: Grounding Large Language Models to Holistic Segmentation
Viaarxiv icon

Inversion-Free Image Editing with Natural Language

Add code
Dec 07, 2023
Viaarxiv icon

Towards A Holistic Landscape of Situated Theory of Mind in Large Language Models

Add code
Oct 30, 2023
Viaarxiv icon

CycleNet: Rethinking Cycle Consistency in Text-Guided Diffusion for Image Manipulation

Add code
Oct 19, 2023
Viaarxiv icon