Picture for Jialou Wang

Jialou Wang

Detect2Interact: Localizing Object Key Field in Visual Question Answering (VQA) with LLMs

Add code
Apr 01, 2024
Viaarxiv icon