Picture for Junjie Fei

Junjie Fei

Document Haystacks: Vision-Language Reasoning Over Piles of 1000+ Documents

Add code
Nov 23, 2024
Viaarxiv icon

Kestrel: Point Grounding Multimodal LLM for Part-Aware 3D Vision-Language Understanding

Add code
May 29, 2024
Viaarxiv icon

Transferable Decoding with Visual Entities for Zero-Shot Image Captioning

Add code
Jul 31, 2023
Viaarxiv icon

Caption Anything: Interactive Image Description with Diverse Multimodal Controls

Add code
May 08, 2023
Viaarxiv icon