Picture for Qi Zhi Lim

Qi Zhi Lim

VLMT: Vision-Language Multimodal Transformer for Multimodal Multi-hop Question Answering

Add code
Apr 11, 2025
Viaarxiv icon