Picture for Jiayin Hu

Jiayin Hu

VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding

Add code
Mar 22, 2024
Viaarxiv icon

VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework

Add code
Mar 14, 2024
Viaarxiv icon