Picture for Wenhan Dou

Wenhan Dou

Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training

Add code
Oct 10, 2024
Viaarxiv icon

Parameter-Inverted Image Pyramid Networks

Add code
Jun 06, 2024
Viaarxiv icon