Picture for Gen Luo

Gen Luo

$γ-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models

Add code
Oct 17, 2024
Viaarxiv icon

Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training

Add code
Oct 10, 2024
Viaarxiv icon

3D-GRES: Generalized 3D Referring Expression Segmentation

Add code
Jul 31, 2024
Viaarxiv icon

ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models

Add code
Jul 31, 2024
Viaarxiv icon

Deep Instruction Tuning for Segment Anything Model

Add code
Mar 31, 2024
Viaarxiv icon

Fast Text-to-3D-Aware Face Generation and Manipulation via Direct Cross-modal Mapping and Geometric Regularization

Add code
Mar 11, 2024
Viaarxiv icon

Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models

Add code
Mar 05, 2024
Viaarxiv icon

Towards Omni-supervised Referring Expression Segmentation

Add code
Nov 01, 2023
Viaarxiv icon

3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referring Expression Segmentation

Add code
Aug 31, 2023
Viaarxiv icon

Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models

Add code
May 24, 2023
Viaarxiv icon