Picture for Mingsheng Li

Mingsheng Li

GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training

Add code
Dec 16, 2024
Viaarxiv icon

Chimera: Improving Generalist Model with Domain-Specific Experts

Add code
Dec 08, 2024
Viaarxiv icon

Lightweight Model Pre-training via Language Guided Knowledge Distillation

Add code
Jun 17, 2024
Viaarxiv icon

M3DBench: Let's Instruct Large Models with Multi-modal 3D Prompts

Add code
Dec 17, 2023
Figure 1 for M3DBench: Let's Instruct Large Models with Multi-modal 3D Prompts
Figure 2 for M3DBench: Let's Instruct Large Models with Multi-modal 3D Prompts
Figure 3 for M3DBench: Let's Instruct Large Models with Multi-modal 3D Prompts
Figure 4 for M3DBench: Let's Instruct Large Models with Multi-modal 3D Prompts
Viaarxiv icon

LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning

Add code
Nov 30, 2023
Figure 1 for LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning
Figure 2 for LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning
Figure 3 for LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning
Figure 4 for LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning
Viaarxiv icon

Vote2Cap-DETR++: Decoupling Localization and Describing for End-to-End 3D Dense Captioning

Add code
Sep 06, 2023
Viaarxiv icon