Picture for Shuailei Ma

Shuailei Ma

Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning

Add code
Dec 12, 2024
Viaarxiv icon

Learning Visual Generative Priors without Text

Add code
Dec 10, 2024
Viaarxiv icon

LSceneLLM: Enhancing Large 3D Scene Understanding Using Adaptive Visual Preferences

Add code
Dec 02, 2024
Viaarxiv icon

LoTLIP: Improving Language-Image Pre-training for Long Text Understanding

Add code
Oct 07, 2024
Figure 1 for LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Figure 2 for LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Figure 3 for LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Figure 4 for LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Viaarxiv icon

ControlMTR: Control-Guided Motion Transformer with Scene-Compliant Intention Points for Feasible Motion Prediction

Add code
Apr 17, 2024
Figure 1 for ControlMTR: Control-Guided Motion Transformer with Scene-Compliant Intention Points for Feasible Motion Prediction
Figure 2 for ControlMTR: Control-Guided Motion Transformer with Scene-Compliant Intention Points for Feasible Motion Prediction
Figure 3 for ControlMTR: Control-Guided Motion Transformer with Scene-Compliant Intention Points for Feasible Motion Prediction
Figure 4 for ControlMTR: Control-Guided Motion Transformer with Scene-Compliant Intention Points for Feasible Motion Prediction
Viaarxiv icon

CoReS: Orchestrating the Dance of Reasoning and Segmentation

Add code
Apr 08, 2024
Viaarxiv icon

DreamLIP: Language-Image Pre-training with Long Captions

Add code
Mar 25, 2024
Viaarxiv icon

Understanding the Multi-modal Prompts of the Pre-trained Vision-Language Model

Add code
Dec 18, 2023
Viaarxiv icon

A Simple Knowledge Distillation Framework for Open-world Object Detection

Add code
Dec 14, 2023
Viaarxiv icon

Detecting the open-world objects with the help of the Brain

Add code
Mar 21, 2023
Viaarxiv icon