Picture for Lin Ma

Lin Ma

Mitigating Hallucination for Large Vision Language Model by Inter-Modality Correlation Calibration Decoding

Add code
Jan 03, 2025
Viaarxiv icon

Scalable Hierarchical Reinforcement Learning for Hyper Scale Multi-Robot Task Planning

Add code
Dec 27, 2024
Viaarxiv icon

RoboMM: All-in-One Multimodal Large Model for Robotic Manipulation

Add code
Dec 10, 2024
Viaarxiv icon

DriveMM: All-in-One Large Multimodal Model for Autonomous Driving

Add code
Dec 10, 2024
Figure 1 for DriveMM: All-in-One Large Multimodal Model for Autonomous Driving
Figure 2 for DriveMM: All-in-One Large Multimodal Model for Autonomous Driving
Figure 3 for DriveMM: All-in-One Large Multimodal Model for Autonomous Driving
Figure 4 for DriveMM: All-in-One Large Multimodal Model for Autonomous Driving
Viaarxiv icon

Flash Communication: Reducing Tensor Parallelization Bottleneck for Fast Large Language Model Inference

Add code
Dec 06, 2024
Viaarxiv icon

RFSR: Improving ISR Diffusion Models via Reward Feedback Learning

Add code
Dec 04, 2024
Viaarxiv icon

ROSE: A Reward-Oriented Data Selection Framework for LLM Task-Specific Instruction Tuning

Add code
Dec 01, 2024
Figure 1 for ROSE: A Reward-Oriented Data Selection Framework for LLM Task-Specific Instruction Tuning
Figure 2 for ROSE: A Reward-Oriented Data Selection Framework for LLM Task-Specific Instruction Tuning
Figure 3 for ROSE: A Reward-Oriented Data Selection Framework for LLM Task-Specific Instruction Tuning
Figure 4 for ROSE: A Reward-Oriented Data Selection Framework for LLM Task-Specific Instruction Tuning
Viaarxiv icon

TimeMarker: A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability

Add code
Nov 27, 2024
Viaarxiv icon

LESS: Label-Efficient and Single-Stage Referring 3D Segmentation

Add code
Oct 17, 2024
Figure 1 for LESS: Label-Efficient and Single-Stage Referring 3D Segmentation
Figure 2 for LESS: Label-Efficient and Single-Stage Referring 3D Segmentation
Figure 3 for LESS: Label-Efficient and Single-Stage Referring 3D Segmentation
Figure 4 for LESS: Label-Efficient and Single-Stage Referring 3D Segmentation
Viaarxiv icon

VidCompress: Memory-Enhanced Temporal Compression for Video Understanding in Large Language Models

Add code
Oct 15, 2024
Viaarxiv icon